PaddlePaddle
diff --git a/‎docs/algorithm/formula_recognition/algorithm_rec_ppformulanet.md‎
Lines changed: 15 additions & 11 deletions b/‎docs/algorithm/formula_recognition/algorithm_rec_ppformulanet.md‎
Lines changed: 15 additions & 11 deletions
diff --git a/‎docs/algorithm/formula_recognition/algorithm_rec_ppformulanet_en.md‎
Lines changed: 17 additions & 11 deletions b/‎docs/algorithm/formula_recognition/algorithm_rec_ppformulanet_en.md‎
Lines changed: 17 additions & 11 deletions
diff --git a/‎ppocr/metrics/rec_metric.py‎
Lines changed: 5 additions & 7 deletions b/‎ppocr/metrics/rec_metric.py‎
Lines changed: 5 additions & 7 deletions
@@ -2,20 +2,24 @@
 
 ## 1. 算法简介
 
-`PP-FormulaNet` 是百度飞桨自研的公式识别模型，采用 PaddleX 内部自建的 5百万数据集进行训练，在对应测试集上的精度如下：
+`PP-FormulaNet` 是由百度飞桨视觉团队开发的一款先进的公式识别模型，支持识别 5 万个常见 LaTeX 源码词汇。其中，PP-FormulaNet-S 版本采用 PP-HGNetV2-B4 作为骨干网络，通过并行掩码和模型蒸馏等技术大幅提升推理速度并保持高识别精度，适用于简单印刷公式和跨行简单印刷公式等场景；而 PP-FormulaNet-L 版本基于 Vary_VIT_B 并经过大规模公式数据集的深入训练，在复杂公式识别方面表现显著提升，适用于简单印刷、复杂印刷和手写公式。
 
+为了进一步提升性能，百度飞桨团队在 PP-FormulaNet 的基础上开发了增强版 `PP-FormulaNet_plus`。该版本使用了来自中文学位论文、专业书籍、教材试卷和数学期刊的丰富数据集，大幅提升了识别能力。其中，PP-FormulaNet_plus-M 和 PP-FormulaNet_plus-L 新增了对中文公式的支持，并将最大预测 token 数从 1024 扩大至 2560，显著提升了复杂公式的识别性能；而 PP-FormulaNet_plus-S 则专注于增强英文公式识别能力，使得 PP-FormulaNet_plus 系列模型在处理复杂多样的公式识别任务时表现更加出色。
 
+上述模型在对应测试集上的精度如下：
 
-| 模型        | 骨干网络       | 配置文件                                                  | SPE-<br/>BLEU↑ | CPE-<br/>BLEU↑  | Easy-<br/>BLEU↑ | Middle-<br/>BLEU↑ | Hard-<br/>BLEU↑| Avg-<br/>BLEU↑ | 下载链接 |
-|-----------|------------|------------------|:--------------:|:---------:|:----------:|:----------------:|:---------:|:-----------------:|:-----------------:|
-| UniMERNet | Donut Swin | [UniMERNet.yaml](../../../configs/rec/UniMERNet.yaml) |     0.9187  |    0.9252       | 0.8658  |    0.8228   | 0.7740 |     0.8613        |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_unimernet_train.tar)|
-| PP-FormulaNet-S | PPHGNetV2_B4 | [PP-FormulaNet-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml) |    0.8694   |    0.8071       | 0.9294  |    0.9112    | 0.8391 |    0.8712       |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar)|
-| PP-FormulaNet-L | Vary_VIT_B | [PP-FormulaNet-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-L.yaml) |     0.9055   |     0.9206       | 0.9392  |     0.9273    | 0.9141 |     0.9213         |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_l_train.tar )|
-| PP-FormulaNet_plus-S | PPHGNetV2_B4 | [PP-FormulaNet_plus-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml) |     -   |     -       | -  |     -    | - |     -         |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_s_train.tar )|
-| PP-FormulaNet_plus-M | PPHGNetV2_B6 | [PP-FormulaNet_plus-M.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml) |     -   |     -       | -  |     -    | - |     -         |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar )|
-| PP-FormulaNet_plus-L | Vary_VIT_B | [PP-FormulaNet_plus-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml) |     -   |     -       | -  |     -    | - |     -         |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_l_train.tar )|
+| 模型          | 骨干网络       | 配置文件                   | En-<br/>BLEU↑ |Zh-<br/>BLEU(%)↑ |OmniDocBench-<br/>BLEU(%)↑  |GPU推理耗时（ms）| 下载链接 |
+|-----------|--------|----------------------------------------|:----------------:|:---------:|:-----------------:|:--------------:|:--------------:|
+| UniMERNet | Donut Swin | [UniMERNet.yaml](../../../configs/rec/UniMERNet.yaml) |     85.91  |   43.50       | 67.75 | 2266.96 | [训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_unimernet_train.tar)|
+| PP-FormulaNet-S | PPHGNetV2_B4 | [PP-FormulaNet-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml) |   87.00   |   45.71   | 59.57| 202.25 |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar)|
+| PP-FormulaNet-L | Vary_VIT_B | [PP-FormulaNet-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-L.yaml) |    90.36   |    45.78       | 64.81  | 1976.52  |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_l_train.tar )|
+| PP-FormulaNet_plus-S | PPHGNetV2_B4 | [PP-FormulaNet_plus-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml) |     88.71   |     53.32       | 70.54  |     	191.69  |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_s_train.tar )|
+| PP-FormulaNet_plus-M | PPHGNetV2_B6 | [PP-FormulaNet_plus-M.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml) |     91.45   |     89.76       | 	72.07  |     	1301.56    |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar )|
+| PP-FormulaNet_plus-L | Vary_VIT_B | [PP-FormulaNet_plus-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml) |     92.22   |     90.64       | 	72.45  |     1745.25    |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_l_train.tar )|
+| LaTeX-OCR | Hybrid ViT |[LaTeX_OCR_rec.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/rec/LaTeX_OCR_rec.yaml)|   74.55   |       39.96        | 47.59| 	1244.61   |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
 
-其中，SPE、CPE为UniMERNet的简单公式数据集和复杂公式数据集；Easy、Middle、Hard为PaddleX内部自建的简单公式数据集（LaTeX 代码长度 0-64）、中等公式数据集（LaTeX 代码长度  64-256）和复杂公式数据集（LaTeX 代码长度  256+）。
+
+其中，En-BLEU、Zh-BLEU（%）和OmniDocBench-BLEU（%）分别表示英文公式、中文公式以及OmniDocBench的BLEU分数。这里，英文公式的评估数据集包含UniMERNet的简单和复杂公式，以及PaddleX内部自建的简单、中等和复杂公式。中文公式的评估数据集则来自PaddleX自建的中文公式数据集。而 OmniDocBench 评估集由 [OmniDocBench](https://github.com/opendatalab/OmniDocBench) 中的行间公式图片组成。
 
 ## 2. 环境配置
 请先参考[《运行环境准备》](../../ppocr/environment.md)配置PaddleOCR运行环境，参考[《项目克隆》](../../ppocr/blog/clone.md)克隆项目代码。
@@ -89,7 +93,7 @@ python3  -m paddle.distributed.launch --gpus '0,1,2,3' --ips=127.0.0.1   tools/t
 ```shell
 # 注意将pretrained_model的路径设置为本地路径。
 python3 tools/infer_rec.py -c configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml \
-  -o  Global.infer_img='./docs/datasets/images/pme_demo/0000099.png'\
+  -o  Global.infer_img='./docs/datasets/images/pme_demo/0000295.png'\
    Global.pretrained_model=./rec_ppformulanet_s_train/best_accuracy.pdparams
 # 预测文件夹下所有图像时，可修改infer_img为文件夹，如 Global.infer_img='./doc/datasets/pme_demo/'。
 ```
 
@@ -3,18 +3,24 @@
 ## 1. Introduction
 
 
-PP-FormulaNet is a formula recognition model independently developed by Baidu PaddlePaddle. It is trained on a self-built dataset of 5 million samples within PaddleX, achieving the following accuracy on the corresponding test set:
+`PP-FormulaNet` is an advanced formula recognition model developed by Baidu’s PaddlePaddle Vision Team, supporting the recognition of 50,000 common LaTeX source terms. The PP-FormulaNet-S version uses PP-HGNetV2-B4 as its backbone network, leveraging techniques like parallel masking and model distillation to significantly enhance inference speed while maintaining high recognition accuracy, making it suitable for scenarios involving simple printed formulas and cross-line simple printed formulas. On the other hand, the PP-FormulaNet-L version is based on Vary_VIT_B and has undergone extensive training on a large-scale formula dataset, showing significant improvement in recognizing complex formulas, and is applicable to simple printed, complex printed, and handwritten formulas.
 
-| Model           | Backbone       | config                                                  |SPE-<br/>BLEU↑ | CPE-<br/>BLEU↑  | Easy-<br/>BLEU↑ | Middle-<br/>BLEU↑ | Hard-<br/>BLEU↑| Avg-<br/>BLEU↑  | Download link |
-|-----------|--------|---------------------------------------------------|:--------------:|:-----------------:|:----------:|:----------------:|:---------:|:-----------------:|:--------------:|
-| UniMERNet | Donut Swin | [UniMERNet.yaml](../../../configs/rec/UniMERNet.yaml) |     0.9187  |    0.9252       | 0.8658  |    0.8228   | 0.7740 |     0.8613        |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_unimernet_train.tar)|
-| PP-FormulaNet-S | PPHGNetV2_B4 | [PP-FormulaNet-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml) |    0.8694   |    0.8071       | 0.9294  |    0.9112    | 0.8391 |    0.8712       |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar)|
-| PP-FormulaNet-L | Vary_VIT_B | [PP-FormulaNet-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-L.yaml) |     0.9055   |     0.9206       | 0.9392  |     0.9273    | 0.9141 |     0.9213         |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_l_train.tar )|
-| PP-FormulaNet_plus-S | PPHGNetV2_B4 | [PP-FormulaNet_plus-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml) |     -   |     -       | -  |     -    | - |     -         |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_s_train.tar )|
-| PP-FormulaNet_plus-M | PPHGNetV2_B6 | [PP-FormulaNet_plus-M.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml) |     -   |     -       | -  |     -    | - |     -         |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar )|
-| PP-FormulaNet_plus-L | Vary_VIT_B | [PP-FormulaNet_plus-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml) |     -   |     -       | -  |     -    | - |     -         |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_l_train.tar )|
+To further enhance performance, the Baidu PaddlePaddle team developed an enhanced version called `PP-FormulaNet_plus` based on PP-FormulaNet. This version utilizes a rich dataset sourced from Chinese theses, professional books, textbooks, exam papers, and mathematical journals, greatly improving recognition capability. Among them, PP-FormulaNet_plus-M and PP-FormulaNet_plus-L add support for Chinese formulas and increase the maximum predicted token count from 1024 to 2560, significantly enhancing the recognition performance of complex formulas. Meanwhile, PP-FormulaNet_plus-S focuses on enhancing English formula recognition, making the PP-FormulaNet_plus series models perform more exceptionally in handling complex and diverse formula recognition tasks.
 
-Among them, SPE and CPE refer to the simple and complex formula datasets of UniMERNet, respectively. Easy, Middle, and Hard are simple (LaTeX code length 0-64), medium (LaTeX code length 64-256), and complex formula datasets (LaTeX code length 256+) built internally by PaddleX.
+The accuracy of the above models on the corresponding test sets is as follows:
+
+| Model           | Backbone       | config                  | En-<br/>BLEU↑ |Zh-<br/>BLEU(%)↑ |OmniDocBench-<br/>BLEU(%)↑  | GPU Inference Time (ms)| Download link |
+|-----------|--------|----------------------------------------|:----------------:|:---------:|:-----------------:|:--------------:|:--------------:|
+| UniMERNet | Donut Swin | [UniMERNet.yaml](../../../configs/rec/UniMERNet.yaml) |     85.91  |   43.50       | 67.75 | 2266.96 | [trained model](https://paddleocr.bj.bcebos.com/contribution/rec_unimernet_train.tar)|
+| PP-FormulaNet-S | PPHGNetV2_B4 | [PP-FormulaNet-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml) |   87.00   |   45.71   | 59.57| 202.25 |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_s_train.tar)|
+| PP-FormulaNet-L | Vary_VIT_B | [PP-FormulaNet-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet-L.yaml) |    90.36   |    45.78       | 64.81  | 1976.52  |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_l_train.tar )|
+| PP-FormulaNet_plus-S | PPHGNetV2_B4 | [PP-FormulaNet_plus-S.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml) |     88.71   |     53.32       | 70.54  |     	191.69  |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_s_train.tar )|
+| PP-FormulaNet_plus-M | PPHGNetV2_B6 | [PP-FormulaNet_plus-M.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml) |     91.45   |     89.76       | 	72.07  |     	1301.56    |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_m_train.tar )|
+| PP-FormulaNet_plus-L | Vary_VIT_B | [PP-FormulaNet_plus-L.yaml](../../../configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml) |     92.22   |     90.64       | 	72.45  |     1745.25    |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_ppformulanet_plus_l_train.tar )|
+| LaTeX-OCR | Hybrid ViT |[LaTeX_OCR_rec.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/rec/LaTeX_OCR_rec.yaml)|   74.55   |       39.96        | 47.59| 	1244.61   |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
+
+
+En-BLEU, Zh-BLEU (%), and OmniDocBench-BLEU (%) represent the BLEU scores for English formulas, Chinese formulas, and the OmniDocBench, respectively. The evaluation dataset for English formulas includes simple and complex formulas from UniMERNet, as well as simple, intermediate, and complex formulas from PaddleX’s internally developed dataset. The evaluation dataset for Chinese formulas comes from PaddleX’s internally developed Chinese formula dataset. The OmniDocBench evaluation set consists of inter-line formula images from  [OmniDocBench](https://github.com/opendatalab/OmniDocBench).
 
 
 ## 2. Environment
@@ -74,7 +80,7 @@ Prediction:
 ```shell
 # The configuration file used for prediction must match the training
 python3 tools/infer_rec.py -c configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml \
-  -o  Global.infer_img='./docs/datasets/images/pme_demo/0000099.png'\
+  -o  Global.infer_img='./docs/datasets/images/pme_demo/0000295.png'\
    Global.pretrained_model=./rec_ppformulanet_s_train/best_accuracy.pdparams
 ```
 
 
@@ -190,7 +190,6 @@ def __init__(self, main_indicator="exp_rate", cal_bleu_score=False, **kwargs):
         self.e1_right = []
         self.e2_right = []
         self.e3_right = []
-        self.editdistance_total_length = 0
         self.exp_total_num = 0
         self.edit_dist = 0
         self.exp_rate = 0
@@ -210,11 +209,12 @@ def __call__(self, preds, batch, **kwargs):
         word_pred = preds
         word_label = batch
         line_right, e1, e2, e3 = 0, 0, 0, 0
-        lev_dist = []
+        bleu_list, lev_dist = [], []
         for labels, prediction in zip(word_label, word_pred):
             if prediction == labels:
                 line_right += 1
             distance = compute_edit_distance(prediction, labels)
+            bleu_list.append(compute_bleu_score([prediction], [labels]))
             lev_dist.append(Levenshtein.normalized_distance(prediction, labels))
             if distance <= 1:
                 e1 += 1
@@ -228,19 +228,17 @@ def __call__(self, preds, batch, **kwargs):
         self.edit_dist = sum(lev_dist)  # float
         self.exp_rate = line_right  # float
         if self.cal_bleu_score:
-            self.bleu_score = compute_bleu_score(word_pred, word_label)
+            self.bleu_score = sum(bleu_list)
+            self.bleu_right.append(self.bleu_score)
         self.e1 = e1
         self.e2 = e2
         self.e3 = e3
         exp_length = len(word_label)
         self.edit_right.append(self.edit_dist)
         self.exp_right.append(self.exp_rate)
-        if self.cal_bleu_score:
-            self.bleu_right.append(self.bleu_score * batch_size)
         self.e1_right.append(self.e1)
         self.e2_right.append(self.e2)
         self.e3_right.append(self.e3)
-        self.editdistance_total_length = self.editdistance_total_length + exp_length
         self.exp_total_num = self.exp_total_num + exp_length
 
     def get_metric(self):
@@ -254,7 +252,7 @@ def get_metric(self):
         cur_edit_distance = sum(self.edit_right) / self.exp_total_num
         cur_exp_rate = sum(self.exp_right) / self.exp_total_num
         if self.cal_bleu_score:
-            cur_bleu_score = sum(self.bleu_right) / self.editdistance_total_length
+            cur_bleu_score = sum(self.bleu_right) / self.exp_total_num
         cur_exp_1 = sum(self.e1_right) / self.exp_total_num
         cur_exp_2 = sum(self.e2_right) / self.exp_total_num
         cur_exp_3 = sum(self.e3_right) / self.exp_total_num