Skip to content

Merge main branch changes to release/3.3#17514

Closed
Bobholamovic wants to merge 76 commits intoPaddlePaddle:release/3.3from
Bobholamovic:cp
Closed

Merge main branch changes to release/3.3#17514
Bobholamovic wants to merge 76 commits intoPaddlePaddle:release/3.3from
Bobholamovic:cp

Conversation

@Bobholamovic
Copy link
Member

No description provided.

TingquanGao and others added 30 commits October 16, 2025 22:06
* update docs

* add methods
Frigate is a real-time NVR system that uses PaddleOCR for License Plate Recognition (LPR).

Co-authored-by: AmirHossein_Omidi <151873319+AmirHoseinOmidi@users.noreply.github.com>
* update PaddleOCR-VL paper url

* polish README
* Add hardware support

* Add hardware support

* fix

* update

* update
* update fqa

* Update PaddleOCR-VL.en.md

* Update PaddleOCR-VL.en.md

* Update PaddleOCR-VL.en.md

* Update PaddleOCR-VL.md
…addle#16756)

Signed-off-by: Adler Fleurant <2609856+AdlerFleurant@users.noreply.github.com>
* update readme

* fix code-style for readme
* Optimize docs for deployment of PaddleOCR-VL

* Update docs

* Fix not-using-doc-prepeocessor bug

* Update dockerfiles and docs

* Add SFT

* Fix code style

* Add PaddleOCR-VL-0.9B model into offline pipeline image

* Support Windows

* Add lower bound for paddleocr version

* Revert windows and paddle 3.2.1

* Support setting paddleocr version

* Fix typo

* Update docker image sizes

* Fix bug

* Fix doc
* polish README

* polish
* update PaddleOCR-VL.md

* update

* update

* update

* update

* add en docs
zhang-prog and others added 23 commits November 28, 2025 18:00
* fix tips

* update
* fix tips

* fix tips
* [METAX] supports paddleOCR VL in metax_gpu

* Fix code style

* Fix code style
* fix docs

* update

* update
…ion (PaddlePaddle#16994)

* Fix: Prevent auto-splitting of French accented words in text recognition

Added support for Latin characters with diacritics (é, è, à, ç, etc.) and French contractions (n'êtes) in word grouping logic of BaseRecLabelDecode.get_word_info().

This fix ensures that French words are no longer split at accented characters during OCR text recognition.

* moved test file and fix some style errors

* fix: Move test file to tests/ directory and correct Unicode name check

- Moved test_french_accents.py to tests/ directory following project structure
- Removed invalid 'FRENCH' prefix from Unicode name check
- Unicode standard only uses 'LATIN' prefix for all Latin-based characters
- All French accented characters (é, è, à, ç, etc.) are correctly matched
- Verified with comprehensive character set including uppercase/lowercase variants

* style: Remove emojis from test file to maintain project code style
* Support Qianfan PP-StructureV3 MCP server

* Bump version to 0.4.1
* update docs

* update

* update

* update
…#17401)

* Support MetaX GPU docker image

* Update

* Update dockerfile

* Add docs

* Polish docs

* Update docs

* Add mkdocs.yml
PaddlePaddle#17201)

* fix: support accented characters in word segmentation for return_word_box

Fixes PaddlePaddle#17156

The word segmentation in get_word_info() was using [a-zA-Z0-9] regex which
only matched ASCII letters and digits. This caused words with accented
characters (ä, ö, ü, é, à, etc.) to be incorrectly split into separate
segments.

Changed to use \w with re.UNICODE flag which properly matches:
- All Unicode letter characters (including accented/diacritic characters)
- Digits from all scripts
- Excludes underscore (which \w includes but we want as splitter)

This fix enables proper word grouping for German, French, Polish, and
other languages with accented characters while maintaining backward
compatibility with existing ASCII text processing.

Example: 'Grüßen' now stays as one word instead of ['Gr', 'üß', 'en']

* fix: resolve pytest warning by using assert instead of return
Automatically generated security fix

Co-authored-by: orbisai0security <orbisai0security@users.noreply.github.com>
Co-authored-by: Lin Manhui <mhlin425@whu.edu.cn>
…arseQHead (PaddlePaddle#17019)

Co-authored-by: Lin Manhui <mhlin425@whu.edu.cn>
* 参数修改

* 修改explanation为Description
* docs: 更新PP-DocTranslation等产线文档的参数描述格式

* Update doc_preprocessor.md

* Update parameters and descriptions in PP-DocTranslation

* Clarify usage of predict() method in doc_preprocessor

Removed redundant explanation about predict_iter() method and clarified the usage of predict() method.
* 模块列表修改

* Update doc_vlm.md

* Update table_classification.md

* Update table_structure_recognition.md

* Update seal_text_detection.md
* 表格参数说明修改

* 表格参数说明修改

* PaddleOCR-VL文档修改

* Enhance OCR.md with parameter explanations

Added detailed descriptions for various parameters related to device selection, inference settings, and performance optimizations in the OCR documentation.

* Update PP-ChatOCRv4.md

* Update PaddleOCR-VL documentation for save_to_json()

Clarify the behavior of save_to_json() method regarding output paths and numpy array conversion.

* Fix default value for layout_nms in documentation

Updated default initialization value for layout_nms parameter to True.

---------

Co-authored-by: cuicheng01 <45199522+cuicheng01@users.noreply.github.com>
* 英文版产线列表文档修改

* Fix duplicate entry in PP-DocTranslation documentation

Removed duplicate entry for saving visualization results.

* 修改explanation为Description
* 模型列表英文修改

* 修改explanation为Description

* 错误项修改
* add iluvatar docs

* add npu doc

* add npu en doc

* delete safetensors desc

* update

* revert dockerfile

* update

---------

Co-authored-by: Bobholamovic <bob1998425@hotmail.com>
Co-authored-by: Lin Manhui <mhlin425@whu.edu.cn>
@paddle-bot
Copy link

paddle-bot bot commented Jan 19, 2026

Thanks for your contribution!

@Bobholamovic Bobholamovic changed the base branch from main to release/3.3 January 19, 2026 13:10
@Bobholamovic Bobholamovic deleted the cp branch January 19, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.