Skip to content

Releases: docling-project/docling

v2.70.0

23 Jan 15:22

Choose a tag to compare

Feature

Fix

  • md: Handle pipe symbols that are not table markers (#2904) (86eaef5)
  • Remove direct vllm dependency (#2910) (7a1952a)
  • PPTX parsing: bullet points not grouped correctly under subheadings (#2663) (#2855) (999dbb2)

Documentation

  • Add comprehensive docstrings to PdfPipelineOptions (#2827) (ab91786)

v2.69.1

21 Jan 12:45

Choose a tag to compare

Fix

  • Off-by-one error for page indexing in vlm_pipeline (#2902) (08f49e2)

v2.69.0

20 Jan 11:14

Choose a tag to compare

Feature

Fix

Documentation

  • Correct broken link to supported formats (#2878) (16e88d5)

v2.68.0

13 Jan 10:46

Choose a tag to compare

Feature

  • Support for DeepSeek-OCR in VLM pipeline (#2798) (19af03f)

Fix

  • logging: Include page numbers in preprocess error messages (#2858) (89bea24)
  • docx: Handle grouped pictures (#2861) (5c1f8f0)

Documentation

  • Fix Colab badge links and Weaviate typo in docs examples (#2871) (72851cc)
  • example: Fix update sample image path to be relative (#2864) (211c759)
  • Add Semantica integration (#2860) (bf80e32)

v2.67.0

09 Jan 08:27

Choose a tag to compare

Feature

Fix

  • Lock new deps and update python 3.14 warnings (#2844) (d9295df)
  • Correct type hint for table_structure_options usage (#2823) (a0530a2)
  • Transformers models lazy-loaded (#2826) (3ef4525)
  • Font download by passing font_path to RapidOcr (#2822) (ffafe58)
  • cli: Add Layout and Table models to --show-external-plugins (#2832) (ed57089)

v2.66.0

24 Dec 10:46

Choose a tag to compare

Feature

  • Add preset for using granite-docling via vllm and other apis (#2792) (241d19e)

Fix

  • docx: Handle tables with merged cells causing IndexError (#2813) (faff935)
  • markdown: Allow text before headers also in mixed markdown and html (#2801) (595115d)

Documentation

v2.65.0

15 Dec 16:55

Choose a tag to compare

Feature

Fix

  • rapidocr: Use correct parameter name for rec_keys_path (#2762) (1d78418)
  • docx: Handle missing value in paragraph style name (#2761) (a97d950)

Documentation

  • Add Pydantic field documentation for PipelineOptions (#2771) (7c24b01)
  • gpu: Add benchmarks of standard pipeline with OCR (#2764) (d03439c)

v2.64.1

09 Dec 09:06

Choose a tag to compare

Fix

  • Clear word/char cells when force_full_page_ocr is used (#2738) (1df0560)
  • Add missing font download in the rapidocr artifacts (#2735) (edbabfc)
  • Ensure proper image_scale for generated page images in VLM pipelines (#2728) (609069d)
  • html: Tackle paragraphs with block-level elements (#2720) (d007ba0)
  • html: Prevent hierarchy reset in rich table cells (#2716) (aebe25c)
  • docx: Parse integrals as n-ary objects without chr element (#2712) (c97715f)

v2.64.0

02 Dec 11:25

Choose a tag to compare

Feature

  • experimental: Add experimental TableCropsLayoutModel (#2669) (1344362)
  • Factory and plugin-capability for Layout and Table models (#2637) (ad97e52)

Fix

  • InputFormat.IMAGE must have correct pipeline (#2707) (6ef4ffd)
  • Do not consider singleton cells in xlsx as TableItems but rather TextItems (#2589) (54cd6d7)
  • docx: Missing list items after numbered header (#2665) (e580554)

Documentation

  • Example on how to apply external OCR as post processing (#2517) (fa21128)
  • More GPU results and improvements in the example docs (#2674) (b75c646)
  • Fix typo on jobkit page (#2671) (146b4f0)

v2.63.0

20 Nov 14:42

Choose a tag to compare

Feature

Fix

  • Respect document_timeout in new threaded StandardPdfPipeline (#2653) (2087c6b)
  • In DocumentConverter.convert_string() make nullable name parameter optional (#2660) (6fb9a5f)
  • Enable GPU for RapidOCR when available (#2659) (463a3fd)
  • Remove py3.14 requirement for default rapidocr (#2639) (da4c2e9)

Documentation

  • Add Hector as compatible AI agent platform integration (#2662) (ce5a099)
  • Added documentation to use SuryaOCR via plugin docling-surya (#2533) (b216ad8)
  • Fix broken homepage links (#2651) (03e7c7d)
  • examples: Processing parquet file of images (#2641) (8af228f)
  • Move Installation and Quickstart (Usage) under Getting started (#2644) (d549445)
  • Add redirection from getting started page (#2640) (ac9fc58)
  • examples: Remove deprecation warnings with export_to_dataframe (#2638) (f552862)