⚠️ This post links to an external website. ⚠️
In computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order. Detection and labeling of the different zones (or blocks) as text body, illustrations, math symbols, and tables embedded in a document is called geometric layout analysis. But text zones play different logical roles inside the document (titles, captions, footnotes, etc.) and this kind of semantic labeling is the scope of the logical layout analysis. – Wikipedia
In our case, we are using the pdf document itself instead of image representation. The following categories of tools are available:
- Word extractors
- Page segmenters
- Reading order detectors
- Other layout tools
- Export – Viewing/exporting the results of document layout analysis
continue reading on github.com
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.