Replies: 1 comment
-
|
At the moment, Markitdown doesn't automatically distinguish or filter out headers, footers, or page numbers inserted into the body during PDF-to-markdown conversion. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Every page's header ("XYZ MANUAL"), footer ("XYT Manual - XXX"), and page-numbers of my pdf has been inserted directly into the body of the text of the markdown.
Is there a way to have markitdown identify headers, footers, page indicators and similar elements, and prevent this behaviour?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions