-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Enable transformations on PDFs #5172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
33f5d7f
feat: Adding Docling RAG demo
franciscojavierarceo 292f2c9
updated demo
franciscojavierarceo 38f8c23
cleaned up notebook
franciscojavierarceo fea2a5e
adding chunk id
franciscojavierarceo a33545b
adding quickstart demo that is WIP and updating docling-demo to expor…
franciscojavierarceo 8c7dc00
adding current tentative exmaple repo
franciscojavierarceo d1f4269
adding current temporary work
franciscojavierarceo 6609cc3
updating demo script to rename things
franciscojavierarceo 48f77e4
updated quickstart
franciscojavierarceo 26919a3
added comment
franciscojavierarceo fb1bce4
checking in progress
franciscojavierarceo 4340ca6
checking in progress for now, still have some issues with vector retr…
franciscojavierarceo 0564f7d
okay think i have most things working
franciscojavierarceo 637df02
removing commenting and unnecessary code
franciscojavierarceo 31e7f85
updated type mapping for PDFs
franciscojavierarceo 7978df9
updated test case
franciscojavierarceo b249b87
updated unit test
franciscojavierarceo fdedcbe
linter
franciscojavierarceo 3a8fa01
limiting test run
franciscojavierarceo b8de03a
missed import
franciscojavierarceo f83153c
uploading demo
franciscojavierarceo 1fbfdb1
only running on mac 13
franciscojavierarceo 3c85653
updated implementation to work with docling
franciscojavierarceo 7b1f059
remove print statement
franciscojavierarceo b99530a
using get_online_features instead
franciscojavierarceo 09c73a7
demo working
franciscojavierarceo c416bcc
removing files from other PR
franciscojavierarceo def595e
removing files from other PR
franciscojavierarceo 8379a97
reverting print
franciscojavierarceo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to have a new type, maybe just use BYTES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, running type inference on raw bytes will fail when using a transformation expecting PDF content.