refactor: make office docs optional and reduce core dependency footprint#255
Open
mlikasam-askui wants to merge 3 commits intomainfrom
Open
refactor: make office docs optional and reduce core dependency footprint#255mlikasam-askui wants to merge 3 commits intomainfrom
mlikasam-askui wants to merge 3 commits intomainfrom
Conversation
- Move MarkItDown to new `office_document` extra and lazy-load in markdown conversion - Remove bson usage; generate time-ordered IDs via `time_ns` + UUID fragment - Promote `pure-python-adb` to default deps; replace `android` extra in `all` - Relax Python constraint to `>=3.10` and align setup/readme docs - Remove obsolete mypy ignore for `bson`
| all = ["askui[android,bedrock,otel,vertex,web]"] | ||
| android = [ | ||
| "pure-python-adb>=0.3.0.dev0" | ||
| all = ["askui[office-document,bedrock,otel,vertex,web]"] |
Contributor
There was a problem hiding this comment.
why is android no longer part of the all group?
Contributor
There was a problem hiding this comment.
ah I see, because it is now a default, right?
Contributor
Author
There was a problem hiding this comment.
Yes, it’s included by default, and we should start using pip install askui without the all option, especially on Windows, to improve installation speed.
| str: Time-ordered ID string | ||
| """ | ||
|
|
||
| return f"{prefix}_{str(bson.ObjectId())}" |
Contributor
There was a problem hiding this comment.
I do not know what bson was doing and what effects removing it has. Out of curiosity: can you maybe explain?
Contributor
Author
There was a problem hiding this comment.
It was the reason why the SDK was not compatible with Python 3.14 and later, but now the imagehash library is the new issue causing incompatibility.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR reduces the default installation footprint and clarifies optional dependency usage.
Dependency changes
markitdownfrom core dependencies and introduced optional extra:office_documentallextra to includeoffice-document(and no longer depend on the removedandroidextra)pure-python-adbinto default dependenciesbsonfrom core dependenciesRuntime/code changes
generate_time_ordered_id()no longer usesbson.ObjectId; now builds IDs fromtime.time_ns()+ UUID suffixconvert_to_markdown()now importsmarkitdownlazily and raises a clear install hint:pip install "askui[office-document]"Documentation/config updates
pip install askui) and explains optional extrasdocs/10_extracting_data.mdexplicitly notes Excel/Word (OfficeDocumentSource) requiresoffice-documentdocs/01_setup.mdupdated Python requirement textpyproject.toml/pdm.locksynchronized with new extras + depsmypyignore section forbsonWhy
This keeps the base package lighter, avoids forcing Office-conversion dependencies on all users, and makes Office document support explicit and discoverable.