Skip to content

Loosen package dependencies#166

Merged
tonywu71 merged 7 commits intomainfrom
loosen-package-deps
Jan 10, 2025
Merged

Loosen package dependencies#166
tonywu71 merged 7 commits intomainfrom
loosen-package-deps

Conversation

@tonywu71
Copy link
Contributor

@tonywu71 tonywu71 commented Jan 9, 2025

Description

Loosen default dependencies, but keep stricter dep ranges for the train dependency group.

Features

Added

  • Add expected scores in ColPali E2E test

Changed

  • Loosen package dependencies

Notes

With the new dependencies, I have observed some flakiness similar to what I had observed previously in huggingface/transformers#33736.

In the test in tests/models/paligemma/colpali/test_colpali_e2e.py, I have slightly different scores depending on the installed dependencies.

With the train dependency group:

[
    [15.8125, 7.1562, 15.2500],
    [11.5625, 15.9375, 10.5000],
    [14.9375, 12.6875, 20.7500],
]

Without the train dependency group:

[
    [16.5000, 7.5938, 15.6875],
    [12.0625, 16.2500, 11.1250],
    [15.2500, 12.6250, 21.0000],
]

From a quick analysis, this is probably due to the versions of transformers and accelerate.

Note that I have allowed a larger tolerance in the E2E test to make the test pass despite the delta in scores.

@tonywu71 tonywu71 added the enhancement New feature or request label Jan 9, 2025
@tonywu71 tonywu71 self-assigned this Jan 9, 2025
@tonywu71 tonywu71 linked an issue Jan 9, 2025 that may be closed by this pull request
@tonywu71 tonywu71 force-pushed the loosen-package-deps branch from a22b636 to 62518cc Compare January 9, 2025 15:24
@tonywu71 tonywu71 requested a review from ManuelFay January 9, 2025 17:10
Copy link
Collaborator

@ManuelFay ManuelFay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to run this with a clean install and test (1) at inference time (2) a training run

"peft>=0.11.0,<0.12.0",
"pillow>=9.2.0,<11.0.0",
"numpy",
"peft>=0.11.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump to 12 but in this case, still bind 13 (or latest).
And test !!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bumping to 12 really necessary?

"datasets>=2.19.1",
"mteb>=1.16.3,<1.17.0",
"typer>=0.12.3, <1.0.0",
"peft>=0.11.0,<0.12.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the idea of having double requirements here and above ? is there something I don't know about ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point, which was raised by @galleon, is that our users might want to work with loosened dependencies, e.g. what if someone wants to use colpali-engine with the latest version of `transformers?

My proposition is:

  • when users want to train their models, they should install colpali-engine[train] where we can make sure that training behaves as planned i.e. wit a strict dep config
  • when users want to use our models they should install colpali-engine where we have loosened the deps.

While there might be some minor discrepancies in the results, I think it's inevitable and that my proposition achieves a good happy medium. But happy to hear your opinion and/or your replacement solutions on this!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it would be nice to have less strict restrictions on inference only usage than training.
Idk how the override behavior works in pyproject depending on the optional dependencies.

I still think we need to upper bond the transformers version. People that want to use with newer versions can always install colpali-engine with --no-deps but otherwise at terms this will just lead to bugs I fear...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good compromise, thanks!

So following your propositions, I have added the upper bound for transformers in the default package configuration and restored the correct upper bounds in the train config. Does the PR look ready to be merged now?

FYI I've tested the pyproject override behavior and it works just as planned: each dep takes the intersection of lower/upper bounds defined in the different groups. So installing colpali-engine[train] takes the intersection of the default deps and the ones defined in the train group.

@tonywu71 tonywu71 requested a review from ManuelFay January 10, 2025 12:42
pyproject.toml Outdated
"typer>=0.12.3, <1.0.0",
"peft>=0.11.0,<0.12.0",
"pillow>=9.2.0,<11.0.0",
"transformers>=4.46.1,<4.47.0",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonywu71 does it make sense to have transformers here as it is already in dependencies ?

Otherwise a medium ground solution would be to have transformers >=4.46.1 only in dependencies and keep the upper bound for training.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 1st point, this line is indeed redundant. While it doesn't change anything, I'll remove it for clarity.

For the 2nd point, I'm going with @ManuelFay's proposition: we'll cap transformers and if users really want to use a higher version they'll use pip install --no-deps colpali-engine.

@ManuelFay
Copy link
Collaborator

So from my POV, if dependency resolving are tested and work like you say, we can merge - for 99% o f people who just do inference, it should be fine !

I am not super happy about it, just bumping all versions would be the cleanest, and I am pretty sure there will some compatibility problems for people training, but I am aware this is not immediate and not maintainable from our POV so let's do a minor like this and be ready to fix stuff if training deps break !

At terms, would be nice to bump to peft 0.12 (or later) after testing as well (todo list).
As is, let's merge !

@tonywu71 tonywu71 merged commit 0bf8c4d into main Jan 10, 2025
5 checks passed
@tonywu71 tonywu71 deleted the loosen-package-deps branch January 10, 2025 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not working with smolagents

3 participants