Mine hard negatives: optionally output similarity scores by tsbalzhanov · Pull Request #3506 · huggingface/sentence-transformers

tsbalzhanov · 2025-08-30T14:45:33Z

Hello

This PR adds an option to include similarity scores into result of mine hard negatives function.
This options might be helpful to fine-tune parameters of the mining function without a need to recalculate scores again or to extract logic of selecting negatives outside of the mining function.

Tsyren Balzhanov

tomaarsen · 2025-09-22T08:17:17Z

Hello!

I think it's indeed a good idea to also allow exporting scores, but so far I've introduced that via the n-tuple-scores: https://github.com/UKPLab/sentence-transformers/blob/1def8d3d6289e72bfa6a6a48592b1342053e6ff2/sentence_transformers/util/hard_negatives.py#L209

If we instead add a parameter akin to include_scores, then we'll have to deprecate the n-tuple-scores presumably. That's not really an issue, though. I'll do some more thinking on it.

Tom Aarsen

tsbalzhanov · 2025-10-24T15:01:53Z

@tomaarsen

Hi, did you decide on what to do with output_format=n-tuple-scores?
I've just updated the PR: rebased on current master branch and made output_format=n-tuple with include_scores=True equivalent with output_format=n-tuple-scores

tomaarsen · 2025-10-24T15:48:21Z

Apologies for the delay. I think it would be preferable indeed to move towards output_scores and deprecate n-tuple-scores. If n-tuple-scores is passed, we can simply give a warning and set output_format="n-tuple" and include_scores=True indeed.

I want share that I'll be taking 3 weeks off starting Monday, so I won't be able to move this PR forward in the meantime. Apologies for this.

Tom Aarsen

tsbalzhanov · 2025-10-24T18:18:43Z

If n-tuple-scores is passed, we can simply give a warning and set output_format="n-tuple" and include_scores=True indeed.

Okay, I've implemented this

…r both And consider "scores" and "labels" special label columns for all model archetypes, not just CrossEncoder

tomaarsen · 2025-12-08T16:24:38Z

@tsbalzhanov
I made some more changes to be a bit more in line with the general format of Sentence Transformer training datasets:

The score and label (and also scores and labels for the CrossEncoder class) columns are "special" in ST: they're considered the label column and they're passed to the labels section of a loss. (In this PR I'm extending these special columns to all 4 for all classes)
a. This means that positive_scores and negative_scores wouldn't be considered labels, and wouldn't be passed to a loss, not ideal.
b. Multiple columns that match the special columns is also not supported: it would become unclear whether the binarized "labels" or the "scores" would be used in the losses.

So, I've made it so it's either labels or scores, instead of both for labeled-pair and labeled-list.

I think the current implementation gives you all the outputs that you might want, while also working nicely out of the box with the Sentence Transformers trainers etc.
I hope you like the proposal here, I'd like to include it in the upcoming v5.2 release.

Tom Aarsen

tsbalzhanov · 2025-12-08T19:23:34Z

@tomaarsen

I think using scores without labels defeats the purpose of using them in the first place, because we need an ability to distinguish between hard negatives and positives for training cross encoder.
Information about labels is the important bit, scores are very useful, but, ultimately, secondary.

In case of labeled-pair we can't distinguish between positives and negatives at all, and while in cases of triplet, n-tuple and labeled-list we can use the position in scores array, it's not a great solution, because it relies on particularities of current implementation and it might be changed in the future.

Is there some way to have both scores and labels included in the output?

tomaarsen · 2025-12-09T09:09:37Z

Thanks for your considered response. I agree completely that more information is almost always better, but this time it contradicts one of the goals of mine_hard_negatives: that it produces a dataset that immediately works with some loss(es) in Sentence Transformers. If there's both a labels and a scores in that order, then I think the scores would be included as a text column and only labels as the labels for the loss. It would be rather confusing.

I also agree that the "gold" (human-annotated) positives vs negatives (i.e. labels) is valuable for some losses, while others prefer the "silver" (machine-generated) positives vs negatives (i.e. scores) for a more detailed range from not similar to extremely similar. But I can't really picture a situation where someone would want both simultaneously (except I suppose to experiment with multiple losses). We also recently added caching to embeddings, so it would be possible to cheaply rerun the hard negatives mining with different output formats (unless you're also using a CrossEncoder).

Do you know of a situation where you'd need both columns ?

Tom Aarsen

Copilot

Pull request overview

This PR adds an optional output_scores parameter to the mine_hard_negatives function, allowing users to include similarity scores in the output dataset alongside the mined hard negatives. This enables fine-tuning mining parameters without recalculating scores and supports extracting selection logic outside the mining function. The PR also deprecates the n-tuple-scores output format in favor of using n-tuple with output_scores=True.

Key changes:

Added output_scores parameter to optionally include similarity scores in all output formats
Deprecated n-tuple-scores format with a migration path to n-tuple + output_scores=True
Updated data collator to recognize "labels" and "scores" as valid label columns

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
sentence_transformers/util/hard_negatives.py	Implements `output_scores` parameter, deprecates `n-tuple-scores` format, and adds score extraction logic for all output formats
tests/util/test_hard_negatives.py	Adds comprehensive test coverage for `output_scores` parameter across all output formats and validates deprecated format behavior
sentence_transformers/data_collator.py	Extends valid label columns to include "labels" and "scores" alongside existing "label" and "score"
docs/sentence_transformer/training_overview.md	Updates documentation to reflect new valid label column names
docs/cross_encoder/training_overview.md	Removes obsolete comparison note about label column differences
docs/cross_encoder/loss_overview.md	Documents ability to output similarity scores instead of binary labels

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sentence_transformers/util/hard_negatives.py

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sentence_transformers/util/hard_negatives.py

…tiple positives

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sentence_transformers/util/hard_negatives.py

tsbalzhanov marked this pull request as ready for review August 30, 2025 14:46

tsbalzhanov force-pushed the mine_hard_negatives branch from 4053087 to e0ac98e Compare October 24, 2025 14:33

Mine hard negatives: optionally output scores

61f4dc4

tsbalzhanov force-pushed the mine_hard_negatives branch from e0ac98e to 61f4dc4 Compare October 24, 2025 14:55

Deprecate "n-tuple-scores"

1d56932

tomaarsen added 2 commits December 8, 2025 16:51

Update include_scores -> output_scores, either labels or scores, neve…

1db0059

…r both And consider "scores" and "labels" special label columns for all model archetypes, not just CrossEncoder

Merge branch 'main' into pr-3506

a348ca3

tomaarsen approved these changes Dec 9, 2025

View reviewed changes

tomaarsen requested a review from Copilot December 10, 2025 12:57

Copilot started reviewing on behalf of tomaarsen December 10, 2025 12:58 View session

Copilot AI reviewed Dec 10, 2025

View reviewed changes

tomaarsen added 2 commits December 10, 2025 14:55

Fix indexing error re. multiple positives

d7fb823

Update docstring with some fixes/consistency issues

b957c74

tomaarsen requested a review from Copilot December 10, 2025 14:01

Copilot started reviewing on behalf of tomaarsen December 10, 2025 14:01 View session

Copilot AI reviewed Dec 10, 2025

View reviewed changes

sentence_transformers/util/hard_negatives.py Outdated Show resolved Hide resolved

sentence_transformers/util/hard_negatives.py Show resolved Hide resolved

sentence_transformers/util/hard_negatives.py Show resolved Hide resolved

Hard negs: expand test suite considerably, fix several issues re. mul…

f277c34

…tiple positives

tomaarsen requested a review from Copilot December 10, 2025 17:21

Copilot started reviewing on behalf of tomaarsen December 10, 2025 17:22 View session

Copilot AI reviewed Dec 10, 2025

View reviewed changes

sentence_transformers/util/hard_negatives.py Show resolved Hide resolved

sentence_transformers/util/hard_negatives.py Show resolved Hide resolved

sentence_transformers/util/hard_negatives.py Outdated Show resolved Hide resolved

tomaarsen added 2 commits December 10, 2025 18:45

Add missing ` and . in warning

70ed0d8

Merge branch 'main' into mine_hard_negatives

5f5f08d

tomaarsen enabled auto-merge (squash) December 11, 2025 10:57

tomaarsen merged commit 32cb5de into huggingface:main Dec 11, 2025
17 checks passed

Conversation

tsbalzhanov commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomaarsen commented Sep 22, 2025

Uh oh!

tsbalzhanov commented Oct 24, 2025

Uh oh!

tomaarsen commented Oct 24, 2025

Uh oh!

tsbalzhanov commented Oct 24, 2025

Uh oh!

tomaarsen commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsbalzhanov commented Dec 8, 2025

Uh oh!

tomaarsen commented Dec 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tsbalzhanov commented Aug 30, 2025 •

edited

Loading

tomaarsen commented Dec 8, 2025 •

edited

Loading