Skip to content

Question marks at the end swallowed #39

@dakinggg

Description

@dakinggg

Looks like the example with just question marks is good now:

>>> segmenter.segment("??")
['??']

but the example with double question marks as a token at the end of a sentence still loses the question marks:

>>> segmenter.segment("T stands for the vector transposition. As shown in Fig. ??")
['T stands for the vector transposition.', 'As shown in Fig.']

looks like this is the minimal repro:

>>> segmenter.segment("Fig. ??")
['Fig.']

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugedge-casesupdate rules to account for the edge cases

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions