-
Notifications
You must be signed in to change notification settings - Fork 86
Does not properly segment within quotations #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I confirm the... bug? Not sure if it is a bug or intentional but
I was expecting
It almost does the right thing when using single quotes. Almost. The sentence is split correctly, but it considers the terminating single quote as a sentence
|
diasks2/pragmatic_segmenter#13 and #45 seems to have the answer - it was a design choice. But I wonder if we could have an option to change that behavior, as a parameter in the constructor... ? |
Not sure if this is related, but a text like
results in a single span:
whereas I would expect two - |
Uh oh!
There was an error while loading. Please reload this page.
When dealing with a long statement of facts quoted from legal text, the text is not split up within left double quotations and write double quotations. this is different than the " characterI cannot share the text here as it deals with sensitive content.
import pysbd
seg = pysbd.Segmenter(language='en')
sentences = seg.segment(above_text)
Returns a lot of length 1 and does not split by sentences. The expected behavior is to split up into sentences within the quotations.
The text was updated successfully, but these errors were encountered: