enlp.processing.stdtools.retain_spaces¶
-
enlp.processing.stdtools.
retain_spaces
(processed)[source]¶ Retaining spaces around punctuation at the end of a sentence
Function for use when joining tokens and wishing to retain original spacing around punctuation.
without function - lemma = ‘the quick brown fox jump over the lazy dog .’
with function - lemma = ‘the quick brown fox jump over the lazy dog.’
- Parameters
- processed
str
processed text string
- processed
- Returns
- updated_text
str
updated processed sentence to ensure same spacing around symbols as in original
- updated_text
Notes
Have only accounted for punctuation at the end of a sentence and not others, for example % or $ or # etc.
Examples
>>> tokens = ['Den', 'raske', 'brune', 'reven', 'hoppet', 'over', 'den', 'late', 'hunden', '.'] >>> joined_tokens = ' '.join(tokens) >>> print ('Original: ', joined_tokens) >>> print ('Fixed spaces: ', retain_spaces(joined_tokens)) Original: Den raske brune reven hoppet over den late hunden . Fixed spaces: Den raske brune reven hoppet over den late hunden.