Skip to content

Commit

Permalink
Remove trailing whitespace from paragraphs
Browse files Browse the repository at this point in the history
  • Loading branch information
jncraton committed Jan 23, 2025
1 parent 1e22202 commit 03391d6
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion languagemodels/preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def handle_data(self, data):
self.paras.append(data)

def get_plain(self):
return "\n\n".join([p for p in self.paras if len(p) > 140])
return "\n\n".join([p.rstrip() for p in self.paras if len(p) > 140])

extractor = ParagraphExtractor()
extractor.feed(src)
Expand Down

0 comments on commit 03391d6

Please sign in to comment.