Skip to content

Latest commit

 

History

History
52 lines (35 loc) · 1.07 KB

CHANGELOG.md

File metadata and controls

52 lines (35 loc) · 1.07 KB

0.5.0 / 2021-02-15

  • internal refactoring for clarity

0.4.4 / 2021-02-13

  • Do a better job of stripping out script tags

0.4.3 / 2020-07-18

  • update deps

0.4.2 / 2020-03-11

  • update deps

0.4.1 / 2019-07-04

  • Fix bug in min_clause_words_filter ( used in article_sentence_extractor )
  • Allow tests to run in Docker
  • Update circle to continue to work
  • Add architecture flow
  • Code formatting
  • Add min words filter specs
  • Add label action specs
  • Add missing test case to ignorable element spec
  • Add merge_next case to text block spec
  • Dry up includes

0.4.0 / 2017-09-15

  • Add KeepEverythingWithMinKWords Extractor
  • Add ArticleSentence Extractor

0.3.0 / 2017-09-12

  • Add LargestContent Extractor
  • Add KeepEverything Extractor
  • Add NumWordsRules Extractor
  • Add Canola Extractor

0.2.0 / 2017-09-11

  • Add Default Extractor
  • Tweak dependency to use Nokogiri 1.6.6.2 or newer
  • Add Apache 2.0 license to reflect original work by Christian Kohlschütter

0.1.1 / 2017-09-11

  • bugfix new line character escaping bug

0.1.0 / 2017-09-08

  • Add Article Extractor