Improvements in data structure (keywords and image added) and corrections in href logic #12

morissonmaciel · 2018-11-10T16:04:10Z

Hello, @thomastuts,
I make some changes in article-extractor to improve de data structure returned from its main extractArticle function.

First I included two new attributes:

keywords
Obtained from tags related to "keywords" name and "swiftype > keys" variant (common used in most articles in internet (see. Engadget.com and all Vox Media articles pages)
image
Obtained from two sources: a scored rank from all <img> from <body> or <main> section.; otherwise from tags related to "swiftype > image" variant.

Also, make changes in obtaining author data using tags related to "swiftype > blogger_name" variant.

Note: author image can be obtained from blogger_image and may be pushed to a new metadata property in future improvement.

The documentation was slightly improved with these new fields and a increment in minor version was made: 1.1.0

morissonmaciel added 5 commits November 10, 2018 13:36

-- Improvments and corrections

8dfd7d1

-- Increased minor version

7e967b9

Update README.md

71cc5fa

-- Updated in code documentation

7f00d07

Merge branch 'master' of https://github.com/morissonmaciel/article-ex…

3f9556b

…tractor

Provide feedback