Skip to content

Releases: B-Souty/html2dict

Html2Dict

19 Aug 16:39
17dfdad
Compare
Choose a tag to compare
Html2Dict Pre-release
Pre-release

Installation

Html2Dict is available on PyPi, simply install it with pip3 install html2dict

Changelog

This release completely change the way to use the script.

New Structure

The script is now a package 'html2dict' including 3 new modules:

  • resources
  • base_extractor
  • extractors

New classes

  • Table: A table object holds information about a table, including its name,
    headers row and data rows.
  • TableExtractor: The skeleton for more advanced Extractor class.
  • BasicTableExtractor: The basic extractor which extract data as plaintext.

3 New constructors

  • my_extractor = BasicTableExtractor.from_html_string(html_string=<html_string>)

  • my_extractor = BasicTableExtractor.from_html_file(html_file=<relative_or_absolute_filepath>)

  • my_extractor = BasicTableExtractor.from_url(url=<url>)