Skip to content

MatthewAndreTaylor/xml-to-pydict

Repository files navigation

xmlpydict 📑

XML Tests PyPI versions PyPI

Requirements

  • python 3.7+

Installation

To install xmlpydict, using pip:

pip install xmlpydict

Quickstart

>>> from xmlpydict import parse
>>> parse("<package><xmlpydict language='python'/></package>")
{'package': {'xmlpydict': {'@language': 'python'}}}
>>> parse("<person name='Matthew'>Hello!</person>")
{'person': {'@name': 'Matthew', '#text': 'Hello!'}}

Goals

Create a consistent parsing strategy between XML and Python dictionaries. xmlpydict takes a more laid-back approach to enforce the syntax of XML. However, still ensures fast speeds by using finite automata.

Features

xmlpydict allows for multiple root elements. The root object is treated as the Python object.

xmlpydict supports the following

CDataSection: CDATA Sections are stored as {'#text': CData}.

Comments: Comments are tokenized for corectness, but have no effect in what is returned.

Element Tags: Allows for duplicate attributes, however only the latest defined will be taken.

Characters: Similar to CDATA text is stored as {'#text': Char} , however this text is stripped.

dict.get(key[, default]) will not cause exceptions

# Empty tags are containers
>>> from xmlpydict import parse
>>> parse("<a></a>")
{'a': {}}
>>> parse("<a/>")
{'a': {}}
>>> parse("<a/>").get('href')
None
>>> parse("")
{}

Attribute prefixing

# Change prefix from default "@" with keyword argument attr_prefix
>>> from xmlpydict import parse
>>> parse('<p width="10" height="5"></p>', attr_prefix="$")
{"p": {"$width": "10", "$height": "5"}}

Exceptions

# Grammar and structure of the xml_content is checked while parsing
>>> from xmlpydict import parse
>>> parse("<a></ a>")
Exception: not well formed (violation at pos=5)

Unsupported

Prolog / Enforcing Document Type Definition and Element Type Declarations

Entity Referencing

Namespaces