Skip to content

Supp: PyQuery HTML Parser Package

Ricky Woo edited this page Nov 26, 2018 · 1 revision

The python package, pyquery, allows you to make queries on xml documents in a way which is similar to jquery. pyquery uses lxml for fast xml and html manipulation.

Import data

PyQuery class can load an xml document from:

  • a string,
  • a lxml document,
  • a file or
  • an url Here is the corresponding example:
from pyquery import PyQuery as pq
from lxml import etree
import urllib

d = pq("<html></html>") # string object
d = pq(etree.fromstring("<html></html>"))# lxml object
d = pq(filename=path_to_html_file) # file
d = pq(url='http://google.com/') # url

Now you can operate on d as jquery: