Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced PyQuery with BeautifulSoup4 and lxml. #52

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

phonique
Copy link

Should properly parse UTF-8 now.
Added some options (See readme).
Should produce (mostly) valid HTML5, when --replace-all=yes.

Should solve #51, #43, #39
Adds alternative for #41

Needed the changes for a project, and thought I'd share.

Code contains some more TODO:s.

George added 3 commits January 17, 2015 05:10
…w. Adjusted README, requirements. Code contains TODO:s
more control

Signed-off-by: geotti <george@geoco.de>
Signed-off-by: geotti <george@geoco.de>
@phonique
Copy link
Author

b6cd08b,
2ffbd66, and
513338a

supersede this by replacing docopt with argparse (more control?).

Unicode support should work properly now, including rss links.
Added a proper switch and CLI is more forgiving now (courtesy of argparse).

Would appreciate if someone would look at the commented lxml code and get it working, since currently the code runs BeautifulSoup twice.

Also, needs testing with git (I only need to generate a local copy without gh).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant