Python Script To Copy WuxiaWorld Chapters Into EPUB File.
Copies The Novel Chapters Along With Novel Details And Sometimes(Once Every 6-10 Times Code Is Run) 'Not' The Cover Image (IDK Why ? Maybe Because Of BeautifulSoup4 Internal Problem).
How Does The Script Work ? Just Enter The Novel URL Inside The Script And The Rest Follows.
Initial Implementation By : Aundinn
Check this other novel webiste: https://wxuiaworld.co. Ask Me, Why This Website? Well, It Has Novels From Webnovel(Qidan) & WuxiaWorld With All Latest Chapters Unlocked. No Spirit Stones, No Patreon, No Subscription Or Any Of Those Things Required To Read The Latest Chapters! Don't Take My Word For It ? Check It Out.
- Get List Of Chapters From Novel Website And Use Links From The List Rather Than Progress Sequentially Because Of The Arising Problem Of Some Pages Not Having Sequential Names.
- Implement multiprocessing to speed up process.
- None Yet(Report if any).
-
For Beginners, After Setting Up A Working Python 3 Environment(Along With Latest
pip
), You Need To Install Some Packages. To Install, Run These Commands In Your CMD/Terminal :pip3 install bs4
pip3 install ebooklib
pip3 install requests
pip3 install html5lib=="0.9999999"
-
Download The Python Script And Unzip It.
-
Open The Script With A Text Editor And Read The Details Inside.
-
In Case The Script Was Not Updated According To The Changes In Website, You Might Refer The BeautifulSoup Docs To Make Changes Accordingly.
-
To Run, Open CMD/Terminal, Navigate To The Unzip Location And Type :
- Linux -
python3 code.py
- Windows -
python code.py
orpy code.py
- Linux -
-
EPUB File Will Be Saved At The Location Of Script.
- Set Novel Link in
novelURL
- If Specific No. Of Chapters Are To Be Downloaded, Then Enter 2 And Provide The
start
Andend
Chapters. - EPUB File Will Be Saved In The Format
NovelName_start-chapter_end-chapter.epub
html5lib
Is Used Because Although Being Tiny Winy Bit Slow, It Generates Valid HTML. You May Compare Others Here, Differences Between Parsers.
I've Copied The Table From BS4 Website Below To Give A Faint Overview.
Parser | Typical usage | Advantages | Disadvantages |
Python’s html.parser | BeautifulSoup(markup, "html.parser") |
|
|
lxml’s HTML parser | BeautifulSoup(markup, "lxml") |
|
|
lxml’s XML parser | BeautifulSoup(markup, "lxml-xml")
BeautifulSoup(markup, "xml") |
|
|
html5lib | BeautifulSoup(markup, "html5lib") |
|
|
- In Case You Update It Accidentally, You Can Reinstall The Specific Version By Checking The Details For Beginners.
- Another Choice, Change
html5lib
Tolxml
- If Installed, Otherwise To Python's Inbuilthtml.parser
.
Copyright © 2018 Kogam22. Released under the terms of the Apache 2.0 license.