Convert your Meebook ereader highlights and notes (from the Haoqing reading app) to CSV format for seamless import into Readwise.
This tool bridges the gap between your Meebook ereader and Readwise by converting exported HTML book notes into CSV format that's fully compatible with Readwise's bulk import feature.
- Preserve Your Reading Journey: Don't lose all those valuable highlights and notes from your Meebook
- Centralize Your Knowledge: Get all your highlights into Readwise alongside notes from other sources
- Automated Processing: Handles multiple books at once with proper metadata extraction
- Readwise Ready: Generates CSV files in the exact format Readwise expects
Book-notes-html-to-csv/
βββ html-files/ # π Place your HTML files here
βββ output/ # π Generated CSV files will appear here
βββ html_to_csv_converter.py # π Main conversion script
βββ batch_convert.bat # π Easy-to-use batch script for Windows
βββ requirements.txt # π Python dependencies
βββ README.md # π This file
- Meebook Compatible: Specifically designed for HTML exports from Meebook ereaders
- Complete Data Extraction: Captures highlights, personal notes, timestamps, book titles, and authors
- Batch Processing: Handle multiple books simultaneously
- Dual Output: Creates both individual CSV files per book AND a master combined file
- Readwise Ready: Output format matches Readwise's CSV import requirements exactly
- Metadata Preservation: Maintains reading dates and location information
- Multi-language Support: Handles books in any language with proper UTF-8 encoding
- On your Meebook ereader, go to your highlights/notes for each book
- Export or share your notes in HTML format
- Transfer these HTML files to your computer (via USB, email, cloud storage, etc.)
For Windows Users:
- Download or clone this repository to your computer
- Place your Meebook HTML files in the
html-files/folder - Double-click
batch_convert.bat - Done! Check the
output/folder for your CSV files
For Mac/Linux Users:
- Install Python (3.7 or newer) if you don't have it
- Install dependencies:
pip install beautifulsoup4 - Place your HTML files in the
html-files/folder - Run:
python html_to_csv_converter.py --batch - Check results in the
output/folder
- Go to readwise.io/import_bulk
- Upload your generated CSV file(s)
- Readwise will automatically process all your highlights with proper book information
- Download this repository (click "Code" β "Download ZIP" on GitHub)
- Extract the ZIP file to a folder on your computer
- That's it! The batch file will handle Python setup automatically
- Ensure Python 3.7+ is installed:
python3 --version - Download this repository
- Install dependencies:
pip install beautifulsoup4
# Convert a single HTML file
python html_to_csv_converter.py "html-files/book.html" -o "output/book.csv"
# Process all files with batch mode (recommended)
python html_to_csv_converter.py --batch
# Get help
python html_to_csv_converter.py --helpThe generated CSV files contain the following columns (compatible with Readwise):
- Highlight: The actual text of the highlight (mandatory)
- Title: The name of the book (extracted from HTML header)
- Author: The author of the book (extracted from HTML header)
- URL: (empty for books, used for articles)
- Note: Your personal notes attached to the highlight
- Location: Sequential number indicating highlight order
- Date: When the highlight was made (YYYY-MM-DD HH:MM:SS format)
The tool automatically extracts book titles and authors from the HTML file's header (the <h2> tag), which typically follows the format: "Book Title - Author Name".
If the book title or author appears incorrect in your CSV:
- Open the generated CSV file in Excel, Google Sheets, or any text editor
- Manually edit the Title and Author columns as needed
- Save the file and proceed with Readwise import
Common reasons for incorrect extraction: unusual formatting in the original Meebook export, titles with dashes, or missing author information.
Here's exactly what happens when you use this tool:
html-files/
βββ Why We Sleep_ Unlocking the Power of Sleep and Dreams - Matthew Walker_20251026_083943.html
βββ To Kill a Mockingbird - Harper Lee_20251026_083931.html
Double-click batch_convert.bat (Windows) or run python html_to_csv_converter.py --batch
Console Output:
Found 2 HTML files to process:
- Why We Sleep_ Unlocking the Power of Sleep and Dreams - Matthew Walker_20251026_083943.html
- To Kill a Mockingbird - Harper Lee_20251026_083931.html
Processing: Why We Sleep_ Unlocking the Power of Sleep and Dreams - Matthew Walker_20251026_083943.html
β Created: output\Why We Sleep_ Unlocking the Power of Sleep and Dreams - Matthew Walker.csv
β Found 36 highlights
Processing: To Kill a Mockingbird - Harper Lee_20251026_083931.html
β Created: output\To Kill a Mockingbird - Harper Lee.csv
β Found 7 highlights
============================================================
BATCH PROCESSING COMPLETE!
============================================================
Combined CSV file: output\all_books_combined.csv
Total highlights across all books: 43
output/
βββ Why We Sleep_ Unlocking the Power of Sleep and Dreams - Matthew Walker.csv (36 highlights)
βββ To Kill a Mockingbird - Harper Lee.csv (7 highlights)
βββ all_books_combined.csv (43 total highlights)
- Open
all_books_combined.csvin Excel or Google Sheets - Edit the Title column to clean up book titles:
- Change
"Why We Sleep_ Unlocking the Power of Sleep and Dreams"β"Why We Sleep" - Keep
"To Kill a Mockingbird"as is
- Change
- Save the file
- Go to readwise.io/import_bulk
- Upload your edited
all_books_combined.csv - β Success! All 43 highlights are now in Readwise with clean titles, proper dates, and personal notes
Result in Readwise:
- π "Why We Sleep" by Matthew Walker (36 highlights)
- π "To Kill a Mockingbird" by Harper Lee (7 highlights)
- Open an issue on GitHub with your HTML file structure (remove personal content)
- Include error messages and your operating system
- Pull requests welcome!
- Test with different Meebook models/firmware versions
- Add support for other ereader HTML formats
- Language: Python 3.7+
- Dependencies: BeautifulSoup4 for HTML parsing
- Encoding: Full UTF-8 support for international characters
- Date Handling: Converts Meebook timestamps to Readwise-compatible format
- File Safety: Handles special characters in book titles/filenames automatically
Made with β€οΈ for the reading community
If this tool helped you preserve your reading notes, consider starring the repository or sharing it with other Meebook users!