中文 | English
This project is a Mac local OCR application based on Pix2Text (no internet connection required). It can recognize mathematical formula images from the clipboard and convert them to their LaTeX representation, which can then be copied to the clipboard. Additionally, it supports text recognition (Text OCR) from general images.
Note
⚠️ : This application is only available for MacOS.
The initial code of this project was forked from: horennel/LaTex-OCR_for_macOS. Special thanks to the author of this project.
After opening the application, you can see the Pix2Text application icon in the Mac menu bar, as shown below. It includes OCR for 4 different modes.
This mode can recognize images containing both mathematical formulas and text. The recognition result is in Markdown format, which can be pasted into the Pix2Text Online Service to view the rendered result.
For example, it can recognize the following image (assets/mixed-en.jpg):
This mode can recognize images containing only mathematical formulas. The recognition result is in LaTeX format, which can be pasted into the Pix2Text Online Service to view the rendered result.
For example, it can recognize the following image (assets/math-formula-42.png):
This mode can recognize images containing only text. The recognition result is in plain text.
For example, it can recognize the following image (assets/text.jpg):
If an image contains complex layout structures, such as multi-column layouts or includes tables and other information, you can use this mode for recognition. This mode will additionally load the Layout Analysis and Table Recognition models from pix2text~=1.1
to recognize all information in the image and integrate the recognition results into Markdown format. You can paste the results into the Pix2Text web version to view the rendered results.
The recognition results will also be saved to a specified local folder. The folder location can be specified by the output_md_root_dir
variable in the configuration file config.yaml, which defaults to the /tmp/output_mds
folder. Additionally, the parsing results will be saved to a specified local folder. The folder location can be specified by the output_debug_dir
variable in the configuration file config.yaml, which defaults to the /tmp/output_debugs
folder. You can manually change the values of these two variables to specify the storage location.
For example, it can recognize the following image (assets/page.png):
git clone https://github.com/breezedeus/Pix2Text-Mac
pip install -r requirements.txt
If you want to recognize text images in languages other than Simplified Chinese and English, please run the following command to install additional dependencies:
pip install pix2text[multilingual]>=1.1.0.1
Use the following command to verify if the installed Pix2Text is working normally:
p2t predict -l en,ch_sim --resized-shape 768 --file-type page -i assets/page.png -o output-page --save-debug-res output-debug-page
python setup.py py2app -A
- You can find the application
Pix2Text.app
in the generateddist
folder. Double-click to open it, or move it to theApplications
folder.
- Launch the application
- Start the
Pix2Text.app
application, and you will see the Pix2Text application icon in the menu bar. - Click the
On / Off
button in the menu bar icon to ensure that theMixed OCR
,Formula OCR
, andMixed OCR
buttons are lit up.
- Start the
- Take a screenshot
- Use any screenshot software, such as
Snipaste
, to capture and copy to the clipboard.
- Use any screenshot software, such as
- Recognition
- Recognize images with both mathematical formulas and text
- Click the
Text_Formula OCR
button. - After successful recognition, you will receive a notification in the notification center.
- Click the
- Recognize images with pure mathematical formulas
- Click the
Formula OCR
button. - After successful recognition, you will receive a notification in the notification center.
- Click the
- Recognize images with pure text
- Click the
Text OCR
button. - After successful recognition, you will receive a notification in the notification center.
- Click the
- To recognize screenshots of pages with complex layouts
- Click on the
Page OCR
button. - After successful recognition, you will receive a notification in the notification center.
- Click on the
- If you do not want to receive notifications, you can turn them off in the system settings.
- After receiving a notification, you can paste the result into the Pix2Text Online Service to view the rendered result.
- You can modify the initialization configuration of Pix2Text by editing the configuration file config.yaml, such as which model to use and the path to the model. If you have purchased the premium models (which provides better results), you can refer to the content of pro-config.yaml to modify config.yaml.
- Recognize images with both mathematical formulas and text
- The first time you start the application, it will download models and configuration files, resulting in a long startup time. Subsequent startups will return to normal speed.
- The storage path for downloaded models and configuration files is
~/.cnstd
,~/.cnocr
, and~/.pix2text
. - The application depends on the Python environment used during packaging. If the Python environment changes (e.g., the virtual environment used for packaging is deleted, the dependencies in the environment used for packaging are deleted or modified, or the Python environment on the computer is completely uninstalled), the application may not work properly and needs to be repackaged.
- The initial code of this project was forked from: horennel/LaTex-OCR_for_macOS. Special thanks to the author of this project.
- Pix2Text
- pyperclip
- rumps
- py2app