pycerpt is a command line utility for extracting highlighted text from PDFs.
Get the latest version with pip install pycerpt.
pycerpt outputs to markdown as default. Use with excerpt test.pdf or save to a file with excerpt test.pdf > out.md or excerpt test.pdf out.md.
For PDF generation additional dependencies are needed: pip install pycerpt[pdf].
Usage: excerpt test.pdf out.pdf.