Skip to content

Scrape ABSITE scoring PDFs to extract data and import into Excel file for analysis.

License

Notifications You must be signed in to change notification settings

ClancyClark/absite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

absite

Files: absite.py and wrong_answers.py

python 2.7.11

by clancy j clark

3/15/2016

http://www.clancyclark.com

Scrape 2016 ABSITE scoring PDFs to extract data and import into Excel file for analysis.

Two python programs were created in simplistic fashion to process data stuck in PDFs. This work is in part an introduction to processing string text and a practical effort to extract data for purposes of surgery resident education.

PDFminer is used to grab text as long string.

regex expressions are used to grab text from string.

xlsxwriter is used to write to Excel.

Run each program in a folder that contains PDF files.

Excel file will be created in the same folder.

Note: This is not fancy but does the job.

About

Scrape ABSITE scoring PDFs to extract data and import into Excel file for analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages