Shortcuts will be availalbe here to all available datasets and computer code.
- Equations on how to calculate milligrams of morphine equivalents
- Dix Hospital Ledger data cleaning and basic descriptives
These datasets are open access and available to the public. They contain no personally identifying information. All datasets are either used with permission or only contain public data. Attributions to orignal source material are provided. We respect data use agreements. Datasets are either used with permission or only contain publicly available records. They are available under various open access licenses and contain no personally identifying information.
-
Corpus of FDA-approved controlled substance US drug labels
Published: October 3, 2019 by Nabarun Dasgupta
Observations: 2,177
Format: ZIP archive of .txt files
Size: 50 MB
Origin: United States Food and Drug Administration via NIH National Library of Medicine DailyMed API service License: Creative Commons Sharealike 4.0 International -
Inactive Ingredients in Controlled Substances
Published: November 14, 2019 by Nabarun Dasgupta
Observations: 18,876
Format: .xls codebook
Size: 8.8 MB
Origin: United States Food and Drug Administration via NIH National Library of Medicine DailyMed API service
License: Creative Commons Sharealike 4.0 International
Each set of code listed below is provided for public use, with the author's consent. Each code set contains a license that details sharing and reuse permissions. Code is written in SAS, Python, Stata, and/or R.
-
PubMed Search Volume Query
Python 3.7, November 2019, MIT License
By Nabarun Dasgupta
This straightforward code will let you take a list of potential search terms and return the number of search term results from PubMed. It was create to identify noisy and irrelevant terms when building a search strategy for a systematic review. -
Distinguishing Brand Versus Generic Controlled Substances
Stata MP 16, October 2019, MIT License
By Nabarun Dasgupta
This code differentiates brand names from generic names of controlled substances. Useful for cleaning data from prescription monitoring program (PMP or PDMP), insurance claims, electronic health records, etc. -
Data Cleaning for Historical Dix Asylum Intake Ledger
Stata MP 16 in iPython notebook, November 13, 2019, MIT License
By Nabarun Dasgupta
This code will process, format, and create variables from the Dix Asylum (Raleigh, NC, USA) Intake Ledger, covering years 1856 to 1919.