A csv file enroll_data.csv
in a dropbox folder named recruitment_project
with columns:
site ID, date of consent, cohort, birth date
Download
-
We have shared the above folder with you. You will be required to upload your results there. To be able to do that, you need a dropbox account. Create one if you do not have it already.
-
Write Python code to pull the csv from dropbox using their API.
Hint
-
upload()
anddownload()
examples at https://github.com/dropbox/dropbox-sdk-python/blob/main/example/updown.py -
Remember to include
/
to access a folder via the API i.e./recruitment_project
-
Anonymize
-
Human subjects sign a consent form to participate in a research study. Your task is to disguise the
date of consent
to protect their privacy. First, modify the csv by replacing given dates with ones earlier than year 1925. You must use a random number of days (offset) for each subject so there is no way to trace back. Second, replace thebirth date
column withage
in years at the original date of consent. Save the modified csv asenroll_data_anon_{your_initials}.csv
.Hint
enroll_data_anon_{your_initials}.csv
should look like:site ID date of consent cohort age 1 BWH 8/13/1924 CHR 45 . ... ... ... ... -
For us to conveniently check your work, save the offset in a file
enroll_data_offset_{your_initials}.csv
.Hint
enroll_data_offset_{your_initials}.csv
should look like:days_offset 1 35041 2 35049 3 35055 . ...
Upload
- Push the two csvs back to dropbox using their API.
Write a web application in JavaScript, HTML, CSS with--
- a time period filter for original
date of consent
- a drop down menu for groups--
cohort
andage
- two visualizations, using the same graphics--
- X axis is site, Y-axis is enrollment by cohort
- X axis is site, Y-axis is enrollment by age group (in increment of 10 years)
You may use any JavaScript library you want.
Sample Output
- The Y-Axis represents the total enrollment after
Group By
andTime Period
controls are applied. - There are two cohorts--CHR and HC. Hence, in the following example, there are two legends. When displaying groups by age, add as many legends as the number of groups.
- Numbers overlaid on the bar segments represent the percentage of the entire bar height covered by that segment.
- Hint--for filling segments of a single bar, you can use SVG elements
defs
andlinearGradient
. But you can also plot multiple bars contiguously. If you do the latter and you have a hard time cacluating coordinates, just flip X and Y axes i.e. display the enrollment on X-axis and sites on Y-axis. - Time Period filter is over the original
date of consent
. This task is different from anonymization so we ask that you to use the originals. - Sample shows fictitious numbers and site names, do not let them confuse you
- Make a GitHub repo and push up all your work
- Write a skeletal README.md as you would write for any GitHub repo so a remote user is able to get a sense of your project. You can also write any limitation you faced or new thing you learnt while doing this project.
- Launch a public webpage from GitHub so Task 2 could be seen from the internet
- Good coding practice (comment, space etc.) is recommended but not required.
- You are expected to navigate through this project on our own. But do reach out to us via the comment box below if anything is unclear or if you are stuck with any part of the project.