Health-AI Ethics Atlas - Requirements #29
Replies: 11 comments 16 replies
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
Hi, I am Mitali. I am very much interested in this project. Kindly guide me for the next steps. |
Beta Was this translation helpful? Give feedback.
-
There is not specific reason to use only D3 and Leaflet. Plan is to decide based on the final group discussion and open to suggestions, |
Beta Was this translation helpful? Give feedback.
-
Let's discuss: Web Scraping Task
|
Beta Was this translation helpful? Give feedback.
-
Hello @selenbw, I'd like to discuss how we can ensure that our web scraping approach prioritizes trustworthy sources for AI ethics guidelines and policies in healthcare or medicine. Are there specific websites or databases known for hosting authoritative content in this area? Additionally, should we concentrate on scraping particular websites or gather data from a wide range of sources to build a comprehensive dataset? I've been exploring web scraping techniques for our task, aiming to collect AI ethics guidelines and policies related to healthcare and medicine. I experimented with BeautifulSoup and spaCy on platforms like PubMed, UNESCO, and IEEE. However, I encountered an obstacle - certain websites have measures in place to deter web scrapers. Despite this challenge, I made progress, particularly with PubMed. The script I've been refining retrieves HTML content from PubMed search results, utilizes BeautifulSoup for parsing, and extracts specific details such as guideline names, organizations, publication dates, and summaries. To enhance the script's capabilities, I integrated spaCy for text processing tasks. However, there are still areas where we can enhance the script. Do you have any suggestions on how we can navigate around websites that block web scrapers? |
Beta Was this translation helpful? Give feedback.
-
Hello everyone, Myself atharv sabde and I am very excited to be part of this project. |
Beta Was this translation helpful? Give feedback.
-
Hello All, |
Beta Was this translation helpful? Give feedback.
-
Hello @selenbw! I have scraped one of the toughest platforms ethically and it's With 2 crawler's I was able to fetch nearly 200,000 company profiles along with each of their 16 different company details, all done in two steps. Find my repo on the same Linkedin Company Directory Scraping System Well I am new to GSoC, and would like to know if I still have time to submit proposal for this project? I LOVE ETHICS! |
Beta Was this translation helpful? Give feedback.
-
Hello everyone,
Email: Khyati9505@gmail.com |
Beta Was this translation helpful? Give feedback.
-
Hello, |
Beta Was this translation helpful? Give feedback.
-
This project wasn't selected for GSoC, but my team will continue working on it. We're aiming for a conference or journal paper and plan to finish in 3 months. It will require consistent commitment. If you're seriously interested in joining the team, please contact me at selen.bozkurt@emory.edu. |
Beta Was this translation helpful? Give feedback.
-
Is there any specific reason to mention only D3 and Leaflet. I think we can use deck.gl. Only downside is heavy memory demand from clients’ machines to render a map resulting less mobile browser compatibility. Open to discussion.
Beta Was this translation helpful? Give feedback.
All reactions