Skip to content

web-scraping-country-music_hof-SudeepGhantasala created by GitHub Classroom

Notifications You must be signed in to change notification settings

SudeepGhantasala/Web-Scraping-Country-Music-Hall-of-Fame

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Web Scraping the Country Music Hall of Fame

Used BeautifulSoup in Python to webscrape and obtain a dataset of all Country Music Hall of Fame inductees. Answered questions as part of an assignment at Nashville Software School.

This project involved webscraping across multiple pages and can be found at the following link: https://countrymusichalloffame.org/hall-of-fame/members/

  1. Start by using either the inspector or by viewing the page source. Can you identify a tag that might be helpful for finding the names of all inductees? Make use of this to create a list containing just the names of each inductee.

  2. Next, try and find a tag that could be used to find the year that each member was inducted. Extract these into a list. When you do this, be sure to only include the year and not the full text. For example, for Roy Acuff, the list entry should be "1962" and not "Inducted 1962". Double-check that the resulting list has the correct number of elements and is in the same order as your inductees list.

  3. Take the two lists you created on parts 1 and 2 and convert it into a pandas DataFrame.

  4. Now, you need to take what you created for the first page and apply it across the rest of the pages so that you can scrape all inductees. Notice that when you click the next page button at the bottom of the page that the url changes to "https://countrymusichalloffame.org/hall-of-fame/members/page/2". Check that the code that you wrote for the first page still works for page 2. Once you have verified that your code will still work, write a for loop that will cycle through all 16 pages and build a DataFrame containing all inductees and the year of their induction.

  5. Create a visual using the data that you scraped. Prepare a short (<5 minute) presentation.

  6. Bonus Question: If you navigate to Roy Acuff's page, you will see that his date of birth and date of death are listed towards the top of the page, along with his birthplace. Write some code that will extract these three values. Once you get it working for Roy Acuff, figure out how you can automatically extract these values across the whole dataset of artists. In doing this, you'll need to figure out a way to automatically determine the correct urls for each artist. Note also that not every artist will have these three values, so write your code in a way that it can handle cases where these values are missing. Alabama is one such example.

About

web-scraping-country-music_hof-SudeepGhantasala created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published