Skip to content

nikoelvambuena95/Web-Scrape-Web-App---Live-Mars-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 

Repository files navigation

Mars Web Application

This repo highlights the following three data skills:

  1. Web scraping "Mars" websites
  2. Storing data with Mongo DB
  3. Building a web application through Flask

1. Web scraping

Four different website were scraped using the open-source tool Splinter to automate browser actions...

from splinter import Browser
from webdriver_manager.chrome import ChromeDriverManager

executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

and the Python package Beautiful Soup to parse through HTML...

from bs4 import BeautifulSoup as bs

url = "https://mars.nasa.gov/news/"
browser.visit(url)
html = browser.html
news_site = bs(html, 'html.parser')

for scraping relevant data, in this case the latest Mars news headline and first paragraph.

result = news_site.find('div', class_ = 'list_text')
news_title = result.find('a').text
news_para = result.find('div', class_ = 'article_teaser_body').text

This web scraping is assigned as a function [scrape()] in the scrape_mars.py file and is called in the app.py file as a flask route [("/scrape")].

2. Storing Data

In the app.py file, I connect to a local Mongo database...

from flask_pymongo import PyMongo

mongo = PyMongo(app, uri="mongodb://localhost:27017/mars_app")

then store and update the database with the scraped Mars data - as a route.

@app.route("/scrape")
def scrape():

    mars_data = scrape_mars.scrape()
    mongo.db.collection.update({}, mars_data, upsert = True)

3. Web Application

Finally, the data is visualized using Flask to render the html page.

from flask import Flask, render_template, redirect

@app.route("/")
def home():

    data = mongo.db.collection.find_one()
    return render_template("index.html", mars_data = data)

Considerations

I did have trouble displaying the dataframe of Mars facts. After some tinkering, I realized a potential reason is the class attribute of the table.

<table border="1" class="dataframe table">

The easiest fix I could think of was changing the HTML script for the table directly in index.html.

<table border="1" class="table">

Source Sites


Contact

LinkedIn | https://www.linkedin.com/in/niko-elvambuena/
Email | niko.elvambuena95@gmail.com