- Julia Gajda
- Kathleen Graham
- Tamara Najjar
Our project was originally inspired by this shortened clip from an episode of the Newsroom. At the 1:09 minute mark, the character gives a lot of statistics about the United States compared to the rest of the world. His expressions certainly are abrasive, but he makes some interesting points about factual information. There is no such thing as perfect data, but we thought we'd take a closer look at a few of the statistics he discussed. Eventually, we decided plotting information related to military around the world would be especially interesting. We also wanted to plot about food or some other consumer product and any other trending topic, such as billionaires around the world.
We wanted a visualization that had three layers on one map that also contained 3 views: light, dark, and satellite. After researching our chosen topics, we had trouble getting data, so our final layers include data on wine consumption in Liters by country, total number of Olympic medals won, and number of overseas military bases. The U.S. is leading in all three of these, which we did not originally know about wine consumption. Our steps were as follows:
FIND THE DATA! This is never as easy as it sounds. We were able to find a PDF containing wine consumption data and a few different sites on Olympic medal and international military bases data that we could scrape. Throughout the course of this project, we came across more and more helpful resources. Each resource will be referenced at the appropriate step.
We were able to find a PDF containing wine consumption data for 2015-2017 from the Wine Institute. However, we needed to find a way to convert that data from a PDF into a CSV so we could use it in our code. We used PDF Element to do just that. The extraction did most of the heavy lifting so there wasn't quite as much cleaning to do in the CSV after that.
We originally wanted to plot all the billionaires around the world but ran into some difficulties. Both Forbes and Bloomberg had lists that were nearly impossible to scrape. There was no visible body in the HTML. It was linked to a private directory that we could not access, so we resorted to a different topic - Summer Olympic Medals Won by Country.
We were able to scrape the Olympic medal data, but converting it to a CSV directly from Jupyter Notebook was not working properly, so we exported to an .xlsx file and then saved as a CSV before changing it to geojson.
import requests
import pandas as pd
from splinter import Browser
from bs4 import BeautifulSoup as bs
executable_path = {'executable_path': '../chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
url = 'https://www.worldatlas.com/articles/countries-with-the-most-olympic-medals.html'
table = pd.read_html(url)
table[0]
writer = pd.ExcelWriter('olympics.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='List')
writer.save()
We were able to scrape International Military Bases by Country data from Wikipedia. This was the most difficult site to scrape because Wikipedia has multiple contributors that can alter the HTML. When inspecting the HTML, we found that not all the countries/bases were in the same div or unordered list so it was difficult to iterate through and return the desired results. We found a suitable workaround but it took quite some time.
We set up our military bases Jupyter notebook file.
import requests
import pandas as pd
from splinter import Browser
from bs4 import BeautifulSoup as bs
executable_path = {'executable_path': '../chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
We accessed the url and parsed through the HTML with Beautiful Soup. We found a common element, a span containing the flag images, between the elements we wanted. Then we attempted to work our way back with .parent
to get the names of the countries that had overseas bases, deleting the last parent element with .pop()
because it was not actually one of the countries.
Then with .find_next('a')
, we were able to scrape the names of the countries where the overseas bases are located. There were some special cases, such as with the unordered list on Turkey's overseas bases, that had lists of lists, so we had to clean up the data by appending and inserting base name where appropriate.
We turned these two lists into a dataframe with pandas.
To check for correctness, we inspected the count of overseas bases for each country.
Finally, we saved to a CSV file. Not every base had a name or any other details, so later we went back and manually added more information about each military base since we wanted correct data for all bases, not just the bases with the most information available.
military_base_df.to_csv('military_bases.csv')
We discovered that local geojson files don't always work the same as geojson files accessed through a link to the file on the web. Through the Leaflet Choropleth tutorial, we were able to figure out how to add to our HTML a script with a variable of the geojson data for the outlines of all the countries in the world. We then used that variable to create our geojson layer of our logic.js file.
This turned out to give us a lot more control over what was put on our map in three different layers. When we wanted to add more data to the geojson file, we were able to manipulate it using a website called geojson.io. We added references to the appropriate latitude and longitude, names of bases, and even images of little flag icons that could display in a popup or tooltip. We even converted to geojson from CSV.
Leaflet.js has become one of our favorite visualization tools. The interactivity is really fun, especially when you get it to work as you envisioned. Reading through Leaflet's documentation helped us come up with some even better ways of visualizing multiple layers at once.
First, we created the base layers through the Mapbox API and included images in the name by adding HTML image tags. The three views we chose were mapbox.satellite
, mapbox.light
, and mapbox.dark
. These layers were added to a layer group variable called baseMaps
and weren't implemented until after all the map overlays were ready to be added to the map.
// link to maps with api in config.js
const mapboxLink = 'https://api.tiles.mapbox.com/v4/{id}/{z}/{x}/{y}.png?access_token={accessToken}';
// create satellite map layer
const satmap = L.tileLayer(mapboxLink,{
attribution: attribution,
maxZoom: 18,
id: 'mapbox.satellite',
accessToken: API_KEY
});
// create light map layer
const lightmap = L.tileLayer(mapboxLink,{
attribution: attribution,
maxZoom: 18,
id: 'mapbox.light',
accessToken: API_KEY
});
// create dark map layer
const darkmap = L.tileLayer(mapboxLink,{
attribution: attribution,
maxZoom: 18,
id: 'mapbox.dark',
accessToken: API_KEY
});
// create basemap layer with the other maps
const baseMaps = {
"<span> Satellite Map <img class='layer-img' src='../images/satellite.jpg'/></span>": satmap,
"<span> Light Map <img class='layer-img' src='../images/lightmap.jpg'/></span>": lightmap,
"<span> Dark Map <img class='layer-img' src='../images/darkmap.jpg'/></span>": darkmap
};
Later, we came back to this point in our logic.js file and added variables for our layers before any functions because we kept getting errors in the console about layers not being defined yet. This was the best place to create them all at once.
// make variables for mapOverlay layers to adjust later
var wineLayer, olympicsLayer, militaryLayer;
Next, we began to create our different layers that would overlap the base layers.
Our first map overlay was a choropleth layer with Wine Consumption by Country, and the first function we made was the countryColor()
function that included a 5-sequence color scheme by Colorbrewer. This took a little while to get right, but we finally decided on just 5 colors and divided them up from 102 to 106.
// set countryColor based on consumption of wine
function countryColor(d) {
return d > 1000000 ? '#016c59' :
d > 100000 ? '#1c9099' :
d > 10000 ? '#67a9cf' :
d > 1000 ? '#bdc9e1' :
d > 100 ? '#f6eff7' :
'white';
}
We used the countryColor()
function in the following function for the styling of the features in style(feature)
. Originally, we had all the countries and outlines brought to the front when hovering, as shown in the highlightFeature(e)
function, but we found that this covered the olympic layer markers when both layers were checked. Eventually, we decided to go back and comment out the section of this function that brings it to the front until we can figure out a better solution later. This messes up the borders of the countries when highlighting, but it's not as noticable as not being able to look at all the data at once when selecting all layers. Whenever we fix this function in the future, the resetHighlight(e)
function will work exactly as it should, just reset whatever was in highlightFeature(e)
. These functions also update the info
legend with more information depending on what country is being hovered over and highlighted. The legend was made later, and I'm still confused
// fxn for filling in the countries
function style(feature) {
return {
weight: 2,
opacity: 1,
color: 'white',
dashArray: '3',
fillOpacity: 0.5,
fillColor: countryColor(feature.properties.wineConsumption)
};
}
// fxn for highlighting outline of country on hover
function highlightFeature(e) {
let layer = e.target;
layer.setStyle({
weight: 5,
color: '#666',
dashArray: '',
fillOpacity: 0.7
});
// don't want to bring to front because it covers up the olympic circles when both layers checked
// if (!L.Browser.ie && !L.Browser.opera && !L.Browser.edge) {
// layer.bringToFront();
// }
info.update(layer.feature.properties);
}
// fxn to reset the outline of countries when not hovering anymore
function resetHighlight(e) {
wineLayer.resetStyle(e.target);
info.update();
}
One of our favorite functions we found in Leaflet's documentation was zoomToFeature(e)
. This is boilerplate, but it's so cool to include it and see it in action!
// fxn to zoom in to each country once clicked
function zoomToFeature(e) {
myMap.fitBounds(e.target.getBounds());
}
Our last function is important because it's what brings all these features together. The name was given in Leaflet's documentation for onEachFeature(), but we only included the parameters and functions inside that we wanted. Our only parameter is layer
. Even though we're only using this function on one specific layer, wineLayer
, we wanted to allow the function to be used on other layers that would have the same functionality if we decided to extend this project to include more data points (such as multiple layers for the years of data we have besides 2017).
The event listener layer.on()
combines our three functions (highlightFeature
, resetHighlight
, and zoomToFeature
) so that upon hovering over a country (mouseover
), it hightlights; upon mouseout
, it resets; and upon clicking (click
), the map will zoom in on the country that was clicked.
// fxn to bring all previous feature fxns together
function onEachFeature(feature,layer) {
layer.on({
mouseover: highlightFeature,
mouseout: resetHighlight,
click: zoomToFeature
});
}
Last for this layer, we created the layer itself using L.geoJson
and referencing wineData
as the first parameter. As mentioned briefly before, this was our GeoJSON data for wine consumption saved as a single variable in .js format that could then be called in our index.html. The second parameter when creating this layer included just two main functions style
and onEachFeature
.
// create wine layer that includes styling on three features:
// highlight and resethighlight when hovering or not, and click to zoom
wineLayer = L.geoJson(wineData, {
style: style,
onEachFeature: onEachFeature
});
Next, we moved to our layer with the total number of summer olympic medals won by country. This layer was less complicated because we just wanted colorful circle markers for each country that could be hovered over to show a Tooltip.
We started with the function olympicsSize(m)
to take in the medal count as its parameter and make the size of the marker based on that. When comparing the circle sizes for the United States (rank 1) and Sweden (rank 9), we can see that Sweden's circle is about the same size even though they have significantly fewer medals (only shown by the difference in color and the Tooltips). We believe this is because of the unavoidable map distortion when using the Mercator projection.
// create markerSize based on number of medals won
function olympicsSize(m) {
return m > 1000 ? m*150 :
m > 500 ? m*250 :
m > 100 ? m*500 :
m*1000
}
We made the olympicsColor(m)
function with the same parameter, medal count, to choose the color of the marker. The colors we used are recognizable with typical Olympic symbols
function olympicsColor(m) {
return m > 800 ? '#FBB32E' :
m > 400 ? '#0186C3' :
m > 200 ? '#158C39' :
'#EE304D'
}
Although this layer was a little simpler than the wineLayer, calling all the features correctly was crucial. We used olympicsData
as the first parameter in L.geoJson()
, just like the wineLayer. But then we created another function inside called pointToLayer
, using feature
and latlng
as parameters that would be used when returning a new circle for each point of data. For each circle, we used latlng
as the first parameter and then set the radius
and fillColor
with the two functions made previously for this layer. We wanted a Tooltip, so we then used .bindTooltip
and .openTooltip
to include information about the country and its number of summer olympic medals.
// create olympics layer
olympicsLayer = L.geoJson(olympicsData,{
pointToLayer:function(feature,latlng){
return new L.circle(latlng,
{radius:olympicsSize(feature.properties.medals),fillColor:olympicsColor(feature.properties.medals),fillOpacity:0.9,stroke:false})
.bindTooltip('<div><h4>'+feature.properties.country+'<br><img class="flag-img" src="'+feature.properties.flag
+'"><hr>Rank: '+feature.properties.rank+'</h4><h5>'
+'<img class="medal-img" src="images/gold-medal.svg">Medals: '+feature.properties.medals+'</h5></div>',{'className': 'medal-tooltip'})
.openTooltip()
}
})
Last, we made a simple military bases layer with tank icons (called in the pointToLayer
function with L.marker
) and with Popups (.bindPopup
) giving more information about the bases and which countries to which they belong.
// define tank icon to be used for markers in military layer
const tankIcon = L.icon({
iconUrl: '../images/tank.svg',
iconSize: [38, 95]
});
// can look up difference betweeen L.SVG L.marker with icon as a parameter
militaryLayer = L.geoJson(militaryData, {
pointToLayer: function (feature, latlng) {
return L.marker(latlng, {icon: tankIcon})
.bindPopup('<h5>'+feature.properties.country+'</h5>'+feature.properties.base_name, {'className': 'tank-popup'});
}
});
We combined the three layers into a variable called mapOverlay
.
// create overlays
const mapOverlay = {
"<span> Wine Consumption <img class='layer-img' src='../images/glass.svg'/></span>": wineLayer,
"<span> Summer Olympic Medals <img class='layer-img' src='../images/medal.png'/></span>": olympicsLayer,
"<span> Overseas Military Bases <img class='layer-img' src='../images/tank.svg'/></span>": militaryLayer
};
Once we had the base layers and the map overlays, we were able to make the map variable myMap
and choose what to load on default. We decided to center the map a little above the equator with a zoom of 3, and we wanted the lightmap base layer and wine layer map overlay to load first.
// load lightmap and winelayer as default
const myMap = L.map('map', {
center: [45,0],
zoom: 3,
layers: [lightmap, wineLayer]
});
We wanted the map to show more information depending on which layers were shown. Making these different controls appear and disappear at the appropriate time was the biggest challenge when plotting/mapping with Leaflet.js.
We started with the layer control, the section where the user can choose which baselayers or map overlays to observe. We decided to not allow this control to collapse to allow a user to more easily switch between layers without having to wait for the control to open up again.
// add all map layers to contorl div
const layerDiv = L.control.layers(baseMaps, mapOverlay, {
collapsed: false
})
layerDiv.addTo(myMap);
We wanted users to be able to see the amount of wine in Liters each country consumed whenever hovering over the country (shown previously on because these controls were created here and then used in previous functions). The L.Control
and L.DomUtil
were boilerplate, but understanding how it works took some studying.
// control that shows country info on hover
let info = L.control({ position: 'bottomleft' });
// add info div to wine layer
info.onAdd = function() {
this._div = L.DomUtil.create('div', 'info');
this.update();
return this._div;
};
// update info div whenever hovering over a country
info.update = function(props) {
this._div.innerHTML = '<h4>World Wine Consumption (2017)</h4>' + (props ?
'<b>' + props.name + '</b><br />' + props.wineConsumption + ' L'
: 'Hover over a country<br><br>');
};
// add info div to myMap for wine layer
info.addTo(myMap);
We also made a legend explaining what the range of colors mean for wine consumption. Again, this was boilerplate from documentation, only changing where necessary to match our own data.
// create wine legend
const legend = L.control({position: 'bottomleft'});
// add function to legend for wine layer
legend.onAdd = function() {
const div = L.DomUtil.create('div', 'legend');
const consumption = [0, 100, 1000, 10000, 100000, 1000000]
// const labels = []
for (let i = 0; i < consumption.length; i++){
div.innerHTML +=
'<i style="background:' + countryColor(consumption[i] + 1) + '"></i> ' +
consumption[i] + (consumption[i + 1] ? '–' + consumption[i + 1] + '<br>' : '+')
}
return div
}
legend.addTo(myMap);
Since we added a legend for the wine consumption layer, we thought it'd be best to also add a legend to explain what the marker colors mean for the number of medals in the summer olympic medals layer.
// create olympic legend
const olympicsLegend = L.control({position: 'bottomright'});
// add function to legend for olympics layer
olympicsLegend.onAdd = function() {
const div = L.DomUtil.create('div', 'oLegend');
const medals = [0,200,400,800]
// const labels = []
div.innerHTML = '<h5>Total Summer Olympic Medals<br>Won by Country<br>(up to 2016)</h5>'
for (let i = 0; i < medals.length; i++){
div.innerHTML +=
'<i style="background:' + olympicsColor(medals[i] + 1) + '"></i> ' +
medals[i] + (medals[i + 1] ? '–' + medals[i + 1] + '<br>' : '+')
}
return div
}
At this point, we were really proud of our visualization. But there were some things bothering us. Whenever we would uncheck the wine consumption layer, the information div and the legend for this layer would stay on the screen. Of course, the information div didn't work anymore because the hovering function was taken away with the layer, but we weren't sure how to add this section to the layer itself. So after a little research on Stack Exchange, we determined adding event listeners to add or remove controls would be best. We were able to make a function that added the controls or removed the controls if the event layer name matched what we had made earlier when declaring the mapOverlay
variable.
// show info and legend depending on which layer is checked
myMap.on('overlayadd', function(eventLayer){
if (eventLayer.name === "<span> Wine Consumption <img class='layer-img' src='../images/glass.svg'/></span>"){
myMap.addControl(info);
myMap.addControl(legend);
} else if (eventLayer.name === "<span> Summer Olympic Medals <img class='layer-img' src='../images/medal.png'/></span>") {
myMap.addControl(olympicsLegend);
}
});
// remove info and legend depending on which layer is unchecked
myMap.on('overlayremove', function(eventLayer){
if (eventLayer.name === "<span> Wine Consumption <img class='layer-img' src='../images/glass.svg'/></span>"){
myMap.removeControl(info);
myMap.removeControl(legend);
} else if (eventLayer.name === "<span> Summer Olympic Medals <img class='layer-img' src='../images/medal.png'/></span>") {
myMap.removeControl(olympicsLegend);
}
});
We would love to extend this project in the future to include the following considerations:
- using GitHub Pages to allow anyone to observe our visualization instead of only being able to observe the gifs in this README (we tried to use GitHub Pages, but most of the SVGs don't show up and it ruins the experience)
- designing more layer controls with specific years
- changing the toggling of layers to be only two combinations at once (such as radio buttons for wine and olympic layers but a checkbox for military layer)
- adding other types of controls, such as dropdowns, that allow more data but with different selections intead of all at once
- plotting more trends that are popular to compare across the globe (again, with fewer at once or more control over which are shown together)
- adding flag icons instead of circle markers for the olympic layer (currently, this would affect the tanks because they are using the same coordinates)
- using a database to get real time data on other trends that change more frequently, such as current billionaires around the world
As with any project, the scope changed and we learned a lot! But we also learned about some things that we didn't have time to research more about given our current deadline. Some things we'd like to have better understanding of are as follows:
- when to use
let
and when to useconst
(this still just gets a little confusing when looking through other people's code for examples or ideas). - the difference between
this._div.innerHTML
anddiv.innerHTML
(we are currently assuming that the first refers to the current div created through L.control in Leaflet and the second is a div that was created inside a function by the programmer). - differences between and pros/cons of D3 and Leaflet for certain types of plotting.
Visualizing data across the globe can look powerful, but it can be difficult to get clean data in the first place and then plotting it all on one map can make the screen very busy. Limiting to three trends was a good idea and could be adjusted for the future to really allow an even cleaner look.