students/jupa1089.Rmd

---
title: "jupa1089 Details"
author: "Julia P."
date: "January 17, 2016"
output: 
  html_document:
    toc: true
    toc_depth: 2
---

## Content

Does citizen monitoring (via **SMS surveys**) impact the quality of municipal waste collection services in *Kampala, Uganda*?

[link](https://github.com/citizen-monitoring/citizen-monitoring.github.io)

![](images/jupa1089_ugandasms.png)

## Techniques

The most applicable techniques:

* data organization,
  + calculating summary statistics,
  + clear, shareable code.

## Data

We have preliminary survey result data from Mark Buntaine. These are in excel format and will have any identifying information removed.

```{r}
#read csv
d = read.csv('data/jupa1089_ugandaSMS.csv')

#output summary
summary(d)
```


##Data Wrangling

```{r, eval=FALSE}
#present working directory
getwd()

#change working directory
setwd('C:/Users/paltsev/Documents/gitsnorlax/env-info')

#list files
list.files()

#list files that end in '.jpg'
list.files(pattern=glob2rx('*.jpg'))

#file exists
file.exists('test.png')

setwd('students')
```

###Install Packages

```{r, eval=FALSE}
#run this chunk only once in your Console
#do not evaluate when knitting Rmarkdown

#list of packages
pkgs = c(
  'readr',
  'readxl',
  'dplyr',
  'tidyr',
  'nycflights13',
  'gapminder')
  
#install packages if not found
for (p in pkgs){
  if (!require(p, character.only=T)){
    install.packages(p)
  }
}
```

###utils::read.csv

Traditionally, you would read a CSV like so:

```{r}
d = read.csv('../data/r-ecology/species.csv')
d
head(d)
summary(d)
```

###readr::read_csv

Better yet, try read_csv:

```{r}
library(readr)

d = read_csv('../data/r-ecology/species.csv')
d
head(d)
summary(d)
```

###dplry::tbl_df

Now convert to a dplyr table:

```{r}
library(readr)
library(dplyr)

d = read_csv('../data/r-ecology/species.csv')
d = tbl_df(d)

d = read_csv('../data/r-ecology/species.csv') %>%
  tbl_df()

d
head(d)
summary(d)
glimpse(d)

```

###dplry loosely
####What year does species "NL" show up in survey.csv?

```{r}
library(readr)
library(dplyr)

read_csv('../data/r-ecology/surveys.csv') %>%
  select(species_id, year) %>%
  #filter(species_id == 'NL') %>%
  group_by(year) %>%
  summarize(count = n())

d = read_csv('../data/r-ecology/surveys.csv') %>%
  tbl_df()

d
head(d)
summary(d)
glimpse(d)
```

##Wrangling Webinar
```{r}
#View(iris)
#iris %>%
  #select(Species, Petal.Length)

#install.packages("devtools")
#devtools::install_github("rstudio/EDAWR")
#library(EDAWR)
#View(storms)
#View(cases)
#View(pollution)
#storms$storm
#storms$wind
#storms$pressure
#storms$date
#cases$country
#names(cases)[-1]
#unlist(cases[1:3, 2:4])
#pollution$amount[1:5]
#storms$pressure / storms$wind
#library(tidyr)
#?gather
#?spread
#gather(cases, "year", "n", 2:4)
#spread(pollution, size, amount)
#storms2 <- separate(storms, date, c("year", "month", "day"), sep = "-")
#unite(storms2, "date", year, month, day, sep = "-")
```