students/bbest.Rmd

---
title: "bbest"
author: "Ben Best"
date: "January 15, 2016"
output:
  html_document:
    theme: united
    highlight: tango
---

## Content
        
What is your burning environmental question that you'd like to address? Feel free to provide group project, dissertation, and/or personal interest. What's the study area?

![cool idea](images/bbest_cool-idea.png)
        
## Techniques
        
What techniques from the course do you think will be most applicable?
        
## Data
        
What data have you already identified? Feel free to provide a link and/or details on the variables of interest.
  
Here is some data from [Shipping in Canada (2011)](http://www.statcan.gc.ca/pub/54-205-x/2011000/part-partie1-eng.htm):
  
```{r}
ports_bc = read.csv('data/bbest_ports-bc.csv')
summary(ports_bc)
```

## Data Wrangling

```{r, eval=FALSE}
# present working directory
getwd()

# change working directory
setwd('.')

# list files
list.files()

# list files that end in '.jpg'
list.files(pattern=glob2rx('*.jpg'))

# file exists
file.exists('test.png')
```

# Install Packages

```{r, eval=FALSE}
# Run this chunk only once in your Console
# Do not evaluate when knitting Rmarkdown

# list of packages
pkgs = c(
  'readr',        # read csv
  'readxl',       # read xls
  'dplyr',        # data frame manipulation
  'tidyr',        # data tidying
  'nycflights13', # test dataset of NYC flights for 2013
  'gapminder')    # test dataset of life expectancy and popultion

# install packages if not found
for (p in pkgs){
  if (!require(p, character.only=T)){
    install.packages(p)
  }
}
```


## utils::read.csv

Traditionally, you would read a CSV like so:

```{r}
d = read.csv('../data/r-ecology/species.csv')
d
head(d)
summary(d)
```

## readr::read_csv

Better yet, try read_csv:

```{r}
library(readr)

d = read_csv('../data/r-ecology/species.csv')
d
head(d)
summary(d)
```


## dplry::tbl_df

Now convert to a dplyr table:

```{r}
library(readr)
library(dplyr)

d = read_csv('../data/r-ecology/species.csv')
d = tbl_df(d)

d = read_csv('../data/r-ecology/species.csv') %>%
  tbl_df()

d = tbl_df(read_csv('../data/r-ecology/species.csv'))
d
head(d)
summary(d)
glimpse(d)

```

## dplyr loosely

### What year does species 'NL' show up in the surveys.csv?


```{r}
library(readr)
library(dplyr)

read_csv('../data/r-ecology/surveys.csv') %>%
  select(species_id, year) %>%
  #filter(species_id == 'NL') %>%
  group_by(species_id, year) %>%
  summarize(count = n())

d = read_csv('../data/r-ecology/species.csv') %>%
  tbl_df()

d = tbl_df(read_csv('../data/r-ecology/species.csv'))
d
head(d)
summary(d)
glimpse(d)

```