Skip to content

Latest commit

 

History

History
37 lines (24 loc) · 1.94 KB

README.md

File metadata and controls

37 lines (24 loc) · 1.94 KB

R code examples to teach basic Web scraping with rvest and related packages.

Used at a two-day workshop in November 2018: refer to the introductory slides, in French, for details.

Please report any bugs or errors in the issues of this repository, or email me.

DEMOS

  1. lagasafn · legal cross-references in Icelandic law
  2. jorf · XML field extraction from the French Official Journal
  3. cop21 · word extraction from the UNCC Paris Accord
  4. qosd · keyword co-occurrence in French parliamentary questions

Projects mentioned but not included in the repository:

Slides shown but not included in the repository (available on request):

  • "Large-scale legislative data collection from online sources" (2016)
  • "Web scraping et APIs avec R" (2017)

HOWTO

  1. Run the dependencies.r script to install all required packages.
  2. Run each code folder separately. Each has its own .Rproj file.

THANKS

  • Sabrina Granger and Isabelle Scarpat-Bouvet for excellent logistics.
  • Thomas J. Leeper for his word_count function, used in the cop21 example.
  • Emiliano Grossman for inspiring the qosd example.