-
Notifications
You must be signed in to change notification settings - Fork 9
/
index.html
249 lines (184 loc) · 13.7 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="chrome=1">
<title>Data-Fa14 by uiuc-cse</title>
<link rel="stylesheet" href="stylesheets/styles.css">
<link rel="stylesheet" href="stylesheets/pygment_trac.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script src="javascripts/respond.js"></script>
<!--[if lt IE 9]>
<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<!--[if lt IE 8]>
<link rel="stylesheet" href="stylesheets/ie.css">
<![endif]-->
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
</head>
<body>
<div id="header">
<nav>
<li class="fork"><a href="https://github.com/uiuc-cse/data-fa14">View On GitHub</a></li>
<li class="downloads"><a href="https://github.com/uiuc-cse/data-fa14/zipball/master">ZIP</a></li>
<li class="downloads"><a href="https://github.com/uiuc-cse/data-fa14/tarball/master">TAR</a></li>
<li class="title">DOWNLOADS</li>
</nav>
</div><!-- end header -->
<div class="wrapper">
<section>
<div id="title">
<h1>Data-Fa14</h1>
<p>CSE Training Workshops in Data Analytics, Fall 2014 • DCL L440, 1–3 pm</p>
<hr>
<span class="credits left">Project maintained by <a href="https://github.com/uiuc-cse">uiuc-cse</a></span>
<span class="credits right">Hosted on GitHub Pages — Theme by <a href="https://twitter.com/michigangraham">mattgraham</a></span>
</div>
<p>All workshops will be held in L440 Digital Computer Laboratory, an EWS computer laboratory in the basement. There is no sign-up for this series—walk-ins are welcome and encouraged!</p>
<h1>
<a id="introduction-to-r-part-1" class="anchor" href="#introduction-to-r-part-1" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#r1">Introduction to R, Part 1</a>
</h1>
<h3>
<a id="sep-8-100300-pm--dcl-l440" class="anchor" href="#sep-8-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Sep. 8, 1:00–3:00 pm • DCL L440</h3>
<p>This workshop targets students with some programming experience and little to no prior exposure to the statistical and data analysis language R. We will conduct a hands-on walkthrough of basic R features and packages.</p>
<p>We will cover the following topics:</p>
<ul>
<li>Reading and cleaning data</li>
<li>Major data types and their uses</li>
<li>Useful functions</li>
<li>Data display</li>
</ul>
<p><a href="https://github.com/uiuc-cse/data-fa14/blob/gh-pages/lessons/install_Rstudio.md">How to install R Studio on EWS workstations</a></p>
<p><a href="https://github.com/uiuc-cse/data-fa14/blob/gh-pages/lessons/R1_intro.pdf?raw=true">Link to lesson</a></p>
<p><a href="https://github.com/uiuc-cse/data-fa14/blob/gh-pages/lessons/R1_exercises.pdf?raw=true">Link to exercises</a></p>
<h1>
<a id="introduction-to-r-part-2" class="anchor" href="#introduction-to-r-part-2" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#r2">Introduction to R, Part 2</a>
</h1>
<h3>
<a id="sep-15-100300-pm--dcl-l440" class="anchor" href="#sep-15-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Sep. 15, 1:00–3:00 pm • DCL L440</h3>
<p>This tutorial continues the introduction to R begun previously, including new topics such as importing packages.</p>
<p><a href="https://github.com/uiuc-cse/data-fa14/blob/gh-pages/lessons/R2_intro.md">Link to lesson</a></p>
<p><a href="https://github.com/uiuc-cse/data-fa14/blob/gh-pages/lessons/R2_exercise_capstone.pdf?raw=true">Link to capstone exercise</a></p>
<h1>
<a id="pandas-python-data-analysis-library" class="anchor" href="#pandas-python-data-analysis-library" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#pandas">Pandas (Python Data Analysis) Library</a>
</h1>
<h3>
<a id="sep-17-9001100--dcl-l440" class="anchor" href="#sep-17-9001100--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Sep. 17, 9:00–11:00 • DCL L440</h3>
<p>The <a href="http://pandas.pydata.org/">Pandas</a> module provides an R-like interface for manipulating and analyzing data sets and their statistics.</p>
<h1>
<a id="introduction-big-data-and-analytics-seminar-by-big-data-and-analytics-council" class="anchor" href="#introduction-big-data-and-analytics-seminar-by-big-data-and-analytics-council" aria-hidden="true"><span class="octicon octicon-link"></span></a>Introduction Big Data and Analytics (Seminar by Big Data and Analytics Council)</h1>
<h3>
<a id="sep-25-500-pm--1002-lincoln-hall" class="anchor" href="#sep-25-500-pm--1002-lincoln-hall" aria-hidden="true"><span class="octicon octicon-link"></span></a>Sep. 25, 5:00 pm • 1002 Lincoln Hall</h3>
<p>Although not part of the CSE workshop series, we recommend this talk hosted by the student group Big Data and Analytics Council which will cover Big Data and its applications in a popular manner to those interested in applying data analysis techniques to their research and coursework.</p>
<h1>
<a id="data-mining-applications" class="anchor" href="#data-mining-applications" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#mining">Data Mining Applications</a>
</h1>
<h3>
<a id="sep-29-100300-pm--dcl-l440" class="anchor" href="#sep-29-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Sep. 29, 1:00–3:00 pm • DCL L440</h3>
<h1>
<a id="machine-learning-applications" class="anchor" href="#machine-learning-applications" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#ml">Machine Learning Applications</a>
</h1>
<h3>
<a id="oct-6-100300-pm--dcl-l440" class="anchor" href="#oct-6-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Oct. 6, 1:00–3:00 pm • DCL L440</h3>
<h1>
<a id="machine-learning-in-python" class="anchor" href="#machine-learning-in-python" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#sklearn">Machine Learning in Python</a>
</h1>
<h3>
<a id="oct-22-9001100--dcl-l440" class="anchor" href="#oct-22-9001100--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Oct. 22, 9:00–11:00 • DCL L440</h3>
<h1>
<a id="illinoisnlpcloud" class="anchor" href="#illinoisnlpcloud" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#nlpcloud">IllinoisNLPCloud</a>
</h1>
<h3>
<a id="oct-23-9001100-am--dcl-l440" class="anchor" href="#oct-23-9001100-am--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Oct. 23, 9:00–11:00 am • DCL L440</h3>
<p><em>Presented by Dr Mark Sammons and Hao Wu of the <a href="http://cogcomp.cs.illinois.edu/">Cognitive Computation Group</a>. Please note the updated date and time.</em></p>
<p>Come learn how to perform cloud processing of natural language, whether your interest is business intelligence, computer science, computational linguistics, or text mining. </p>
<p><a href="http://cogcomp.cs.illinois.edu/page/software">IllinoisCloudNLP</a> makes it straightforward for experts and nonexperts alike to process large texts as needed.</p>
<h3>
<a id="setup" class="anchor" href="#setup" aria-hidden="true"><span class="octicon octicon-link"></span></a>Setup</h3>
<p>We will follow the instructions <a href="http://cogcomp.cs.illinois.edu/page/software_view/CloudNLP">here</a>. Unless you already have an <a href="https://aws.amazon.com/">Amazon Web Services</a> account, you will use a CSE training account uniquely assigned to you in the workshop. (Good user practice: you don't want to expose this information, but since I'll reset it immediately after the workshop it's "okay" here.)</p>
<p><a href="http://cogcomp.cs.illinois.edu/member_pages/sammons/temp/tutorial-cloudnlp-supp.php">Mark Sammon's page</a></p>
<ul>
<li>
<p>On your EWS machine, please open a terminal window to work in and execute the following code:</p>
<pre><code>module load sun-jdk/1.7.0-latest-el6-x86_64
</code></pre>
</li>
<li><p>Follow the instructions <a href="http://cogcomp.cs.illinois.edu/page/software_view/CloudNLP">here</a>.</p></li>
<li>
<p>When your terminal output reads</p>
<pre><code>[info] play - Application started (Prod)
[info] play - Listening for HTTP on /0:0:0:0:0:0:0:0:9000
</code></pre>
<p>then navigate to <a href="http://127.0.0.1:9000/"><code>http://127.0.0.1:9000/</code></a>.</p>
</li>
<li><p>To monitor jobs, you can log in to the <a href="https://uiuc-cse.signin.aws.amazon.com/console">AWS site</a> with the username assigned as <code>csetrainingXX</code> and password <code>Capricorn1</code>, then select ‘EC2’ and <code>Monitor Instances</code> on the left. There you can see your machine instances running on the cloud and some data about their execution.</p></li>
</ul>
<h1>
<a id="knime-graphical-analytics" class="anchor" href="#knime-graphical-analytics" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#knime">KNIME Graphical Analytics</a>
</h1>
<h3>
<a id="oct-27-100300-pm--dcl-l440" class="anchor" href="#oct-27-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Oct. 27, 1:00–3:00 pm • DCL L440</h3>
<p><a href="http://www.knime.org/knime/">KNIME</a> is an open platform for sophisticated data mining and statistics on your data. The visual workbench combines data access, transformation, investigation, predictive analytics, and visualization in one package. Come to this hands-on workshop and get started today!</p>
<h3>
<a id="setup-1" class="anchor" href="#setup-1" aria-hidden="true"><span class="octicon octicon-link"></span></a>Setup</h3>
<p>KNIME can be executed directly from the extracted <a href="http://www.knime.org/knime_downloads/linux/knime-latest-linux.gtk.x86_64.tar.gz">archive</a>.</p>
<h3>
<a id="data-files" class="anchor" href="#data-files" aria-hidden="true"><span class="octicon octicon-link"></span></a>Data Files</h3>
<ul>
<li><a href="https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/airlines.dat"><code>airlines.dat</code></a></li>
<li><a href="https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/airports.dat"><code>airports.dat</code></a></li>
<li><a href="https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/routes.dat"><code>routes.dat</code></a></li>
<li><a href="https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv"><code>iris.csv</code></a></li>
</ul>
<h1>
<a id="big-data-hadoop-and-mapreduce" class="anchor" href="#big-data-hadoop-and-mapreduce" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#hadoop">Big Data: Hadoop and MapReduce</a>
</h1>
<h3>
<a id="nov-10-100300-pm--dcl-l440" class="anchor" href="#nov-10-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Nov. 10, 1:00–3:00 pm • DCL L440</h3>
<p>Today we will discuss Hadoop and MapReduce, a popular algorithm and platform for large-scale data analytics. We will also use Amazon Web Services’ cloud computing infrastructure.</p>
<ul>
<li>
<p>Please download this file to your Desktop.</p>
<ul>
<li><a href="https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/lessons/mapreduce.ipynb">MapReduce Notebook</a></li>
</ul>
</li>
<li>
<p>Then open a command-line window and execute the following:</p>
<pre><code>cd Desktop
module load canopy
ipython notebook mapreduce.ipynb
</code></pre>
</li>
<li><p>You will log in later as well to <a href="https://uiuc-cse.signin.aws.amazon.com/console">the AWS CSE workshop site</a> using the login and password distributed in class.</p></li>
</ul>
<h1>
<a id="big-data-sql-pig-and-the-hadoop-zoo" class="anchor" href="#big-data-sql-pig-and-the-hadoop-zoo" aria-hidden="true"><span class="octicon octicon-link"></span></a><a href="#pig">Big Data: SQL, Pig, and the Hadoop Zoo</a>
</h1>
<h3>
<a id="nov-17-100300-pm--dcl-l440" class="anchor" href="#nov-17-100300-pm--dcl-l440" aria-hidden="true"><span class="octicon octicon-link"></span></a>Nov. 17, 1:00–3:00 pm • DCL L440</h3>
<p>We will teach the database language SQL, the SQL-like interface to Hadoop, Pig, and what the elements of the Hadoop Zoo, or ecosystem of tools and platforms around Hadoop, are.</p>
<h1>
<a id="about-these-workshops" class="anchor" href="#about-these-workshops" aria-hidden="true"><span class="octicon octicon-link"></span></a>About These Workshops</h1>
<h3>
<a id="contributors" class="anchor" href="#contributors" aria-hidden="true"><span class="octicon octicon-link"></span></a>Contributors</h3>
<p>Neal Davis and Yuanzhi Qi developed these materials. This content is available under a Creative Commons Attribution 3.0 Unported License.</p>
<p><img src="https://i.creativecommons.org/l/by/4.0/88x31.png" alt=""></p>
<h1>
<a id="contact" class="anchor" href="#contact" aria-hidden="true"><span class="octicon octicon-link"></span></a>Contact</h1>
<p>If you have any questions about course availability, concepts, or content, please contact Neal Davis, Training Coördinator for Computational Science & Engineering, at davis68 at illinois dot edu.</p>
</section>
</div>
<!--[if !IE]><script>fixScale(document);</script><![endif]-->
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-53962544-5");
pageTracker._trackPageview();
} catch(err) {}
</script>
</body>
</html>