-
Notifications
You must be signed in to change notification settings - Fork 0
/
basics.qmd
206 lines (139 loc) · 7.91 KB
/
basics.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
title: "Core Concepts"
---
## Overview
This section covers some of the most fundamental operations of both languages. These include [variable]{.py}/[object]{.r} assignment, data [type]{.py}/[class]{.r}, arithmetic, etc. External data are _not_ included in this page.
Note that any line in a code chunk preceded by a hashtag (`#`) is a "comment" and is not evaluated in either language. Including comments is generally good practice because it allows humans to read and understand code that may otherwise be unclear to them.
## Assignment
At its most basic, we want to store data in code in such a way that we can use / manipulate it via our scripts. This requires **assigning** data to a [variable]{.py}/[object]{.r} with the **assignment operator**.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
In [{{< fa brands r-project >}} R]{.r}, the assignment operator is `<-`. To use it, the name of the new object-to-be is on the left of the arrow and the information to assign is on the right.
```{r r-assign}
# Make a simple object
a <- 2
# Check it out
a
```
## [{{< fa brands python >}} Python]{.py}
In [{{< fa brands python >}} Python]{.py}, the assignment operator is `=`. To use it, the name of the new object-to-be is on the left of the equal sign and the information to assign is on the right.
```{python py-assign}
# Make a simple object
a = 2
# Check it out
a
```
:::
Once we've created a [variable]{.py}/[object]{.r} we can then use the information stored inside of it in downstream operations! For example, we could perform basic arithmetic on our [variable]{.py}/[object]{.r} and assign the result to a new [variable]{.py}/[object]{.r}.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
Addition, subtraction, multiplication, and division share operators across both languages (`+`, `-`, `*`, and `/` respectively). However, in [{{< fa brands r-project >}} R]{.r} exponents use `^`.
```{r r-math}
# Raise to an exponent
b <- a^3
# Check out the result
b
```
## [{{< fa brands python >}} Python]{.py}
Addition, subtraction, multiplication, and division share operators across both languages (`+`, `-`, `*`, and `/` respectively). However, in [{{< fa brands python >}} Python]{.py} exponents use `**`
```{python py-math}
# Raise to an exponent
b = a**3
# Check out the result
b
```
:::
## [Type]{.py} & [Class]{.r}
Some operations are only possible on some categories of information. For instance, we can only perform arithmetic on numbers. In [{{< fa brands python >}} Python]{.py} this is known as the [variable]{.py}'s [type]{.py} & while in [{{< fa brands r-project >}} R]{.r} this is the [object]{.r}'s [class]{.r}. In either case, it's important to know--and be able to check--this information about the [variables]{.py}/[objects]{.r} with which we are working.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
In [{{< fa brands r-project >}} R]{.r} we use the `class` function to get this information. Note that the names of [{{< fa brands r-project >}} R]{.r} classes sometimes differ from their equivalents in [{{< fa brands python >}} Python]{.py}.
```{r r-class1}
# Check class of an integer
class(37)
```
```{r r-class2}
# Check class of a decimal
class(3.14159)
```
```{r r-class3}
# Check class of text
class("my hands are typing words")
```
## [{{< fa brands python >}} Python]{.py}
In [{{< fa brands python >}} Python]{.py}, the `type` function returns the type of the data object. Note that the names of [{{< fa brands python >}} Python]{.py} types sometimes differ from their equivalents in [{{< fa brands r-project >}} R]{.r}.
```{python py-type1}
# Check type of an integer
type(37)
```
```{python py-type2}
# Check type of a decimal
type(3.14159)
```
```{python py-type3}
# Check type of text
type("my hands are typing words")
```
:::
## Indexing
When our [variables]{.py}/[objects]{.r} have more than one [item]{.py}/[element]{.r} we may want to examine the piece of information at a specific position. This position is the "index position" and can be accessed in either language fairly easily.
In order to explore this more fully, let's make some example multi-component [variables]{.py}/[objects]{.r}.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
In [{{< fa brands r-project >}} R]{.r}, one of the fundamental data structures is a "vector". Vectors are assembled with the concatenation function (`c`) where each item is separated by commas (`,`) and the set of them is wrapped in parentheses (`(...)`).
Note that the class of the object comes from the vector's _contents_ rather than the fact that it is a vector. All elements in a vector therefore must share a class.
```{r r-index-prep}
# Make a multi-item variable
x <- c(1, 2, 3, 4, 5)
# Check it out
class(x)
```
## [{{< fa brands python >}} Python]{.py}
In [{{< fa brands python >}} Python]{.py} the fundamental data structure is a "list". Lists are assembled either by wrapping the items to include in square brackets (`[...]`) or by using the `list` function. In either case, each item is separated from the others by commas (`,`).
Note that the type of the variable comes from the _list itself_ rather than its contents. Lists therefore support items of multiple different types.
```{python py-index-prep}
# Make a multi-item variable
x = [1, 2, 3, 4, 5]
# Check it out
type(x)
```
:::
One crucial difference between [{{< fa brands r-project >}} R]{.r} and [{{< fa brands python >}} Python]{.py} is that [{{< fa brands python >}} Python]{.py} is "0-based" meaning that the first [item]{.py} is at index position `0` while in [{{< fa brands r-project >}} R]{.r} the position of the equivalent [element]{.r} is `1`.
Fortunately, in either language the syntax for indexing is the same.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
To index a multi-element object, simply append square brackets to the end of the object name and specify the number of the index position in which you are interested.
```{r r-index}
# Access the first element of the vector
x[1]
```
## [{{< fa brands python >}} Python]{.py}
To index a multi-item variable, simply append square brackets to the end of the variable name and specify the number of the index position in which you are interested.
```{python py-index}
# Access the first item of the list
x[0]
```
:::
## Slicing
When we index more than one position, this is known as "slicing". We can still use square brackets in either language to slice multiple [items]{.py}/[elements]{.r} and the syntax inside of those brackets _seems_ shared but yields different results due to inherent syntactical differences.
:::panel-tabset
## [{{< fa brands r-project >}} R]{.r}
In [{{< fa brands r-project >}} R]{.r}, when we write two numbers separated by a colon (`:`), that indicates that we want those two numbers and all integers between them.
```{r r-slice-cont1}
# Demonstrate that the colon is shorthand for 'all numbers between'
1:10
```
We can use this to slice out multiple _continuous_ index positions from an object.
```{r r-slice-cont2}
# Slice items in the `x` object
x[2:4]
```
## [{{< fa brands python >}} Python]{.py}
In order to slice in [{{< fa brands python >}} Python]{.py}, we include the start and stop _bounds_ of the items that we want to slice separated by a colon (`:`) inside of square brackets. The first bound (i.e., bound position 0) is actually the starting bracket of the list! This means that we can treat the first number in the slice in the same way we would in single indexing but the second number is actually the bound before the item with that index value.
Another way of thinking about this is that it is similar to a mathematical set. The starting bound is _inclusive_ while the ending bound is _exclusive_.
```{python py-slice-cont}
# Strip out several items of the Python list
x[2:4]
```
Notice that we only get the items at third and fourth index position despite `4` being after the colon (which in an index would return the fifth index position)? That is because the fourth bound is after the fourth item but _before_ the fifth item.
:::