-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.Rmd
161 lines (116 loc) · 4.23 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
output: rmarkdown::github_document
editor_options:
chunk_output_type: console
---
```{r pkg-knitr-opts, include=FALSE}
knitr::opts_chunk$set(collapse=TRUE, fig.retina=2, message=FALSE, warning=FALSE)
options(width=120)
```
[](https://travis-ci.org/hrbrmstr/iptrie)
[](https://codecov.io/gh/hrbrmstr/iptrie)
[](https://cran.r-project.org/package=iptrie)
# iptrie
Efficiently Store and Query 'IPv4' Internet Addresses with Associated Data
## Description
Tries are great ways to store data that has obvious hierarchichal
properties, such as 'IPv4' addresses. Methods are provided to create 'IPv4'
tries and store, retrieve and delete 'IPv4' address keys with values.
Functions are based on the 'zmap' 'iptree' 'C' library.
## NOTE
This is an experiment that will not likely turn into a CRAN package or get migrated
to either the `iptools` or `asntools` package.
I initially wanted to feel the pain (again) of using R's own C interface
(i.e. no `Rcpp` crutch) since it's easy to forget just how handy Rcpp is.
(It turns out that my ill memories of playing at the R C level are also unjustified).
This is orders of magnitude slower than the method used in `astools::as_asntrie()`
for an `astools::routeviews_latest()` data set.
## What's Inside The Tin
The following functions are implemented:
- `as_iptrie`: Convert a data frame to an IP trie
- `iptrie`: Efficiently Store and Query 'IPv4' Internet Addresses with Associated Data
- `iptrie_create`: Create a new IPv4 Trie
- `iptrie_destroy`: Destroy an IP trie
- `iptrie_insert`: Insert a value for an IPv4 Address+CIDR combo into an IPv4 Trie
- `iptrie_ip_in`: Lookup a value for an IPv4 Address+CIDR combo into an IPv4 Trie or Test for Existence
- `iptrie_lookup`: Lookup a value for an IPv4 Address+CIDR combo into an IPv4 Trie or Test for Existence
- `iptrie_remove`: Remove a trie entry for an IPv4 Address+CIDR combo into an IPv4 Trie
- `is_iptrie`: Create a new IPv4 Trie
## Installation
```{r install-ex, eval=FALSE}
devtools::install_git("https://sr.ht.com/~hrbrmstr/iptrie.git")
# or
devtools::install_git("https://gitlab.com/hrbrmstr/iptrie.git")
# or (if you must)
devtools::install_github("hrbrmstr/iptree")
```
## Usage
```{r lib-ex}
library(iptrie)
library(tidyverse)
# current version
packageVersion("iptree")
```
### Basic Usage
```{r basic}
x <- iptrie_create()
iptrie_insert(x, "10.1.10.0/24", "HOME")
iptrie_ip_in(x,"10.1.10.1/32")
iptrie_ip_in(x,"10.1.11.1/32")
iptrie_lookup(x, "10.1.10.1/32")
iptrie_lookup(x, "10.1.10.1/32", "exact")
```
### Data frame to iptrie
```{r dftotrie}
xdf <- data.frame(a = "10.1.10.0/24", b = "HOME", stringsAsFactors = FALSE)
(xt <- as_iptrie(xdf))
is_iptrie(xt)
(xt <- as_iptrie(xdf, "a", "b"))
iptrie_ip_in(xt, "10.1.10.6")
```
### Bigger example (kinda what `as.data.frame.iptrie` does)
```{r bigger, cache=TRUE}
# Make a trie from autonomous system CIDRs
xdf <- astools::routeviews_latest()
cat(scales::comma(nrow(xdf)), "\n")
asntrie <- iptrie_create()
system.time(for (i in 1:nrow(xdf)) {
iptrie_insert(asntrie, xdf[["cidr"]][i], xdf[["asn"]][i])
})
# Get a block list (picked at ransom)
blklst <- url("https://iplists.firehol.org/files/botscout_1d.ipset")
y <- readLines(blklst)
close(blklst)
y <- y[!grepl("^#", y)] # comments at the top
cat(scales::comma(length(y)), "\n")
system.time(do.call(
rbind.data.frame,
lapply(y, function(.x) {
r <- iptrie_lookup(asntrie, .x, "best")
if (is.null(r)) {
data.frame(
ip = .x,
asn = NA_character_,
cidr = NA_character_,
stringsAsFactors = FALSE
)
} else {
data.frame(
ip = .x,
asn = r,
cidr = sprintf("%s/%d", attr(r, "ip"), attr(r, "mask")),
stringsAsFactors = FALSE
)
}
})
) -> cdf)
as_tibble(cdf)
count(cdf, cidr, sort=TRUE)
```
## iptree Metrics
```{r cloc, echo=FALSE}
cloc::cloc_pkg_md()
```
## Code of Conduct
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md).
By participating in this project you agree to abide by its terms.