forked from snap-stanford/snap
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ReadMe.txt
306 lines (273 loc) · 13.2 KB
/
ReadMe.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
========================================================================
SNAP : Stanford Network Analysis Platform
http://snap.stanford.edu
========================================================================
Stanford Network Analysis Platform (SNAP) is a general purpose, high
performance system for analysis and manipulation of large networks.
SNAP is written in C++ and it scales to massive graphs with hundreds
of millions of nodes and billions of edges.
/////////////////////////////////////////////////////////////////////////////
Directory structure:
http://snap.stanford.edu/snap/description.html
snap-core:
the core SNAP graph library
snap-adv:
advanced SNAP components, not in the core, but used by examples
snap-exp:
experimental SNAP components, still in development
examples:
small sample applications that demonstrate SNAP functionality
tutorials:
simple programs, demonstrating use of various classes
glib-core:
STL-like library that implements basic data structures, like vectors
(TVec), hash-tables (THash) and strings (TStr), provides
serialization and so on
test:
unit tests for various classes
doxygen:
SNAP reference manuals
Code compiles under Windows (Microsoft Visual Studio, CygWin with gcc) and
Linux and Mac (gcc). Use the SnapExamples*.sln or provided makefiles.
Some of applications expect that GnuPlot and GraphViz are installed and
accessible -- paths are in the system PATH variable or they reside in the
working directory.
/////////////////////////////////////////////////////////////////////////////
Example applications for advanced SNAP functionality are available
in the examples directory and described at:
http://snap.stanford.edu/snap/description.html.
To compile from the command line, execute:
make all # compiles SNAP and all sample applications
To compile on Mac OS X, using Xcode:
1. From the Toolbar, select Scheme (e.g. 'bigclam').
2. Product -> Build. (or Cmd + B).
3. Run executable via the command line; or
Choose the scheme's executable (Product -> Edit Scheme -> Run -> Info)
and run: Product -> Run (or Cmd + R).
Note: If using Gnuplot, add the PATH to the scheme's environment variables.
or create symlink to /usr/bin:
sudo ln -s <gnuplot_dir>/gnuplot /usr/bin/
For code completion, the "docs" target has been created which includes all
Snap-related files and example programs.
Description of examples:
agmfit :
Detects network communities from a given network by fitting
AGM to the given network by maximum likelihood estimation.
agmgen :
Implements the Affiliation Graph Model (AGM). AGM generates
a realistic looking graph from the community affiliation of the nodes.
bigclam :
Formulates community detection problems into non-negative matrix
factorization and discovers community membership factors of nodes.
cascadegen :
Identifies cascades in a list of events.
cascades :
Simulates a SI (susceptible-infected) model on a network and computes
structural properties of cascades.
centrality :
Computes node centrality measures for a graph: closeness, eigen,
degree, betweenness, page rank, hubs and authorities.
cesna :
Implements a large scale overlapping community detection method
for networks with node attributes based on Communities from
Edge Structure and Node Attributes (CESNA).
circles :
Implements a method for identifying users social circles.
cliques :
Finds overlapping dense groups of nodes in networks,
based on the Clique Percolation Method.
coda :
Implements a large scale overlapping community detection method
based on Communities through Directed Affiliations (CoDA), which
handles directed as well as undirected networks. The method is able
to find 2-mode communities where the member nodes form a bipartite
connectivity structure.
community :
Implements network community detection algorithms: Girvan-Newman,
Clauset-Newman-Moore and Infomap.
concomp :
Computes weakly, strongly and biconnected connected components,
articulation points and bridge edges of a graph.
flows :
Computes the maximum network flow in a network.
forestfire :
Generates graphs using the Forest Fire model.
graphgen :
Generates undirected graphs using one of the many SNAP graph generators.
graphhash :
Demonstrates the use of TGHash graph hash table, useful for
counting frequencies of small subgraphs or information cascades.
infopath :
Implements stochastic algorithm for dynamic network inference from
cascade data, see http://snap.stanford.edu/infopath/.
kcores :
Computes the k-core decomposition of the network and plots
the number of nodes in a k-core of a graph as a function of k.
kronem :
Estimates Kronecker graph parameter matrix using EM algorithm.
kronfit :
Estimates Kronecker graph parameter matrix.
krongen :
Generates Kronecker graphs.
localmotifcluster :
Implements a local method for motif-based clustering using MAPPR.
lshtest :
Implements locality sensitive hashing.
magfit :
Estimates Multiplicative Attribute Graph (MAG) model parameter.
maggen :
Generates Multiplicative Attribute Graphs (MAG).
mkdatasets :
Demonstrates how to load different kinds of networks in various
network formats and how to compute various statistics of the network.
motifcluster :
Implements a spectral method for motif-based clustering.
motifs :
Counts the number of occurrence of every possible subgraph on K nodes
in the network.
ncpplot :
Plots the Network Community Profile (NCP).
netevol :
Computes properties of an evolving network, like evolution of
diameter, densification power law, degree distribution, etc.
netinf :
Implements netinf algorithm for network inference from
cascade data, see http://snap.stanford.edu/netinf.
netstat :
Computes statistical properties of a static network, like degree
distribution, hop plot, clustering coefficient, distribution of sizes
of connected components, spectral properties of graph adjacency
matrix, etc.
randwalk :
Computes Personalized PageRank between pairs of nodes.
rolx :
Implements the rolx algorithm for analysing the structural
roles in the graph.
testgraph :
Demonstrates some of the basic SNAP functionality.
temporalmotifs :
Counts temporal motifs in temporal networks.
zygote :
Demonstrates how to use SNAP with the Zygote library, which
significantly speeds up computations that need to process the
same large graph many times.
/////////////////////////////////////////////////////////////////////////////
SNAP documentation:
http://snap.stanford.edu/snap/doc.html
The library defines Graphs (nodes and edges) and Networks (graphs with data
associated with nodes and edges).
Graph types:
TNGraph :
directed graph (single directed edge between a pair of nodes)
TUNGraph :
undirected graph (single undirected edge between a pair of nodes)
TNEGraph :
directed multi-graph (multiple directed edges can exist between
a pair of nodes)
Network types:
TNodeNet<TNodeData> :
like TNGraph, but with TNodeData object for each node
TNodeEDatNet<TNodeData,TEdgeData> :
like TNGraph, but with TNodeData object for each node and TEdgeData
object for each edge
TNodeEdgeNet<TNodeData, TEdgeData> :
like TNEGraph but with TNodeData object for each node and TEdgeData
object for each edge
TNEANet :
like TNEGraph, but with attributes on nodes and edges. The attributes
are dynamic in that they can be defined at runtime
TBigNet<TNodeData> :
memory efficient implementation of TNodeNet (avoids memory
fragmentation)
To generate reference manuals, install doxygen (www.doxygen.org), and execute:
cd doxygen; make all # generates user and developer reference manuals
/////////////////////////////////////////////////////////////////////////////
SNAP tutorials
Sample programs demonstrating the use of foundational SNAP classes and
functionality are available in the tutorials directory.
To compile all the tutorials, execute the following command line:
cd tutorials; make all # generates all the tutorials
/////////////////////////////////////////////////////////////////////////////
SNAP unit tests
Unit tests are available in the test directory.
To run unit tests, install googletest (code.google.com/p/googletest) and
execute:
cd test; make run # compiles and runs all the tests
/////////////////////////////////////////////////////////////////////////////
Description of SNAP files:
http://snap.stanford.edu/snap/description.html
snap-core:
alg.h : Simple algorithms like counting node degrees, simple graph
manipulation (adding/deleting self edges, deleting isolated nodes)
and testing whether graph is a tree or a star.
anf.h : Approximate Neighborhood Function: linear time algorithm to
approximately calculate the diameter of massive graphs.
bfsdfs.h : Algorithms based on Breath First Search (BFS) and Depth First
Search (DFS): shortest paths, spanning trees, graph diameter, and
similar.
bignet.h : Memory efficient implementation of a network with data on
nodes. Use when working with very large networks.
casc.h : Computes cascades from a list of events.
centr.h : Node centrality measures: closeness, betweenness, PageRank, ...
cmty.h : Algorithms for network community detection: Modularity,
Girvan-Newman, Clauset-Newman-Moore.
cncom.h : Connected components: weakly, strongly and biconnected
components, articular nodes and bridge edges.
ff.h : Forest Fire model for generating networks that densify and have
shrinking diameters.
flow.h: Maximum flow algorithms.
gbase.h : Defines flags that are used to identify functionality of graphs.
ggen.h : Various graph generators: random graphs, copying model,
preferential attachment, RMAT, configuration model, Small world model.
ghash.h : Hash table with directed graphs (<tt>TNGraph</tt>) as keys. Uses
efficient adaptive approximate graph isomorphism testing to scale to
large graphs. Useful when one wants to count frequencies of various
small subgraphs or cascades.
gio.h : Graph input output. Methods for loading and saving various textual
and XML based graph formats: Pajek, ORA, DynNet, GraphML (GML),
Matlab.
graph.h : Implements graph types TUNGraph, TNGraph and TNEGraph.
gstat.h : Computes many structural properties of static and evolving networks.
gsvd.h : Eigen and singular value decomposition of graph adjacency matrix.
gviz.h : Interface to GraphViz for plotting small graphs.
kcore.h : K-core decomposition of networks.
network.h : Implements network types TNodeNet, TNodeEDatNet and TNodeEdgeNet.
randwalk.h : Computing random walk scores and personalized PageRank
between pairs of nodes
Snap.h : Main include file of the library.
statplot.h : Plots of various structural network properties: clustering,
degrees, diameter, spectrum, connected components.
subgraph.h : Extracting subgraphs and converting between different
graph/network types.
timenet.h : Temporally evolving networks.
triad.h : Functions for counting triads (triples of connected nodes in the
network) and computing clustering coefficient.
util.h : Utilities to manipulate PDFs, CDFs and CCDFs. Quick and dirty
string manipulation, URL and domain manipulation routines.
snap-adv:
agm*.h : Implements the Affiliation Graph Model (AGM).
cliques.h : Maximal clique detection and Clique Percolation method.
graphcounter.h : Performs fast graph isomorphism testing to count the
frequency of topologically distinct sub-graphs.
kronecker.h : Kronecker Graph generator and KronFit algorithm for
estimating parameters of Kronecker graphs.
mag.h : Implements the Multiplicative Attribute Graph (MAG).
motifcluster.h : Implements motif-based clustering algorithms.
ncp.h : Network community profile plot. Implements local spectral graph
partitioning method to efficiently find communities in networks.
rolx.h : Node role detection.
subgraphenum.h : Enumerates all connected induced sub-graphs of particular
size.
snap-exp:
arxiv.h : Functions for parsing Arxiv data and standardizing author names.
dblp.h : Parser for XML dump of DBLP data.
imdbnet.h : Actors-to-movies bipartite network of IMDB.
mxdag.h Finds the maximum directed-acyclic subgraph of a given
directed graph.
signnet.h : Networks with signed (+1, -1) edges that can denote
trust/distrust between the nodes of the network.
sir.h : SIR epidemic model and SIR parameter estimation.
spinn3r.h : Past parser for loading blog post data from Spinn3r.
trawling.h : Algorithm of extracting bipartite cliques from the network.
wgtnet.h : Weighted networks.
wikinet.h : Networks based on Wikipedia.