Skip to content

Commit

Permalink
Merge pull request #70 from TensorBFS/fix-62-69-66
Browse files Browse the repository at this point in the history
Fix issues: 62 69 66
  • Loading branch information
mroavi authored Sep 10, 2023
2 parents b9797d4 + ea48d21 commit fbbd5a4
Show file tree
Hide file tree
Showing 7 changed files with 278 additions and 15 deletions.
7 changes: 5 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,16 @@ makedocs(;
),
pages=[
"Home" => "index.md",
"Background" => "background.md",
"Background" => [
"Probabilistic Inference" => "probabilisticinference.md",
"Tensor Networks" => "tensornetwork.md",
"UAI file formats" => "uai-file-formats.md"
],
"Examples" => [
"Overview" => "examples-overview.md",
"Asia Network" => "generated/asia/main.md",
"Hard-core Lattice Gas" => "generated/hard-core-lattice-gas/main.md",
],
"UAI file formats" => "uai-file-formats.md",
"Performance tips" => "generated/performance.md",
"API" => [
"Public" => "api/public.md",
Expand Down
24 changes: 24 additions & 0 deletions docs/src/assets/preambles/the-tensor-network.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
\usepackage{tikz}
\usepackage{xcolor}
\usetikzlibrary{positioning}

\definecolor{c01}{HTML}{5790fc}
\definecolor{c02}{HTML}{f89c20}
\definecolor{c03}{HTML}{e42536}
\definecolor{c04}{HTML}{964a8b}
\definecolor{c05}{HTML}{9c9ca1}
\definecolor{c06}{HTML}{7a21dd}

\tikzset {
mytensor/.style={
circle,
thick,
fill=white,
draw=black!100,
font=\small,
minimum size=0.5cm
},
myedge/.style={
line width=0.80pt,
}
}
5 changes: 3 additions & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,10 @@ more complex, real-world models.
## Outline
```@contents
Pages = [
"background.md",
"examples-overview.md",
"probabilisticinference.md",
"tensornetwork.md",
"uai-file-formats.md",
"examples-overview.md",
"performance.md",
"api/public.md",
"api/internal.md",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Background
# Probabilistic inference

*TensorInference* implements efficient methods to perform Bayesian inference in
*probabilistic graphical models*, such as Bayesian Networks or Markov random
Expand Down
229 changes: 229 additions & 0 deletions docs/src/tensornetwork.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
# Tensor networks

We now introduce the core ideas of tensor networks, highlighting their
connections with probabilistic graphical models (PGM) to align the terminology
between them.

For our purposes, a tensor is equivalent to the concept of a factor as defined
in the PGM domain, which we detail more formally below.

## What is a tensor?

*Definition*: A tensor $T$ is defined as:
```math
T: \prod_{V \in \bm{V}} \mathcal{D}_{V} \rightarrow \texttt{number}.
```
Here, the function $T$ maps each possible instantiation of the random
variables in its scope $\bm{V}$ to a generic number type. In the context of tensor networks,
a minimum requirement is that the number type is a commutative semiring.
To define a commutative semiring with the addition operation $\oplus$ and the multiplication operation $\odot$ on a set $S$, the following relations must hold for any arbitrary three elements $a, b, c \in S$.
```math
\newcommand{\mymathbb}[1]{\mathbb{#1}}
\begin{align*}
(a \oplus b) \oplus c = a \oplus (b \oplus c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\oplus$ with identity $\mymathbb{0}$}\\
a \oplus \mymathbb{0} = \mymathbb{0} \oplus a = a &\\
a \oplus b = b \oplus a &\\
&\\
(a \odot b) \odot c = a \odot (b \odot c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\odot$ with identity $\mymathbb{1}$}\\
a \odot \mymathbb{1} = \mymathbb{1} \odot a = a &\\
a \odot b = b \odot a &\\
&\\
a \odot (b\oplus c) = a\odot b \oplus a\odot c & \hspace{5em}\text{$\triangleright$ left and right distributive}\\
(a\oplus b) \odot c = a\odot c \oplus b\odot c &\\
&\\
a \odot \mymathbb{0} = \mymathbb{0} \odot a = \mymathbb{0}
\end{align*}
```
Tensors are represented using multidimensional arrays of nonnegative numbers
with labeled dimensions. These labels correspond to the array's indices, which
in turn represent the set of random variables that the tensor is a function
of. Thus, in this context, the terms **label**, **index**, and
**variable** are synonymous and hence used interchangeably.

## What is a tensor network?

We now turn our attention to defining a **tensor network**, a mathematical
object used to represent a multilinear map between tensors. This concept is
widely employed in fields like condensed matter physics
[^Orus2014][^Pfeifer2014], quantum simulation [^Markov2008][^Pan2022], and
even in solving combinatorial optimization problems [^Liu2023]. It's worth
noting that we use a generalized version of the conventional notation, most
commonly known through the
[eisnum](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)
function, which is commonly used in high-performance computing. Packages that
implement this conventional notation include
- [numpy](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)
- [OMEinsum.jl](https://github.com/under-Peter/OMEinsum.jl)
- [PyTorch](https://pytorch.org/docs/stable/generated/torch.einsum.html)
- [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/einsum)

This approach allows us to represent a broader range of sum-product
multilinear operations between tensors, thus meeting the requirements of the
PGM field.

*Definition*[^Liu2023]: A tensor network is a multilinear map represented by the triple
$\mathcal{N} = (\Lambda, \mathcal{T}, \bm{\sigma}_0)$, where:
- $\Lambda$ is the set of variables present in the network
$\mathcal{N}$.
- $\mathcal{T} = \{ T^{(k)}_{\bm{\sigma}_k} \}_{k=1}^{M}$ is the set of
input tensors, where each tensor $T^{(k)}_{\bm{\sigma}_k}$ is identified
by a superscript $(k)$ and has an associated scope $\bm{\sigma}_k$.
- $\bm{\sigma}_0$ specifies the scope of the output tensor.

More specifically, each tensor $T^{(k)}_{\bm{\sigma}_k} \in \mathcal{T}$ is
labeled by a string $\bm{\sigma}_k \in \Lambda^{r \left(T^{(k)} \right)}$, where
$r \left(T^{(k)} \right)$ is the rank of $T^{(k)}$. The multilinear map, also
known as the `contraction`, applied to this triple is defined as
```math
\texttt{contract}(\Lambda, \mathcal{T}, \bm{\sigma}_0) = \sum_{\bm{\sigma}_{\Lambda
\setminus [\bm{\sigma}_0]}} \prod_{k=1}^{M} T^{(k)}_{\bm{\sigma}_k},
```
Notably, the summation extends over all instantiations of the variables that
are not part of the output tensor.

As an example, consider matrix multiplication, which can be specified as a
tensor network contraction:
```math
(AB)_{ik} = \texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right),
```
Here, matrices $A$ and $B$ are input tensors labeled by strings $ij, jk \in
\{i, j, k\}^2$. The output tensor is labeled by string $ik$. Summations run
over indices $\Lambda \setminus [ik] = \{j\}$. The contraction corresponds to
```math
\texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right) = \sum_j
A_{ij}B_{jk},
```
In the einsum notation commonly used in various programming languages, this is
equivalent to `ij, jk -> ik`.

Diagrammatically, a tensor network can be represented as an *open hypergraph*.
In this diagram, a tensor maps to a vertex, and a variable maps to a
hyperedge. Tensors sharing the same variable are connected by the same
hyperedge for that variable. The diagrammatic representation of matrix
multiplication is:
```@eval
using TikzPictures
tp = TikzPicture(
L"""
\matrix[row sep=0.8cm,column sep=0.8cm,ampersand replacement= \& ] {
\node (1) {}; \&
\node (a) [mytensor] {$A$}; \&
\node (b) [mytensor] {$B$}; \&
\node (2) {}; \&
\\
};
\draw [myedge, color=c01] (1) edge node[below] {$i$} (a);
\draw [myedge, color=c02] (a) edge node[below] {$j$} (b);
\draw [myedge, color=c03] (b) edge node[below] {$k$} (2);
""",
options="every node/.style={scale=2.0}",
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
)
save(SVG("the-tensor-network1"), tp)
```

```@raw html
<img src="the-tensor-network1.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
```

In this diagram, we use different colors to denote different hyperedges. Hyperedges for
$i$ and $j$ are left open to denote variables in the output string
$\bm{\sigma}_0$. The reason we use hyperedges rather than regular edges will
become clear in the following star contraction example.
```math
\texttt{contract}(\{i,j,k,l\}, \{A_{il}, B_{jl}, C_{kl}\}, ijk) = \sum_{l}A_{il}
B_{jl} C_{kl}
```
The equivalent einsum notation employed by many programming languages is `il,
jl, kl -> ijk`.

Since the variable $l$ is shared across all three tensors, a simple graph
can't capture the diagram's complexity. The more appropriate hypergraph
representation is shown below.
```@eval
using TikzPictures
tp = TikzPicture(
L"""
\matrix[row sep=0.4cm,column sep=0.4cm,ampersand replacement= \& ] {
\&
\&
\node[color=c01] (j) {$j$}; \&
\&
\&
\\
\&
\&
\node (b) [mytensor] {$B$}; \&
\&
\&
\\
\node[color=c03] (i) {$i$}; \&
\node (a) [mytensor] {$A$}; \&
\node[color=c02] (l) {$l$}; \&
\node (c) [mytensor] {$C$}; \&
\node[color=c04] (k) {$k$}; \&
\\
};
\draw [myedge, color=c01] (j) edge (b);
\draw [myedge, color=c02] (b) edge (l);
\draw [myedge, color=c03] (i) edge (a);
\draw [myedge, color=c02] (a) edge (l);
\draw [myedge, color=c02] (l) edge (c);
\draw [myedge, color=c04] (c) edge (k);
""",
options="every node/.style={scale=2.0}",
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
)
save(SVG("the-tensor-network2"), tp)
```

```@raw html
<img src="the-tensor-network2.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
```

As a final note, our definition of a tensor network allows for repeated
indices within the same tensor, which translates to self-loops in their
corresponding diagrams.

## Tensor network contraction orders

The performance of a tensor network contraction depends on the order in which
the tensors are contracted. The order of contraction is usually specified by
binary trees, where the leaves are the input tensors and the internal nodes
represent the order of contraction. The root of the tree is the output tensor.

Numerous approaches have been proposed to determine efficient contraction
orderings, which include:
- Greedy algorithms
- Breadth-first search and Dynamic programming [^Pfeifer2014]
- Graph bipartitioning [^Gray2021]
- Local search [^Kalachev2021]

Some of these have been implemented in the
[OMEinsum](https://github.com/under-Peter/OMEinsum.jl) package. Please check
[Performance Tips](@ref) for more details.

## References

[^Orus2014]:
Orús R. A practical introduction to tensor networks: Matrix product states and projected entangled pair states[J]. Annals of physics, 2014, 349: 117-158.

[^Markov2008]:
Markov I L, Shi Y. Simulating quantum computation by contracting tensor networks[J]. SIAM Journal on Computing, 2008, 38(3): 963-981.

[^Pfeifer2014]:
Pfeifer R N C, Haegeman J, Verstraete F. Faster identification of optimal contraction sequences for tensor networks[J]. Physical Review E, 2014, 90(3): 033315.

[^Gray2021]:
Gray J, Kourtis S. Hyper-optimized tensor network contraction[J]. Quantum, 2021, 5: 410.

[^Kalachev2021]:
Kalachev G, Panteleev P, Yung M H. Multi-tensor contraction for XEB verification of quantum circuits[J]. arXiv:2108.05665, 2021.

[^Pan2022]:
Pan F, Chen K, Zhang P. Solving the sampling problem of the sycamore quantum circuits[J]. Physical Review Letters, 2022, 129(9): 090502.

[^Liu2023]:
Liu J G, Gao X, Cain M, et al. Computing solution space properties of combinatorial optimization problems via generic tensor networks[J]. SIAM Journal on Scientific Computing, 2023, 45(3): A1239-A1270.
5 changes: 3 additions & 2 deletions examples/asia/main.jl
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,9 @@ tn = TensorNetworkModel(model)

# ---

# Calculate the ``\log_{10}`` partition function
probability(tn) |> first |> log10
# Calculate the partition function.
# Since the factors in this model is normalized, the partition function is the same as total probability, $1$.
probability(tn) |> first

# ---

Expand Down
21 changes: 13 additions & 8 deletions examples/hard-core-lattice-gas/main.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,23 @@ using GenericTensorNetworks.Graphs: edges, nv
graph = unit_disk_graph(vec(sites), blockade_radius)
show_graph(graph; locs=sites, texts=fill("", length(sites)))

# These constraints defines a independent set problem that characterized by the following energy based model.
# Let $G = (V, E)$ be a graph, where $V$ is the set of vertices and $E$ be the set of edges. The energy model for the hard-core lattice gas problem is
# These constraints defines an independent set problem that characterized by the following energy based model.
# Let $G = (V, E)$ be a graph, where $V$ is the set of vertices and $E$ is the set of edges.
# The energy model for the hard-core lattice gas problem is
# ```math
# E(\mathbf{n}) = -\sum_{i \in V}w_i n_i + \infty \sum_{(i, j) \in E} n_i n_j
# E(\mathbf{n}) = -\sum_{i \in V}w_i n_i + U \sum_{(i, j) \in E} n_i n_j
# ```
# where $n_i \in \{0, 1\}$ is the number of particles at site $i$, and $w_i$ is the weight associated with it. For unweighted graphs, the weights are uniform.
# The solution space hard-core lattice gas is equivalent to that of an independent set problem. The independent set problem involves finding a set of vertices in a graph such that no two vertices in the set are adjacent (i.e., there is no edge connecting them).
# $U$ is the repulsive interaction strength between two particles.
# To represent the independence constraint, we let $U = \infty$, i.e. coexitence of two particles at two sites connected by an edge is completely forbidden.
# The solution space hard-core lattice gas is equivalent to that of an independent set problem.
# The independent set problem involves finding a set of vertices in a graph such that no two vertices in the set are adjacent (i.e., there is no edge connecting them).
# One can create a tensor network based modeling of an independent set problem with package [`GenericTensorNetworks.jl`](https://github.com/QuEraComputing/GenericTensorNetworks.jl).
using GenericTensorNetworks
problem = IndependentSet(graph; optimizer=GreedyMethod());

# There has been a lot of discussions related to solution space properties in the `GenericTensorNetworks` [documentaion page](https://queracomputing.github.io/GenericTensorNetworks.jl/dev/generated/IndependentSet/).
# In this example, we show how to use `TensorInference` to use probabilistic inference for understand the finite temperature properties of this statistic physics model.
# There are plenty of discussions related to solution space properties in the `GenericTensorNetworks` [documentaion page](https://queracomputing.github.io/GenericTensorNetworks.jl/dev/generated/IndependentSet/).
# In this example, we show how to use `TensorInference` to use probabilistic inference for understand the finite temperature properties of this statistical model.
# We use [`TensorNetworkModel`](@ref) to convert a combinatorial optimization problem to a probabilistic model.
# Here, we let the inverse temperature be $\beta = 3$.

Expand All @@ -62,7 +66,8 @@ pmodel2 = TensorNetworkModel(problem, β; mars=[[e.src, e.dst] for e in edges(gr
mars = marginals(pmodel2);

# We show the probability that both sites on an edge are not occupied
show_graph(graph; locs=sites, edge_colors=[(b = mars[[e.src, e.dst]][1, 1]; (1-b, 1-b, 1-b)) for e in edges(graph)], texts=fill("", nv(graph)), edge_line_width=5)
show_graph(graph; locs=sites, edge_colors=[(b = mars[[e.src, e.dst]][1, 1]; (1-b, 1-b, 1-b)) for e in edges(graph)], texts=fill("", nv(graph)),
edge_line_widths=edge_colors=[8*mars[[e.src, e.dst]][1, 1] for e in edges(graph)])

# ## The most likely configuration
# The MAP and MMAP can be used to get the most likely configuration given an evidence.
Expand Down Expand Up @@ -90,5 +95,5 @@ sum(config2)
# One can ue [`sample`](@ref) to generate samples from hard-core lattice gas at finite temperature.
# The return value is a matrix, with the columns correspond to different samples.
configs = sample(pmodel3, 1000)
sizes = sum(configs; dims=1)
sizes = sum.(configs)
[count(==(i), sizes) for i=0:34] # counting sizes

0 comments on commit fbbd5a4

Please sign in to comment.