introduction.tex

\section{Introduction}\label{sec:introduction}

A blockchain is a list of \emph{blocks}, each reporting the hash
of a previous block, satisfying some consistency or \emph{consensus} rules.
Blocks hold \emph{transactions}, whose exact nature is not relevant here:
they are requests to update the state of a global abstract machine
(a ledger of payments as in Bitcoin~\cite{Nakamoto08,Antonopoulos17})
or a sort of global memory where data structures are allocated and modified
(as in Ethereum~\cite{AntonopoulosW18}).
By using hashes as machine-independent
pointers, blockchains can be distributed in a network of peers.
This is desirable since data gets safely duplicated
and no special peer determines the history by itself.
However, peers expand the blockchain, independently from other
peers, hence the blockchain becomes a tree rather than a list.
A notion of chain quality incentivizes peers to append blocks to the highest-quality chain
(the \emph{best} chain).
Therefore, a peer could replace its current best chain with another, even better chain,
a so called \emph{history change}.

As presented above, peers are free to generate new blocks at maximal speed, flood the network
with new blocks and make the emergence of a best chain difficult.
This is an efficiency and security issue: frequent history changes allow
\emph{double spending} and network forks.
%, when the same money is moved in the ledger twice, once in the current history
%and once in the updated history.
The actual genious of Nakamoto~\cite{Nakamoto08} was to
(largely) solve this issue with a consensus rule requiring blocks to answer a \emph{challenge}
contained in their previous block. Namely, the hash of each block must be smaller than
a \emph{difficulty} value computed from the previous blocks, directly bound to the quality of the chain.
Therefore, who creates (\emph{mines}) a new block runs a \emph{proof of work} algorithm
that rotates (\emph{grinds}) many alternative values for a block field (\emph{nonce}), until
the hash of the block is smaller than the difficulty. This hardens the creation of new blocks,
makes it impossible to create blocks at arbitrary speed and introduces an
incentive to expanding the best chain only,
rather than creating alternative histories by mining on multiple chains.
%, since otherwise the miner
%risks spending work (concretely, electricity) for creating blocks discarded by its peers.
The difficulty changes overtime, to account for change in the total hashing power
of the network. As shown in~\cite{GarayKL17}, this stabilizes the block creation rate and
supports network consistency (all honest peers converge to the same chain, eventually).

Proof of work is a brute-force algorithm,
because of the non-correlation property of hash functions.
Therefore, it consumes energy (as much as a medium-sized country, for Bitcoin);
moreover, it is not egalitarian, being
worthwhile only in countries where electricity is cheap; furthermore, it
is more efficient over dedicated, expensive hardware (such as ASICs),
against the promise of a democratic and open network.
Therefore, the current trend is towards \emph{proof of stake}.
Its different flavors share the common idea that mining is limited
to a (static or dynamic, exclusive or delegatable)
set of peers (\emph{validators}), that \emph{stake} a collateral in exchange of mining rights.
Many criticize proof of stake for being centralized and undemocratic
(\emph{rich becomes richer}).
Moreover, it suffers from what we call a \emph{start-up issue}: as long as the cryptocurrency
of a newborn blockchain has still no value, validators have no incentive to work
and be updated. Moreover, validators get punished (\emph{slashed}) if they misbehave or are offline, which
might be perceived as unfair if that happens because of a connectivity issue or black-out.

A further alternative is \emph{proof of space}~\cite{AtenieseBFG14,DziembowskiFKP15}, where
miners must dedicate a large chunk of disk memory for answering challenges.
Its energy consumption is negligeable and no special
hardware helps for mining, currently: the technology is both cheap
and democratic. Moreover, proof of space allows
one to capitalize on unused memory, for free, while proof of work has always an
inherent electricity cost.
For fairness, proof of space protocols should only allow to generate answers of
quality directly proportional to the allocated space, or otherwise they are said
to suffer from a time/memory tradeoff. As a drawback,
cheap answers introduce new \emph{nothing-at-stake} security
attacks~\cite{ParkKFGAP18}, that are instead anti-economic with proof of work,
since computing power can only be dedicated to one mining task:
%
\begin{description}
\item[Block grinding:] Miners might find it profitable (\emph{rational})
  to mine many alternative new blocks, each holding
  different transactions, finally selecting the block that leads to better answers
  to subsequent challenges.
\item[Challenge grinding:] Miners might find it profitable
  to provide suboptimal answers to the current challenge if
  this leads to subsequent challenges for which they have much better answers.
\item[Mining on multiple chains:] Miners might find it profitable
  to mine multiple chains simultaneously (not only the best one).
\end{description}
%
These attacks increase the risk of double spending and make it convenient
to mine through space \emph{and} work, thus neutralizing the benefits of proof of space.

Another nothing-at-stake problem, that has not received great attention up to now,
is the \emph{newborn attack}~\cite{TangZDWLG0L19}: a miner that has allocated a large space
for mining for a blockchain network $N$ could use the same space, unchanged, for
mining for a newborn, small network $N'$. If the total space used by the peers of $N'$ is
initially relatively small, it could be possible for the miner to hijack the history of $N'$,
effectively taking its control.

Most formalizations of proof of space~\cite{AtenieseBFG14,DziembowskiFKP15,RenD16} are
based on challenges against graphs of high pebbling complexity, but
no actual blockchain has ever been built that way: only SpaceMint~\cite{ParkKFGAP18} exists
which is not a real blockchain but a minimal non-maintained prototype
of the theoretical consensus protocol only. Namely, a real graph pebbling blockchain has
never been implemented because:
%
%\begin{enumerate}
%\item
(1) the protocol includes an \emph{initialization phase},
  run for each new prover (miner) that joins the blockchain,
  that complicates the protocol itself and
  requires to spend cryptocurrency \emph{before}
  starting mining;
  %Compare this with Bitcoin, for instance, that allows one to start mining
  %and \emph{later} collect cryptocurrency. This initialization is a barrier against new
  %miners, that is, a limit to democracy;
  %\item
(2) answers to challenges (\emph{proofs}),
  included in blockchain, are relatively large~\cite{AbusalahACKPR17}:
  kilobytes or even megabytes for proofs created in the initialization phase.
%\end{enumerate}

An alternative is Permacoin~\cite{MillerJSPK14}, based on proof of retrievability rather than
graph pebbling, still in the family of the proof of space algorithms~\cite{RenD16}.
But neither Permacoin has been implemented: only a prototypical and minimal implementation
of only its consensus algorithm exists, discontinued in~2014. Neither SpaceMint nor Permacoin
have been shown to support smart contracts.

Our target of analysis has been Signum~\cite{Signum}, instead, since we wanted a fully-fledged implementation,
which gives practical relevance to our work. Signum (previously Burstcoin) is
actually deployed and runs continously since~2014. It provides smart contracts on top of its consensus algorithms.
%Nothing of that can be said about SpaceMint or
%Permacoin, that only exist as minimal prototype implementations limited to only their consensus
%algorithm, are not deployed blockchains, have not been shown to support smart contracts
%and their code has been discontinued a decade ago.
Signum is not based on graph pebbling: instead,
each miner precomputes a large \emph{plot file} of hashes, that is not shared nor
stored in blockchain.
A peer that wants to mine the next block derives a challenge from the current blockchain head
and challenges a miner for an answer, called
\emph{deadline}\ie a small (less than $200$ bytes) data structure, that can be very quickly
derived from the plot, with a quality measure
(its \emph{waiting time}) proportional, on average, to the size of the plot.
Signum's protocol is attractive since it has no initialization phase (each miner creates its plot file
independently and off-line) and its answers are very small.
In principle, Signum can be mined as described above (proof of space)
but also by recomputing the plot file on-the-fly, at each challenge, without any disk allocation
(proof of work). Because of that, Signum's mining is sometimes called \emph{proof of capacity}.
However, the proof of work version of Signum remains theoretical and there is no evidence that Signum has ever
been mined like that. This is because plots use an expensive hashing algorithm (shabal256) and are very big, so that their recomputation takes longer than the block creation rate.
Specialized hardware might change the situation in the future but, from its inception in 2014 to
the present day, Signum seems to have been only mined with proof of space.

The actual drawback of Signum is that the underlying theory has never been formalized nor defined up to now.
Moroever, \cite{ParkKFGAP18}~warned about a potential block grinding attack.
Without a formalization, it is impossible to judge if it is real.

Therefore, this paper provides the following contributions about Signum:
%
\begin{itemize}
\item a formalization of its algorithm, recostructed and interpolated from a very informal and partial
  description~\cite{SignumPlotting} and its poorly commented code~\cite{SignumSource};
\item a proof that block grinding attacks are impossible, against a previous hint~\cite{ParkKFGAP18};
\item a proof that challenge grinding attacks are limited;
\item a new protection against newborn attacks.
\end{itemize}
%
These results are relevant since they show that Signum's consensus
is actually supported by a formal theory and protected from a large class of attacks.

The rest of this paper is organized as follows.
Sec.~\ref{sec:nonces_and_plots} formalizes the structure of the plot files.
Sec.~\ref{sec:challenges_and_deadlines} defines the challenges that the consensus
algorithm must solve, and their answers (deadlines).
Sec.~\ref{sec:blockchain_construction} presents Signum's mining algorithm.
Sec.~\ref{sec:prolog_structure_attacks} studies grinding attacks
in Signum and proposes a new solution against newborn attacks.
Sec.~\ref{sec:related_work} presents related work.
Sec.~\ref{sec:conclusion} concludes. Proofs are reported in appendix.
Tab.~\ref{tab:notations} collects notations used throughout the paper.
It also reports specific choices
made in~\cite{SignumPlotting} (rightmost column), but this paper remains parametric \wrt them.
For instance, for genericity,
our formalization uses three hashing functions, that might actually coincide.

\begin{table}[t]
\begin{center}
\begin{tabular}{|c|c|c|}
  \hline
  \textbf{notation} & \textbf{meaning} & \textbf{in~\cite{SignumPlotting}}\\\hline\hline

  $\append$ & concatenation of sequences & \\\hline
  $\numberofscoops$ & number of scoops contained in a nonce & $4096$\\\hline

  $h_\deadline$ & hashing function for computing nonces, plots and deadlines & shabal256\\\hline

  $h_\generation$ & hashing function for computing the generations of challenges & shabal256\\\hline

  $h_\block$ & hashing function for computing the hash of the blocks & sha256 \\\hline

  $\kappa$ & threshold to the number of bytes fed to $h_\deadline$ in Alg.~\ref{alg:nonce_construction} & $4096$\\\hline

  $\beat$ & target block creation time interval (ms) & $240000$ \\\hline

  $\sigma_\genesis$ & generation signature for the genesis block & \\\hline

  $\tau_\now$ & current time (ms from Unix epoch) & \\\hline

  $\oblivion$ & acceleration reaction to changes of mining power ($0$ to $1$) & \\\hline
\end{tabular}
\end{center}
\caption{Notations and contextual information used in our formalization and their specific instantiations
  used in~\cite{SignumPlotting}, when available.}
\label{tab:notations}
\end{table}

\vspace*{1ex}
\textbf{Acknowledgments:}
%Removed for anonymization. %\newpage
We thank the developers of Signum
%for discussions on their Discord channel
and K.\ Pietrzak for discussing some
theoretical aspects of Signum.