Skip to content

Commit

Permalink
Fix gamma and delta encoding in ex. 6.2 and improve consistency
Browse files Browse the repository at this point in the history
  • Loading branch information
flandolfi committed Feb 7, 2018
1 parent 9f2c4b1 commit e71abca
Show file tree
Hide file tree
Showing 3 changed files with 57 additions and 38 deletions.
29 changes: 18 additions & 11 deletions 09_posting_compression/150605-2.tex
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@

$$S'=(1,3,4,5,9,16,23,27,28,31,40).$$

Then, in order to encode the values obtained as binary strings, we use $b = \lceil \log_2{40}\rceil = 6$ bit.
$w = \lceil log_2{\frac{2^b}{n}}\rceil = 3$ of this bits are used to produce the $L$ sequence, and the remaining ones
are used to produce the $H$ sequence\footnote{Refer to the note about integers' encoding given during the course in order
to better understand how sequence $H$ is constructed.}.

Then, in order to encode the values obtained as binary strings, we encode the
integers in $S'$ using $b = \lceil \log_2{40}\rceil = 6$ bits, separating the
$w = \lceil log_2{\frac{2^b}{n}}\rceil = 3$ less significant bits, as in the
following table:
%
\begin{center}
\begin{tabular}{ c | c | c }
$S'$ & $z$ & $w$ \\ \hline
$S'$ & \multicolumn{2}{c}{$(S'_i)_2$} \\ \hline
1 & 000 & 001 \\
3 & 000 & 011 \\
4 & 000 & 100 \\
Expand All @@ -49,8 +49,15 @@
40 & 011 & 111
\end{tabular}
\end{center}

\begin{align*}
&L = 001011100101001000111011100111000, \\
&H = 1111010110111001000.
\end{align*}
%
Then, we build the vector $L$ concatenating the $w$ less significant bits of
each number
%
$$L = 001\;011\;100\;101\;001\;000\;111\;011\;100\;111\;000\;,$$
%
Finally, we complete the Elias-Fano encoding storing, for each binary string $s$
of $z = b - w = 5 - 3 = 2$ bits (viz. from $s=000$ to $s=111$), the
inverse-unary representation of the number of consecutive $s$ in the second
column of the above table, obtaining
%
$$H = 11110\;10\;110\;1110\;01\;0\;0\;0\;.$$
54 changes: 33 additions & 21 deletions 09_posting_compression/160111-2.tex
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,34 @@

\solution

The following table shows the gamma and delta encodings for the $n=7$ given
integers:
For the first two encodings, since we have a monotonic increasing sequence, we first apply the \emph{gap encoding}, obtaining
%
$$S'=(1, 5, 9, 3, 3, 3, 6).$$
%
The following table shows the gamma and delta encodings for the $n=7$
integers of $S'$:
%
\begin{center}
\begin{tabular}{ c | l | l | l }
$S_i$ & Binary & Gamma & Delta \\ \hline
1 & 1 & 1 & 1 \\
6 & 110 & 00\;110 & 011\;10 \\
15 & 1111 & 000\;1111 & 00100\;111 \\
18 & 10010 & 0000\;10010 & 00101\;0010 \\
21 & 10101 & 0000\;10101 & 00101\;0101 \\
24 & 11000 & 0000\;11000 & 00101\;1000 \\
30 & 11110 & 0000\;11110 & 00101\;1110 \\
\begin{tabular}{ c | c | l | l | l }
$S_i$ & $S'_i$ & Binary & Gamma & Delta \\ \hline
1 & 1 & 1 & 1 & 1 \\
6 & 5 & 101 & 00\;101 & 011\;01 \\
15 & 9 & 1001 & 000\;1001 & 00100\;001 \\
18 & 3 & 11 & 0\;11 & 010\;1 \\
21 & 3 & 11 & 0\;11 & 010\;1 \\
24 & 3 & 11 & 0\;11 & 010\;1 \\
30 & 6 & 110 & 00\;110 & 011\;10 \\
\end{tabular}
\end{center}

For the Elias-Fano encoding, we encode the integers in $S$ using $b = \lceil \log_2{30} \rceil = 5$ bit,
$w = \lceil \log_2{\frac{2^b}{n}} \rceil = \lceil \log_2{\frac{2^5}{7}} \rceil = 3$ of this bits are used to produce the
$L$ sequence, and $z = b - w = 5 - 3 = 2$ are used to procude the $H$ sequence.

For the Elias-Fano encoding, instead, we encode the integers in $S$ using $b =
\lceil \log_2{30} \rceil = 5$ bits, separating the $w = \lceil
\log_2{\frac{2^b}{n}} \rceil = \lceil \log_2{\frac{2^5}{7}} \rceil = 3$ less
significant bits, as in the following table:
%
\begin{center}
\begin{tabular}{ c | c | c }
$S'$ & $z$ & $w$ \\ \hline
$S_i$ & \multicolumn{2}{c}{$(S_i)_2$} \\ \hline
1 & 00 & 001 \\
6 & 00 & 110 \\
15 & 01 & 111 \\
Expand All @@ -42,8 +47,15 @@
30 & 11 & 110
\end{tabular}
\end{center}

\begin{align*}
&L = 001110111010101000110, \\
&H = 11010110110.
\end{align*}
%
Then, we build the vector $L$ concatenating the $w$ less significant bits of
each number
%
$$ L = 001\;110\;111\;010\;101\;000\;110\;. $$
%
Finally, we complete the Elias-Fano encoding storing, for each binary string $s$
of $z = b - w = 5 - 3 = 2$ bits (viz. from $s=00$ to $s=11$), the
inverse-unary representation of the number of consecutive $s$ in the second
column of the above table, obtaining
%
$$H = 110\;10\;110\;110\;.$$
12 changes: 6 additions & 6 deletions 09_posting_compression/160627-3.tex
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@
\end{tabular}
\end{center}
%
Then, we build a vector $L$ concatenating the $w$ less significant bits of each
number
Then, we build the vector $L$ concatenating the $w$ less significant bits of
each number
%
$$L = 10\; 00\; 00\; 10\; 01\; 10\; 11\; 01.$$

Finally, we complete the Elias-Fano encoding storing, for each binary string $S$
of $b-w$ bits (viz. from $S=000$ to $S=111$), the inverse-unary representation
of the number of consecutive $S$ in the second column of the above table,
obtaining
Finally, we complete the Elias-Fano encoding storing, for each binary string $s$
of $z = b-w = 3$ bits (viz. from $s=000$ to $s=111$), the inverse-unary
representation of the number of consecutive $s$ in the second column of the
above table, obtaining
%
$$H=10\;10\;110\;\underbracket{1110}_{\clubsuit}\;0\;10\;0\;0.$$
%
Expand Down

0 comments on commit e71abca

Please sign in to comment.