From e71abca2a6f29e3ce732d4748bebfda1245f408e Mon Sep 17 00:00:00 2001 From: Francesco Landolfi Date: Wed, 7 Feb 2018 20:42:41 +0100 Subject: [PATCH] Fix gamma and delta encoding in ex. 6.2 and improve consistency --- 09_posting_compression/150605-2.tex | 29 ++++++++++------ 09_posting_compression/160111-2.tex | 54 ++++++++++++++++++----------- 09_posting_compression/160627-3.tex | 12 +++---- 3 files changed, 57 insertions(+), 38 deletions(-) diff --git a/09_posting_compression/150605-2.tex b/09_posting_compression/150605-2.tex index 4631914..22c39e9 100644 --- a/09_posting_compression/150605-2.tex +++ b/09_posting_compression/150605-2.tex @@ -28,14 +28,14 @@ $$S'=(1,3,4,5,9,16,23,27,28,31,40).$$ -Then, in order to encode the values obtained as binary strings, we use $b = \lceil \log_2{40}\rceil = 6$ bit. -$w = \lceil log_2{\frac{2^b}{n}}\rceil = 3$ of this bits are used to produce the $L$ sequence, and the remaining ones -are used to produce the $H$ sequence\footnote{Refer to the note about integers' encoding given during the course in order -to better understand how sequence $H$ is constructed.}. - +Then, in order to encode the values obtained as binary strings, we encode the +integers in $S'$ using $b = \lceil \log_2{40}\rceil = 6$ bits, separating the +$w = \lceil log_2{\frac{2^b}{n}}\rceil = 3$ less significant bits, as in the +following table: +% \begin{center} \begin{tabular}{ c | c | c } - $S'$ & $z$ & $w$ \\ \hline + $S'$ & \multicolumn{2}{c}{$(S'_i)_2$} \\ \hline 1 & 000 & 001 \\ 3 & 000 & 011 \\ 4 & 000 & 100 \\ @@ -49,8 +49,15 @@ 40 & 011 & 111 \end{tabular} \end{center} - -\begin{align*} - &L = 001011100101001000111011100111000, \\ - &H = 1111010110111001000. -\end{align*} +% +Then, we build the vector $L$ concatenating the $w$ less significant bits of +each number +% +$$L = 001\;011\;100\;101\;001\;000\;111\;011\;100\;111\;000\;,$$ +% +Finally, we complete the Elias-Fano encoding storing, for each binary string $s$ +of $z = b - w = 5 - 3 = 2$ bits (viz. from $s=000$ to $s=111$), the +inverse-unary representation of the number of consecutive $s$ in the second +column of the above table, obtaining +% +$$H = 11110\;10\;110\;1110\;01\;0\;0\;0\;.$$ diff --git a/09_posting_compression/160111-2.tex b/09_posting_compression/160111-2.tex index d4743da..bb338df 100644 --- a/09_posting_compression/160111-2.tex +++ b/09_posting_compression/160111-2.tex @@ -10,29 +10,34 @@ \solution -The following table shows the gamma and delta encodings for the $n=7$ given -integers: +For the first two encodings, since we have a monotonic increasing sequence, we first apply the \emph{gap encoding}, obtaining +% +$$S'=(1, 5, 9, 3, 3, 3, 6).$$ +% +The following table shows the gamma and delta encodings for the $n=7$ +integers of $S'$: % \begin{center} - \begin{tabular}{ c | l | l | l } - $S_i$ & Binary & Gamma & Delta \\ \hline - 1 & 1 & 1 & 1 \\ - 6 & 110 & 00\;110 & 011\;10 \\ - 15 & 1111 & 000\;1111 & 00100\;111 \\ - 18 & 10010 & 0000\;10010 & 00101\;0010 \\ - 21 & 10101 & 0000\;10101 & 00101\;0101 \\ - 24 & 11000 & 0000\;11000 & 00101\;1000 \\ - 30 & 11110 & 0000\;11110 & 00101\;1110 \\ + \begin{tabular}{ c | c | l | l | l } + $S_i$ & $S'_i$ & Binary & Gamma & Delta \\ \hline + 1 & 1 & 1 & 1 & 1 \\ + 6 & 5 & 101 & 00\;101 & 011\;01 \\ + 15 & 9 & 1001 & 000\;1001 & 00100\;001 \\ + 18 & 3 & 11 & 0\;11 & 010\;1 \\ + 21 & 3 & 11 & 0\;11 & 010\;1 \\ + 24 & 3 & 11 & 0\;11 & 010\;1 \\ + 30 & 6 & 110 & 00\;110 & 011\;10 \\ \end{tabular} \end{center} -For the Elias-Fano encoding, we encode the integers in $S$ using $b = \lceil \log_2{30} \rceil = 5$ bit, -$w = \lceil \log_2{\frac{2^b}{n}} \rceil = \lceil \log_2{\frac{2^5}{7}} \rceil = 3$ of this bits are used to produce the -$L$ sequence, and $z = b - w = 5 - 3 = 2$ are used to procude the $H$ sequence. - +For the Elias-Fano encoding, instead, we encode the integers in $S$ using $b = +\lceil \log_2{30} \rceil = 5$ bits, separating the $w = \lceil +\log_2{\frac{2^b}{n}} \rceil = \lceil \log_2{\frac{2^5}{7}} \rceil = 3$ less +significant bits, as in the following table: +% \begin{center} \begin{tabular}{ c | c | c } - $S'$ & $z$ & $w$ \\ \hline + $S_i$ & \multicolumn{2}{c}{$(S_i)_2$} \\ \hline 1 & 00 & 001 \\ 6 & 00 & 110 \\ 15 & 01 & 111 \\ @@ -42,8 +47,15 @@ 30 & 11 & 110 \end{tabular} \end{center} - -\begin{align*} - &L = 001110111010101000110, \\ - &H = 11010110110. -\end{align*} +% +Then, we build the vector $L$ concatenating the $w$ less significant bits of +each number +% +$$ L = 001\;110\;111\;010\;101\;000\;110\;. $$ +% +Finally, we complete the Elias-Fano encoding storing, for each binary string $s$ +of $z = b - w = 5 - 3 = 2$ bits (viz. from $s=00$ to $s=11$), the +inverse-unary representation of the number of consecutive $s$ in the second +column of the above table, obtaining +% +$$H = 110\;10\;110\;110\;.$$ diff --git a/09_posting_compression/160627-3.tex b/09_posting_compression/160627-3.tex index ee4f4a1..294ddda 100644 --- a/09_posting_compression/160627-3.tex +++ b/09_posting_compression/160627-3.tex @@ -29,15 +29,15 @@ \end{tabular} \end{center} % -Then, we build a vector $L$ concatenating the $w$ less significant bits of each -number +Then, we build the vector $L$ concatenating the $w$ less significant bits of +each number % $$L = 10\; 00\; 00\; 10\; 01\; 10\; 11\; 01.$$ -Finally, we complete the Elias-Fano encoding storing, for each binary string $S$ -of $b-w$ bits (viz. from $S=000$ to $S=111$), the inverse-unary representation -of the number of consecutive $S$ in the second column of the above table, -obtaining +Finally, we complete the Elias-Fano encoding storing, for each binary string $s$ +of $z = b-w = 3$ bits (viz. from $s=000$ to $s=111$), the inverse-unary +representation of the number of consecutive $s$ in the second column of the +above table, obtaining % $$H=10\;10\;110\;\underbracket{1110}_{\clubsuit}\;0\;10\;0\;0.$$ %