14_binary_search.html

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width">
<title>14. Binary Search</title>
<link rel="stylesheet" href="template/style.css">
</head>
<body>


<span style="float: right">
[
<a href="13_searching.html">previous</a>, 
<a href="index.html">contents</a>, 
<a href="15_merge_inplace.html">next</a>
]
</span>


<a name="L14.-Binary-Search"></a>
<h1>14. Binary Search</h1>

<a name="Bisection-in-math"></a>
<h2>Bisection in math</h2>

<p>Bisection is an idea which is pretty ancient.
It was first used in a mathematical setting by <a href="https://en.wikipedia.org/wiki/Joseph-Louis_Lagrange">Lagrange</a> who applied it for
finding roots of polynomials, that was around 1796.
Two people independently discovered it
in late 1810s.
The first is <a href="https://en.wikipedia.org/wiki/Bernard_Bolzano">Bernard Bolzano</a>
the second is <a href="https://en.wikipedia.org/wiki/Augustin-Louis_Cauchy">Augustin-Louis Cauchy</a>.
They both invented the very famous theorem which is called <a href="https://en.wikipedia.org/wiki/Bisection_method">bisection</a>.
Where does it appear in mathematics?
The <a href="https://en.wikipedia.org/wiki/Intermediate_value_theorem">intermediate value theorem</a>, if you have a <a href="https://en.wikipedia.org/wiki/Continuous_function">continuous function</a> <code>f</code> which goes from negative to positive, it has to cross zero.</p>

<pre><code>                ____f(b) &gt; 0
               /
              /
        0 -- / --------
            /
 f(a) &lt; 0 \/
</code></pre>

<p>How do you find zero by bisection?
You divide the interval <code>[a, b]</code> in half, and you see if the new endpoint
is above or below, then you shrink the interval:</p>

<pre><code>[a   c)    b]
</code></pre>

<p><code>0</code> is on the left side if <code>f(c) &gt;= 0</code> and otherwise
<code>0</code> is on the right side.
It is a great idea.</p>

<a name="Partitions"></a>
<h2>Partitions</h2>

<p>We are still struggling on our path to binary search<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>.
It deals with two things.
First, it deals with a monotonically non-decreasing sequence.
Second of all it has some
binary predicate which establishes <code>StrictWeakOrdering</code>
on the sequence and which allows you to compare elements.
This is too much, even for people who design APIs.
In order to write it correctly we need to reduce it to a simpler problem.</p>

<p>Even simpler than the notion of a sorted sequence
is a <strong>partitioned sequence</strong>.
A sequence is partitioned based on a predicate.
A sequence is partitioned if the predicate is true for some sub range of items, and then false for the rest[^order].</p>

<a name="Is-partitioned"></a>
<h3>Is partitioned</h3>

<p>First we want a function to tell us
if a sequence is partitioned.
Who should go first? Good guys, or bad guys?
This is a tricky thing and STL gets it wrong.
When I did STL I thought good guys should naturally
come before bad.
Satisfying is good.
Don&rsquo;t you want satisfying things?</p>

<p>But it&rsquo;s wrong.
We want partition sequence to be sorted on the boolean values and since STL
assumes that ascending order is natural,
The right thing to do is make partition consistent<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>, false values go first.
But, for our course we will follow the standard.</p>

<pre><code>template&lt;typename I, typename P&gt;
// I is InputIterator
// P is unary predicate 
// value_type of I == argument_type of P
inline
bool is_partitioned(I first, I last, P pred) {
  first = find_if_not(first, last, pred);
  first = find_if(first, last, pred);
  return first == last;
}
</code></pre>

<p>Now for counted ranges:</p>

<pre><code>template&lt;typename I, typename N, typename P&gt;
// I is InputIterator
// P is unary predicate 
// N is integral type
// value_type of I == argument_type of P
inline
bool is_partitioned_n(I first, N n, P pred) {
  std::pair&lt;I, N&gt; p = find_if_not_n(first, n, pred);
  p = find_if_n(p.first, p.second, pred);
  return p.second == N(0);
}
</code></pre>

<a name="Partition-point"></a>
<h3>Partition point</h3>

<p>When we partition we will have true guys followed by false:</p>

<pre><code>T T T F F F
      ^
</code></pre>

<p>There is only one special thing, the partition point.
If we understand the partition point everything else will be very simple and there is no ambiguity.
<code>find_if</code> actually finds the partition point.
But, it does too many predicate applications.
We could do better if our range is at least forward iterable.</p>

<p>Even for singly linked lists such a thing is useful, because
if your predicate is very expensive,
while iteration is relatively fast<sup id="fnref:3"><a href="#fn:3" rel="footnote">3</a></sup>,
you still could reduce the number of tests to <code>log(n)</code>.
As we shall see we have a very good bound on the number of traversal
operations which is literally <code>n</code>, not order of <code>n</code>.
So we can get it so it works for everything.
Then it works on arrays much much faster than linked lists.</p>

<p>A distinguished computer scientist recently asked me, &ldquo;what if we just make it linear. will it really affect practical algorithms&rdquo;.
The answer is yes, very badly.</p>

<p>The algorithm to find it faster is to test the middle.
How do we go to the middle?
Divide by 2.
Dividing numbers is easier so we will start with a counted range, instead of bounded.</p>

<pre><code>template&lt;typename I, typename N, typename P&gt;
// I is ForwardIterator
// P is UnaryPredicate 
// N is integral type
// value_type of I == argument_type of P
inline
I partition_point_n(I first, N n, P pred) {
  while (n) {
    N half = n &gt;&gt; 1;
    I middle = first;
    std::advance(middle, half);
    if (pred(*middle)) {
      ++middle;
      first = middle;
      n -= (half + 1);
    } else {
      n = half;
    }
  }
  return first;
}
</code></pre>

<p>Why did I use a shift (<code>n &gt;&gt; 1</code>)? We know it&rsquo;s non-negative.
I&rsquo;m a penny pincher.
Maybe the compiler will automatically do it for <code>n / 2</code> maybe it will not.
Now it will.</p>

<p>How many <code>++</code> operations do we do?</p>

<pre><code>n/2 + n/4 + ... + = n.
</code></pre>

<p>We are traversing more than linear search on the average case.
We are also not trying to be lucky and find equal.</p>

<pre><code>template&lt;typename I, typename I, typename P&gt;
// I is ForwardIterator
// P is UnaryPredicate 
// value_type of I == argument_type of P
inline
I partition_point(I first, I last, P pred) {
  return partition_point_n(first, std::distance(first, last), pred);
}
</code></pre>

<a name="Upper-and-lower-bound"></a>
<h2>Upper and lower bound</h2>

<p>We need to talk a little bit about sorted ranges.
A precondition to binary search is not that the range is partitioned,
but it is sorted.
Again, we have a range and it could be counted or bounded
How do we know a range is sorted?
If you say <em>monotonically increasing</em>
then you would be wrong.
<em>Non decreasing</em> things are sorted.
Increasing is too much.
If I want to sort my coins and there are
two pennies however much I want to sort,
I&rsquo;m not going to have an ascending sequence.
What we need to guarantee is that for no pair of elements
<code>x_i</code>, <code>x_j</code> where <code>j &gt; i</code>
that <code>x_i &gt; x_j</code>.</p>

<p>But, that&rsquo;s too much to check.
We need to rely on this amazing property that if we check adjacent elements and
the property holds,
then it will hold for all the elements.
Why? Because of transitivity.
This again shows mathematics is actually important.
If you have a non-transitive relation it might not work.
You might say, &ldquo;well everybody knows that&rdquo;.
I could give you examples of famous people who do not quite get it.
We have transitivity from a requirement of <code>StrictWeakOrdering</code>.</p>

<a name="Is-sorted"></a>
<h3>Is sorted</h3>

<p>Let&rsquo;s write <code>is_sorted</code>.
We know an empty range is sorted.
As long as we didn&rsquo;t reach the end, and we didn&rsquo;t
find a counterexample (not suddenly decreasing)
we can go on.</p>

<pre><code>template &lt;typename I, typename R&gt;
// I is ForwardIterator
// R is WeakStrictOrdering on the value type of I
inline
bool is_sorted(I first, I last, R r) {
  if (first == last) return true;
  I previous = first;
  while (++first != last &amp;&amp; !r(*first, *previous)) previous = first;
  return first == last;
}
</code></pre>

<p>When you design a library for yourself, friends, or the world, provide all the versions.
So if you have a comparison operator, provide a version where you don&rsquo;t need
to pass the relation:</p>

<pre><code>template &lt;typename I&gt;
// I is ForwardIterator with a totally ordered value type
inline
bool is_sorted(I first, I last) {
  typedef typename std::iterator_traits&lt;I&gt;::value_type T;
  return is_sorted(first, last, std::less&lt;T&gt;());
}
</code></pre>

<p>There are a few choices which humankind forced upon us.
Increasing is better than decreasing.
1, 2, 3. Not 3, 2, 1.
Natural numbers go up.
The second choice is if we should use <code>&lt;</code> or <code>&lt;=</code>,
strict or non-strict.
I made a choice that <code>&lt;</code> is the primitive one.</p>

<p><strong>Exercise:</strong> There is an STL function called <a href="https://en.cppreference.com/w/cpp/algorithm/adjacent_find"><code>std::adjacent_find</code></a>
    and there is a profound relation between <code>is_sorted</code> and <code>adjacent_find</code>
    but you&rsquo;re going to discover it yourself.
    Write a program that uses <code>std::adjacent_find</code> and
    try to figure out this relationship.</p>

<a name="Binary-search-with-partition-points"></a>
<h2>Binary search with partition points</h2>

<p>How do we get from partition point to lower bound?
Let us assume we have a sorted range, of something.
Somebody important comes with an element <code>a</code> in the same domain.
I claim there are two points
which <code>a</code> establishes in this sorted sequence.
Let us assume that they are integers and <code>a = 5</code>.
<code>5</code> gives us two iterators in this sequence.
The beginning of the range containing <code>5</code> and the
end of the range,
and it could be empty.
Consider:</p>

<pre><code>2 3 5 5 6 7
    ^   ^
</code></pre>

<p>We get iterators pointing to the first <code>5</code>,
and just beyond.
How could we describe the property of the first iterator?
We have a dichotomy.
Everybody to the left is less than <code>5</code>.
In other words, it is the partition point<sup id="fnref:4"><a href="#fn:4" rel="footnote">4</a></sup>.
This is why we don&rsquo;t want equal because equal leads
to all these bad binary search that return <code>-1</code>,
because they only look for <code>5</code>.</p>

<p>As far as I could ascertain, these names are invented by me but I think they&rsquo;re good names,
<code>upper_bound</code> and <code>lower_bound</code>, there are two.
So what property does the upper bound have?
It is the first element which is greater.
Both lower bound and upper bound split our range into two ranges.
So in some sense we actually have 3 ranges:</p>

<pre><code>[first [lower)   [upper)  last)

 1. [first, lower)
 2. [lower, upper)
 3. [upper, last)
</code></pre>

<p>You can actually find them both together a little faster, than you can separately<sup id="fnref:5"><a href="#fn:5" rel="footnote">5</a></sup>.
There is a function <a href="https://en.cppreference.com/w/cpp/algorithm/equal_range"><code>std::equal_range</code></a> which does that.</p>

<p>First let&rsquo;s implement a function object
which is our predicate for partitioning.
It is defined by <code>P(x) = a &lt; x</code>
for a fixed element <code>a</code> and a comparison operator.</p>

<pre><code>template &lt;typename R, typename T&gt;
// R is StrictWeakOrdering
// T is an argument type of R
class lower_bound_predicate
{
  private:
    R r;
    const T* a;
  public:
    lower_bound_predicate(const R&amp; r, const T&amp; a) : r(r), a(&amp;a) {}
    bool operator()(const T&amp; x) { return r(x, *a); }
};
</code></pre>

<p>Now we can write <code>lower_bound</code>, which is the version
of &ldquo;binary search&rdquo; that we want<sup id="fnref:6"><a href="#fn:6" rel="footnote">6</a></sup>:</p>

<pre><code>template &lt;typename I, typename N, typename R&gt;
// I is ForwardIterator
// N is Integral
// R is WeakStrictOrdering on the value type of I
inline
I lower_bound_n(I first, N n, 
    const typename std::iterator_traits&lt;I&gt;::value_type&amp; a, R r) {
  // precondition: is_sorted_n(first, n, r)
  typedef typename std::iterator_traits&lt;I&gt;::value_type T;
  return partition_point_n(first, n, lower_bound_predicate&lt;R, T&gt;(r, a));
}

template &lt;typename I, typename R&gt;
// I is ForwardIterator
// R is WeakStrictOrdering on the value type of I
inline
I lower_bound(I first, I last, R r, 
    const typename std::iterator_traits&lt;I&gt;::value_type&amp; a) {
  // precondition: is_sorted(first, last, r)
  return lower_bound_n(first, std::distance(first, last), r);
}
</code></pre>

<p>We implement similar functions for <code>upper_bound</code>
found in the final code.</p>

<a name="Project:-Partitioning"></a>
<h2>Project: Partitioning</h2>

<p>So we have worked a lot with partitions, but
how do we actually partition a range?
A few lessons ago we saw the first example of sorting.
We are sorting linked lists, and we are sorting them well.
We will see many other examples of sort and sort like function.
Often people
have an array, or a list, or whatever they like,
and they want to distinguish between good things and bad things.
So, they say, &ldquo;how will I do this? I&rsquo;ll sort&rdquo;.
This is not necessarily a good solution because they are solving a simple problem,
the problem of partitioning a sequence, with the help
of expensive <code>n log(n)</code> algorithms.</p>

<p>I want you to think about this partitioning,
especially in terms of our wonderful <code>binary_counter</code> device.
However, we also want the partition to be <strong>stable</strong>.
You want to move all the bad guys up front and good guys to the tail end.
You will return an iterator pointing to the partition point
separating good from bad.
This partition is <em>stable if the relative order of good guys and bad guys
is maintained</em>,
meaning if I have an alphabetically sorted list of employees and I
want to divide them by say gender,
if I partition them, they will still be alphabetically sorted.</p>

<p>Of course, you don&rsquo;t have extra memory
I claim our marvelous device will allow you to do it.
But, we have to figure out how.
If you try to use something like <a href="https://en.wikipedia.org/wiki/Bubble_sort">bubble sort</a>.
It won&rsquo;t give you a very good solution.
It will be stable, but it&rsquo;s quadratic.</p>

<p>Start with linked list where the problem is easy,
then try to do it on an array.</p>

<a name="Code"></a>
<h2>Code</h2>

<ul>
<li><a href="code/partition.h">partition.h</a></li>
<li><a href="code/test_binary_search.cpp">test_binary_search.cpp</a></li>
</ul>

<div class="footnotes">
<hr/>
<ol>
<li id="fn:1">
Alex: Paul McJones has a good friend <a href="https://en.wikipedia.org/wiki/Butler_Lampson">Butler Lampson</a> who is a Turing award winner.
We went to lunch and he told us binary search is the
only algorithm a programmer needs to know.
I think sort should be there too, but we&rsquo;ll take his opinion.<a href="#fnref:1" rev="footnote">&#8617;</a></li>
<li id="fn:2">
Alex: Changing something in this standard is impossible,
because they just don&rsquo;t listen to arguments.
Whatever arguments you give them, they just say, &ldquo;It&rsquo;s the standard&rdquo;.<a href="#fnref:2" rev="footnote">&#8617;</a></li>
<li id="fn:3">
In writing my Master&rsquo;s thesis
I actually came across a comparison operator which is very expensive to evaluate,
called the Dehornoy ordering for <a href="https://github.com/justinmeiners/braid-rank-thesis">braid groups</a>.
This provides a total ordering which I used for sorting, removing duplicates, and other algorithms in STL style.
In this case having
algorithms that carefully optimized the number of comparisons made a significant different in performance.<a href="#fnref:3" rev="footnote">&#8617;</a></li>
<li id="fn:4">
<p>Framing binary search as finding the partition point solves an important problem.
Suppose you want to search an array of records by a particular field.
You can of course make a record having arbitrary values for all the other
fields besides the one you care about,
but it would be nice to provide just the key you want.
For example:</p>

<pre><code>binary_search(
    people.begin(),
    people.end(),
    "bob",
    [](Person a, const char* name){ a.name &lt; name }
);
</code></pre>

<p>But what is the theoretical basis for a function comparing a key to a person record?
It&rsquo;s not a <code>StrictWeakOrdering</code> as  <code>Name</code> and <code>Person</code> are not elements of the same domain.
In this case the comparison function is no longer an ordering at all, or even an operation.
If we condense the key and comparison into a predicate, and find the partition point, then this problem goes away.</p>

<p>It appears the C++ standards committee was confused about this for some time.
See <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2001/n1313.html">&ldquo;Binary Search with Heterogeneous Comparison&rdquo;</a>
and <a href="https://cplusplus.github.io/LWG/issue270">&ldquo;Binary search requirements overly strict&rdquo;</a> for further discussion.</p><a href="#fnref:4" rev="footnote">&#8617;</a></li>
<li id="fn:5">
<p>Here is one improvement.
First find the lower bound in the range <code>[first, last)</code>.
Then find the upper bound in the range <code>[lower, last)</code>.
Further optimization is possible.</p><a href="#fnref:5" rev="footnote">&#8617;</a></li>
<li id="fn:6">
Alex: If you remember there was this
grand review one day where the committee threw out a bunch of useful functions.
Well they inserted some too.
One is called <a href="https://en.cppreference.com/w/cpp/algorithm/binary_search"><code>std::binary_search</code></a>.
A friend asked me, &ldquo;where is binary search?&rdquo;
I explained all this <code>upper_bound</code>, <code>lower_bound</code>, stuff
to him,
and he said, &ldquo;but, where is binary search?&rdquo;.
So, I did it for him.
Who can argue with a best friend.
Will I ever use it? No.<a href="#fnref:6" rev="footnote">&#8617;</a></li>
</ol>
</div>


<span style="float: right">
[
<a href="13_searching.html">previous</a>, 
<a href="index.html">contents</a>, 
<a href="15_merge_inplace.html">next</a>
]
</span>


</body>
</html>