Skip to content

Commit

Permalink
deploy: 60055ad
Browse files Browse the repository at this point in the history
  • Loading branch information
prrao87 committed Feb 9, 2024
1 parent 7f72516 commit ea86c69
Show file tree
Hide file tree
Showing 17 changed files with 196 additions and 205 deletions.
5 changes: 1 addition & 4 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -198,14 +198,11 @@ <h2 id="whats-covered-in-this-book"><a class="header" href="#whats-covered-in-th
<tr><td>Mock data generation</td><td>File-handling</td><td>RNG, sampling</td></tr>
<tr><td>Age grouping</td><td>File-handling</td><td>enums</td></tr>
<tr><td>Datetime parsing</td><td>File-handling</td><td>chrono, lifetimes</td></tr>
<tr><td>Preprocessing data for NLP</td><td>Parallelism</td><td>rayon, parallelism</td></tr>
<tr><td>Extract pronouns from text</td><td>Parallelism</td><td>rayon, parallelism</td></tr>
<tr><td>Polars datetimes</td><td>DataFrames</td><td>datetimes</td></tr>
<tr><td>Polars EDA</td><td>DataFrames</td><td>TBD</td></tr>
<tr><td>Postgres</td><td>Databases</td><td>async, sqlx, tokio</td></tr>
<tr><td>DuckDB</td><td>Databases</td><td>arrow, in-memory DB</td></tr>
<tr><td>Meilisearch</td><td>Databases</td><td>async, async-std, clap</td></tr>
<tr><td>Qdrant</td><td>Databases</td><td>async, tokio, gRPC</td></tr>
<tr><td>KùzuDB</td><td>Databases</td><td>async, graph</td></tr>
<tr><td>REST API to Postgres</td><td>APIs</td><td>axum, async, tokio</td></tr>
<tr><td>REST API to local LLM</td><td>APIs</td><td>axum, LLMs</td></tr>
<tr><td>PyO3 mock data generation</td><td>Unification</td><td>TBD</td></tr>
Expand Down
5 changes: 1 addition & 4 deletions introduction/_index.html
Original file line number Diff line number Diff line change
Expand Up @@ -198,14 +198,11 @@ <h2 id="whats-covered-in-this-book"><a class="header" href="#whats-covered-in-th
<tr><td>Mock data generation</td><td>File-handling</td><td>RNG, sampling</td></tr>
<tr><td>Age grouping</td><td>File-handling</td><td>enums</td></tr>
<tr><td>Datetime parsing</td><td>File-handling</td><td>chrono, lifetimes</td></tr>
<tr><td>Preprocessing data for NLP</td><td>Parallelism</td><td>rayon, parallelism</td></tr>
<tr><td>Extract pronouns from text</td><td>Parallelism</td><td>rayon, parallelism</td></tr>
<tr><td>Polars datetimes</td><td>DataFrames</td><td>datetimes</td></tr>
<tr><td>Polars EDA</td><td>DataFrames</td><td>TBD</td></tr>
<tr><td>Postgres</td><td>Databases</td><td>async, sqlx, tokio</td></tr>
<tr><td>DuckDB</td><td>Databases</td><td>arrow, in-memory DB</td></tr>
<tr><td>Meilisearch</td><td>Databases</td><td>async, async-std, clap</td></tr>
<tr><td>Qdrant</td><td>Databases</td><td>async, tokio, gRPC</td></tr>
<tr><td>KùzuDB</td><td>Databases</td><td>async, graph</td></tr>
<tr><td>REST API to Postgres</td><td>APIs</td><td>axum, async, tokio</td></tr>
<tr><td>REST API to local LLM</td><td>APIs</td><td>axum, LLMs</td></tr>
<tr><td>PyO3 mock data generation</td><td>Unification</td><td>TBD</td></tr>
Expand Down
4 changes: 2 additions & 2 deletions introduction/why-rust.html
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ <h1 id="why-use-rust-with-python"><a class="header" href="#why-use-rust-with-pyt
<p>It's possible to write relatively high-performance code in Python these days by leveraging its rich library ecosystem (which are typically wrappers around C/C++/Cython runtimes). However, performance and concurrency are <em>not</em> Python's strong suits, and this requires performance-critical code to be implemented in lower-level languages. For many Python developers, using languages like C, C++ and Cython is a daunting prospect.</p>
<p><a href="https://www.rust-lang.org/">Rust</a> is a statically typed, compiled programming language that's known for its relatively steep learning curve. Its design philosophy is centered around three core functions: performance, safety, and fearless concurrency. It offers a modern, high-level syntax and a rich type system that makes it possible to write code that runs really fast without the need for manual memory management, eliminating entire classes of bugs.</p>
<p>Although it's possible to write all sorts of complex tools and applications in Rust, it's not the best option for <em>every</em> situation. In cases like research and prototyping, where speed of iteration is important, Rust's strict compiler can slow down development, and Python is still the better choice.</p>
<p>We believe that Python 🐍 and Rust 🦀 form a near-perfect pair to address either side of the so-called &quot;two-world problem&quot;, explained below.</p>
<p>We believe that Python 🐍 and Rust 🦀 form a near-perfect pair to address either side of the so-called "two-world problem", explained below.</p>
<h2 id="the-two-world-problem"><a class="header" href="#the-two-world-problem">The two-world problem</a></h2>
<p>The programming world often finds itself divided in two: those who prefer high-level, dynamically typed languages, and those who prefer low-level, statically typed languages.</p>
<p>Many high-level languages are interpreted (i.e., they execute each line as it's read, sequentially). These languages are generally easier to learn because they abstract away the details of memory management, allowing for rapid prototyping and development.</p>
Expand All @@ -194,7 +194,7 @@ <h2 id="the-two-world-problem"><a class="header" href="#the-two-world-problem">T
<p><img src="/image/two-world-problem.png" alt="" /></p>
<p>The image above is a figurative representation of two distributions of people, typically disparate individuals from either background (with the languages listed in no specific order).</p>
<h2 id="has-the-two-world-problem-been-solved-before"><a class="header" href="#has-the-two-world-problem-been-solved-before">Has the two-world problem been solved before?</a></h2>
<p>A lot of readers will have heard of Julia, a dynamically typed, just-in-time (JIT) compiled alternative to Python and is often touted as a &quot;high-level language with the performance of C&quot;. While Julia is no doubt a great language, it's popularity is largely limited to the scientific community and its library ecosystem and user community haven't yet matured to the extent that Python's has. As such, the &quot;two-language problem&quot; that Julia <a href="https://julialang.org/blog/2012/02/why-we-created-julia/">attempts to solve</a>, is still largely unsolved.</p>
<p>A lot of readers will have heard of Julia, a dynamically typed, just-in-time (JIT) compiled alternative to Python and is often touted as a "high-level language with the performance of C". While Julia is no doubt a great language, it's popularity is largely limited to the scientific community and its library ecosystem and user community haven't yet matured to the extent that Python's has. As such, the "two-language problem" that Julia <a href="https://julialang.org/blog/2012/02/why-we-created-julia/">attempts to solve</a>, is still largely unsolved.</p>
<p>Other languages like Mojo explain in <a href="https://docs.modular.com/mojo/why-mojo.html">their vision</a> how they aim to solve the two-world problem by providing a single unified language (acting like a superset of Python) that can be compiled to run on any hardware. However, Mojo is still very much in its infancy as a language and hasn't gained widespread adoption, and its user community is non-existent.</p>
<h2 id="rust-and-pyo3"><a class="header" href="#rust-and-pyo3">Rust and PyO3</a></h2>
<p>The most interesting aspect about PyO3 in combination with Rust is that they offer a new way to think the two-world problem. Rather than trying to <em>solve</em> the problem by creating a new language that offers the best of many worlds, Rust and PyO3 <em>embrace</em> the problem by allowing a developer to move between the worlds and choose the best tool for parts of a larger task.</p>
Expand Down
4 changes: 2 additions & 2 deletions pieces/hello-world.html
Original file line number Diff line number Diff line change
Expand Up @@ -185,15 +185,15 @@ <h1 id="hello-world"><a class="header" href="#hello-world">Hello world!</a></h1>
<p>Navigate to the <code>pieces/hello_world</code> directory in the <a href="https://github.com/thedataquarry/rustinpieces/tree/main/pieces/hello_world">repo</a> to get started.</p>
<h2 id="python"><a class="header" href="#python">Python</a></h2>
<p>The file <code>main.py</code> has just one line of code:</p>
<pre><code class="language-python">print(&quot;Hello, world!&quot;)
<pre><code class="language-python">print("Hello, world!")
</code></pre>
<p>The program is run as follows:</p>
<pre><code class="language-bash">python main.py
</code></pre>
<h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>The file <code>main.rs</code> has just three lines of code:</p>
<pre><code class="language-rs">fn main() {
println!(&quot;Hello, world!&quot;);
println!("Hello, world!");
}
</code></pre>
<p>The program is run via <code>cargo</code>:</p>
Expand Down
54 changes: 27 additions & 27 deletions pieces/intro/dicts_vs_hashmaps.html
Original file line number Diff line number Diff line change
Expand Up @@ -189,27 +189,27 @@ <h2 id="python"><a class="header" href="#python">Python</a></h2>
<p>Consider the below function in Python, where we define a dictionary of processors and their
corresponding market names.</p>
<pre><code class="language-py"> processors = {
&quot;13900KS&quot;: &quot;Intel Core i9&quot;,
&quot;13700K&quot;: &quot;Intel Core i7&quot;,
&quot;13600K&quot;: &quot;Intel Core i5&quot;,
&quot;1800X&quot;: &quot;AMD Ryzen 7&quot;,
&quot;1600X&quot;: &quot;AMD Ryzen 5&quot;,
&quot;1300X&quot;: &quot;AMD Ryzen 3&quot;,
"13900KS": "Intel Core i9",
"13700K": "Intel Core i7",
"13600K": "Intel Core i5",
"1800X": "AMD Ryzen 7",
"1600X": "AMD Ryzen 5",
"1300X": "AMD Ryzen 3",
}

# Check for presence of value
is_item_in_dict = &quot;AMD Ryzen 3&quot; in processors.values()
print(f'Is &quot;AMD Ryzen 3&quot; in the dict of processors?: {is_item_in_dict}')
is_item_in_dict = "AMD Ryzen 3" in processors.values()
print(f'Is "AMD Ryzen 3" in the dict of processors?: {is_item_in_dict}')
# Lookup by key
key = &quot;13900KS&quot;
key = "13900KS"
lookup_by_key = processors[key]
print(f'Key &quot;{key}&quot; has the value &quot;{lookup_by_key}&quot;')
print(f'Key "{key}" has the value "{lookup_by_key}"')
</code></pre>
<p>The first portion checks for the presence of a value in the dictionary, while the second portion
looks up the value by key.</p>
<p>Running the above function via <code>main.py</code> gives us the following output:</p>
<pre><code class="language-bash">Is &quot;AMD Ryzen 3&quot; in the dict of processors?: True
Key &quot;13900KS&quot; has the value &quot;Intel Core i9&quot;
<pre><code class="language-bash">Is "AMD Ryzen 3" in the dict of processors?: True
Key "13900KS" has the value "Intel Core i9"
</code></pre>
<h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>We define the below function in Rust, where we define a hashmap of processors and their
Expand All @@ -218,25 +218,25 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>

fn run8() {
let mut processors = HashMap::new();
processors.insert(&quot;13900KS&quot;, &quot;Intel Core i9&quot;);
processors.insert(&quot;13700K&quot;, &quot;Intel Core i7&quot;);
processors.insert(&quot;13600K&quot;, &quot;Intel Core i5&quot;);
processors.insert(&quot;1800X&quot;, &quot;AMD Ryzen 7&quot;);
processors.insert(&quot;1600X&quot;, &quot;AMD Ryzen 5&quot;);
processors.insert(&quot;1300X&quot;, &quot;AMD Ryzen 3&quot;);
processors.insert("13900KS", "Intel Core i9");
processors.insert("13700K", "Intel Core i7");
processors.insert("13600K", "Intel Core i5");
processors.insert("1800X", "AMD Ryzen 7");
processors.insert("1600X", "AMD Ryzen 5");
processors.insert("1300X", "AMD Ryzen 3");

// Check for presence of value
let value = &quot;AMD Ryzen 3&quot;;
let value = "AMD Ryzen 3";
let mut values = processors.values();
println!(
&quot;Is \&quot;AMD Ryzen 3\&quot; in the hashmap of processors?: {}&quot;,
"Is \"AMD Ryzen 3\" in the hashmap of processors?: {}",
values.any(|v| v == &amp;value)
);
// Lookup by key
let key = &quot;13900KS&quot;;
let key = "13900KS";
let lookup_by_key = processors.get(key);
println!(
&quot;Key \&quot;{}\&quot; has the value \&quot;{}\&quot;&quot;,
"Key \"{}\" has the value \"{}\"",
key,
lookup_by_key.unwrap()
);
Expand All @@ -245,8 +245,8 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>Just like in the Python version, the first portion checks for the presence of a value in the
hashmap, while the second portion looks up the value by key.</p>
<p>Running the function via <code>main.rs</code> gives us the same output as in Python:</p>
<pre><code class="language-bash">Is &quot;AMD Ryzen 3&quot; in the hashmap of processors?: true
Key &quot;13900KS&quot; has the value &quot;Intel Core i9&quot;
<pre><code class="language-bash">Is "AMD Ryzen 3" in the hashmap of processors?: true
Key "13900KS" has the value "Intel Core i9"
</code></pre>
<h2 id="takeaways"><a class="header" href="#takeaways">Takeaways</a></h2>
<p>Python and Rust contain collections that store key-value pairs for fast lookups. A key difference is
Expand All @@ -255,18 +255,18 @@ <h2 id="takeaways"><a class="header" href="#takeaways">Takeaways</a></h2>
<p>In Python, this <code>dict</code> is perfectly valid:</p>
<pre><code class="language-py"># You can have a dict with keys of different types
example = {
&quot;a&quot;: 1,
"a": 1,
1: 2
}
</code></pre>
<p>In Rust, the compiler will enforce that the keys and values are of the same type, based on
the first entry's inferred types.</p>
<pre><code class="language-rs">let mut example = HashMap::new();
example.insert(&quot;a&quot;, 1);
example.insert("a", 1);
// This errors because the first entry specified the key as &amp;str
example.insert(1, 2);
// This is valid
example.insert(&quot;b&quot;, 2);
example.insert("b", 2);
</code></pre>

</main>
Expand Down
8 changes: 4 additions & 4 deletions pieces/intro/enumerate.html
Original file line number Diff line number Diff line change
Expand Up @@ -188,9 +188,9 @@ <h2 id="python"><a class="header" href="#python">Python</a></h2>
<code>Person</code> class with a name and an age attribute.</p>
<p>We can instantiate a list of <code>Person</code> objects and iterate over them using <code>enumerate</code>.</p>
<pre><code class="language-py">def run2() -&gt; None:
persons = [Person(&quot;James&quot;, 33), Person(&quot;Salima&quot;, 31)]
persons = [Person("James", 33), Person("Salima", 31)]
for i, person in enumerate(persons):
print(f&quot;Person {i}: {str(person)}&quot;)
print(f"Person {i}: {str(person)}")
</code></pre>
<p>Running the above function via <code>main.py</code> gives us the same output as in Rust:</p>
<pre><code class="language-bash">Person 0: James is 33 years old
Expand All @@ -211,9 +211,9 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>For most purposes, vectors in Rust perform the same function as Python lists.
Unlike a Python list, a vector in Rust can only contain objects of the same type, in this case, <code>Person</code>.</p>
<pre><code class="language-rs">fn run2() {
let persons = vec![Person::new(&quot;James&quot;, 33), Person::new(&quot;Salima&quot;, 31)];
let persons = vec![Person::new("James", 33), Person::new("Salima", 31)];
for (i, p) in persons.iter().enumerate() {
println!(&quot;Person {}: {}&quot;, i, p)
println!("Person {}: {}", i, p)
}
}
</code></pre>
Expand Down
8 changes: 4 additions & 4 deletions pieces/intro/lambdas_vs_closures.html
Original file line number Diff line number Diff line change
Expand Up @@ -190,10 +190,10 @@ <h2 id="python"><a class="header" href="#python">Python</a></h2>
<p>In the following example, we use the <code>sorted</code> function to sort a list of <code>Person</code> objects by their
age.</p>
<pre><code class="language-py">def run5() -&gt; None:
persons = [Person(&quot;Aiko&quot;, 41), Person(&quot;Rohan&quot;, 18)]
persons = [Person("Aiko", 41), Person("Rohan", 18)]
sorted_by_age = sorted(persons, key=lambda person: person.age)
youngest_person = sorted_by_age[0]
print(f&quot;{youngest_person.name} is the youngest person at {youngest_person.age} years old&quot;)
print(f"{youngest_person.name} is the youngest person at {youngest_person.age} years old")
</code></pre>
<p>The <code>sorted</code> function takes an optional <code>key</code> argument, which is a function that is called on each
item in the list to determine the value to sort by. In this case, we use a lambda to return the
Expand All @@ -206,12 +206,12 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>In the following example, we use the <code>sort_by_key</code> method to sort a vector of <code>Person</code> objects by
their age.</p>
<pre><code class="language-rs">fn run5() {
let mut persons = vec![Person::new(&quot;Aiko&quot;, 41), Person::new(&quot;Rohan&quot;, 18)];
let mut persons = vec![Person::new("Aiko", 41), Person::new("Rohan", 18)];
// Sort by age
persons.sort_by_key(|p| p.age);
let youngest_person = persons.first().unwrap();
println!(
&quot;{} is the youngest person at {} years old&quot;,
"{} is the youngest person at {} years old",
youngest_person.name, youngest_person.age
);
</code></pre>
Expand Down
14 changes: 7 additions & 7 deletions pieces/intro/list_comprehensions_vs_map.html
Original file line number Diff line number Diff line change
Expand Up @@ -189,14 +189,14 @@ <h2 id="python"><a class="header" href="#python">Python</a></h2>
<p>Consider the following function in which we print a message depending on which persons from
a list of <code>Person</code> objects are born after the year 1995, based on their current age.</p>
<pre><code class="language-py">def run7() -&gt; None:
&quot;&quot;&quot;
"""
1. List comprehensions
&quot;&quot;&quot;
persons = [Person(&quot;Issa&quot;, 39), Person(&quot;Ibrahim&quot;, 26)]
"""
persons = [Person("Issa", 39), Person("Ibrahim", 26)]
persons_born_after_1995 = [
(person.name, person.age) for person in persons if approx_year_of_birth(person) &gt; 1995
]
print(f&quot;Persons born after 1995: {persons_born_after_1995}&quot;)
print(f"Persons born after 1995: {persons_born_after_1995}")
</code></pre>
<p>The list comprehension in the above function essentially does the following:</p>
<ol>
Expand All @@ -211,13 +211,13 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<p>We can define the below function in Rust, where we print a message depending on which persons from
a vector of <code>Person</code> objects are born after the year 1995, based on their current age.</p>
<pre><code class="language-rs">fn run7() {
let persons = vec![Person::new(&quot;Issa&quot;, 39), Person::new(&quot;Ibrahim&quot;, 26)];
let persons = vec![Person::new("Issa", 39), Person::new("Ibrahim", 26)];
let result = persons
.into_iter()
.filter(|p| approx_year_of_birth(p) &gt; 1995)
.map(|p| (p.name, p.age))
.collect::&lt;Vec&lt;(String, u8)&gt;&gt;();
println!(&quot;Persons born after 1995: {:?}&quot;, result)
println!("Persons born after 1995: {:?}", result)
</code></pre>
<p>The <code>filter</code> and <code>map</code> functions in the above function essentially do the following:</p>
<ol>
Expand All @@ -227,7 +227,7 @@ <h2 id="rust"><a class="header" href="#rust">Rust</a></h2>
<li>Collect all the tuples into a vector of unsigned 8-bit integers</li>
</ol>
<p>Running the function via <code>main.rs</code> gives us the same output as in Python:</p>
<pre><code class="language-bash">Persons born after 1995: [(&quot;Ibrahim&quot;, 26)]
<pre><code class="language-bash">Persons born after 1995: [("Ibrahim", 26)]
</code></pre>
<p>The Rust version is a little more verbose than the Python version, but it's still quite readable.</p>
<h2 id="takeaways"><a class="header" href="#takeaways">Takeaways</a></h2>
Expand Down
Loading

0 comments on commit ea86c69

Please sign in to comment.