Skip to content

Rust workshop given at Accenture Interactive on September 22, 2021 to introduce the language and showcase its features.

Notifications You must be signed in to change notification settings

mrnaveira/rust-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 

Repository files navigation

rust-workshop

This is the written version of the Rust workshop given at Accenture Interactive on September 22, 2021 to introduce the language and showcase its features.

Introduction

Memory safety matters

Trulli

Memory safety is a big concern. As the above picture shows, ~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues. This is across all Microsoft products and using all languages. This types of problems include overflows, dereferencing issues and race conditions.

One may think that this is easily fixed just by using a high-level language (like JavaScript) instead of a low level-one (like C/C++). But this is not so simple because:

  • High-level languages need runtimes, virtual machines and interpreters. Those tools must themselves be written in low-level languages and that's a source of the memory issues mentioned earlier.
  • There is a big performance penalty.

Enter Rust

Because of the memory safety concerns introduced earlier, it's important to have a low-level, totally memory safe language to replace C/C++ and that can provide top performance compared to them.

To address this, Rust was designed by Mozilla Research's Graydon Hore, with contributions from the likes of JavaScript creator Brendan Eich. Rust became the core language for some of the fundamental features of the Firefox browser and its Gecko engine, as well as Mozilla's Servo engine.

Today, Rust is considered the most-loved language among developers and is used in production by many companies and products, including Mozilla Firefox, multiple Microsoft products, Dropbox, Discord, Coursera, Tor and recently even in the Linux Kernel.

Rust was able to achieve top performance and top safety at the same time. But of course nothing is perfect, and Rust has its downsides as well. The main downside of Rust is language ergonomics: assume that any program written in Rust will be more difficult to write and understand that if it was written in a high-level language. Also, the learning curve is very hard in Rust, as new developers need to understand memory management concepts like ownership, borrowing or lifetimes.

Rust vs Go

Rust and Go are often compared as they both have very similar goals: safe and fast compiled languages with focus on concurrency.

As many benchmarks show, Rust significantly out-performs Go in a multitude of tasks, including CPU computation as well as I/O bound operations. An important point is that Rust performance is more stable and predictable across time. A very good example of this is why Discord switched from Go to Rust due to performance issues with Go's garbage collector.

Go is considered a very safe language, but in this aspect the Rust compiler checks are even more exhaustive, as the race condition issue that Microsoft found in Kubernetes Helm shows.

The main advantage of Go over Rust is a much easier learning curve, as memory is handled automatically via the runtime/garbage collector. Rust forces the developer to learn and understand the memory ownership system.

As expected, there is not a clear winner between the two languages. You must weight in Rust performance and safety versus the easier learning curve of Go to make a decision for your particual project.

So, when to use Rust?

In summary, the main use case that make Rust attractive is when you need to develop any software that has a high demand for performance and/or concurrency. Basically anything that in the past would be written in C/C++ but nowadays you want the memory safety that Rust provides:

  • Powerful, cross-platform command-line tools.
  • Distributed online services.
  • Embedded devices.
  • Anywhere else you would need systems programming, like browser engines.

Another important use case to mention is when you need to develop in WebAssembly. Rust is widely used for that purpose: has very small runtime, generates very efficient wasm code and the Rust WebAssembly tooling is very mature.

Memory management exercises

We will take a look a some basic examples of Rust code, to showcase the basic features of the language. We will focus on the memory safety constraints of Rust, which constitute the bigger part of the initial learning curve of the language.

For this section there is no need to have Rust installed in you machine, you can simple use the online rust playground.

Hello, world!

To get things started, lets write the typical Hello World example:

fn main() {
    println!("Hello, world!");
}

Run the above code in the the online rust playground. To know how to compile and run this code in a real environment you can check the Rust reference.

The example defines a function in Rust. The main function is special: it is always the first code that runs in every executable Rust program. Here things start to get confusing, as due to limitations in the language we are using println! which is a macro and not a function (for more information on this, check the rust reference book on macros).

Ownership

The concept that truly separates Rust from other languages is the memory management mechanism called ownership. The Rust compiler has some very restrictive rules regarding memory (heap):

  • Each memory value in Rust has one, and only one variable that's called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope (i.e. the code block of the variable closes), the memory value will be automatically deleted.

The example code below will fail because of ownership rules.

fn main() {
    // fixed size types go into the stack, not the heap, so this does not generate ownership problems
    let x: i32 = 5;
    let _y: i32 = x; // this performs a copy
    
    // dynamic size types go into the heap, we need to handle ownership
    // Here, "s1" cannot be assigned directly to "s2" because the mamory value can only have one owner
    let s1: String = String::from("hello");
    let _s2: String = s1; // this line will give a compiler error
    
    println!("{}, world!", s1);
}
Exercise 1

Fix the above code by using the clone function to make a copy of the memory value.

Solution
fn main() {
    let x: i32 = 5;
    let _y: i32 = x;

    let s1: String = String::from("hello");
    // "clone" returns a copy of the memory value 
    let _s2: String = s1.clone();
    
    println!("{}, world!", s1);
}

Passing a variable to a function will move or copy, just as assignment does. The example code below will fail because of ownership rules.

fn main() {
    // s comes into scope
    let s = String::from("hello");  

    // s's value moves into the function...
    takes_ownership(s);             
    
    // ... and so is no longer valid afterwards
    // even more, the memory value in the heap is deleted after the call finishes
   
    // The below code will fail to compile because the ownership was transferred
    println!("{}, world!", s);

}

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.
Exercise 2

Fix the above code using again the clone function to make a copy of the memory value.

Solution
fn main() {
    let s = String::from("hello");  
    takes_ownership(s.clone());             
    println!("{}, world!", s);
}

fn takes_ownership(some_string: String) {
    println!("{}", some_string);
}
Exercise 3

This time, instead of making a clone, modify the takes_ownership function to return the value and reassign it again to the s variable. Variables in Rust are inmmutable by default, so you will also need to use the mut keyword.

Solution
fn main() {
    let mut s = String::from("hello");
    // ownership is returned to "s" after the call, so we can use the variable again
    s = takes_ownership(s);      
    println!("{}, world!", s);
}

fn takes_ownership(some_string: String) -> String {
    println!("{}", some_string);
    return some_string;
}

The solutions we explored in exercises 2 and 3 are not ideal. Cloning a memory value (exercise 2) has a performance penalty and we lose the ability to modify the original value. Returning the value again (exercise 3) solves those problems but makes the code very difficult to manage when we are manipulating multiple values. In the next section we will explore a better solution, called borrowing.

Borrowing

Borrowing is the action of temporarily use a reference of a value wihout taking ownership of it. With borrowing, we have another way of solving the code of the previous section.

Find below the same problem of ownership of the previous section.

fn main() {
    let s = String::from("hello");  
    borrow(s);             
   
    // The below code will fail to compile because the ownership was transferred
    println!("{}, world!", s);
}

fn borrow(some_string: String) {
    println!("{}", some_string);
}
Exercise 4

Use borrowing to pass a reference of the variable s to the function borrow.

Solution
fn main() {
    let s = String::from("hello");  
    borrow(&s);             
    println!("{}, world!", s);
}

fn borrow(some_string: &String) {
    println!("{}", some_string);
}

But what happens if we want to modify the value that we are borrowing? By default, Rust will not let you modify a borrowed value unless both the caller and the called explicitly declare it. The next example code will fail for that same reason:

fn main() {
    let s = String::from("hello");  
    borrow(&s);             
    println!("{}, world!", s);
}

fn borrow(some_string: &String) {
    // we are modifying the value, adding more characters to the string 
    some_string.push_str("ooooo");
}
Exercise 5

Use a mutable reference to allow the function borrow to modify the string s.

Solution
fn main() {
    // note that the variable must also be declared as mutable!
    let mut s = String::from("hello");  
    borrow(&mut s); // passing a mutable reference             
    println!("{}, world!", s);
}

fn borrow(some_string: &mut String) { // receiving a mutable reference
    some_string.push_str("ooooo");
}

To prevent data races, the Rust compiler imposes some restrictions on mutable references:

  • You can have only one mutable reference to a particular piece of data in a particular scope.
  • You cannot mix mutable and inmmutable references.
  • You can't return a function defined variable as a reference.

Lifetimes

The Rust compiler keeps track of how long does a reference last to avoid a variable pointing to data that no longer exists (dangling pointer). That's called the lifetime, and begins when the reference is created and ends when the reference is last used. In most cases the lifetime is infered by the compiler so we, as developers, don't need to specify it.

But when it's not possible for the compiler to statically determine when to deallocate a value, we need to manually specify the lifetime. Consider the following example:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

The result of the longest function cannot be determined at compile time. This means that if either of the arguments x or y don’t live long enough to be used safely, the compiler will let you know about it.

The code below will produce an error, since the lifetime of the references mismatch:

fn main() {
    let string1 = String::from("a very long string");
    let result;
    {
        let string2 = String::from("short string");
        result = longest(string1.as_str(), string2.as_str());
    }
    println!("The longest string is {}", result);
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
Exercise 6

Modify the previous code to fix the compiler error about lifetemes, making the variables string1 and string2 be in the same scope.

Solution
fn main() {
    // both strings have the same lifetime, so they can be safely called by "longest" in any shorter scope
    let string1 = String::from("a very long string");
    let string2 = String::from("short string");
    let result;
    {
        result = longest(string1.as_str(), string2.as_str());
    }
    println!("The longest string is {}", result);
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Other Rust concepts

This workshop was focused on memory management exercises, probably the biggest part of the inital learning curve. But there are other concepts of Rust worth mentioning at this point.

Concurrency

The examples in this workshop shows how the Rust compiler will not allow a memory value to be modified from different scopes. But what to do if we precisely want to read/write the same memory value multiple threads? One of the core goals of rust to allow safe concurrency. To access the same value concurrently from multiple threads at the same time, the programmer must specify the synchronization mechanism to use: either message passing or by shared state.

Object orientation

Rust is object oriented, but there is no subtyping and no inheritance of data in Rust. The relationships between various data types are established using traits.

trait Speaks {
     fn speak(&self);

     fn noise(&self) -> &str;
}

trait Animal {
    fn animal_type(&self) -> &str;
}

struct Dog {}

impl Animal for Dog {
    fn animal_type(&self) -> &str {
        "dog"
    }
}  

impl Speaks for Dog {
    fn speak(&self) {
        println!("The dog said {}", self.noise());
    }

    fn noise(&self) -> &str {
        "woof"
    }
}

fn main() {
    let dog = Dog {};
    dog.speak();
}

You can think of traits as interfaces. But unlike interfaces in languages like Java, new traits can be implemented for existing types. That means abstractions can be created after-the-fact, and applied to existing libraries.

There is also trait inheritance:

trait Show {
    fn show(&self) -> String;
}

trait Location {
    fn location(&self) -> String;
}

trait ShowTell: Show + Location {}

Error handling

Error handling in Rust is done by wrapping function return values in the Result enum type to later match the outcome from the caller:

use std::fs::File;
fn main() {
   let f = File::open("main.jpg");
   match f {
      Ok(f)=> {
         println!("file found {:?}",f);
      },
      Err(e)=> {
         println!("file not found \n{:?}",e);   //handled error
      }
   }
   println!("end of main");
}

To better handle errors in Rust, there are usually two libraries that we will need:

  • thiserror to avoid writing lots of boilerplate code on custom error declaration.
  • anyhow to avoid boilerplate on error chaining in function calls.

Asynchronous Programming

Asynchronous Programming is tipically used when we want to run many instances of a small function concurrently (for example, listening to network requests), but we want to avoid creating OS thread for each execution to avoid the performance overhead. But keep in mind that Async code should never spend a long time without reaching an await because each execution will block the thread (i.e. avoid expensive CPU processing), if that's the case it's better to stick to threads.

In Rust, asynchronous functions return a Future value similar to promises in JavaScript. We can then specify how and when to block the thread execution.

async fn get_two_sites_async() {
    // Create two different "futures" which, when run to completion,
    // will asynchronously download the webpages.
    let future_one = download_async("https://www.foo.com");
    let future_two = download_async("https://www.bar.com");

    // Run both futures to completion at the same time.
    join!(future_one, future_two);
}

Rust currently provides only the bare essentials for writing async code. A full async runtime is not yet provided in the standard library. In the meantime, community-provided async ecosystems fill in these gaps, being Tokio the most widely used.

Resources

About

Rust workshop given at Accenture Interactive on September 22, 2021 to introduce the language and showcase its features.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published