A brief introduction to memory management in Rust

Memory management

In Rust memory is managed through a system of ownership, borrowing and lifetimes. This system is a set of rules that the compiler checks against at compile time and is used instead of a garbage collector to manage memory.

Ownership is the breakout feature of Rust. It allows Rust to be completely memory-safe and efficient, while avoiding garbage collection.

Rust, Ownership and Lifetimes

The stack and heap are blocks of memory that store data for a program to use and ownership, borrowing and lifetimes prevent many problems caused by the usage of incorrect memory management techniques possible in languages such as C++.

When interacting with the heap and stack, the Rust compiler:

Keeps track of areas of the program using data on the heap
Prevents Invalid Memory Access in heap and stack
Prevents Memory leak
Prevents Mismatched Allocation/Deallocation errors
Prevents Missing Allocation errors
Clears unused data on the heap
Prevents Uninitialized Memory Access in heap and stack
Minimizes the amount of duplicated data on the heap
Prevents Cross Stack Access

Ownership, scopes and lifetimes

Imagine a puzzle of entirely unique pieces, each piece of the puzzle makes up a greater whole. Values are the pieces, the shape of each piece is the name for that piece and the puzzle is your program. While some values may look the same or represent similar types of data, each name is unique to it's value.

Each data value in Rust has a name and that name is the "owner" of it's value, there can only be one owner for each value at a time.

Scopes play an important part in ownership, borrowing, and lifetimes. They indicate to the compiler when borrows are valid, when resources can be freed, and when values are dropped.

The scope of a name binding - an association of a name to an entity, such as a variable - is the part of a program where the name binding is valid, that is where the name can be used to refer to the entity. In other parts of the program the name may refer to a different entity (have a different binding), or to nothing at all (may be unbound).

Wikipedia, Scope (Computer Science)

The "lifetime" of a value - the time between when it was created and when it was dropped - depends on the scope it's in. Once the program exits the scope of that name, its value is dropped (I.E. its lifetime has ended). Rust does this by calling std::mem::drop to free the memory used by that value.

Note: You can call std::mem::drop manually to drop a value.

The stack and heap

Imagine you had a piece of lined paper of infinite length and a box of infinite size, now if you wanted to write a line (say a name) to that paper, you'd simply go to the next available line and write it down. If you wanted to put another box inside of the infinite box you'd pick where to put it (allocate a space for it) before putting it in. The stack is the lined paper and the heap is the box.

The stack stores values in the order it gets them and gives them out in the opposite order (first in, last out). Adding data to the stack is referred to as "pushing to the stack" removing data is referred to as "popping off the stack". All data on the stack must use a known, fixed size. This is why pushing onto the stack is not considered allocating.

Computer Science Wiki, Illustration of a Stack (Memory).

The heap is less organized than the stack and is for data with an unknown size at compile time or data with a variable size.

When you put data on the heap you ask for some amount of space, the OS (Operating System) finds an empty spot in the heap that's big enough, marks it as being in use and returns the address of that location. This process is referred to as "allocating to the heap" (allocating) and this address is called the pointer.

When you want to access this data, you have to follow the pointer to the data's location on the heap.

Requesting, allocating and freeing memory

Scalar types are stored on the stack as they take up a fixed size, because of this data on the stack is easy to find and quick to reference.

Complex types (such as String) are stored on the heap. This is because, in order to support a growable, mutable data type such as String the value on the heap must contain three pieces of information the "pointer" (ptr), the "length" (len) and the "capacity".

This data (ptr, len and capacity) is stored on the stack.

The pointer; points to the memory on the heap that holds the contents of the String
The length is how much memory (in bytes) the string uses
The capacity is the total memory (in bytes) the OS has allocated to the string

*A diagram of `&String s` pointing at `String s1`.*

When a value of unknown size is requested to be stored in memory, Rust requests heap memory from the OS at runtime, which is returned when the lifetime of that value ends.

For this example, let's say we want to store a String to memory:

We request memory when we call the String::from method
In other languages, freeing memory is usually done by the garbage collector (GC) but since Rust doesn't have a garbage collector, this is handled based on the lifetime and scope of that value
Rust will automatically call the drop function when the scope of that value ends (I.E. ending the lifetime of that value)

When managing memory manually, you need to think about when that value will be no longer needed so you can drop the value and free the memory correctly.

Moving, copying and cloning values

The difference between a copy operation and a clone operation is that a copy operation is a simple (and sometimes implicit) bit-by-bit clone of a value's bits, while a clone operation is always explicit (and sometimes expensive, I.E. resource intensive), creating a duplicate of the value.

Say you have a struct that defines a person's name and age:

struct Person {
  name: String,
  age: i32
}

Now, if for some reason we want to create a copy of this struct (for, say John), we might be tempted to write:

let john = Person {
  name: String::from("John"),
  age: 25
};
let jonathan = john;

This however is wrong, because we've not copied the value of john, we've moved it. The most noticable difference being the use of moved value error the next time you try to use the john variable.

In order to create a copy of john with the name jonathan, we need to use the derive macro (more on that later) to implement the Copy trait and it's supertrait Clone.

#[derive(Copy, Clone)]
struct Person {
  name: String,
  age: i32
}
let john = Person {
  name: String::from("John"),
  age: 25
};
let jonathan = john;

By implementing the Copy and Clone traits for Person, we can assign jonathan to john and implicitly copy the value of john.

An example of an explicit clone is string.clone:

let alex = String::from("Alex");
let alexander = alex.clone();

Note: Clone can be implemented using the Clone trait to return a copy of a value.

Borrowing and referencing values

Borrowing allows us to have one or more references to a single value without breaking the “single owner” concept. While a reference is an address that is passed to a function as an argument. When we borrow a value, we reference its memory address using the & operator.

Note: an owned value cannot be owned by multiple names or mutably borrowed multiple times.

Immutable references

The following code is a demonstration of a function that borrows a String as a reference and returns its calculated length:

fn main() {
  // Create a string (string 1)
  let s1 = String::from("Hello World!");

  // Calculate the length of the string and put it in a variable
  // Note: pass the string as a reference (I.E.) "calculate_length" is borrowing "s1"
  let len = calculate_length(&s1);

  // Print the length of the string
  println!("The length of '{}' is {}.", s1, len);
}

// Calculate the length of a borrowed string
fn calculate_length(s: &String) -> usize {
  return s.len();
}

Mutable references

The following code is a demonstration of a function that borrows a mutatable integer and increments it by one:

fn main() {
  // Create a variable with a value of 5
  let mut x = 1;

  // Pass the variable as a mutable reference to the "increment" function
  increment(&mut x);

  // Print the new value of "a"
  println!("{}", x);
}

// Take a mutable reference to a number and increment it by 1
fn increment(num:&mut i32) {
  *num = num + 1;
}

Notice how when we assign the incremented value to num, we assign the new value to the dereferenced name. In doing so, we're not assigning it to a local copy but the passed reference directly.

Creating owned copies of borrowed values

Clone works only for going from &T to T. The std::borrow::ToOwned trait generalizes Clone to construct owned data from any borrow of a given type.

An example use case for the ToOwned trait is in scenarios when you want to create an owned copy instead of a clone of a borrow.

// Create a string for the name "james"
let james = String::from("James");

// Create an owned copy of `james`
let jameson = james.to_owned();

Like the Clone trait, you can also implement the ToOwned trait manually.

References

Geeks for Geeks - Memory leak in C++ and How to avoid it?
Geeks for Geeks - Core Dump (Segmentation fault) in C/C++
Geeks for Geeks - new and delete Operators in C++ For Dynamic Memory
Geeks for Geeks - Introduction to Stack – Data Structure and Algorithm Tutorials
Geeks for Geeks - Memory Layout of C Programs
Rust Docs - Function std::mem::drop
Rust Docs - Primitive Type pointer
Rust Docs - Struct std::string::String
Rust Docs - Trait std::marker::Copy
Rust Docs - Trait std::clone::Clone
Rust Docs - Defining and Instantiating Structs
Rust Docs - Module std::clone
Rust Docs - Operator expressions
Rust Docs - Trait std::borrow::ToOwned
Rust Lang Book - Scalar Types
Rust Lang Book - Error code E0382
StackOverflow - Is it possible in Rust to delete an object before the end of scope?
Wikipedia - Garbage collection (Computer Science)
Wikipedia - Operating system

Note: References for C/C++ and articles on Wikipedia provide general context and not Rust programming information.