Patterns, Predictions & Puzzles

Rust notes

By LG 2025-06-08

The meaining of "!" in Rust

In Rust, the "!" in println! signifies that we are calling a macro rather than a regular function.

Here's a breakdown of what that means:

  1. Macros vs. Functions: While they can look similar, the key difference is that macros are expanded at compile-time, meaning they write code that is then compiled with the rest of your program. Functions, on the other hand, are called at runtime. The ! is a special syntax in Rust to make it clear that a macro is being invoked.

  2. Why println! is a Macro: println! needs to be a macro to provide its powerful features, primarily for handling a variable number of arguments. A Rust function must declare a specific number and type of arguments it accepts. However, println! can take a format string and a varying number of additional arguments to print, like so:

let x = 5;
let y = "hello";
println!("The value of x is {} and y is '{}'", x, y);

A regular function couldn't be defined to handle this flexibility, but a macro can be programmed to parse the format string and generate the appropriate code to handle the arguments at compile time.

In short, the "!" is a visual cue in Rust's syntax to let you know you're using a macro, which is a piece of code that generates other code.


The meaning of ":?" in Rust

The :? in that println! macro is a format specifier that tells Rust to print the variable using its debug format.

Here’s a more detailed explanation:

  1. What is Formatting?: In Rust, when we use macros like println! or format!, the {} curly braces are placeholders for variables. We can add extra information inside the braces to change how the variable is printed.

  2. {} vs. {:?}:

    • {} (Display Trait): This is the standard, "pretty" format, intended for user-facing output. For a type to be printable with {}, it must implement the Display trait. Simple types like numbers (i32, f64) and strings (&str) have this by default.
    • {:?} (Debug Trait): This is the "debug" format, intended for developer-facing output. It provides a more detailed, programmer-friendly representation of the value. For a type to be printable with {:?}, it must implement the Debug trait.
  3. Why is {:?} Needed Here?: The function array_and_vec() likely returns a compound type like an array or a Vec (a vector). These types don't have a single, simple value to display to an end-user, so they don't implement the Display trait by default. You can't print a whole vector with {}.

    However, they do implement the Debug trait, which knows how to print all the elements within the collection for debugging purposes.

Example with a Struct:

If you create your own struct, it won't be printable by default. You can automatically make it printable for debugging by adding #[derive(Debug)].

// This annotation automatically implements the Debug trait for our struct.
#[derive(Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let p = Point { x: 10, y: 20 };

    // This would cause a compile error because Point doesn't implement Display.
    // println!("My point is {}", p);

    // This works perfectly because we derived the Debug trait.
    // It will print a developer-friendly representation of the struct.
    println!("My point is {:?}", p);
    // Output: My point is Point { x: 10, y: 20 }
}

Bonus Tip: Pretty-Printing

You can make the debug output even more readable for complex types by using :#?. This adds newlines and indentation.

#[derive(Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let p = Point { x: 10, y: 20 };
    println!("My point, pretty-printed:\n{:#?}", p);
}

Output:

My point, pretty-printed:
Point {
    x: 10,
    y: 20,
}

The meaning of "&" in Rust

The & in the function signature fn vec_loop(input: &[i32]) signifies a reference, but the combination &[i32] as a whole means the function accepts a slice of i32 integers.

Let's break down what that means.

1. The & Symbol: Borrowing

In Rust, the & symbol is used to create a reference. A reference allows us to "borrow" access to data without taking ownership of it.

  • Without & (Ownership Transfer): If the type were just Vec<i32>, the function would take ownership of the vector passed to it. The original owner in the calling code could no longer use that vector after the function call.
  • With & (Borrowing): By using a reference, the function vec_loop only borrows the data. The calling code retains ownership and can continue to use the data after the function finishes. This is more efficient as it avoids copying the entire collection of data.

2. The [i32] part: The Slice

The [i32] part of the type is not an array (which would be [i32; N] with a fixed size N). Instead, [i32] represents a "slice"—a dynamically sized view into a contiguous sequence of i32s.

A slice is powerful because it's an abstraction. It doesn't care if the underlying data comes from a Vec<i32>, an array [i32; 5], or even part of another slice. It just represents "a sequence of integers in memory."

Putting It Together: &[i32] (A Borrowed Slice)

When we combine them, &[i32] means "a borrowed reference to a slice of i32s". This is the idiomatic and most flexible way to accept a sequence of items in Rust.

Here’s why it's so useful in the vec_loop function:

  1. Efficiency: The function avoids copying the data by just borrowing it.
  2. Flexibility: The function is more generic. It can now accept input from multiple sources without changing the signature.

Code Example:

Notice how the vec_loop function can accept a reference to a Vec and a reference to an array, and even a slice of an array, without any changes.

fn vec_loop(input: &[i32]) -> Vec<i32> {
    let mut output = Vec::new();
    for item in input {
        // We're just doubling the items for this example.
        output.push(item * 2);
    }
    output
}

fn main() {
    // 1. Call it with a Vector
    let my_vec = vec![1, 2, 3];
    // Rust automatically converts &Vec<i32> into &[i32] (this is called "deref coercion")
    let result_from_vec = vec_loop(&my_vec);
    println!("From Vec: {:?}", result_from_vec); // Prints "From Vec: [2, 4, 6]"
    println!("Original vec is still usable: {:?}", my_vec); // Prints "[1, 2, 3]"

    // 2. Call it with an Array
    let my_array = [10, 20, 30];
    let result_from_array = vec_loop(&my_array);
    println!("From Array: {:?}", result_from_array); // Prints "From Array: [20, 40, 60]"

    // 3. Call it with a slice of an array
    let result_from_slice = vec_loop(&my_array[1..3]); // Pass only [20, 30]
    println!("From slice of Array: {:?}", result_from_slice); // Prints "From slice of Array: [40, 60]"
}

Summary Table

Parameter Type Meaning Can it accept a Vec? Can it accept an array? Ownership
Vec<i32> An owned vector. Yes No Takes ownership.
&Vec<i32> A reference to a vector. Yes No Borrows.
&[i32] A reference to a slice. Yes Yes Borrows.

The meaning of "|" in Rust (anonymous function)

The | characters are used to define a closure.

A closure is a short, anonymous (unnamed) function that can be passed around as a variable or as an argument to other functions.

Let's break down: input.iter().map(|element| element + 1).collect()

  1. .iter(): This creates an iterator from the input slice. An iterator is a special object that lets you process a sequence of items one by one.

  2. .map(...): This is a method on the iterator that transforms each item. It takes a closure as its argument, and it calls that closure on every single element the iterator produces.

  3. |element| element + 1: This is the closure itself.

    • The |...| part: This defines the parameters for the anonymous function. In this case, |element| means "this closure accepts one argument, and we will call it element inside the closure's body."
    • The element + 1 part: This is the body of the closure. It's the code that gets executed for each element. It takes the input element and returns its value plus one.

So, for each number that .iter() provides from the input slice, the .map() function calls our closure. The number is passed in as element, and the closure returns element + 1.

A Simple Analogy

Imagine we have a list of numbers and you tell a friend: "For every number I give you, add one to it and tell me the result."

  • The list of numbers is input.iter().
  • Our instruction, "add one to it," is the closure |element| element + 1.
  • The process of applying that instruction to each number is .map().
  • Finally, .collect() is the gathering of our friend's answers into a new list.

How it could look with a regular function

To make it clearer, we could achieve the same result by defining a separate, named function and passing it to map.

// A named function that does the same thing as the closure.
fn add_one(n: &i32) -> i32 {
    n + 1
}

fn vec_map_example_with_function(input: &[i32]) -> Vec<i32> {
    // Note: iter() yields references (`&i32`), so our function
    // must accept a reference.
    input.iter().map(add_one).collect()
}

fn main() {
    let numbers = vec![1, 2, 3];
    let result = vec_map_example_with_function(&numbers);
    println!("{:?}", result); // Prints "[2, 3, 4]"
}

The closure |element| element + 1 is just a much more concise, inline way of writing the add_one function right where you need it.


The meaning of "<T>" type in Rust

Let's take the following example: fn largest<T>(list: &[T]) -> &T.

High-Level Meaning

In plain English, this signature means: "I am defining a function named largest that works on a slice of any type, as long as all elements in the slice are of the same type. It doesn't take ownership of the slice, and it will return a reference to the single largest element found within that slice."

Detailed Breakdown

1. fn largest

This is the standard way to start a function definition. It declares a function named largest.

2. <T> (The Generic Type Parameter)

This is the core of what makes the function "generic."

  • T is a placeholder for a specific, concrete type. Think of it as a variable for a data type. It's conventional to use T for "Type."
  • By using <T>, we are telling the Rust compiler: "This function can operate on any type T we want, whether it's i32, f64, char, or a struct we made ourselves. The compiler will figure out what T is when the function is called."
  • This allows us to write one function that works for many different types, avoiding code duplication (e.g., not having to write largest_i32, largest_char, etc.).

3. list: &[T] (The Parameter)

This defines the function's single input parameter, named list.

  • [T]: This is a "slice" of elements of our generic type T. A slice is a view into a block of memory, like a part of an array or a Vec.
  • &: This is the "borrow" symbol. It means the function is taking a reference to the slice. It does not take ownership of the data.
  • Putting it together (&[T]): The function accepts a borrowed slice of items of some type T. This is highly efficient and flexible because:
    • Efficient: The program doesn't need to copy the entire list of data into the function. It just passes a pointer to the original data.
    • Flexible: It can accept a reference to a Vec<T>, an array [T; N], or a part of either.

4. -> &T (The Return Type)

This specifies what the function will return.

  • ->: The arrow syntax simply indicates that what follows is the return type.
  • &T: The function returns a reference to a value of type T. This is very important. It means the function is not returning a new copy of the largest value. Instead, it's returning a pointer that points directly to the largest element's location within the original list that was passed in.

The Missing Piece: The Trait Bound

As written, that function signature is incomplete and will not compile. The compiler will ask: "You told me T could be any type, but how do I know I can compare two values of type T to see which one is larger?"

To fix this, we need to add a trait bound to guarantee that T is a type that can be ordered.

The corrected, working signature looks like this:

fn largest<T: PartialOrd>(list: &[T]) -> &T {
  • : PartialOrd: This is the "trait bound." It constrains the generic type T.
  • Meaning: It says, "T can be any type, as long as it implements the PartialOrd trait." The PartialOrd trait is what gives types the ability to be compared with operators like > and <. Standard types like i32 and char already implement this.

Complete Example

Here is how the full function would look and be used:

// The complete, working signature with the trait bound.
// This function requires that the type T has an ordering.
fn largest<T: PartialOrd>(list: &[T]) -> &T {
    let mut largest = &list[0];

    for item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

fn main() {
    // Example 1: Use it with numbers
    let number_list = vec![34, 50, 25, 100, 65];
    // Here, the compiler sees you passed a `&Vec<i32>`, so it sets T = i32.
    let result = largest(&number_list);
    println!("The largest number is {}", result); // Prints "The largest number is 100"

    // Example 2: Use it with characters
    let char_list = vec!['y', 'm', 'c', 'a'];
    // Here, the compiler sets T = char.
    let result = largest(&char_list);
    println!("The largest char is {}", result); // Prints "The largest char is y"
}

Summary Table

Part of Signature Meaning
fn largest Declares a function named largest.
<T> Makes the function generic; T is a placeholder for any type.
: PartialOrd (Required for it to work) A trait bound, ensuring T can be compared.
list: &[T] It takes one argument, list, which is a borrowed slice of type T.
-> &T It returns a reference to a value of type T from within the original slice.

The meaning of "option" in Rust function signature

Example: fn maybe_icecream(hour_of_day: u16) -> Option<u16>

The Option type is one of the most important and fundamental concepts in Rust for writing safe and robust code.

In simple terms, Option is an enum that represents the possibility of a value being either present or absent.

Think of it as a box that could either contain one item or be empty. We can't know for sure until we check.

The Problem Option Solves: The End of null

In many other programming languages (like Java, C#, Python, JavaScript), there is a concept of null or None. A null value means "no value here." This often leads to runtime errors.

Rust solves this problem by completely removing null and using the Option enum instead.

The Two States of Option

The Option enum is defined like this:

enum Option<T> {
    Some(T), // A value of type T is present.
    None,    // No value is present.
}
  1. Some(T): This variant means the "box" contains a value. The T is a generic type parameter, meaning Option can hold any type of value. For example, Some(5) is an Option<i32>, and Some("hello") is an Option<&str>.

  2. None: This variant means the "box" is empty. It's the explicit way of saying "there is no value."

Applying This to Function Signature

Let's look at the specific function: fn maybe_icecream(hour_of_day: u16) -> Option<u16>

  • Function Name: maybe_icecream is well-named. It hints that you might not always get ice cream.
  • Input: hour_of_day: u16 is the time of day (as a number).
  • Return Type: -> Option<u16> is the crucial part. This signature makes a promise to the compiler:
    • "This function will always return an Option<u16>."
    • If it returns Some(u16), it means "Yes, you get ice cream," and the value inside might represent the number of scoops.
    • If it returns None, it means "Sorry, no ice cream for you."

This is powerful because the compiler forces whoever calls this function to handle both possibilities (Some and None).

How to Implement and Use It

Here’s a possible implementation and how we would use the result:

// Let's say we can only have ice cream between 8 PM (20) and 10 PM (22).
fn maybe_icecream(hour_of_day: u16) -> Option<u16> {
    if hour_of_day >= 20 && hour_of_day <= 22 {
        // It's a valid time! Return Some value.
        // Let's say we get 2 scoops.
        Some(2)
    } else {
        // It's not time for ice cream. Return None.
        None
    }
}

fn main() {
    // Let's check at 9 PM (21:00)
    let nine_pm_result = maybe_icecream(21);

    // To use the Option, we must check what's inside.
    // The idiomatic way to do this is with a 'match' statement.
    match nine_pm_result {
        Some(scoops) => {
            println!("Yay! I get {} scoops of ice cream!", scoops);
        },
        None => {
            println!("Aww, no ice cream for me.");
        }
    }

    // Now let's check at 10 AM (10:00)
    let ten_am_result = maybe_icecream(10);
    match ten_am_result {
        Some(scoops) => {
            println!("Yay! I get {} scoops of ice cream!", scoops);
        },
        None => {
            println!("Aww, no ice cream for me."); // This line will be printed.
        }
    }
}

Summary

Concept Meaning
Option<T> An enum representing a value that could be present or absent.
Some(T) The variant for a present value.
None The variant for an absent value.
Why? To eliminate null-related bugs by making the possibility of absence an explicit part of the type system that the compiler forces you to handle.

The meaning of "?" in Rust

The ? in a function is the try operator, also known as the question mark operator. It is a powerful piece of syntactic sugar for propagating errors.

In short, the ? unwraps a Result or Option, and if it's an error (Err) or None, it immediately returns from the current function.

Let's break it down in the context of this function:

fn abs_i64(inputs: &[Series]) -> PolarsResult<Series> {
    let s = &inputs[0];
    let ca: &Int64Chunked = s.i64()?;
    Ok(out.into_series())
}

1. The Context: A Function That Can Fail

First, the function's signature: fn abs_i64(inputs: &[Series]) -> PolarsResult<Series>

The key part is the return type: PolarsResult<Series>. In Polars, PolarsResult<T> is an alias for Rust's standard Result<T, PolarsError>. This signature tells us: "The abs_i64 function will either succeed and return an Ok(Series), or it will fail and return an Err(PolarsError)."

Because this function is explicitly designed to return a Result, we are allowed to use the ? operator inside it.

2. The Line with the ? Operator

let ca: &Int64Chunked = s.i64()?;

Here's what's happening step-by-step:

  1. s.i64() is called.

    • s is a Polars Series, which is a column of data. A Series can hold data of any type (integers, floats, strings, etc.).
    • The .i64() method is an attempt to get an i64 (64-bit integer) representation of the series.
    • This attempt can fail! What if the Series actually contains strings? We can't treat a column of "hello" as a column of integers.
    • Therefore, .i64() does not return a plain &Int64Chunked. It returns a Result: specifically PolarsResult<&Int64Chunked>.
  2. The ? operator is applied to the Result. The ? checks the Result that s.i64() returned and does one of two things:

    • Success Case: If s.i64() returns Ok(&Int64Chunked), the ? operator unwraps it, extracting the &Int64Chunked value from inside the Ok. This extracted value is then assigned to the variable ca, and the function continues executing normally.

    • Failure Case: If s.i64() returns Err(PolarsError), the ? operator immediately stops the execution of the abs_i64 function and returns that Err(PolarsError) to whatever code called abs_i64. The code after the ? is never reached.

What the Code Looks Like Without the ?

To fully appreciate the ? operator, it's helpful to see the code you would have to write without it. The line let ca = s.i64()?; is a shortcut for this match statement:

let ca: &Int64Chunked = match s.i64() {
    Ok(chunked_array) => {
        // The result was Ok, so we can proceed with the value.
        chunked_array
    },
    Err(error) => {
        // The result was an error. We must stop and return this error
        // from the `abs_i64` function. The `.into()` is used to make sure
        // the error type is compatible.
        return Err(error.into());
    }
};

The ? operator is a much more concise and readable way to handle errors that we simply want to pass up the call stack.

Summary Table

The ? operator works on both Result and Option types.

If the expression is a Result<T, E>... The ? operator...
Ok(value) ...extracts value of type T.
Err(error) ...returns Err(error.into()) from the current function.
If the expression is an Option<T>... The ? operator...
Some(value) ...extracts value of type T.
None ...returns None from the current function.

Trait bounds in Rust

Example function:

fn impl_abs_numeric<T>(ca: &ChunkedArray<T>) -> ChunkedArray<T>
where
    T: PolarsNumericType,
    T::Native: Signed,
{
    // NOTE: there's a faster way of implementing `abs`, which we'll
    // cover in section 7.
    ca.apply(|opt_v: Option<T::Native>| opt_v.map(|v: T::Native| v.abs()))
}

High-Level Purpose

In simple terms, this function, impl_abs_numeric, calculates the absolute value for every element in a Polars ChunkedArray (which is the core data structure for a column/Series). It is designed to work on any numeric type that can have a sign (like i32, i64, f64) and to correctly handle any null values in the data.

Detailed Breakdown

1. The Function Signature

fn impl_abs_numeric<T>(ca: &ChunkedArray<T>) -> ChunkedArray<T>
  • fn impl_abs_numeric<T>: This declares a generic function named impl_abs_numeric. The <T> means it can work with multiple types. T is a placeholder for a specific Polars data type (like Int32Type, Float64Type, etc.).
  • ca: &ChunkedArray<T>: The function takes one argument, ca, which is a reference (&) to a ChunkedArray of type T. A ChunkedArray is Polars' internal representation of a column of data. Using a reference is efficient because it avoids copying the entire column.
  • -> ChunkedArray<T>: The function returns a brand new ChunkedArray of the same type T. This new array will contain the results of the calculation.

2. The where Clause (Trait Bounds)

This is the most complex but also the most powerful part of the signature. It puts constraints on what the generic type T is allowed to be.

where
    T: PolarsNumericType,
    T::Native: Signed,
  • T: PolarsNumericType: This is the first constraint. It says, "T can be any type, as long as it implements the PolarsNumericType trait." This trait is used to mark all of Polars' numeric types, like Int32Type, UInt64Type, Float32Type, etc. This ensures the function can't be accidentally called on a column of strings.

  • T::Native: Signed: This is the second, more interesting constraint.

    • T::Native: This is an associated type. For a PolarsNumericType, Native is the actual, underlying Rust primitive type. For example:
      • If T is Int32Type, then T::Native is i32.
      • If T is Float64Type, then T::Native is f64.
    • : Signed: This constrains the associated type T::Native. The Signed trait is implemented for all of Rust's signed numeric types (i8, i16, i32, i64, isize).
    • Putting it together: This constraint says, "The underlying Rust type for our column must be a signed number." This is crucial because the concept of "absolute value" (.abs()) only makes sense for numbers that can be negative. The compiler will now prevent you from calling this function on an unsigned integer column (like u32), which is correct behavior.

3. The Function Body

ca.apply(|opt_v: Option<T::Native>| opt_v.map(|v: T::Native| v.abs()))
  • ca.apply(...): The .apply() method on a ChunkedArray is a powerful tool. It iterates through every single value in the column and applies a function (a closure) to it. It then collects the results into a new ChunkedArray.

  • |opt_v: Option<T::Native>| ...: This is the closure that .apply() will execute for each value.

    • opt_v: This is the argument to our closure. It represents one value from the column.
    • Option<T::Native>: Notice the type! The value is wrapped in an Option. This is how Polars handles nulls. If the value in the column is valid (e.g., -5), opt_v will be Some(-5). If the value is null, opt_v will be None.
  • opt_v.map(|v: T::Native| v.abs()): This is the core logic that handles the Option.

    • .map(): This is a standard method on Rust's Option type. It's a clean way to apply a function to the value inside an Option, without messy if/else checks.
    • How it works:
      1. If opt_v is Some(v) (e.g., Some(-5)), the .map() method will execute the inner closure |v| v.abs() on the contained value -5. The closure calculates (-5).abs(), which is 5, and .map() re-wraps it, returning Some(5).
      2. If opt_v is None, the .map() method does nothing and immediately returns None.

This single line elegantly handles all null values by passing them through, while applying the .abs() function only to the valid, non-null data.

Summary Walkthrough

Let's imagine ca is a column of i32s: [10, -20, null, 30].

  1. impl_abs_numeric is called. The compiler confirms that T (Int32Type) is a PolarsNumericType and its Native type (i32) is Signed. Everything is valid.
  2. ca.apply starts iterating.
  3. First element is 10: opt_v is Some(10). .map runs |v| v.abs() on 10. Result: Some(10).
  4. Second element is -20: opt_v is Some(-20). .map runs |v| v.abs() on -20. Result: Some(20).
  5. Third element is null: opt_v is None. .map does nothing. Result: None.
  6. Fourth element is 30: opt_v is Some(30). .map runs |v| v.abs() on 30. Result: Some(30).
  7. apply collects these results (Some(10), Some(20), None, Some(30)) and constructs a new ChunkedArray, which is [10, 20, null, 30]. This is the return value.

Difference between ":" and "::"

  • : (Colon) is the type annotator. It is used to say "this variable/parameter/field is of this type".
  • :: (Double Colon) is the path separator. It is used to access items within a namespace, like a module, enum, or type.

They are not interchangeable and have completely different meanings. There is no concept of "type assignment" with ::.


The Colon (:): The Type Annotator

The single colon (:) is always used to declare the type of something.

Think of it as answering the question: "What type is this?"

The Double Colon (::): The Path Separator

The double colon (::), often called the "turbofish" when used with <>, is a namespace resolver. It allows us to access something that is defined inside something else.

Think of it as answering the question: "Where can I find this?" or as showing a relationship of "in".

1. Accessing Items in a Module: This is the most common use. It's like a directory path for our code.

// "Use the `HashMap` type which is found in the `collections` module,
// which is in the `std` (standard library) crate."
use std::collections::HashMap;

2. Calling Associated Functions (like static methods): These are functions that belong to a type itself, not to a specific instance of it.

// "Call the `new` function that is associated with the `String` type."
let s = String::new();

// "Call the `from` function that is associated with the `Vec` type."
let v = Vec::from([1, 2, 3]);

3. Accessing Enum Variants: Enum variants live inside the namespace of the enum.

// "The `my_option` variable holds the `Some` variant of the `Option` enum."
let my_option: Option<i32> = Option::Some(5);

// "The `my_result` variable holds the `Err` variant of the `Result` enum."
let my_result: Result<i32, &str> = Result::Err("Something went wrong");

Summary Table

Symbol Name Primary Purpose Analogy Example
: Colon Type Annotation "is a" let age: u8 = 30; ("age is a u8")
:: Double Colon Path Resolution "in" Option::Some(5) ("Some which is in Option")