Lifetimes, Clones, and Closures: Explaining the “glib::clone!()” Macro

One thing that I’ve seen confuse newcomers to writing GObject-based Rust code is the glib::clone!() macro. It’s foreign to people coming from writing normal Rust code trying to write GObject-based code, and it’s foreign to many people used to writing GObject-based code in other languages (e.g. C, Python, JavaScript, and Vala). Over the years I’ve explained it a few times, and I figure now that I should write a blog post that I can point people to describing what the clone!() macro is, what it does, and why we need it in detail.

Closures and Clones in Plain Rust

Rust has a nifty thing called a closure. To quote the official Rust book:

…closures are anonymous functions you can save in a variable or pass as arguments to other functions. You can create the closure in one place and then call the closure to evaluate it in a different context. Unlike functions, closures can capture values from the scope in which they’re defined.

Simply put, a closure is a function you can use as a variable or an argument to another function. Closures can “capture” variables from the environment, meaning that you can easily pass variables within your scope without needing to pass them as arguments. Here’s an example of capturing:

let num = 1;
let num_closure = move || {
    println!("Num times 2 is {}", num * 2); // `num` captured here
};

num_closure();

num is an i32, or a signed 32-bit integer. Integers are cheap, statically sized primitives, and they don’t require any special behavior when they are dropped. Because of this, it’s safe to keep using them after a move – so the type can and does implement the Copy trait. In practice, that means we can use our integer after the closure captures it, as it captures a copy. So we can have:

// Everything above stays the same
num_closure();
println!("Num is {}", num);

And the compiler will be happy with us. What happens if you need something dynamically sized and stored on the heap, like the data from a String? If we try this pattern with a String:

let string = String::from("trust");
let string_closure = move || {
    println!("String contains \"rust\": {}", string.contains("rust"));
};

string_closure();
println!("String is \"{}\"", string); 

We get the following error:

error[E0382]: borrow of moved value: `string`
  --> src/main.rs:10:34
   |
4  |     let string = String::from("trust");
   |         ------ move occurs because `string` has type `String`, which does not implement the `Copy` trait
5  |     let string_closure = move || {
   |                          ------- value moved into closure here
6  |         println!("String contains \"rust\": {}", string.contains("rust"));
   |                                                  ------ variable moved due to use in closure
...
10 |     println!("String is \"{}\"", string); 
   |                                  ^^^^^^ value borrowed here after move

Values of the String type cannot be copied, so the compiler instead “moves” our string, giving the closure ownership. In Rust, only one thing can have ownership of a value. So when the closure captures string, our outer scope no longer has access to it. That doesn’t mean we can’t use string in our closure, though. We just need to be more explicit about how it should be handled.

Rust provides the Clone trait that we can implement for objects like this. Clone provides the clone() method, which explicitly duplicates an object. Types that implement Clone but not Copy are generally types that can be of an arbitrary size, and are stored in the heap. Values of the String type can be vary in size, which is why it falls into this category. When you call clone(), usually you are creating a new full copy of the object’s data on the heap. So, we want to create a clone, and only pass that clone into the closure:

let s = string.clone();
let string_closure = move || {
    println!("String contains \"rust\": {}", s.contains("rust"));
};

The closure will only capture our clone, and we can still use the original in our original scope.

If you need more information on cloning and ownership, I recommend reading the “Understanding Ownership” chapter of the official Rust book.

Reference Counting, Abbreviated

When working with types of an arbitary size, we may have types that are too large to efficiently clone(). For these types, we can use reference counting. In Rust, there are two types for this you’re likely to use: Rc<T> for single-threaded contexts, and Arc<T> for multi-threaded contexts. For now let’s focus on Rc<T>.

When working with reference-counted types, the reference-counted object is kept alive for as long as anything holds a “strong” reference. Rc<T> creates a new Rc<T> instance when you call .clone() and increments the number of strong references instead of creating a full copy. The number of strong references is decreased when an instance of Rc<T> goes out of scope. An Rc can often be used in contexts the reference &T is used. Particularly, calling a method that takes &self on an Rc<T> will call the method on the underlying T. For example, some_string.as_str() would work the same if some_string were a String or an Rc<String>.

For our example, we can simply wrap our String constructor with Rc::new():

let string = Rc::new(String::from("trust"));
let s = string.clone();
let string_closure = move || {
    println!("String contains \"rust\": {}", s.contains("rust"));
};

string_closure();
println!("String is \"{}\"", string); 

With this, we can capture and use larger values without creating expensive copies. There are some consequences to naively using clone(), and we’ll get into those below, but in a slightly different context.

Closures and Copies in GObject-based Rust

When working with GObject-based Rust, particularly gtk-rs, closures come up most often when working with signals. Signals are a GObject concept. To (over)simplify, signals are used to react to and modify object-specific events. For more detail I recommend reading the “Signals” section in the “Type System Concepts” documentation. Here’s what you need to know:

  • Signals are emitted by objects.
  • Signals can carry data in the form of parameters that connections may use.
  • Signals can expect their handlers to have a return type that’s used elsewhere.

Let’s take a look at how this works with a C example. Say we have a GtkButton, and we want to react when the button is clicked. Most code will use the g_signal_connect () function macro to register a signal handler. g_signal_connect () takes 4 parameters:

  • The GObject that we expect to emit the signal
  • The name of the signal
  • A GCallback that is compatible with the signal’s parameters
  • data, which is a pointer to a struct.

The object here is our GtkButton instance. The signal we want to connect to is the “clicked” signal. The signal expects a callback with the signature of void clicked (GtkButton *self, gpointer user_data). So we need to write a function that has that signature. user_data here corresponds to the data parameter that we give g_signal_connect (). With all of that in mind, here’s what connecting to the signal would typically look like in C:

void
button_clicked_cb (GtkButton *button,
                   gpointer   user_data)
{
    MyObject *self = MY_OBJECT (user_data);
    my_object_do_something_with_button (self, button);
}


static void
my_object_some_setup (MyObject *self)
{
    GtkWidget *button = gtk_button_new_with_label ("Do Something");
    g_signal_connect (button, "clicked",
                      G_CALLBACK (button_clicked_cb), self);
    
    my_object_add_button (button); // Assume this does something to keep button alive
}

This is the simplest way to handle connecting to the signal. But we have an issue with this setup: what if we want to pass multiple values to the callback, that aren’t necessarily a part of MyObject? You would need to create a custom struct that’s houses each value you want to pass, use that struct as data, and read each field of that struct within your callback.

Instead of having to create a struct for each callback that needs to take multiple arguments, in Rust we can and do use closures. The gtk-rs bindings are nice in that they have generated functions for each signal a type can emit. So for gtk::Button we have connect_clicked (). These generated functions take a closure as an argument, with the closure taking the same arguments that the signal expects – except user_data. However, because Rust closures can capture variables, we don’t need user_data – the closure essentially becomes a struct containing captured variables, and the pointer to it becomes user_data. So, let’s try to do a direct port of the functions above, and condense them down to one function with a closure inside:

impl MyObject {
    pub fn some_setup(&self) {
        let button = gtk::Button::with_label("Do Something");

        button.connect_clicked(move |btn| {
            self.do_something_with_button(btn);
        });

        self.add_button(button);
    }
}

This looks pretty nice, right? The catch is, it doesn’t compile:

error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
  --> src/lib.rs:33:36
   |
30 |           pub fn some_setup(&self) {
   |                             ----- this data with an anonymous lifetime `'_`...
...
33 |               button.connect_clicked(move |btn| {
   |  ____________________________________^
34 | |                 self.do_something_with_button(btn);
35 | |             });
   | |_____________^ ...is captured here...
   |
note: ...and is required to live as long as `'static` here
  --> src/lib.rs:33:20
   |
33 |             button.connect_clicked(move |btn| {
   |                    ^^^^^^^^^^^^^^^

Lifetimes can be a bit confusing, so I’ll try to simplify. &self is a reference to our object. It’s like the C pointer MyObject *self, except it has guarantees that C pointers don’t have: notably, they must always be valid where they are used. The compiler is telling us that by the time our closure runs – which could be any point where button is alive – our reference may not be valid, because our &self method argument (by declaration) only lives to the end of the method. There are a few ways to solve this: change the lifetime of our reference and ensure it matches the closure’s lifetime, or to find a way to pass an owned object to the closure.

Lifetimes are complex – I don’t recommend worrying about them unless you really need the extra performance from using references everywhere. There’s a big complication with trying to work with lifetimes here: our closure has a specific lifetime bound. If we take a look at the function signature for connect_clicked():

fn connect_clicked<F: Fn(&Self) + 'static>(&self, f: F) -> SignalHandlerId

We can see that the closure (and thus everything captured by the closure) has the 'static lifetime. This can mean different things in different contexts, but here that means that the closure needs to be able to hold onto the type for as long as it wants. For more detail, see “Rust by Example”’s chapter on the static lifetime. So, the only option is for the closure to own the objects it captures.

The trick to giving ownership to something you don’t necessarily own is to duplicate it. Remember clone()? We can use that here. You might think it’s expensive to clone your object, especially if it’s a large and complex widget, like your main window. There’s something very nice about GObjects though: all GObjects are reference-counted. So, cloning a GObject instance is like cloning an Rc<T> instance. Instead of making a full copy, the amount of strong references increases. So, we can change our code to use clone just like we did in our original String example:

pub fn some_setup(&self) {
    let button = gtk::Button::with_label("Do Something");

    let s = self.clone();
    button.connect_clicked(move |btn| {
        s.do_something_with_button(btn);
    });

    self.add_button(button);
}

All good, right? Unfortunately, no. This might look innocent, and in some programs cloning like this might cause any issues. What if button wasn’t owned by MyObject? Take this version of the function:

pub fn some_setup(&self, button: &gtk::Button) {
    let s = self.clone();
    button.connect_clicked(move |btn| {
        s.do_something_with_button(btn);
    });
}

button is now merely passed to some_setup(). It may be owned by some other widget that may be alive for much longer than we want MyObject to be alive. Think back to the description of reference counting: objects are kept alive for as long as a strong reference exists. We’ve given a strong reference to the closure we attached to the button. That means MyObject will be forcibly kept alive for as long as the closure is alive, which is potentially as long as button is alive. MyObject and the memory associated with it may never be cleaned up, and that gets more problematic the bigger MyObject is and the more instances we have.

Now, we can structure our program differently to avoid this specific case, but for now let’s continue using it as an example. How do we keep our closure from controlling the lifetime of MyObject when we need to be able to use MyObject when the closure runs? Well, in addition to “strong” references, reference counting has the concept of “weak” references. The amount of weak references an object has is tracked, but it doesn’t need to be 0 in order for the object to be dropped. With an Rc<T> instance we’d use Rc::downgrade() to get a Weak<T>, and with a GObject we use ObjectExt::downgrade() to get a WeakRef<T>. In order to turn a weak reference back into a usable instance of an object we need to “upgrade” it. Upgrading a weak reference can fail, since weak references do not keep the referenced object alive. So Weak<T>::upgrade() returns an Option<Rc<T>>, and WeakRef returns an Option<T>. Because it’s optional, we should only move forward if T still exists.

Let’s rework our example to use weak references. Since we only care about doing something when the object still exists, we can use if let here:

pub fn some_setup(&self, button: &gtk::Button) {
    let s = self.downgrade();
    button.connect_clicked(move |btn| {
        if let Some(obj) = s.upgrade() {
            obj.do_something_with_button(btn);
        }
    });
}

Only two more lines, but a little more annoying than just calling clone(). Now, what if we have another widget we need to capture?

pub fn some_setup(&self, button: &gtk::Button, widget: &OtherWidget) {
    let s = self.downgrade();
    let w = widget.downgrade();
    button.connect_clicked(move |btn| {
        if let (Some(obj), Some(widget)) = (s.upgrade(), w.upgrade()) {
            obj.do_something_with_button(btn);
            widget.set_visible(false);
        }
    });
}

That’s getting harder to parse. Now, what if the closure needed a return value? Let’s say it should return a boolean. We need to handle our intended behavior when MyObject and OtherWidget still exist, and we need to handle the fallback for when it doesn’t:

pub fn some_setup(&self, button: &gtk::Button, widget: &OtherWidget) {
    let s = self.downgrade();
    let w = widget.downgrade();
    button.connect_clicked(move |btn| {
        if let (Some(obj), Some(widget)) = (s.upgrade(), w.upgrade()) {
            obj.do_something_with_button(btn);
            widget.visible()
        } else {
            false
        }
    });
}

Now we have something pretty off-putting. If we want to avoid keeping around unwanted objects or potential reference cycles, this will get worse for every object we want to capture. Thankfully, we don’t have to write code like this.

Enter the glib::clone!() Macro

The glib crate provides a macro to solve all of these cases. The macro takes the variables you want to capture as @weak or @strong, and the capture behavior corresponds to upgrading/downgrading and calling clone(), respectively. So, starting with the example behavior that kept MyObject around, if we really wanted that we would write the function like this:

pub fn some_setup(&self, button: &gtk::Button) {
    button.connect_clicked(clone!(@strong self as s => move |btn| {
        s.do_something_with_button(btn);
    }));
}

We use self as s because self is a keyword in Rust. We don’t need to rename a variable unless it’s a keyword or some field (e.g. foo.bar as bar). Here, glib::clone!() doesn’t prevent us from holding onto s forever, but it does provide a nicer way of doing it should we want to. If we want to use a weak reference instead, it would be:

button.connect_clicked(clone!(@weak self as s => move |btn| {
    s.do_something_with_button(btn);
}));

Just one word and we no longer have to worry about MyObject sticking around when it shouldn’t. For the example with multiple captures, we can use comma separation to pass multiple variables:

pub fn some_setup(&self, button: &gtk::Button, widget: &OtherWidget) {
    button.connect_clicked(clone!(@weak self as s, @weak widget => move |btn| {
        s.do_something_with_button(btn);
        widget.set_visible(false);
    }));
}

Very nice. It’s also simple to provide a fallback for return values:

button.connect_clicked(clone!(@weak self as s, @weak widget => @default-return false, move |btn| {
    s.do_something_with_button(btn);
    widget.visible()
}));

Now instead of spending time and code on using weak references and fall back correctly, we can rely on glib::clone!() to handle it for us succinctly.

There’s are a few caveats to using glib::clone!(). Errors in your closures may be harder to spot, as the compiler may point to the site of the macro, instead of the exact site of the error. rustfmt also can’t format the contents inside the macro. For that reason, if your closure is getting too long I would recommend separating the behavior into a proper function and calling that.

Overall, I recommend using glib::clone!() when working on gtk-rs codebases. I hope this post helps you understand what it’s doing when you come across it, and that you know when you should use it.

2 Replies to “Lifetimes, Clones, and Closures: Explaining the “glib::clone!()” Macro”

  1. > We’ve given a strong reference to the closure we attached to the button. That means MyObject will be forcibly kept alive for as long as the closure is alive, which is potentially as long as button is alive. MyObject and the memory associated with it may never be cleaned up, and that gets more problematic the bigger MyObject is and the more instances we have.

    If I understand correctly, in gtk-rs, connect_clicked() doesn’t automatically disconnect a closure if the data referenced by the closure (connect_clicked() doesn’t know which data) is destroyed. So you make the closure capture a weak handle to that data instead. But to my understanding, this doesn’t fix the memory leaks, since the connection and closure aren’t freed when MyObject is deleted, and neither is the backing memory behind MyObject (storing the weak and presumably strong count). All you’ve fixed is leaking memory owned by MyObject (which could include reference cycles). Do I understand correctly?

    You could *partly* fix the leak by disconnecting the signal if upgrading fails. This way, the connection is freed and the weak count drops by 1, as soon as the object’s strong count drops to 0 and the signal next fires. If you repeatedly connect and destroy objects to a signal that never fires, the leak still occurs.

    I suspect that weak references weren’t designed to free memory immediately when deleting connection targets, but to break reference cycles that would keep “MyObject owning button referencing MyObject” alive forever.

    A related GUI framework with signal connections is Qt, where connect() takes both a sender and a receiver object, and destroys the connection if either object is deleted. Passing in a lambda but not a receiver object is possible but discouraged, since the connection can leak and access a dangling pointer to a receiver object. How does GTK in C work? https://docs.gtk.org/gobject/func.signal_connect.html appears to not have any means to eagerly disconnect the connection when the recipient object’s strong count drops to 0.

    1. MyObject actually is freed – remember, the weak reference doesn’t change the lifetime of the data. It cannot ensure that MyObject still exists within memory, which is why upgrading it can fail. When MyObject is freed, all you would have is the weak reference that holds nothing.

Comments are closed.