Last week, I continued my single case study of the following function and the errors it generated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Recall from the previous post that this describes a SubSupConflict
, meaning that lifetime #2 is a sub-lifetime of the lifetime of &x.y
(which we are trying to infer), and the lifetime of &x.y
is a sub-lifetime of lifetime #1. However, lifetime #2 is not necessarily a sub-lifetime of lifetime #1, and so we have to explicitly declare that it is. Note that the anonymous lifetime #1 came from &Foo
in the function parameter, and anonymous lifetime #2 came from the return value &int
. In this case, we indicate that they are the same lifetime by introducing a lifetime parameter (binding) 'a
and give both the same binding:
1 2 3 |
|
Terminologies
One potential source of confusion when it comes to lifetime parameter is that it’s slightly different from function parameter. Here, it’s helpful to introduce a couple terminologies: free lifetime and bound lifetime.
The terms “free” and “bound” came from lambda calculus. For traditional function, a binding is analogous to a parameter, and other variables are considered free. For example, for some function fn foo(x: int) { y }
, x
would be a binding and y
a free variable.
Free lifetime and bound lifetime slightly differ from their traditional function counterparts in the following ways:
-
Return value can get a lifetime binding.
-
If a function parameter or return value uses lifetimes, it gets lifetime bindings even when we don’t explicitly give it any. A function parameter or return value without a named binding may be given anonymous bindings of the form #n for some n (this is where the name “anonymous lifetime” came from). Note that just as you can write multiple lifetimes on a type declaration (for example,
&'a &'b Foo<'c>
), there may be multiple anonymous lifetimes associated with a single function parameter or return value. -
When we typecheck the body of a function, we treat those lifetimes as free instead.
Thus, “free lifetime” and “bound lifetime” are not mutually exclusive; instead, they represent different concepts. The reason why we treat lifetimes from function parameters as free is because they may have arbitrary lifetimes depending on what are passed in (that’s why another way to view free lifetime is “one that’s unbounded above”). Also, Rust doesn’t attempt to infer lifetime of return value, so it’s given the same treatment.
What is the point of having lifetime binding, then? Essentially, it serves these two purposes:
-
To indicate that a group of function parameters and/or return value have the same lifetime
-
To name a lifetime in case of error (though this ends up being not very helpful when a binding is anonymous)
Case study continued
This section doesn’t describe anything new except giving a small peek inside the Rust compiler. Here is Rust’s representation of free lifetime:
1 2 3 4 |
|
Here, scope_id
refers to the function block the lifetime is bounded to. BoundRegion refers to bound lifetime and has the following representation:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
From the function that we are studying,
1 2 3 |
|
the Rust compiler creates the free lifetimes with bindings BrAnon(0)
and BrAnon(1)
out of the function parameter and return value, respectively. For brevity, we will refer to the former free lifetime as free(a#0)
and the latter free(a#1)
. Also, let 'v
be the lifetime of &x.y
, which we are trying to infer, and let a <= b
be the constraint that lifetime a
has to be a sub-lifetime of lifetime b
.
As the compiler walks through this function, it creates four constraints. The two that are relevant are free(a#1) <= 'v
and 'v <= free(a#0)
. At the beginning, it will go through all the constraints where 'v
is the “bigger” lifetime, which in this case is free(a#1) <= 'v
. At this point, it infers that 'v
also has the lifetime free(a#1)
. Thus, it follows that free(a#1) <= free(a#0)
. However, since these lifetimes are free, we cannot be sure how “big” either is, and so it’s not necessarily the case that one is a sub-lifetime of the other. As the result, we get an error.
Improving the error message
From this case study, it occurred to me that all SubSupConflict sub_r <= ... <= 'v <= ... <= sup_r
, where sub_r
and sup_r
are free lifetimes can use the error message “define the same lifetime parameter on r_sub
and r_sup
”.
This leads me to believe that I now have a strategy to make Idea #1 from my first post a reality. Namely, for functions with similar form to the one above, I want to make a concrete suggestion:
1 2 3 4 5 6 |
|
Here is the strategy:
-
Check that
sub_r
andsup_r
are both free lifetimes. -
Check that at least one of
sub_r
orsup_r
has anonymous binding. -
Somehow identify that they come from a function declaration and obtain the AST of that function declaration.
-
Since the compiler appears to identify the first anonymous binding of a function with the number 0 and increment each subsequent one by 1, we may be able to use the same strategy to identify “where” to insert/replace the lifetime in the AST.
-
Use the Rust pretty printer to print the new function declaration.
Questions(s)
Here are some things I’m currently unclear about:
-
Suppose that
sub_r
andsup_r
come from function declaration. Issub_r
always the lifetime of the return value? -
Suppose that
sub_r
has a named binding'a
andsup_r
has an anonymous binding, and we infer thatsub_r
andsup_r
should have the same binding, would there be a case where givingsup_r
a binding'a
would cause a new error?