Mainstream PL Big Ideas

Recently I watched an interview with Cris Lattner, the creator of Mojo, Swift and llvm. He pointed out several choices for Mojo’s design that got me thinking. Mojo makes a couple of relatively new improvements in its memory management but otherwise every aspect seems to come from other languages. Nothing wrong with that, – they combine to further a few goals for the language. These are what I’m calling “little big ideas.” As I’ve worked on my hobby languages I’ve had a couple of these.

Single ownership

For Mojo Lattner described how they wanted to realize benefits of functional languages in a imperative style language. One nice aspect of a functional language is it typically has immutable data structures. You never modify structures owned by other parts of a program because no data changes – you only make new data structures. The purity is nice, but at odds with how computers work. The performance isn’t great. Compilers will have to do a lot of optimizations to avoid so much copying.

Making the compiler aware of data “ownership” is a different way to avoid mutation unsafety. You can safely allow mutation of data if that change happens only when no other parts of a program use it. Rust does this with a “borrow checker” by analyzing the code to see if you alter a mutable value in an unsafe way. If data is provably unchanged it’s known to be safe. If the only change to data is when it has a single owner it’s safe. Rust has to use “lifetimes” to do this analysis. Usually they are determined automatically but programmers can attach lifetimes to data. Sometimes automatic lifetimes extend beyond what the programmer would want or expect, making the borrow checker more strict than necessary.

Mojo uses a different approach – it looks at the last use of a value and drops it after that. So lifetime analysis gets simpler and fewer programs have ownership errors. Like Rust, at least superficially, it uses “references” to pass data without handing off ownership. The compiler has to be sure no references exist where a value gets dropped. Lattner made a good case that his approach results in more efficient programs and a simpler compiler.

My own languages have used the tried and true garbage collection approach along with limited mutation. You don’t get the maximum performance but it’s a lot easier to build a compiler without complicated ownership analysis.

Value Semantics

Mojo uses pass-by-value by default. Internally the function “owns” the parameter value, simply because it’s a copy and so entirely safe to change. However if you want to use the function to effect a change outside the function you can, but you must explicitly declare that as ppart of the function’s definition on that parameter. This is a really good little idea.

One of my language projects took one more step in this direction: pass-by-value, pass-by-read-only-value or pass-by-mutable-value. Essentially if you’re passing by value but no changes to the value are needed you can just implement this as a reference, saving a copy. I used the val, var and cpy effects on the parameters, but val is a default so not needed to include in the code.

Few built in types – everything’s a library

I don’t fully understand this. Mojo allows you to make new struct or other types and implement operations on them. So for example one could make their own complex or quaternian or matrix type and it would act like a built-in numeric type in other languages. Also Mojo lets you implement operations differently for different architectures for maximum performance.

While too much operator overloading can be a problem for sure, freeing the developer from out of date old design decisions is great. For instance if you were stuck with small ints in an old C version but a new processor supports 128 bit ints and you want to do a lot of fast operations on UUID values you’re in trouble. That is, unless you’re prepared to update the compiler. Not so with Mojo.

I need to consider the implications some more. I understand the concept, but need to see how it looks in practice.

Comp time metaprogramming

This comes from Zig. The idea is, you use only one language to do metaprogramming rather than templating or strange extra functions or macros like with Rust or C++.

At compile time your “comp time” code chooses which code to compile – to essentially generate code to fit the call site, maybe based on architecture, maybe on a data type. In some ways this does the same thing as C++ templates but is more powerful. Since the metaprogramming is all in the main language debugging and error messages is straightforward in contrast to Rust macros or C++ templates.

Initialization is distinct from assignment

This comes along with allowing mutable or immutable variable declarations, but doesn’t have to. The idea is that initialization of variable values uses different syntax, and also is implemented differently so you can apply different analysis to initial values versus analyzing effects of assignment.

In my language RCI, you declare variables like

var x = 9

And you assign a new value like:

x := 11

Syntactically this is nice, it’s easy to see the difference. := is mutation, and only mutation. = is an equality check or initial value declaration. More importantly the initialization and assignment work completely differently.

I need to understand more of how this distinction helps the design of Mojo. I got the impression it’s important.