Colin's Notes

The last few years have brought us an explosion in the number of new systems languages under development. They’re mostly trying to find good balances between safety, performance and expressivity. In this post I’ll first outline briefly what’s meant by “safety” and a little on how today’s system languages try to achieve it. Then I’ll list what I think are some of the interesting languages in this space with thoughts on each. All can be used today but not all have gotten beyond an early experimental state.

Here are some notable new languages you could use today: Rust, Zig, Odin, Jakt, Hare

This next group ranges from somewhat to very experimental but can still be tried out: Vale, Austral, Myrddin, V, Lobster, Compis, Cone

These are the second-most recent generation of systems languages. They’re all a bit more on the applications development side, less focused on highly optimized memory management. They’re all well past 1.0 status and heavily used in real-world software: Nim, Crystal, D, Go, Cone,

What do the new systems languages bring to the table over older established languages? Primarily “safety” in broad terms and secondarily syntactic conveniences. Aside from advanced memory management, they mostly do not try to explore new programming language concepts. That said, compared to C, C++ and Fortran they have borrowed from functional languages innovations like sum types.

The new wave of systems languages also typically have much saner build systems and library / dependency management and IDE integration but that’s a topic for another post.

Safety

Safety includes: Avoiding memory leaks and double free errors, buffer overflows, minimizing chances for concurrency bugs, automatic resource management, and generally designing to avoid dangerous patterns that lead to incorrect output or program crashes. Safer languages will have fewer security vulnerabilities, but security is a separate issue from safety. Some additional measures are needed to enhance program security.

Research and experience has found shared, mutable state to be a locus of a lot of unsafe conditions:bugs or bugs waiting to happen after a slight code change. Therefore Rust and many other modern systems languages seek to minimize it through a number of techniques.

Shared mutable state may arise most commonly during concurrent execution. So, unless a language prohibits concurrency, it needs to limit shared mutable state. Doing so will make for generally safer programs anyway. Reducing shared, mutable state is a form of “temporal” memory safety.

Memory safety has both spatial and temporal dimensions. An example of a spatial safety feature is array bounds checking: You prevent the program from reading or writing memory beyond the end of an array for instance. Temporal safety relates to tracking use of memory over time as the program runs, ensuring areas of memory aren’t used or reused erroneously. The simplest example of a temporal memory bug is failing to initialize a variable before use later on.

Preventing uninitialized variables is a very easy problem to design around in a new language. Enforcing single ownership (C++ unique_ptr, or the more comprehensive borrow checker in Rust) are much more difficult to build and use.

Manual memory reclamation (free, delete) is also a source of temporal memory unsafety in the form of bugs or crashes. A garbage collector will completely remove this class of safety problems, at some expense in performance and memory use. Borrow checking + lifetimes or linear types are another way to relieve the programmer from manually freeing memory. They can result in faster programs compared to GC managed memory but are more difficult to use by the programmer, harder to build into a language and may slow down compile times.

Techniques to Achieve Safety

Type-checking: The first step to achieve safety is simply a good type checker, so your program is more likely to be doing what you intended at a basic level. C++ and C, for instance, while statically typed don’t feature particularly strong type checking. Depending on which errors and warnings you’ve enabled you can easily miss mistakes due to typecasting and type coercion,that you’d never miss in Rust or even Free Pascal or Go.
Immutable data by default: variables may be declared immutable by default, limiting incidental mutability. Later attempts to change the data will result in compiler errors.
Pure functions: Eliminating state is a great way to make both automatic and manual program logic checking easier. Functional languages limit mutation and state. Pure functions are stateless and instead of altering data they return new data. Sometimes this approach sacrifices performance with excessive data copying. Clever compilers can minimize the amount of copying by tracking ownership of data…
Enforced single ownership: Rust and Austral use this approach. Austral is particularly interesting for its use of linear types to manage resources including but not limited to memory. Rust employs a “borrow checker.” ensuring single ownership of memory – there’s only one owner at any one time, but values can be “borrowed” according to the rules: One borrower at a time if the value is mutable, more if it’s immutable. Memory is only freed after a value has no owner. (This seemingly simple rule is why Rust occasionally requires the explicit use of “lifetimes” to indicate how long a value “lives.”) Borrow-checking prevents memory leaks but more importantly highlights places where potentially unsafe and unintended behavior could happen.
Tracing garbage collection: While less powerful than borrow-checking it is simple to use. Garbage collection manages memory behind the scenes, preventing memory leaks and memory freeing coding mistakes. It can’t help to manage non-memory resources though and it won’t on its own prevent shared state. The trade-off is less power and slightly less performance for simpler to read and write programs and faster compilation.
Reference counting: GC can add unpredictability to run times as you don’t know when the GC cycle will occur. Reference counting can at least produce predictable run times (though not the fastest) but can’t neatly dispose of retain cycles (self referential or referencing each other.) Lobsters and Vale have techniques to get the best of reference counting and limited GC. Jakt uses reference counting with string and weak pointers to allow self-referential data structures. The experimental Vale language tries a mixed approach where non-deterministic GC is rarely needed but resources can be cleaned up without restrictive ownership constraints or manual memory management.
Sum types: Also known as “tagged unions”, which is how they’re implemented often, these are values that can be one of a number of types, or a predefined type that’s a union of a number of types. Rust and Jakt (to name two) call these “enums” – they can be used as simple enumerations but can carry data with them and that data can vary in type for each enum variant. This is powerful for representing values or errors “type T or Error”.
Pattern matching: This accounts for all possible states a value can take on so that parts of your program’s logic can be perfectly checked at compile time. Often you use sum types in conjunction with pattern matching where you match against every possible sub-type a sum type can have. You’ll find pattern matching in most modern languages from the last ten years.

Here’s a non-exhaustive rundown of newish systems languages and some notable things about them related to safety and syntax.

While putting this list together I wondered why so many languages have gotten made recently. Could it be access to LLVM ? I’m not sure – a number of these languages use C or C++ or QBE as back ends instead. I make a note on the back end for each as a point of interest.

Zig Zig home page

Only manual memory management but more streamlined than C and allows simple RIAA patterns with “defer”
Supports tagged unions and exhaustive switch statements
Compile time function execution
Predictability as a language feature: No hidden control flow, no hidden allocations, no macros or meta-programming
Can mix Zig and C and C++ compilation units
LLVM back end

While Zig isn’t at a 1.0 release yet it’s quite polished for a 0.10.1 version. It may be the most C-like in spirit of any language here, and produces some of the fastest programs of any language. Zig is probably the most popular and successful new systems language next to Rust. Like other languages it will be substantially more safe than C even though it has fewer safety innovations than most other languages in this list. While all memory allocation is manual, you can easily change allocators and choose a debugger allocator to check for memory safety.In fact the fact you must choose an allocator is a feature: It’s easy to tell by reading Zig code if any allocations are done or not. To allocate you must have an allocator.

Zig is very impressive for providing an alternative C and C++ tool chain. If it weren’t for some highly restrictive conditions on our build environment, I’d re-do my current large C++ project at work to build with Zig.

The documentation is somewhat behind Rust but the project is at an earlier development stage, so that’s to be expected. There are lots of good example code snippets on the home page showing language features in use and a good language reference.

Odin Odin home page

Manual memory management including defer and easy to use allocators
Simple syntax: No overloading of operators, no uniform function call syntax
Bitsets
“Distinct” types: Actual new types based on existing types for stronger type checking.
Tagged unions
LLVM backend

Odin is designed for simplicity of syntax, high performance compilation and high performance executables. It is influenced by Pascal, Oberon and other Wirth languages which is a good thing. I really like many of the design choices in Odin. While it may have the fewest safety affordances compared to other modern systems languages, it still provides much better safety than C++ at little expense.

Odin has imperative statements like C, Pascal etc rather than Rust or Ruby’s”everything’s an expression” approach. Odin compensates well by offering some nice operators like “or_else” to allow conditional expressions. Once you have gotten used to “if” as an expression it’s hard to go back.

Odin is being used for real high-performance applications. It has a lot of support for inter-operating with external libraries and has bindings for popular graphics and audio libraries.

As a game-dev adjacent language it kind of makes sense that complex numbers and quaternions are built-in types in the language.

Jakt Jakt home page

Memory safe with reference counting
Sum types with enums and pattern matching
Value semantics with struct, reference semantics with classes
Set types
Superficially reference syntax is like Rust, but without the borrow-checker applied to them
Closures
Compile time function execution
C++ output – c++ compiler as the backend

Jakt is a part of the Serenity OS project. Jakt is meant to be the language for writing Serenity applications. It was created in 2022 very quickly and is already self-hosting. Jakt isn’t limited to use in Serenity however; it currently generates C++ which could be compiled almost anywhere.

Interestingly One of Jakt’s top goals was readability. For instance function arguments are always named, not positional as in most languages.

The designers made many choices that really agree with my own view of how a programming language should be.

Hare Home

Manual memory management
Tagged unions*
Bounds checking
mandatory initializers
Mandatory error handling or immediate termination
Exhaustive switch and match
Nullable pointers
QBE backend

Hare is very new, the first open source release was announced in 2022. Hare comes closest to a direct C replacement. It’s small: (fits on a 3.5” floppy disk.) It uses QBE as the compiler back end which partly accounts for it’s compactness. QBE is an optimizing back end written in C that’s much smaller than LLVM.

Hare has various features to improve on C’s spatial memory safety, but not so much temporal safety. The road map indicates they’re considering a borrow-checker though. The exhaustive match and switch along with forced error handling and forced handling of null values make it a lot less bug-prone than C in general. Statements are expressions, so as in Rust you can initialize a variable with the result of an “if” expression-statement, for example.

The Hare developers plan to only support free platforms, so they won’t ever support Mac OS or Windows.

There’s a document on the Hare home page dedicated to Hare safety features.

Rust Rust home page

Sum types via enum and pattern matching
Expressions only, no statements
Borrow checker to enforce single ownership and single borrower of mutable data
Lifetimes
Manual memory management (but with lifetimes and borrow checking, no manual alloc or free is needed.)
Good built-in threading support including memory safety for concurrent programs
LLVM backend, alternative gcc, Crane-lift

I shouldn’t need to say too much about Rust, it’s the big player and is the only one on this list past a 1.0 release.

The ability to have non-reference counted, non-GC (so sort of manual) memory management while keeping Rust programs very memory safe and fast is the big trick of Rust. But combining that with other good code safety like sum types and pattern matching makes Rust great. Safe Rust, (there can be blocks marked “unsafe” for exceptional situations,) offers a lot of practical safety. Rust is one of those languages where if you can get your code to compile there’s a good chance it won’t crash and will actually do what you intended on the first try.

The thing I dislike the most about Rust is the fact that lifetimes are so often automatically assigned and invisible; so much so that when you actually must set them explicitly it’s a struggle at times because it happens so rarely and you’re out of practice. Also the use of traits can get out of hand in my opinion. Oh, and lastly the borrow checker makes it hard to define some data structures that are trivial with a garbage-collected language. I tend to use recursive patterns learned from Scheme, Ruby and Java which don’t fit well with Rust. With the right lifetimes you can do it though.

Austral Austral home page

This language is at an early stage of development. While the whole language is implemented, the standard library is still under construction. The build system is at an MVP level currently. Nevertheless it’s extremely interesting. It’s unique in Austral’s careful selection of a few key features: Linear types and type classes most notably.

Linear types
Memory management: Automatic with linear types, no GC
Type classes
Capability based security
Sum types with unions
Exhaustive case statement
C backend for now

Austral sort of looks like a mini-Ada at first glance. Superficially the syntax looks similar. This hints at it’s intended use. Austral is designed to produce very safe programs with fairly easy to read syntax. There’s a big list of anti-features – Austral keeps things simple.

The most interesting features are linear types and type classes. Linear types partly fill the role of Rust’s borrow-checker but offer even more safety due to their strictness. Austral uses a borrow checker but it is simpler than Rust’s; linear types use lexical scope. What are linear types? Rust uses Affine types which are more permissive than linear types and perhaps surprisingly a little harder to reason about and check. Linear types can take care of not just memory safety but resource safety (open files, network connections) generally.

Austral provides polymorphism in the form of type classes. You can think of type classes very approximately as overloaded functions, but done in a way that’s easier to disambiguate and type-check with useful restrictions.

It’s interesting to note the current (bootstrapping) Austral compiler is written with OCaml.

Vale Vale home page

Automatic, very fast memory management
Higher Raii: What’s this?
Single ownership without the need for a borrow-checker
LLVM backend

Vale has “fast, safe, easy” as a goal. Vale uses a novel technique to manage its memory called generational references. It’s a sort of reference counting technique but with ownership analysis so that actual reference counts and free / allocation can be kept to a minimum. They claim Vale is “the most safe” natively compiled language. A Region Borrow Checker is under development to make the language even faster and safer.

The language looks very promising already. There are many features beyond what I’ve listed here that are aspirational: The language is early alpha presently. The memory management and single ownership story is very compelling.

Myrddan Myrddan home page

Algebraic Data Types
Pattern matching
Traits
Closures

Lobster : http://aardappel.github.io/lobster/README_FIRST.html

V V home page

I don’t know as much about V as some of these other languages. The home page claims:

GC memory management – kept to a minimum for performance
Simple syntax, similar to Go
Limited pattern matching
Sum types
Error handling with Result sum type
C back end

It appears to borrow some good parts from Rust and Go, with a somewhat more sophisticated type system than Go has, but keeping the garbage collector and simple concurrency support. To me these feel like good trade offs. It has built-in JSON support and appears to have some other “batteries included” libraries under development. They already have a package manager. I can’t speak to the quality of the current release (it’s version 0.3.3. Last year there was some online controversy a over how much of the claimed functionality actually worked.

There’s a C to V converter. Like Jakt or Nim you could probably include C source pretty easily and use external libraries with V.

Slightly older languages, all usable for production:

Crystal: GC. Ruby-like syntax, extremely powerful type-inference, union types. Crystal rocks.

Nim : GC or ARC, compiles to C or WASM. Sort of a Modula-2 with Python-like whitespace scoping.

Go : GC, compiler makes self-contained binaries, good concurrency support in standard library.

D : GC or manual, has “better C” support; pure functions, extremely fast compiler, multi-paradigm. Lots of nice little features.