Colin's Notes

Newtypes are user-defined types that are derived from common types in a language but considered a distinct type by the language interpreter / compiler and type-checker. What this precisely means will depend on the language. However it’s important to understand that newtypes aren’t simply type name aliases like “type = “ in Rust or “typedef” in C and C++. Instead, the newtype convinces the type checker to disallow assigning a newtype derived from an integer to a plain integer variable, for instance. The term “newtype” comes from Haskell. In other languages “newtype” is an programming idiom or pattern, not a specific language feature.

Newtypes: Type Aliases with Static Type Checking

Type aliases are nice for making complex codebases more readable by letting the developer name types after their intended use.

Simply using aliasing, function signatures can get way easier to read and give more context as to the intention of code. In a made-up C++ pseudo-language:

	bool register(string person, HashMap<String, List<Integer>> & icps_general) {
		...
	}

Becomes

	typedef string person;
	typedef HashMap<String, List<Integer>> local_registry;	

	bool register(PersonID person, LocalRegistry icps_general) {
		...
	}

But, there won’t be a compiler error if you pass in a plain string for ‘person_id’ to the register function; the type aliasing doesn’t enforce type checking. Newtypes add this capability. Actually adding newtypes can be less than easy in C++ but fairly easy in Rust and very easy in Python, however.

Just “adding” newtypes means more than allowing new type names associated with existing types; you also must consider all pre-existing operators and library functions that accept the original type. You’ll need to overload / extend operators and functions. Some languages make this easier than others.

One approach, taken by Python type hinting, creates a newtype with “NewType(TYPE)” creating a sub-class of the original type. This way, it will respond to all methods of the parent class. This may actually be more permissive than you’d like, but it’s simple and painless.

Rust Newtypes and Type Aliases

The ‘type’ expression adds an alias for a type similar to “typedef”:

type CustomErrorMessage = String;

Rust has newtypes via struct:

struct PersonId(i32);
struct CustomErrorMessage(String);

This is a tuple-struct where the members are unnamed; you access data with :

let my_id = PersonId(25);
	
if my_id.0 == 25 {
		...
}

You derive traits to work with common types in newtypes as if they weren’t in the struct; use the “derive_more” crate to implement these traits for you, otherwise using Rust newtypes can be tedius.

use derive_more::{Display, FromStr};
#[derive(FromStr, Display)]
pub struct CustomErrorMessage(String);

Other Languages

Similar to Rust you could use a single member struct in C++ as a newtype, or consider the Strong Typedef Boost library.

Why NewTypes Matter

So far I haven’t made much of a case for newtypes; in part I hope their value is self-evident. But, think about big codebases where you see many function or method signatures all over with long multi-part types like HashMap<String, List<HashSet>></code> or similar monstrosities. Aliasing only solves part of the problem. In ___Java___ you ought to wrap something like this in a class; it's a bit more difficult in ___C++___ or ___Rust___ but wrapping this sort of thing in a struct is probably the right thing to do. It's a pretty well understood good practice to name these sorts of things early on rather than letting them inhabit the codebase all over. The simple aliasing helps to discover where a certain structure is used; wrapping it in a class or struct helps the type-checker find mistakes in the code.

The harder cases come in where you have relatively simple data types passed around, and where those types could be accidentally used in the wrong context. Typically we stick to using “f64”, “String”, “Vec" and so on, but maybe we shouldn't. For simple types that don't do a lot of operations on each other, the basic ___Rust___ approach with a tuple-struct and "derive-more" should work well enough. You can implement specific operators if you need to. This will improve many, situations in terms of type safety and readability. But it won't scale to lots of inter-dependent types.

The classic example would be units of measure. Measures of distance, mass and temperature all need a number to store their magnitude, but they must not be mixed up in functions or expressions using them. Especially tricky are units measuring the same thing but with different units like feet and meters or kilograms and pounds. You don’t want to multiply 20 meters by 25 seconds and compare that with 25 inches times 30 seconds!

Here’s a discussion of using Rust refinement types to properly handle units of measure.