The Parquet columnar data format typically has columns of simple types: int32, int64, string and a few others. However, columns can have logical types of “List”, “Map” as well, and their members may be more “List” or “Map” structures or primitive types.
[Read More]
Investigating Mojo 🔥
I spent the afternoon learning about Mojo. Here are my notes.
Mojo aims to be a super-set of Python by supporting the Python syntax and adding in new keywords for more performant and safe code. Mojo is a compiler that produces extremely high-performance executable binary files. It offers interop with existing Python libraries and a limited set of Python types
[Read More]
Effectively Avoid Problems When Consuming Legacy Character Encodings in Rust
There’s still a lot of old “extended” ASCII out there and you may need to deal with it. One source can be the old-fashioned “fixed-width” data formats, but it may be found in any old files like spreadsheets in Windows. Whatever the case you can’t just wish it away, sadly.
[Read More]
RustConf 2023 Notes
Notes on RustConf 2023 Talks I Attended
We saw a wide range of talks, from Rust success stories to language improvement projects, to team dynamics / Rust org growing pains. I didn’t (couldn’t) attend all talks so I may have missed some really interesting stuff. This is just what I took away from what I went to.
[Read More]
Better Code Organization by Nesting Functions
The other day I found myself writing a really long Python script full of small groups of “helper” functions. Each group only “helped” a single caller. Something felt off. What a mess. Hidden under all the clutter, the script had a fairly simple structure. There’s only one path through the code. Breaking it into separate files would only obscure the logic. So how could I make that more clear?
[Read More]
Add Key-Value Metadata to Parquet Files in C++
File-level arbitrary metadata on a parquet file could be extremely useful but adding it in C++ isn't well documented. Here's how to do it.
Although the Parquet format allows extra metadata and the C++ libraries provide a means to read and write extra metadata the capability isn’t well documented. I’ll show some example code to clarify how to read and write key-value Parquet metadata. This advice is specific to directly using the C++ libraries in the Arrow project.
[Read More]
My not so deep thoughts on AI
Speculation on our AI future races straight to where we fear it ends up, but we should think about what comes first (but I can't help myself in the concluding thoughts.)
Will A.I. Become the New McKinsey? Yes. And it will begin when McKinsey and the other big consultants start to apply A.I. routinely. If firms like McKinsey are “capital’s willing executioners”, A.I. will at first merely sharpen their axes.
[Read More]
So Many New Systems Programming Languages II
Twelve new systems languages, and one that dates to the Carter administration
Here’s a non-exhaustive rundown of newish systems languages. I’ll list some notable things about them related to safety and syntax as I discussed in the previous post. Well, here they are, in rough order of production readiness and popularity. Sorry if I put something lower than it deserved.
[Read More]
Taking a Look at the Recent Batch of Systems Programming Languages
The new languages emphasize safety, convenient syntax, and high-performance. There are so many of them!
The last few years have brought us an explosion in the number of new systems languages under development. They’re mostly trying to find good balances between safety, performance and expressivity. In this post I’ll first outline briefly what’s meant by “safety” and a little on how today’s system languages try to achieve it.
[Read More]
Value and Reference Semantics in Modern Programming Languages
Some relatively new programming languages give the developer clear ways to choose value or reference semantics with data type, which is unusual. This controls pass by value or by reference to and from functions. Languages have nearly always supported the distinction but it wasn’t always so obvious what was going on.
[Read More]