After you have done the obvious algorithm commplexity analysis on your C++ application in code you’ve written, what’s next on the list for ways to optimize your application? How about looking for more efficient implementations of standard data structures in libraries?
[Read More]
Optimization Part II: Targeted Optimizations Assisted by Flame Graphing
Early last year IPUMS moved production of IPUMS-International micro-data to the latest version of the core DCP and a new data editing API. In doing so we discovered a number of places where the new API – while performing better than the old one on our USA and CPS test datasets – performed worse than expected on some of the IPUMSI datasets. Not a big deal except for a few datasets that took twenty or thirty times longer to process than we would expect.
[Read More]
Optimizing a Data-Intensive C++ Application, Part I
At IPUMS we continuously enhance our data products with newly available datasets, adding new variables and improvements to existing variables. We do this with the “Data Conversion Program”, a C++ application built to transform census and survey data into “harmonized” micro-data. When you visit ipums.org and make data extracts, you’re downloading data developed with the DCP.
[Read More]
Python 3 Language Notes
Notes on the Pythone 3 Language
[Read More]
Reparations
Introduction
[Read More]
Save the USPS
The U.S. Postal Service is required to fund itself by charging for services like a private business. Since the beginning of the COVID-19 outbreak mail volume has dropped by more than half, severely undercutting its budget.
[Read More]
The Parquet Data Format Landscape
As you begin to handle Parquet data with tools in more than one framework and language you’ll probably wonder how all these related pieces fit together. Here is a summary of data formats, libraries and frameworks you will encounter when working with Parquet data and Spark.
[Read More]
SF Worth Reading
The book recommendations list has moved to sfworthreading.com. It’s a static site built with Jekyll and a Ruby “updater” script I wrote to do some busy work for me that Jekyll won’t, like building a custom authors index.
[Read More]
Markdown Syntax Highlighting With Notepad++
Changing the default Notepad++ theme doesn’t change most of the colors in a Markdown document. This is especially apparent when using a dark-mode Notepad++ style and dark theme in Windows. You have to manually edit a special Markdown theme to change most of the colors and fonts.
[Read More]
Read the Report
Following the release of Special Counsel Robert Mueller’s report this spring we’ve heard lots of interpretations of the report offered that can’t survive even a casual reading.
[Read More]