Hi, I'm Lukas Atkinson. My irregularly published long form blog posts cover topics such as software development, programming language concepts, data protection, and whatever I currently find interesting.

Recent posts

You Just Don't Need Tox

Tox is a neat tool for helping test Python projects. It automatically creates “virtual environments” that include the necessary dependencies, and can then run user-defined tools for testing. Multiple environments can be created in a declarative manner to test different combinations of Python versions + dependencies. Tox will also build packages in an isolated environment.

This has been an absolutely fantastic tool in its 15 years of existence. But since then, tooling in the Python ecosystem has moved on, and doing it the Tox way will probably slow you down.

You can get 90% of the value of Tox by wrapping Poetry or uv, and will end up with simpler, faster, and more flexible QA tooling. My preferred way to do that is to define tasks with Just, which enables something quite close to the npm run style development experience.

Is Python Code Sensitive to CPU Caching?

Cache-aware programming can make a huge performance difference, especially when writing code in C++ or Rust. Python is a much more high-level language, and doesn't give us that level of control over memory layout of our data structures. So does this mean that CPU caching effects aren't relevant in Python?

In this post, we'll conduct some basic experiments to answer this question, by accessing list element either in sequental order or in a random order.

Results indicate that randomized access is consistently slower in Python. Especially as soon as problem sizes outgrow the CPU cache sizes, random access is multiple times slower. Thus, some degree of cache-aware programming may be relevant in interpreted environments like CPython 3.12 as well.

Allocator Testing

This post contains a bunch of ideas on testing custom allocators in C, assuming a single-threaded scenario.

Cursed Syscalls to Set IO Priority in Python

While cleaning up some of my dotfiles, I found what may be the most cursed Python code I have ever written: raw syscalls that required parsing Linux header files.

Intent, not implementation

When designing interfaces / APIs, it is easy to design the interface around the solution space. This makes such interfaces difficult to use, difficult to test, and difficult to maintain. Instead, our interfaces should allow users to easily express their intent.

Rust doesn't actually follow its Golden Rule

(when it comes to async functions)

A couple of days ago, Steve Klabnik published an article discussing Rust's Golden Rule, arguing that Rust's function signatures provide a clear contract that doesn't depend on the function's contents, which aids reasoning about the code. In particular, function signatures are never inferred.

However, the Rust language has evolved so that it violates this Golden Rule. While impl Trait return types by themselves are fine, they combine with auto-traits such as Send in an unfortunate manner. This is a noticeable limitation when it comes to writing async Rust code.

Brexit deal and GDPR: no adequacy yet, but transfers can continue for a while…

The last-minute Brexit deal essentially extends the transition period status for a few months with regards to data protection issues.

by Lukas Atkinson GDPR (4)

Interface Dispatch

Virtual method calls are simple: you just look up the method slot in a vtable and call the function pointer. Easy! Well, not quite: interfaces present a kind of multiple inheritance, and things quickly become complicated.

This post discusses interface method calls in C++ (GCC), Java (OpenJDK/HotSpot), C# (CLR), Go, and Rust.

It is an expanded version of my answer on Software Engineering Stack Exchange on Implementation of pure abstract classes and interfaces.

How to check for an array reference in Perl

So you've got a Perl $variable. Can we use it as an array or hash reference? If you do an online search for possible solutions, you'll find a number of suggestions, most of them wrong.

TL;DR: checking if ref $variable eq 'ARRAY' is almost always a bug. Depending on your use case, you want:

  • reftype $variable eq 'ARRAY' from Scalar::Util as a check for physical array references, or
  • _::is_array_ref $variable from my module Util::Underscore as a check for logical array references.
by Lukas Atkinson Perl (5)

Dist::Zilla on Travis CI

With Dist::Zilla (dzil), testing Perl projects on Travis CI can be a bit tricky. Here's my approach.

Should I Separate Unit Tests from Integration Tests?

When does it make sense to keep integration tests separate from your unit tests, and when is it OK to make no distinction?

Well, it's all about getting fast feedback.

by Lukas Atkinson Testing (6)

Dynamic vs. Static Dispatch

This article explains the difference between dynamic dispatch (late binding) and static dispatch (early binding). We'll also touch on the differences in language support for virtual and static methods, and how virtual methods can be circumvented.

Simpler Tests thanks to “Extract Method” Refactoring

I'm currently refactoring a huge method into smaller parts. It is stock full of nested loops, maintains a complex state machine with more variables than I have fingers, and is the kind of code where I have to ask myself how I could ever think this would have been a good idea. So obviously, I'm splitting that function into smaller, independent chunks with the Extract Method refactoring technique.[^1] Since the control flow is now simplified, the code has also become easier to test – as long as I'm comfortable with testing private methods. Why?

The number of test cases needed for full path coverage corresponds directly to the McCabe complexity of the code under test. Since many simple functions often have lower total complexity than one convoluted function, the overall required testing effort is reduced. As this reduction can be substantial, there is a strong incentive to test the extracted methods directly, instead of testing only through the public interface.

by Lukas Atkinson Testing (6)

Extract Your Dependencies

Making your code ready to be tested

Global dependencies make it difficult to properly test a piece of code. By extracting all dependencies into a single manageable object, we can easily mock the necessary services and avoid a large-scale refactor.

An Overview Of The Marpa Parser

There are many exciting parser technologies out there, and one of the most promising is Marpa. This post discusses how Marpa improves over commonly used parsers.

Emerging Objects

Building a simple object system out of closures

Object-oriented programming and functional programming imply each other. While encoding closures as objects is a well-known technique (see the command pattern, and e.g. Functors in C++), using closures to implement objects is a bit more unusual.

In this post, I will explore creating a simple object system in JavaScript, using only the functional parts.

Transforming Syntax

Or: how to write the easy part of a compiler

A Stack Overflow question asked how to translate a VB-like conditional into a C-like ternary. The other answers suggested regexes or treating it as Perl code *shudder*. But transpiling code to another language can be done correctly.

This post aims to cover:

  • parsing with Marpa::R2,
  • AST manipulation,
  • optimization passes,
  • compilation, and
  • Perl OO.

In the end, we'll be able to do all that in only 200 lines of code!

Since this post is already rather long, we will not discuss parsing theory. You are expected to be familiar with EBNF grammar notation.

Archive

See also the note dump.