Using Timers to Learn

Using Timers to Learn
Photo by Veri Ivanova / Unsplash

When I'm trying to learn a new language or environment, one of the first things that I try to do is implement a timer: code that can be used to instrument other code and print out how long it took to run.

A long time ago, early on in my time learning Python, I wrote a client library for Etsy's StatsD server. (Steve Ivy and Rick van Hattem also created clients right around the same time. It was new and fun.) StatsD is all about counting and timing: it's instrumentation to record data about the behavior of an piece of software.

For timers in particular—because they can otherwise be the most involved—I was interested in making the instrumentation as fluent and unobtrusive as possible. For example, if I want to count "something happened," it can be as small as one line:

stats.incr('a.counter.metric')

An example of timing a chunk of code with the simplest interface might look like:

start = datetime.datetime.now()
some_code(1, 2, 3)
end = datetime.datetime.now()
delta = end - start
stats.timing('metric.name', delta.total_seconds() * 1000.0)

But we quickly start to see some cracks in the real world:

  • What if some_code raises an exception and we never execute the last 3 statements? We need to use try/finally to make sure we're always recording the time.
  • What if some_code has a return value we care about? We need to store the return value, record the time, then return the return value.
  • What if both are true??

That's a lot to worry about for every timer! For counters and other metric types, the simple interface is small enough, just one line. For timers, though, it would be helpful to encapsulate all of those concerns so we're not adding several lines of code for everything we want to measure.

Python's Affordances: Decorators and Context Managers

Fortunately, Python supports two patterns that are helpful here: decorators and context managers.

Decorators are a way to wrap a function or method with some extra code that runs every time, around the outside of the function call. For example, I could write a decorator to print the name and arguments of a function call:

from functools import wraps

def argprinter(func):
    @wraps(func)
    def _wrapper(*args, **kwargs):
        print(f"{func.__name__}: {args}, {kwargs}")
        return func(*args, **kwargs)
    return _wrapper
    

And when I define a function, I can apply the decorator with one line:

@argprinter
def add(a: int, b: int, c: int):
    return a + b + c

result = add(1, 2, 3)  # with the regular signature
print(f"result: {result}")

Which will result in:

add: (1, 2, 3), {}
result: 6

Inside the _wrapper function, we can use all the regular control flow tools. We can return a result value from inside a try/finally block to ensure code runs no matter what. We can use conditionals. We can adjust arguments and return values. And it only takes a single line of code to apply to a function: @argprinter.

Context managers are a similar tool for blocks that aren't functions. They let you wrap one chunk of code around another without changing the scope or outcomes of the inner code. They are used in the Python standard library for ensuring that resource clean-up always happens, even if the code raises exceptions. The most common example is making sure a file is closed after opening it:

with open('data.csv') as fp:
    raw_data = fp.read()

When the "managed" block exits, the file is automatically closed.

For timers, that means we might do something like:

with stats.timer('my.metric'):
    result = some_code(1, 2, 3)

Implementing them can be a little more complex than decorators, but the outcome is the same: instead of needing several lines of code and control flow mechanisms, handling error and return value cases can be encapsulated in the context manager.

The Things I Like About Timers

Timing a piece of code the simple way above is possible in every language I've seen. But what does it mean to create an easy, fluent timing instrumentation interface in Ruby, or Go, or C, or...? For output, as a first pass, I print the timing info to standard out (stdout) or standard error (stderr).

The most valuable thing to me is that this is a problem I've seen before. As much as we as programmers try to avoid solving the same problem twice, reimplementing this well-known behavior means I can focus more on the affordances, libraries, and idioms of the language.

Printing the results, especially redirecting the printer to stderr, usually means taking a "hello, world" program a few steps further. How do I format floats and strings? How do I print to non-default streams?

Getting accurate time almost always means diving into a standard library. How do I find things in the docs? How do I access those library items from my code? How do I get simple time values? Are there ways to get monotonic clock values from the system? How is time represented, and how are deltas or durations represented?

Making instrumentation concise and fluent involves a certain amount of metaprogramming and nuance. Beyond the challenge of taking, executing, and handling results and/or errors in an arbitrary piece of code, I can start to understand how the language is used. What feels natural or idiomatic? What are the ways return values and status are communicated? What tools do I have at my disposal?

In Ruby, for example, maybe we'd want a timer to take a block:

Timer.time do
  some_code(1, 2, 3)
end

In Go, maybe generics are the way to go?

type Timer[P,R any] struct {}

func (t *Timer[P, R]) Time(fn func(p P) (R, error)) (R, error) {
	// ...
}

func Double(a int) (int, error) {
	return a * 2, nil
}

t := &Timer[int, int]{}
res, err := t.Time(Double, 6)

(Or, honestly, maybe that's too much work and too restricting and we should just use closures:

p := 1
var r int
err := timer.Time(func() (err error) {
	r = p * 2
	return
})

if we're going to limit parameters and return values, anyway.)

Today I took the opportunity to dig into this in C and understand the underlying syscall infrastructure to implement a simple version of the command line tool time. (E.g. $ time some_prog -a -b.) It doesn't entirely work super well yet—I get "real" wall clock time but not sys or user time—but I've gotten to learn a lot more about what actually happens when you want to execute a subprocess. And just how much higher level code like Python's subprocess module or Ruby's backtick operators help improve the ergonomics.

(As an aside, running into functions that don't return at all on success was definitely new!)

Iteration and Rebuilding

I redo this work from time to time, especially when I'm still new to a language. As I learn more about what's idiomatic, what APIs are more or less pleasant to use, what tools are available, etc, the solution will change. My first iteration of a timer in Ruby ignored a lot of Ruby's uniqueness and design philosophy. When I understood that philosophy better, I was able to make a new attempt and come up with something that felt better to use.

How do you get comfortable in a new environment? Do you have problems like this? Do you like a set of smaller, introductory problems like koans? Dive into real work?