I was chatting with a friend recently about a potential refactor they were considering that would've reduced overall lines of code. I realized we were talking about a very concrete opportunity to apply dependency inversion. I typed up a quick note and decided to expand it here. These examples are in Python.
This is OK, but it's not great. I understand the temptation:
- It seems like it encapsulates some details.
- It appears to minimize the number of arguments to initialize the class.
- Those are good, right?
class ThingClient: def __init__(self): self.connection = establish_conn(get_auth())
This is the "typical" order of dependencies:
ThingClient needs a connection and that connection needs authentication details, so
ThingClient depends on code that creates those. One challenge here is testing: in order to test
ThingClient, we'd need to know to monkey patch
establish_conn()—which is something we can do in Python but not in all languages. Another challenge might arise if we needed to connect to two
Things that had different AuthN details.
This is slightly better, because you can provide alternate credentials, possibly for testing:
class ThingClient: def __init__(self, auth): self.connection = establish_conn(auth)
This is where my friend found himself, with lots of instantiations that look like this:
tc_auth = get_auth() tc = ThingClient(auth)
And from here, the desire to remove some duplication seems totally logical. Maybe we could even update
establish_conn() so that when
auth is None it does some kind of fake or in-memory connection for testing without mocking. But that's even more code!
But an even better question is: why should
ThingClient need to be aware of
auth at all? What if we reversed the way we thought about these dependencies?
class ThingClient: def __init__(self, conn): self.connection = conn
For one thing, now if we want to write tests specifically for
ThingClient, we don't need to monkey patch anything. We can create a mock or fake connection object that provides known responses—or known errors.
Another benefit is that we make the connection much more configurable: not only can we change the AuthN details, but we can change the service address, or even potentially what service we're using at all—and we can do it without adding any arguments to the class constructor! If we're sticking with a limited subset of, say, a SQL connection API, or the memcached and Redis APIs, we might be able to eliminate whole classes:
# By restricting the API we use class CacheProto(typing.Protocol): def get(key: bytes) -> bytes: return b'' class ThingClient: def __init__(self, conn: CacheProto): self.connection = conn # ...we can replace... class RedisThingClient(BaseThingClient): # ... class MemcachedThingClient(BaseThingClient): # ... # ...with... redis_tc = ThingClient(redis.Connection()) memcached_tc = ThingClient(memcached.Connection())
And this has the potential to eliminate substantially more, and more complex, code.