Extract Your Dependencies

Making your code ready to be tested

Global dependencies make it difficult to properly test a piece of code. By extracting all dependencies into a single manageable object, we can easily mock the necessary services and avoid a large-scale refactor.

2016-02-07 9 min by Lukas Atkinson Testing (6) C++ (3)

When testing code, the most common difficulty is … that I can't. I generally try to follow the Arrange, Act, Assert¹ structure:

I can't arrange the test case, since I cannot easily transition the system under test into a known state/set up a test fixture. For example, I might want to test a time-sensitive function at a specific date. How can I simulate that date, short of changing the time of the operating system?
I can't assert that the implementation was correct, since some effects might lie outside of the system under test. If one effect of the system was to trigger a HTTP request with a certain set of parameters, how can I capture that short of intercepting the network request?

These are symptoms of a general problem: That most of our code has global dependencies. When we need a library, our first instinct is to just use that library. However, our code is now deeply integrated with that library, and I can't easily swap it out for a test. Some people might argue that dynamic linking is a solution since I can create a mock library and link that instead, but this solution is unsatisfactory because it works on a wrong level of granularity: I want to mock away a single call, not a whole library.

The next idea might be to wrap the library in a custom set of objects. To facilitate unit testing, we can supply a mock implementation. I've seen classes like this:

class Wrapper {
  static Mock* m_mock = nullptr;
  Impl m_impl;
public:
  Wrapper(Arg a) : m_impl(a) {}

  void perform() {
    if (m_mock) return m_mock->perform();
    return m_impl.perform();
  }

  static void set_mock(Mock* mock) {
    m_mock = mock;
  }
};

// later, in a test:
Mock mock();
Wrapper::set_mock(&mock);
... // perform test
Wrapper::set_mock(nullptr);

This design allows us to set a global mock object that is used to service all request if present. While better than nothing, this is still really fragile.

Setting a mock has a global effect for all instances, even those in other parts of the system that should not be affected.
It is difficult to capture instance creation.
I have to sift through all code and all its dependencies to find out what I have to mock this way.
Doing this stuff correctly is hard. Forgot to reset the mock? Here, have a bad day figuring out your mistake.

Make your dependencies explicit

There is an extremely simple solution to this kind of problem: make all your dependencies explicit. Each dependency can be expressed as a single function or a bunch of functions. This dependency might be some action, or some factory method. If my system under test relies on getting the current time, we can specify that as:

using CurrentTimeProvider = std::function<Instant>;
const CurrentTimeProvider default_get_current_time =
  Clock::now;

void system_under_test(
    int usual_argument,
    const CurrentTimeProvider& get_current_time =
      default_get_current_time)
{
  ...
}

This is great: during normal use, nothing really changes. The calling code stays the same. But during tests, I am free to specify all the dependencies. The system_under_test no longer depends on a specific implementation. I have reached dependency inversion by using the simplest dependency injection technique: passing the dependencies as arguments.

Of course, there are a couple of problems with this direct approach.

For a large number of dependencies, I have a humongous amount of extra arguments just for managing the dependencies. We'll cover that in a minute.
All those std::function objects incur extra runtime cost. That's true, but please consider whether that is really relevant here. Will you get more value from better performance, or from better tests?

In any case, using function objects is not a requirement for this technique. It makes writing tests a lot easier, but you could also go the traditional object-oriented route and define an interface for the dependency. If the dependency is from code you cannot change, you can easily write an adapter for the interface, similar to the above Wrapper but without the mock.

If you cannot even do that because virtual dispatch is “too expensive”, you'll have to find another way to inject dependencies – probably involving compile-time techniques such as conditional compilation. I'm so sorry.
Just because a function happens to implement the signature of our dependency does not mean it is suitable to satisfy it. E.g. for the CurrentTimeProvider above, any function returning a time_point would be accepted by the system. Even if that time point would be yesterday. If callers can accidentally pass in bogus implementations, we can no longer be certain that our system is correct.

Thinking in services

Every functionality you use in your code can be interpreted as a service provided to that code. We can codify this concept with a class type. Please don't be intimidated by the templates, this is merely needed to express a service in terms of the function signature it implements.

// Service - capture your dependencies as concrete services
//
// Copyright 2016 Lukas Atkinson
//
// Licensed under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in
// compliance with the License. You may obtain a copy of
// the License at
//
//   http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in
// writing, software distributed under the License is
// distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
// CONDITIONS OF ANY KIND, either express or implied. See
// the License for the specific language governing
// permissions and limitations under the License.

// catch-all case for the Service template
template <class SignatureT>
class Service;

// A Service represents some dependency in your system. It
// is strongly encouraged that you subclass a service to
// provide better type safety:
//
// <pre>
//  // define a custom service type without additional
//  // semantics.
//  struct HelloWorldService
//    : public Service<void(const std::string& name)>
//  {
//      using Service::Service;
//  };
//
//  // provide a service implementation
//  HelloWorldService hello_world_service
//    = HelloWorldService([](const std::string& name) {
//      std::cout << "Hello, " << name << std::endl;
//    });
//
//  // use the service
//  void say_hello_world(const std::string& name) {
//    return hello_world_service(name);
//  }
// </pre>
template<class ResT, class... ArgsT>
class Service<ResT(ArgsT...)>
{
public:
  // The function signature implemented by this Service
  using Signature = ResT(ArgsT...);
  // The implementation used.
  using FunctionType = std::function<Signature>;
private:
  FunctionType m_function;
public:
  // Create a new service from a function.
  // Must be non-null.
  explicit Service(FunctionType&& fn)
  : m_function(std::move(fn)) {
    if (m_function == nullptr)
      throw std::invalid_argument(
        "Service: The provided function must be non-null");
  }

  // Move-construct a Service.
  Service(Service&& orig) = default;

  virtual ~Service() = default;

  // Move-assign a Service.
  Service& operator= (Service&& orig) {
    m_function = std::move(orig.m_function);
    return *this;
  }

  // Perform the service.
  ResT perform(ArgsT... args) {
    return m_function(std::forward<ArgsT>(args)...);
  }

  // Perform the service.
  ResT operator() (ArgsT... args) {
    return perform(std::forward<ArgsT>(args)...);
  }
};

This is really just a very thin wrapper around std::function, but with a couple of important changes.

It can be easily and correctly subclassed.
The provided function can never be null. A service cannot exist without an implementation. This prevents unexpected failures deep within our system due to some unmet dependency.
There is no implicit construction. The programmer must explicitly declare a function to be compatible with the service. This allows us to implement stronger typing.
We move-construct our services, so they own all resources used by them. This is important for non-copyable objects, e.g. if they use std::unique_pointer internally. Notably, std::function cannot be move-constructed from functor objects (only from itself), so in the above code you can easily swap out the std::function for a better implementation of your choice. Which you probably should, but that's a different topic.
For the template nerds: In the perform and operator() implementations, I forward the arguments to the m_function. Normally, we would capture the arguments as they are provided at the call site with template<class... Args> auto perform(Args&&... args). However, we already know the precise argument types from the class template, and these types already include the proper cv-qualifications and references. It is therefore unnecessary to capture the references with Args&& (which given such a template is not an ordinary rvalue reference, but a forwarding reference). However, we still have to properly forward the arguments to the m_function.

The point of this is that we can define our own service types. Going back to the current time example:

struct CurrentTimeService : Service<Instant()> {
  // inherit the constructor
  using Service::Service;

  // provide default constructor
  CurrentTimeService() : CurrentTimeService(Clock::now) {}
};

void system_under_test(
    int usual_argument,
    const CurrentTimeService& get_current_time = CurrentTimeService()) {
  ...
}

Since the CurrentTimeService is a proper class now and no longer just a typedef, we have increased our static type safety. Also, we can no longer accidentally supply null implementations. When using it to supply a custom implementation, there's another nice effect:

system_under_test(
  arg,
  CurrentTimeService([](){ ... }));

Because explicit conversion is required by the CurrentTimeService constructor, we effectively have named arguments.

Collect all dependencies

Any unit of code will typically have multiple dependencies. If a function must take multiple optional parameters for its immediate and indirect dependencies, this tends to become unmanageable. However, we can combine these arguments into a single object that describes the dependencies. Assume we needed a simple logging service and an ID generation service. The original code might look like this:

namespace evil_id_generator {
  int id = 0;
  int get_id() { return ++id; }
}

namespace evil_logger {
  void log(const std::string& msg) {
    std::cout << "LOGGER: " << msg << std::endl;
  }
}

void consumer() {
  auto id = evil_id_generator::get_id();
  {
    std::ostringstream log;
    log << "using id=" << id;
    evil_logger::log(log.str());
  }
}

int main(int argc, char** argv) {
  consumer();
  return 0;
}

By extracting the services, we get:

struct GetIdService
: public Service<int()>
{
  using Service::Service;

  GetIdService()
  : GetIdService(evil_id_generator::get_id)
  {}
};

struct LogMessageService
: public Service<void(const std::string& msg)>
{
  using Service::Service;

  LogMessageService()
  : LogMessageService(evil_logger::log)
  {}
};

void consumer(
    GetIdService& get_id_service,
    LogMessageService& log_message_service) {
  auto id = get_id_service();
  {
    std::ostringstream log;
    log << "using id=" << id;
    log_message_service(log.str());
  }
}

int main(int argc, char** argv) {
  GetIdService get_id_service;
  LogMessageService log_message_service;

  if (argc >= 2 && std::string(argv[1]) == "--with-mocks") {
    get_id_service = GetIdService([](){
      return 42;
    });
    log_message_service = LogMessageService([](const std::string){
      /*ignore*/
    });
  }

  consumer(get_id_service, log_message_service);

  return 0;
}

By combining the services into a single object, we get:

struct Dependencies final {
  struct GetIdService
  : public Service<int()>
  {
    using Service::Service;
  };
  GetIdService get_id_service;
  int get_id() {
    return get_id_service();
  }

  struct LogMessageService
  : public Service<void(const std::string& msg)>
  {
    using Service::Service;
  };
  LogMessageService log_message_service;
  void log_message(const std::string& msg) {
    return log_message_service(msg);
  }

  // the constructor helps us check all dependencies were provided.
  Dependencies(
      GetIdService&& get_id_service_,
      LogMessageService&& log_message_service_)
  : get_id_service(std::move(get_id_service_))
  , log_message_service(std::move(log_message_service_))
  {}

  Dependencies()
  : Dependencies(
      GetIdService(evil_id_generator::get_id),
      LogMessageService(evil_logger::log))
  {}
};

void consumer(Dependencies& deps) {
  auto id = deps.get_id();
  {
    std::ostringstream log;
    log << "using id=" << id;
    deps.log_message(log.str());
  }
}

int main(int argc, char** argv) {
  Dependencies deps;
  if (argc >= 2 && std::string(argv[1]) == "--with-mocks") {
    deps.get_id_service = Dependencies::GetIdService([](){
      return 42;
    });
    deps.log_message_service = Dependencies::LogMessageService([](const std::string){
      /*ignore*/
    });
  }
  consumer(deps);
  return 0;
}

Here, the Dependencies can be viewed as an object where you can swap out the methods at run time. This allows us to easily provide our own implementations in a test.

Some people create such an object capturing all dependencies, make it a global singleton, and call it a service locator. I think this is not a terribly good idea, but better than nothing. When passing the bundled Dependencies as an argument, we only have one extra parameter to carry around. Also, we can easily have different dependency bundles for different parts of the system, making it easier to grasp which dependencies will be used by that part of the code.

An important point of the Dependencies implementation is that we not only have a service member for each dependency, but also a trivial wrapper method. In the above example, this is not necessary aside from providing a slightly better user experience. However, this changes if services depend on each other. This method can be used to inject the dependencies provided by this container, without the caller having to provide them. E.g. if the ID service would depend on the logging service, we could write

struct Dependencies final {
  struct GetIdService
  : public Service<int(LogMessageService&)>
  {
    using Service::Service;
  };
  GetIdService get_id_service;
  int get_id() {
    return get_id_service(log_message_service);
  }

  ...
};

The signature of the GetIdService changes, but everything else stays the same. In particular, we have insulated the users of Dependencies::get_id() from this change, just as if the get_id_service had an implicit static dependency on the log_message_service. This is great.

Creating a dedicated type describing all dependencies simplifies client code, and makes testing possible.

However, there are a couple of drawbacks to such an approach.

The dependencies can be easily corrupted accidentally. Code with dependencies described in this system will not be correct unless the services are also implemened correctly. Testing this is challenging.

My response to this is that an unit test should only test the value added by the system under test, not the system with all its dependencies.² However, it is true that unit tests are not sufficient, and that the interactions between different parts as they appear in the production configuration will also have to be tested.
This approach ignores const-correctness. I might want to pass a dependency container to a system, without that system being able to change all the depdencies.

I have largely ignored constness in the above code snippets because this is a difficult topic. When a service is performed, it is reasonable to expect that this may change the state of the service. However, this does not change the identity of the service. It would probably be all right making all necessary methods and all references const, but I have yet to come to a final conclusion regarding this problem.
The Dependencies class ends up being a god object. Since every action performed by any part of the system can be understood as a “service”, eventually the whole application consists of incomprehensible interactions between these services.

To some degree, testability and traditional best practices are at odds to another. There is a middle ground where these two desires balance out, but that point has to be found for each software individually. It makes sense to primarily capture external dependencies as services. Also, we can have multiple dependency objects for different parts of the system, so each object stays fairly small.

Should you use this technique or some part of it? Probably. It's an easy way to get dependency injection without introducing a huge framework. It provides a lot of type safety. It can be introduced without significantly changing the external interface of your current code. It is safe to refactor code to use this pattern. Your code may have different requirements that favour another technique to introduce testability. But in any case, please make sure as early as possible in your development process that testability is not compromised by your design. A completely encapsulated system is worthless if you can't show that this system is also completely correct.

see http://c2.com/cgi/wiki?ArrangeActAssert for a short description. ↩
I learned about the idea of a value-added test from Stephen Vance's Quality Code book. He writes (ellipsis mine):

At a unit- or isolation-test level, I like to think of the purposose as the value added by the code. Those who live in countries like Mexico or the European Union with value-added tax (VAT) may find this concept clear. A company only pays VAT on the amount of value it adds to a product. Raw materials or component parts have a value when you receive them that is subtracted out of the value of the product you sell for purpose of taxation. Similarly, your code takes the libraries and collaborators on which it is built and adds an additional level of functionality or purpose for its consumers: the value added by that code.

Defining a unit test has caused considerable debate, but the value-added perspective gives us an alternative. I […] prefer to use the value-added concept for an inclusive definition of a unit test.

A unit test is a test that verifies the value added by the code under test. Any use of independently testable collaborators is simply a matter of convenience.

With this definition, use of an untestable method in the same class falls within the value added by the code under test. A testable method called from the code under test should have independent tests and therefore not be part of the verification except to the extent that it adds value to the code under test. Use of other classes can be mocked, stubbed, or otherwise test doubled because they are independently testable. […]

— Stephen Vance: Quality Code: Software Testing Principles, Practices, and Patterns. Addison-Wesley, 2013. pp. 25–26.

↩

next post: Algorithms Matter: Incrementally Naming Files
previous post: Good API Documentation