Unit tests should be lightweight, quick, isolated, and easy to execute without further setup. This approach to test software works fine for individual classes and functions, but can get challenging when external dependencies are involved – such as databases.
It is not desirable to spin up a real database during unit testing, because this wouldn’t meet the aforementioned best practices and demands. However, the code in the data access layer has to be tested. To solve this problem, external dependencies such as the database are often replaced by test doubles that mimic that dependency: Stubs or mocks.
There might be exceptions – but in most cases, depending on how you implement it, this either gives a false sense of security or you end up building a database yourself.
What a Mock Does
Quite frequently, when people talk about a mock, what they actually mean is a stub. Both are some kind of test double, and the difference between them is important in this context.
A stub is initialized with predefined test data and answers incoming calls from the system under test with that test data. Merely providing data is its only purpose, and the stub has no impact on the outcome of the test.
A mock, on the other hand, is more complex than a stub. It registers all calls received from the system under test and examines those calls. The mock verifies and checks the interaction with the system under test, and this verification will affect the outcome of the test.
Deliberately using a stub is always fine. There might be a higher-level component that calls the data access layer through a repository abstraction, and especially in cases where it only reads from such a repository, the repository can be stubbed for testing.
When testing the data access layer itself and how it interacts with the database, a data-providing stub isn’t sufficient as there most likely are writing operations and complex queries. An actual mock would be needed to test the interaction with the database.
The Costs of a Mock
Such a mock doesn’t come for free, and the costs are often paid way too carelessly.
Most mocking frameworks introduce a high coupling between the unit tests and the implementation of the system under test. They let the developer set very clear expectations on how often and in which order the mock’s methods should be invoked, leading to verification of the implementation rather than the behavior. Changing the implementation of the tested class or function will now lead to failing unit tests, even though the expected result remains correct.
Also, the system under test shouldn’t notice whether it makes its calls against a test double or the real dependency. This requires an interface that hides the implementation from the system under test, introducing an additional layer of abstraction and more code to reason about.
Your Own Database
The disparity between a mock and the dependency it mimics has to be as small as possible – anything else would defeat the purpose of the mock and skew the outcome of the test. The larger the disparity, the higher the risk that an interaction passes the test but fails at runtime.
When implementing a database mock, keeping this disparity small is particularly hard. Proper verification of the interaction with the client would entail parsing SQL, modeling transactions, and acquiring locks. In fact, if you were to remove any doubt that the code in the data access layer will work with the real database using a unit test, you would need to implement your own database.
But implementing your own database that behaves like the real one isn’t a viable solution. You need to take shortcuts. Therein lies the next problem: With every shortcut taken, the significance and trustworthiness of the test decrease. It won’t offer much confidence, but the mentioned costs of the mock still have to be paid.
A common approach to circumvent this problem is to use an in-memory database like Oracle’s H2 for testing. H2 certainly comes closer to the production database than a self-built mock would, but SQL is practically not standardized and there won’t be a guarantee of feature parity between H2 and the production database. This means that database-specific features can’t be used, preventing proper optimization, efficiency, and maintainability. The even worse alternative is not to test code that utilizes vendor-specific features at all.
Both a self-built mock and an in-memory database don’t give guarantees or allow for conclusions as to whether or not the data access code will work against the production database. They merely increase the test coverage and give a false sense of security.
A Better Fit
Testing the data access layer during integration testing rather than unit testing offers much more confidence, accuracy, and trustworthiness. A real instance of the same database as used in production can be launched within an integration test, which significantly minimizes the differences between the testing and production environments.
It used to be too brittle and costly to spin up an actual database instance when running a test phase, but Docker changed this. Nowadays, a fully isolated and ready-to-use database container can be started within a few seconds. Integration testing frameworks such as Testcontainers provide a code API for managing throwaway database instances without a complex setup.
Not only gives an integration test with a real database the biggest confidence possible, but it also removes the challenges imposed by using a mock. There’s only one test class to write, there won’t be any additional abstractions, and refactoring the system under test won’t break the test.
By and large, the test will be focused on the behavior of the system rather than its implementation.