Unit Testing : Making sure your bugs don’t come back

8. Unit Testing : Making sure your bugs don’t come back#

Unit testing allows to ensure that a given software behaves in the correct way, at least for the cases one is testing. Once a function is written (or even before in TTD) or a bug is fixed, it is necessary to write a test that ensures the function to work properly in limit cases or the bug to not reappear in the future. There are several levels associated with unit testing ., and check https://matklad.github.io/2021/05/31/how-to-test.html. Also, check unit testing numerical libraries: https://news.ycombinator.com/item?id=42115161

In this unit we will learn the general philosophy behind it and a couple of tools to implement very basic tests, althoguh the list of testing frameworks is very large. Furthermore, modularization will be very important, so you must have a clear understanding on how to split some given code into headers, source files, and how to compile objects and then link them using the linker, hopefully through a Makefile.

It is worth mentioning that catching an exception (try and catch blocks) also helps with handling runtime errors and reacting correspondingly (see c++ exceptions and python exceptions ). Also, using logging libraries like logger or loguru (python), or spdlog c++, allows to print useful logging messages that also helps when you try to understand what is going on with your program.

But, how to write and run tests? A test is:

a small piece of code that compares the actual vs. expected behaviour of your software.
some code that you have to write.
should run as automatically as possible.
should run fast.
should fail fast.
should report its status
should track previous results and performance.

For this, it is much better to use a test framework, which will run all test and report back results statistics. It can also be integrated with revision control systems, automatic and deployment tasks. For C++, one can use google test, catch2, boost test, cppunit, …, and for python: pytest, unittest, doctest, and so on.

If possible, try to test all your code or lines of code (code coverage 100%), but do not obsese with it. You can use tools like gcov, llvm-cov and so on.

Another tip: you can run your test automatically when doing a commit, and if a test is not passed, the commit is not allowed. That keeps your code clean and forces you to commit only passing code. Of course your test must be fast. To configure this, check git pre-commit hooks:

https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks
https://pre-commit.com/

8.1. Type of tests#

A unit test should be small, concrete and precise, and run independent of other tests. Actually, you can tests at several levels:

Unit test: Tests if a software unit (like a function) works as expected.
Stress test: Checks if the software behave as expected with large/challenging inputs or environments. Can use fuzzy testing.
Integration test: Test if several software parts work together correctly even if they all pass unit testing.
System test: Interaction with larger software, even the OS.
Regression test: Checks if the software behaves as it did before
Performance test: is kind of a regression test, ensuring that performance is not affected by new changes.
UI/UX tests
Boundary test: Testing edge cases or very large numbers.
Error handling test: making sure we are catching and processing all tests.
…

Sometimes, is necessary to emulate some complex datatypes through mocking/stubing. Make sure your framework does that.

8.2. Testing with Catch2#

Our goal here is to learn to use catch2 to test a very simple function extracted from their tutorial. Later we will modularize the code to practice that and write a useful Makefile.

8.2.1. Installing catch2#

If you don’t have it installed, you can installl it from source, or you can use spack:

spack install catch2

If you are using a global spack install, check if catch2 is not already installed.

Then load it as usual

spack load catch2

8.3. Tutorial example: factorial#

Here we will follow the tutorial , testing a factorial function implementation. To do so, we need, first, to modularize our code:

Header file with declarations factorial.h

#pragma once
int factorial(int n);

Source file with implementations factorial.cpp

#include "factorial.h"

int factorial(int number)
{
    return number <= 1 ? number : factorial(number-1)*number;
}

And a first main file to use the function: main_factorial.cpp

#include <iostream>
#include "factorial.h"

int main(void)
{
    std::cout << factorial(4) << std::endl;
    return 0;
}

With these three files, we have the basic utilities to use the factorial function. To compile, we must run something like

g++ -c factorial.cpp
g++ -c main_factorial.cpp
g++ factorial.o main_factorial.o -o factorial_test.x

and then run as

./factorial_test.x

The compilation can be automated with a Makefile as (complete it)

all: main_factorial.x

# TODO

8.4. Exercise#

Now, please modify the main file, to compute the factorial of some cases that you think of. Is it working correctly? try 0, -1, 2, large numbeers, and so on. You have found some bugs. Now we need to both fix the function AND create test cases for those bugs to check that they are fixed always.

8.5. Including a test using catch2#

This is the file example extracted from catch2 tutorial. The following would be the main_test.cpp file:

#define CATCH_CONFIG_MAIN  // This tells Catch to provide a main() - only do this in one cpp file
#include "catch2/catch_test_macros.hpp"

#include "factorial.h"

TEST_CASE( "Factorials are computed", "[factorial]" ) {
    //REQUIRE( factorial(0) == 1 );
    REQUIRE( factorial(1) == 1 );
    REQUIRE( factorial(2) == 2 );
    REQUIRE( factorial(3) == 6 );
    REQUIRE( factorial(10) == 3628800 );
}

To compile, you need to also link with the corresponding catch2 flags (if you are using spack, do not forget to load catch2, spack load catch2)

g++ -c test_factorial.cpp
g++ -c factorial.cpp
g++ test_factorial.o factorial.o -o test_factorial.x -l Catch2Main -l Catch2
./test_factorial.x

The last two flags, -l Catch2Main -l Catch2, are used to link the program with the catch implementations. After running, and if all test passed, you will get something like

Randomness seeded to: 2222863459
===============================================================================
All tests passed (4 assertions in 1 test case)

Modify your Makefile accordingly (add a test target)

all: main_factorial.x

# TODO

clean:
        rm -f *.o *.x

After running make test you will get

g++ -c test_factorial.cpp
g++ -c factorial.cpp
g++ test_factorial.o factorial.o -o test_factorial.x -l Catch2Main -l Catch2
./test_factorial.x
Randomness seeded to: 3097615407
===============================================================================
All tests passed (4 assertions in 1 test case)

8.5.1. Exercise#

Please uncomment the commented line in the test, analyze and fix the factorial function. Also, implement more tests, for large numbers, negative numbers, and so on.

8.5.2. Tips for getting more info about test#

Catch2 also adds several cli options, check them as

./test_factorial.x --help

If you have several test cases, you can called them by name/tag like (see previous example)

./test_factorial.x "Factorials are computed"
Filters: "Factorials are computed"
Randomness seeded to: 133418804
===============================================================================
All tests passed (4 assertions in 1 test case)

8.5.3. Tip when having problems finding the library#

Sometimes, for non standard installations, it is useful to configure the paths to find both the includes and libs, and this can be done with the pkg-config utility. For example, to get the include path one can use

$ pkg-config --cflags catch2
-I/usr/local/include # This result can change if catch2 is installed on other systems

or, for the libs path

$ pkg-config --libs-only-L catch2
-L/usr/local/lib  # This result can change if catch2 is installed on other systems

So, the compilation line could

g++ $(pkg-config --cflags) $(pkg-config --libs-only-L catch2) test_factorial.o factorial.o -o test_factorial.x -l Catch2Main -l Catch2

Again, it is better to include this in a Makefile:

SHELL:=/bin/bash

# ...

test_factorial.x: test_factorial.o factorial.o
	g++ $$(pkg-config --cflags) $$(pkg-config --libs-only-L catch2) $^ -o $@ -l Catch2Main -l Catch2

# ...
clean:
	rm -f *.o *.x

NOTE: if you are using spack, you might need to add more code to your target commands

%.x: %.o factorial.o
    source $$HOME/repos/spack/share/spack/setup-env.sh; \
    spack load catch2; \
    g++ $$(pkg-config --cflags catch2) $^ -o $@

8.5.4. Exercise#

Check property based and random testing. Imagine a function like

int doubleIt(int x) {
    return 2 * x;
}

and a test like

TEST(RandomTests, DoubleIsEven) {
    for (int i = 0; i < 100; ++i) {
        int x = rand() % 1000;
        EXPECT_EQ(doubleIt(x) % 2, 0);
    }
}

Here you are checking that all the returned numbers are even, a property. And also you are using random numbers to test. Create a full example to run this. Also, instead of restricting yourself to positive numbers, check what happens with random numbers in the full range. For any bug found, create a particular test case, fix the bug and run again everything.

Note: for more elaborated testing you can check

8.5.5. Exercise#

Check the documentation for more options, like SECTIONS, tags, data generators, signature based test, …

8.6. Test coverage#

Test coverage refers to the amount of code you are testing. it is good to try to have 100% test coverage, but this depend on your project context.

To do check test coverage, you can use gcov or llmv-cov,a mong others. Let’s see and example for the first one.

First, you will need to compile your application as

g++ -g -coverage -fprofile-arcs -ftest-coverage -o mytest mycode.cpp mytest.cpp -lgtest -pthread

Then you run your tests

./mytest

and, finally, you produce a coverage report

gcov mycode.cpp

This will generate a bunch of gocov, gcno , gcda files with the reports Only focus on the ones you are interested in.

8.6.1. Generating and html report#

To do so, you can use wither gcovr or lcov. Let’s show examples with both.

8.6.1.1. Reporting with `gocvr`#

See: https://gcovr.com/en/stable/

First, install it as

uv pip install gcovr

After you have run gcov, create the html report as

gcovr --html-details coverage.html

And then open the html file.

8.6.1.2. Reporting with `lcov`#

Optionally you can also have an html report using lcov. After installing it, you can run

lcov --capture --directory . --output-file coverage.info
genhtml coverage.info --output-directory coverage
firefox coverage/index.html

Note to self: genhtml needs the date module from perl. Install as yes | perl -MCPAN -e 'install Date::Parse'

8.6.2. Exercise#

Run a test coverage pass on the factorial example. Are you getting 100% coverage? Add a new function, with not test, and check the report. Update your makefile to generate a test coverage report. Create and html report.

CXXFLAGS = -fprofile-arcs -ftest-coverage

all: main_factorial.x

%.x: %.o factorial.o
        g++ $(CXXFLAGS) $^ -o $@

# TODO

clean:
        rm -f *.o *.x *.gcov *.gcno *.gcda *~ a.out

8.7. Continuous integration#

Automating testing can also happen remotely when running a Continuous Integration (CI) pipeline. To do so, tools like GitHub Actions or GitLab CI/CD can be used for this purpose, which in turn use container under the hood. In the following, and example using a github CI pipeline will be shown.

To start, you will need to create a file inside your repo called .github/workflows/cpp-ci.yml. The directory .github/workflows/ is where you put all actions you expect to be run when you push or perform other action (for example, check the workflows in this book repo). Inside the file, you can put something like

name: C++ CI

on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Install dependencies (g++, make, Catch2)
      run: |
        sudo apt-get update
        sudo apt-get install -y g++ make catch2

    - name: Build with Makefile
      run: make test_factorial.x

    - name: Run tests
      run: ./test_factorial.x

This will run this workflow every time there is a push or pull_request. What this actually do is

It spins up a clean and new irtual machine (ubuntu-latest)
Checkouts the repo in that specific container image
Install dependencies (depends on OS, in ubuntu it uses apt)
Compile the default target for testing.
Runs the executable.

You will get a report in the actions tab. If everything is ok, you will see a nice green checkmark in your repo.

For running actions only in a particular repository, you can adapt the workflow as

...
jobs:
  build:
    runs-on: ubuntu-latest

    defaults:
      run:
        working-directory: 09-testing

...

8.7.1. Exeercise#

Add continuous integration to your repo and check it is running in the remote repo.

8.8. [OPT] Google test example#

Google test is a famous and advance unit framework that goes well beyond of what is shown here. You are invited to follow the docs to learn more.

8.8.1. Installation#

Again, we will use spack

spack install googletest
mkdir googletest

8.8.2. Example#

This is an example, already modularized.

Factorial and isprime header:

#ifndef GTEST_SAMPLES_SAMPLE1_H_
#define GTEST_SAMPLES_SAMPLE1_H_

// Returns n! (the factorial of n).  For negative n, n! is defined to be 1.
int Factorial(int n);

//// Returns true if and only if n is a prime number.
bool IsPrime(int n);

#endif  // GTEST_SAMPLES_SAMPLE1_H_

Source file

#include "factorial.h"

// Returns n! (the factorial of n).  For negative n, n! is defined to be 1.
    int Factorial(int n) {
    int result = 1;
    for (int i = 1; i <= n; i++) {
        result *= i;
    }

    return result;
  }

// Returns true if and only if n is a prime number.
bool IsPrime(int n) {
    // Trivial case 1: small numbers
    if (n <= 1) return false;

    // Trivial case 2: even numbers
    if (n % 2 == 0) return n == 2;

    // Now, we have that n is odd and n >= 3.

    // Try to divide n by every odd number i, starting from 3
    for (int i = 3; ; i += 2) {
        // We only have to try i up to the square root of n
        if (i > n/i) break;

        // Now, we have i <= n/i < n.
        // If n is divisible by i, n is not prime.
        if (n % i == 0) return false;
    }

    // n has no integer factor in the range (1, n), and thus is prime.
    return true;
}

Test source file (to be compiled as an object)

#include <limits.h>
#include "factorial.h"
#include "gtest/gtest.h"
namespace {
    // Tests factorial of negative numbers.
    TEST(FactorialTest, Negative) {
        // This test is named "Negative", and belongs to the "FactorialTest"
        // test case.
        EXPECT_EQ(1, Factorial(-5));
        EXPECT_EQ(1, Factorial(-1));
        EXPECT_GT(Factorial(-10), 0);
    }
    // Tests factorial of 0.
    TEST(FactorialTest, Zero) {
        EXPECT_EQ(1, Factorial(0));
    }

// Tests factorial of positive numbers.
    TEST(FactorialTest, Positive) {
        EXPECT_EQ(1, Factorial(1));
        EXPECT_EQ(2, Factorial(2));
        EXPECT_EQ(6, Factorial(3));
        EXPECT_EQ(40320, Factorial(8));
    }

    // Tests negative input.
    TEST(IsPrimeTest, Negative) {
        // This test belongs to the IsPrimeTest test case.

        EXPECT_FALSE(IsPrime(-1));
        EXPECT_FALSE(IsPrime(-2));
        EXPECT_FALSE(IsPrime(INT_MIN));
    }

// Tests some trivial cases.
    TEST(IsPrimeTest, Trivial) {
        EXPECT_FALSE(IsPrime(0));
        EXPECT_FALSE(IsPrime(1));
        EXPECT_TRUE(IsPrime(2));
        EXPECT_TRUE(IsPrime(3));
    }

// Tests positive input.
    TEST(IsPrimeTest, Positive) {
        EXPECT_FALSE(IsPrime(4));
        EXPECT_TRUE(IsPrime(5));
        EXPECT_FALSE(IsPrime(6));
        EXPECT_TRUE(IsPrime(23));
    }
}

Main google test file

#include <cstdio>
#include "gtest/gtest.h"

GTEST_API_ int main(int argc, char **argv) {
    printf("Running main() from %s\n", __FILE__);
    testing::InitGoogleTest(&argc, argv);
    return RUN_ALL_TESTS();
}

8.9. Python Unit Testing with `pytest`#

While C++ has great tools like catch2 or GTest for unit testing, Python also has a rich ecosystem for writing and running tests. The most commonly used testing framework in Python is pytest

Other popular tools include:

unittest: Built-in module inspired by JUnit
doctest: Test code inside docstrings
hypothesis: Property-based testing (like fuzzing)

8.9.1. Getting Started with `pytest`#

Install pytest if it’s not already installed:

uv pip install pytest

Create a file math_utils.py

def add(a, b):
    return a + b

def divide(a, b):
    if b == 0:
        raise ValueError("Division by zero!")
    return a / b

Now create a test file test_math_utils.py:

from math_utils import add, divide
import pytest

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0

def test_divide():
    assert divide(10, 2) == 5
    with pytest.raises(ValueError):
        divide(10, 0)

To run all tests:

pytest

You’ll see output like:

=========================== test session starts ============================
collected 2 items

test_math_utils.py ..                                             [100%]

============================ 2 passed in 0.01s =============================

8.9.2. Bonus: Randomized Testing with `pytest`#

You can also try simple randomized tests using Python’s random module:

import random

def test_add_random():
    for _ in range(10):
        a = random.randint(-100, 100)
        b = random.randint(-100, 100)
        assert add(a, b) == a + b

For more powerful property-based testing, look into hypothesis:

from hypothesis import given
import hypothesis.strategies as st

@given(st.integers(), st.integers())
def test_add_hypothesis(a, b):
    assert add(a, b) == a + b

You can also test for coverage with pytest-cov or benchmarking with pytest-benchmark.