gcovr 4.0

2018-06-17 8 min by Lukas Atkinson Projects (3) Python (10)

After a lot of refactoring and cleanup, gcovr 4.0 is out!

This release represents four months of work. The focus was on internal changes, but there are lots of enhancements as well: better filters, improved Python 3 support, GCC 8 support, better encoding support via --source-encoding, parallel gcov invocations with -j, and HTML customizations with --html-title, --html-medium-threshold, and --html-high-threshold.

Gcovr is a command line tool that uses gcov to produce code coverage reports as text summaries, detailed HTML reports, or Cobertura-compatible XML. I works with the GCC and Clang compilers.

You can pip install gcovr from PyPI, read the overview on GitHub, or read the user guide.

Changes in detail

for the full changelog, see https://github.com/gcovr/gcovr/blob/master/CHANGELOG.rst#40-17-june-2018

Huge internal refactoring

This should make future maintenance and improvements easier. For example, the code is now split across multiple modules. Gcovr should mostly behave the same, but I expect smaller regressions.

As a consequence of this change, gcovr must be installed via pip:

# install from PyPI
pip install gcovr

# install development version from GitHub
pip install git+https://github.com/gcovr/gcovr.git

Previously, it was possible to download the gcovr script and use it directly, but this was limiting for development. Installation via pip has advantages for users and developers. E.g. Windows users now automatically get an executable gcovr command. Some data files like the HTML templates could be extracted into a data directory (they are now Jinja templates!).

When we're already doing breaking changes, it's a good opportunity to sneak in more. For example, we dropped support for Python 2.6. Python 2.6 support is not directly difficult, but you suffer a death from a thousand cuts. E.g. string formatting, the subprocess library, and Python 3.x compatibility grow increasingly complicated. The 2.6-compatible library ecosystem is severely limited: To use the standard library argparse module we had to find a 2.6-compatible backport.

Now ideally it would have been possible to get rid of 2.7 as well, but the world isn't ready yet.

As part of the refactoring, the test suite and build system was improved. We now collect code coverage on gcovr itself, and it's not too bad. Automated testing with various Python versions under Windows and Linux provides confidence during development.

Better filters

Filters should now behave nearly identical on Windows and Unix. Importantly, they now use forward slashes even on Windows, like:

gcovr --filter C:/Projects/foo/

Previously, filters like --filter and --exclude were treated as paths and normalized to an absolute path. They would later be re-interpreted as regexes. This worked, but not very well. On Windows the path normalization was too limited to be useful in many scenarios.

Now, gcovr looks at a filter to determine whether its trying to match a relative or absolute path. Gcovr then matches the filter against a suitable path, without having to rewrite the filter. It will be possible to extend this approach in the future, if necessary.

Encoding support

Python 2 is encoding agnostic by default, Python 3 is not. This prevented non-ASCII source files from being processed correctly by gcovr.

Gcovr will now decode all data it reads. This changes the behaviour under Python 2. The encoding can be selected with the new --source-encoding option. One restriction is that all source files must have the same encoding.

Documentation is managed with Sphinx

The Gcovr Website is now generated by the Sphinx documentation system. This simplifies edits as the website source is now maintained together with gcovr. Right now the user guide is still a large monolithic document, but in the future it will become feasible to split it up into smaller chapters. As an immediate win, it is possible to reuse content from the readme in the user guide. Sphinx also has a rich ecosystem of extensions. We use one to generate the command line option documentation.

Speaking of the command line docs, they have been sorted into separate groups. This makes the gcovr --help output much more readable.

The documentation also includes a contribution guide. Having a written process saves me a lot of time as a maintainer.

Gcov parsing improvements

The parser for the gcov report format was rewritten. It now provides basic support for GCC 8, and should be more robust. If there is a parse error, the new --gcov-ignore-errors option can be used to ignore it.

Gcovr can now launch multiple simultaneous gcov processes with the -j option. This can help in huge projects, but is unlikely to help in smaller projects. Due to synchronization overhead it can actually slow down a coverage report, so please don't use it indiscriminately.

HTML customizations

Since the refactoring, making changes to the HTML has become a lot easier.

--html-details no longer requires you to also use --html.
--html can now write to stdout.
--html-title sets the report title.
--html-medium-threshold and --html-high-threshold can be used to customize the color legend in the HTML overview.

Statistics

Between gcovr 3.4 and 4.0,

167 files were touched (mostly tests)
12748 lines were added (mostly tests)
3161 lines were deleted (mostly the old gcovr script)
8 new contributors submitted code

Despite these huge numbers gcovr isn't actually that big, a large part of the effort simply focuses on the comprehensive test suite. The core gcovr application has 1800–2700 lines of code, depending on how you count.

Future plans

There is one big change that didn't make it into this release: the heuristics by which gcovr tries to find the correct directory from which to invoke gcov. This should be the same directory where the compiler was executed, and therefore depends on the exact build setup. Currently gcovr takes a few guesses and tries them out, launching a gcov process per guess. This is a major performance problem. However, I don't really understand that part of the code, and I don't quite understand what paths gcov is expecting exactly.

HTML suggestions

An often requested feature is per-subdirectory coverage summaries, so that the report website forms a tree that mirrors the source code structure. This is a great idea, but the HTML generator will require a bit of cleanup first.
Many years ago, there was a proposed redesign of the HTML reports based on Bootstrap. Since then, the gcovr codebase has diverged so that work may be unsalvageable. However, a cleaner look would be nice.
Accessibility improvements. The HTML reports use red/magenta and green a lot. Making the colours customizable would be great news for colour-blind people (including me). The branch coverage indicator symbols are tiny, and confusing when there are multiple branches per line. A stronger symbol incl. textual representation would make the reports easier to read for everyone.
The use of inline CSS is a problem when the reports are served under a Content Security Policy, e.g. through a Jenkins plugin. Adding an option to write the CSS into a separate file would be extremely helpful. It could also make the test corpus a lot smaller.
It's easy to extract function names from the gcov output. How cool would it be to summarize per-function coverage!

Other suggestions

XML reports can be made more performant. The XML data describes the coverage for all lines in a project. For large projects, this requires too much memory. Using either a more efficient DOM representation or a streaming output method will address this.
Tests with GCC 8 and Clang are important. Currently, the test suite assumes GCC 5. Different GCC versions differ slightly in the exact coverage report. That makes it very tricky to test.
A new machine readable output format (such as JSON) will make gcovr easier to script. E.g. a coverage summary could be extracted from this output in order to generate coverage badges.

How to contribute

Would you like to tackle any of the above issues or an item from the issue tracker? The gcovr contributing guide explains how to get started: how to set up a development environment, how to submit the pull request, and where to get help along the way. When I have the time I enjoy mentoring, so ping me (@latk on GitHub) if you need any assistance.

next post: Interface Dispatch
previous post: Favourite Blogs