docs/doxygen/lib/refdata.md

   1 Using reference data in C++ tests {#page_refdata}
   2 =================================
   3
   4 The \ref module_testutils module provides (among other things) utilities to
   5 write Google Test tests that compare their results against stored reference
   6 data.  This can either be used for
   7
   8  * regression-style tests, just ensuring that the output does not change, or
   9  * combined with manual checking of the reference data, as a different kind of
  10    assertion, where the expected results would be tedious to express directly
  11    as C++ code (e.g., when checking complicated data structures for
  12    correctness).
  13
  14 The current reference data functionality is quite basic, but it can be extended
  15 if/when more control over, e.g., comparison tolerances is needed.
  16
  17 Reference data organization
  18 ===========================
  19
  20 Conceptually, the reference data consists of a tree-like structure of nodes.
  21 Each leaf node checks a single primitive value (an integer, a floating-point
  22 value, a string etc.), and each inner node acts as a _compound_ value that
  23 helps organizing the data.  Within each compound node (including the root of
  24 the tree), child nodes are identified by an `id` string.  Each node within a
  25 single compound must have a unique `id`, and it is possible to compare
  26 multiple values produced by the test against this single node (naturally, the
  27 test only passes if the test produces the same value in all such cases).
  28
  29 Each node also has a type (a string).  For leaf nodes, the type is from a
  30 predetermined set of strings, and identifies the type of the value stored in
  31 the node.  For compound nodes, the type is just a string provided by the test.
  32 In all cases, the type in the reference data must match the type provided by
  33 the test.  This provides additional safety when changing the test to detect
  34 mismatches between the test and the reference data.  The intention is that
  35 compound nodes whose contents have the same structure would have the same type;
  36 this will simplify using XSLT for viewing the reference data (see below).
  37
  38 Some compound types are predefined, e.g., for simple sequences, but more
  39 complicated compounds can be defined ad-hoc in tests that need them.  See below
  40 for how to use them in the code.
  41
  42 As a special case, the `id` can be empty (`NULL`).  This is intended for
  43 cases where one is checking for a sequence of items, and the only thing
  44 distinguishing the items is their position in this sequence.  Using an empty
  45 `id` removes the need to generate unique identifiers for the items, and makes
  46 textual diffs of the reference data files easier to read.
  47 Only a single sequence of nodes with an empty `id` is supported within one
  48 parent node: if you first check some nodes with an empty `id`, followed by a
  49 non-empty `id`, the next check for an empty `id` will again match the first
  50 node in the sequence.
  51 For clarity, all the nodes that have an empty `id` should be of the same
  52 type, but this is not enforced.
  53
  54 Using reference data in code
  55 ============================
  56
  57 To use reference data in a test, the test should first create exactly one
  58 instance of gmx::test::TestReferenceData.  It can do so as a local variable in
  59 the test, as a member variable in its test fixture, or by subclassing a test
  60 fixture that already contains such a variable (e.g., gmx::test::StringTestBase
  61 or gmx::test::CommandLineTestBase).
  62 Only use the default constructor!  The other constructor is intended for
  63 self-testing utility code used in other tests (including self-testing the
  64 reference data implementation itself), and behaves differently from what is
  65 described here.
  66
  67 To access the root node of the data,
  68 gmx::test::TestReferenceData::rootChecker() needs to be called.
  69 This returns a gmx::test::TestReferenceChecker that provides various
  70 `check*()` methods that can be used to check values against top-level nodes.
  71 gmx::test::TestReferenceChecker::checkCompound() can be called to create custom
  72 compound types: it returns another gmx::test::TestReferenceChecker that can be
  73 used to check values against child nodes of the created compound.
  74
  75 Whenever a gmx::test::TestReferenceChecker method detects a mismatch against
  76 reference data, it will generate a non-fatal Google Test failure in the current
  77 test.  The test can naturally also use its own test assertions for additional
  78 checks, but any mismatch will automatically also fail the test.
  79
  80 It is also possible to read values of the reference data items using
  81 gmx::test::TestReferenceChecker, so that they can be used programmatically.
  82 For this to work, those items should first be written in the same test.
  83 This supports tests that want to both check data against a reference, and use
  84 that reference as a persistence layer for storing information.  This is useful
  85 at least for serialization tests.
  86 This is currently not supported for all use cases, but with some caveats, it is
  87 possible to use this for testing.
  88
  89 When using floating-point values in reference data, the tolerance for the
  90 comparison can be influenced with
  91 gmx::test::TestReferenceChecker::setDefaultTolerance().
  92 Per-comparison tolerances would be possible to implement if necessary, but
  93 currently you can either change the default tolerance whenever you need to, or
  94 create copies of the gmx::test::TestReferenceChecker object and set different
  95 tolerances in the different instances.  Note that there is an implicit
  96 assumption that a mixed- and a double-precision build will produce the same
  97 results (within the given tolerance).  This means that some things cannot be
  98 tested with the reference data (e.g., multiple steps of MD integration), and
  99 that reference data for such tests needs to be always generated in double
 100 precision (unless the results are nice, exact binary floating-point numbers).
 101
 102 Just creating a gmx::test::TestReferenceData instance does not enforce using
 103 reference data in the test; the data is loaded/used only when
 104 gmx::test::TestReferenceData::rootChecker() is first called.  If the test never
 105 calls this method, the gmx::test::TestReferenceData object does nothing.  This
 106 allows using the same test fixture (e.g., CommandLineTestBase) also in tests
 107 that do not need the reference data, but benefit from other features of the
 108 fixture.
 109
 110 Running tests that use reference data
 111 =====================================
 112
 113 To run a test that uses the reference data, you just execute the test binary as
 114 you would otherwise.  However, when you first add a test, the reference data
 115 does not exist, and the test will fail with an assertion message saying that
 116 the reference data could not be found.  To generate the reference data, you
 117 need to run the test binary with a `-ref-data create` command-line option
 118 (it is also possible to use any of the `update` options below to generate the
 119 reference data).
 120
 121 If you change a test (or the tested code) such that the reference data needs to
 122 be changed, you need to run the test binary with `-ref-data update-all` or
 123 `-ref-data update-changed`.  The first will recreate the reference data from
 124 scratch.  The latter will retain old reference values if they are still valid.
 125 In other words, floating-point reference values that are within the test
 126 tolerance will be kept at their old values.  Only values that are outside the
 127 tolerance (or otherwise do not match or do not exist) are updated.
 128 This is useful (at least) for tests that contain floating-point data, where it
 129 is not expected that those floating-point values would actually need to change.
 130 This allows you to update other parts of the reference data without doing a
 131 double-precision build, and also makes it easier to avoid spurious changes in
 132 the last bits of other reference data values when just a single output value is
 133 expected to change.
 134
 135 To create or update reference data, the test needs to pass when run with the
 136 corresponding flag.  All comparisons against reference data will pass in these
 137 modes, but you need to ensure that other assertions in the test also pass, and
 138 that the test does not throw exceptions.
 139 Note that if your test does multiple comparisons against the same `id` node,
 140 reference data comparison can still fail during create/update if the test does
 141 not produce the same results for each comparison.
 142
 143 With all the operations that create or update the reference data, you can use
 144 the `--gtest_filter=<...>` command-line option provided by Google Test to
 145 select the tests whose reference data you want to influence.
 146
 147 Persistence
 148 ===========
 149
 150 The reference data is stored in XML files under
 151 `src/gromacs/`<em>module</em>`/tests/refdata/` in the source tree.
 152 This part of the framework depends on `tinyxml2`, which is bundled in `src/external`.
 153 One file is produced per test that uses reference data.  If you rename tests or
 154 otherwise change the reference data, you currently need to manually manage the
 155 files with `git`.
 156
 157 For inspecting the reference data in a browser, there are XSLT stylesheets that
 158 transform the XML files into HTML.  Such custom transformations need to be
 159 written for each type of test if the output is not easy to check otherwise.
 160 Because of security features in browsers, the transformations may not work for
 161 all browsers.  For the same reason, the XSLT files must be in the same folder
 162 as the XML files.  For cases where the XSLT files are shared between multiple
 163 modules, `src/testutils/copy_xsl.sh` takes care to synchronize the files after
 164 a master copy is edited.