Enzo Test Suite¶
The Enzo test suite is a set of tools whose purpose is to perform regression tests on the Enzo codebase, in order to help developers discover bugs that they have introduced, to verify that the code is producing correct results on new computer systems and/or compilers, and, more generally, to demonstrate that Enzo is behaving as expected under a wide variety of conditions.
What’s in the test suite?¶
The suite is composed of a large number of individual test problems that are designed to span the range of physics and dimensionalities that are accessible using the Enzo code, both separately and in various permutations. Tests can be selected based on a variety of criteria, including (but not limited to) the physics included, the estimated runtime of the test, and the dimensionality. The testing suite runs enzo on each selected test problem, produces a series of outputs, and then uses yt to process these outputs in a variety of different ways (making projections, looking at fields, etc.). The results of these yt analyses are then compared against similarly generated results from an earlier “good” version of the enzo code run on the same problems. In test problems where we have them, analytical solutions are compared against the test results (e.g. shocktubes). Lastly, a summary of these test results are returned to the user for interpretation.
One can run individual tests or groups of tests using the various run time flags. For convenience, three pre-created, overlapping sets of tests are provided. For each set of tests, one must generate their own standard locally against which she can compare different builds of the code.
1. The “quick suite” (--suite=quick
). This is composed of
small calculations that test critical physics packages both
alone and in combination. The intent of this package is to be run
automatically and relatively frequently (multiple times a day) on
a remote server to ensure that bugs have not been introduced during the code
development process. All runs in the quick suite use no more than
a single processor. The total run time should be about 15 minutes
on the default lowest level of optimization..
2. The “push suite” (--suite=push
). This is a slightly
large set of tests, encompassing all of the quick suite and
some additional larger simulations that test a wider variety of physics
modules. The intent of this package is to provide a thorough validation
of the code prior to changes being pushed to the main repository. The
total run time is roughly 60 minutes for default optimization, and
all simulations use only a single processor.
3. The “full suite” (--suite=full
). This encompasses essentially
all of test simulations contained within the run directory. This suite
provides the most rigorous possible validation of the code in many different
situations, and is intended to be run prior to major changes being pushed
to the stable branch of the code. A small number of simulations in the full
suite are designed to be run on 2 processors and will take multiple hours to
complete. The total run time is roughly 60 hours for the default lowest
level of optimization.
How to run the test suite¶
1. Compile Enzo. If you have already built enzo, you can skip this step and the test will use your existing enzo executable. To compile enzo with the standard settings, complete these commands:
$ cd <enzo_root>/src/enzo
$ make default
$ make clean
$ make
Note that you need not copy the resulting enzo executable to your path, since the enzo.exe will be symbolically linked from the src/enzo directory into each test problem directory before tests are run.
2. Get the correct yt version The enzo tests are generated and compared using the yt analysis suite. You must be using yt 2.6.3 in order for the test suite to work. The test suite has not yet been updated to work with yt 3.0 and newer releases. If you do not yet have yt, visit http://yt-project.org/#getyt for installation instructions. If you already have yt and yt is in your path, make sure you’re using yt 2.6.3 by running the following commands:
$ cd /path/to/yt_mercurial_repository
$ hg update yt-2.x
$ python setup.py develop
3. Generate answers to test with. Run the test suite with these flags within
the run/
subdirectory in the enzo source hierarchy:
$ cd <enzo_root>/run
$ ./test_runner.py --suite=quick -o <output_dir> --answer-store
--answer-name=<test_name> --local
Note that we’re creating test answers in this example with the quick suite, but we could just as well create a reference from any number of test problems using other test problem flags.
Here, we are storing the results from our tests locally in a file called
<test_name> which will now reside inside of the <output_dir>
. If you want
to, you can leave off --answer-name
and get a sensible default.
$ ls <output_dir>
fe7d4e298cb2 <test_name>
$ ls <output_dir>/<test_name>
<test_name>.db
When we inspect this directory, we now see that in addition to the subdirectory containing the simulation results, we also have a <test_name> subdirectory which contains python-readable shelve files, in this case a dbm file. These are the files which actually contain the reference standard. You may have a different set of files or extensions depending on which OS you are using, but don’t worry Python can read this no problem. Congratulations, you just produced your own reference standard. Feel free to test against this reference standard or tar and gzip it up and send it to another machine for testing.
4. Run the test suite using your local answers. The testing suite operates
by running a series of enzo test files throughout the run
subdirectory.
Note that if you want to test a specific enzo changeset, you must update to it
and recompile enzo. You can initiate the quicksuite test simulations and their
comparison against your locally generated answers by running the following
commands:
$ cd <enzo_root>/run
$ ./test_runner.py --suite=quick -o <output_dir> --answer-name=<test_name>
--local --clobber
In this command, --output-dir=<output_dir>
instructs the test runner to
output its results to a user-specified directory (preferably outside of the enzo
file hierarchy). Make sure this directory is created before you call
test_runner.py, or it will fail. The default behavior is to use the quick
suite, but you can specify any set of tests using the --suite
or --name
flags. We are comparing the simulation results against a local (--local
)
reference standard which is named <test_name>
also located in the
<output_dir>
directory. Note, we included the --clobber
flag to rerun
any simulations that may have been present in the <output_dir>
under the
existing enzo version’s files, since the default behavior is to not rerun
simulations if their output files are already present. Because we didn’t set
the --answer-store
flag, the default behavior is to compare against the
<test_name>
.
5. Review the results. While the test_runner is executing, you should see the results coming up at the terminal in real time, but you can review these results in a file output at the end of the run. The test_runner creates a subdirectory in the output directory you provided it, as shown in the example below.
$ ls <output_dir>
fe7d4e298cb2
$ ls <output_dir>/fe7d4e298cb2
Cooling GravitySolver MHD test_results.txt
Cosmology Hydro RadiationTransport version.txt
The name of this directory will be the unique hash of the version of
enzo you chose to run with the testing suite. In this case it is
fe7d4298cb2
, but yours will likely be different, but equally
unintelligible. You can specify an optional additional suffix to be
appended to this directory name using --run-suffix=<suffix>
. This
may be useful to distinguish multiple runs of a given version of enzo,
for example with different levels of optimization. Within this
directory are all of the test problems that you ran along with their
simulation outputs, organized based on test type (e.g. Cooling
,
AMR
, Hydro
, etc.) Additionally, you should see a file called
test_results.txt
, which contains a summary of the test runs and
which ones failed and why.
My tests are failing and I don’t know why¶
A variety of things cause tests to fail: differences in compiler,
optimization level, operating system, MPI submission method,
and of course, your modifications to the code. Go through your
test_results.txt
file for more information about which tests
failed and why. You could try playing with the relative tolerance
for error using the --tolerance
flag as described in the flags
section. For more information regarding the failures of a specific
test, examine the estd.out
file in that test problem’s subdirectory
within the <output_dir>
directory structure, as it contains the
STDERR
and STDOUT
for that test simulation.
If you are receiving EnzoTestOutputFileNonExistent
errors, it
means that your simulation is not completing. This may be due to
the fact that you are trying to run enzo with MPI which your
system doesn’t allow you to initiate from the command line.
(e.g. it expects you to submit mpirun jobs to the queue).
You can solve this problem by recompiling your enzo executable with
MPI turned off (i.e. make use-mpi-no
), and then just pass the
local_nompi machine flag (i.e. -m local_nompi
) to your
test_runner.py call to run the executable directly without MPI support.
Currently, only a few tests use multiple cores, so this is not a
problem in the quick or push suites.
If you see a lot of YTNoOldAnswer
errors, it may mean that your simulation
is running to a different output than what was reached for your locally
generated answers does, and the test suite is trying to compare your last output
file against a non-existent file in the answers. Look carefully at the
results of your simulation for this test problem using the provided python file
to determine what is happening. Or it may simply mean that you specified the
wrong answer name.
Descriptions of all the testing suite flags¶
You can type ./test_runner.py --help
to get a quick summary of all
of the command line options for the testing suite. Here is a more
thorough explanation of each.
General flags
-h, --help
- list all of the flags and their argument types (e.g. int, str, etc.)
-o str, --output-dir=str
default: None- Where to output the simulation and results file hierarchy. Recommended to specify outside of the enzo source hierarchy.
-m str, --machine=str
default: local- Specify the machine on which you’re running your tests. This loads
up a machine-specific method for running your tests. For instance,
it might load qsub or mpirun in order to start the enzo executable
for the individual test simulations. You can only use machine
names of machines which have a corresponding machine file in the
run/run_templates
subdirectory (e.g. nics-kraken). N.B. the default,local
, will attempt to run the test simulations using mpirun, so if you are required to queue on a machine to execute mpirun,test_runner.py
will silently fail before finishing your simulation. You can avoid this behavior by compiling enzo without MPI and then setting the machine flag tolocal_nompi
. --repo=str
default: current directory- Path to repository being tested.
--interleave
default: False- Interleaves preparation, running, and testing of each individual test problem as opposed to default batch behavior.
--clobber
default: False- Rerun enzo on test problems which already have results in the destination directory
--tolerance=int
default: see--strict
Sets the tolerance of the relative error in the comparison tests in powers of 10.
Ex: Setting
--tolerance=3
means that test results are compared against the standard and fail if they are off by more than 1e-3 in relative error.--bitwise
default: see--strict
- Declares whether or not bitwise comparison tests are included to assure that the values in output fields exactly match those in the reference standard.
--strict=[high, medium, low]
default: lowThis flag automatically sets the
--tolerance
and--bitwise
flags to some arbitrary level of strictness for the tests. If one sets--bitwise
or--tolerance
explicitly, they trump the value set by--strict
. When testing enzo general functionality after an installation,--strict=low
is recommended, whereas--strict=high
is suggested when testing modified code against a local reference standard.high
: tolerance = 13, bitwise = Truemedium
: tolerance = 6, bitwise = Falselow
: tolerance = 3, bitwise = False--sim-only
default: False- Only run simulations, do not store the tests or compare them against a standard.
--test-only
default: False- Only perform tests on existing simulation outputs, do not rerun the simulations.
--time-multiplier=int
default: 1- Multiply simulation time limit by this factor. Useful if you’re on a slow machine or you cannot finish the specified tests in their allocated time.
--run-suffix=str
default: None- An optional suffix to append to the test run directory. Useful to distinguish multiple runs of a given changeset.
-v, --verbose
default: False- Verbose output in the testing sequence. Very good for tracking down specific test failures.
--pdb
default: False- When a test fails a pdb session is triggered. Allows interactive inspection of failed test data.
Flags for storing, comparing against different standards
--answer-store
default: False- Should we store the results as a reference or just compare against an existing reference?
--answer-name=str
default: latest gold standard- The name of the file where we will store our reference results,
or if
--answer-store
is false, the name of the reference against which we will compare our results. --local
default: False- Store/Compare the reference standard locally (i.e. not on the cloud)
Bisection flags
-b, --bisect
default: False- Run bisection on test. Requires revisions
--good
and--bad
. Best if--repo
is different from location oftest_runner.py
runs--problematic
suite. --good=str
default: None- For bisection, most recent good revision
--bad=str
default: None- For bisection, most recent bad revision
-j int, --jcompile=int
default: 1- number of processors with which to compile when running bisect
--changeset=str
default: latest- Changeset to use in simulation repo. If supplied, make clean && make is also run
Flags not used
--with-answer-testing
default: False- DO NOT USE. This flag is used in the internal yt answer testing and has no purpose in the enzo testing infrastructure.
--answer-big-data
default: False- DO NOT USE. This flag is used in the internal yt answer testing and has no purpose in the enzo testing infrastructure.
Flags for specifying test problems
These are the various means of specifying which test problems you want to include in a particular run of the testing suite.
--suite=[quick, push, full]
default: None- A precompiled collection of several different test problems. quick: 37 tests in ~15 minutes, push: 48 tests in ~30 minutes, full: 96 tests in ~60 hours.
--answer_testing_script=str
default: None
--AMR=bool
default: False- Test problems which include AMR
--author=str
default: None- Test problems authored by a specific person
--chemistry=bool
default: False- Test problems which include chemistry
--cooling=bool
default: False- Test problems which include cooling
--cosmology=bool
default: False- Test problems which include cosmology
--dimensionality=[1, 2, 3]
- Test problems in a particular dimension
--gravity=bool
default: False- Test problems which include gravity
--hydro=bool
default: False- Test problems which include hydro
--max_time_minutes=float
- Test problems which finish under a certain time limit
--mhd=bool
default: False- Test problems which include MHD
--name=str
default: None- A test problem specified by name
--nprocs=int
default: 1- Test problems which use a certain number of processors
--problematic=bool
default: False- Test problems which are deemed problematic
--radiation=[None, fld, ray]
default: None- Test problems which include radiation
--runtime=[short, medium, long]
default: None- Test problems which are deemed to have a certain predicted runtime
How to track down which changeset caused your test failure¶
In order to identify changesets that caused problems, we have
provided the --bisect
flag. This runs hg bisect on revisions
between those which are marked as –good and –bad.
hg bisect automatically manipulates the repository as it runs its
course, updating it to various past versions of the code and
rebuilding. In order to keep the tests that get run consistent through
the course of the bisection, we recommend having two separate enzo
installations, so that the specified repository (using --repo
) where
this rebuilding occurs remains distinct from the repository where the
testing is run.
To minimize the number of tests run, bisection is only run on tests
for which problematic=True
. This must be set by hand by the user
before running bisect. It is best that this is a single test problem,
though if multiple tests match that flag, failures are combined with “or”
An example of using this method is as follows:
$ echo "problematic = True" >> Cosmology/Hydro/AdiabaticExpansion/AdiabaticExpansion.enzotest
$ ./test_runner.py --output-dir=/scratch/dcollins/TESTS --repo=/SOMEWHERE_ELSE
--answer-compare-name=$mylar/ac7a5dacd12b --bisect --good=ac7a5dacd12b
--bad=30cb5ff3c074 -j 8
To run preliminary tests before bisection, we have also supplied the
--changeset
flag. If supplied, --repo
is updated to
--changeset
and compiled. Compile errors cause test_runner.py
to return that error, otherwise the tests/bisector is run.
How to add a new test to the library¶
It is hoped that any newly-created or revised physics module will be accompanied by one or more test problems, which will ensure the continued correctness of the code. This sub-section explains the structure of the test problem system as well as how to add a new test problem to the library.
Test problems are contained within the run/
directory in the
Enzo repository. This subdirectory contains a tree of directories
where test problems are arranged by the primary physics used in that
problem (e.g., Cooling, Hydro, MHD). These directories may be further
broken down into sub-directories (Hydro is broken into Hydro-1D,
Hydro-2D, and Hydro-3D), and finally into individual directories
containing single problems. A given directory contains, at minimum,
the Enzo parameter file (having extension .enzo
, described in
detail elsewhere in the manual) and the Enzo test suite parameter file
(with extension .enzotest
). The latter contains a set of
parameters that specify the properties of the test. Consider the test
suite parameter file for InteractingBlastWaves, which can be found in the
run/Hydro/Hydro-1D/InteractingBlastWaves
directory:
name = 'InteractingBlastWaves'
answer_testing_script = None
nprocs = 1
runtime = 'short'
hydro = True
gravity = False
AMR = True
dimensionality = 1
max_time_minutes = 1
fullsuite = True
pushsuite = True
quicksuite = True
This allows the user to specify the dimensionality, physics used, the
runtime (both in terms of ‘short’, ‘medium’, and ‘long’ calculations,
and also in terms of an actual wall clock time). A general rule for
choosing the runtime value is ‘short’ for runs taking less than 5 minutes,
‘medium’ for run taking between 5 and 30 minutes, and ‘long’ for runs taking
more than 30 minutes. If the test problem runs successfully in any amount
of time, it should be in the full suite, selected by setting
fullsuite=True
. If the test runs in a time that falls under ‘medium’
or ‘short’, it can be added to the push suite (pushsuite=True
). If
the test is ‘short’ and critical to testing the functionality of the code,
add it to the quick suite (quicksuite=True
).
Once you have created a new problem type in Enzo and thoroughly documented the parameters in the Enzo parameter list, you should follow these steps to add it as a test problem:
- Create a fork of Enzo.
2. Create a new subdirectory in the appropriate place in the
run/
directory. If your test problem uses multiple pieces of
physics, put it under the most relevant one.
3. Add an Enzo parameter file, ending in the extension .enzo
,
for your test problem to that subdirectory.
4. Add an Enzo test suite parameter file, ending in the extension
.enzotest
. In that file, add any relevant parameters as described
above.
5. By default, the final output of any test problem will be tested by
comparing the min, max, and mean of a set of fields. If you want to
have additional tests performed, create a script in the problem type
directory and set the answer_testing_script
parameter in the
.enzotest
file to point to your test script. For an example of
writing custom tests, see
run/Hydro/Hydro-3D/RotatingCylinder/test_rotating_cylinder.py
.
6. Submit a Pull Request with your changes and indicate that you have created a new test to be added to the testing suites.
Congratulations, you’ve created a new test problem!
What to do if you fix a bug in Enzo¶
It’s inevitable that bugs will be found in Enzo, and that some of those bugs will affect the actual simulation results (and thus the test problems used in the problem suite). Here is the procedure for doing so:
1. Run the “push suite” of test problems (--pushsuite=True
)
for your newly-revised version of Enzo, and determine which test
problems now fail.
2. Visually inspect the failed solutions, to ensure that your new version is actually producing the correct results!
3. Email the enzo-developers mailing list at enzo-dev@googlegroups.com to explain your bug fix, and to show the results of the now-failing test problems.
- Create a pull request for your fix.