Hash :
0e0ea026
Author :
Date :
2022-10-13T18:29:33
Split trace tests out into angle_trace_tests. This makes angle_perftests strictly for microbenchmarks. The trace tests are a better separated target because they are for both gold correctness tests and performance. You can now use the GN arg "build_angle_trace_tests" to control building the trace tests. "build_angle_trace_perf_tests" is still supported for compatibility. Updates the infra scripts to use this new trace tests target. Also updates the build so by default we'll build the trace tests if you check them out, without needing a custom arg. Also updates docs. Bug: angleproject:7755 Change-Id: I2b1ae67799234cc1aefe7cda5f3af945c3f867fd Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3953340 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Yuly Novikov <ynovikov@chromium.org>
angle_perftests is a standalone microbenchmark testing suite that contains
tests for the OpenGL API. angle_trace_tests is a suite to run captures traces for correctness and
performance. Because the traces contain confidential data, they are not publicly available.
For more details on ANGLE’s tracer please see the docs.
The tests currently run on the Chromium ANGLE infrastructure and report results to the Chromium perf dashboard. Please refer to thepublic dashboard docs for help
You can follow the usual instructions to check out and build ANGLE.
Build the angle_perftests or angle_trace_tests targets. Note that all
test scores are higher-is-better. You should also ensure is_debug=false in
your build. Running with angle_assert_always_on or debug validation enabled
is not recommended.
Variance can be a problem when benchmarking. We have a test harness to run a
tests repeatedly to find a lower variance measurement. See src/tests/run_perf_tests.py.
To use the script first build angle_perftests or angle_trace_tests, set
your working directory your build directory, and invoke the
run_perf_tests.py script. Use --test-suite to specify your test suite,
and --filter to specify a test filter.
You can choose individual tests to run with --gtest_filter=*TestName*. To
select a particular ANGLE back-end, add the name of the back-end to the test
filter. For example: DrawCallPerfBenchmark.Run/gl or
DrawCallPerfBenchmark.Run/d3d11. Many tests have sub-tests that run
slightly different code paths. You might need to experiment to find the right
sub-test and its name.
ANGLE implements a no-op driver for OpenGL, D3D11 and Vulkan. To run on these
configurations use the gl_null, d3d11_null or vulkan_null test
configurations. These null drivers will not do any GPU work. They will skip
the driver entirely. These null configs are useful for diagnosing performance
overhead in ANGLE code.
Several command-line arguments control how the tests run:
--one-frame-only: Runs tests once and quickly exits. Used as a quick smoke test. --enable-trace: Write a JSON event log that can be loaded in Chrome. --trace-file file: Name of the JSON event log for --enable-trace. --calibration: Prints the number of steps a test runs in a fixed time. Used by perf_test_runner.py. --steps-per-trial x: Fixed number of steps to run for each test trial. --max-steps-performed x: Upper maximum on total number of steps for the entire test run. --screenshot-dir dir: Directory to store test screenshots. Implies --save-screenshots. On Android this directory is on device, not local (see also --render-test-output-dir). Only implemented in TracePerfTest. --save-screenshots: Save screenshots. Only implemented in TracePerfTest. --screenshot-frame <frame>: Which frame to capture a screenshot of. Defaults to first frame (1). Only implemented in TracePerfTest. --render-test-output-dir=dir: Directory to store test artifacts (including screenshots but unlike --screenshot-dir, dir here is always a local directory regardless of platform and --save-screenshots isn’t implied). --verbose: Print extra timing information. --warmup-trials x: Number of times to warm up the test before starting timing. Defaults to 3. --warmup-steps x: Maximum number of steps for the warmup loops. Defaults to unlimited. --no-warmup: Skip warming up the tests. Equivalent to --warmup-steps 0. --calibration-time: Run each test calibration step in a fixed time. Defaults to 1 second. --trial-time x or --max-trial-time x: Run each test trial under this max time. Defaults to 10 seconds. --fixed-test-time x: Run the tests until this much time has elapsed. --trials: Number of times to repeat testing. Defaults to 3. --no-finish: Don’t call glFinish after each test trial. --enable-all-trace-tests: Offscreen and vsync-limited trace tests are disabled by default to reduce test time. --minimize-gpu-work: Modify API calls so that GPU work is reduced to minimum. --validation: Enable serialization validation in the trace tests. Normally used with SwiftShader and retracing. --perf-counters: Additional performance counters to include in the result output. Separate multiple entries with colons: ‘:’. For example, for an endless run with no warmup, run:
angle_perftests --gtest_filter=TracePerfTest.Run/vulkan_trex_200 --steps 1000000 --no-warmup
The command line arguments implementations are located in ANGLEPerfTestArgs.cpp.
DrawCallPerfBenchmark: Runs a tight loop around DrawArarys calls. validation_only: Skips all rendering. render_to_texture: Render to a user Framebuffer instead of the default FBO. vbo_change: Applies a Vertex Array change between each draw. tex_change: Applies a Texture change between each draw. UniformsBenchmark: Tests performance of updating various uniforms counts followed by a DrawArrays call. vec4: Tests vec4 Uniforms. matrix: Tests using Matrix uniforms instead of vec4. multiprogram: Tests switching Programs between updates and draws. repeating: Skip the update of uniforms before each draw call. DrawElementsPerfBenchmark: Similar to DrawCallPerfBenchmark but for indexed DrawElements calls. BindingsBenchmark: Tests Buffer binding performance. Does no draw call operations. 100_objects_allocated_every_iteration: Tests repeated glBindBuffer with new buffers allocated each iteration. 100_objects_allocated_at_initialization: Tests repeated glBindBuffer the same objects each iteration. TexSubImageBenchmark: Tests glTexSubImage update performance. BufferSubDataBenchmark: Tests glBufferSubData update performance. TextureSamplingBenchmark: Tests Texture sampling performance. TextureBenchmark: Tests Texture state change performance. LinkProgramBenchmark: Tests performance of glLinkProgram. glmark2: Runs the glmark2 benchmark. Many other tests can be found that have documentation in their classes.
TracePerfTest: Runs replays of restricted traces, not
available publicly. To enable, read more in RestrictedTraceTests cpu_time: Amount of CPU time consumed by an iteration of the test. This is backed by
GetProcessTimes on Windows, getrusage on Linux/Android, and zx_object_get_info on Fuchsia. wall_time. That is because we are summing up the time
on all CPU threads for the test. wall_time: Wall time taken to run a single iteration, calculated by dividing the total wall
clock time by the number of test iterations. gpu_time: Estimated GPU elapsed time per test iteration. We compute the estimate using GLES
timestamp queries
at the beginning and ending of each test loop. vsync mode.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
# ANGLE Performance Tests
`angle_perftests` is a standalone microbenchmark testing suite that contains
tests for the OpenGL API. `angle_trace_tests` is a suite to run captures traces for correctness and
performance. Because the traces contain confidential data, they are not publicly available.
For more details on ANGLE's tracer please see the [docs](../restricted_traces/README.md).
The tests currently run on the Chromium ANGLE infrastructure and report
results to the [Chromium perf dashboard](https://chromeperf.appspot.com/report).
Please refer to the[public dashboard docs][DashboardDocs] for help
[DashboardDocs]: https://chromium.googlesource.com/catapult/+/HEAD/dashboard/README.md
## Running the Tests
You can follow the usual instructions to [check out and build ANGLE](../../../doc/DevSetup.md).
Build the `angle_perftests` or `angle_trace_tests` targets. Note that all
test scores are higher-is-better. You should also ensure `is_debug=false` in
your build. Running with `angle_assert_always_on` or debug validation enabled
is not recommended.
Variance can be a problem when benchmarking. We have a test harness to run a
tests repeatedly to find a lower variance measurement. See [`src/tests/run_perf_tests.py`][RunPerfTests].
To use the script first build `angle_perftests` or `angle_trace_tests`, set
your working directory your build directory, and invoke the
`run_perf_tests.py` script. Use `--test-suite` to specify your test suite,
and `--filter` to specify a test filter.
[RunPerfTests]: https://chromium.googlesource.com/angle/angle/+/main/scripts/perf_test_runner.py
### Choosing the Test to Run
You can choose individual tests to run with `--gtest_filter=*TestName*`. To
select a particular ANGLE back-end, add the name of the back-end to the test
filter. For example: `DrawCallPerfBenchmark.Run/gl` or
`DrawCallPerfBenchmark.Run/d3d11`. Many tests have sub-tests that run
slightly different code paths. You might need to experiment to find the right
sub-test and its name.
### Null/No-op Configurations
ANGLE implements a no-op driver for OpenGL, D3D11 and Vulkan. To run on these
configurations use the `gl_null`, `d3d11_null` or `vulkan_null` test
configurations. These null drivers will not do any GPU work. They will skip
the driver entirely. These null configs are useful for diagnosing performance
overhead in ANGLE code.
### Command-line Arguments
Several command-line arguments control how the tests run:
* `--one-frame-only`: Runs tests once and quickly exits. Used as a quick smoke test.
* `--enable-trace`: Write a JSON event log that can be loaded in Chrome.
* `--trace-file file`: Name of the JSON event log for `--enable-trace`.
* `--calibration`: Prints the number of steps a test runs in a fixed time. Used by `perf_test_runner.py`.
* `--steps-per-trial x`: Fixed number of steps to run for each test trial.
* `--max-steps-performed x`: Upper maximum on total number of steps for the entire test run.
* `--screenshot-dir dir`: Directory to store test screenshots. Implies `--save-screenshots`. On Android this directory is on device, not local (see also `--render-test-output-dir`). Only implemented in `TracePerfTest`.
* `--save-screenshots`: Save screenshots. Only implemented in `TracePerfTest`.
* `--screenshot-frame <frame>`: Which frame to capture a screenshot of. Defaults to first frame (1). Only implemented in `TracePerfTest`.
* `--render-test-output-dir=dir`: Directory to store test artifacts (including screenshots but unlike `--screenshot-dir`, `dir` here is always a local directory regardless of platform and `--save-screenshots` isn't implied).
* `--verbose`: Print extra timing information.
* `--warmup-trials x`: Number of times to warm up the test before starting timing. Defaults to 3.
* `--warmup-steps x`: Maximum number of steps for the warmup loops. Defaults to unlimited.
* `--no-warmup`: Skip warming up the tests. Equivalent to `--warmup-steps 0`.
* `--calibration-time`: Run each test calibration step in a fixed time. Defaults to 1 second.
* `--trial-time x` or `--max-trial-time x`: Run each test trial under this max time. Defaults to 10 seconds.
* `--fixed-test-time x`: Run the tests until this much time has elapsed.
* `--trials`: Number of times to repeat testing. Defaults to 3.
* `--no-finish`: Don't call glFinish after each test trial.
* `--enable-all-trace-tests`: Offscreen and vsync-limited trace tests are disabled by default to reduce test time.
* `--minimize-gpu-work`: Modify API calls so that GPU work is reduced to minimum.
* `--validation`: Enable serialization validation in the trace tests. Normally used with SwiftShader and retracing.
* `--perf-counters`: Additional performance counters to include in the result output. Separate multiple entries with colons: ':'.
For example, for an endless run with no warmup, run:
`angle_perftests --gtest_filter=TracePerfTest.Run/vulkan_trex_200 --steps 1000000 --no-warmup`
The command line arguments implementations are located in [`ANGLEPerfTestArgs.cpp`](ANGLEPerfTestArgs.cpp).
## Test Breakdown
### Microbenchmarks
* [`DrawCallPerfBenchmark`](DrawCallPerf.cpp): Runs a tight loop around DrawArarys calls.
* `validation_only`: Skips all rendering.
* `render_to_texture`: Render to a user Framebuffer instead of the default FBO.
* `vbo_change`: Applies a Vertex Array change between each draw.
* `tex_change`: Applies a Texture change between each draw.
* [`UniformsBenchmark`](UniformsPerf.cpp): Tests performance of updating various uniforms counts followed by a DrawArrays call.
* `vec4`: Tests `vec4` Uniforms.
* `matrix`: Tests using Matrix uniforms instead of `vec4`.
* `multiprogram`: Tests switching Programs between updates and draws.
* `repeating`: Skip the update of uniforms before each draw call.
* [`DrawElementsPerfBenchmark`](DrawElementsPerf.cpp): Similar to `DrawCallPerfBenchmark` but for indexed DrawElements calls.
* [`BindingsBenchmark`](BindingPerf.cpp): Tests Buffer binding performance. Does no draw call operations.
* `100_objects_allocated_every_iteration`: Tests repeated glBindBuffer with new buffers allocated each iteration.
* `100_objects_allocated_at_initialization`: Tests repeated glBindBuffer the same objects each iteration.
* [`TexSubImageBenchmark`](TexSubImage.cpp): Tests `glTexSubImage` update performance.
* [`BufferSubDataBenchmark`](BufferSubData.cpp): Tests `glBufferSubData` update performance.
* [`TextureSamplingBenchmark`](TextureSampling.cpp): Tests Texture sampling performance.
* [`TextureBenchmark`](TexturesPerf.cpp): Tests Texture state change performance.
* [`LinkProgramBenchmark`](LinkProgramPerfTest.cpp): Tests performance of `glLinkProgram`.
* [`glmark2`](glmark2.cpp): Runs the glmark2 benchmark.
Many other tests can be found that have documentation in their classes.
### Trace Tests
* [`TracePerfTest`](TracePerfTest.cpp): Runs replays of restricted traces, not
available publicly. To enable, read more in [`RestrictedTraceTests`](../restricted_traces/README.md)
## Understanding the Metrics
* `cpu_time`: Amount of CPU time consumed by an iteration of the test. This is backed by
`GetProcessTimes` on Windows, `getrusage` on Linux/Android, and `zx_object_get_info` on Fuchsia.
* This value may sometimes be larger than `wall_time`. That is because we are summing up the time
on all CPU threads for the test.
* `wall_time`: Wall time taken to run a single iteration, calculated by dividing the total wall
clock time by the number of test iterations.
* For trace tests, each rendered frame is an iteration.
* `gpu_time`: Estimated GPU elapsed time per test iteration. We compute the estimate using GLES
[timestamp queries](https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_disjoint_timer_query.txt)
at the beginning and ending of each test loop.
* For trace tests, this metric is only enabled in `vsync` mode.