Hash :
3f0c4a56
Author :
Date :
2019-01-10T10:20:35
Vulkan: Faster state transitions. Implements a transition table from Pipeline Cache entry to state change neighbouring Pipeline Cache entries. We use a 64-bit mask to do a quick scan over the pipeline desc. This ends up being a lot faster than doing a full hash and memcmp over the pipeline description. Note that there could be future optimizations to this design. We might keep a hash map of the pipeline transitions instead of a list. Or use a sorted list. This could speed up the search when there are many transitions for cache entries. Also we could skip the transition table and opt to do a full hash when there are more than a configurable number of dirty states. This might be a bit faster in some cases. Likely this will be something we can add performance tests for in the future. Documentation is also added in a README file for the Vulkan back end. This will be extended over time. Improves performance about 30-35% on the VBO state change test. Bug: angleproject:3013 Change-Id: I793f9e3efd8887acf00ad60e4ac2502a54c95dee Reviewed-on: https://chromium-review.googlesource.com/c/1369287 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuly Novikov <ynovikov@chromium.org>
angle_perftests is a standalone testing suite that contains targeted tests for OpenGL, Vulkan and ANGLE internal classes. The tests currently run on the Chromium ANGLE infrastructure and report results to the Chromium perf dashboard.
You can also build your own dashboards. For example, a comparison of ANGLE’s back-end draw call performance on Windows NVIDIA can be found at this link. Note that this link is not kept current.
You can follow the usual instructions to check out and build ANGLE. Build the angle_perftests target. Note that all test scores are higher-is-better. You should also ensure is_debug=false in your build. Running with dcheck_always_on or debug validation enabled is not recommended.
Variance can be a problem when benchmarking. We have a test harness to run a single test in an infinite loop and print some statistics to help mitigate variance. See scripts/perf_test_runner.py. To use the script first compile angle_perftests into a folder with the word Release in it. Then provide the name of the test as the argument to the script. The script will automatically pick up the most current angle_perftests and run in an infinite loop.
You can choose individual tests to run with --gtest_filter=*TestName*. To select a particular ANGLE back-end, add the name of the back-end to the test filter. For example: DrawCallPerfBenchmark.Run/gl or DrawCallPerfBenchmark.Run/d3d11. Many tests have sub-tests that run slightly different code paths. You might need to experiment to find the right sub-test and its name.
ANGLE implements a no-op driver for OpenGL, D3D11 and Vulkan. To run on these configurations use the gl_null, d3d11_null or vulkan_null test configurations. These null drivers will not do any GPU work. They will skip the driver entirely. These null configs are useful for diagnosing performance overhead in ANGLE code.
DrawCallPerfBenchmark: Runs a tight loop around DrawArarys calls. validation_only: Skips all rendering. render_to_texture: Render to a user Framebuffer instead of the default FBO. vbo_change: Applies a Vertex Array change between each draw. tex_change: Applies a Texture change between each draw. UniformsBenchmark: Tests performance of updating various uniforms counts followed by a DrawArrays call. vec4: Tests vec4 Uniforms. matrix: Tests using Matrix uniforms instead of vec4. multiprogram: Tests switching Programs between updates and draws. repeating: Skip the update of uniforms before each draw call. DrawElementsPerfBenchmark: Similar to DrawCallPerfBenchmark but for indexed DrawElements calls. BindingsBenchmark: Tests Buffer binding performance. Does no draw call operations. 100_objects_allocated_every_iteration: Tests repeated glBindBuffer with new buffers allocated each iteration. 100_objects_allocated_at_initialization: Tests repeated glBindBuffer the same objects each iteration. TexSubImageBenchmark: Tests glTexSubImage update performance. BufferSubDataBenchmark: Tests glBufferSubData update performance. TextureSamplingBenchmark: Tests Texture sampling performance. TextureBenchmark: Tests Texture state change performance. LinkProgramBenchmark: Tests performance of glLinkProgram. Many other tests can be found that have documentation in their classes.