• Show log

    Commit

  • Hash : 7b77dc5b
    Author : Charlie Lao
    Date : 2021-04-22T09:58:33

    Vulkan: minimize-gpu-work: Skip data copy when possible
    
    When --minimize-gpu-work is specified while replaying app traces, the
    goal is to avoid any GPU work when possible and focus on driver cpu
    logic overhead. Data copy can be lengthy and each driver optimize it
    differently for some real world usage scenario. This should be looked
    along with normal app trace playback performance. When
    --minimize-gpu-work is specified, we want to leave this out of picture.
    Previously I have fixed TexImage2D by overwriting pixel pointer with
    null.  But there is a hole here when PBO is used. This CL fix the case
    that when data is sourced from PBO, we ensure to skip data copy as well.
    
    This CL also noops TexSubImage call instead of doing 1x1 copy. Again
    depends on driver implementation, some may use CPU others use GPU which
    will have different overhead. We can easily write a test to cover these
    performance optimizations. By skipping the subImage call here we will
    have less noise to deal with for CPU overhead investigation.
    
    Bug: b/184766477
    Change-Id: I84a5d26d2f25f8f0a6c5c9da72737906d6356a53
    Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2847100
    Commit-Queue: Charlie Lao <cclao@google.com>
    Reviewed-by: Cody Northrop <cnorthrop@google.com>
    Reviewed-by: Jamie Madill <jmadill@chromium.org>
    

  • Properties

  • Git HTTP https://git.kmx.io/kc3-lang/angle.git
    Git SSH git@git.kmx.io:kc3-lang/angle.git
    Public access ? public
    Description

    A conformant OpenGL ES implementation for Windows, Mac, Linux, iOS and Android.

    Homepage

    Github

    Users
    thodg_m kc3_lang_org thodg_w www_kmx_io thodg thodg_l
    Tags

  • README.md

  • ANGLE Performance Tests

    angle_perftests is a standalone testing suite that contains targeted tests for OpenGL, Vulkan and ANGLE internal classes. The tests currently run on the Chromium ANGLE infrastructure and report results to the Chromium perf dashboard.

    You can also build your own dashboards. For example, a comparison of ANGLE’s back-end draw call performance on Windows NVIDIA can be found at this link. Note that this link is not kept current.

    Running the Tests

    You can follow the usual instructions to check out and build ANGLE. Build the angle_perftests target. Note that all test scores are higher-is-better. You should also ensure is_debug=false in your build. Running with dcheck_always_on or debug validation enabled is not recommended.

    Variance can be a problem when benchmarking. We have a test harness to run a single test in an infinite loop and print some statistics to help mitigate variance. See scripts/perf_test_runner.py. To use the script first compile angle_perftests into a folder with the word Release in it. Then provide the name of the test as the argument to the script. The script will automatically pick up the most current angle_perftests and run in an infinite loop.

    Choosing the Test to Run

    You can choose individual tests to run with --gtest_filter=*TestName*. To select a particular ANGLE back-end, add the name of the back-end to the test filter. For example: DrawCallPerfBenchmark.Run/gl or DrawCallPerfBenchmark.Run/d3d11. Many tests have sub-tests that run slightly different code paths. You might need to experiment to find the right sub-test and its name.

    Null/No-op Configurations

    ANGLE implements a no-op driver for OpenGL, D3D11 and Vulkan. To run on these configurations use the gl_null, d3d11_null or vulkan_null test configurations. These null drivers will not do any GPU work. They will skip the driver entirely. These null configs are useful for diagnosing performance overhead in ANGLE code.

    Command-line Arguments

    Several command-line arguments control how the tests run:

    • --one-frame-only: Runs tests once and quickly exits. Used as a quick smoke test.
    • --enable-trace: Write a JSON event log that can be loaded in Chrome.
    • --trace-file file: Name of the JSON event log for --enable-trace.
    • --calibration: Prints the number of steps a test runs in a fixed time. Used by perf_test_runner.py.
    • --steps-per-trial x: Fixed number of steps to run for each test trial.
    • --max-steps-performed x: Upper maximum on total number of steps for the entire test run.
    • --screenshot-dir dir: Directory to store test screenshots. Only implemented in TracePerfTest.
    • --render-test-output-dir=dir: Equivalent to --screenshot-dir dir.
    • --verbose: Print extra timing information.
    • --warmup-loops x: Number of times to warm up the test before starting timing. Defaults to 3.
    • --no-warmup: Skip warming up the tests. Equivalent to --warmup-steps 0.
    • --calibration-time: Run each test calibration step in a fixed time. Defaults to 1 second.
    • --test-time: Run each test trial in a fixed time. Defaults to 10 seconds.
    • --trials: Number of times to repeat testing. Defaults to 3.
    • --no-finish: Don’t call glFinish after each test trial.
    • --enable-all-trace-tests: Offscreen and vsync-limited trace tests are disabled by default to reduce test time.
    • --minimize-gpu-work: Modify API calls so that GPU work is reduced to minimum.

    For example, for an endless run with no warmup, run:

    angle_perftests --gtest_filter=TracePerfTest.Run/vulkan_trex_200 --steps 1000000 --no-warmup

    The command line arguments implementations are located in ANGLEPerfTestArgs.cpp.

    Test Breakdown

    • DrawCallPerfBenchmark: Runs a tight loop around DrawArarys calls.
      • validation_only: Skips all rendering.
      • render_to_texture: Render to a user Framebuffer instead of the default FBO.
      • vbo_change: Applies a Vertex Array change between each draw.
      • tex_change: Applies a Texture change between each draw.
    • UniformsBenchmark: Tests performance of updating various uniforms counts followed by a DrawArrays call.
      • vec4: Tests vec4 Uniforms.
      • matrix: Tests using Matrix uniforms instead of vec4.
      • multiprogram: Tests switching Programs between updates and draws.
      • repeating: Skip the update of uniforms before each draw call.
    • DrawElementsPerfBenchmark: Similar to DrawCallPerfBenchmark but for indexed DrawElements calls.
    • BindingsBenchmark: Tests Buffer binding performance. Does no draw call operations.
      • 100_objects_allocated_every_iteration: Tests repeated glBindBuffer with new buffers allocated each iteration.
      • 100_objects_allocated_at_initialization: Tests repeated glBindBuffer the same objects each iteration.
    • TexSubImageBenchmark: Tests glTexSubImage update performance.
    • BufferSubDataBenchmark: Tests glBufferSubData update performance.
    • TextureSamplingBenchmark: Tests Texture sampling performance.
    • TextureBenchmark: Tests Texture state change performance.
    • LinkProgramBenchmark: Tests performance of glLinkProgram.
    • glmark2: Runs the glmark2 benchmark.
    • TracePerfTest: Runs replays of restricted traces, not available publicly. To enable, read more in RestrictedTraceTests

    Many other tests can be found that have documentation in their classes.