Commit b5b2612edfc4ddc8aca28f88546898261ee723ef

Roman Lavrov 2025-01-13T12:36:39

Tune branching in getProgramResolveLink() Hinting the likely/unlikely branches in related calls. Notably, mShaderProgramManager->getProgram() has a flat resource array and a fallback to a absl::flat_hash_map. As observed in driver_overhead_2 trace based PGO builds, the fallback gets un-inlined by PGO (presumably due to being hit rarely) and becomes a function call. Regular builds without the tuning in this CL inline flat_hash_map implementation, increasing the code size / worsening locality for a fallback case. This change makes the Context::useProgram() aarch64 assembly in regular builds very close to the driver_overhead_2 based PGO, and the code size goes down from 576 to 256 bytes. The total reduction of the .so size is 36KB (0.6%), likely due to all the cases where the inlining is avoided by hinting. There appears to be a ~1% perf improvement in driver_overhead_2 trace wall_time in my tests on a couple of Android devices. Hard to tell if this is due to the improved code locality or some other aspect of the change in assembly. Bug: b/383305597 Change-Id: I85c02cc74a56e7074086965e8d31018bd9ee0040 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6169263 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Roman Lavrov <romanl@google.com>