kmx git

Commit	Date	Message
d8c15499	2025-05-27T03:46:24	Vulkan: Clear depth and stencil unresolve separately To take into account two situations. 1. LoadOp for depth and stencil attachments are set differently. 2. depth and stencil unresolves could be different between the previous render pass and the current render pass. Bug: angleproject:42266019 Change-Id: I9e069b3972f86abb84eee6280919e6bba2901225 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6590197 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
b64d380e	2025-04-15T00:54:22	Vulkan: Ghost texture's image backing if overwritten If the texture's image is in use by the GPU but is overwritten completely, this change releases the old image and creates a fresh one. If the texture was used in a render pass, this avoids breaking the render pass. Otherwise, it allows the new image to be initialized with VK_EXT_host_image_copy functionality. In the very least, an unnecessary barrier is avoided. As a targeted optimization, this functionality is limited to non-array 2D color textures which are known to benefit from this ghosting in applications. Other texture types can be easily added, but need lots of tests to be added. Bug: angleproject:42265356 Change-Id: I18a7587599e36f9f70109264ddc1003b24b8b2df Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6456345 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
8cf89716	2025-03-14T20:17:07	Vulkan: Remove perFrameWindowSizeQuery feature Feature was enabled for all platforms in order for surface to be resized before acquire next image (not only after swap). Remove it, as if it's always enabled to simplify the code. Bug: angleproject:397848903 Bug: angleproject:42262287 Bug: angleproject:42262286 Bug: angleproject:40096601 Change-Id: I768772e30f5f38f68992e5b82c84430732aa77d9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6354166 Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
05c491e1	2025-03-15T11:56:07	Vulkan: Optimize GraphicsDriverUniforms update Unless RP is closed there is no need to dirty GraphicsDriverUniforms when the program executable changes. Bug: angleproject:386749841 Test: VulkanPerformanceCounterTest.NoUpdatesToGraphicsDriverUniformsOnProgramChange* Change-Id: Id02e8a17de93e2b73103666fc6cc62ce3cdd8f43 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6358315 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
1a0c9db3	2025-02-27T10:43:00	Vulkan: Disable monolithic pipeline creation with GPL As it violates OpenGL ES rules. This change also removes Vulkan perf counter tests that attempt to verify that warmed up programs hit the cache... this fails in non-trivial ways especially with graphics pipeline library due to: - Warm up tasks being async, they may finish after the test reads the perf counters to set expectations - Some drivers report a cache miss when fast-linking libraries... but likely they don't even look at the cache (so both hit and miss would have been inaccurate) - There is no 100% guarantee that the warmup really leads to a draw-time cache hit. Things are made worse by https://chromium-review.googlesource.com/c/angle/angle/+/5421594 because we don't necessarily even wait for the warm up tasks if ANGLE's view of the pipeline description doesn't match what was used for warm up (even if internally to the driver some of the state does not affect the binary blobs). Bug: angleproject:42265839 Change-Id: Iaf96e4f64e2187abc666ff07fe1304d7474a0e86 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6309696 Reviewed-by: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
2e65d3d4	2025-02-03T16:26:46	Vulkan: Fine-tune submission for multiple RPs This CL aims to fine-tune submission based on a certain command buffer size. This allows us to perform one submission for multiple smaller render passes. * Added mCommandsPendingSubmissionCount to ContextVk. * In ContextVk::syncState(), preferSubmitAtFBOBoundary is only used if the render pass pending command count exceeds the threshold: kMinCommandCountToSubmit * Currently set to 32. * For now, we still submit if the command is a clear (for example glClearBufferfv()). * For now, we also still submit if the command is an invalidate (for example, glInvalidateFramebuffer()). * In ContextVk::flushImpl(), if the pending command count exceeds the threshold (kMinCommandCountToSubmit) and the device is found to be idle, the work is submitted to keep the device busy. * Modified the following unit test from VulkanPerformanceCounterTest: VerifySubmitCounterForSwitchUserFBOToDirtyUserFBO * Since there is now a minimum command count for submission, the number of draw calls has been changed so that the submission is still issued at the new FBO boundary. * After this CL, life_is_strange shows the following improvements: (From the latest measurements) * +19% wall_time * +38% cpu_time Bug: angleproject:42265052 Change-Id: I18452cc1d39ca7e0ac376f6012974b498153cce8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6182927 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
bd8bc105	2025-02-19T18:08:32	vulkan: disable pipeline cache data serialization for nvidia device. we still see the big cache data issue after driver version 520. rename hasEffectivePipelineCacheSerialization to skipPipelineCacheSerialization. Bug: b/358380399 Change-Id: Idd8354f95c3eb4c2e58678a4cf50c8b6af20f371 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6284126 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Hailin Zhang <hailinzhang@google.com>
055123f8	2025-02-27T10:45:56	Vulkan: Don't maintain SharedCacheKeyManager for BufferBlock Dynamic descriptor type uses the underlying BufferBlock in the descriptorSet. There could be many BufferHelper objects sub-allocated from the same BufferBlock. And each BufferHelper could combine with other buffers to form a descriptorSet. This means the combination for BufferBlock could potentially be very large, in thousands with some app traces like seeing in honkai_star_rail. The overhead of maintaining mDescriptorSetCacheManager for BufferBlock could be too big. In this CL I have chosen to not maintain mDescriptorSetCacheManager in the BufferBlock. The only downside is that when BufferBlock gets destroyed, we will not able to immediately destroy all cached descriptorSets that it is part of. They will still gets evicted later on if needed (see evictStaleDescriptorSets for detail). After this CL, running with all app traces we have, the max vector size of SharedCacheKeyManager::mEmptySlotBits is no more than 2, versus ~70s before the CL. Bug: b/384839847 Bug: b/293297177 Bug: b/237686097 Change-Id: I7c7c91cd0aeacba4145575ac4270b713bf38b742 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6310837 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
ca7e8882	2025-02-26T14:39:47	Account for relaxed precision conversion data loss 1. Add highp precision qualifier for "expect" uniforms 2. Change EXPECT_EQ -> EXPECT_*NEAR when using mediump Bug: angleproject:391002353 Change-Id: I5a75ec81e622718f989f357fc263ae287c2c7dbc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6306684 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
febba52a	2024-12-12T17:26:59	Optimize for swap after clear Bug: angleproject:382006939 Change-Id: Ia6b9a53042a1d188dbd5a5f6436f17ca1e389a4e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6072416 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
c0ee7b20	2024-12-12T16:49:40	Swap getWidth() and getHeight() if the swapchain is 90 emulate rotated When checking if we need to recreate swapchain, we should swap the getWidth() and getHeight() if Is90DegreeRoration(mEmulatedPreTransform) is true. This is because: When creating swapchain, if Is90DegreeRoration(mEmulatedPreTransform) is true, we store swapped mSurfaceCaps.currentExtent.width and mSurfaceCaps.currentExtent.height in getWidth() and getHeight(), but we use the original mSurfaceCaps.currentExtent.width and mSurfaceCaps.currentExtent.height to create the swapchain. On next acquire, to check if the swapchain property changes, we should swap getWidth() and getHeight() if if Is90DegreeRoration(mEmulatedPreTransform) is true, otherwise we are recreating swapchains when width and height are unchanged. Bug: b/382006939 Change-Id: I1cbe9da2ff5e76602a90963514d2d0d5fbf677e7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6090199 Commit-Queue: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
c75bd915	2024-12-10T23:01:44	Vulkan: Remove asyncCommandQueue It's been years and it never showed an advantage. In the meantime, performance without this feature seems close to native drivers (i.e. the feature has lost its appeal) and it's frequently a source of complication and bugs. Bug: angleproject:42262955 Bug: angleproject:42265241 Bug: angleproject:42265934 Bug: angleproject:42265368 Bug: angleproject:42265738 Bug: angleproject:42266015 Bug: angleproject:377503738 Bug: angleproject:42265678 Bug: angleproject:173004081 Change-Id: Id8d7588fdbc397c28c1dd18aafa1f64cbe77806f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6084760 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
d8f6df3e	2024-12-09T09:20:40	Vulkan: Fix assertion in RefCountedEvent::releaseImpl RefCountedEvent uses non-atomic uint32_t for reference counting, so it is not thread safe. To ensure we do not use it in a thread unsafe way, there is an assertion I added in RefCountedEvent::releaseImpl that the release call is not come from async command queue thread (all release calls are expected come from context thread which should be protected by context share group lock). A while ago dynamic rendering is implemented. With dynamic rendering you can't do final layout change in render pass. So the final layout change to Present is added to primary command buffer at the end of RenderPassCommandBufferHelper::flushToPrimary(). And if async command queue is enabled, that flushToPrimary is called from async command queue thread, which triggers the assertion I added in RefCountedEvent::releaseImpl(). This CL releases mCurrentRefCountedEvent for this specific situation (present image + dynamic rendering + asyncCommandQueue) so that we do not hit the assertion. The only downside is that it will force to use pipelineBarrier for this specific situation. But no one ships with asyncCommandQueue enabled, so there is no real concern here as well. Bug: angleproject:382580875 Change-Id: I042e3906db7f5bb7acb299997f8fc7e21b8350b6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6072350 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
e42047f0	2024-10-25T13:50:28	Vulkan: Disable DescriptorSet cache for SwiftShader Performance with swiftShader is not critical and cache code path not making much difference for SwiftShader renderer anyway. This CL disables descriptor set cache for SwiftShader mainly to ensure the code path gets test coverage on CI bots. This code path also ensures that we are tagging ResourceUse on DescriptorSetHelper properly after every use. Otherwise, it is very hard to test with cache enabled code path since object usually won't get destroyed due to cache and any bug associated with this is going to be very hard to debug. This CL has catch such bugs during the descriptor set cache work. Bug: angleproject:372268711 Change-Id: Iee1028f9378cf4f537d897e08746d5cf958f225a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6047805 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
e3b3dd68	2024-11-14T23:00:52	Vulkan: Optimize color clears vs read-only depth/stencil switch When switching to read-only depth/stencil mode, if the aspect that intends to be in read-only mode has a deferred clear, the clear is flushed separately beforehands (as that would be a write operation). Prior to this change, _all_ deferred clears were flushed for simplicity. In this change, only the aspect that is switching modes is cleared, leaving the other aspects free to be optimized as loadOp of the following render pass. Bug: angleproject:378058737 Change-Id: Iba4371590bee99f5022575c09b0d32231562488c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6019829 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
adb80cbb	2024-11-12T22:38:38	Vulkan: Optimize read-only depth/stencil switch after clear Prior to this change, if color is cleared then read-only depth/stencil mode is enabled, ContextVk::switchToReadOnlyDepthStencilMode eagerly flushed the deferred clears (starting a render pass). However, if the render pass is marked dirty for any reason afterwards, for example because we want to flush the render pass after a query ends, the render pass that is just started above is unnecessarily closed. In this change, `switchToReadOnlyDepthStencilMode` only detects if a new render pass is needed and marks the appropriate dirt bits. This way, the render pass can only be restarted once. Bug: angleproject:378058737 Change-Id: I83a5ebae6c223882eafea338eeec19895d87e5c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6023414 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
bdebee8c	2024-10-29T17:08:11	Tests: add repro for running out of outside RP serials Bug: b/375661776 Change-Id: I2cd82710bdf5b00a6165ddad6ef21f30150aa5bc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5977123 Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
6b9d3762	2024-08-22T15:29:00	Vulkan: Optimize full texture clears Currently, a full texture clear (glClearTexImageEXT()) is treated as a special case of a partial clear (glClearTexSubImageEXT() with image dims as the input). However, it can be further optimized by treating it as a clear update. * For full clears from EXT_clear_texture, the clear update path is taken. * It leads to a more optimized path, including the usage of the following APIs: * vkCmdClearColorImage() * vkCmdClearDepthStencilImage() * It uses the following enum: ClearTextureMode * If a partial clear uses the extents for the entire image, it is treated as a full clear. * Updated the method to determine if a texture is renderable in clearSubImageImpl(). * Added perf counter: fullImageClears * Added new unit tests * Single 3D texture full clear (Clear3DSingleFull) * 2D RGB SNORM clear (Clear2DRGB8Snorm) * Added Vulkan perf counter test for 2D and 3D color image clear. * Updated the related skipped tests on Pineapple. Bug: angleproject:42266869 Bug: angleproject:375425839 Change-Id: I12ef3002dee190d7f8f43204f7d3f76e05d0b54f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5806207 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
1652f8ed	2024-10-17T13:35:39	Vulkan: end2end tests when descriptorSetCache is disabled Some end2end tests are testing specific descriptorSet cache behavior. When cache is disabled, these tests failed. In this CL these perfCounter based tests haven been modified to check total allocation to ensure the descriptorSets are properly reused instead of cache hit/miss. Bug: angleproject:372268711 Change-Id: I1d2f4cfcf622b05cdcb3317c8804416a80e72c48 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3735732 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
cc7d0220	2024-07-31T14:22:38	Vulkan: Fix serial mismatch during mid-loop flush Currently, if the total buffer updates to the image surpasses a certain threshold, it results in a flush. However, this can cause discrepencies in the queue serial, which can result in incorrect behavior on some platforms. * Updated flushStagedUpdatesImpl() so that the image serial after applying the updates matches that of the current outside command buffer. * That includes when there is a flush in the middle of the update loop, resulting in submission and new queue serial for the CB. * Added a unit test to check if a large texture can uploaded and deleted after a second small texture is uploaded. * Texture1UploadThenTexture2UploadThenTexture1Delete * Added a unit test for flushing when uploading cubemap textures. Bug: b/351650806 Bug: b/356192937 Change-Id: I7f9b20e4b7fd49115f22081a9733b4d44b740e4a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5744377 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
46dd6457	2024-06-25T15:56:15	Vulkan: Use DONT_CARE ops for missing D/S aspects Simplifies op tracking with dynamic rendering. Bug: angleproject:42267038 Change-Id: I394c154d94458c470190fea66d82c408e6f33725 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5655873 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
3f572905	2024-06-19T17:46:38	Add basic begin/end support for perf counters The AMD_performance_monitor extension has explicit begin/end calls to capture counters. This was not implemented in ANGLE and the tests were relying on ANGLE always capturing counters (incurring a small overhead). This change does not complete the implementation of that extension, but does add basic support for starting and stopping perf counter measurements. While inactive, most counters are not updated. Bug: angleproject:42267038 Change-Id: I3ff6448b22ca247c217401cb2d76ef4142c9d759 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5639343 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
d193d51b	2024-06-17T22:46:08	Replace issue ids post migration to new issue tracker This change replaces anglebug.com/NNNN links. Bug: None Change-Id: I8ac3aec8d2a8a844b3d7b99fc0a6b2be8da31761 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637912 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
15c182f9	2024-06-11T09:47:07	Vulkan: remove deferFlushUntilEndRenderPass feature, always on This only applies to Qualcomm chipsets, the feature was already enabled for all other devices. It was previously causing a manhattan 3.0 perf regression on some Qualcomm devices, but my tests on S24 both with ANGLE trace manhattan_31 and running gfxbench manually do not show any obvious regression. It was also not expected that this would result in a regression. As we do not aim to improve perf on older devices, removing the feature altogether so that defers are always enabled. This change resulted in a change in gold images on these traces on pixel 4 bots: pokemon_masters_ex - text was missing and now is rendered street_fighter_iv_ce_frame86 - shadow was missing and now is rendered So it looks like the feature may have been working incorrectly. Bug: b/346378481 Change-Id: I2b0d15b89e11c67dea7c316a42bc807441c43b0a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5622115 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
cc4fe098	2024-06-10T15:11:02	Tests: InRenderpassFlushShouldNotBreakRenderpass feature skip Flush is only deferred when deferFlushUntilEndRenderPass feature is enabled. It was intentionally disabled for Qualcomm devices due to perf: https://crrev.com/c/2441667 Bug: angleproject:42265681 Change-Id: I15d580c334a20e9ea58aec418a608f0b07bda2d1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5617650 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
4880a6c2	2024-05-03T14:43:20	Enhance VulkanPerformanceCounterTest coverage Enable additional config with PadBuffersToMaxVertexAttribStride feature Bug: b/271915956 Change-Id: I5fe27781fc51cdf477d5c2f453ad05a211788739 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5512875 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
13829f20	2024-03-26T23:03:12	Vulkan: Optimize depth/stencil resolve with glBlitFramebuffer Like color resolve, depth/stencil resolve is now also possibly done by modifying the render pass and attaching a depth/stencil resolve attachment. Bug: angleproject:7551 Change-Id: I045e3875e24006d2473a55b6c3856dd768fe8b84 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5398004 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
0e9254bd	2024-03-25T16:37:51	Vulkan: Optimize color invalidates By not flushing the render pass when there is an invalidate. Previously, the tracking of invalidation, write to attachments, management of load/store ops, and whether image contents are defined or not have all been unified between color and depth/stencil images. As such, it is possible to not close the render pass when a color image is invalidated just as is not done for depth/stencil images. Together with the optimization to resolve attachments [1], it is now finally possible to efficiently do MSAA rendering with ANGLE. Note that the optimization to use resolve attachments for depth/stencil is not yet implemented. For color only, the perf test added in [2] shows the following improvement on Pixel 6: - Single sampled rendering: ~2.73ms - Resolve + invalidate (before optimizations): ~3.54ms - Resolve + invalidate (after this change): ~2.85ms [1]: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 [2]: https://chromium-review.googlesource.com/c/angle/angle/+/5392548 Bug: angleproject:7551 Change-Id: I008adf9f53df97ab464b0a0399f0b312bf4d0d3f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5391905 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
9475ac40	2023-11-15T10:25:06	Vulkan: Make efficient MSAA resolve possible Prior to this change, using a resolve attachment to implement resolve through glBlitFramebuffer was done by temporarily modifying the source FramebufferVk's framebuffer description. This caused a good deal of complexity; enough to require the render pass to be immediately closed after this optimization. The downsides to this are: - Only one attachment can be efficiently resolved - There is no chance for the MSAA attachment to be invalidated In this change, resolve attachments that are added because of glBlitFramebuffer are stored in the command buffer, with the FramebufferVk completely oblivious to them. When the render pass is closed, either the FramebufferVk's original framebuffer object is used (if no resolve attachments are added) or a temporary one is created to include those resolve attachments. With the above method, the render pass is able to accumulate many resolve attachments as well as have its MSAA attachments be invalidated before it is flushed. For a FramebufferVk that is resolved in this way, there used to be two framebuffers created each time and thrown away as the code alternated between starting a render pass without a resolve attachment and then closing with one. With this change, there is now one framebuffer (without resolve attachments) that is cached in FramebufferVk (and is not recreated every time), and only the framebuffer with resolve attachments is recreated every time. Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there would be no framebuffers to create and destroy, and this change paves the way for that support too. WindowSurfaceVk framebuffers are still imagefull. Making them imageless adds unnecessary complication with no benefit. ----------------- To achieve efficient MSAA rendering on tiling hardware, applications should do the following: ``` glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO); // Clear the framebuffer to avoid a load // Or invalidate, if not needed to load: // glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...); glClear(...); // Draw calls // Resolve into the single sampled framebuffer glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO); glBlitFramebuffer(...); // Immediately discard the contents of the MSAA buffer, to avoid store glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...); ``` The above would translate to the following Vulkan render pass: - MSAA LOAD_OP_CLEAR/DONT_CARE - MSAA STORE_OP_DONT_CARE - Resolve LOAD_OP_DONT_CARE - Resolve STORE_OP_STORE This makes sure the MSAA data doesn't leave the tile memory and greatly reduces bandwidth usage. Once anglebug.com/4892 is fixed, this would also allow the MSAA image to never be allocated either. Bug: angleproject:7551 Bug: angleproject:8625 Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
b3ab67d3	2024-03-14T15:06:02	tests: Remove unnecessary .get() from RAII objects Bug: chromium:40942995 Change-Id: I82509869bce3ad8f51811188fe04267f2de04786 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370904 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
8346addb	2024-02-06T15:40:31	Contain X11 includes and free usage of common terms This change undoes workarounds where some terms were avoided so there is no clash with X11 (such as Success, Bool and None). In particular, this helps us make sure we never include the X11 headers in such an unconstrained manner as to clash with our code. Bug: angleproject:8520 Change-Id: I53d9657c5a33164064d2c80a206b96fd52f607f1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5273491 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Liza Burakova <liza@chromium.org>
ba65feb4	2023-10-18T17:33:38	Vulkan: Limit mutable texture flush to one update In case there are many updates for a mutable texture, flushing it preemptively can reduce performance, especially if it is done repeatedly. * Added getLevelUpdateCount() to ImageHelper. * Previous mutable textures will now be flushed only if they have exactly one update per mip level/cubemap face (if defined). * This means that mutable textures with no data will also not be flushed. * Added unit tests for single-level texture flushing and situations with no updates or more than one update. Bug: b/285613719 Change-Id: I1592ecf502051a55ebfbb7fcd22577c9ce87bf43 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4953847 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
b2e6a196	2023-09-11T15:27:20	Vulkan: Use VK_EXT_host_image_copy for texture uploads Of all the scenarios where host image copy may be useful, this is likely the most common case. There are numerous conditions for when the copy may be done on the host: - The image format must support it, - It must be unused by the GPU, - It must not have any pending updates (this can potentially be mitigated if needed), and - It must be in a host-copyable layout. However, many texture uploads are done: - To compressed formats, where support is highly likely, - On init, where: - the image is never previously used, - the image has no previous uploads - the image is in the UNDEFINED layout which satisfies the conditions above. As a result of this change, when the upload is done on the host, creation of a temp buffer is avoided which greatly reduces memory pressure (specially during app loading which is when most texture data is uploaded) and may even improve performance (due to avoiding a double copy). Testing the first 3 frames of the following traces with a SwiftShader implementation shows the amount of buffer allocated for staged uploads changed as such: - Black Desert: 185MB -> 65MB - Genshin Impact: 125MB -> 12MB - Asphalt 9: 138MB -> 0MB Bug: angleproject:8341 Change-Id: Id71dcc4a7a0f8b67960d2d283fe9d19ce7429a03 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4856676 Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
f5986fbb	2023-07-11T12:11:20	Vulkan: Dont break RP if there is actual render feedback loop There is a bit terminology confusion here that will be fixed in next CL. If a depth attachment is read only, then there is no feedback loop, we should not call feedback loop for read only depth attachment. The real depth render feedback loop mode is formed when we write to depth and sample from depth at the same time. In this condition, the content is undefined per OpenGLES spec section 9.3.1 (https://registry.khronos.org/OpenGL/specs/es/3.2/es_spec_3.2.pdf). The shouldSwitchToReadOnlyDepthStencilFeedbackLoopMode() implementation handles the usage case that the same render pass has depth write and then switch to read only. Under this usage there is no actual feedback loop, and we should still work properly by end current render pass and start a new render pass with read only depth attachment. This implementation also treating the actual feedback loop case exactly the same way by ending render pass first, even though this is undefined behavior. gangstar_vegas has the exact this undefined behavior usage case, where it write and sample from depth buffer at the same draw call. Native driver is not ending the render pass but ANGLE currently does. This puts ANGLE into worse performance. Since this is undefined behavior, either way is correct. This CL checks if there is an actual feedback loop in the current render pass and if yes, we adopt the native driver's behavior that keep the current render pass going. This improves gangstar_vegas frame time from 4.365ms to 3.89ms, and interestingly, yield the same golden image. Bug: b/289436017 Change-Id: Ifc04ecd8ad6455a88e8615bd5452b9cce88c6687 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4679361 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
2a08c33b	2023-07-10T17:47:49	Vulkan: Avoid flushCommandsAndEndRenderPass for readonlyDS switch When we switch to read only depth stencil mode, right now we always call flushCommandsAndEndRenderPass, even though the started render pass is empty and loadOp is load. This flush will cause render pass actually submitted and color attachment being cleared and then color attachment gets loaded in the subsequent render pass. In this CL, we only flush if the depthStencil attachment has clear or written This CL save one renderPass for the following app traces: antutu_refinery, aztec_ruins, manhattan_10, manhattan_31. Bug: b/290833623 Change-Id: I13b7a968d797b4c913f1cfbe9677d9b8abe791d2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4674087 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
2f19bb74	2023-03-16T16:03:29	Reland "Vulkan: Reactivate already started render pass when possible" This is a reland of commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: Ice9062122ae320b1a0108ff981bc65bd13b2ada0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4406888 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
459f0fad	2023-04-06T13:42:49	Vulkan: Force submit when switch to system framebuffer draw Given recent finding on unnecessary wait for acquire image semaphore, since the semaphore wait is per submission, any GPU will have this same performance problem if we only do command submit per frame. This CL forces submission when we switch draw framebuffer to system default framebuffer, so that the semaphore wait will not block user's fbo rendering. Bug: b/275624771 Change-Id: Id6b941870ef296393c13d0daaf81a41b6c042b9a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4406882 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
d58fbf04	2023-04-05T12:32:09	Vulkan: Wait for surface ANI semaphore only if image is used Right now there is a bug that surface's ANI semaphore is added to context when WindowSurfaceVk::getAttachmentRenderTarget get called, which gets called from FramebufferVk::syncState, which is before we end the previous render pass, due to the endRenderpass usually is deferred until next renderPass starts. This caused ANI semaphore gets added to the previous render pass's submission, which does not use surface, and thus a bubble in GPU execution pipeline where the user FBO rendering gets unnecessarily blocked until ANI semaphore is signaled. This lowers GPU utilization and thus GPU frequency gets dropped and frame time increased. This CL stores ANI semaphore to ImageHelper object and when barrier is generated, the ANI semaphore is moved to CommandBuffer. When CommandBuffer gets flushed and submitted, it gets added to the waitSemaphores vector and submitted to vulkan. Since all use of swap chain image must go through barrier code first (you need at least change layout), this ensures ANI semaphore gets waited in exact and robust way. Only the submission that references the swap chain image will be waited. With this CL, professional_baseball_spirits reduces frame time from 3.8 ms to 2.7 ms, achieving parity with native GLES on pixel 7 pro. Bug: b/275624771 Change-Id: Ifa6cacf9e3bc147bfde54eb7def2fca48c50aca0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4400011 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
25e60197	2023-03-31T14:17:26	Vulkan: Unify buffer alloc strategy for uploads and GPU copies With this change, glCopyBufferSubData uses the same buffer allocation strategy as glBufferSubData. Only exception is with buffer self-copies which never allocate a new buffer for simplicity. Additionally, this change allows glCopyBufferSubData to be done on the CPU if possible, i.e. if the source buffer is not being written to by the GPU and whenever the equivalent glBufferSubData would have used a CPU upload. Bug: b/276002151 Change-Id: Ice8df5891c5516b148245d5d6fa9b19b787df4ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4390023 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
41f0a321	2023-04-03T21:58:43	Revert "Vulkan: Reactivate already started render pass when possible" This reverts commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217. Reason for revert: Suspected cause for flakiness. anglebug.com/8118 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: I81cc2dcacb52466808b2ccf5819feda466c39fc5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4396502 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
96cda1ac	2023-03-31T15:08:11	Vulkan: Switch to modified framebuffer should trigger submit The feature flag preferSubmitAtFBOBoundary intended to trigger submitCommands call when FBO is switched. Right now there is a bug that when FBO is switched, submitCommands did not get called even if feature is enabled. The reason is that when framebuffer changed, we first get FramebufferVk::syncState call, and if the FBO that we switched to is drity, we will end up calling ContextVk::flushCommandsAndEndRenderPass to immediate end the render pass. The problem with that is that later on when we get to ContextVk::syncState and saw DIRTY_BIT_DRAW_FRAMEBUFFER_BINDING dirty bit and try to issue a submit, we notice render pass already ended, so mHasDeferredFlush never gets set, which means we never call submitCommands. All apps render to system framebuffer in the last render pass, and system framebuffer is always dirty due to swap chain image rotation, we hit this almost with every app. This CL avoid flushCommandsAndEndRenderPass() call from FramebufferVk::syncState. We now relies on ContextVk::onFramebufferChanged (which calls onRenderPassFinished() which will set DIRTY_RENDER_PASS bit to trigger deferred endRenderPass) to end the current render pass. This CL also add a regression test to ensure the submit indeed occur. Bug: b/275624771 Change-Id: I92b95a7a6c435f242d6684cb7852172cf41896c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4390642 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Roman Lavrov <romanl@google.com>
ad9537af	2023-03-16T16:03:29	Vulkan: Reactivate already started render pass when possible In some usage case (such as lineage_mobile), we are seeing in the middle of render pass, app switch to another fbo just to issue a glClear() call, which the clear call itself gets deferred. Application then switch back to the original frame buffer. Before this CL, the render pass gets recreated due to frame buffer binding change, even though the clear gets deferred and new render pass and the previous render pass are essentially the same. This CL detects this situation and reactivate the current render pass instead of creating a new one. With this CL, lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only one render pass is used instead of two. This CL also allows the render pass started by BlitFramebuffer reused by subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 pro) Bug: b/273808966 Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
d6a25bfa	2023-03-07T15:06:10	Vulkan: Optimize glBufferData call to improve storage reuse If app calls glBufferData with certain size, then calls it again with size 0, and then call it again with same old size again, we should try to reuse the existing storage. When size is zero, with the existing logic, we never free the storage. When glBufferData is called third time with the same size as the first glBufferData call, we expect to reuse the existing storage. But because of the storage reuse logic is comparing buffer's new size to the old size (which is 0), we missed the opportunity to reuse the existing storage. This CL update the reuse logic so that it checks the new size against storage's size (instead of OpenGLES buffer's size) and if we will end up with same sized allocation and same pool and memory type, then we reuse instead of reallocate. This reduces efootball_pes_2021 frame time from 4.670 ms to 4.277 ms on pixel 7 pro. Bug: b/271915956 Change-Id: I6f91e3e85b104eca215b28e7d0bea413ecc4401c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4317488 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
244e1931	2023-01-17T19:22:10	Vulkan: Fix use of pending Outside RenderPass CommandBuffer. Regression from this commit: 730c127102b540ce2c4ec086b037c8b732706e26 "Vulkan: Submit queue more often for texture data" Bug: angleproject:6354 Test: angle_end2end_tests --gtest_filter=SubmittingOutsideCommandBufferAssertIsOpen Change-Id: I5f72f499cd7153c94c8e5f8a3415df2182726c8e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4296802 Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
7189e4cf	2023-02-23T11:03:40	vulkan: fix depth buffer renderpass loadOp issue. when change the depth compare function, Renderpass need to change the mAcess flag accordingly. Bug: b/269929460 Change-Id: I83826c1b07c6d22600d6cd039e7d8bfd0b5b39c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4284624 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Hailin Zhang <hailinzhang@google.com>
05e62f39	2023-02-16T23:16:46	Vulkan: Don't close render pass if rebind to same fbo In the Vulkan backend, the render pass can occasionally (and transiently) be in a state of "open but inactive". This is when the render pass is closed, but has the potential for future modifications (for example to add a resolve attachment). Under many circumstances, it is expected that an open render pass cannot be in such a state. This assumption can be broken in this scenario: - Open render pass, draw, etc - Change framebuffer binding - Change framebuffer binding back to original - Masked Clear When ContextVk is synced before clear, it sees that the framebuffer binding is changed (though it hasn't really), and it closes the render passes and sets the render pass dirty bit. If a draw were to follow, a new render pass would have started (unnecessarily). However, in the case of a masked clear, UtilsVk notices that the render pass is started, assumes it must be active, and continues recording to it. While the operation itself succeeds, the assumption that the render pass is active is false (and fails assertion). This change makes sure that framebuffer binding change is no-oped if the framebuffer is the same one that has opened the current render pass. If any application does unnecessary binding changes and back, it will be optimized by this change as well. Bug: chromium:1411210 Change-Id: I37a3a9f2eaa1a81a1b3393840b9458ec71a87377 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4261215 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
410d8ba5	2022-12-21T13:27:00	Vulkan: Cleanup ContextVk::hasStartedRenderPass APIs ContextVk has a few hasStartedRenderPass APIs which interpret "start" inconsistently. A RenderPassCommands' life should be notStarted, started, requestEnd, and end (which is equivalent to notStarted). When someone calls onRenderPassFinished on a started renderpass, it does not immediate endRenderPass, but it will set DIRTY_BIT_RENDER_PASS dirty bit so that next draw call will trigger endRenderPass and start a new renderPass. We do not have a name for this state, which adds some confusion. This CL renames the stage between start and onRenderPassFinished to be "active" renderpass, when you have mRenderPassCommandBuffer pointer being valid and you can actively adding draw commands into the renderPass. For this purpose, I haves renamed hasStartedRenderPass to hasActiveRenderPass. This CL also simplifies hasStartedRenderPass implementation to only check mRenderPassCommandBuffer and turned mRenderPassCommands.started as assertion. This CL also changes hasStartedRenderPassWithQueueSerial to actually check mRenderPassCommands.started instead of being "active", so that name reflects what it is actually checking. This CL also changed hasStartedRenderPassWithCommands to hasActiveRenderPassWithCommands to make name and implementation consistent. One added benefit of this is that after this CL we now allow load/store optimization on a started but inactive renderPass as well (for example glInvalidateFramebuffer call after glFenceSync call, or invalidate after FBO blit as demonstrated by MultisampleResolveTest.ResolveD32FSamples tests). Bug: angleproject:7903 Bug: angleproject:7551 Change-Id: I8c8ec4c0d54b9ad0a9e373108dfce6b151c8fe0e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4119693 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
77c95de4	2022-11-16T21:12:28	Vulkan: Threaded monolithic pipeline creation With this change, once a pipeline is created out of libraries, a task is scheduled (if necessary) to asynchronously create a corresponding monolithic pipeline. Once the task is complete, the linked pipeline handle is replaced by the monolithic one, gaining back any performance that might have been lost due to the use of libraries. Bug: angleproject:7369 Change-Id: I525fb1e09f8bedc61b9dbef19f9cce7026ff9c53 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4031151 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
f17cb883	2022-11-30T17:23:12	Vulkan: Add two tests for per context queue serial work SubmittingOutsideCommandBufferTriggersEndRenderPass: This test is added to test outside command buffer uploads that triggers endRenderPass works properly. CreateMultiSharedContextAndDraw: This test is added to test draw with shared vertex buffer in the shared context group works properly. Bug: b/255414841 Change-Id: I8b4f343fe220a9f0b7c6e042f4663e23ae6f4c9d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4064148 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
7f4caaf5	2022-11-23T15:40:53	Vulkan: Fix VulkanPerformanceCounterTest.SubmittingOutsideCom VulkanPerformanceCounterTest.SubmittingOutsideCommandBufferDoesNotCollectRenderPassGarbage depends on the implementation detail on how we flush and submit commands. The recent change crrev.com/c/4038095 fixes one issue that we are now having one less submission on pixel 6 device. This CL adjust the test to account for that. This CL also changed to set mHasDeferredFlush to true only when there is a started renderpass upon FBO bind. This CL also opt in swiftshader into preferSubmitAtFBOBoundary feature for test coverage and ease of debugging since ARM GPU (which enables this flag) is not been tested on CI. Bug: b/255414841 Change-Id: I295cec33a8ca257a5d5a98604b8c4c0c29e97cdf Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4054101 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
68b47e58	2022-11-16T10:46:59	Vulkan: Initial support for VK_EXT_graphics_pipeline_library When available, this change uses VK_EXT_graphics_pipeline_library to create pipelines. Currently, it is only used when graphicsPipelineLibraryFastLinking is available. This restricts the use of this extension to devices where monolithic pipelines are not any more performant than linked libraries. A future change adds support for other implementations by providing async pipeline creation. Bug: angleproject:7369 Change-Id: I1e3b7ac4aa56e75c7d6f4d0d5ea91cb0b862e581 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4031489 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Steven Noonan <steven@valvesoftware.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
9b5fff82	2022-10-05T21:56:00	Vulkan: Emulate shader stencil export for MSRTT The MSRTT emulation code had one corner case issue that could lead to performance and memory inefficiencies. That is when stencil needs to be unresolved and VK_EXT_shader_stencil_export is not supported. This change adds a path to emulate VK_EXT_shader_stencil_export and removes this inefficiency. This should help Chromium on older Android devices that lack both this and the recent VK_EXT_multisampled_render_to_single_sampled extensions. Chromium frequently breaks the render pass (crbug.com/1336981), which easily leads to this situation. Bug: angleproject:4836 Change-Id: Ifceec43f7f3807b7e32f4b379edcd4351ae76414 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3935892 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
76f377c5	2022-06-17T16:05:16	Vulkan: Break renderpass when switch from query to non-query getQueryResult will wait for query result to be available, which means a potential CPU bubble if the result is not yet available. On tiler GPUs it will at least wait for renderpass to complete. Usually query enabled draws are very tiny (usually just draw a point to see if it is occluded or not), and query disabled draws are expensive. Some apps do issue a glFlush when switch from query draw to non-query draw, but app like dead_by_daylight does not issue such flush. In order to reduce the bubble, this CL ends renderpass and issue a flush when we switch from query enabled draws to non-query enabled draw so that the result will be available much earlier, this reduce the CPU bubble. This result in dead_by_daylight frame time improves from 5.45ms to 3.5ms (35% improvement). Bug: b/250706693 Change-Id: Ia3a32a9fb336e6f256809b3cad83f61a45415fb1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3931739 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
c19ec948	2022-08-23T10:43:59	Vulkan: Implement imageless framebuffers * Added the attachment image and create info objects to be used for imageless framebuffers created in getFramebuffer(). * New helper class for framebuffers in RenderPassCommandBufferHelper: MaybeImagelessFramebuffer, which includes a framebuffer object, if the framebuffer is imageless, and the image views. This is to make sure that the args for render pass begin info will be correctly set up according to the status of the used framebuffer. * Refactored the collection of attachments in getFramebuffer() into a new function, getAttachmentsAndImagesFromRenderTargets(). It also returns their corresponding ImageHelper* objects used to create the framebuffer (from their image properties). * New struct: RenderTargetInfo; which keeps track of render targets and whether resolve image should be used for the render pass in the form of the enum class RenderTargetImage. * Added a new arg to getFramebuffer(): resolveRenderTargetIn; to use when there is a valid resolveImageViewIn. * Without using the framebuffer cache, we would require to handle the framebuffer destruction by adding it to the garbage instead of releasing it. For example, FramebufferVk::destroy() now adds mCurrentFramebuffer to the garbage. * Added new framebuffer unit tests. * Added tests where two textures with different attributes are bound to the same framebuffer before drawing, one after another. * Added test where a blit occurs from a multisample texture into a non-zero level of a resolve texture, each bound to a separate FBO. * Added a new perf test to compare performance for enabled imageless framebuffers vs disabled. (Credit: cclao) Bug: angleproject:7553 Change-Id: Iacdbd73aaa01cbb0e37abf01ae4892bdfdd4b12f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3827644 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
5d7c4eca	2022-10-02T02:27:27	Vulkan: Don't flush depth/stencil on color blit When syncing the read framebuffer for blit, deferred clears are picked up for the attachments that are not being synced. They are then redeferred so a future command would pick them hopefully as loadOp. This change improves the frame time of Pretty Derby on Pixel 6 by ~23%. Bug: angleproject:7727 Change-Id: Ie7d84c58315cd09204e5229f1ec73605d5a7f639 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3931973 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
2debd07d	2022-09-21T11:40:18	Automatically query status of features for tests Now tests can skip based on what features exist, compared to what features are explicitly asked for. For example, a test suite may override-enable a (normally disabled) feature that depends on a hardware capability. With this change, it can be skipped if said hardware capability doesn't exist. As a bonus, tests now correctly skip if the feature is overriden through an environment variable. This change also cleans up VulkanPerformanceCounterTest tests which did the same for a number of specific features. Bug: b/243398683 Change-Id: I84f026e3394eab56fd123e02bee72720c7ed94c6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3909789 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
18f90857	2022-09-09T11:28:00	Vulkan: Use DontCare if attachment is invalidated If an attachment is invalidated, there is no need to preserve the old content. NONE means old content is still preserved, DontCare means discard old content. In this case we do want to discard instead of preserve old content. Bug: b/243711628 Change-Id: I242ac86db6993574b5627d61f7185d155beec0ba Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3888938 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
34332f85	2022-09-13T13:54:14	Fix UninstantiatedParameterizedTestSuite errors on iOS. Some test suites are instantiated only on ES31 or Vulkan, which iOS doesn't support. Bug: angleproject:5417 Change-Id: Iea202934edb3804993dabd38f2629d4992eb2095 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3892013 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Commit-Queue: Yuly Novikov <ynovikov@chromium.org> Auto-Submit: Yuly Novikov <ynovikov@chromium.org>
1d04539f	2022-09-06T15:20:32	Fix xfb tests rendering points Some xfb tests render points and verify a coordinate away from the points is unchanged as a means to break the render pass. Due to lack of output to gl_PointSize, these tests are flaky on SwiftShader. Bug: angleproject:7625 Change-Id: I7347516bb755ace87d57df3467c59055f28f1d69 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3877783 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>
38a38b8d	2022-09-01T17:10:39	Revert "EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2)" This reverts commit 2dc1c609dea184e5e51a8136df71ae14f4481f52. Reason for revert: Doesn't fix the issue Original change's description: > EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2) > > Bug: None > Change-Id: I6a8006be39ff8b8208004f533157f27da8e7fe24 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3863143 > Auto-Submit: Roman Lavrov <romanl@google.com> > Commit-Queue: Jamie Madill <jmadill@chromium.org> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Jamie Madill <jmadill@chromium.org> Bug: None Change-Id: Ifbb8f12798c9b5bf1f77f997302114263eceaf75 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3868935 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
2dc1c609	2022-08-29T15:31:56	EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2) Bug: None Change-Id: I6a8006be39ff8b8208004f533157f27da8e7fe24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3863143 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
c6ad305c	2022-08-25T11:53:46	Vulkan: No depth load/store if depthFunc==ALWAYS/NEVER && mask==FALSE If depthFunc is set to always or never pass with depthMask disabled, and the entire render pass is drawing with that state, then there is no need to load or store depth value. Bug: b/243711628 Change-Id: I71d470bda49abc48a4a6e20895b7e056c33fa33a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3858143 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
7428369a	2022-08-29T17:59:38	Vulkan: Use macros for load/store Op check Use macro instead of inline function for result check so that the correct line number gets print out for the failed check. Bug: b/243711628 Change-Id: I1141f6a63fd01bb9fe0cf5c06b81b378e8acc08e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3864347 Reviewed-by: Ian Elliott <ianelliott@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
9a258281	2022-08-17T17:47:22	Fix submit-count perf counter test on ARM On ARM, the preferSubmitAtFBOBoundary feature causes extra submissions that need to be taken into account. Bug: chromium:1337538 Change-Id: Id545ee3e65fc943aff51ea3721e9c19bc0afd4a5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3835168 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
2c351351	2022-08-07T22:31:40	Vulkan: Don't break render pass on read-only buffer updates When uploading to a buffer that is in use by the GPU, we either acquire a new buffer and copy the contents over, or stage the update and do a GPU copy. Ignoring all other conditions, this decision was made based on whether a small or large part of the buffer is being updated; small updates where staged. However, if the current render pass uses the buffer in read-only mode, the staged update would break it (to apply the update). In this change, this situation is detected and the acquire-and-update path is chosen even for small updates. Bug: angleproject:7534 Change-Id: Ie2c0989449dcc7d03695a003cf6f353920f8fb65 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3812566 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
928c5016	2022-08-04T12:28:12	Vulkan: Fix garbage collection vs outside-RP-only flush In https://chromium-review.googlesource.com/c/angle/angle/+/3379231, an optimization was implemented such that the excessive recorded texture uploads would get flushed early and submitted. This caused a use-after-free bug in the following situation: * Draw with pipeline A * Delete A <--- this puts A in the Context garbage list * Upload a lot of data At this point, the flush threshold could pass and the commands recorded outside of the render pass up to this point would be submitted. Associated with this submission was the current garbage, including pipeline A. However, the render pass that uses pipeline A is still not submitted. Now if after some time the render pass is still open, but the "completed commands" are checked (another set of uploads causing another submission, a query status check, etc), the garbage can be cleaned up. When the render pass closes next and is submitted, the implementation attempts to use the pipeline, which is already deleted. In this change, outside-render-pass-only submissions no longer reference the current garbage. This has the side effect that the temporary buffers used for uploading texture data won't be released early. A future optimization may want to separate the garbage list in ContextVk to render pass and outside render pass garbage. Bug: chromium:1337538 Change-Id: I4d31edc53916785d44420f4d6b4b2578ca3996e2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3812555 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
553b1334	2022-07-28T23:33:28	Vulkan: fix default msaa framebuffer resolve issue. Bug: b/239217726 Change-Id: I826aad7495814e0a178a586c4cfd5943278cddac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3793304 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
f000215d	2022-07-26T21:16:14	Vulkan: Optimize transform feedback buffer tracking Prior to this CL, if transform feedback was active at the time of render pass closure, its buffers were cached in ContextVk. Later, these buffers were used to close the render pass if they were used for any other reason (such as vertex attribute). However, this meant that the render pass could close unnecessarily if transform feedback was ended right after the render pass is closed. The closure of the render pass was an awkward place to cache the used transform feedback buffers (because at that point, the buffers are actually no longer used). Instead, this change makes sure that the buffers are cached when transform feedback buffers are first used by the render pass, and the cache is cleared at the end of the render pass. Bug: angleproject:4622 Change-Id: I31c0a1e20d48f2e261e2cf37adb0a46db683e6fb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3788309 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
53d40aed	2022-07-15T15:03:25	Vulkan: Destroy descriptorSet cache when BufferHelper destroyed For atomic counter buffers or other cases, dynamic descriptor is not been used. Right now when such buffer is destroyed, the cache is still lingers around. With this CL, when a new cache entry has been created, we record the cache entry in the BufferHelper. When BufferHelper is destroyed, we also immediately destroy the cache entry since the cache will no longer reused. Bug: b/237686097 Change-Id: I26eee96318fbc003e65318c0b8263dc61092f350 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3764044 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
c7459a46	2022-07-15T09:55:03	Vulkan: Destroy descriptorSet cache when BufferBlock destroyed When a new cache entry has been created, we record the cache entry in the BufferBlock. When BufferBlock is destroyed, we also immediately destroy the cache entry since the cache will no longer reused. This CL also removes DescriptorCacheResult from various APIs since it is now redundant with newSharedCacheKey argument. Bug: b/237686097 Change-Id: I14fa8906fdbe7d9226c8e8ecddef2beb05fbaa5c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3756694 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Charlie Lao <cclao@google.com>
496bddf3	2022-07-14T20:58:03	Skip mutable texture upload tests through feature * Added a condition in the mutable texture upload tests in VulkanPerformanceCounterTest.cpp, to skip the test if the feature `MutableMipmapTextureUpload` is disabled on that platform. Bug: angleproject:7308 Change-Id: Iff1985cabb463dc82ef15340cf3c485a0b680f0b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3765180 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Amirali Abdolrashidi <abdolrashidi@google.com>
01092c48	2022-07-12T10:11:22	Vulkan: Destroy descriptorSet cache when shader image is destroyed Similar to texture descriptor set, this applies to images used as shader resource. When a texture is used in a shader resource descriptorSet, we record it. When texture is destroyed, we also destroy that shader resource descriptorSet cache. Bug: b/237686097 Change-Id: I475982fcec45535cc285a4aebca922d01efc7ed2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3758884 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
32c5fd8a	2022-05-13T14:31:03	Reland "Vulkan: Flush texture updates more often" This is a reland of 8bb7c35c2159de2fa9e9a008679c692edd4402a6 * Added a condition to make sure the previous texture is not immutable when performing the optimization. * Fixed the issue where mipmap textures with unequal dimensions were not flushed. * Added related tests. * Added kEnableMutableMipmapTextureUpload, a flag to enable/disable the feature (enabled by default). Original change's description: > Vulkan: Flush texture updates more often > > * Added a pointer to the previous texture in ShareGroupVk so we can > flush the texture updates once we switch to a new texture. > > * We check if mip levels 0 and 1 are conformant in terms of > size, format and number of samples. > > * As a part of size check, we also check depths if the texture > target is either 3D, 2D array, or cube map array. For the former > two, they have to conform to mip scaling similar to width and > height. For the latter, the depth represents layer-faces and does > not change for mipmaps. > > * Added a test to ensure the pointer to the previous texture is > deleted when the corresponding texture is deleted, so the old value > is not accessed by a future mutable texture. > > * Added tests to make sure the mutable texture is uploaded with > the appropriate mip level attributes, and not uploaded in cases of > size/format inconsistencies, incompleteness, and no base level. > > Bug: b/202744914 > Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 > Reviewed-by: Charlie Lao <cclao@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Bug: b/202744914 Change-Id: I2bdbcd0182a57c18c1a18968396251a2e366731b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3646959 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
89e38b57	2022-06-22T15:04:08	Refactor to use ANGLETest vs ANGLETestWithParam Bug: angleproject:6747 Change-Id: I72ad52d0268eae0e1a401f12f3e94cc5efa402f2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3719002 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com>
e50351cb	2022-06-10T22:28:58	Vulkan: Don't close render pass on framebuffer fetch For applications that use framebuffer fetch in the same RP as non-fetch programs, we can save some extra RenderPasses by always creating our RP objects with input attachments enabled. This works almost identically except for needing to use the images in a "GENERAL" layout instead of "COLOR_ATTACHMENT_OPTIMAL". According to partners it is possible to achieve performance parity even with GENERAL layout. To remove any potential negative impacts of using the GENERAL layout, the context enters this always-framebuffer-fetch mode only and as soon as a framebuffer fetch program is created. Applications that don't use framebuffer fetch are thus unaffected. This eliminates 20 render passes in the Genshin Impact trace (out of about 58). On a Pixel 6 the resulting benchmark score speeds up by ~25%. For Real Racing 3, the speed up is ~30%. Based on change by jmadill@chromium.org Bug: angleproject:7375 Change-Id: Ib6c73e95d06229f8545d502b388ee2a55a582323 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3697308 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
91976352	2022-06-21T15:41:02	Use C++17 attributes instead of custom macros Bug: angleproject:6747 Change-Id: Iad6c7cd8a18d028e01da49b647c5d01af11e0522 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3718999 Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
97a6e581	2022-05-30T16:50:26	Vulkan: Useful implementation of program binaries ANGLE already serializes the pipeline state for the sake of OES_get_program_binary. This serialization had limited usefulness however, since the Vulkan driver hasn't actually created any pipelines yet (which is a costly part of program creation). Simultaneously, ANGLE deferred Vulkan pipeline creation to draw time, which causes hitching. In this change, a handful of Vulkan pipelines are precreated at link time; those at least that are sure to create different blobs in the pipeline cache (different spec consts or SPIR-V generation). These pipelines are created in the program executable's cache. The cache is then merged into the shared renderer cache (for potential blob reuse by other programs). With this, two goals are achieved: - Most pipelines created at draw time hit the pipeline cache, avoiding costly compilation. - When the program binary is retrieved, the contents of the program executable's pipeline cache is also returned. On reload, the cache is recovered, resulting in faster startup. Bug: angleproject:5881 Change-Id: I46c5451a7d0b16dffd40e44015e094640886880b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3671977 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
c5ee5a9c	2022-06-10T10:29:11	Vulkan: Add test CreateDestroyTextureDoesNotIncreaseDescSetCache This adds a test to demonstrate a usage pattern seen with surfaceflinger (see b/234602034 for detailed reproduce steps). With every iteration of notification shade pop up, after all other optimization, we are still seeing four descriptor sets gets allocated. Surfaceflinger is allocating AHB and texture every time and after usage it gets destroyed. This test uses normal texture instead of EGLImage for easy of debugging on linux/windows platform, but it demonstrated the exact same problem with AHB texture. Bug: b/235523746 Change-Id: I7ca1ff13b61ade1449a56d3afc8a84926ad13850 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3700570 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Ian Elliott <ianelliott@google.com>
3dfc8004	2022-06-08T14:24:48	Vulkan: Optimize sync followed by swap Previously, inserting a sync object immediately caused a submission. That was done in https://chromium-review.googlesource.com/c/angle/angle/+/3200274 to be able to wait until the sync object is signaled without having to wait for whatever is recorded after it until a flush naturally happens. Some applications issue a glFenceSync right before eglSwapBuffers. The submission incurred by glFenceSync disallowed the optimizations that eglSwapBuffers would have done, leading to performance degradations. This could have been avoided if glFenceSync was issued right after eglSwapBuffers, but that's not the case with a number of applications. In this change, when a fence is inserted: - For EGL sync objects, a submission is issued regardless - For GL sync objects, a submission is issued if there is no render pass open - For GL sync objects, the submission is deferred if there is an open render pass. This is done by marking the render pass closed, and flagging the context as having a deferred flash. If the context that issued the fence sync issues another draw call, the render pass is naturally closed and the submission is performed. If the context that issued the fence sync causes a submission, it would have a chance to modify the render pass before doing so. For example, it could apply swapchain optimizations before swapping, or add a resolve attachment for blit. If the context that issued the fence sync doesn't cause a submission before another context tries to access it (get status, wait, etc), the other context will flush its render pass and cause a submission on its behalf. This is possible because the deferral of submission is done only for GL sync objects, and those are only accessible by other contexts in the same share group. Bug: angleproject:7379 Change-Id: I3dd1c1bfd575206d730dd9ee2e33ba2254318521 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3695520 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
ee1dd7f4	2022-06-08T13:17:39	Vulkan: Add test for glEGLImageTargetTexture2DOES issue This add a test that repeatedly calling glEGLImageTargetTexture2DOES on the same source EGLImage with the same texture parameters should not causing texture's descriptor set cache to keep growing. This is the usage pattern we are seeing with surfaceflinger. Bug: b/234602034 Change-Id: I38ec0a0b2580b8985c27e8c9f7edf14aa7843023 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3696677 Reviewed-by: Ian Elliott <ianelliott@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
d655ad29	2022-05-31T14:20:16	Vulkan: Add tests for FramebufferCache growth bugs When texture attached to FBO gets respecified, we shouldn't keep growing FramebufferCache. When texture attached to fbo get glTexParameteri(GL_TEXTURE_SWIZZLE_R) call with the same value, we should also not destroy/recreate framebuffers (in fact should not recreate VkImageView). We ran into this usage pattern on surfaceflinger. When texture attached to fbo get glTexParameteri(GL_TEXTURE_SWIZZLE_R) call with different value, we should also not destroy/recreate framebuffers (in fact should not recreate VkImageView). We ran into this usage pattern on surfaceflinger. Bug: b/234769934 Bug: b/234602034 Change-Id: I9fc881486f95cc3da843f50fa0a8cdcbfd4fc625 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3681081 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Ian Elliott <ianelliott@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
b0d75fb5	2022-05-31T16:55:23	Vulkan: Use 64-bit counters Some upcoming counters don't fit in 32 bits. Bug: angleproject:5881 Change-Id: I2de8a603cabdb5f7417c29d5f37a50899485d6d3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3679488 Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
e56f227d	2022-05-13T08:50:12	Vulkan: Add case: TextureSampleByDrawDispatchDraw This case is used to verify the implicit synchronization when GL executables switch from draw to dispatch. Besides, suppress a VVL on it. Bug: angleproject:7031 Change-Id: Idab68cfd0d4b17685f5eb5b3eec7f2cad12e5877 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3646927 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
03ccd9cc	2022-05-13T16:12:11	Revert "Vulkan: Flush texture updates more often" This reverts commit 8bb7c35c2159de2fa9e9a008679c692edd4402a6. Reason for revert: crashes tests in linux-rel Example: https://ci.chromium.org/ui/p/chromium/builders/try/linux-rel/1012030/overview Also possible flakiness https://anglebug.com/7308 Repro: out/Debug/bin/run_blink_web_tests fast/canvas/OffscreenCanvas-2d-drawImage.html Original change's description: > Vulkan: Flush texture updates more often > > * Added a pointer to the previous texture in ShareGroupVk so we can > flush the texture updates once we switch to a new texture. > > * We check if mip levels 0 and 1 are conformant in terms of > size, format and number of samples. > > * As a part of size check, we also check depths if the texture > target is either 3D, 2D array, or cube map array. For the former > two, they have to conform to mip scaling similar to width and > height. For the latter, the depth represents layer-faces and does > not change for mipmaps. > > * Added a test to ensure the pointer to the previous texture is > deleted when the corresponding texture is deleted, so the old value > is not accessed by a future mutable texture. > > * Added tests to make sure the mutable texture is uploaded with > the appropriate mip level attributes, and not uploaded in cases of > size/format inconsistencies, incompleteness, and no base level. > > Bug: b/202744914 > Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 > Reviewed-by: Charlie Lao <cclao@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Bug: b/202744914 Change-Id: Id51fd4c76d058aa5100ec58ba618098c8f614253 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3645493 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Lingfeng Yang <lfy@google.com> Reviewed-by: Lingfeng Yang <lfy@google.com>
21ad9b3c	2022-04-07T09:57:26	Vulkan: Add generic descriptors for DS cache. With the new design, the descriptor set cache keys include all identifying information needed to reconstruct the update descriptor sets calls except the specific resource handles. The places for the resource handles are held by serials intead. When we miss the cache, we no longer need a second step to then construct the update calls, and can build the update calls directly from the key structures in combination with a list of resource handles. Bug: angleproject:6776 Change-Id: If1660a557585a75e9aa2560d6a38c56b62f555c8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3484981 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org>
d8d396db	2022-04-07T09:57:25	Vulkan: Add shared descriptor set caches. This allows programs with the same sets of descriptors to share descriptor sets. Currently there is no cache eviction. This CL adds a new "Meta" class to manage the descriptor set caches. Each shared descriptor pool is unique to a descriptor set layout. The descriptor set cache is moved into the pool class. Now every instance of a descriptor pool in ANGLE has easy access to a descriptor set cache as well. Bug: angleproject:6776 Change-Id: I06982e0349f5a87e4578e769fa356ce8e7ab49f0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3424660 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
8bb7c35c	2022-03-23T19:14:54	Vulkan: Flush texture updates more often * Added a pointer to the previous texture in ShareGroupVk so we can flush the texture updates once we switch to a new texture. * We check if mip levels 0 and 1 are conformant in terms of size, format and number of samples. * As a part of size check, we also check depths if the texture target is either 3D, 2D array, or cube map array. For the former two, they have to conform to mip scaling similar to width and height. For the latter, the depth represents layer-faces and does not change for mipmaps. * Added a test to ensure the pointer to the previous texture is deleted when the corresponding texture is deleted, so the old value is not accessed by a future mutable texture. * Added tests to make sure the mutable texture is uploaded with the appropriate mip level attributes, and not uploaded in cases of size/format inconsistencies, incompleteness, and no base level. Bug: b/202744914 Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
3d55cf0c	2021-12-30T11:27:26	Vulkan: Optimize the vkImage layout when used as GL_image If one vkImage has been used as GL_image in compute shader and as a GL_texture in fragment shader, no dependencies are needed for the fragment shader and other pre-fragment graphics shaders, like vertex/tess/geom. If we only assign the vkImage layout as writable when running GL executables that have Image Textures, we can specify more precise read-only barriers when running read-only GL executables. Bug: angleproject:6862 Change-Id: Iff37fdce13fea637751899253e535bf3f6663200 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3366014 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
d075dfe2	2022-05-03T16:25:26	Vulkan: Reduce kMaxBufferToImageCopySize to 64M Bug: b/230538246 Change-Id: Id2ef9c35f74fb6f526744903402562f9354bfcdb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3625834 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
3e05b93a	2022-04-12T19:39:19	Vulkan: MSAA swapchain resolve based on renderArea * Updated the MSAA resolve subpass so it can only be performed if the render pass is covering the entire area (e.g., not scissored). * Added test to make sure that the subpass resolve does not occur when the render pass does not cover the entire area. Bug: angleproject:6762 Bug: angleproject:7196 Change-Id: Iac3ab4b655dfeb7bff1348cc5e289a77a4dc0b83 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584942 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
3eb2bcf7	2022-04-27T16:13:04	Vulkan: Fix syncval errors with DONT_CARE for unused attachments DONT_CARE is a write operation for synchronization purposes. ANGLE doesn't synchronize depth/stencil attachments that are not written to, as it uses the read-only layout. This change makes sure LOAD/STORE_OP_NONE are used instead of DONT_CARE for attachments that are not used, even if they don't have defined contents. This allows ANGLE to continue to not do additional synchronization. Bug: angleproject:5371 Bug: angleproject:5962 Bug: angleproject:6411 Bug: angleproject:6584 Change-Id: I539379aa34f6655f00e798e8c4a5c57f40f7a12d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3612182 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
7d31a47f	2022-04-23T00:19:15	Vulkan: Optimize away eglSwapBuffers for single buffer surfaces For single buffer surfaces, eglSwapBuffers serves two purposes: - Switch to/from single buffer mode - Implicitly issue a glFlush Simultaneously, for single buffer surfaces, glFlush serves three purposes: - Submit the commands - Call queue present (if necessary) - Throttle the CPU In this mode, ContextVk::flush() already redirects to the surface, calling WindowSurfaceVk::swapImpl() which calls back to ContextVk::flushImpl() (to submit the commands), calls queue present and throttles the CPU. If the application calls eglSwapBuffers(), the exact same thing happens (i.e. WindowSurfaceVk::swapImpl() is called to the same effect). Calling swapImpl() leads to an addition of the corresponding submit serial to the "swap history". The CPU throttling code always throttles the CPU to the serial of two swaps ago. Unnecessary calls to eglSwapBuffers() (when there is no command to be flushed) in single buffer mode would thus lead to the CPU throttled to the end of the last submission, effectively turning into a glFinish(). In this change, eglSwapBuffers() in single buffer mode, when not switching to/from this mode, is redirected to glFlush() as it's functionally equivalent. Simultaneously, ContextVk now tracks whether it has any pending commands for submission at all, and skips glFlush() altogether if there are none. Together, this results in the unnecessary eglSwapBuffers() to become no-op. Bug: b/229908040 Change-Id: I0e3b4a8b7eb4f6b0e0ed22260644825fc67dd330 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3603841 Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
4aae5815	2022-04-22T13:21:03	Vulkan: Overlay widgets for submission statistics Bug: angleproject:7084 Change-Id: I68e69bda43862f9f2711c25a28dbe4745c19a45c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3602832 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
2db718ed	2022-04-21T23:13:02	Vulkan: Skip empty submissions A number of places in ANGLE perform an implicit flush; eglSwapBuffers(), glFenceSync() etc. Sometimes these flushes are unnecessary because there is nothing to submit. Additionally, an application may unnecessarily issue glFlush() with nothing recorded. In this change, empty command buffers are automatically not submitted, optimizing these unnecessary flushes away. Bug: angleproject:7084 Change-Id: Iecb865b6b9ef8045dfecda7b5221874f7031b42e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3600837 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
3b38b379	2022-04-20T10:44:24	Vulkan: Add feature avoid HOST_VISIBLE and DEVICE_LOCAL combination Discrete GPUs device local memory usually is not CPU accessible. This adds a feature flag to control that. Fixed bug in BufferVk that when mapRangeImpl is called from angle internal, unmapImpl was using front end mapping parameters that is incorrect. We have to cache the mapping parameters in the backend to hangle the mapRangeImpl/unmapImpl calls from internal. Fixed the test bug in ComputeShaderTest.BufferImageBufferMapWrite that we are calling glMapBufferRange with GL_MAP_READ_BIT but are actually writing to the map pointer. This should result in undefined behavior per spec. Fixed the test bug in GLSLTest.* that VerifyBuffer calls glMapBufferRange, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Fixed the test bug in ReadPixelsPBOTest.* and ReadPixelsPBONVTest.* that calls glMapBufferRangeEXT, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Added new skipped syncval messages. Because this CL triggers a copyToBuffer call for some of the buffers and that changes the syncval message signature for the same reasons (i.e, feedback loop or synval does not know the exact range of buffer been used for vertex buffers etc). Bug: angleproject:7047 Change-Id: I28c96ae0f23db8e5b51af8259e5b97e12e8b91f2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3597711 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
b2a1f0d2	2022-04-14T07:58:32	Track total vs per-frame descriptor set counters. This will give more consistent measurements for descriptor set caches and descriptor set allocations. Bug: angleproject:6776 Change-Id: I584b8807ad19f8393ae54cc1d88b319c8f7f9f39 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584636 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org>
fcec6904	2022-04-13T14:18:06	Generate feature variable names from display names The json file now only contains the feature display name. The variable name is automaticaly derived. For consistence with Chromium and other Chromium-based projects, the display name is now always snake_case, and that's what's specified in the json files. This also makes camelCase variable name generation trivial (as opposed to the other way around). Feature overrides now accept both snake_case and camelCase names to ensure compatibility with existing scripts. This is done by removing _ and comparing override names with feature names in lower case. Bug: angleproject:6435 Change-Id: I0b6ed2bbf5c312bc4f4be7b3c7d55dbaca2a9886 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584630 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
3cea7fcc	2022-03-16T16:33:43	Split Context ResourceUseList to RP Commandbuffers * Added mResourceUseList to each command buffer helper in an effort to move mResourceUseList away from ContextVk. * submitFrameImpl() renamed to submitCommands() * Moved the functions acquireResourceUseList() and onRenderPassFinished() in submitCommands() to the submitFrame functions calling it. Bug: angleproject:7103 Change-Id: I2487d5b86ea0a4d504f283aa7128501651317fe0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3531368 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
607d398e	2022-03-14T16:32:21	Vulkan: Optimize resolve of multisample swapchains * Resolves the multisampled image if the last render pass draws into the default framebuffer. * Added test to check the number of resolves in the optimization subpass (credit: Xinyi He) * Added test to check the number of resolves outside the subpass. * Added disabled test to see if the subpass resolve works. Bug: angleproject:6762 Change-Id: I86a8db3387851ab97d5f7a3d8a0ff26961254c14 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3523062 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
0ffff9ed	2022-04-05T15:56:23	Vulkan: Perf counters test for glInvalidateSubFramebuffer Bug: angleproject:7183 Change-Id: Id07c6467c746de312d6ba9695bdc98c9460144ca Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3573182 Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

d8c15499

2025-05-27T03:46:24

Vulkan: Clear depth and stencil unresolve separately To take into account two situations. 1. LoadOp for depth and stencil attachments are set differently. 2. depth and stencil unresolves could be different between the previous render pass and the current render pass. Bug: angleproject:42266019 Change-Id: I9e069b3972f86abb84eee6280919e6bba2901225 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6590197 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

b64d380e

2025-04-15T00:54:22

Vulkan: Ghost texture's image backing if overwritten If the texture's image is in use by the GPU but is overwritten completely, this change releases the old image and creates a fresh one. If the texture was used in a render pass, this avoids breaking the render pass. Otherwise, it allows the new image to be initialized with VK_EXT_host_image_copy functionality. In the very least, an unnecessary barrier is avoided. As a targeted optimization, this functionality is limited to non-array 2D color textures which are known to benefit from this ghosting in applications. Other texture types can be easily added, but need lots of tests to be added. Bug: angleproject:42265356 Change-Id: I18a7587599e36f9f70109264ddc1003b24b8b2df Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6456345 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

8cf89716

2025-03-14T20:17:07

Vulkan: Remove perFrameWindowSizeQuery feature Feature was enabled for all platforms in order for surface to be resized before acquire next image (not only after swap). Remove it, as if it's always enabled to simplify the code. Bug: angleproject:397848903 Bug: angleproject:42262287 Bug: angleproject:42262286 Bug: angleproject:40096601 Change-Id: I768772e30f5f38f68992e5b82c84430732aa77d9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6354166 Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

05c491e1

2025-03-15T11:56:07

Vulkan: Optimize GraphicsDriverUniforms update Unless RP is closed there is no need to dirty GraphicsDriverUniforms when the program executable changes. Bug: angleproject:386749841 Test: VulkanPerformanceCounterTest.NoUpdatesToGraphicsDriverUniformsOnProgramChange* Change-Id: Id02e8a17de93e2b73103666fc6cc62ce3cdd8f43 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6358315 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

1a0c9db3

2025-02-27T10:43:00

Vulkan: Disable monolithic pipeline creation with GPL As it violates OpenGL ES rules. This change also removes Vulkan perf counter tests that attempt to verify that warmed up programs hit the cache... this fails in non-trivial ways especially with graphics pipeline library due to: - Warm up tasks being async, they may finish after the test reads the perf counters to set expectations - Some drivers report a cache miss when fast-linking libraries... but likely they don't even look at the cache (so both hit and miss would have been inaccurate) - There is no 100% guarantee that the warmup really leads to a draw-time cache hit. Things are made worse by https://chromium-review.googlesource.com/c/angle/angle/+/5421594 because we don't necessarily even wait for the warm up tasks if ANGLE's view of the pipeline description doesn't match what was used for warm up (even if internally to the driver some of the state does not affect the binary blobs). Bug: angleproject:42265839 Change-Id: Iaf96e4f64e2187abc666ff07fe1304d7474a0e86 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6309696 Reviewed-by: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>

2e65d3d4

2025-02-03T16:26:46

Vulkan: Fine-tune submission for multiple RPs This CL aims to fine-tune submission based on a certain command buffer size. This allows us to perform one submission for multiple smaller render passes. * Added mCommandsPendingSubmissionCount to ContextVk. * In ContextVk::syncState(), preferSubmitAtFBOBoundary is only used if the render pass pending command count exceeds the threshold: kMinCommandCountToSubmit * Currently set to 32. * For now, we still submit if the command is a clear (for example glClearBufferfv()). * For now, we also still submit if the command is an invalidate (for example, glInvalidateFramebuffer()). * In ContextVk::flushImpl(), if the pending command count exceeds the threshold (kMinCommandCountToSubmit) and the device is found to be idle, the work is submitted to keep the device busy. * Modified the following unit test from VulkanPerformanceCounterTest: VerifySubmitCounterForSwitchUserFBOToDirtyUserFBO * Since there is now a minimum command count for submission, the number of draw calls has been changed so that the submission is still issued at the new FBO boundary. * After this CL, life_is_strange shows the following improvements: (From the latest measurements) * +19% wall_time * +38% cpu_time Bug: angleproject:42265052 Change-Id: I18452cc1d39ca7e0ac376f6012974b498153cce8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6182927 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

bd8bc105

2025-02-19T18:08:32

vulkan: disable pipeline cache data serialization for nvidia device. we still see the big cache data issue after driver version 520. rename hasEffectivePipelineCacheSerialization to skipPipelineCacheSerialization. Bug: b/358380399 Change-Id: Idd8354f95c3eb4c2e58678a4cf50c8b6af20f371 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6284126 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Hailin Zhang <hailinzhang@google.com>

055123f8

2025-02-27T10:45:56

Vulkan: Don't maintain SharedCacheKeyManager for BufferBlock Dynamic descriptor type uses the underlying BufferBlock in the descriptorSet. There could be many BufferHelper objects sub-allocated from the same BufferBlock. And each BufferHelper could combine with other buffers to form a descriptorSet. This means the combination for BufferBlock could potentially be very large, in thousands with some app traces like seeing in honkai_star_rail. The overhead of maintaining mDescriptorSetCacheManager for BufferBlock could be too big. In this CL I have chosen to not maintain mDescriptorSetCacheManager in the BufferBlock. The only downside is that when BufferBlock gets destroyed, we will not able to immediately destroy all cached descriptorSets that it is part of. They will still gets evicted later on if needed (see evictStaleDescriptorSets for detail). After this CL, running with all app traces we have, the max vector size of SharedCacheKeyManager::mEmptySlotBits is no more than 2, versus ~70s before the CL. Bug: b/384839847 Bug: b/293297177 Bug: b/237686097 Change-Id: I7c7c91cd0aeacba4145575ac4270b713bf38b742 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6310837 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

ca7e8882

2025-02-26T14:39:47

Account for relaxed precision conversion data loss 1. Add highp precision qualifier for "expect" uniforms 2. Change EXPECT_EQ -> EXPECT_*NEAR when using mediump Bug: angleproject:391002353 Change-Id: I5a75ec81e622718f989f357fc263ae287c2c7dbc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6306684 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

febba52a

2024-12-12T17:26:59

Optimize for swap after clear Bug: angleproject:382006939 Change-Id: Ia6b9a53042a1d188dbd5a5f6436f17ca1e389a4e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6072416 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>

c0ee7b20

2024-12-12T16:49:40

Swap getWidth() and getHeight() if the swapchain is 90 emulate rotated When checking if we need to recreate swapchain, we should swap the getWidth() and getHeight() if Is90DegreeRoration(mEmulatedPreTransform) is true. This is because: When creating swapchain, if Is90DegreeRoration(mEmulatedPreTransform) is true, we store swapped mSurfaceCaps.currentExtent.width and mSurfaceCaps.currentExtent.height in getWidth() and getHeight(), but we use the original mSurfaceCaps.currentExtent.width and mSurfaceCaps.currentExtent.height to create the swapchain. On next acquire, to check if the swapchain property changes, we should swap getWidth() and getHeight() if if Is90DegreeRoration(mEmulatedPreTransform) is true, otherwise we are recreating swapchains when width and height are unchanged. Bug: b/382006939 Change-Id: I1cbe9da2ff5e76602a90963514d2d0d5fbf677e7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6090199 Commit-Queue: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>

c75bd915

2024-12-10T23:01:44

Vulkan: Remove asyncCommandQueue It's been years and it never showed an advantage. In the meantime, performance without this feature seems close to native drivers (i.e. the feature has lost its appeal) and it's frequently a source of complication and bugs. Bug: angleproject:42262955 Bug: angleproject:42265241 Bug: angleproject:42265934 Bug: angleproject:42265368 Bug: angleproject:42265738 Bug: angleproject:42266015 Bug: angleproject:377503738 Bug: angleproject:42265678 Bug: angleproject:173004081 Change-Id: Id8d7588fdbc397c28c1dd18aafa1f64cbe77806f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6084760 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

d8f6df3e

2024-12-09T09:20:40

Vulkan: Fix assertion in RefCountedEvent::releaseImpl RefCountedEvent uses non-atomic uint32_t for reference counting, so it is not thread safe. To ensure we do not use it in a thread unsafe way, there is an assertion I added in RefCountedEvent::releaseImpl that the release call is not come from async command queue thread (all release calls are expected come from context thread which should be protected by context share group lock). A while ago dynamic rendering is implemented. With dynamic rendering you can't do final layout change in render pass. So the final layout change to Present is added to primary command buffer at the end of RenderPassCommandBufferHelper::flushToPrimary(). And if async command queue is enabled, that flushToPrimary is called from async command queue thread, which triggers the assertion I added in RefCountedEvent::releaseImpl(). This CL releases mCurrentRefCountedEvent for this specific situation (present image + dynamic rendering + asyncCommandQueue) so that we do not hit the assertion. The only downside is that it will force to use pipelineBarrier for this specific situation. But no one ships with asyncCommandQueue enabled, so there is no real concern here as well. Bug: angleproject:382580875 Change-Id: I042e3906db7f5bb7acb299997f8fc7e21b8350b6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6072350 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

e42047f0

2024-10-25T13:50:28

Vulkan: Disable DescriptorSet cache for SwiftShader Performance with swiftShader is not critical and cache code path not making much difference for SwiftShader renderer anyway. This CL disables descriptor set cache for SwiftShader mainly to ensure the code path gets test coverage on CI bots. This code path also ensures that we are tagging ResourceUse on DescriptorSetHelper properly after every use. Otherwise, it is very hard to test with cache enabled code path since object usually won't get destroyed due to cache and any bug associated with this is going to be very hard to debug. This CL has catch such bugs during the descriptor set cache work. Bug: angleproject:372268711 Change-Id: Iee1028f9378cf4f537d897e08746d5cf958f225a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6047805 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

e3b3dd68

2024-11-14T23:00:52

Vulkan: Optimize color clears vs read-only depth/stencil switch When switching to read-only depth/stencil mode, if the aspect that intends to be in read-only mode has a deferred clear, the clear is flushed separately beforehands (as that would be a write operation). Prior to this change, _all_ deferred clears were flushed for simplicity. In this change, only the aspect that is switching modes is cleared, leaving the other aspects free to be optimized as loadOp of the following render pass. Bug: angleproject:378058737 Change-Id: Iba4371590bee99f5022575c09b0d32231562488c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6019829 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

adb80cbb

2024-11-12T22:38:38

Vulkan: Optimize read-only depth/stencil switch after clear Prior to this change, if color is cleared then read-only depth/stencil mode is enabled, ContextVk::switchToReadOnlyDepthStencilMode eagerly flushed the deferred clears (starting a render pass). However, if the render pass is marked dirty for any reason afterwards, for example because we want to flush the render pass after a query ends, the render pass that is just started above is unnecessarily closed. In this change, `switchToReadOnlyDepthStencilMode` only detects if a new render pass is needed and marks the appropriate dirt bits. This way, the render pass can only be restarted once. Bug: angleproject:378058737 Change-Id: I83a5ebae6c223882eafea338eeec19895d87e5c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6023414 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

bdebee8c

2024-10-29T17:08:11

Tests: add repro for running out of outside RP serials Bug: b/375661776 Change-Id: I2cd82710bdf5b00a6165ddad6ef21f30150aa5bc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5977123 Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

6b9d3762

2024-08-22T15:29:00

Vulkan: Optimize full texture clears Currently, a full texture clear (glClearTexImageEXT()) is treated as a special case of a partial clear (glClearTexSubImageEXT() with image dims as the input). However, it can be further optimized by treating it as a clear update. * For full clears from EXT_clear_texture, the clear update path is taken. * It leads to a more optimized path, including the usage of the following APIs: * vkCmdClearColorImage() * vkCmdClearDepthStencilImage() * It uses the following enum: ClearTextureMode * If a partial clear uses the extents for the entire image, it is treated as a full clear. * Updated the method to determine if a texture is renderable in clearSubImageImpl(). * Added perf counter: fullImageClears * Added new unit tests * Single 3D texture full clear (Clear3DSingleFull) * 2D RGB SNORM clear (Clear2DRGB8Snorm) * Added Vulkan perf counter test for 2D and 3D color image clear. * Updated the related skipped tests on Pineapple. Bug: angleproject:42266869 Bug: angleproject:375425839 Change-Id: I12ef3002dee190d7f8f43204f7d3f76e05d0b54f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5806207 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

1652f8ed

2024-10-17T13:35:39

Vulkan: end2end tests when descriptorSetCache is disabled Some end2end tests are testing specific descriptorSet cache behavior. When cache is disabled, these tests failed. In this CL these perfCounter based tests haven been modified to check total allocation to ensure the descriptorSets are properly reused instead of cache hit/miss. Bug: angleproject:372268711 Change-Id: I1d2f4cfcf622b05cdcb3317c8804416a80e72c48 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3735732 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

cc7d0220

2024-07-31T14:22:38

Vulkan: Fix serial mismatch during mid-loop flush Currently, if the total buffer updates to the image surpasses a certain threshold, it results in a flush. However, this can cause discrepencies in the queue serial, which can result in incorrect behavior on some platforms. * Updated flushStagedUpdatesImpl() so that the image serial after applying the updates matches that of the current outside command buffer. * That includes when there is a flush in the middle of the update loop, resulting in submission and new queue serial for the CB. * Added a unit test to check if a large texture can uploaded and deleted after a second small texture is uploaded. * Texture1UploadThenTexture2UploadThenTexture1Delete * Added a unit test for flushing when uploading cubemap textures. Bug: b/351650806 Bug: b/356192937 Change-Id: I7f9b20e4b7fd49115f22081a9733b4d44b740e4a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5744377 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

46dd6457

2024-06-25T15:56:15

Vulkan: Use DONT_CARE ops for missing D/S aspects Simplifies op tracking with dynamic rendering. Bug: angleproject:42267038 Change-Id: I394c154d94458c470190fea66d82c408e6f33725 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5655873 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

3f572905

2024-06-19T17:46:38

Add basic begin/end support for perf counters The AMD_performance_monitor extension has explicit begin/end calls to capture counters. This was not implemented in ANGLE and the tests were relying on ANGLE always capturing counters (incurring a small overhead). This change does not complete the implementation of that extension, but does add basic support for starting and stopping perf counter measurements. While inactive, most counters are not updated. Bug: angleproject:42267038 Change-Id: I3ff6448b22ca247c217401cb2d76ef4142c9d759 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5639343 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>

d193d51b

2024-06-17T22:46:08

Replace issue ids post migration to new issue tracker This change replaces anglebug.com/NNNN links. Bug: None Change-Id: I8ac3aec8d2a8a844b3d7b99fc0a6b2be8da31761 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637912 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

15c182f9

2024-06-11T09:47:07

Vulkan: remove deferFlushUntilEndRenderPass feature, always on This only applies to Qualcomm chipsets, the feature was already enabled for all other devices. It was previously causing a manhattan 3.0 perf regression on some Qualcomm devices, but my tests on S24 both with ANGLE trace manhattan_31 and running gfxbench manually do not show any obvious regression. It was also not expected that this would result in a regression. As we do not aim to improve perf on older devices, removing the feature altogether so that defers are always enabled. This change resulted in a change in gold images on these traces on pixel 4 bots: pokemon_masters_ex - text was missing and now is rendered street_fighter_iv_ce_frame86 - shadow was missing and now is rendered So it looks like the feature may have been working incorrectly. Bug: b/346378481 Change-Id: I2b0d15b89e11c67dea7c316a42bc807441c43b0a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5622115 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

cc4fe098

2024-06-10T15:11:02

Tests: InRenderpassFlushShouldNotBreakRenderpass feature skip Flush is only deferred when deferFlushUntilEndRenderPass feature is enabled. It was intentionally disabled for Qualcomm devices due to perf: https://crrev.com/c/2441667 Bug: angleproject:42265681 Change-Id: I15d580c334a20e9ea58aec418a608f0b07bda2d1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5617650 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

4880a6c2

2024-05-03T14:43:20

Enhance VulkanPerformanceCounterTest coverage Enable additional config with PadBuffersToMaxVertexAttribStride feature Bug: b/271915956 Change-Id: I5fe27781fc51cdf477d5c2f453ad05a211788739 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5512875 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

13829f20

2024-03-26T23:03:12

Vulkan: Optimize depth/stencil resolve with glBlitFramebuffer Like color resolve, depth/stencil resolve is now also possibly done by modifying the render pass and attaching a depth/stencil resolve attachment. Bug: angleproject:7551 Change-Id: I045e3875e24006d2473a55b6c3856dd768fe8b84 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5398004 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

0e9254bd

2024-03-25T16:37:51

Vulkan: Optimize color invalidates By not flushing the render pass when there is an invalidate. Previously, the tracking of invalidation, write to attachments, management of load/store ops, and whether image contents are defined or not have all been unified between color and depth/stencil images. As such, it is possible to not close the render pass when a color image is invalidated just as is not done for depth/stencil images. Together with the optimization to resolve attachments [1], it is now finally possible to efficiently do MSAA rendering with ANGLE. Note that the optimization to use resolve attachments for depth/stencil is not yet implemented. For color only, the perf test added in [2] shows the following improvement on Pixel 6: - Single sampled rendering: ~2.73ms - Resolve + invalidate (before optimizations): ~3.54ms - Resolve + invalidate (after this change): ~2.85ms [1]: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 [2]: https://chromium-review.googlesource.com/c/angle/angle/+/5392548 Bug: angleproject:7551 Change-Id: I008adf9f53df97ab464b0a0399f0b312bf4d0d3f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5391905 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

9475ac40

2023-11-15T10:25:06

Vulkan: Make efficient MSAA resolve possible Prior to this change, using a resolve attachment to implement resolve through glBlitFramebuffer was done by temporarily modifying the source FramebufferVk's framebuffer description. This caused a good deal of complexity; enough to require the render pass to be immediately closed after this optimization. The downsides to this are: - Only one attachment can be efficiently resolved - There is no chance for the MSAA attachment to be invalidated In this change, resolve attachments that are added because of glBlitFramebuffer are stored in the command buffer, with the FramebufferVk completely oblivious to them. When the render pass is closed, either the FramebufferVk's original framebuffer object is used (if no resolve attachments are added) or a temporary one is created to include those resolve attachments. With the above method, the render pass is able to accumulate many resolve attachments as well as have its MSAA attachments be invalidated before it is flushed. For a FramebufferVk that is resolved in this way, there used to be two framebuffers created each time and thrown away as the code alternated between starting a render pass without a resolve attachment and then closing with one. With this change, there is now one framebuffer (without resolve attachments) that is cached in FramebufferVk (and is not recreated every time), and only the framebuffer with resolve attachments is recreated every time. Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there would be no framebuffers to create and destroy, and this change paves the way for that support too. WindowSurfaceVk framebuffers are still imagefull. Making them imageless adds unnecessary complication with no benefit. ----------------- To achieve efficient MSAA rendering on tiling hardware, applications should do the following: ``` glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO); // Clear the framebuffer to avoid a load // Or invalidate, if not needed to load: // glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...); glClear(...); // Draw calls // Resolve into the single sampled framebuffer glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO); glBlitFramebuffer(...); // Immediately discard the contents of the MSAA buffer, to avoid store glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...); ``` The above would translate to the following Vulkan render pass: - MSAA LOAD_OP_CLEAR/DONT_CARE - MSAA STORE_OP_DONT_CARE - Resolve LOAD_OP_DONT_CARE - Resolve STORE_OP_STORE This makes sure the MSAA data doesn't leave the tile memory and greatly reduces bandwidth usage. Once anglebug.com/4892 is fixed, this would also allow the MSAA image to never be allocated either. Bug: angleproject:7551 Bug: angleproject:8625 Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

b3ab67d3

2024-03-14T15:06:02

tests: Remove unnecessary .get() from RAII objects Bug: chromium:40942995 Change-Id: I82509869bce3ad8f51811188fe04267f2de04786 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370904 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

8346addb

2024-02-06T15:40:31

Contain X11 includes and free usage of common terms This change undoes workarounds where some terms were avoided so there is no clash with X11 (such as Success, Bool and None). In particular, this helps us make sure we never include the X11 headers in such an unconstrained manner as to clash with our code. Bug: angleproject:8520 Change-Id: I53d9657c5a33164064d2c80a206b96fd52f607f1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5273491 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Liza Burakova <liza@chromium.org>

ba65feb4

2023-10-18T17:33:38

Vulkan: Limit mutable texture flush to one update In case there are many updates for a mutable texture, flushing it preemptively can reduce performance, especially if it is done repeatedly. * Added getLevelUpdateCount() to ImageHelper. * Previous mutable textures will now be flushed only if they have exactly one update per mip level/cubemap face (if defined). * This means that mutable textures with no data will also not be flushed. * Added unit tests for single-level texture flushing and situations with no updates or more than one update. Bug: b/285613719 Change-Id: I1592ecf502051a55ebfbb7fcd22577c9ce87bf43 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4953847 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

b2e6a196

2023-09-11T15:27:20

Vulkan: Use VK_EXT_host_image_copy for texture uploads Of all the scenarios where host image copy may be useful, this is likely the most common case. There are numerous conditions for when the copy may be done on the host: - The image format must support it, - It must be unused by the GPU, - It must not have any pending updates (this can potentially be mitigated if needed), and - It must be in a host-copyable layout. However, many texture uploads are done: - To compressed formats, where support is highly likely, - On init, where: - the image is never previously used, - the image has no previous uploads - the image is in the UNDEFINED layout which satisfies the conditions above. As a result of this change, when the upload is done on the host, creation of a temp buffer is avoided which greatly reduces memory pressure (specially during app loading which is when most texture data is uploaded) and may even improve performance (due to avoiding a double copy). Testing the first 3 frames of the following traces with a SwiftShader implementation shows the amount of buffer allocated for staged uploads changed as such: - Black Desert: 185MB -> 65MB - Genshin Impact: 125MB -> 12MB - Asphalt 9: 138MB -> 0MB Bug: angleproject:8341 Change-Id: Id71dcc4a7a0f8b67960d2d283fe9d19ce7429a03 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4856676 Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

f5986fbb

2023-07-11T12:11:20

Vulkan: Dont break RP if there is actual render feedback loop There is a bit terminology confusion here that will be fixed in next CL. If a depth attachment is read only, then there is no feedback loop, we should not call feedback loop for read only depth attachment. The real depth render feedback loop mode is formed when we write to depth and sample from depth at the same time. In this condition, the content is undefined per OpenGLES spec section 9.3.1 (https://registry.khronos.org/OpenGL/specs/es/3.2/es_spec_3.2.pdf). The shouldSwitchToReadOnlyDepthStencilFeedbackLoopMode() implementation handles the usage case that the same render pass has depth write and then switch to read only. Under this usage there is no actual feedback loop, and we should still work properly by end current render pass and start a new render pass with read only depth attachment. This implementation also treating the actual feedback loop case exactly the same way by ending render pass first, even though this is undefined behavior. gangstar_vegas has the exact this undefined behavior usage case, where it write and sample from depth buffer at the same draw call. Native driver is not ending the render pass but ANGLE currently does. This puts ANGLE into worse performance. Since this is undefined behavior, either way is correct. This CL checks if there is an actual feedback loop in the current render pass and if yes, we adopt the native driver's behavior that keep the current render pass going. This improves gangstar_vegas frame time from 4.365ms to 3.89ms, and interestingly, yield the same golden image. Bug: b/289436017 Change-Id: Ifc04ecd8ad6455a88e8615bd5452b9cce88c6687 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4679361 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

2a08c33b

2023-07-10T17:47:49

Vulkan: Avoid flushCommandsAndEndRenderPass for readonlyDS switch When we switch to read only depth stencil mode, right now we always call flushCommandsAndEndRenderPass, even though the started render pass is empty and loadOp is load. This flush will cause render pass actually submitted and color attachment being cleared and then color attachment gets loaded in the subsequent render pass. In this CL, we only flush if the depthStencil attachment has clear or written This CL save one renderPass for the following app traces: antutu_refinery, aztec_ruins, manhattan_10, manhattan_31. Bug: b/290833623 Change-Id: I13b7a968d797b4c913f1cfbe9677d9b8abe791d2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4674087 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

2f19bb74

2023-03-16T16:03:29

Reland "Vulkan: Reactivate already started render pass when possible" This is a reland of commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: Ice9062122ae320b1a0108ff981bc65bd13b2ada0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4406888 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>

459f0fad

2023-04-06T13:42:49

Vulkan: Force submit when switch to system framebuffer draw Given recent finding on unnecessary wait for acquire image semaphore, since the semaphore wait is per submission, any GPU will have this same performance problem if we only do command submit per frame. This CL forces submission when we switch draw framebuffer to system default framebuffer, so that the semaphore wait will not block user's fbo rendering. Bug: b/275624771 Change-Id: Id6b941870ef296393c13d0daaf81a41b6c042b9a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4406882 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

d58fbf04

2023-04-05T12:32:09

Vulkan: Wait for surface ANI semaphore only if image is used Right now there is a bug that surface's ANI semaphore is added to context when WindowSurfaceVk::getAttachmentRenderTarget get called, which gets called from FramebufferVk::syncState, which is before we end the previous render pass, due to the endRenderpass usually is deferred until next renderPass starts. This caused ANI semaphore gets added to the previous render pass's submission, which does not use surface, and thus a bubble in GPU execution pipeline where the user FBO rendering gets unnecessarily blocked until ANI semaphore is signaled. This lowers GPU utilization and thus GPU frequency gets dropped and frame time increased. This CL stores ANI semaphore to ImageHelper object and when barrier is generated, the ANI semaphore is moved to CommandBuffer. When CommandBuffer gets flushed and submitted, it gets added to the waitSemaphores vector and submitted to vulkan. Since all use of swap chain image must go through barrier code first (you need at least change layout), this ensures ANI semaphore gets waited in exact and robust way. Only the submission that references the swap chain image will be waited. With this CL, professional_baseball_spirits reduces frame time from 3.8 ms to 2.7 ms, achieving parity with native GLES on pixel 7 pro. Bug: b/275624771 Change-Id: Ifa6cacf9e3bc147bfde54eb7def2fca48c50aca0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4400011 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

25e60197

2023-03-31T14:17:26

Vulkan: Unify buffer alloc strategy for uploads and GPU copies With this change, glCopyBufferSubData uses the same buffer allocation strategy as glBufferSubData. Only exception is with buffer self-copies which never allocate a new buffer for simplicity. Additionally, this change allows glCopyBufferSubData to be done on the CPU if possible, i.e. if the source buffer is not being written to by the GPU and whenever the equivalent glBufferSubData would have used a CPU upload. Bug: b/276002151 Change-Id: Ice8df5891c5516b148245d5d6fa9b19b787df4ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4390023 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

41f0a321

2023-04-03T21:58:43

Revert "Vulkan: Reactivate already started render pass when possible" This reverts commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217. Reason for revert: Suspected cause for flakiness. anglebug.com/8118 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: I81cc2dcacb52466808b2ccf5819feda466c39fc5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4396502 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

96cda1ac

2023-03-31T15:08:11

Vulkan: Switch to modified framebuffer should trigger submit The feature flag preferSubmitAtFBOBoundary intended to trigger submitCommands call when FBO is switched. Right now there is a bug that when FBO is switched, submitCommands did not get called even if feature is enabled. The reason is that when framebuffer changed, we first get FramebufferVk::syncState call, and if the FBO that we switched to is drity, we will end up calling ContextVk::flushCommandsAndEndRenderPass to immediate end the render pass. The problem with that is that later on when we get to ContextVk::syncState and saw DIRTY_BIT_DRAW_FRAMEBUFFER_BINDING dirty bit and try to issue a submit, we notice render pass already ended, so mHasDeferredFlush never gets set, which means we never call submitCommands. All apps render to system framebuffer in the last render pass, and system framebuffer is always dirty due to swap chain image rotation, we hit this almost with every app. This CL avoid flushCommandsAndEndRenderPass() call from FramebufferVk::syncState. We now relies on ContextVk::onFramebufferChanged (which calls onRenderPassFinished() which will set DIRTY_RENDER_PASS bit to trigger deferred endRenderPass) to end the current render pass. This CL also add a regression test to ensure the submit indeed occur. Bug: b/275624771 Change-Id: I92b95a7a6c435f242d6684cb7852172cf41896c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4390642 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Roman Lavrov <romanl@google.com>

ad9537af

2023-03-16T16:03:29

Vulkan: Reactivate already started render pass when possible In some usage case (such as lineage_mobile), we are seeing in the middle of render pass, app switch to another fbo just to issue a glClear() call, which the clear call itself gets deferred. Application then switch back to the original frame buffer. Before this CL, the render pass gets recreated due to frame buffer binding change, even though the clear gets deferred and new render pass and the previous render pass are essentially the same. This CL detects this situation and reactivate the current render pass instead of creating a new one. With this CL, lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only one render pass is used instead of two. This CL also allows the render pass started by BlitFramebuffer reused by subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 pro) Bug: b/273808966 Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

d6a25bfa

2023-03-07T15:06:10

Vulkan: Optimize glBufferData call to improve storage reuse If app calls glBufferData with certain size, then calls it again with size 0, and then call it again with same old size again, we should try to reuse the existing storage. When size is zero, with the existing logic, we never free the storage. When glBufferData is called third time with the same size as the first glBufferData call, we expect to reuse the existing storage. But because of the storage reuse logic is comparing buffer's new size to the old size (which is 0), we missed the opportunity to reuse the existing storage. This CL update the reuse logic so that it checks the new size against storage's size (instead of OpenGLES buffer's size) and if we will end up with same sized allocation and same pool and memory type, then we reuse instead of reallocate. This reduces efootball_pes_2021 frame time from 4.670 ms to 4.277 ms on pixel 7 pro. Bug: b/271915956 Change-Id: I6f91e3e85b104eca215b28e7d0bea413ecc4401c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4317488 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

244e1931

2023-01-17T19:22:10

Vulkan: Fix use of pending Outside RenderPass CommandBuffer. Regression from this commit: 730c127102b540ce2c4ec086b037c8b732706e26 "Vulkan: Submit queue more often for texture data" Bug: angleproject:6354 Test: angle_end2end_tests --gtest_filter=*SubmittingOutsideCommandBufferAssertIsOpen* Change-Id: I5f72f499cd7153c94c8e5f8a3415df2182726c8e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4296802 Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

7189e4cf

2023-02-23T11:03:40

vulkan: fix depth buffer renderpass loadOp issue. when change the depth compare function, Renderpass need to change the mAcess flag accordingly. Bug: b/269929460 Change-Id: I83826c1b07c6d22600d6cd039e7d8bfd0b5b39c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4284624 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Hailin Zhang <hailinzhang@google.com>

05e62f39

2023-02-16T23:16:46

Vulkan: Don't close render pass if rebind to same fbo In the Vulkan backend, the render pass can occasionally (and transiently) be in a state of "open but inactive". This is when the render pass is closed, but has the potential for future modifications (for example to add a resolve attachment). Under many circumstances, it is expected that an open render pass cannot be in such a state. This assumption can be broken in this scenario: - Open render pass, draw, etc - Change framebuffer binding - Change framebuffer binding back to original - Masked Clear When ContextVk is synced before clear, it sees that the framebuffer binding is changed (though it hasn't really), and it closes the render passes and sets the render pass dirty bit. If a draw were to follow, a new render pass would have started (unnecessarily). However, in the case of a masked clear, UtilsVk notices that the render pass is started, assumes it must be active, and continues recording to it. While the operation itself succeeds, the assumption that the render pass is active is false (and fails assertion). This change makes sure that framebuffer binding change is no-oped if the framebuffer is the same one that has opened the current render pass. If any application does unnecessary binding changes and back, it will be optimized by this change as well. Bug: chromium:1411210 Change-Id: I37a3a9f2eaa1a81a1b3393840b9458ec71a87377 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4261215 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

410d8ba5

2022-12-21T13:27:00

Vulkan: Cleanup ContextVk::hasStartedRenderPass APIs ContextVk has a few hasStartedRenderPass APIs which interpret "start" inconsistently. A RenderPassCommands' life should be notStarted, started, requestEnd, and end (which is equivalent to notStarted). When someone calls onRenderPassFinished on a started renderpass, it does not immediate endRenderPass, but it will set DIRTY_BIT_RENDER_PASS dirty bit so that next draw call will trigger endRenderPass and start a new renderPass. We do not have a name for this state, which adds some confusion. This CL renames the stage between start and onRenderPassFinished to be "active" renderpass, when you have mRenderPassCommandBuffer pointer being valid and you can actively adding draw commands into the renderPass. For this purpose, I haves renamed hasStartedRenderPass to hasActiveRenderPass. This CL also simplifies hasStartedRenderPass implementation to only check mRenderPassCommandBuffer and turned mRenderPassCommands.started as assertion. This CL also changes hasStartedRenderPassWithQueueSerial to actually check mRenderPassCommands.started instead of being "active", so that name reflects what it is actually checking. This CL also changed hasStartedRenderPassWithCommands to hasActiveRenderPassWithCommands to make name and implementation consistent. One added benefit of this is that after this CL we now allow load/store optimization on a started but inactive renderPass as well (for example glInvalidateFramebuffer call after glFenceSync call, or invalidate after FBO blit as demonstrated by MultisampleResolveTest.ResolveD32FSamples tests). Bug: angleproject:7903 Bug: angleproject:7551 Change-Id: I8c8ec4c0d54b9ad0a9e373108dfce6b151c8fe0e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4119693 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

77c95de4

2022-11-16T21:12:28

Vulkan: Threaded monolithic pipeline creation With this change, once a pipeline is created out of libraries, a task is scheduled (if necessary) to asynchronously create a corresponding monolithic pipeline. Once the task is complete, the linked pipeline handle is replaced by the monolithic one, gaining back any performance that might have been lost due to the use of libraries. Bug: angleproject:7369 Change-Id: I525fb1e09f8bedc61b9dbef19f9cce7026ff9c53 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4031151 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>

f17cb883

2022-11-30T17:23:12

Vulkan: Add two tests for per context queue serial work SubmittingOutsideCommandBufferTriggersEndRenderPass: This test is added to test outside command buffer uploads that triggers endRenderPass works properly. CreateMultiSharedContextAndDraw: This test is added to test draw with shared vertex buffer in the shared context group works properly. Bug: b/255414841 Change-Id: I8b4f343fe220a9f0b7c6e042f4663e23ae6f4c9d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4064148 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

7f4caaf5

2022-11-23T15:40:53

Vulkan: Fix VulkanPerformanceCounterTest.SubmittingOutsideCom VulkanPerformanceCounterTest.SubmittingOutsideCommandBufferDoesNotCollectRenderPassGarbage depends on the implementation detail on how we flush and submit commands. The recent change crrev.com/c/4038095 fixes one issue that we are now having one less submission on pixel 6 device. This CL adjust the test to account for that. This CL also changed to set mHasDeferredFlush to true only when there is a started renderpass upon FBO bind. This CL also opt in swiftshader into preferSubmitAtFBOBoundary feature for test coverage and ease of debugging since ARM GPU (which enables this flag) is not been tested on CI. Bug: b/255414841 Change-Id: I295cec33a8ca257a5d5a98604b8c4c0c29e97cdf Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4054101 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

68b47e58

2022-11-16T10:46:59

Vulkan: Initial support for VK_EXT_graphics_pipeline_library When available, this change uses VK_EXT_graphics_pipeline_library to create pipelines. Currently, it is only used when graphicsPipelineLibraryFastLinking is available. This restricts the use of this extension to devices where monolithic pipelines are not any more performant than linked libraries. A future change adds support for other implementations by providing async pipeline creation. Bug: angleproject:7369 Change-Id: I1e3b7ac4aa56e75c7d6f4d0d5ea91cb0b862e581 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4031489 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Steven Noonan <steven@valvesoftware.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

9b5fff82

2022-10-05T21:56:00

Vulkan: Emulate shader stencil export for MSRTT The MSRTT emulation code had one corner case issue that could lead to performance and memory inefficiencies. That is when stencil needs to be unresolved and VK_EXT_shader_stencil_export is not supported. This change adds a path to emulate VK_EXT_shader_stencil_export and removes this inefficiency. This should help Chromium on older Android devices that lack both this and the recent VK_EXT_multisampled_render_to_single_sampled extensions. Chromium frequently breaks the render pass (crbug.com/1336981), which easily leads to this situation. Bug: angleproject:4836 Change-Id: Ifceec43f7f3807b7e32f4b379edcd4351ae76414 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3935892 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

76f377c5

2022-06-17T16:05:16

Vulkan: Break renderpass when switch from query to non-query getQueryResult will wait for query result to be available, which means a potential CPU bubble if the result is not yet available. On tiler GPUs it will at least wait for renderpass to complete. Usually query enabled draws are very tiny (usually just draw a point to see if it is occluded or not), and query disabled draws are expensive. Some apps do issue a glFlush when switch from query draw to non-query draw, but app like dead_by_daylight does not issue such flush. In order to reduce the bubble, this CL ends renderpass and issue a flush when we switch from query enabled draws to non-query enabled draw so that the result will be available much earlier, this reduce the CPU bubble. This result in dead_by_daylight frame time improves from 5.45ms to 3.5ms (35% improvement). Bug: b/250706693 Change-Id: Ia3a32a9fb336e6f256809b3cad83f61a45415fb1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3931739 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

c19ec948

2022-08-23T10:43:59

Vulkan: Implement imageless framebuffers * Added the attachment image and create info objects to be used for imageless framebuffers created in getFramebuffer(). * New helper class for framebuffers in RenderPassCommandBufferHelper: MaybeImagelessFramebuffer, which includes a framebuffer object, if the framebuffer is imageless, and the image views. This is to make sure that the args for render pass begin info will be correctly set up according to the status of the used framebuffer. * Refactored the collection of attachments in getFramebuffer() into a new function, getAttachmentsAndImagesFromRenderTargets(). It also returns their corresponding ImageHelper* objects used to create the framebuffer (from their image properties). * New struct: RenderTargetInfo; which keeps track of render targets and whether resolve image should be used for the render pass in the form of the enum class RenderTargetImage. * Added a new arg to getFramebuffer(): resolveRenderTargetIn; to use when there is a valid resolveImageViewIn. * Without using the framebuffer cache, we would require to handle the framebuffer destruction by adding it to the garbage instead of releasing it. For example, FramebufferVk::destroy() now adds mCurrentFramebuffer to the garbage. * Added new framebuffer unit tests. * Added tests where two textures with different attributes are bound to the same framebuffer before drawing, one after another. * Added test where a blit occurs from a multisample texture into a non-zero level of a resolve texture, each bound to a separate FBO. * Added a new perf test to compare performance for enabled imageless framebuffers vs disabled. (Credit: cclao) Bug: angleproject:7553 Change-Id: Iacdbd73aaa01cbb0e37abf01ae4892bdfdd4b12f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3827644 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>

5d7c4eca

2022-10-02T02:27:27

Vulkan: Don't flush depth/stencil on color blit When syncing the read framebuffer for blit, deferred clears are picked up for the attachments that are not being synced. They are then redeferred so a future command would pick them hopefully as loadOp. This change improves the frame time of Pretty Derby on Pixel 6 by ~23%. Bug: angleproject:7727 Change-Id: Ie7d84c58315cd09204e5229f1ec73605d5a7f639 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3931973 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

2debd07d

2022-09-21T11:40:18

Automatically query status of features for tests Now tests can skip based on what features exist, compared to what features are explicitly asked for. For example, a test suite may override-enable a (normally disabled) feature that depends on a hardware capability. With this change, it can be skipped if said hardware capability doesn't exist. As a bonus, tests now correctly skip if the feature is overriden through an environment variable. This change also cleans up VulkanPerformanceCounterTest tests which did the same for a number of specific features. Bug: b/243398683 Change-Id: I84f026e3394eab56fd123e02bee72720c7ed94c6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3909789 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>

18f90857

2022-09-09T11:28:00

Vulkan: Use DontCare if attachment is invalidated If an attachment is invalidated, there is no need to preserve the old content. NONE means old content is still preserved, DontCare means discard old content. In this case we do want to discard instead of preserve old content. Bug: b/243711628 Change-Id: I242ac86db6993574b5627d61f7185d155beec0ba Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3888938 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>

34332f85

2022-09-13T13:54:14

Fix UninstantiatedParameterizedTestSuite errors on iOS. Some test suites are instantiated only on ES31 or Vulkan, which iOS doesn't support. Bug: angleproject:5417 Change-Id: Iea202934edb3804993dabd38f2629d4992eb2095 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3892013 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> Commit-Queue: Yuly Novikov <ynovikov@chromium.org> Auto-Submit: Yuly Novikov <ynovikov@chromium.org>

1d04539f

2022-09-06T15:20:32

Fix xfb tests rendering points Some xfb tests render points and verify a coordinate away from the points is unchanged as a means to break the render pass. Due to lack of output to gl_PointSize, these tests are flaky on SwiftShader. Bug: angleproject:7625 Change-Id: I7347516bb755ace87d57df3467c59055f28f1d69 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3877783 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>

38a38b8d

2022-09-01T17:10:39

Revert "EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2)" This reverts commit 2dc1c609dea184e5e51a8136df71ae14f4481f52. Reason for revert: Doesn't fix the issue Original change's description: > EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2) > > Bug: None > Change-Id: I6a8006be39ff8b8208004f533157f27da8e7fe24 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3863143 > Auto-Submit: Roman Lavrov <romanl@google.com> > Commit-Queue: Jamie Madill <jmadill@chromium.org> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Jamie Madill <jmadill@chromium.org> Bug: None Change-Id: Ifbb8f12798c9b5bf1f77f997302114263eceaf75 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3868935 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>

2dc1c609

2022-08-29T15:31:56

EndXfbAfterRenderPassClosed expectation (0,0) -> (w/2,h/2) Bug: None Change-Id: I6a8006be39ff8b8208004f533157f27da8e7fe24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3863143 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>

c6ad305c

2022-08-25T11:53:46

Vulkan: No depth load/store if depthFunc==ALWAYS/NEVER && mask==FALSE If depthFunc is set to always or never pass with depthMask disabled, and the entire render pass is drawing with that state, then there is no need to load or store depth value. Bug: b/243711628 Change-Id: I71d470bda49abc48a4a6e20895b7e056c33fa33a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3858143 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

7428369a

2022-08-29T17:59:38

Vulkan: Use macros for load/store Op check Use macro instead of inline function for result check so that the correct line number gets print out for the failed check. Bug: b/243711628 Change-Id: I1141f6a63fd01bb9fe0cf5c06b81b378e8acc08e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3864347 Reviewed-by: Ian Elliott <ianelliott@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

9a258281

2022-08-17T17:47:22

Fix submit-count perf counter test on ARM On ARM, the preferSubmitAtFBOBoundary feature causes extra submissions that need to be taken into account. Bug: chromium:1337538 Change-Id: Id545ee3e65fc943aff51ea3721e9c19bc0afd4a5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3835168 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>

2c351351

2022-08-07T22:31:40

Vulkan: Don't break render pass on read-only buffer updates When uploading to a buffer that is in use by the GPU, we either acquire a new buffer and copy the contents over, or stage the update and do a GPU copy. Ignoring all other conditions, this decision was made based on whether a small or large part of the buffer is being updated; small updates where staged. However, if the current render pass uses the buffer in read-only mode, the staged update would break it (to apply the update). In this change, this situation is detected and the acquire-and-update path is chosen even for small updates. Bug: angleproject:7534 Change-Id: Ie2c0989449dcc7d03695a003cf6f353920f8fb65 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3812566 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

928c5016

2022-08-04T12:28:12

Vulkan: Fix garbage collection vs outside-RP-only flush In https://chromium-review.googlesource.com/c/angle/angle/+/3379231, an optimization was implemented such that the excessive recorded texture uploads would get flushed early and submitted. This caused a use-after-free bug in the following situation: * Draw with pipeline A * Delete A <--- this puts A in the Context garbage list * Upload a lot of data At this point, the flush threshold could pass and the commands recorded outside of the render pass up to this point would be submitted. Associated with this submission was the current garbage, including pipeline A. However, the render pass that uses pipeline A is still not submitted. Now if after some time the render pass is still open, but the "completed commands" are checked (another set of uploads causing another submission, a query status check, etc), the garbage can be cleaned up. When the render pass closes next and is submitted, the implementation attempts to use the pipeline, which is already deleted. In this change, outside-render-pass-only submissions no longer reference the current garbage. This has the side effect that the temporary buffers used for uploading texture data won't be released early. A future optimization may want to separate the garbage list in ContextVk to render pass and outside render pass garbage. Bug: chromium:1337538 Change-Id: I4d31edc53916785d44420f4d6b4b2578ca3996e2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3812555 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>

553b1334

2022-07-28T23:33:28

Vulkan: fix default msaa framebuffer resolve issue. Bug: b/239217726 Change-Id: I826aad7495814e0a178a586c4cfd5943278cddac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3793304 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>

f000215d

2022-07-26T21:16:14

Vulkan: Optimize transform feedback buffer tracking Prior to this CL, if transform feedback was active at the time of render pass closure, its buffers were cached in ContextVk. Later, these buffers were used to close the render pass if they were used for any other reason (such as vertex attribute). However, this meant that the render pass could close unnecessarily if transform feedback was ended right after the render pass is closed. The closure of the render pass was an awkward place to cache the used transform feedback buffers (because at that point, the buffers are actually no longer used). Instead, this change makes sure that the buffers are cached when transform feedback buffers are first used by the render pass, and the cache is cleared at the end of the render pass. Bug: angleproject:4622 Change-Id: I31c0a1e20d48f2e261e2cf37adb0a46db683e6fb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3788309 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>

53d40aed

2022-07-15T15:03:25

Vulkan: Destroy descriptorSet cache when BufferHelper destroyed For atomic counter buffers or other cases, dynamic descriptor is not been used. Right now when such buffer is destroyed, the cache is still lingers around. With this CL, when a new cache entry has been created, we record the cache entry in the BufferHelper. When BufferHelper is destroyed, we also immediately destroy the cache entry since the cache will no longer reused. Bug: b/237686097 Change-Id: I26eee96318fbc003e65318c0b8263dc61092f350 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3764044 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>

c7459a46

2022-07-15T09:55:03

Vulkan: Destroy descriptorSet cache when BufferBlock destroyed When a new cache entry has been created, we record the cache entry in the BufferBlock. When BufferBlock is destroyed, we also immediately destroy the cache entry since the cache will no longer reused. This CL also removes DescriptorCacheResult from various APIs since it is now redundant with newSharedCacheKey argument. Bug: b/237686097 Change-Id: I14fa8906fdbe7d9226c8e8ecddef2beb05fbaa5c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3756694 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Charlie Lao <cclao@google.com>

496bddf3

2022-07-14T20:58:03

Skip mutable texture upload tests through feature * Added a condition in the mutable texture upload tests in VulkanPerformanceCounterTest.cpp, to skip the test if the feature `MutableMipmapTextureUpload` is disabled on that platform. Bug: angleproject:7308 Change-Id: Iff1985cabb463dc82ef15340cf3c485a0b680f0b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3765180 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Amirali Abdolrashidi <abdolrashidi@google.com>

01092c48

2022-07-12T10:11:22

Vulkan: Destroy descriptorSet cache when shader image is destroyed Similar to texture descriptor set, this applies to images used as shader resource. When a texture is used in a shader resource descriptorSet, we record it. When texture is destroyed, we also destroy that shader resource descriptorSet cache. Bug: b/237686097 Change-Id: I475982fcec45535cc285a4aebca922d01efc7ed2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3758884 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

32c5fd8a

2022-05-13T14:31:03

Reland "Vulkan: Flush texture updates more often" This is a reland of 8bb7c35c2159de2fa9e9a008679c692edd4402a6 * Added a condition to make sure the previous texture is not immutable when performing the optimization. * Fixed the issue where mipmap textures with unequal dimensions were not flushed. * Added related tests. * Added kEnableMutableMipmapTextureUpload, a flag to enable/disable the feature (enabled by default). Original change's description: > Vulkan: Flush texture updates more often > > * Added a pointer to the previous texture in ShareGroupVk so we can > flush the texture updates once we switch to a new texture. > > * We check if mip levels 0 and 1 are conformant in terms of > size, format and number of samples. > > * As a part of size check, we also check depths if the texture > target is either 3D, 2D array, or cube map array. For the former > two, they have to conform to mip scaling similar to width and > height. For the latter, the depth represents layer-faces and does > not change for mipmaps. > > * Added a test to ensure the pointer to the previous texture is > deleted when the corresponding texture is deleted, so the old value > is not accessed by a future mutable texture. > > * Added tests to make sure the mutable texture is uploaded with > the appropriate mip level attributes, and not uploaded in cases of > size/format inconsistencies, incompleteness, and no base level. > > Bug: b/202744914 > Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 > Reviewed-by: Charlie Lao <cclao@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Bug: b/202744914 Change-Id: I2bdbcd0182a57c18c1a18968396251a2e366731b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3646959 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

89e38b57

2022-06-22T15:04:08

Refactor to use ANGLETest vs ANGLETestWithParam Bug: angleproject:6747 Change-Id: I72ad52d0268eae0e1a401f12f3e94cc5efa402f2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3719002 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com>

e50351cb

2022-06-10T22:28:58

Vulkan: Don't close render pass on framebuffer fetch For applications that use framebuffer fetch in the same RP as non-fetch programs, we can save some extra RenderPasses by always creating our RP objects with input attachments enabled. This works almost identically except for needing to use the images in a "GENERAL" layout instead of "COLOR_ATTACHMENT_OPTIMAL". According to partners it is possible to achieve performance parity even with GENERAL layout. To remove any potential negative impacts of using the GENERAL layout, the context enters this always-framebuffer-fetch mode only and as soon as a framebuffer fetch program is created. Applications that don't use framebuffer fetch are thus unaffected. This eliminates 20 render passes in the Genshin Impact trace (out of about 58). On a Pixel 6 the resulting benchmark score speeds up by ~25%. For Real Racing 3, the speed up is ~30%. Based on change by jmadill@chromium.org Bug: angleproject:7375 Change-Id: Ib6c73e95d06229f8545d502b388ee2a55a582323 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3697308 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

91976352

2022-06-21T15:41:02

Use C++17 attributes instead of custom macros Bug: angleproject:6747 Change-Id: Iad6c7cd8a18d028e01da49b647c5d01af11e0522 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3718999 Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

97a6e581

2022-05-30T16:50:26

Vulkan: Useful implementation of program binaries ANGLE already serializes the pipeline state for the sake of OES_get_program_binary. This serialization had limited usefulness however, since the Vulkan driver hasn't actually created any pipelines yet (which is a costly part of program creation). Simultaneously, ANGLE deferred Vulkan pipeline creation to draw time, which causes hitching. In this change, a handful of Vulkan pipelines are precreated at link time; those at least that are sure to create different blobs in the pipeline cache (different spec consts or SPIR-V generation). These pipelines are created in the program executable's cache. The cache is then merged into the shared renderer cache (for potential blob reuse by other programs). With this, two goals are achieved: - Most pipelines created at draw time hit the pipeline cache, avoiding costly compilation. - When the program binary is retrieved, the contents of the program executable's pipeline cache is also returned. On reload, the cache is recovered, resulting in faster startup. Bug: angleproject:5881 Change-Id: I46c5451a7d0b16dffd40e44015e094640886880b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3671977 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

c5ee5a9c

2022-06-10T10:29:11

Vulkan: Add test CreateDestroyTextureDoesNotIncreaseDescSetCache This adds a test to demonstrate a usage pattern seen with surfaceflinger (see b/234602034 for detailed reproduce steps). With every iteration of notification shade pop up, after all other optimization, we are still seeing four descriptor sets gets allocated. Surfaceflinger is allocating AHB and texture every time and after usage it gets destroyed. This test uses normal texture instead of EGLImage for easy of debugging on linux/windows platform, but it demonstrated the exact same problem with AHB texture. Bug: b/235523746 Change-Id: I7ca1ff13b61ade1449a56d3afc8a84926ad13850 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3700570 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Ian Elliott <ianelliott@google.com>

3dfc8004

2022-06-08T14:24:48

Vulkan: Optimize sync followed by swap Previously, inserting a sync object immediately caused a submission. That was done in https://chromium-review.googlesource.com/c/angle/angle/+/3200274 to be able to wait until the sync object is signaled without having to wait for whatever is recorded after it until a flush naturally happens. Some applications issue a glFenceSync right before eglSwapBuffers. The submission incurred by glFenceSync disallowed the optimizations that eglSwapBuffers would have done, leading to performance degradations. This could have been avoided if glFenceSync was issued right after eglSwapBuffers, but that's not the case with a number of applications. In this change, when a fence is inserted: - For EGL sync objects, a submission is issued regardless - For GL sync objects, a submission is issued if there is no render pass open - For GL sync objects, the submission is deferred if there is an open render pass. This is done by marking the render pass closed, and flagging the context as having a deferred flash. If the context that issued the fence sync issues another draw call, the render pass is naturally closed and the submission is performed. If the context that issued the fence sync causes a submission, it would have a chance to modify the render pass before doing so. For example, it could apply swapchain optimizations before swapping, or add a resolve attachment for blit. If the context that issued the fence sync doesn't cause a submission before another context tries to access it (get status, wait, etc), the other context will flush its render pass and cause a submission on its behalf. This is possible because the deferral of submission is done only for GL sync objects, and those are only accessible by other contexts in the same share group. Bug: angleproject:7379 Change-Id: I3dd1c1bfd575206d730dd9ee2e33ba2254318521 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3695520 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>

ee1dd7f4

2022-06-08T13:17:39

Vulkan: Add test for glEGLImageTargetTexture2DOES issue This add a test that repeatedly calling glEGLImageTargetTexture2DOES on the same source EGLImage with the same texture parameters should not causing texture's descriptor set cache to keep growing. This is the usage pattern we are seeing with surfaceflinger. Bug: b/234602034 Change-Id: I38ec0a0b2580b8985c27e8c9f7edf14aa7843023 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3696677 Reviewed-by: Ian Elliott <ianelliott@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

d655ad29

2022-05-31T14:20:16

Vulkan: Add tests for FramebufferCache growth bugs When texture attached to FBO gets respecified, we shouldn't keep growing FramebufferCache. When texture attached to fbo get glTexParameteri(GL_TEXTURE_SWIZZLE_R) call with the same value, we should also not destroy/recreate framebuffers (in fact should not recreate VkImageView). We ran into this usage pattern on surfaceflinger. When texture attached to fbo get glTexParameteri(GL_TEXTURE_SWIZZLE_R) call with different value, we should also not destroy/recreate framebuffers (in fact should not recreate VkImageView). We ran into this usage pattern on surfaceflinger. Bug: b/234769934 Bug: b/234602034 Change-Id: I9fc881486f95cc3da843f50fa0a8cdcbfd4fc625 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3681081 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Ian Elliott <ianelliott@google.com> Commit-Queue: Charlie Lao <cclao@google.com>

b0d75fb5

2022-05-31T16:55:23

Vulkan: Use 64-bit counters Some upcoming counters don't fit in 32 bits. Bug: angleproject:5881 Change-Id: I2de8a603cabdb5f7417c29d5f37a50899485d6d3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3679488 Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

e56f227d

2022-05-13T08:50:12

Vulkan: Add case: TextureSampleByDrawDispatchDraw This case is used to verify the implicit synchronization when GL executables switch from draw to dispatch. Besides, suppress a VVL on it. Bug: angleproject:7031 Change-Id: Idab68cfd0d4b17685f5eb5b3eec7f2cad12e5877 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3646927 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

03ccd9cc

2022-05-13T16:12:11

Revert "Vulkan: Flush texture updates more often" This reverts commit 8bb7c35c2159de2fa9e9a008679c692edd4402a6. Reason for revert: crashes tests in linux-rel Example: https://ci.chromium.org/ui/p/chromium/builders/try/linux-rel/1012030/overview Also possible flakiness https://anglebug.com/7308 Repro: out/Debug/bin/run_blink_web_tests fast/canvas/OffscreenCanvas-2d-drawImage.html Original change's description: > Vulkan: Flush texture updates more often > > * Added a pointer to the previous texture in ShareGroupVk so we can > flush the texture updates once we switch to a new texture. > > * We check if mip levels 0 and 1 are conformant in terms of > size, format and number of samples. > > * As a part of size check, we also check depths if the texture > target is either 3D, 2D array, or cube map array. For the former > two, they have to conform to mip scaling similar to width and > height. For the latter, the depth represents layer-faces and does > not change for mipmaps. > > * Added a test to ensure the pointer to the previous texture is > deleted when the corresponding texture is deleted, so the old value > is not accessed by a future mutable texture. > > * Added tests to make sure the mutable texture is uploaded with > the appropriate mip level attributes, and not uploaded in cases of > size/format inconsistencies, incompleteness, and no base level. > > Bug: b/202744914 > Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 > Reviewed-by: Charlie Lao <cclao@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Bug: b/202744914 Change-Id: Id51fd4c76d058aa5100ec58ba618098c8f614253 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3645493 Auto-Submit: Roman Lavrov <romanl@google.com> Commit-Queue: Lingfeng Yang <lfy@google.com> Reviewed-by: Lingfeng Yang <lfy@google.com>

21ad9b3c

2022-04-07T09:57:26

Vulkan: Add generic descriptors for DS cache. With the new design, the descriptor set cache keys include all identifying information needed to reconstruct the update descriptor sets calls except the specific resource handles. The places for the resource handles are held by serials intead. When we miss the cache, we no longer need a second step to then construct the update calls, and can build the update calls directly from the key structures in combination with a list of resource handles. Bug: angleproject:6776 Change-Id: If1660a557585a75e9aa2560d6a38c56b62f555c8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3484981 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org>

d8d396db

2022-04-07T09:57:25

Vulkan: Add shared descriptor set caches. This allows programs with the same sets of descriptors to share descriptor sets. Currently there is no cache eviction. This CL adds a new "Meta" class to manage the descriptor set caches. Each shared descriptor pool is unique to a descriptor set layout. The descriptor set cache is moved into the pool class. Now every instance of a descriptor pool in ANGLE has easy access to a descriptor set cache as well. Bug: angleproject:6776 Change-Id: I06982e0349f5a87e4578e769fa356ce8e7ab49f0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3424660 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

8bb7c35c

2022-03-23T19:14:54

Vulkan: Flush texture updates more often * Added a pointer to the previous texture in ShareGroupVk so we can flush the texture updates once we switch to a new texture. * We check if mip levels 0 and 1 are conformant in terms of size, format and number of samples. * As a part of size check, we also check depths if the texture target is either 3D, 2D array, or cube map array. For the former two, they have to conform to mip scaling similar to width and height. For the latter, the depth represents layer-faces and does not change for mipmaps. * Added a test to ensure the pointer to the previous texture is deleted when the corresponding texture is deleted, so the old value is not accessed by a future mutable texture. * Added tests to make sure the mutable texture is uploaded with the appropriate mip level attributes, and not uploaded in cases of size/format inconsistencies, incompleteness, and no base level. Bug: b/202744914 Change-Id: I9c2c1af87a8a49e75d3ad25523436b0cd51a7e81 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3606329 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

3d55cf0c

2021-12-30T11:27:26

Vulkan: Optimize the vkImage layout when used as GL_image If one vkImage has been used as GL_image in compute shader and as a GL_texture in fragment shader, no dependencies are needed for the fragment shader and other pre-fragment graphics shaders, like vertex/tess/geom. If we only assign the vkImage layout as writable when running GL executables that have Image Textures, we can specify more precise read-only barriers when running read-only GL executables. Bug: angleproject:6862 Change-Id: Iff37fdce13fea637751899253e535bf3f6663200 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3366014 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

d075dfe2

2022-05-03T16:25:26

Vulkan: Reduce kMaxBufferToImageCopySize to 64M Bug: b/230538246 Change-Id: Id2ef9c35f74fb6f526744903402562f9354bfcdb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3625834 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>

3e05b93a

2022-04-12T19:39:19

Vulkan: MSAA swapchain resolve based on renderArea * Updated the MSAA resolve subpass so it can only be performed if the render pass is covering the entire area (e.g., not scissored). * Added test to make sure that the subpass resolve does not occur when the render pass does not cover the entire area. Bug: angleproject:6762 Bug: angleproject:7196 Change-Id: Iac3ab4b655dfeb7bff1348cc5e289a77a4dc0b83 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584942 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

3eb2bcf7

2022-04-27T16:13:04

Vulkan: Fix syncval errors with DONT_CARE for unused attachments DONT_CARE is a write operation for synchronization purposes. ANGLE doesn't synchronize depth/stencil attachments that are not written to, as it uses the read-only layout. This change makes sure LOAD/STORE_OP_NONE are used instead of DONT_CARE for attachments that are not used, even if they don't have defined contents. This allows ANGLE to continue to not do additional synchronization. Bug: angleproject:5371 Bug: angleproject:5962 Bug: angleproject:6411 Bug: angleproject:6584 Change-Id: I539379aa34f6655f00e798e8c4a5c57f40f7a12d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3612182 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>

7d31a47f

2022-04-23T00:19:15

Vulkan: Optimize away eglSwapBuffers for single buffer surfaces For single buffer surfaces, eglSwapBuffers serves two purposes: - Switch to/from single buffer mode - Implicitly issue a glFlush Simultaneously, for single buffer surfaces, glFlush serves three purposes: - Submit the commands - Call queue present (if necessary) - Throttle the CPU In this mode, ContextVk::flush() already redirects to the surface, calling WindowSurfaceVk::swapImpl() which calls back to ContextVk::flushImpl() (to submit the commands), calls queue present and throttles the CPU. If the application calls eglSwapBuffers(), the exact same thing happens (i.e. WindowSurfaceVk::swapImpl() is called to the same effect). Calling swapImpl() leads to an addition of the corresponding submit serial to the "swap history". The CPU throttling code always throttles the CPU to the serial of two swaps ago. Unnecessary calls to eglSwapBuffers() (when there is no command to be flushed) in single buffer mode would thus lead to the CPU throttled to the end of the last submission, effectively turning into a glFinish(). In this change, eglSwapBuffers() in single buffer mode, when not switching to/from this mode, is redirected to glFlush() as it's functionally equivalent. Simultaneously, ContextVk now tracks whether it has any pending commands for submission at all, and skips glFlush() altogether if there are none. Together, this results in the unnecessary eglSwapBuffers() to become no-op. Bug: b/229908040 Change-Id: I0e3b4a8b7eb4f6b0e0ed22260644825fc67dd330 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3603841 Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>

4aae5815

2022-04-22T13:21:03

Vulkan: Overlay widgets for submission statistics Bug: angleproject:7084 Change-Id: I68e69bda43862f9f2711c25a28dbe4745c19a45c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3602832 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

2db718ed

2022-04-21T23:13:02

Vulkan: Skip empty submissions A number of places in ANGLE perform an implicit flush; eglSwapBuffers(), glFenceSync() etc. Sometimes these flushes are unnecessary because there is nothing to submit. Additionally, an application may unnecessarily issue glFlush() with nothing recorded. In this change, empty command buffers are automatically not submitted, optimizing these unnecessary flushes away. Bug: angleproject:7084 Change-Id: Iecb865b6b9ef8045dfecda7b5221874f7031b42e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3600837 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

3b38b379

2022-04-20T10:44:24

Vulkan: Add feature avoid HOST_VISIBLE and DEVICE_LOCAL combination Discrete GPUs device local memory usually is not CPU accessible. This adds a feature flag to control that. Fixed bug in BufferVk that when mapRangeImpl is called from angle internal, unmapImpl was using front end mapping parameters that is incorrect. We have to cache the mapping parameters in the backend to hangle the mapRangeImpl/unmapImpl calls from internal. Fixed the test bug in ComputeShaderTest.BufferImageBufferMapWrite that we are calling glMapBufferRange with GL_MAP_READ_BIT but are actually writing to the map pointer. This should result in undefined behavior per spec. Fixed the test bug in GLSLTest.* that VerifyBuffer calls glMapBufferRange, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Fixed the test bug in ReadPixelsPBOTest.* and ReadPixelsPBONVTest.* that calls glMapBufferRangeEXT, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Added new skipped syncval messages. Because this CL triggers a copyToBuffer call for some of the buffers and that changes the syncval message signature for the same reasons (i.e, feedback loop or synval does not know the exact range of buffer been used for vertex buffers etc). Bug: angleproject:7047 Change-Id: I28c96ae0f23db8e5b51af8259e5b97e12e8b91f2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3597711 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>

b2a1f0d2

2022-04-14T07:58:32

Track total vs per-frame descriptor set counters. This will give more consistent measurements for descriptor set caches and descriptor set allocations. Bug: angleproject:6776 Change-Id: I584b8807ad19f8393ae54cc1d88b319c8f7f9f39 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584636 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Jamie Madill <jmadill@chromium.org>

fcec6904

2022-04-13T14:18:06

Generate feature variable names from display names The json file now only contains the feature display name. The variable name is automaticaly derived. For consistence with Chromium and other Chromium-based projects, the display name is now always snake_case, and that's what's specified in the json files. This also makes camelCase variable name generation trivial (as opposed to the other way around). Feature overrides now accept both snake_case and camelCase names to ensure compatibility with existing scripts. This is done by removing _ and comparing override names with feature names in lower case. Bug: angleproject:6435 Change-Id: I0b6ed2bbf5c312bc4f4be7b3c7d55dbaca2a9886 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584630 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

3cea7fcc

2022-03-16T16:33:43

Split Context ResourceUseList to RP Commandbuffers * Added mResourceUseList to each command buffer helper in an effort to move mResourceUseList away from ContextVk. * submitFrameImpl() renamed to submitCommands() * Moved the functions acquireResourceUseList() and onRenderPassFinished() in submitCommands() to the submitFrame functions calling it. Bug: angleproject:7103 Change-Id: I2487d5b86ea0a4d504f283aa7128501651317fe0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3531368 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

607d398e

2022-03-14T16:32:21

Vulkan: Optimize resolve of multisample swapchains * Resolves the multisampled image if the last render pass draws into the default framebuffer. * Added test to check the number of resolves in the optimization subpass (credit: Xinyi He) * Added test to check the number of resolves outside the subpass. * Added disabled test to see if the subpass resolve works. Bug: angleproject:6762 Change-Id: I86a8db3387851ab97d5f7a3d8a0ff26961254c14 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3523062 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>

0ffff9ed

2022-04-05T15:56:23

Vulkan: Perf counters test for glInvalidateSubFramebuffer Bug: angleproject:7183 Change-Id: Id07c6467c746de312d6ba9695bdc98c9460144ca Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3573182 Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>

kc3-lang/angle/src/tests/gl_tests/VulkanPerformanceCounterTest.cpp

src/tests/gl_tests/VulkanPerformanceCounterTest.cpp

Log