src/libANGLE/renderer/vulkan/ContextVk.cpp


Log

Author Commit Date CI Message
Shahbaz Youssefi f72428bb 2025-01-10T22:24:07 Vulkan: Fix MSAA swapchain resolve out of render pass .. vs the optimization that transitions the swapchain image to PRESENT_SRC as part of the render pass. When the swapchain image is the resolve attachment, its layout is decided when the render pass is closed (or through manual calls to finalizeImageLayout()). If the render area forbids the swapchain image from becoming the resolve attachment, it must be resolved after the render pass. Prior to this change, the swapchain image was still marked for transition to PRESENT_SRC by the render pass. This is inefficient, as the following out-of-render-pass resolve would have to transition the image back out of PRESENT_SRC. Nevertheless, this also had a bug exposed by an ASSERT in the dynamic rendering path: * Before the out-of-render-pass resolve, the image layouts are forcefully finalized. * The code in finalizeImageLayout() checks which image is being finalized; if the image is not any of the ones in the render pass, it was silently ignored. * The image marked for transition to PRESENT_SRC (mImageOptimizeForPresent) is not separately checked, as it is expected to be an attachment as well. The code that optimized the final render pass always marked the swapchain image for optimization, even if it was not going to become the resolve attachment. This change makes sure this optimization is done only if the image is definitely an attachment of the render pass. Bug: angleproject:389048224 Change-Id: I9f451d2698944111ac96bd97fefd6efa23859b7f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6168388 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya 0a207b65 2025-01-10T12:01:01 Vulkan: Assert size of GraphicsDriverUniformsExtended is within limits ANGLE updates driver uniforms using push constants, ensure size of ANGLE's driver uniform struct is within Vulkan spec's guaranteed limit of 128 bytes. Bug: angleproject:386749841 Change-Id: Iaa5ca8a46865a804b4c854ba27448bf4b6546646 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6164689 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi bafb661e 2024-12-13T12:01:19 Vulkan: Remove debug log + dead code Bug: angleproject:376572258 Change-Id: Ie774fd248a37fc65b4e05df0b4e4dffb778b9b4f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6090907 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Roman Lavrov <romanl@google.com>
Shahbaz Youssefi c75bd915 2024-12-10T23:01:44 Vulkan: Remove asyncCommandQueue It's been years and it never showed an advantage. In the meantime, performance without this feature seems close to native drivers (i.e. the feature has lost its appeal) and it's frequently a source of complication and bugs. Bug: angleproject:42262955 Bug: angleproject:42265241 Bug: angleproject:42265934 Bug: angleproject:42265368 Bug: angleproject:42265738 Bug: angleproject:42266015 Bug: angleproject:377503738 Bug: angleproject:42265678 Bug: angleproject:173004081 Change-Id: Id8d7588fdbc397c28c1dd18aafa1f64cbe77806f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6084760 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi fc4fc174 2024-12-10T22:01:28 Vulkan: Prevent crash with D/S FF without D/S attachment The spec says that the values for gl_LastFragDepth/StencilARM are undefined if there is no depth/stencil attachment. This "just" works on tiling GPUs, because reading input attachments simply translates to reading _something_ from the tile memory. For ANGLE, the situation is a little more complicated. ANGLE has to bind descriptors for input attachments (because non-tilers read from the input attachment descriptor instead of using the knowledge that input and color/depth/stencil attachments are one and the same thing in tile memory). When a depth/stencil attachment is missing, there is no image to bind to the descriptor set. ANGLE cannot skip binding an image to the descriptor set, because OpImageRead (translated from subpassLoad()) attempts to access the input descriptor; skipping this causes an internal crash in SwiftShader for example. ANGLE cannot bind a bogus image as input attachment, as Vulkan requires that input attachments are also color/depth/stencil attachments. ANGLE _could_ bind a bogus image as input attachment and also as depth/stencil attachment. This is rather risky, as it then also has to be careful to make sure that depth/stencil attachment is never actually used (i.e. it affects the depth/stencil state, load/store ops etc). In this change, the shader itself is modified to remove references to the depth/stencil input attachments if the attachment is missing. This is rather inefficient, as it means the pipeline warmup will not produce a usable pipeline, but it's accepted as a workaround for something apps shouldn't really be doing. Bug: angleproject:376572258 Change-Id: I0de68252b61615cb82cba7d1730699aadf41e92f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6085368 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi e9ba1681 2024-12-10T21:29:26 Vulkan: Fix DR vs FF vs non-draw RP start DR is Dynamic Rendering FF is Framebuffer Fetch RP is Render Pass With DR, whether framebuffer fetch is used or not is no longer tracked in the framebuffer's RenderPassDesc, because that property has no bearing on the framebuffer anymore. It still exits in RenderPassDesc to support legacy VkRenderPass objects. After a draw call starts a render pass, the state of the command buffer's copy of RenderPassDesc (copied from the framebuffer's) is updated to include the correct framebuffer fetch mode. However, this was not done when the render pass starts through other means, such as when a scissored or masked clear would call `Context::startRenderPass`. This change moves the aforementioned update of the framebuffer fetch mode to `Context::startRenderPass` so it affects everywhere the render pass may start from. Bug: angleproject:383356851 Change-Id: I82eff43863fc5b9fe67e57453269ee73859a6cd7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6085367 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Igor Nazarov 09578c42 2024-11-27T19:39:09 Vulkan: Cleanup CommandPoolAccess implementation This is a follow up for: Add a new mutex in CommandQueue to protect Vulkan Command Pool crrev.com/c/angle/angle/+/6020895 Change simplifies `CommandPoolAccess` implementation as well as removes source of bugs when need to know/remember what method from `CommandPoolAccess` to use in order to collect/destroy the primary command buffer. Additionally, `CommandBatch` converted into a class to avoid invalid use. The "retire" word replaced with "release" in methods such as `releaseFinishedCommands()`. Bug: b/362604439 Change-Id: Iaa72c55458604e5ea8ea5a402e437129a5c9180a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6056019 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Igor Nazarov 739bcef0 2024-12-03T17:58:34 Vulkan: Rework finishOneCommandBatchAndCleanup This is a follow up for: Vulkan: Fix finishOneCommandBatchAndCleanupImplLocked crrev.com/c/angle/angle/+/6055419 The original `finishOneCommandBatchAndCleanup()` was not optimal in a sense that it is always finish a batch, when cleaning existing garbage may be sufficient. Also, because there was no feedback from the `cleanupGarbage()`, this method may just end up finishing one batch without cleaning any garbage, causing clients to "think" that there is no more garbage to clean. Additionally, `cleanupGarbage()` was called under the `CommendQueue::mMutex` lock, causing uncessary blocking. This change replaces this method with new `cleanupSomeGarbage()`. It solves all problems of the original method described above. It no longer has the `retireFinishedCommandsLocked()` call, because it is not necessary in scenarios where `cleanupSomeGarbage()` is used. In order to implement the new method, output feedback parameter was added to the `cleanupGarbage()` as well as to all methods that it uses. Bug: b/280304441 Change-Id: I7078899838609a0c3e5edbc4f507c2fe4364380a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6063126 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Igor Nazarov 74609065 2024-11-27T16:09:44 Vulkan: Fix finishOneCommandBatchAndCleanupImplLocked Fix the `finishOneCommandBatchAndCleanupImplLocked()` to always do cleanup regardless if there is something to finish. This method is designed not only to free space in `mInFlightCommands` but also to cleanup already retired commends (in `mFinishedCommandBatches`) and renderer's garbage. In case if `mInFlightCommands` is empty cleanup was skipped - which is incorrect. Change removed `Impl` from the name since it is already have `Locked`. The `finishOneCommandBatchAndCleanup()` is updated to simply call the locked version with the mutex lock held. Change also improved `FixedQueue` assertions (always check that `mSize <= mMaxSize`). Bug: b/280304441 Change-Id: I67bd7c35b164b84e9c07306a5bf48b0adefdfa5e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6055419 Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 224f836c 2024-12-04T09:35:09 Vulkan: Update setupDispatch comment The comment was wrong and a source of confusion. Bug: angleproject:382090958 Change-Id: I7b1a3d5f3b1c86539164d346e320d30b61254f2c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6069354 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
Charlie Lao cb31b886 2024-11-22T13:15:57 Vulkan: pruneDefaultBufferPools when there is excessive garbage What commonly happens with game (and traces) is that we upload a lot of textures before first frame is drawn. These texture uploads uses a lot of buffer memory and creates peak memory usage moment if not clean up quickly. This CL adds back the check if there is excessive suballocation memory gets destroyed, don't wait until frame boundary to prune empty buffer blocks. Do the prune immediately so that these memory could be reused for other purpose for the first frame rendering. Bug: angleproject:372268711 Change-Id: Ie548245b5ce108be0e2c19b296a28025bface395 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6043405 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 70d1ef67 2024-11-20T11:34:39 Vulkan: Ensure onFramebufferBoundary is called for offscreen There is peak memory regression observed from crrev.com/c/6022549. What I suspect happening is that for offscreen or single buffered case, glFlush/glFinish is called but bail out because it already submitted or deferred. So we end up not calling onFramebufferBoundary(). This CL ensures we always call onFramebufferBoundary from these two functions for single buffer or offscreen. Also fixed a bug when onSharedPresentContextFlush is called we may end up calling onFramebufferBoundary. To make API names consistent, existing flushImpl() is renamed to flushAndSubmitCommands() and a new flushIMpl is added to wrap around most logic inside flush(). Bug: angleproject:372268711 Change-Id: I54eed8a81f4153d52ab962f213cacc87a73b89ac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6037491 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao ce53aff0 2024-11-05T16:57:57 Vulkan: Add per descriptorSet LRU cache eviction Before this CL, the descriptor set cache eviction is at the pool level. Either the entire pool is deleted or not. It is also not LRU based. This CL adds a per descriptor set cache eviction and reuse evicted descriptorSet before allocating a new pool. This eviction is LRU based so that it is more precise. The mCurrentFrameCount is passed into various API so that it can make eviction decision based on the frame number. In this CL, anything not been used in last 10 frames will be evicted and recycled before allocate a new pool. Since eviction is based on individual descriptor set, not by pool, ProgramExecutableVk no longer needs to track the DescriptorSetPool object. mDescriptorPools has been removed from ProgramExecutableVk class. As measured by crrev.com/c/5425496/133 This LRU linked list maintenance does not add any measurable time difference, but reduces total descriptorSet pool count by one third (from 75 down to 48). running test name: "TracePerf", backend: "_vulkan", story: "batman_telltale" Before this CL: cacheMissCount: 200, averageTime:23998 ns cacheHitCount: 1075445, averageTime:626 ns descriptorSetEvicted: 0, descriptorSetPoolCount:75 Average frame time 3.9262 ms After this CL: cacheMissCount: 200, averageTime:23207 ns cacheHitCount: 1025415, averageTime:602 ns descriptorSetEvicted: 102708, descriptorSetPoolCount:48 Average frame time 3.9074 ms BYPASS_LARGE_CHANGE_WARNING Bug: angleproject:372268711 Change-Id: I84daaf46f4557cbbfdb94c10c5386001105f5046 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5985112 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 74f74b63 2024-11-14T10:07:34 Vulkan: Add ContextVk::onFramebufferBoundary() function This makes a more formal API to track frame boundary. Also adds a uint32_t mCurrentFrameCount to track the total number of frames rendered so that we could use for heuristic purpose. Bug: angleproject:372268711 Change-Id: I153497403ed0d8fde18f1786186ce600df60c514 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6022549 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 087cc411 2024-11-14T11:05:14 Vulkan: Add mRenderer to ShareGroupVk class For convenience, instead of passing renderer to shareGroupVk's API, keep mRenderer in SharGroupVk class at constructor call. Bug: angleproject:372268711 Change-Id: I9534f7dbe24121856221b89ccf8fc6a353bbb0cc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6022548 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi e3b3dd68 2024-11-14T23:00:52 Vulkan: Optimize color clears vs read-only depth/stencil switch When switching to read-only depth/stencil mode, if the aspect that intends to be in read-only mode has a deferred clear, the clear is flushed separately beforehands (as that would be a write operation). Prior to this change, _all_ deferred clears were flushed for simplicity. In this change, only the aspect that is switching modes is cleared, leaving the other aspects free to be optimized as loadOp of the following render pass. Bug: angleproject:378058737 Change-Id: Iba4371590bee99f5022575c09b0d32231562488c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6019829 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi adb80cbb 2024-11-12T22:38:38 Vulkan: Optimize read-only depth/stencil switch after clear Prior to this change, if color is cleared then read-only depth/stencil mode is enabled, ContextVk::switchToReadOnlyDepthStencilMode eagerly flushed the deferred clears (starting a render pass). However, if the render pass is marked dirty for any reason afterwards, for example because we want to flush the render pass after a query ends, the render pass that is just started above is unnecessarily closed. In this change, `switchToReadOnlyDepthStencilMode` only detects if a new render pass is needed and marks the appropriate dirt bits. This way, the render pass can only be restarted once. Bug: angleproject:378058737 Change-Id: I83a5ebae6c223882eafea338eeec19895d87e5c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6023414 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi ba81145f 2024-11-08T15:45:44 Vulkan: Emulate coherent framebuffer fetch everywhere Many apps expect coherent framebuffer fetch to be available, and multiple downstream emulators end up forcing coherent framebuffer fetch enabled despite the hardware not being coherent. This change attempts to do a best-effort emulation of coherent framebuffer fetch by automatically inserting barriers before framebuffer fetch draws. While this doesn't correctly handle self-overlapping geometry, it works well enough in practice for the applications. As a result, framebuffer fetch is practically enabled everywhere after this change. Bug: angleproject:377923479 Change-Id: I3900a1de0f4db755b7e70871f57df3ea112073f9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6004336 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya e2cd9082 2024-11-05T04:07:17 Vulkan: Bugfix in setCurrentImageLayout Make sure to update mLastNonShaderReadOnlyLayout and mCurrentShaderReadStageMask when updating the current layout. Bug: angleproject:40644776 Change-Id: Ie8652099a0d4caca9f9aea5bac38256a513b08e7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5992020 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao 2df8d32b 2024-10-25T13:49:40 Vulkan: Tag DescriptorSets properly with every command buffer When descriptorSetCache is disabled, there is a bug that the current descriptorSet is not properly tagged with current ResourceUse. Right now when we get a descriptorSet (from cache or reused or allocated new), we retain to the current command buffer. But if we submit command buffer and get a new command buffer, and draw with the same program/buffer/textures, we will be reusing the current bound descriptorSets, but it is not retained with new command buffer. In theory, we have the same bug for pool eviction as well when cache is enabled. It's just very hard to hit due to pool eviction occur very rare. But with cache disabled, this is very easy to hit with multiple tests. In this CL, the retainResource call is moved from ProgramExecutableVk::getOrAllocateDescriptorSet() call to ProgramExecutableVk::bindDescriptorSets() call. Since bindDescriptorSets is always get called when we get a new descriptorSet, and is always get called when a new command buffer is allocated, this covers all usage case. And even better, with this change we are able to remove commandBufferHelper from arguments of quite a few functions. Bug: angleproject:372268711 Change-Id: I1f21a3e7e9ea34e2842e54025b5eb930dbf6c593 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4743599 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 1652f8ed 2024-10-17T13:35:39 Vulkan: end2end tests when descriptorSetCache is disabled Some end2end tests are testing specific descriptorSet cache behavior. When cache is disabled, these tests failed. In this CL these perfCounter based tests haven been modified to check total allocation to ensure the descriptorSets are properly reused instead of cache hit/miss. Bug: angleproject:372268711 Change-Id: I1d2f4cfcf622b05cdcb3317c8804416a80e72c48 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3735732 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 08c1724f 2024-10-11T14:29:00 Vulkan: Support GL_ARM_shader_framebuffer_fetch_depth_stencil Bug: angleproject:352364582 Change-Id: I63fd78314fa7ebccbf366c252e309a9c0f09c8c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938150 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 65fcf9c4 2024-10-26T10:53:18 Vulkan: Remove redundant dependent feature checks Since [1], when a feature is overriden, the dependent features automatically take the override into account. Tests no longer need to account for dependent features, neither does the logic in the code. [1]:https://chromium-review.googlesource.com/c/angle/angle/+/4749524 Bug: angleproject:42266725 Change-Id: I5440aba4a89cffbe710e26ad7de4cfee783e9bdf Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5967414 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Igor Nazarov a769fad4 2024-10-24T20:46:42 Vulkan: Optimize and fix glFinish for single buffered surfaces Fixed bug: When calling `onSharedPresentContextFlush()` from `ContextVk::finish()` need to also call `finishImpl()` to wait for submitted commands. This bug was introduced in the original commit where `onSharedPresentContextFlush()` was added. Optimization: Skip calling `onSharedPresentContextFlush()` from `ContextVk::finish()` similarly to `ContextVk::flush()` when there is nothing to flush. Bug: angleproject:42265370 Bug: b/229908040 Change-Id: Ide9f9c5d8757257c925970faece1e137acf10dec Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5961290 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Charlie Lao fe99836c 2024-10-25T14:34:23 Vulkan: Use ANGLE_PERF_WARNING when no serial for reserved serial When we run out of OutsideRenderPassCommands' queue serial, we have to call flushCommandsAndEndRenderPass() so that we can get a new set of reserved serials for OutsideRenderPassCommands. The problem is that we call ANGLE_VK_PERF_WARNING macro before calling flushCommandsAndEndRenderPass(), which could insert a CommandID::InsertDebugUtilsLabel command when debug marker is enabled. This end up with mOutsideRenderPassCommands becomes not empty and subsequent call of flushCommandsAndEndRenderPass end up with flushOutsideRenderPassCommands and not able to early out due to command buffer is not empty. This CL simply changes ANGLE_VK_PERF_WARNING to ANGLE_PERF_WARNING to avoid getting into this situation. Assertion is also added to catch the problem at at the spot it happens. Bug: b/375661776 Change-Id: I2434af81b139c6b04d7ef1963f76035d60dfd471 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5967615 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Roman Lavrov <romanl@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 31c80bbf 2024-10-17T10:56:16 Vulkan: Avoid redundant work in updateFullActiveTextures ContextVk keeps mActiveTexturesDesc, which gets updated by UpdatePreCacheActiveTextures(). This is only used for cache lookup. When there is a cache miss, we end up call updateFullActiveTextures() which recomputes DescriptorSetDesc again, which is redundant work. This CL removes mActiveTexturesDesc from ContextVk. UpdatePreCacheActiveTextures has been changed to be a DescriptorSetDescBuilder method so that it can directly update the mDesc. updateFullActiveTextures has been renamed to updateActiveTexturesForCacheMiss which avoid mDesc calculation. updateFullActiveTextures is still kept for now which will be used in next CL when cache is disabled. Bug: b/372268711 Change-Id: Ic9a0cdaa7cefca5f72b599d26d079cef14888f07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5905766 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 47c66901 2024-10-21T12:47:22 Vulkan: Set gl_Layer to 0 if the framebuffer is not layered Bug: angleproject:372390039 Change-Id: I29067c9488e06f6dd2e90f207fecb843267fb77c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5949263 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi e40d8581 2024-10-16T10:57:39 Vulkan: Fix render pass revival vs framebuffer fetch and DR Bug: angleproject:352364582 Change-Id: I86548251fc1dec75031a23e3461bf296c852919c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5937412 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuly Novikov <ynovikov@chromium.org> Commit-Queue: Yuly Novikov <ynovikov@chromium.org>
Shahbaz Youssefi a1584f49 2024-10-11T21:17:32 Vulkan: Qualify framebuffer fetch with "Color" In preparation for depth/stencil framebuffer fetch, many framebuffer fetch symbols are affixed with Color to indicate that they pertain to color framebuffer fetch logic. Bug: angleproject:352364582 Change-Id: I86000ada5e6ef47387dec0b6a3fca589d816cdc2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5926593 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 1608d0be 2024-10-10T16:53:15 Vulkan: Isolate framebuffer fetch no-RP-break optim from DR Prior to [1], changes to framebuffer fetch usage by shaders caused a render pass break. This was due to a limitation of render pass compatibility rules. It also caused other headache, such as needing to clear the render pass cache, recreating pipelines etc. [1]:https://chromium-review.googlesource.com/c/angle/angle/+/3697308 In [1] an important optimization was implemented for tiling GPUs where ANGLE permanently switched to framebuffer fetch mode on first encountering framebuffer fetch use. From that point on, ANGLE would always make every render pass framebuffer fetch compatible. In reality, the render pass break was unnecessary, which became apparent with dynamic rendering (for example that whether the render pass includes input attachments has no bearing on a pipeline that doesn't use input attachments at all). In [2], dynamic rendering kept the render pass break + permanent switch behavior for simplicity. [2]:https://chromium-review.googlesource.com/c/angle/angle/+/5637155 This change untangles the optimization done for legacy render passes from dynamic rendering, allowing dynamic rendering to start every render pass without framebuffer fetch and enable it later if a framebuffer fetch program is used. This is in preparation for supporting depth/stencil framebuffer fetch, where a perma-switch is troublesome (for example in combination with read-only depth/stencil feedback loops). Bug: angleproject:352364582 Change-Id: I31221cf22a28d58b9b2bf188e9c0b786cd0fe3d2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5923120 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Mohan Maiya b38cc7fa 2024-09-30T12:43:09 Vulkan: Consolidate read colorspace override states ColorspaceState struct is now used to cache read colorspace related states to determine the colorspace of Vulkan read image views. ImageViewHelper methods are called during initialization and when colorspace related states are toggled dynamically which in turn process these states and determine the final read colorspace. Bug: angleproject:40644776 Tests: ImageTest*Colorspace*Vulkan SRGBTextureTest.SRGB*TextureParameter*Vulkan SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan Change-Id: I16b3666cd80865936b826dc0738fc9210dabeda9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5901255 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Amirali Abdolrashidi 67a5ea45 2024-09-23T16:09:12 Vulkan: Fix the error from multiple lineloop draws Since Vulkan does not support line-loop draws natively, such a draw call requires the conversion of the related buffers to prepare them for this operation. For glDrawElementsIndirect(), the index and the indirect buffers would need conversion. However, what currently happens in this case is that the original buffer pointer is overwritten after the conversion, removing the link to the original buffer. Therefore, if there is a second line-loop call just after the first, it will try to use the converted buffer as the new source, which leads to errors due the buffer already being in use. The index buffer for the draw is bound when the related dirty bit is handled. Therefore, instead of using the draw index buffer directly for handling the line-loop scenario, we can use the index buffer in the form of a local pointer passed between functions. Then, in order to reconcile line-loop with the other cases, the draw index buffer is set just before setting up the indexed draw. * Functions handling line-loop draws do not modify the element array buffer in VertexArrayVk directly, but use local buffer pointers to pass the current element array pointer to further processing and drawing. * Added mCurrentElementArrayBuffer for ContextVk to be bound to the index buffer to used for draw instead of the one from its vertex array object. * Before the indexed draw, mCurrentElementArrayBuffer is set to the last destination index buffer. * Added unit test that makes a line-loop draw and then a non-LL call using the same element array. Bug: angleproject:360758685 Change-Id: I6d6328f6326c1a1f9f80e5ef346aa077c867d344 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5878764 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 7c811715 2024-09-25T11:09:44 Vulkan: fix crash when clearing stencil with ClearBuffer Follow up to [1] which fixed a crash with glClear, but the bug remained with glClearBufferiv. This change refactors the "is stencil write masked out" query to always take the framebuffer's stencil bit count into account (practically always 8), which also happens to make the rest of the code checking this query more accurate in the presence of nonsense masks where the bottom 8 bits are 0. [1]: https://chromium-review.googlesource.com/c/angle/angle/+/3315158 Bug: chromium:40207259 Bug: angleproject:42266334 Change-Id: I68a6b0b75c67ed2cdc8c4d03b243efe5495efce1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5889788 Reviewed-by: Geoff Lang <geofflang@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org>
Yuxin Hu eaffa034 2024-09-24T20:56:04 Revert "Vulkan: Consolidate colorspace override states" This reverts commit bffcd235ba6c031603d798daaa98f1cf9a3f3e46. Reason for revert: Breaks Android test `org.skia.skqp.SkQPRunner#UnitTest_DMSAA_dst_read`. Details: https://b.corp.google.com/issues/369388539. Original change's description: > Vulkan: Consolidate colorspace override states > > ColorspaceState struct is now used to cache colorspace related states > and used to determine the colorspace of Vulkan image views. > ImageViewHelper methods are called during initialization and when > colorspace related states are toggled dynamically which in turn process > these states and determine the final read and write colorspaces. > > We can now fully support rendering to EGLImages, with colorspace > overrides, via texture or renderbuffer EGLImage targets > > Bug: angleproject:40644776 > Tests: ImageTest*Colorspace*Vulkan > MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan > SRGBTextureTest.SRGB*TextureParameter*Vulkan > SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan > ReadPixelsPBOTest.SrgbUnorm*Vulkan > Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Charlie Lao <cclao@google.com> > Commit-Queue: mohan maiya <m.maiya@samsung.com> Bug: angleproject:40644776 Change-Id: I5bf6cf2ed0c8ec22fc02d8c3da92673ee85fe002 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5888506 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya bffcd235 2024-09-13T14:58:00 Vulkan: Consolidate colorspace override states ColorspaceState struct is now used to cache colorspace related states and used to determine the colorspace of Vulkan image views. ImageViewHelper methods are called during initialization and when colorspace related states are toggled dynamically which in turn process these states and determine the final read and write colorspaces. We can now fully support rendering to EGLImages, with colorspace overrides, via texture or renderbuffer EGLImage targets Bug: angleproject:40644776 Tests: ImageTest*Colorspace*Vulkan MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan SRGBTextureTest.SRGB*TextureParameter*Vulkan SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan ReadPixelsPBOTest.SrgbUnorm*Vulkan Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao dbdc9551 2024-09-16T10:19:47 Vulkan: Let asyncCommandBufferReset control garbage cleanup So that people can toggle the flag to compare perf/power difference with async thread doing garbage clean up AND command buffer reset. This also renames feature flag asyncCommandBufferReset to asyncCommandBufferResetAndGarbageCleanup to reflect the implementation. Bug: b/255411748 Change-Id: Id459e6f4dc81ec76b6c0c2dba0db46041ea6ae8a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5867389 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Mohan Maiya 1b4d6185 2024-09-12T09:18:46 Vulkan: Cleanup sRGB related code Image and image view code is littered with sRGB related enums, even in places that don't deal with sRGB. Remove sRGB related parameters from initLayerImageView and getLevelLayerDrawImageView methods, which now assume default values. Add dedicated methods that allow overriding sRGB state values. Also introduce ColorspaceState struct that consolidates all sRGB related states, this will be used in follow up changes to track and infer colorspace of image views Bug: angleproject:40644776 Change-Id: Ifb366db48043e376f9ff6c30c852c44dd96562a1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5860808 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao acf63b9e 2024-08-22T09:52:41 Vulkan: call traceGpuEvent after queueSerial has been allocated traceGpuEvent() will write to mOutsideRenderPassCommands. This means if we destroy context immediately it will flush OutsideRenderPassCommands. This will trigger assertion that the queueSerialIndex is invalid. This CL moves queueSerialIndex allocation before traceGpuEvent. Bug: b/361570359 Change-Id: I70909ebd23c43c05cb793319b255f81e65a17a9d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5806203 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Greg Schlomoff <gregschlom@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 790e0162 2024-08-09T17:11:38 Vulkan: Add dirty range to VertexConversionBuffer class Previously, ConversionBuffer only has a boolean indicates it is dirty or not. This CL adds mDirtyRange to it to indicate which range of data has been modified. The existing dirty boolean has been changed to mEntireBufferDirty so that all the current code will still work. Right now mEntireBufferDirty is always set when we mark it dirty, which means entire buffer gets converted. mDirtyRange has not been used to reduce the data to be converted. Right now the range is always being merged to the existing range and not actually being used in this CL. It will be used in the next CL. Bug: b/357622380 Change-Id: Ibfa702b29011f4e26c511d5db85c07cbf2a4aefb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5778347 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 86508e20 2024-08-16T14:56:37 Vulkan: Make VertexConversionBuffer a class And wrap the cache key (i.e, formatID/stride/offset) into a CacheKey struct so that we can easily add more data members. This CL also changes ConversionBuffer from struct to class to have better encapsulation. No functional changes is expected here. Bug: b/357622380 Change-Id: Ieecf5c922b95a940137c8e54657ef3f458c55fc9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5793921 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 7080b766 2024-08-16T11:31:39 Vulkan: Move LineLoopHelper from vk to rx namespace There is no line loop support in vulkan. LineLoopHelper is a utility function for backend, not a helper function for vulkan object. So it is better fit in rx namespace instead of vk namespace. This also helps my next CL where I am going to change initBufferForVertexConversion to take a ConversionBuffer instead of BufferHelper. LineLoopHelper uses initBufferForVertexConversion, which means I have to change LineLoopHelper to uses ConversionBuffer. This causes header inclusion problem that now vk namespace object end up have to include rx namespace header. This CL fixes this inclusion problem by moving it to the proper namespace. Bug: b/357622380 Change-Id: I6d6cf1aa926f726bb1b1ab1017bcab092eaf5d37 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5787502 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Gowtham Tammana f8fc8ac3 2024-08-05T11:50:11 Vulkan: Remove dependency on ContextVk for CommandBufferHelper Following on the changes in [1], this makes the `CommandBufferHelperCommon` and `OutsideRenderPassCommandBufferHelper` interfaces independent of `ContextVk` state. Any dependency is made explicit. In addition, interfaces that are not specific to GLES context are also updated. [1]: Commit (bcf814fda5 Vulkan: Constrain the dependency on ContextVk in BufferHelper) Bug: angleproject:8544 Change-Id: I7d90ad915e8c14187ab5584453b9e8802bd91e2b Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5319147 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 06ae828f 2024-07-31T16:42:54 Vulkan: Avoid breaking render pass for vertex buffer conversion Adreno driver does not support VK_FORMAT_FEATURE_VERTEX_BUFFER_BIT for VK_FORMAT_R8G8B8A8_USCALED. This cases we hit fallback code path using GPU to do conversion. The conversion buffer gets reused since all access are from GPU, but that causes render pass break since there is a WAR hazard on the conversion buffer. This CL adds the isRenderPassStartedAndUsesBuffer check and allocate a new conversion buffer if reuse current buffer may break the render pass. Bug: b/356473483 Change-Id: Iaa0b9235ba42787f0e3629f0d9174ae768456f8b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5754324 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 7c77bb75 2024-07-24T12:05:29 Vulkan: Remove implicit buffer barrier for shader write When app uses shaders to write to SSBO, right now we are inserting an implicit barrier to ensure WAW are in order. But Spec says that "Explicit synchronization is required to ensure that the effects of buffer and texture data stores performed by shaders will be visible to subsequent operations using the same objects". This CL removes the implicit barrier for buffer write if the current write comes from shaders and relies on explicit glMemoryBarrier to insert a global barrier. Bug: angleproject:350994515 Change-Id: I8ab039610be9be2ded27ea60dab54bdad08502f6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719258 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao bd3a3308 2024-07-22T14:03:20 Vulkan: Remove implicit image barrier for shader write When app uses compute or fragment shader to write to an image and makes multiple dispatchCompute or draw calls, right now we are inserting an implicit barrier to ensure WAW is hazard free. But Spec says that "Explicit synchronization is required to ensure that the effects of buffer and texture data stores performed by shaders will be visible to subsequent operations using the same objects". This CL records the bits from the last glMemoryBarrier call and will skip the barrier calls in ContextVk::updateActiveImages if there is no layout change, unless there is requirement from prior glMemoryBarrier. Bug: angleproject:350994515 Change-Id: I8bdeeb658993824369824aaa0f25cb4b6e3785f7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719024 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 57202584 2024-07-26T13:07:44 Vulkan: Fix dispatch-after-closed-render-pass bug Bug: b/355567160 Change-Id: I4bc6acec53a50330507bfadcc0a4c1093366aae6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5741786 Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 7691cea7 2024-07-22T13:46:14 Vulkan: Remove seamful cubemap emulation Practically, the Vulkan backend is never expected to run on ES2 hardware. It _may_ for WebGL, but seamful cubemap emulation was disabled for webgl anyway. Bug: angleproject:354729454 Change-Id: Iafa20fbdbe232c4df4c777b12e7698ef7a87cf24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5730143 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 1db80b88 2024-07-10T12:47:42 Reland "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This is a reland of commit c379ff48043a47e444c388c45270db40d3172d50 Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I083e6963b5421386695e49a9872edbb2016c9763 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691342 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 1f87cbc9 2024-07-15T13:07:35 Vulkan: Fix late-added resolve attachment tracking Resolve attachments may be added after the fact to a render pass due to glBlitFramebuffer or eglSwapBuffer. Previously, only the resolve image views were tracked by the render pass, and otherwise the state tracking (layout, content defined, etc) treated the resolve images as generically written-to by the render pass. As a result, the render pass was unable to finalize the layout of the resolve images early. Optimizing the layout of the swapchain image when the surface is multisampled for example was not done due to this issue. In this change, when resolve attachments are added late, they are tracked identically to when they are added at the beginning of the render pass, fixing the issues described above. Bug: angleproject:42265625 Bug: angleproject:42266019 Change-Id: I765560762bb8caf39ba1096fb028177201c082d7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5707470 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 584fbcee 2024-07-10T12:43:34 Vulkan: Rework swap-time barrier logic Avoids unnecessary transitions when overlay is enabled Bug: angleproject:42267038 Change-Id: I0534911c0142c5e94cf3be112283fb98fcde0f6c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691346 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 9ca3ed37 2024-07-08T16:48:51 Vulkan: Let ContextVk::onResourceAccess uses retainImage Right now ContextVk::onResourceAccess calls retainResource for everything. Mean time we also have a retainImage() function, which adds a bit confusion to why we have two retain API. This CL moves retainImage from CommandBufferHelperCommon to OutsideRenderPassCommandBufferHelper and RenderPassCommandBufferHelper so that ContextVk::onResourceAccess can use retainImage directly. The slightly behavior difference between RenderPassCommandBufferHelper and OutsideRenderPassCommandBufferHelper's retainImage is from compute shader's image access, which we are using VkEvent to track images, mainly due to we tailor VkEvent to the manhattan's usage case, which involves compute. Bug: b/336844257 Change-Id: Id3fb694f683289a4720cc279387dbc27642745de Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5686352 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 7d461b21 2024-07-10T14:11:53 Revert "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This reverts commit c379ff48043a47e444c388c45270db40d3172d50. Reason for revert: Regresses CPU perf and memory when _not_ using DR Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I3865f0d86813f0eeb9085a92875a33bd449b907f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691337 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c379ff48 2024-06-10T22:01:57 Vulkan: Use VK_KHR_dynamic_rendering[_local_read] Bug: angleproject:42267038 Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 2a87db69 2024-07-06T22:20:40 Vulkan: Remove unused render pass closure reason Bug: angleproject:42266019 Change-Id: I1c516b88677d7c9d3e97e9fd7525cf727be50cc3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5678940 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 3f572905 2024-06-19T17:46:38 Add basic begin/end support for perf counters The AMD_performance_monitor extension has explicit begin/end calls to capture counters. This was not implemented in ANGLE and the tests were relying on ANGLE always capturing counters (incurring a small overhead). This change does not complete the implementation of that extension, but does add basic support for starting and stopping perf counter measurements. While inactive, most counters are not updated. Bug: angleproject:42267038 Change-Id: I3ff6448b22ca247c217401cb2d76ef4142c9d759 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5639343 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Amirali Abdolrashidi a1dea207 2024-06-13T11:40:39 Implement KHR_blend_equation_advanced_coherent * Updated the validation for glBlendBarrier() and ~KHR(). * GL state now includes mBlendAdvancedCoherent. * Updated glEnable() to accept GL_BLEND_ADVANCED_COHERENT_KHR. * It can be queried via glGetIntegerv(), etc. * EXTENDED_DIRTY_BIT_BLEND_ADVANCED_COHERENT added. * Added a corresponding bit to ExternalContextState. * If coherence is supported, there should be no need to use blend barriers, as, based on the spec, the advanced blend ops should then follow order like the basic blend ops. * Relevant tests: *GLES31.functional.blend_equation_advanced.coherent* * On Linux/NVIDIA: From "Not supported" to "Passed" Bug: angleproject:42262258 Change-Id: I7e0e43bdc71524eec111c2d3b024fe73c9795e55 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5634381 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Shahbaz Youssefi d193d51b 2024-06-17T22:46:08 Replace issue ids post migration to new issue tracker This change replaces anglebug.com/NNNN links. Bug: None Change-Id: I8ac3aec8d2a8a844b3d7b99fc0a6b2be8da31761 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637912 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 5703bd61 2024-06-14T14:12:41 Vulkan: Further optimize ProgramExecutableVk::resetLayout 1. Handle compute pipelines similar to how we handle graphics pipelines 2. Track valid compute pipeline permutations Bug: angleproject:8297 Change-Id: I58200517e5a44a2b3092777ea24d1529ceee00f5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5634574 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Roman Lavrov 15c182f9 2024-06-11T09:47:07 Vulkan: remove deferFlushUntilEndRenderPass feature, always on This only applies to Qualcomm chipsets, the feature was already enabled for all other devices. It was previously causing a manhattan 3.0 perf regression on some Qualcomm devices, but my tests on S24 both with ANGLE trace manhattan_31 and running gfxbench manually do not show any obvious regression. It was also not expected that this would result in a regression. As we do not aim to improve perf on older devices, removing the feature altogether so that defers are always enabled. This change resulted in a change in gold images on these traces on pixel 4 bots: pokemon_masters_ex - text was missing and now is rendered street_fighter_iv_ce_frame86 - shadow was missing and now is rendered So it looks like the feature may have been working incorrectly. Bug: b/346378481 Change-Id: I2b0d15b89e11c67dea7c316a42bc807441c43b0a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5622115 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Yuxin Hu 9c391154 2024-06-06T18:24:18 Vulkan: Do not apply advanced blend emulation when blend is disabled The emulateAdvancedBlendEquations code path does not check if GL_BLEND is disabled. This CL adds the check so that the blend is not applied when we disable the GL_BLEND. This CL also adds an updateAdvancedBlendEquations() when DIRTY_BIT_BLEND_ENABLED bit is set. This ensures DIRTY_BIT_DRIVER_UNIFORMS bit is set when GL_BLEND state changes, meaning we will regenerate the uniforms if GL_BLEND state changes: GL_BLEND is enabled: pass the advanced blend equations to the uniforms; GL_BLEND is disabled: do not pass the advanced blend equations to the unforms Bug: b/345581214 Change-Id: I5708a4051647bc29b5b38a027e836f5bf717d1d5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5605109 Auto-Submit: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 295ff607 2024-06-05T14:49:33 Vulkan: Precompute stageMask of kImageMemoryBarrierData Right now every time we need a pipelineStage in kImageMemoryBarrierData, we are doing a bitwise AND with mSupportedVulkanPipelineStageMask. This get called multiple times from barrier call. This CL adds mImageLayoutAndMemoryBarrierDataMap that has already precomputed all stageMask, thus avoid run time bitwise OR. This CL also precomputes the bufferWritePipelineStageMask so that flushImpl can be use it without construct every time. Bug: b/345279810 Change-Id: I878bd31c967cd217477061976f07df13b043fa7f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5601073 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Igor Nazarov d0280f09 2024-06-03T13:21:06 Cleanup Program Executable post link task wait Frontend changes: - Add `ASSERT` into `ProgramExecutable::waitForPostLinkTasks()` to check that `mPostLinkSubTasks` must be empty after backend wait. - Add `UNIMPLEMENTED` into `ProgramExecutableImpl` `waitForPostLinkTasks()`, because this method is only required if backend fills `postLinkSubTasksOut` in `LinkTask`, and frontend must not call this method if `mPostLinkSubTasks` are empty. Vulkan backend changes: - Add public `ProgramExecutableVk::waitForComputePostLinkTasks()` with ASSERT to only allow usage with Compute executable. This change avoids confusion calling `waitForPostLinkTasksIfNecessary()` with `nullptr` in case of the Compute pipelines, which will always wait. - Rename `waitForPostLinkTasksIfNecessary()` to `waitForGraphicsPostLinkTasks()`, add ASSERT to only allow usage with Graphics executable, and change optional pointer to the `GraphicsPipelineDesc` to the reference. - Add `WaitableEvent::AllReady()` check into `ProgramExecutableVk` `waitForGraphicsPostLinkTasks()` when going to skip waiting, in order to process tasks (by calling wait) when all tasks are ready. Without this change, post link task may never be processed, causing repeated `GraphicsPipelineDesc` comparisons. - Replace `GraphicsPipelineDesc` hash comparison in `waitForGraphicsPostLinkTasks()` with `keyEqual()` call. This method is around 7 times faster, however effect on the overall performance will likely be unmeasurable. Changed for clarity. - Remove unnecessary `mWarmUpGraphicsPipelineDesc` reset from: `ProgramExecutableVk` initializer list (has default constructor), Compute warmup (already clean after constructor), wait post link tasks (not used when no tasks). - Other minor changes. Bug: angleproject:8297 Change-Id: I7d790da6712be013243083e314af75f82e73886d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5592474 Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Roman Lavrov 818915a9 2024-05-27T10:53:35 PPO: Propagate dirty uniforms via mPPOProgramExecutables This eliminates the need for ProgramUniformUpdated notification Update mDefaultUniformBlocksDirty rather than just add a check to hasDirtyUniforms, as mDefaultUniformBlocksDirty can then be used elsewhere (for example, calcUniformUpdateRequiredSpace) Also adds ProgramExecutable.mIsPPO flag for an easy check for PPO without inspecting mPPOProgramExecutables Bug: angleproject:8666 Bug: b/335295728 Change-Id: I7725d02460104997df5c89a54d0e5ef3c3079946 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5569184 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 34b832a3 2024-05-24T13:38:58 Vulkan: Add RefCountedEvent recycler Previously the recycler was disabled due to race between resetEvent and setEvent. This CL splits mFreeStack into two list: mEventsToReset and mEventsToReuse. Events are first added to mEventsToReset list. Then at OutsideRenderPassCommandBufferHelper::flushToPrimary time, VkCmdResetEvents are added to reset all events in mEventsToReset list, and that reset operation is tracked by mResettingQueue. When reset command is completed, events moved into mEventsToReuse list. Since access to renderer's RefCountedEventRecycler requires lock, RefCountedEventCollector (a queue of events) is passed between ShareGroupVk and renderer's recycler to minimize the locked access. Bug: b/336844257 Change-Id: Iffac095729a81ba65a43df68cc9255d76e4be7c9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5576757 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Igor Nazarov 3467f0f2 2024-05-20T16:24:53 Vulkan: Cleanup QCOM foveated rendering extensions support Change removed `ContextVK::updateFoveatedRendering()` method because of the following: - Modifying `mGraphicsPipelineDesc` without also updating `mGraphicsPipelineTransition` may case usage of incorrect PSO. - Despite of the above, there is no bug, because the update itself is redundant. In all cases, where `updateFoveatedRendering()` was called, there also will be `onDrawFramebufferRenderPassDescChange()` call, that will call `mGraphicsPipelineDesc->updateRenderPassDesc()`. - The `onDrawFramebufferRenderPassDescChange()` will also call `invalidateCurrentGraphicsPipeline()` therefore, there is no need for this call in the `updateFoveatedRendering()`. - In all cases, the `onRenderPassFinished()` is called before `updateFoveatedRendering()`, therefore, there is no need to set `DIRTY_BIT_RENDER_PASS` bit. - All of the above made `updateFoveatedRendering()` completely redundant (maybe except for the ASSERT). Change also removed `mRenderPassDesc` update from `FramebufferVk::updateFoveationState()`. Note: similar update may be also removed when handling `shouldUpdateSrgbWriteControlMode`. Also fixes possible `mFoveationState` and `mCurrentFramebufferDesc` desynchronization in case if `updateFragmentShadingRateAttachment()` fails. Bug: angleproject:8484 Change-Id: If453bb6691e47aac3c11d0a5a6df696e885b64cb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5573395 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Roman Lavrov a3057eed 2024-05-27T14:48:51 Vulkan: DIRTY_BIT_DESCRIPTOR_SETS in handleDirtyUniformsImpl For consistency between graphics and compute handling Bug: angleproject:7103 Change-Id: If6db0739d2f75d9e8e77bf88a05466e56d165a0a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5574006 Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 1a9a703b 2024-05-21T11:14:40 Vulkan: Add DeviceQueueIndex to Context/BufferHelper/ImageHelper This CL adds a utility class DeviceQueueIndex, which encapsulates queueFamilyIndex and the queueIndex into one integer value so that we can pass around to barrier function. vk::Context and BufferHelper and ImageHelper class now keeps mCurrentDeviceQueueIndex instead of mCurrentQueueFamilyIndex. For All contexts by default it gets the default queue from renderer (which is always the one corresponding to Medium priority). For ContextVk, when priority changes it update mCurrentDeviceQueueIndex to match new context priority. Bug: b/337135577 Change-Id: I62cc483cfdb3e974d38db074e671c57299300074 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5555903 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao b22cce5f 2024-05-21T10:55:27 Vulkan: Remove Renderer::getDeviceQueueIndex Renderer::getDeviceQueueIndex() returns queueFamilyIndex. There is a function that already returns mCurrentQueueFamilyIndex, so this function is now removed. This CL also renames ImageHelper::isQueueChangeNeccesary to isQueueFamilyChangeNeccesary Bug: b/337135577 Change-Id: I3cd9ded1414d1389e162aaa5399c231a987f871e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5553067 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 0636b509 2024-05-06T12:36:20 Vulkan: Move RefCountedEvent GC and recycler to ShareGroupVk (2/3) One of the problem we had with RefCountedEvents is CPU overhead comes with it, and some part of the CPU overhead is due to atomic reference counting. The RefCountedEvents are only used by ImageHelper and ImageHelpers are per share group, so they are already protected by front end context share lock. The only reason we needs atomic here is due to garbage cleanup, which runs in separate thread and will decrement the refCount. The idea is to move that garbage list from RendererVk to ShareGroupVk so that access of RefCountedEvents are all protected already, thus we can remove the use of atomic. The down side with this approach is that a share group will hold onto its event garbage and not available for other context to reuse. But VkEvents are expected to be very light weighted objects, so that should be acceptable. This is the second CL in the series. In this CL, we added RefCountedEventsGarbageRecycler to the ShareGroupVk which is responsible to garbage collect and recycle RefCountedEvent. Since most of ImageHelper code have only access to Context argument, for convenience we also stored the RefCountedEventsGarbageRecycler pointer in the vk::Context for easy access. vk::Context argument is also passed to RefCounteEvent::init and release function so that it has access to the recycler. The garbage collection happens when RefCountedEvent is needed. The per renderer recycler is still kept to hold the RefCounteEvents that gets released from ShareGroupVk or when it is released without access to context information. Bug: b/336844257 Change-Id: I36fe5d1c8dacdbe35bb2d380f94a32b9b72bbaa5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5529951 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 2e0aefe9 2024-05-06T11:03:19 Vulkan: Move RefCountedEvent GC and recycler to ShareGroupVk (1/3) One of the problem we had with RefCountedEvents is CPU overhead comes with it, and some part of the CPU overhead is due to atomic reference counting. The RefCountedEvents are only used by ImageHelper and ImageHelpers are per share group, so they are already protected by front end context share lock. The only reason we needs atomic here is due to garbage cleanup, which runs in separate thread and will decrement the refCount. The idea is to move that garbage list from RendererVk to ShareGroupVk so that access of RefCountedEvents are all protected already, thus we can remove the use of atomic. The down side with this approach is that a share group will hold onto its event garbage and not available for other context to reuse. But VkEvents are expected to be very light weighted objects, so that should be acceptable (If not, we can add some limit to the number of events it can hold in the garbage list). This is the first CL in the series. Before this CL, the RefCounteEvents are garbage collected at flushToPrimrary time, at which time we have lost ContextVk information. In order for us to do garbage collect to ShareGroupVk, we need to move the garbage collection process early, before command buffers leaving ContextVk's visibility. For OutsideRenderPassCommands, this is easy to do, we just call flushSetEvents before we call mRenderer->flushRenderPassCommands. For RenderPassCommands, that flushSetEvents call will simply make another copy of RefCountedEvents and add to the garbage list and the actual VkCmdSetEvents are defered at the executeSetEvents call that get called from flushToPrimrary time. Bug: b/336844257 Change-Id: I1948cd8240ff61d407931083b7584a54b1dc6b0d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5517891 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 5332ab8c 2024-05-10T19:48:34 Vulkan: Add RefCountedEventRecycler to vk::Renderer This CL adds event recycler in vk::Renderer to avoid the constant create and destroy of VkEvents. When RefCountedEvent is destroyed previously, it now goes into per renderer recycler. When RefCountedEvent is created previously, it now dips into this recycler and fetch it. Before we issue VkCmdSetEvent, if this event was from recycler, we also issue VkCmdResetEvent before VkCmdSetEvebt. When glFinish/EGLSwapBuffer is called or context gets destroyed, this recycler is purged to keep the free count under limit. Bug: b/336844257 Change-Id: I92ec1b183f708112a96c3d06fcfa265024f5aa04 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5519174 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Mohan Maiya 1202055c 2024-05-09T11:25:38 Vulkan: Updates to perf counters 1. don't reset pipeline cache hit / miss counters after every frame 2. bugfix in LinkTaskVk::getResult(...) 3. accumulate counters in WarmUpTaskCommon::getResultImpl(...) Bug: angleproject:8297 Change-Id: I39d7e1a438d78e2e353c5cf8246fb24fb3cfefea Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5529942 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao a96e9197 2024-04-25T10:35:02 Vulkan: Add RefCountedEvent class and VkCmdSetEvent call This CL defines RefCountedEvent class that adds reference counting to VkEvent. CommandBufferHelper and ImageHelper each holds one reference count to the event. Every time an event is added to the command buffer, the corresponding RefCountedEvent will be added to the garbage list which tracks the GPU completion using ResourceUse. That event garbage's reference count will not decremented until GPU is finished, thus ensures we never destroy a VkEvent until GPU is completed. For images used by RenderPassCommands, As RenderPassCommandBufferHelper::imageRead and imageWrite get called, an event with that layout gets created and added to the image. That event is saved in RenderPassCommandBufferHelper::mRefCountedEvents and that VkCmdSetEvents calls are issued from RenderPassCommandBufferHelper::flushToPrimary(). For renderPass attachments, the events are created and added to image when attachment image gets finalized. For images used in OutsideRenderPassCommands, The events are inserted as needed as we generates commands that uses image. We do not wait until commands gets flushed to issue VkCmdSetEvent calls. A convenient function trackImageWithEvent() is added to create and setEvent and add event to image all in one call. You can add this call after the image operation whenever we think it benefits, which gives us better control. (Note: Even if forgot to insert the trackImageWithEvent call, it is still okay since every time barrier is inserted, the event gets released. Next time when we inserts barrier again we will fallback to pipelineBarrier since there is no event associated with it. But that is next CL's content). This CL only adds the VkCmdSetEvent call when feature flag is enabled. The feature flag is still disabled and no VkCmdWaitEvent is used in this CL (will be added in later CL). Bug: b/336844257 Change-Id: Iae5c4d2553a80f0f74cd6065d72a9c592c79f075 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5490203 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi e4a12a67 2024-05-02T13:25:53 Vulkan: Dynamic depth test + static depth write When depth test changes in such a situation, the static state is still affected (because we mask depth write with depth test) so the graphics pipeline still needs to be invalidated. Bug: b/336386662 Change-Id: Iebba79ffd7d6fa3962a5b20c27efcca3aa35b10a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5511602 Commit-Queue: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Gowtham Tammana 298b8739 2024-04-15T13:58:01 Vulkan: Restrict the ContextVk dependency in CommandBufferHelper Updating the interfaces that have no need for ContextVk state and instead passing in the vk::Context. Bug: angleproject:8544 Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Change-Id: Id3b72d9eabb7d1d6ee89c46cdc24a23da9e32b5c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5492319 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 80c8b6f0 2024-04-17T10:06:45 Revert "Vulkan: Only enable DS dynamic state if there is DS attachment." This reverts commit 471b50407d7d1c22491d066df77060cb8b9b2f89. The reverted change does not correctly handle UtilsVk functions, leading to validation failures. UtilsVk could be made to not set dynamic state when the depth/stencil attachments are missing, but instead the change is reverted because: - The original issue that prompted this is easily fixable (and fixed in this change) - Disabling depth/stencil dynamic state is not necessarily a performance improvement; every time a pipeline in such a render pass is bound, the driver would have to make sure to no-op the relevant state change if static, which is also costly. Instead, dynamic state may need to be set only once in the entire render pass. Bug: b/223456677 Bug: b/315353258 Bug: angleproject:8242 Change-Id: I8282b87857d6b9285dbcf307c3c6ecf69df5fadb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5462079 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya 924b40dc 2024-04-04T18:44:21 Selectively wait for post-link tasks in the frontend The frontend waits for post-link tasks only for a relink or in syncState when `disableProgramCaching` feature is not enabled. Bug: angleproject:8297 Change-Id: If7a3b8a10a2d01f82fd2bebac5c8f378be56e19e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5427001 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 103c1b53 2024-03-29T14:37:23 Vulkan: Drop MSRTT emulation dependency on independentResolveNone Usage of VK_RESOLVE_MODE_NONE was removed in [1], but dependency to this property was accidentally added in [2]. [1]: https://chromium-review.googlesource.com/c/angle/angle/+/2743666 [2]: https://chromium-review.googlesource.com/c/angle/angle/+/3353895. Bug: angleproject:4836 Change-Id: I25028b5d343686edd794acdac3714c4a6cb5fa17 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5407073 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya e4fe461f 2024-04-01T16:41:37 Vulkan: Apply mask during transition search When checking the transition cache for the shaders subset, mask transition bits with kShadersTransitionBitsMask Bug: angleproject:7369 Change-Id: Ic8e4ad00312d5e601dbfc0d84bbc76e809358427 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5410940 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi b4cf07c3 2024-03-27T15:58:04 Vulkan: Move the interface pipeline library caches to share group When linking libraries into a pipeline, the linked pipeline lives in ProgramExecutableVk and may be shared between contexts in a share group. The caches for the vertex input and fragment output libraries thus cannot live in the context, but should remain alive until all contexts in the share group are destroyed. This change moves these caches to the share group. Bug: angleproject:8629 Change-Id: I2f7edf44d676505cf5e7e24640c6850c67f8b5e3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5401514 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c71a67de 2024-03-27T15:50:00 Vulkan: Move pipeline cache graph dump to renderer In preparation for moving some caches to the share group. Bug: angleproject:6565 Bug: angleproject:8629 Change-Id: I1a06a18417502e499da0edb9abb0d510e3ad99ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5401513 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi 9475ac40 2023-11-15T10:25:06 Vulkan: Make efficient MSAA resolve possible Prior to this change, using a resolve attachment to implement resolve through glBlitFramebuffer was done by temporarily modifying the source FramebufferVk's framebuffer description. This caused a good deal of complexity; enough to require the render pass to be immediately closed after this optimization. The downsides to this are: - Only one attachment can be efficiently resolved - There is no chance for the MSAA attachment to be invalidated In this change, resolve attachments that are added because of glBlitFramebuffer are stored in the command buffer, with the FramebufferVk completely oblivious to them. When the render pass is closed, either the FramebufferVk's original framebuffer object is used (if no resolve attachments are added) or a temporary one is created to include those resolve attachments. With the above method, the render pass is able to accumulate many resolve attachments as well as have its MSAA attachments be invalidated before it is flushed. For a FramebufferVk that is resolved in this way, there used to be two framebuffers created each time and thrown away as the code alternated between starting a render pass without a resolve attachment and then closing with one. With this change, there is now one framebuffer (without resolve attachments) that is cached in FramebufferVk (and is not recreated every time), and only the framebuffer with resolve attachments is recreated every time. Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there would be no framebuffers to create and destroy, and this change paves the way for that support too. WindowSurfaceVk framebuffers are still imagefull. Making them imageless adds unnecessary complication with no benefit. ----------------- To achieve efficient MSAA rendering on tiling hardware, applications should do the following: ``` glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO); // Clear the framebuffer to avoid a load // Or invalidate, if not needed to load: // glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...); glClear(...); // Draw calls // Resolve into the single sampled framebuffer glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO); glBlitFramebuffer(...); // Immediately discard the contents of the MSAA buffer, to avoid store glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...); ``` The above would translate to the following Vulkan render pass: - MSAA LOAD_OP_CLEAR/DONT_CARE - MSAA STORE_OP_DONT_CARE - Resolve LOAD_OP_DONT_CARE - Resolve STORE_OP_STORE This makes sure the MSAA data doesn't leave the tile memory and greatly reduces bandwidth usage. Once anglebug.com/4892 is fixed, this would also allow the MSAA image to never be allocated either. Bug: angleproject:7551 Bug: angleproject:8625 Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 914fe61b 2024-03-15T13:20:49 Vulkan: Rename RendererVk.* to vk_renderer.* Done in a separate CL from the move to namespace vk to avoid possible rebase-time confusion with the file name change. Bug: angleproject:8564 Change-Id: Ibab79029834b88514d4466a7a4c076b1352bc450 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370107 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Mohan Maiya 21d124c4 2024-03-16T10:06:02 Vulkan: Remove support for pipeline cache control For current and upcoming uses for pipeline caches the benefits of having an externally synchronized pipeline cache is minimal at best. Remove support for that and have all caches be internally synchronized by the Vulkan driver. Bug: angleproject:8601 Change-Id: Ic5d9740934641f61b527094aa301e27302b02a57 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5375102 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 60aaf4a0 2024-03-14T12:58:56 Vulkan: Move renderer to namespace vk This class is agnostic of EGL. This change moves it to namespace vk for use with the OpenCL implementation Bug: angleproject:8564 Change-Id: I57f7807d6af8b3d5d7f8efbaf8b5d537a930f881 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5371324 Reviewed-by: Austin Annestrand <a.annestrand@samsung.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Geoff Lang e5cb7f1f 2024-03-12T16:06:37 Vulkan: Fix access to inactive attributes ... within range of active ones. Since a buffer is bound for inactive attributes, it must be considered accessed. Ultimately, the nullDescriptor feature could be used to avoid binding a buffer for inactive attributes. Bug: chromium:327807820 Change-Id: Ieceea9442310c23568c47cef7357b4094b7ebbb4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5369336 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Mohan Maiya 9bae5859 2024-03-13T10:55:18 Vulkan: Add blend factors to allow dithering to work Previously, we had - src: GL_SRC_ALPHA, dst: GL_ONE_MINUS_SRC_ALPHA Now, this adds - src: GL_ONE, dst: GL_ONE_MINUS_SRC_ALPHA This showed up in app "com.inertiasoftware.justjigsaws". Bug: b/328837151 Change-Id: I88208b1ed4dd050283d8d02cf31ccdcb3d02a444 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5369638 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 91ddf851 2024-03-03T10:57:22 Vulkan: support QCOM foveated rendering extensions Add support for foveated rendering in the vulkan backend. This is done by leveraging the VK_KHR_fragment_shading_rate extension. Bug: angleproject:8484 Change-Id: I0d01d07583f710b2302ea07b19c9d113c73bfe41 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5269907 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya b2773c11 2024-03-01T11:24:44 Vulkan: Bug fix in immutable sampler pipeline layout recreation An immutable sampler is tied to a sampler index and changing sampler uniform location value should force a recreation of the pipeline layout Bug: b/155487768 Bug: angleproject:5033 Bug: angleproject:5773 Tests: Texture2DTestES3.TexStorage2DMultipleYuvSamplersSwitch*Vulkan Change-Id: I82aaed332d7f87f11a2fd4923cfc004403ff0bd2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3657480 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 545e3f6e 2024-03-01T23:27:03 Vulkan: Decouple RendererVk from egl::BlobCache The new vk::GlobalOps class abstracts access to egl::BlobCache. This is a step towards decoupling RendererVk from egl::Display for direct use with OpenCL. Bug: angleproject:8564 Change-Id: I7b3910254430df74b889759639da1749735584a7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5332082 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya b978974d 2024-03-03T10:48:48 Update frontend support for QCOM foveated extensions Modifications to frontend support - 1. EXTENDED_DIRTY_BIT_FOVEATED_RENDERING is removed 2. New framebuffer attachment API - getFoveationState 3. Attachment type restriction for foveated rendering is removed 4. Addition of new test - RenderbufferAttachmentClearThenDraw Bug: angleproject:8484 Change-Id: I699cbed81346c9a6344c4ff36afa51d6cc1bf052 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5338529 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 4e6fe5e0 2024-02-29T15:01:06 Vulkan: Cache ImageLoadContext in context This avoids the need to requery this from the display every time. Bug: angleproject:8564 Change-Id: Ied650e7789741f59b7662c0f97c55132b105778d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5332074 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Amirali Abdolrashidi 2ee295b4 2024-02-15T11:27:39 Vulkan: Add per-level image update tracker * Add a per-level image write tracker to ImageHelper. * It tracks the updates scheduled for different parts of the image. Within each level, it also tracks different layers, currently up to 64. * kMaxParallelSubresourceUpload renamed to kMaxParallelLayerWrites; moved to vk_helper header. * It is reset when a barrier is issued for the image. * Modified ImageHelper::recordWriteBarrier(). * Added isWriteBarrierNecessary(). * Now it checks the added writes for the image. It will no longer issue a barrier if the image is in the same layout and there is no write to a part of the image to which was previously written. * Added ReadImageSubresources to CommandBufferAccess. * It is used for layouts that allow both reading and writing to the image (including self-copy): * TransferSrcDst (used in CopyImageSubData) * ComputeShaderWrite (used in compute-based mipmap generation) * CommandBufferImageWrite -> CommandBufferImageSubresourceAccess * Updated onImageSelfCopy() args to include read subresource data. * Improves gpu_time for TextureUploadETC2TranscodingBenchmark perf test * Windows/NVIDIA: ~180609 ns -> ~62669 ns (~2.88x) * Linux/NVIDIA: ~157283 ns -> ~93360 ns (~1.68x) * Windows/Intel: ~72297 ns -> ~57153 ns (~1.27x) * Added a test to show that self-copy for a write-after-read works. * ArraySelfCopyImageSubDataWithWriteAfterRead * (ArraySelfCopyImageSubData covers RAW hazards; renamed) Bug: b/308455694 Change-Id: I5cef296d991ce6ec02792edc3ffc5cc4994831e1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5301855 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Gowtham Tammana bcf814fd 2024-02-02T10:30:34 Vulkan: Constrain the dependency on ContextVk in BufferHelper Make the BufferHelper interface be not dependent on ContextVk state. This makes the interface to be suitable for implementation of other APIs with Vulkan backend. Any dependency on ContextVk is made explicit and handled in ContextVk. Bug: angleproject:8544 Change-Id: I8b285f54c8758a26dd7edf27b1371f9afcf7e241 Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5303573 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Mohan Maiya 3ca8befb 2024-02-14T12:35:08 Vulkan: Handle multi-context apps in pipeline cache graphs Append a monotonically increasing counter to filename so apps and benchmarks with multiple contexts don't clobber each others files. Bug: angleproject:6565 Change-Id: I5c781895e1ec8cc65728aa752e28fb2acb02abe9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5297288 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya 6607a2b9 2024-01-17T15:58:20 Vulkan: Add support for VK_EXT_vertex_input_dynamic_state Hook into VK_EXT_vertex_input_dynamic_state so pipeline states that differ only in vertex input state can reuse existing pipelines. Bug: angleproject:7162 Tests: StateChangeTestES3.Vertex* Change-Id: Icd3134dee93fc5fc2e9d284fcfa8c674b62faec8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5207462 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi b380ed1f 2024-02-14T09:31:26 Vulkan: Add EGL_ANGLE_global_fence_sync Chrome has an implicit assumption that due to context virtualization, signaling a fence in one context results in synchronization with _all_ contexts that have previously made submissions. This is not per EGL spec, but the functionality is easily implementable in the Vulkan backend. In the Vulkan backend, each context is given its own "timeline" of submissions (tracked by serials associated with "indices"). The required functionality is implemented through a new EGL fence sync object whose sole difference is that it synchronizes with all the existing timelines rather than the one of the current context. Bug: b/318721705 Change-Id: I6c45d065e592d0d4ed627ce9695196b1086d5021 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5297396 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi dbc6bd9d 2024-02-12T14:07:49 Reland "Vulkan: Fix alignment issues with SecondaryCommandBuffer" This is a reland of commit e53270c9ca1afe393d6d7d0359e81cf6755b6ca5 Original change's description: > Vulkan: Fix alignment issues with SecondaryCommandBuffer > > This solves undefined behaviour on 64-bit systems. This inflates the > size of a few commands, but most commands either already did align to 8 > bytes or could be aligned to 8 bytes with a few tweaks. > > Bug: angleproject:7852 > Change-Id: Ie61976d5bf8df7790acd95c0e15d4c79402622a1 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5288636 > Reviewed-by: Charlie Lao <cclao@google.com> > Reviewed-by: Yuxin Hu <yuxinhu@google.com> > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Bug: angleproject:7852 Change-Id: Ie206e66fc21c5db7c9e67eb478d9cddada5db8e0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5296376 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuly Novikov <ynovikov@chromium.org>
Roman Lavrov c603a4f1 2024-02-08T10:53:27 Don't perf warn about ETC1->ETC2 emulation as it is efficient Format is forwards compatible: https://crsrc.org/c/third_party/angle/src/libANGLE/renderer/gl/formatutilsgl.cpp;drc=21f16cb16333802dfa942d67cac59885f904301d;l=701 Added hasInefficientlyEmulatedImageFormat() helper Bug: b/302115557 Change-Id: Ibc82c27ecf4e3afbfaac52cb45bdda776c50b4b3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5278562 Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya d05c9a5e 2024-01-25T13:01:49 Frontend support for QCOM foveated extensions Add frontend state management to support foveated rendering extensions. Bug: angleproject:8484 Test: Texture2D*Foveation* Change-Id: I0e1be9f11b2d442207674562da760f5bfd7debc8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5208091 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Geoff Lang <geofflang@chromium.org>