src/libANGLE/renderer/vulkan/ContextVk.h


Log

Author Commit Date CI Message
Panfeng Hou 2ef85c24 2025-07-09T17:13:52 Vulkan: Add support for GL_EXT_fragment_shading_rate Add support for GL_EXT_fragment_shading_rate. Bug: angleproject:420310117 Change-Id: I7b368afc45baf8551c222b2569991269117d385b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6726817 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Panfeng Hou <panfeng.hou@arm.com> Reviewed-by: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi 279652e3 2025-07-21T16:21:28 Vulkan: Flush more often due to render pass count Accumulating too much workload before submission can lead to stuttering and reduced quality in rendering. In this change, a threshold is set for the number of render passes in order to submit the command buffer. * Added the following to ContextVk: mRenderPassCountSinceSubmit * It is incremented every time a render pass begins in the command buffer. * When the count reaches the following threshold at the RP closure, there is a submission: kMaxRenderPassCountPerCommandBuffer * (Currently set to 128) Bug: b/426439980 Change-Id: I4cde223fb81e6a27a61f78c924d3c9f2a082e995 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6775626 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao b4d84458 2025-05-23T18:08:19 Move Buffer from VertexBinding to VertexArray In later CL we will not taking shared context lock for certain VertexArray API calls. VertexArray itself is per context, so this sounds reasonable to do. The main challenge here is a lot of VertexArray function end up accessing gl::Buffer object, which could be modified by other shared contexts. In order to safely not taking the shared context lock, we need to separate out Buffer object out of VertexArray itself so that these lockless APIs will take VertexArray that does not have access to buffer. In this CL, VertexArray is split into two classes: VertexArrayPrivate is everything in VertexArray except buffers. VertexArray is a subclass of VertexArrayPrivate and owns all the buffers. Buffer is removed from gl::VertexBinding class. In order to let back end access to buffers, VertexArrayImpl holds a weak reference to VertexArray::mVertexArrayBuffers (which is a vector of buffers). Further, VertexArrayBufferBindingMask mBufferBindingMask is moved from VertexArrayState into VertexArray class well, since it tracks which index has a non-null buffer. The bulk of change are due to the VertexARrayImpl constructor change, since it now takes vertexArrayBuffers argument. Other bulk of changes are due to VertexBinding no long has the buffer, but you need to get it directly from VertexArray or VertexArrayImpl. This CL also reverts some of the change in crrev.com/c/6758215 that mVertexBindings no longer contains kElementArrayBufferIndex. BYPASS_LARGE_CHANGE_WARNING Bug: b/433331119 Change-Id: I15f4576f7c5c8d8f4d9c9c07d38a60ce539bfeea Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6774702 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Mohan Maiya 890b5d8f 2025-07-07T13:06:54 Vulkan: Encapsulate more descriptor set logic in ProgramExecutableVk - ProgramExecutableVk handles SharedDescriptorSetCacheKey updates - Inline most update*DescInfo methods - Add dedicated methods to handle uniform and storage buffers to remove some branches from frequently used code paths Bug: angleproject:426412564 Tests: UniformBufferTest31.UniformBufferBindingRangeChangeWith*FBF Change-Id: I54b8ae2bd8778231e4d187b2cfd30f4d71de7f3b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6733546 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Amirali Abdolrashidi 065b80a6 2025-07-10T15:50:43 Vulkan: Remove the enum to indicate submitted CB Currently, the function ContextVk::submitCommands() takes the following enum to indicate whether all command buffers or only the outside command buffer is submitted: "Submit" However, ContextVk::submitCommands() is only called twice. Also, this enum is only used to manage a few things, such as garbage collection, and finalizing foreign image layouts. It is possible to move these operations to the respective callers and remove this enum completely. * Moved the operations relying on the enum "Submit" to the locations before submitCommands() as required. * Removed the enum "Submit". (Credit for the idea to move the ops up to the callers: cclao) Bug: b/425987310 Change-Id: Ic0e1c15ee3d2e7cf22a4f7a57b6ac31acc38c861 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6724899 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Mohan Maiya 30a1cbc9 2025-07-03T13:00:05 Vulkan: Separate out descriptor set for uniform buffers Bug: angleproject:426412564 Change-Id: Icdbb1e634fc543714d1e3b9cdba0530d400cb153 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6705153 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Mohan Maiya ce289330 2025-07-01T19:41:46 Vulkan: Simplify descriptor set management - Descriptor logic is contained in ProgramExecutableVk and doesn't leak into ContextVk - Reduces CPU overhead by not having to constantly copy and resize the DescriptorSetDescBuilder - Simplifies decoupling of descriptor set of uniform buffers from that of other shader resources Bug: angleproject:426412564 Change-Id: Ic0926d0d466ea21f611c2b2c7b844e0bb9027c1b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6702410 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 2fd033d0 2025-05-22T04:21:11 Vulkan: Optimize updates to uniform buffers ... when only the offset is modified. Most of the work done when handling dirty uniforms can be skipped since the buffer bindings haven't changed Bug: angleproject:386749841 Tests: UniformBufferTest31.UniformBufferBindingRangeChange*Vulkan Change-Id: Ic811bd71f0f2993f88ce9bcf93f9e8e46dfc6d99 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6581359 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya ef31b3ed 2025-06-04T12:00:04 Vulkan: Selectively dirty DIRTY_BIT_SAMPLE_SHADING When program executable changes dirty DIRTY_BIT_SAMPLE_SHADING bit if either the current or previous program enabled per sample shading Bug: angleproject:386749841 Change-Id: I82aa7df29473e455aa68dfba9fefdb1bc712a78d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6622156 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mavis Deng 2a18fdbf 2025-05-21T16:14:51 Fix sync issue between XFB output and UBO input For the following scenario, where the first draw writes to the transform feedback buffer and the second draw reads from the same buffer as UBO, it is necessary to end the render pass between the two draws and add a barrier. // xfb write glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, xfb); glBeginTransformFeedback(); glDrawArrays(); glEndTransformFeedback(); // Draw with same buffer as UBO glBindBuffer(GL_UNIFORM_BUFFER, xfb); glDraw(); Bug: angleproject:418568423 Change-Id: Ia294d174111c6104b55762590ec26056ee759b53 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6572198 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c884460a 2025-04-24T16:43:04 Vulkan: Fix deferred clears vs noop multidraw Bug: chromium:407828338 Change-Id: I5da22aeb72605bb7943fa5ae079ae297d00888f7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6488794 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi 49d0a332 2025-04-08T16:07:17 Vulkan: Remove ring buffer allocators * Removed the ring buffer allocator functionality from ANGLE: angle::RingBufferAllocator * Also removed the related common files. * (Pool allocators will be used at all times.) * Removed the placeholder functions from the pool allocator. * Removed the following BUILD flag: angle_enable_vulkan_shared_ring_buffer_cmd_alloc * Removed redundant line from ContextVk. Bug: b/410036490 Change-Id: I368fb93a66ddfd192018b09f65004a32339abd5a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6442640 Reviewed-by: Igor Nazarov <i.nazarov@samsung.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi dae3c851 2025-03-14T11:44:53 Vulkan: Bake non-shader state into linked pipeline When using VK_EXT_graphics_pipeline_library, previously ANGLE would create three pipelines libraries: * The Shaders library was created based on the GL program's shaders + a few static states. This typically hit the program's own pipeline's cache that was warmed up during link. * The VertexInput and FragmentOutput libraries were created at draw time, which used the global pipeline cache At draw time, immediately after creating the non-Shaders libraries, the three libraries were linked into the final pipeline to be used by the draw call. This caused an inefficiency; because the non-Shaders libraries were created independently from the Shaders library, they had to be compiled pessimistically, for example because they could not be optimized to take into account the precision of the fragment shader's outputs or whether any value is const (typically alpha being set to one). Given the creation of VertexInput and FragmentOutput libraries is typically quite fast (the former being no-op and dynamic state anyway), this change removes the need for creating those libraries, and directly specifies the vertex input and fragment output state when creating the final pipeline out of the Shaders library. In this way, the same fragment output state can be tailored to the exact shaders it is being used with and incur a smaller overhead. In this change, the linked pipeline is cached in the GL program's pipeline cache, which is never synced to the blob cache as producing it is assumed to be fast already. Bug: angleproject:42265839 Change-Id: I8496ea37771555522bdc9de94043a1b56fa5967e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6354205 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Igor Nazarov 318d4038 2025-02-18T13:19:28 Vulkan: Fix present optimization w.r.t shared present mode Since the implementation of "EGL_KHR_mutable_render_buffer" mode (angleproject:42262606), the renderpass optimization for present (b/153885625) was not working correctly. It sill tries to transition into `ImageLayout::Present`, but instead of entirely skipping the transition, it inserts the barrier even when renderpass is still opened. When both "supportsSharedPresentableImageExtension" and "preferDynamicRendering" are enabled, code will hit ASSERT in `flushToPrimary()` when attempting to record `ImageLayout::Present` barrier into the primary command buffer. Above issue is fixed by skipping the transitioning into `ImageLayout::Present` when in shared present mode. Other changes and fixes: - removed renderpass flush when resolving with renderpass, since it is not necessary (angleproject:42265256). - above change reveled a bug in `finalizeImageLayout(&mColorImageMS)` call. This call reverts the layout in the previous finalize call from Present to ColorWrite. So it must have been inserted before finalizing the swapchain image. Issue fixed by removing both finalize calls. - updated condition to skip invalidate w.r.t shared present mode (b/229689340), that was missed during implementation of "EGL_ANDROID_front_buffer_auto_refresh" (angleproject:42265697). Test: angle_end2end_tests --gtest_filter=EGLSurfaceTest.PresentLayoutTransitionWithMSAA/* Bug: b/153885625 Bug: angleproject:42262606 Bug: angleproject:42265256 Bug: b/229689340 Bug: angleproject:42265697 Change-Id: Ifad8aea8548fa7bfac27941812c435b2af655309 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6277445 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Amirali Abdolrashidi 2e65d3d4 2025-02-03T16:26:46 Vulkan: Fine-tune submission for multiple RPs This CL aims to fine-tune submission based on a certain command buffer size. This allows us to perform one submission for multiple smaller render passes. * Added mCommandsPendingSubmissionCount to ContextVk. * In ContextVk::syncState(), preferSubmitAtFBOBoundary is only used if the render pass pending command count exceeds the threshold: kMinCommandCountToSubmit * Currently set to 32. * For now, we still submit if the command is a clear (for example glClearBufferfv()). * For now, we also still submit if the command is an invalidate (for example, glInvalidateFramebuffer()). * In ContextVk::flushImpl(), if the pending command count exceeds the threshold (kMinCommandCountToSubmit) and the device is found to be idle, the work is submitted to keep the device busy. * Modified the following unit test from VulkanPerformanceCounterTest: VerifySubmitCounterForSwitchUserFBOToDirtyUserFBO * Since there is now a minimum command count for submission, the number of draw calls has been changed so that the submission is still issued at the new FBO boundary. * After this CL, life_is_strange shows the following improvements: (From the latest measurements) * +19% wall_time * +38% cpu_time Bug: angleproject:42265052 Change-Id: I18452cc1d39ca7e0ac376f6012974b498153cce8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6182927 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 0f75fc3d 2025-01-20T14:10:41 Vulkan: Transition foreign images to the FOREIGN queue on submit Vulkan's interaction with AHB and dmabuf images is through the FOREIGN queue family. When ANGLE uses these images, it must take ownership of the images by doing a queue family ownership transfer (QFOT) away from the FOREIGN queue family and into the graphics queue family used by the Vulkan backend. Prior to this change, ANGLE would do the QFOT away from FOREIGN once such a foreign image is imported into an EGL image. Afterwards, usage in ANGLE works correctly. What ANGLE did not handle is when a foreign entity wants to use these images _after_ ANGLE has used them. For the above to work correctly, ANGLE must do a QFOT back into FOREIGN before the image can be used by the foreign entity. Unfortunately, EGL does not provide a clear point for this hand-off to happen. ANGLE has no choice then to proactively transition the images back into FOREIGN at some point "just in case". For some native drivers, this hand-off to FOREIGN can be quite frequent. For example, on Android for most vendors there is no actual layout transition between graphics and FOREIGN queue families (the actual data layout is the same), so a cache flush/invalidate at strategic points (such as the end of the command buffer) is sufficient as equivalent to transition to FOREIGN (and another at the beginning of the command buffer as equivalent to transition from FOREIGN). As a layer over Vulkan's formalism, ANGLE is less lucky; it has to enumerate exactly which image is being transitioned to and away from FOREIGN. Transitions away from FOREIGN are in principle easy. As long as the image is marked as being in the FOREIGN queue family, it will automatically transition to the graphics queue family on first use. In this change, when a foreign image is transitioned out of the FOREIGN queue, it's added to a list of images to be transitioned back to FOREIGN at submit time. Once submission is done, the image may or may not actually be used by a foreign entity, but ANGLE cannot know that. The next time the image is used in ANGLE, it is transitioned out of FOREIGN. Verifying correctness with multi-threading is tricky, and relies on GL's requirement that access in one context is followed by a synchronization and rebind in another context before it can be used there. This means that the image's transition to FOREIGN (at the end of one submission) naturally happens before the transition back from FOREIGN (at the beginning of the next submission). Because the set of images to transition is tracked in the context, submissions in other contexts don't interfere with the above logic. The situation can be more complicated with one-off submissions, but fortunately, no such usage of foreign images is present. Another wrinkle is simultaneous usage of the image as read-only in two contexts. According to GL, this is not a hazard and requires no synchronization. However this is broken in ANGLE even for non-foreign images (see http://anglebug.com/42266349), because as what _seems_ like read-only usage of the image from GL's point of view (like sampling from the image), there are associated write operations from Vulkan's point of view (image layout transitions and QFOT). This change does not attempt to address this corner case. Bug: angleproject:42263241 Bug: angleproject:42262454 Bug: angleproject:390443243 Bug: chromium:382527242 Change-Id: Idd4ef1fecfa3fccf1a4063f1bddb08d28b85386b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6184604 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao fbd230f5 2025-01-23T12:59:06 Vulkan: Split ErrorContext into ErrorContext and Context ErrorContext continue to be context for error handling. vk::Context is added to serve as common base class for ContextVk and CLContextVk. Bug: angleproject:390443243 Change-Id: Ifac0b1d2d714ce610693ce60a35459c6c9cddf1a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6191438 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c1214ec2 2025-01-22T14:22:56 Vulkan: Rename Context to ErrorContext In preparation for adding another Context (derived by GL and CL contexts), which includes logic that pertains to command recording (such as barrier tracking). Bug: angleproject:390443243 Change-Id: Idf495b62e63fb9aa901a2f16447fdaf3c2acd90b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6191248 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao a5016e31 2025-01-17T17:50:27 Vulkan: Fix typos found by gerrit spellchecker Bug: angleproject:360274928 Change-Id: Idbffd7a4609f28d161bd0a11ace817856dcd750c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6182930 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi fc4fc174 2024-12-10T22:01:28 Vulkan: Prevent crash with D/S FF without D/S attachment The spec says that the values for gl_LastFragDepth/StencilARM are undefined if there is no depth/stencil attachment. This "just" works on tiling GPUs, because reading input attachments simply translates to reading _something_ from the tile memory. For ANGLE, the situation is a little more complicated. ANGLE has to bind descriptors for input attachments (because non-tilers read from the input attachment descriptor instead of using the knowledge that input and color/depth/stencil attachments are one and the same thing in tile memory). When a depth/stencil attachment is missing, there is no image to bind to the descriptor set. ANGLE cannot skip binding an image to the descriptor set, because OpImageRead (translated from subpassLoad()) attempts to access the input descriptor; skipping this causes an internal crash in SwiftShader for example. ANGLE cannot bind a bogus image as input attachment, as Vulkan requires that input attachments are also color/depth/stencil attachments. ANGLE _could_ bind a bogus image as input attachment and also as depth/stencil attachment. This is rather risky, as it then also has to be careful to make sure that depth/stencil attachment is never actually used (i.e. it affects the depth/stencil state, load/store ops etc). In this change, the shader itself is modified to remove references to the depth/stencil input attachments if the attachment is missing. This is rather inefficient, as it means the pipeline warmup will not produce a usable pipeline, but it's accepted as a workaround for something apps shouldn't really be doing. Bug: angleproject:376572258 Change-Id: I0de68252b61615cb82cba7d1730699aadf41e92f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6085368 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 70d1ef67 2024-11-20T11:34:39 Vulkan: Ensure onFramebufferBoundary is called for offscreen There is peak memory regression observed from crrev.com/c/6022549. What I suspect happening is that for offscreen or single buffered case, glFlush/glFinish is called but bail out because it already submitted or deferred. So we end up not calling onFramebufferBoundary(). This CL ensures we always call onFramebufferBoundary from these two functions for single buffer or offscreen. Also fixed a bug when onSharedPresentContextFlush is called we may end up calling onFramebufferBoundary. To make API names consistent, existing flushImpl() is renamed to flushAndSubmitCommands() and a new flushIMpl is added to wrap around most logic inside flush(). Bug: angleproject:372268711 Change-Id: I54eed8a81f4153d52ab962f213cacc87a73b89ac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6037491 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 74f74b63 2024-11-14T10:07:34 Vulkan: Add ContextVk::onFramebufferBoundary() function This makes a more formal API to track frame boundary. Also adds a uint32_t mCurrentFrameCount to track the total number of frames rendered so that we could use for heuristic purpose. Bug: angleproject:372268711 Change-Id: I153497403ed0d8fde18f1786186ce600df60c514 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6022549 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 087cc411 2024-11-14T11:05:14 Vulkan: Add mRenderer to ShareGroupVk class For convenience, instead of passing renderer to shareGroupVk's API, keep mRenderer in SharGroupVk class at constructor call. Bug: angleproject:372268711 Change-Id: I9534f7dbe24121856221b89ccf8fc6a353bbb0cc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6022548 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi ba81145f 2024-11-08T15:45:44 Vulkan: Emulate coherent framebuffer fetch everywhere Many apps expect coherent framebuffer fetch to be available, and multiple downstream emulators end up forcing coherent framebuffer fetch enabled despite the hardware not being coherent. This change attempts to do a best-effort emulation of coherent framebuffer fetch by automatically inserting barriers before framebuffer fetch draws. While this doesn't correctly handle self-overlapping geometry, it works well enough in practice for the applications. As a result, framebuffer fetch is practically enabled everywhere after this change. Bug: angleproject:377923479 Change-Id: I3900a1de0f4db755b7e70871f57df3ea112073f9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6004336 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 2df8d32b 2024-10-25T13:49:40 Vulkan: Tag DescriptorSets properly with every command buffer When descriptorSetCache is disabled, there is a bug that the current descriptorSet is not properly tagged with current ResourceUse. Right now when we get a descriptorSet (from cache or reused or allocated new), we retain to the current command buffer. But if we submit command buffer and get a new command buffer, and draw with the same program/buffer/textures, we will be reusing the current bound descriptorSets, but it is not retained with new command buffer. In theory, we have the same bug for pool eviction as well when cache is enabled. It's just very hard to hit due to pool eviction occur very rare. But with cache disabled, this is very easy to hit with multiple tests. In this CL, the retainResource call is moved from ProgramExecutableVk::getOrAllocateDescriptorSet() call to ProgramExecutableVk::bindDescriptorSets() call. Since bindDescriptorSets is always get called when we get a new descriptorSet, and is always get called when a new command buffer is allocated, this covers all usage case. And even better, with this change we are able to remove commandBufferHelper from arguments of quite a few functions. Bug: angleproject:372268711 Change-Id: I1f21a3e7e9ea34e2842e54025b5eb930dbf6c593 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4743599 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 08c1724f 2024-10-11T14:29:00 Vulkan: Support GL_ARM_shader_framebuffer_fetch_depth_stencil Bug: angleproject:352364582 Change-Id: I63fd78314fa7ebccbf366c252e309a9c0f09c8c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938150 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 65fcf9c4 2024-10-26T10:53:18 Vulkan: Remove redundant dependent feature checks Since [1], when a feature is overriden, the dependent features automatically take the override into account. Tests no longer need to account for dependent features, neither does the logic in the code. [1]:https://chromium-review.googlesource.com/c/angle/angle/+/4749524 Bug: angleproject:42266725 Change-Id: I5440aba4a89cffbe710e26ad7de4cfee783e9bdf Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5967414 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Igor Nazarov a769fad4 2024-10-24T20:46:42 Vulkan: Optimize and fix glFinish for single buffered surfaces Fixed bug: When calling `onSharedPresentContextFlush()` from `ContextVk::finish()` need to also call `finishImpl()` to wait for submitted commands. This bug was introduced in the original commit where `onSharedPresentContextFlush()` was added. Optimization: Skip calling `onSharedPresentContextFlush()` from `ContextVk::finish()` similarly to `ContextVk::flush()` when there is nothing to flush. Bug: angleproject:42265370 Bug: b/229908040 Change-Id: Ide9f9c5d8757257c925970faece1e137acf10dec Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5961290 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Charlie Lao 31c80bbf 2024-10-17T10:56:16 Vulkan: Avoid redundant work in updateFullActiveTextures ContextVk keeps mActiveTexturesDesc, which gets updated by UpdatePreCacheActiveTextures(). This is only used for cache lookup. When there is a cache miss, we end up call updateFullActiveTextures() which recomputes DescriptorSetDesc again, which is redundant work. This CL removes mActiveTexturesDesc from ContextVk. UpdatePreCacheActiveTextures has been changed to be a DescriptorSetDescBuilder method so that it can directly update the mDesc. updateFullActiveTextures has been renamed to updateActiveTexturesForCacheMiss which avoid mDesc calculation. updateFullActiveTextures is still kept for now which will be used in next CL when cache is disabled. Bug: b/372268711 Change-Id: Ic9a0cdaa7cefca5f72b599d26d079cef14888f07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5905766 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi a1584f49 2024-10-11T21:17:32 Vulkan: Qualify framebuffer fetch with "Color" In preparation for depth/stencil framebuffer fetch, many framebuffer fetch symbols are affixed with Color to indicate that they pertain to color framebuffer fetch logic. Bug: angleproject:352364582 Change-Id: I86000ada5e6ef47387dec0b6a3fca589d816cdc2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5926593 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 1608d0be 2024-10-10T16:53:15 Vulkan: Isolate framebuffer fetch no-RP-break optim from DR Prior to [1], changes to framebuffer fetch usage by shaders caused a render pass break. This was due to a limitation of render pass compatibility rules. It also caused other headache, such as needing to clear the render pass cache, recreating pipelines etc. [1]:https://chromium-review.googlesource.com/c/angle/angle/+/3697308 In [1] an important optimization was implemented for tiling GPUs where ANGLE permanently switched to framebuffer fetch mode on first encountering framebuffer fetch use. From that point on, ANGLE would always make every render pass framebuffer fetch compatible. In reality, the render pass break was unnecessary, which became apparent with dynamic rendering (for example that whether the render pass includes input attachments has no bearing on a pipeline that doesn't use input attachments at all). In [2], dynamic rendering kept the render pass break + permanent switch behavior for simplicity. [2]:https://chromium-review.googlesource.com/c/angle/angle/+/5637155 This change untangles the optimization done for legacy render passes from dynamic rendering, allowing dynamic rendering to start every render pass without framebuffer fetch and enable it later if a framebuffer fetch program is used. This is in preparation for supporting depth/stencil framebuffer fetch, where a perma-switch is troublesome (for example in combination with read-only depth/stencil feedback loops). Bug: angleproject:352364582 Change-Id: I31221cf22a28d58b9b2bf188e9c0b786cd0fe3d2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5923120 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi 67a5ea45 2024-09-23T16:09:12 Vulkan: Fix the error from multiple lineloop draws Since Vulkan does not support line-loop draws natively, such a draw call requires the conversion of the related buffers to prepare them for this operation. For glDrawElementsIndirect(), the index and the indirect buffers would need conversion. However, what currently happens in this case is that the original buffer pointer is overwritten after the conversion, removing the link to the original buffer. Therefore, if there is a second line-loop call just after the first, it will try to use the converted buffer as the new source, which leads to errors due the buffer already being in use. The index buffer for the draw is bound when the related dirty bit is handled. Therefore, instead of using the draw index buffer directly for handling the line-loop scenario, we can use the index buffer in the form of a local pointer passed between functions. Then, in order to reconcile line-loop with the other cases, the draw index buffer is set just before setting up the indexed draw. * Functions handling line-loop draws do not modify the element array buffer in VertexArrayVk directly, but use local buffer pointers to pass the current element array pointer to further processing and drawing. * Added mCurrentElementArrayBuffer for ContextVk to be bound to the index buffer to used for draw instead of the one from its vertex array object. * Before the indexed draw, mCurrentElementArrayBuffer is set to the last destination index buffer. * Added unit test that makes a line-loop draw and then a non-LL call using the same element array. Bug: angleproject:360758685 Change-Id: I6d6328f6326c1a1f9f80e5ef346aa077c867d344 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5878764 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 86508e20 2024-08-16T14:56:37 Vulkan: Make VertexConversionBuffer a class And wrap the cache key (i.e, formatID/stride/offset) into a CacheKey struct so that we can easily add more data members. This CL also changes ConversionBuffer from struct to class to have better encapsulation. No functional changes is expected here. Bug: b/357622380 Change-Id: Ieecf5c922b95a940137c8e54657ef3f458c55fc9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5793921 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 7c77bb75 2024-07-24T12:05:29 Vulkan: Remove implicit buffer barrier for shader write When app uses shaders to write to SSBO, right now we are inserting an implicit barrier to ensure WAW are in order. But Spec says that "Explicit synchronization is required to ensure that the effects of buffer and texture data stores performed by shaders will be visible to subsequent operations using the same objects". This CL removes the implicit barrier for buffer write if the current write comes from shaders and relies on explicit glMemoryBarrier to insert a global barrier. Bug: angleproject:350994515 Change-Id: I8ab039610be9be2ded27ea60dab54bdad08502f6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719258 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao bd3a3308 2024-07-22T14:03:20 Vulkan: Remove implicit image barrier for shader write When app uses compute or fragment shader to write to an image and makes multiple dispatchCompute or draw calls, right now we are inserting an implicit barrier to ensure WAW is hazard free. But Spec says that "Explicit synchronization is required to ensure that the effects of buffer and texture data stores performed by shaders will be visible to subsequent operations using the same objects". This CL records the bits from the last glMemoryBarrier call and will skip the barrier calls in ContextVk::updateActiveImages if there is no layout change, unless there is requirement from prior glMemoryBarrier. Bug: angleproject:350994515 Change-Id: I8bdeeb658993824369824aaa0f25cb4b6e3785f7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719024 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 7691cea7 2024-07-22T13:46:14 Vulkan: Remove seamful cubemap emulation Practically, the Vulkan backend is never expected to run on ES2 hardware. It _may_ for WebGL, but seamful cubemap emulation was disabled for webgl anyway. Bug: angleproject:354729454 Change-Id: Iafa20fbdbe232c4df4c777b12e7698ef7a87cf24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5730143 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 1db80b88 2024-07-10T12:47:42 Reland "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This is a reland of commit c379ff48043a47e444c388c45270db40d3172d50 Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I083e6963b5421386695e49a9872edbb2016c9763 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691342 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 1f87cbc9 2024-07-15T13:07:35 Vulkan: Fix late-added resolve attachment tracking Resolve attachments may be added after the fact to a render pass due to glBlitFramebuffer or eglSwapBuffer. Previously, only the resolve image views were tracked by the render pass, and otherwise the state tracking (layout, content defined, etc) treated the resolve images as generically written-to by the render pass. As a result, the render pass was unable to finalize the layout of the resolve images early. Optimizing the layout of the swapchain image when the surface is multisampled for example was not done due to this issue. In this change, when resolve attachments are added late, they are tracked identically to when they are added at the beginning of the render pass, fixing the issues described above. Bug: angleproject:42265625 Bug: angleproject:42266019 Change-Id: I765560762bb8caf39ba1096fb028177201c082d7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5707470 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 373ac541 2024-07-10T11:14:47 Vulkan: Make surface RP check independent from framebuffer object With dynamic rendering, there is no framebuffer object, so checking whether the currently open render pass belongs to the window surface (at swap time) is made independent from these objects. Bug: angleproject:42267038 Change-Id: I408e2376ba865b64fa1e8890316e8f57c08c695f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691345 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 7d461b21 2024-07-10T14:11:53 Revert "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This reverts commit c379ff48043a47e444c388c45270db40d3172d50. Reason for revert: Regresses CPU perf and memory when _not_ using DR Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I3865f0d86813f0eeb9085a92875a33bd449b907f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691337 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c379ff48 2024-06-10T22:01:57 Vulkan: Use VK_KHR_dynamic_rendering[_local_read] Bug: angleproject:42267038 Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 8c546d35 2024-06-25T12:49:40 Vulkan: Limit VkEvent for usage matters for Manhattan31 only If we use VkEvent to track all image operations causes performance regression on some app traces, including manhattan10 trace. This mainly because of CPU overhead comes with VkCmdSetEvent, mostly inside vulkan driver. These app traces likely not benefit from VkEvent because the specific bubble (false dependency) does not manifest on these app traces, but the CPU overhead takes a performance toll on it. In order to strike a balance between benefit and overhead, this CL removes most of VkEvent usage and only leaves the ones that matters for manhattan31. The only we still keeps are generateMipmap, dispatchCompute, texture sampling. We can always add more if more beneficial usage cases comes up and no regression in other traces. Bug: b/336844257 Change-Id: I346fe70bc33e57edf04e933a2db0f79738c4481d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5654737 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi d193d51b 2024-06-17T22:46:08 Replace issue ids post migration to new issue tracker This change replaces anglebug.com/NNNN links. Bug: None Change-Id: I8ac3aec8d2a8a844b3d7b99fc0a6b2be8da31761 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637912 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 5703bd61 2024-06-14T14:12:41 Vulkan: Further optimize ProgramExecutableVk::resetLayout 1. Handle compute pipelines similar to how we handle graphics pipelines 2. Track valid compute pipeline permutations Bug: angleproject:8297 Change-Id: I58200517e5a44a2b3092777ea24d1529ceee00f5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5634574 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Igor Nazarov 3467f0f2 2024-05-20T16:24:53 Vulkan: Cleanup QCOM foveated rendering extensions support Change removed `ContextVK::updateFoveatedRendering()` method because of the following: - Modifying `mGraphicsPipelineDesc` without also updating `mGraphicsPipelineTransition` may case usage of incorrect PSO. - Despite of the above, there is no bug, because the update itself is redundant. In all cases, where `updateFoveatedRendering()` was called, there also will be `onDrawFramebufferRenderPassDescChange()` call, that will call `mGraphicsPipelineDesc->updateRenderPassDesc()`. - The `onDrawFramebufferRenderPassDescChange()` will also call `invalidateCurrentGraphicsPipeline()` therefore, there is no need for this call in the `updateFoveatedRendering()`. - In all cases, the `onRenderPassFinished()` is called before `updateFoveatedRendering()`, therefore, there is no need to set `DIRTY_BIT_RENDER_PASS` bit. - All of the above made `updateFoveatedRendering()` completely redundant (maybe except for the ASSERT). Change also removed `mRenderPassDesc` update from `FramebufferVk::updateFoveationState()`. Note: similar update may be also removed when handling `shouldUpdateSrgbWriteControlMode`. Also fixes possible `mFoveationState` and `mCurrentFramebufferDesc` desynchronization in case if `updateFragmentShadingRateAttachment()` fails. Bug: angleproject:8484 Change-Id: If453bb6691e47aac3c11d0a5a6df696e885b64cb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5573395 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Roman Lavrov a3057eed 2024-05-27T14:48:51 Vulkan: DIRTY_BIT_DESCRIPTOR_SETS in handleDirtyUniformsImpl For consistency between graphics and compute handling Bug: angleproject:7103 Change-Id: If6db0739d2f75d9e8e77bf88a05466e56d165a0a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5574006 Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 1a9a703b 2024-05-21T11:14:40 Vulkan: Add DeviceQueueIndex to Context/BufferHelper/ImageHelper This CL adds a utility class DeviceQueueIndex, which encapsulates queueFamilyIndex and the queueIndex into one integer value so that we can pass around to barrier function. vk::Context and BufferHelper and ImageHelper class now keeps mCurrentDeviceQueueIndex instead of mCurrentQueueFamilyIndex. For All contexts by default it gets the default queue from renderer (which is always the one corresponding to Medium priority). For ContextVk, when priority changes it update mCurrentDeviceQueueIndex to match new context priority. Bug: b/337135577 Change-Id: I62cc483cfdb3e974d38db074e671c57299300074 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5555903 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao a96e9197 2024-04-25T10:35:02 Vulkan: Add RefCountedEvent class and VkCmdSetEvent call This CL defines RefCountedEvent class that adds reference counting to VkEvent. CommandBufferHelper and ImageHelper each holds one reference count to the event. Every time an event is added to the command buffer, the corresponding RefCountedEvent will be added to the garbage list which tracks the GPU completion using ResourceUse. That event garbage's reference count will not decremented until GPU is finished, thus ensures we never destroy a VkEvent until GPU is completed. For images used by RenderPassCommands, As RenderPassCommandBufferHelper::imageRead and imageWrite get called, an event with that layout gets created and added to the image. That event is saved in RenderPassCommandBufferHelper::mRefCountedEvents and that VkCmdSetEvents calls are issued from RenderPassCommandBufferHelper::flushToPrimary(). For renderPass attachments, the events are created and added to image when attachment image gets finalized. For images used in OutsideRenderPassCommands, The events are inserted as needed as we generates commands that uses image. We do not wait until commands gets flushed to issue VkCmdSetEvent calls. A convenient function trackImageWithEvent() is added to create and setEvent and add event to image all in one call. You can add this call after the image operation whenever we think it benefits, which gives us better control. (Note: Even if forgot to insert the trackImageWithEvent call, it is still okay since every time barrier is inserted, the event gets released. Next time when we inserts barrier again we will fallback to pipelineBarrier since there is no event associated with it. But that is next CL's content). This CL only adds the VkCmdSetEvent call when feature flag is enabled. The feature flag is still disabled and no VkCmdWaitEvent is used in this CL (will be added in later CL). Bug: b/336844257 Change-Id: Iae5c4d2553a80f0f74cd6065d72a9c592c79f075 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5490203 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 80c8b6f0 2024-04-17T10:06:45 Revert "Vulkan: Only enable DS dynamic state if there is DS attachment." This reverts commit 471b50407d7d1c22491d066df77060cb8b9b2f89. The reverted change does not correctly handle UtilsVk functions, leading to validation failures. UtilsVk could be made to not set dynamic state when the depth/stencil attachments are missing, but instead the change is reverted because: - The original issue that prompted this is easily fixable (and fixed in this change) - Disabling depth/stencil dynamic state is not necessarily a performance improvement; every time a pipeline in such a render pass is bound, the driver would have to make sure to no-op the relevant state change if static, which is also costly. Instead, dynamic state may need to be set only once in the entire render pass. Bug: b/223456677 Bug: b/315353258 Bug: angleproject:8242 Change-Id: I8282b87857d6b9285dbcf307c3c6ecf69df5fadb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5462079 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi b4cf07c3 2024-03-27T15:58:04 Vulkan: Move the interface pipeline library caches to share group When linking libraries into a pipeline, the linked pipeline lives in ProgramExecutableVk and may be shared between contexts in a share group. The caches for the vertex input and fragment output libraries thus cannot live in the context, but should remain alive until all contexts in the share group are destroyed. This change moves these caches to the share group. Bug: angleproject:8629 Change-Id: I2f7edf44d676505cf5e7e24640c6850c67f8b5e3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5401514 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c71a67de 2024-03-27T15:50:00 Vulkan: Move pipeline cache graph dump to renderer In preparation for moving some caches to the share group. Bug: angleproject:6565 Bug: angleproject:8629 Change-Id: I1a06a18417502e499da0edb9abb0d510e3ad99ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5401513 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi 9475ac40 2023-11-15T10:25:06 Vulkan: Make efficient MSAA resolve possible Prior to this change, using a resolve attachment to implement resolve through glBlitFramebuffer was done by temporarily modifying the source FramebufferVk's framebuffer description. This caused a good deal of complexity; enough to require the render pass to be immediately closed after this optimization. The downsides to this are: - Only one attachment can be efficiently resolved - There is no chance for the MSAA attachment to be invalidated In this change, resolve attachments that are added because of glBlitFramebuffer are stored in the command buffer, with the FramebufferVk completely oblivious to them. When the render pass is closed, either the FramebufferVk's original framebuffer object is used (if no resolve attachments are added) or a temporary one is created to include those resolve attachments. With the above method, the render pass is able to accumulate many resolve attachments as well as have its MSAA attachments be invalidated before it is flushed. For a FramebufferVk that is resolved in this way, there used to be two framebuffers created each time and thrown away as the code alternated between starting a render pass without a resolve attachment and then closing with one. With this change, there is now one framebuffer (without resolve attachments) that is cached in FramebufferVk (and is not recreated every time), and only the framebuffer with resolve attachments is recreated every time. Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there would be no framebuffers to create and destroy, and this change paves the way for that support too. WindowSurfaceVk framebuffers are still imagefull. Making them imageless adds unnecessary complication with no benefit. ----------------- To achieve efficient MSAA rendering on tiling hardware, applications should do the following: ``` glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO); // Clear the framebuffer to avoid a load // Or invalidate, if not needed to load: // glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...); glClear(...); // Draw calls // Resolve into the single sampled framebuffer glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO); glBlitFramebuffer(...); // Immediately discard the contents of the MSAA buffer, to avoid store glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...); ``` The above would translate to the following Vulkan render pass: - MSAA LOAD_OP_CLEAR/DONT_CARE - MSAA STORE_OP_DONT_CARE - Resolve LOAD_OP_DONT_CARE - Resolve STORE_OP_STORE This makes sure the MSAA data doesn't leave the tile memory and greatly reduces bandwidth usage. Once anglebug.com/4892 is fixed, this would also allow the MSAA image to never be allocated either. Bug: angleproject:7551 Bug: angleproject:8625 Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 914fe61b 2024-03-15T13:20:49 Vulkan: Rename RendererVk.* to vk_renderer.* Done in a separate CL from the move to namespace vk to avoid possible rebase-time confusion with the file name change. Bug: angleproject:8564 Change-Id: Ibab79029834b88514d4466a7a4c076b1352bc450 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370107 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 60aaf4a0 2024-03-14T12:58:56 Vulkan: Move renderer to namespace vk This class is agnostic of EGL. This change moves it to namespace vk for use with the OpenCL implementation Bug: angleproject:8564 Change-Id: I57f7807d6af8b3d5d7f8efbaf8b5d537a930f881 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5371324 Reviewed-by: Austin Annestrand <a.annestrand@samsung.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 91ddf851 2024-03-03T10:57:22 Vulkan: support QCOM foveated rendering extensions Add support for foveated rendering in the vulkan backend. This is done by leveraging the VK_KHR_fragment_shading_rate extension. Bug: angleproject:8484 Change-Id: I0d01d07583f710b2302ea07b19c9d113c73bfe41 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5269907 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 4e6fe5e0 2024-02-29T15:01:06 Vulkan: Cache ImageLoadContext in context This avoids the need to requery this from the display every time. Bug: angleproject:8564 Change-Id: Ied650e7789741f59b7662c0f97c55132b105778d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5332074 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Mohan Maiya 6607a2b9 2024-01-17T15:58:20 Vulkan: Add support for VK_EXT_vertex_input_dynamic_state Hook into VK_EXT_vertex_input_dynamic_state so pipeline states that differ only in vertex input state can reuse existing pipelines. Bug: angleproject:7162 Tests: StateChangeTestES3.Vertex* Change-Id: Icd3134dee93fc5fc2e9d284fcfa8c674b62faec8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5207462 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi b380ed1f 2024-02-14T09:31:26 Vulkan: Add EGL_ANGLE_global_fence_sync Chrome has an implicit assumption that due to context virtualization, signaling a fence in one context results in synchronization with _all_ contexts that have previously made submissions. This is not per EGL spec, but the functionality is easily implementable in the Vulkan backend. In the Vulkan backend, each context is given its own "timeline" of submissions (tracked by serials associated with "indices"). The required functionality is implemented through a new EGL fence sync object whose sole difference is that it synchronizes with all the existing timelines rather than the one of the current context. Bug: b/318721705 Change-Id: I6c45d065e592d0d4ed627ce9695196b1086d5021 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5297396 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya ac71a592 2024-01-23T11:56:42 Vulkan: updates to pipeline cache graph dumping logic 1. To dump pipeline cache graph you need to - 1. add "angle_dump_pipeline_cache_graph" compile time flag to args.gn 2. set Android property "angle.dump_pipeline_cache_graph" or envvar ANGLE_DUMP_PIPELINE_CACHE_GRAPH on non-Android platforms before app start 2. Default path for dump on Android is "/data/local/tmp/angle_dumps/" 3. "angle.pipeline_cache_graph_dump_path" Android property or envvar ANGLE_PIPELINE_CACHE_GRAPH_DUMP_PATH on non-Android platforms can be used to configure the dump path Bug: angleproject:6565 Change-Id: I38848aff58f413dd7bdffc9083116bd4b95e4960 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5226054 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi e00995d4 2023-12-21T15:57:39 Vulkan: Invalidate pipeline with FBO draw buffer change Enabling/disabling draw buffers can affect the graphics pipeline without changing the render pass description. In that case, the graphics pipeline was not being invalidated. Bug: angleproject:8463 Change-Id: I6848472dcbb3d3ce4c34d95be28c8ec3fc50dcd7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5147847 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi a1143857 2023-12-18T15:24:42 Fix UBO dirty bits vs PPOs This change fixes propagation of UBO dirty bits (such as through glUniformBlockBinding and glBindBufferRange) to program pipeline objects. Since PPOs concatenate the attached programs' UBOs in a list, a map of program UBO indices to PPO UBO indices is introduced to offset these dirty bits appropriately. Additionally, when the program's executable's buffer bindings change (through glUniformBlockBinding), a notification is send to the PPO to update its executable's buffer binding accordingly (which is otherwise only updated during PPO link). Bug: angleproject:8462 Change-Id: I4965ae23e6fc6cac0842e1643755e42e95d3d5cc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5131418 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Cody Northrop <cnorthrop@google.com>
Charlie Lao 471b5040 2023-10-26T09:33:29 Vulkan: Only enable DS dynamic state if there is DS attachment. This is discovered while investigating EXT_yuv_target crash in driver. What happens is that UtilsVk::copyImage does not set depth stencil dynamic state since there is no depth stencil attachment. But we enabled dynamic state for D/S, thus driver still does D/S state setup, which sees garbage data and hitting assertion. Even though this is discovered with EXT_yuv_target test, I believe this is a general issue. This CL adds the renderPassDesc.hasDepthAttachment() and hasStencilAttachment() check and enable depth or stencil related dynamic state only if there is depth or stencil attachment. This fixes crash in driver with ImageTestES3.ClearYUVAHB test. This also has added performance benefit that we now completely skips depth/stencil related dynamic state dirty bit handling code, thus reduces state processing CPU overhead. Bug: b/223456677 Change-Id: I3a4fe6d97b14c066d78f8b8ded21c626cb2f376c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4980765 Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2608c622 2023-10-06T13:32:49 Vulkan: Refactor SharedGarbageList into templated class This CL mostly involves non-functional changes to prepare for next CL. No behavior change is expected. This CL wraps the garbage list into its own templated class which maintains std::queue and tracks number bytes in the queue etc. This CL also renames SharedBufferSuballocationGarbageList to BufferSuballocationGarbageList to reduce verbosity a bit. This CL deleted GarbageAndQueueSerial and GarbageQueue since they are no longer being used. This renames vk::GarbageList to vk::GarbageObjects to reduce name confusion with SharedGarbageList. Bug: b/302739073 Change-Id: I7370c147847ffe69ad8aa3b48251d8b5762f97f9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4919816 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Hailin Zhang <hailinzhang@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Roman Lavrov 34c8778b 2023-09-26T13:45:06 Use atomic counters early in perf warning macros Before this CL, snprintf was called repeatedly to format the warning message which was then discarded after 4 logs. snprintf showed up in profiling at ~2% and this CL appears to yield an ~8% power improvement in one of the traces (egypt_1500). A mutex was previously used to avoid the race condition on the static sRepeatCount variable. This CL avoids the need for that by using static atomics instead. Also updated the Debug macro to use the VK macro vararg approach so that formatting only happens when the message is actually logged. Bug: b/302112423 Change-Id: Ia8a18361cfb5a9f2aa19ff939499754ba861efb7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4886388 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Roman Lavrov <romanl@google.com>
Amirali Abdolrashidi e1d2e88a 2023-09-20T11:53:15 Check pending garbage after some buffer releases * Embedded BufferHelper::releaseBufferAndDescriptorSetCache() inside a new ContextVk method: releaseBufferAllocation() * After releasing the buffer, there is a check for excess pending garbage. If the tracked pending garbage size is larger than the threshold, the context will be flushed. * Unskipped the test "BufferDataInLoopManyTimes", which was failing on Android devices. Bug: b/280304441 Change-Id: Ib34319f3291dd2200fc1a92e30645f9d1da8e2b9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4879086 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi 91ef1f3c 2023-09-08T16:39:53 Move buffer suballocation callers to ContextVk * Moved the following functions from BufferHelper to ContextVk. * initBufferForBufferCopy() * initBufferForImageCopy() * initBufferForVertexConversion() Bug: b/280304441 Change-Id: I890f4396b00b0c20feb44f0ad113c55924ce1014 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4854760 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi a1f52f1b 2023-09-07T14:44:24 Vulkan: Flush pending image garbage more often * Added a counter to the context object to keep track of the size of the pending image garbage: mEstimatedPendingImageGarbageSize. * Modified hasExcessPendingGarbage() to use the sum of the size of the image and and suballocation garbage. * RendererVk::calculatePendingGarbageSizeLimit() provides the limit. * Currently the limit is based on the available heap sizes. It will use a fraction of the largest memory heap size. * The portion is currently kGarbageSizeLimitCoefficient = 0.2f. * Unskipped the test "TextureDataInLoopManyTimes", which was failing on Android devices. Bug: b/280304441 Change-Id: Ibcced1d118ea8a1f347028b62d29cfbd9e38e8c0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4851252 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Amirali Abdolrashidi 27896999 2023-08-16T16:44:22 Vulkan: Flush pending suballoc garbage more often * Added a counter to the renderer object to keep track of the pending suballocation garbage. * mPendingSuballocationGarbageSizeInBytes * Once it surpasses a limit (mPendingSuballocationGarbageSizeLimit), it will flush the context so the pending garbages can be freed. * Currently the limit is based on the available heap sizes. It will use a fraction of the largest memory heap size. * The portion is currently kGarbageSizeLimitCoefficient = 0.2f. * At the end of the render pass, it is checked if the limit has been reached. If so, context flush will occur. Bug: b/280304441 Change-Id: I08e6028cfe20059ece2b2e4e971ece897544cd6d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4787950 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 1b450b92 2023-09-15T11:07:25 Vulkan: Fix buffer storage reuse bug when robustAccess is enabled There is an optimization in vulkan backend that when the bufferData is called and current storage size is big enough for new bufferData call, we just reuse the storage. Mean while, when hasRobustAccess() is true, we must use the VkBuffer with the exact user size that glBufferData call provides so that driver can set proper access boundary. In order to satisfy both requirement, if robust resource access is enabled, we create a separate VkBuffer with the exact user provided size but bind to the same memory. There is a bug here that if robustAccess is true, this buffer of user provided size is not been recreated when storage is reused but with different user size (both has same allocation size). This causes we keep using the smaller VkBuffer and subsequently causes missing triangles. This CL clears mBufferWithUserSize when size changes and storage is reused. The other bug here is that previously we are checking isRobustResourceInitEnabled, which is incorrect. We should check hasRobustAccess. This appears works for chrome possibly due to both are enabled. This CL switches it to check hasRobustAccess. This CL also renames mBufferForVertexArray to mBufferWithUserSize to reflect what its true meaning. Bug: chromium:1476475 Change-Id: I843cc3a705f8a582a97bc0307f03aa1eb9fad3ff Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4864003 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi e7418836 2023-08-16T14:25:52 Vulkan: Add context flushing as OOM fallback * As a new fallback for out-of-memory errors, if an allocation results in device OOM, the context is flushed and the allocation is retried. * Functions related to buffer/image allocations now return a VkResult value instead of angle::Result, which will be bubbled up to a higher level for safer handling. * The OOM is no longer handled at the level where the allocation happens, but is moved up to the context. * Added two functions to ContextVk for allocating memory for images and buffer suballocations, which also include the fallback options. * initBufferAllocation(): Uses BufferHelper::initSuballocation() * initImageAllocation(): Uses ImageHelper::initMemory() * Moved initNonZeroMemory() out of the following functions: * BufferHelper::initSuballocation() * Moved to ContextVk::initBufferAllocation(). * ImageHelper::initMemory() * Moved to ContextVk::initImageAllocation(). * Also moved to new function: ImageHelper::initMemoryAndNonZeroFillIfNeeded(). This function replaced the rest of initMemory() usages outside initImageAllocation(). * New macros for memory allocation * VK_RESULT_TRY() * If the output of the command inside it is not VK_SUCCESS, it will return with the error result from the command. * VK_RESULT_CHECK() * If the output of the command inside it is not VK_SUCCESS, it will return with the input error. * Added a test in which allocation would fail due to too much pending garbage without the fix on some platforms. The test ends once there has been a submission. * New suite: UniformBufferMemoryTest * Added a similar test for flushing texture-related pending garbage. * New suite: Texture2DMemoryTestES3 Bug: b/280304441 Change-Id: I60248ce39eae80b5a8ffe4723d8a1c5641087f23 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4787949 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 48e2c605 2023-09-07T14:30:46 More instances of program usage converted to executable Bug: angleproject:8297 Change-Id: I8e4eeef8f4f20610bbe0f994ce1141c17d588765 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4850888 Reviewed-by: Shahbaz Youssefi <syoussefi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Shahbaz Youssefi 6698fb69 2023-08-25T22:21:32 Vulkan: Stop passing both ProgramExecutable and ...Vk around Now that ProgramExecutableVk is accessible through ProgramExecutable. Bug: angleproject:8297 Change-Id: Ie08770ef97400195d63b87f2d4b7e2a2c8f4ad24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4812147 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi bb135f0e 2023-08-24T15:29:11 Make ProgramExecutableImpl managed by ProgramExecutable This change allows both parts of the program executable to be safely backed up and swapped on link. Bug: angleproject:8297 Change-Id: I17e4b6c05e4e481a66a227d6047dbf943d2c2603 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4812138 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Shahbaz Youssefi 571b4cdb 2023-08-14T16:55:28 Vulkan: Move pipeline/desc-set layout creation to link job The pipeline and desc-set layout caches are consequently made thread-safe. The reference counter on the layouts are also made atomic. With this change, practically all of the link in the Vulkan backend is moved to the link job. Bug: angleproject:8297 Change-Id: Iba694ece5fc5510d34cce2c34441ae08ca5bb646 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4774787 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 16cfa28e 2023-08-08T22:08:24 Vulkan: Basic infra for parallel link This change moves pipeline warm up to a parallelizable task, mostly as an exercise to put in the infrastructure for parallel link in the Vulkan backend. Follow up changes will move more of the link step to this task. The end goal is to be able to make the link task independent of ContextVk, which would allow it to be run as an UnlockedTailCall, even if not using a worker thread. Bug: angleproject:8297 Change-Id: I17047162b2a41f0d681d9e3ee33f2e0239b4280d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4764231 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 52fe3116 2023-07-17T16:20:54 Vulkan: Deduplicate share group's context set tracking Bug: angleproject:8224 Change-Id: I7a59a37229682fb91ff777f31e02e05d7ab2b80f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4690345 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Roman Lavrov c0f2f71e 2023-06-27T16:00:09 Use VK_EXT_legacy_dithering when available instead of emulation Yields improvement in gpu power: http://b/284462263#comment45 Bug: b/284462263 Change-Id: I5bfd115557b6baac17c05639118feaebf19c5cd4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4652590 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Roman Lavrov <romanl@google.com>
Charlie Lao 6ffd0d20 2023-07-12T12:09:45 Vulkan: Clean up depth stencil feedback mode part 2 Right now the tracking of depth stencil buffer readOnly or feedback loop is in FramebufferVk class. This really belongs to ContextVk, since it is not a permanent state of framebuffer, but current state of context. This CL moves it to ContextVk and changes to use BitSet instead of four boolean. Bug: b/289436017 Change-Id: I955c439259935f82eff30ddfff776a69723e5d0d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4679886 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 5f581f87 2023-06-27T20:38:03 Pass dirty bits by value Split CL from follow up change where the dirty bits need to be passed by value as they are calculated from two sets. Many cached dirty bits are turned to constexpr as a result. Bug: angleproject:8224 Change-Id: Ibdb3090d6ee93788e1502b72bce55f4677937c58 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4650074 Reviewed-by: Roman Lavrov <romanl@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 02292814 2023-06-01T14:46:05 Vulkan: Optimize the usage of FastMap in DescriptorSetDescBuilder While looking at disassemble of DescriptorSetDescBuilder::updateOneShaderBuffer() function, I noticed that there are a lot of CPU cycles spent in FastMap::operator[]. What happend here is that we are increasing size one by one as we build descriptorSet, and that hit `if (mData.size() <= key)` case and we end up resize the underline FastVector, and that resize also initialize the element with zeros, which immediately overwrite by actual data. Since we actually know the eventual size of DescriptorSetDescBuilder::mDesc/mHandles/mDynamicOffsets, we could just switch to angle::FastVector which will avoid this check size and grow every time we write to it. This CL switches the use of FastMap in DescriptorSetDescBuilder to FastVector. The only trick we need to watch out is that previously the new elements are always zero filled and now it does not. So we need to make sure we write every field of structure. This CL also renames WriteDescriptorDescBuilder to WriteDescriptorDescs since when it is read only we are passing it as const reference already, there is no added advantage to have two classes. Bug: b/282194402 Change-Id: I06a063cc51585fc17fbf0d5aa916b9aa0ab88dd4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4581881 Reviewed-by: Roman Lavrov <romanl@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 2e209516 2023-06-26T11:58:50 Move state dirty bits definitions out of the class This is in preparation for a follow up change that splits the state class. Bug: angleproject:8224 Change-Id: Ic9b253583e40fcc93ff37605b6b6e1deb55a6e55 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4631843 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Charlie Lao 2501903e 2023-05-31T11:59:36 Vulkan: Merge UpdateShader***Buffers into updateShaderBuffers Both UpdateShaderUniformBuffers() and DescriptorSetDescBuilder::updateShaderBuffers() walks the list of uniform blocks (or storage blocks). Some of the logic are the same and we are paying that overhead twice. In this CL, UpdateShaderStorageBuffers, UpdateShaderUniformBuffers and UpdateShaderAtomicCounterBuffers functions are merged into DescriptorSetDescBuilder::updateShaderBuffers and DescriptorSetDescBuilder::updateAtomicCounters. In order to handle the usage that same buffer used by multiple shader stages, a new variant of bufferRead() function is added that takes "const gl::ShaderBitSet" instead of single PipelineStage. This also paves way for next CL so that we can call updateOneShaderBuffer individually. Bug: b/282194402 Change-Id: I09d045d6295827b60bdb4c05df9333fe593fa40e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4574288 Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi b0e9bbd7 2023-05-31T14:23:40 Vulkan: Split features for dynamic state When a driver bug with dynamic state is encountered, it is hard to debug which dynamic state exactly is causing an issue, due to the current granularity of disabling all entire state from an extension. With this change, every dynamic state gets its own ANGLE feature, and can be toggled as necessary. Disabling the supportsExtendedDynamicState* features implicitly disables all dependent features. Bug: b/285124778 Bug: b/275210062 Bug: fuchsia:107106 Bug: angleproject:5906 Change-Id: Ic291279872df2d0eb58618ff364ab118bdcc4a9f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4577553 Reviewed-by: Cody Northrop <cnorthrop@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 297687c6 2023-05-18T17:27:54 Vulkan: Reduce CPU overhead for uniform buffer change One of the common usage pattern is change uniform buffers and draw. Right now every uniform buffer change goes into ShaderResource dirty code path that rebuilds entire ShaderResource cache key and descriptor set etc. This CL keeps SHaderResource dirty code path, and will be used when program changes or framebuffer changes. But uniform buffer change will go down its own code path that only update uniform buffer related state. This CL along with prior two CLs reduced asphalt_9 average frame time (with multi context hacked away) from 5.375 ms to 5.2594 ms (reduced 2.15%). Bug: b/282194402 Change-Id: Ibae2895663918ddc10bf13bc559f1483f94d2e11 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4528314 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Roman Lavrov <romanl@google.com>
Charlie Lao 9445fbbe 2023-05-16T10:43:55 Vulkan: Move mWriteDescriptors out of DescriptorSetDescBuilder DescriptorSetDescBuilder has two data structures: mWriteDescriptors, which keeps track of "layout" of descriptorSet, and mDescriptorInfos, which is the actual cache key for descriptorSet. mDescriptorInfos will be updated every time buffer or image changes. But mWriteDescriptors is immutable with buffer/image, it is static per program executable (with exception of InputAttachment). Right now whenever there is a buffer or image change, we call into DescriptorSetDescBuilder and update both mWriteDescriptors and mDescriptorInfos. This CL moves mWriteDescriptors out of DescriptorSetDescBuilder, and stores it in ProgramExecutableVk. To deal with InputAttachment variation, ContextVk makes a copy of mWriteDescriptors when program is bound, and then update inputAttachment with framebuffer information. This not only removes unnecessary update of mWriteDescriptors, but also removes the requirement that mShaderBuffersDescriptorDesc has to reset and rebuild as a whole (because mWriteDescriptors keeps mCurrentInfoIndex which gets incremented as we build). This allows us to do further optimization in future to do piece meal update of mDescriptorInfos with only the changed data. Bug: b/282194402 Change-Id: I443c7c3b85b7a2e2e93c68d40ea102533c43f76a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4540280 Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2c836045 2023-05-18T12:00:41 Vulkan: Remove buffer/image tracking from DescriptorSetDescBuilder Right now DescriptorSetDescBuilder keeps an array of images and buffers as we build DescriptorSet cache key. Later on if we have a cache miss and allocated a new cache entry, we walk the array and tag the images and buffers with newSharedCacheKey. If we have a cache hit, the tracked images/buffers are unused and then cleared. This means the effort of keep track of these buffers are wasted when we have cache hit, which we expect to be most likely. This was initially implemented this way simply because of convenience, but there is really not a need to add another tracker for them, as they are readily available from context state anyway. This CL remove the tracking of images/buffers from DescriptorSetDescBuilder and replaced with context API to tag images and buffers with newSharedCacheKeywhen when we have a cache miss. Bug: b/282194402 Change-Id: I355c2fbabdfc573ce71c0a4281788c942d260271 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4539290 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 086b6c20 2023-05-04T15:55:31 Vulkan: Simplify TransformFeedback buffer tracking Right now the buffers that used by transform feedback is tracked by mCurrentTransformFeedbackBuffers, which is angle::FlatUnorderedSet. With the QueueSerial numbers, this can be much simplified. This CL removes mCurrentTransformFeedbackBuffers. It adds a new QueueSerial variable called mCurrentTransformFeedbackQueueSerial, and is inavlid if no active transform feedback. Then check if a buffer is used by transform feedback is as simple as check if it is written by mCurrentTransformFeedbackQueueSerial. Since buffers are already tracked by queue serial, so there is no new tracking needed. It is simply a check if the buffer contains mCurrentTransformFeedbackQueueSerial or not. Bug: b/280889890 Change-Id: I54bdc4a0cfc7194a12d2aa0abdc67a3211949024 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4507978 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Igor Nazarov 903d9fdf 2023-01-19T15:37:45 Vulkan: Implement ExternalFence for use in SyncHelperNativeFence `ExternalFence` allows concurrent usage in `CommandQueue` and `SyncHelperNativeFence` classes eliminating need of additional `vkQueueSubmit()` call. Waiting in `CommandQueue` on `QueueSerial` or `ResourceUse` will ensure corresponding state of the native FD (because `CommandQueue` will wait on the same FD instead of some other fence). After this change there will be only single `vkQueueSubmit()` call from the `SyncHelperNativeFence::initializeWithFd()` method. This CL and the follow-up is sufficient to fix the bugs below. Bug: angleproject:8115 Bug: angleproject:8117 Test: angle_end2end_tests --gtest_filter=EGLSyncTest.AndroidNativeFence_ExternalFenceWaitVVLBug* Change-Id: Ic562ecc71a95203454a1dc438589a13bcf3bff7f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4392879 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Igor Nazarov e24f4519 2023-01-19T02:30:39 Vulkan: Add externalFence into submitCommands() Currently one-off fence in the `queueSubmitOneOff()` is used only in `SyncHelperNativeFence::initializeWithFd()` to submit external fence. Other `queueSubmitOneOff()` calls may use `QueueSerial` instead of a fence. Providing `fence` into `queueSubmitOneOff()` prevents tracking that submission with `QueueSerial`. Therefore using `mUse` to collecting `mFenceWithFd` as garbage will not work as intended. This CL removes `fence` from `queueSubmitOneOff()` and adds optional `externalFence` into `submitCommands()` instead. Providing `externalFence` will cause additional `vkQueueSubmit()` call: - first submission will submit everything as usual except using the `externalFence`. - second, will only submit internal `CommandQueue` fence for `QueueSerial` tracking. As the result of this CL, call to `initializeWithFd()` will always produce two (2) `vkQueueSubmit()` calls. Previously it may be one (1) or two (2) submissions. Future CL will reduce submission count to one (1). If add additional submission into `queueSubmitOneOff()` instead of `submitCommands()`, then maximum number of submissions will be three (3). Bug: angleproject:8117 Change-Id: I6f1ec12682aaab71bfc871e665fec2659df96b26 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4392877 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Charlie Lao b875f47b 2023-04-24T11:11:03 Vulkan: Make mLastSubmittedQueueSerial reflect what it means Right now we always update ContextVk::mLastSubmittedQueueSerial and ContextVk::mLastFlushedQueueSerial when context becomes current, even though it does not make any submission. It was done it this way mainly for simplification (i.e, you will always see both queueSerial's index the same). But this also causes a performance cost when a mLastSubmittedQueueSerial is used to tag a resource. For example, if you insert a EGLSync right after makeCurrent, that EGLSync may get a queue serial number bigger than it should be, which will translate to longer sync wait time. This CL changes these two queue serial to exact what the name suggests, that it will only update if we made a "flush" or "submit" call. Bug: b/277644512 Change-Id: Ibe4c78985a3fe0726836d620202e5276894a8e7c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4458532 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Alexey Knyazev e27759f9 2023-04-20T00:00:00 D3D11: Ignore sample mask and A2C for single-sampled rendering Also fixed alpha-to-coverage for single-sampled rendering and simplified sample coverage code in the Vulkan backend. Fixed: angleproject:8102 Change-Id: Ieea03dfdc13c6105ddf916dca6d0fea593eb3a62 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4455508 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Alexey Knyazev <lexa.knyazev@gmail.com>
Charlie Lao 8049d082 2023-04-20T16:13:34 Vulkan: Split ShareGroupVk class from DisplayVK into its own files When ShareGroupVk class was introduced, it is a bit of convenience to put it in the DisplayVk.h and DisplayVk.cpp files. Now we have added more and more code into ShareGroupVk class and it deserves to have its own files. This CL added two files ShareGroupVk.h and ShareGroupVk.cpp and moved the class into the new files. No functional change is expected. Bug: None Change-Id: I8683a3dc4192612d6ec8abbc7f00424958f09598 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4454639 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao fa9172a3 2023-03-27T09:49:33 Reland "Vulkan: Use midRenderPass clear if RP has started but inactive" This is a reland of commit 98151770adfd990c533991da27615b4879494307 Original change's description: > Vulkan: Use midRenderPass clear if RP has started but inactive > > This CL extends prior CL's optimization so that if clear is issued right > after blitFramebuffer call (this could make sense if blit and clear are > on different buffer), we can keep the started render pass and do the > midRenderPass clear instead of endRenderPass and start another > renderPass. > > Bug: b/273808966 > Change-Id: Ia2504e8e260867a6f797d42cd4c8a72f187280ef > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4374145 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Bug: b/273808966 Change-Id: I5c8c85c173f021a7753ef579f83d9ceb24147a7c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4442911 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 9188aa0e 2023-04-18T10:05:29 Vulkan: Disallow reactivate of UtilsVk::blitResolve renderPass We are still seeing ClearTestES3.RepeatedStencilClearWithBlitInBetween/ES3_Vulkan flakiness on win-test bot with intel GPU. The exact root cause is still unknown. For now this CL will disallow reactivate of UtilsVk::blitResolve renderPass by the subsequent user's draw calls. Bug: b/273808966 Change-Id: Iebf37da3642d1fc3ee724b0743bfc0767ac48354 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4442446 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2f19bb74 2023-03-16T16:03:29 Reland "Vulkan: Reactivate already started render pass when possible" This is a reland of commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: Ice9062122ae320b1a0108ff981bc65bd13b2ada0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4406888 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 41f0a321 2023-04-03T21:58:43 Revert "Vulkan: Reactivate already started render pass when possible" This reverts commit ad9537af7f2bb5e22bc73f4e833fd3789adaa217. Reason for revert: Suspected cause for flakiness. anglebug.com/8118 Original change's description: > Vulkan: Reactivate already started render pass when possible > > In some usage case (such as lineage_mobile), we are seeing in the middle > of render pass, app switch to another fbo just to issue a glClear() > call, which the clear call itself gets deferred. Application then switch > back to the original frame buffer. Before this CL, the render pass gets > recreated due to frame buffer binding change, even though the clear gets > deferred and new render pass and the previous render pass are > essentially the same. This CL detects this situation and reactivate the > current render pass instead of creating a new one. With this CL, > lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only > one render pass is used instead of two. > > This CL also allows the render pass started by BlitFramebuffer reused by > subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL > reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 > pro) > > Bug: b/273808966 > Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/273808966 Change-Id: I81cc2dcacb52466808b2ccf5819feda466c39fc5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4396502 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c5e9de23 2023-04-03T15:14:03 Revert "Vulkan: Use midRenderPass clear if RP has started but inactive" This reverts commit 98151770adfd990c533991da27615b4879494307. Reason for revert: Suspected cause for flakiness. anglebug.com/8118 Original change's description: > Vulkan: Use midRenderPass clear if RP has started but inactive > > This CL extends prior CL's optimization so that if clear is issued right > after blitFramebuffer call (this could make sense if blit and clear are > on different buffer), we can keep the started render pass and do the > midRenderPass clear instead of endRenderPass and start another > renderPass. > > Bug: b/273808966 > Change-Id: Ia2504e8e260867a6f797d42cd4c8a72f187280ef > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4374145 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Bug: b/273808966 Change-Id: I7a11635a6eceafb6f4fb3a0d95f6627ee98321c0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4393497 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 98151770 2023-03-27T09:49:33 Vulkan: Use midRenderPass clear if RP has started but inactive This CL extends prior CL's optimization so that if clear is issued right after blitFramebuffer call (this could make sense if blit and clear are on different buffer), we can keep the started render pass and do the midRenderPass clear instead of endRenderPass and start another renderPass. Bug: b/273808966 Change-Id: Ia2504e8e260867a6f797d42cd4c8a72f187280ef Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4374145 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao ad9537af 2023-03-16T16:03:29 Vulkan: Reactivate already started render pass when possible In some usage case (such as lineage_mobile), we are seeing in the middle of render pass, app switch to another fbo just to issue a glClear() call, which the clear call itself gets deferred. Application then switch back to the original frame buffer. Before this CL, the render pass gets recreated due to frame buffer binding change, even though the clear gets deferred and new render pass and the previous render pass are essentially the same. This CL detects this situation and reactivate the current render pass instead of creating a new one. With this CL, lineage_m app trace reduces frame time from 3.86ms to 3.7ms, and only one render pass is used instead of two. This CL also allows the render pass started by BlitFramebuffer reused by subsequent draw calls. Asphalt_9 is hitting this use pattern and this CL reduces frame time by 0.1245 ms (from 5.6203 ms to 5.4958 on pixel 7 pro) Bug: b/273808966 Change-Id: I48c2671cbef3ff9d6cf59caae88c37c77828ee07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4348713 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Geoff Lang c6ec59dc 2023-03-27T11:15:48 Explicitly pass the extended dirty bits to syncState. Add a the extended dirty bits and bit mask to syncState instead of calling gl::State::getAndResetExtendedDirtyBits when encountering DIRTY_BIT_EXTENDED. It disallowed us from masking the extended dirty bits and feels like an anti-pattern to modify the extended dirty bits in gl::State from the backend. This is a refactor only. Bug: chromium:1410191 Change-Id: I66fdec3eb57e3426cf0fda9ccb759700eafdda14 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4374100 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>