src/libANGLE/renderer/vulkan/vk_cache_utils.h


Log

Author Commit Date CI Message
Charlie Lao cba665f3 2025-09-11T17:59:41 Vulkan: Add fast path for supportsVertexInputDynamicState In earlier CLs, we have stored stride/offset/format/divisor in Vulkan structs directly in VertexArray, this CL try to send these structs to vulkan driver directly without making another copy. divisor code has been modified to update inputRate as well as divisor in VertexArrayVk. WHen program and VertexArray's attribute types matches we use VertexArray's mVertexInputBindingDesc and mVertexInputAttribDesc without any data copy. If attribType mismatch then we make a copy and patch up the stride/format. Bug: b/439073246 Change-Id: I905b1e6d0609ffc4eb63b47e11a84f8617e06c29 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6898416 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao c292f292 2025-09-10T14:34:14 Vulkan: Remove compressVertexData feature This feature was added for performance reason. It was used years ago to improve performance of lego legacy. That entire game dashboard feature was disabled a few years ago. So this code path is no longer been used now, and not been tested on bots as well. This CL deletes this feature and related code path so that we don't just leave it bit rotten. Bug: b/167404532 Bug: b/439073246 Change-Id: I384fc97021592da57d38e8c1771892071ae68a89 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6935271 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Shahbaz Youssefi ebf29178 2025-09-05T12:15:23 Vulkan: Rename ImageLayout to ImageAccess This enum really describes how the image is accessed, including what VkImageLayout it should be in for that access. With VK_KHR_unified_image_layouts, it makes little sense to call this enum ImageLayout anymore, given how almost all of them will have VK_IMAGE_LAYOUT_GENERAL. Bug: angleproject:422982681 Change-Id: Id0ea107d339457e90b7a167292b75211eb42f803 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6918518 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 1ed999ea 2025-08-25T16:02:01 Vulkan: Move sampler cache to share group The sampler cache (and the adjacent yuv-conversion-info cache) were in vk::Renderer, but they were not thread safe. Bug: angleproject:440364873 Change-Id: I2dc034f2db400f680ca91a9fde509d90f90c957e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6870736 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Tom Sepez a02670d6 2025-08-26T20:41:16 Move unsafe buffers inside header guard macros While this is exactly opposite of what Chromium has chosen to do, there is an issue with clang-format trying to indent preprocessor directives four spaces relative to include guard. This is because Angle's .clang-format file specifies IndentPPDirectives: AfterHash but Chromium's does not. The current placement is sufficient to throw off clang-format's guard detection since the guard macro no longer covers the entire file. Bug: b/436880895 Change-Id: Ic6b99c8cef6213939cdf9b42af8730e1eb423065 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6885892 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Auto-Submit: Tom Sepez <tsepez@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Tom Sepez 25390156 2025-08-21T00:13:19 Suppress unsafe buffers on a file-by-file basis in src/ [1 of N] In this CL, we suppress many files but stop short of actually enabling the warning by not removing the line from the unsafe_buffers_paths.txt file. That will happen in a follow-on CL, along with resolving any stragglers missed here. This is mostly a manual change so as to familiarize myself with the kinds of issues faced by the Angle codebase when applying buffer safety warnings. -- Re-generate affected hashes. -- Clang-format applied to all changed files. -- Add a few missing .reserve() calls to vectors as noticed. -- Fix some mismatches between file names and header comments. -- Be more consistent with header comment format (blank lines and trailing //-only lines when a filename comment adjoins license boilerplate). Bug: b/436880895 Change-Id: I3bde5cc2059acbe8345057289214f1a26f1c34aa Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6869022 Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 677d8281 2025-08-25T15:59:46 Vulkan: Pass ContextVk to view-creation functions In preparation for moving the ycbcr conversion cache to the share group. This change is a no-op. Bug: angleproject:440364873 Change-Id: I0c18062259b07813dd04ec02650bb6fab48947ad Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6879204 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Mohan Maiya 890b5d8f 2025-07-07T13:06:54 Vulkan: Encapsulate more descriptor set logic in ProgramExecutableVk - ProgramExecutableVk handles SharedDescriptorSetCacheKey updates - Inline most update*DescInfo methods - Add dedicated methods to handle uniform and storage buffers to remove some branches from frequently used code paths Bug: angleproject:426412564 Tests: UniformBufferTest31.UniformBufferBindingRangeChangeWith*FBF Change-Id: I54b8ae2bd8778231e4d187b2cfd30f4d71de7f3b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6733546 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya 30a1cbc9 2025-07-03T13:00:05 Vulkan: Separate out descriptor set for uniform buffers Bug: angleproject:426412564 Change-Id: Icdbb1e634fc543714d1e3b9cdba0530d400cb153 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6705153 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Mohan Maiya ce289330 2025-07-01T19:41:46 Vulkan: Simplify descriptor set management - Descriptor logic is contained in ProgramExecutableVk and doesn't leak into ContextVk - Reduces CPU overhead by not having to constantly copy and resize the DescriptorSetDescBuilder - Simplifies decoupling of descriptor set of uniform buffers from that of other shader resources Bug: angleproject:426412564 Change-Id: Ic0926d0d466ea21f611c2b2c7b844e0bb9027c1b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6702410 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 2fd033d0 2025-05-22T04:21:11 Vulkan: Optimize updates to uniform buffers ... when only the offset is modified. Most of the work done when handling dirty uniforms can be skipped since the buffer bindings haven't changed Bug: angleproject:386749841 Tests: UniformBufferTest31.UniformBufferBindingRangeChange*Vulkan Change-Id: Ic811bd71f0f2993f88ce9bcf93f9e8e46dfc6d99 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6581359 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao 59281334 2025-05-22T17:13:22 Vulkan: Reduce kMaxEmptySlots for SharedDescriptorSetCacheKey There is report that addKey is still showing up in simpleperf. This CL breaks addKeyImpl into three functions so that SharedFramebufferCacheKey will still have the same behavior. SharedDescriptorSetCacheKey is changed to track 64 cache key at maximum, and updateEmptySlotBits() is never called in SharedDescriptorSetCacheKey. This means the shared cache key tracking is further limited to usage case where a buffer/texture is only involved in less than 64 descriptorSets. Otherwise we will not track the remaining DescriptorSets, which means if this buffer is released, the corresponding descriptorSet will not immediately destroyed, and we will rely on cache eviction code to take care of DS growing problem. Bug: b/384839847 Change-Id: I99abd17966446377babace6d06cc8f380a71c084 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6581492 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Jisun Lee d8c15499 2025-05-27T03:46:24 Vulkan: Clear depth and stencil unresolve separately To take into account two situations. 1. LoadOp for depth and stencil attachments are set differently. 2. depth and stencil unresolves could be different between the previous render pass and the current render pass. Bug: angleproject:42266019 Change-Id: I9e069b3972f86abb84eee6280919e6bba2901225 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6590197 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi c39f4a5c 2025-06-05T15:35:06 Vulkan: Update border color assignment for stencil In texture border clamp, if the border color is assigned together stencil mode (both using glTexParameter()), its red component is used to set up a border color to be used in each backend: * (Set up in AdjustBorderColor()) In the Vulkan backend, this is used when updating SamplerDesc, which is then used later to set up the custom border color: * (VkSamplerCustomBorderColorCreateInfoEXT) According to the spec, in case of undefined format, integer border color, and stencil image, the implementation is required to use either the first or the second component of the custom color, although it is recommended to use the first. However, at the moment, only the first component is being populated, while using the second component is also valid. * Added feature: usesSecondComponentForStencilBorderColor * Added bit to SamplerDesc: mUsesSecondComponentForStencil * It is set based on the feature flag above and the texture format. * When setting the custom border color info, the second component will be used based on the above flag. * Added test suites to test this on ES31 and ES32: TextureBorderClampTestES3*.CustomBorderColorWithStencil* * Updated capture params for glTexParameterIuivEXT(). * Suppressed the ES32 version for the following: * P4 * Linux/NVIDIA (due to out-of-date driver) Bug: b/390710636 Change-Id: Ie50c19e8ea66da40dc8b8db49d7e622a582637a5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6626416 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya c45e7c83 2025-05-22T04:14:52 Vulkan: Optimize uniform and storage buffer updates Maintain a map between buffer block index and its DescriptorDesc index in WriteDescriptorDescs and look up the map instead of repeatedly calculating it when updating DescriptorDesc Bug: angleproject:386749841 Change-Id: I74d14f6205f07992fae1e338697998d04de1c563 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6603986 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi f355e2b3 2025-04-15T18:58:25 Vulkan: Remove preferDriverUniformOverSpecConst This was practically true for every vendor on Android (where rotation matters). For Qualcomm, it was also true due to a bug in version checking and didn't seem to be causing any concerns. Where pre-rotation is supported, it is better to enable this feature to avoid excessive pipeline creation. This change removes the feature and makes sure ANGLE always uses uniforms for rotation instead of spec consts. While technically this may have an adverse effect on platforms that never need pre-rotation, the ability is retained for all vendors since pre-rotation is finding its way into more platforms and would likely eventually be needed everywhere anyway. Bug: angleproject:42265878 Bug: angleproject:42262166 Change-Id: I4b64c04da46db08cfdd44b60789b66d93d8e8b17 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6459025 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Cody Northrop <cnorthrop@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com>
Shahbaz Youssefi dae3c851 2025-03-14T11:44:53 Vulkan: Bake non-shader state into linked pipeline When using VK_EXT_graphics_pipeline_library, previously ANGLE would create three pipelines libraries: * The Shaders library was created based on the GL program's shaders + a few static states. This typically hit the program's own pipeline's cache that was warmed up during link. * The VertexInput and FragmentOutput libraries were created at draw time, which used the global pipeline cache At draw time, immediately after creating the non-Shaders libraries, the three libraries were linked into the final pipeline to be used by the draw call. This caused an inefficiency; because the non-Shaders libraries were created independently from the Shaders library, they had to be compiled pessimistically, for example because they could not be optimized to take into account the precision of the fragment shader's outputs or whether any value is const (typically alpha being set to one). Given the creation of VertexInput and FragmentOutput libraries is typically quite fast (the former being no-op and dynamic state anyway), this change removes the need for creating those libraries, and directly specifies the vertex input and fragment output state when creating the final pipeline out of the Shaders library. In this way, the same fragment output state can be tailored to the exact shaders it is being used with and incur a smaller overhead. In this change, the linked pipeline is cached in the GL program's pipeline cache, which is never synced to the blob cache as producing it is assumed to be fast already. Bug: angleproject:42265839 Change-Id: I8496ea37771555522bdc9de94043a1b56fa5967e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6354205 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Mohan Maiya 0331847e 2025-03-07T16:32:41 Vulkan: Update VkGraphicsPipelineCreateInfo::flags ... with protected access bits if VK_EXT_pipeline_protected_access is supported Bug: angleproject:42265839 Bug: angleproject:391002353 Change-Id: Ibb00a4a0dcb1084046403bf4bfaeeb8d125b9aea Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6336515 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 9df57ec5 2025-02-28T14:35:01 Vulkan: Limit max vector size of mEmptySlotBits to 4 Add another safety guard: in case of some uncommon usage case that we end up with one buffer/texture is part of many descriptorSets, skip the tracking logic in SharedCacheKeyManager to avoid the excessive overhead associated with it. The only downside is that when BufferBlock gets destroyed, we will not able to immediately destroy all cached descriptorSets that it is part of. They will still gets evicted later on if needed (see evictStaleDescriptorSets for detail). Based on 300+ app traces we have, this appears very rare situation. Also made this behavior limited to DescriptorSetCacheManager, so that FramebufferCacheManager will not get affected. FramebufferCacheManager does not have any cache eviction, so it is important that we always destroy cache when texture is destroyed. Bug: b/293297177 Bug: b/384839847 Change-Id: I0f1eb21b014f83675b14fb59ab59b5c694a421e9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6314161 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao e6d28755 2025-02-27T11:09:10 Vulkan: Use VkImageLayout in DescriptorInfoDesc DescriptorInfoDesc is part of the cache key for descriptorSet cache. Right now it uses ImageLayout for DescriptorInfoDesc::imageLayoutOrRange. There are cases where two ImageLayout have the exact same VkImageLayout, which end up with cache miss. Switch to use VkImageLayout will make it cache hit. Given that this field is uint32_t, we are not really getting any benefit by using ImageLayout. Bug: b/384839847 Change-Id: I14060c3faab701b76a554a1e3a07aff44e25d7cd Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6310838 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 00f5944d 2025-02-26T15:42:14 Vulkan: Avoid duplicate sharedCacheKey in mDescriptorSetCacheManager There are usage cases that same buffer/texture bound to multiple binding points. When we have a cache miss, we end up walking through all binding points and record the sharedCacheKey there (so that when the buffer/texture is destroyed, the cache will be destroyed). This causes same cacheKey added to the same buffer/texture multiple times. This CL keeps track of last added sharedCacheKey and do a quick check against it and it matches, we just early return. With this CL, SharedCacheKeyManager::mEmptySlotBits max vector size reduced from ~200 to ~70 for batman_telltale. Bug: b/384839847 Change-Id: I0d405c18b3f1c807da4c7a402392667630bd7f1f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6306687 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 9202e05a 2025-02-21T18:42:42 Vulkan: Invalidate the SharedCacheKey when cache gets evicted When descriptorSet cache gets evicted, right now we have a bug that the sharedCacheKey does not gets invalidated. This caused SharedCacheKeyManager always think the sharedCacheKey is valid and the mEmptySlotBits never gets cleared, which leads to mEmptySlotBits growth over time, and increases CPU overhead when walking mEmptySlotBits vector. This CL adds an assertion to ensure that all valid sharedCacheKeys has a corresponding entry in the cache, which means without this CL, some traces and dEQP tests are hitting the assertion. This CL also fixes the bug. Bug: b/384839847 Change-Id: If013443144aceb5d62f67f619074ef831e73653b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6292988 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao c96844d9 2025-02-14T10:26:25 Vulkan: More vector storage fix BufferPool::mBufferBlocks reserves 32 entry storage based on data gathered from trace. BufferPool::mEmptyBufferBlocks switched to queue since we almost never walk the entire list unless it gets destroyed. Renderer::CollectGarbage is changed to take only one object. The only time it get called with more than one object is from ImageHelper::releaseImage(), which in this CL we now creates and pass GarbageObjects to Renderer::collectGarbage directly. This also allows me to delete recursive CollectGarbage() and DestroyGarbage() functions (which is doing emplace_back quite often, even though only two entries). PipelineHelper::mTransitions is updated to reserve storage for 8 entries based on trace data. Bug: b/293297177 Change-Id: I3e4552939a780dd26f9b7b8a67deee0d52d4f9bc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6270518 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 2ba1f129 2025-02-11T15:27:16 Vulkan: Avoid storage reallocation in UpdateDescriptorSetsBuilder UpdateDescriptorSetsBuilder::mDescriptorBufferInfos and mDescriptorImageInfos will keep grow to a few hundreds of entries and that grow will end up with data copy and patching mWriteDescriptorSets. There is no requirement that entire vector of mDescriptorBufferInfos andmDescriptorImageInfos must be continuous. The only requirement is that when allocDescriptorBufferInfos(count) is called, the count of entries must be continuous. This CL uses a queue of vectors so that when we need to allocate new storage we just add another vector and allocate out of the new vector. This avoids all related data copy. Similar thing applies to mWriteDescriptorSets. The only thing I added for mWriteDescriptorSets is that I try to grow the first vector big enough to hold all of the entries for next submission to minimize the vkUpdateDescriptorSets call. Bug: b/293297177 Change-Id: Ief417ace8c8f7b477a1962505e9487bf31bae2ac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6253675 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Austin Annestrand b4cac1ad 2025-01-29T10:59:22 CL/VK: Hotfix: Implementation of Compute Pipeline Cache Unnecessary "new" that was leaking when moving raw ptr to cache map. Easy fix is to stack allocate and std::move the object to container when finished initializing. Bug: angleproject:391672281 Change-Id: I7b0f922de2a1332e8e452e87bc498d3c9907d7d8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6214690 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuly Novikov <ynovikov@chromium.org> Commit-Queue: Yuly Novikov <ynovikov@chromium.org>
Austin Annestrand 95635ef0 2025-01-23T16:30:41 CL/VK: Implementation of Compute Pipeline Cache. Implemented ComputePipelineCache, hash map from OpenCL and OpenGL compute state vectors to compiled pipelines. Implemented ComputePipelineDesc, a tightly packed description of the current compute state. Compute Pipeline State includes the specialization constants, Pipeline Options (Protected, Robust). Updated-by: Austin Annestrand <a.annestrand@samsung.com> Bug: angleproject:391672281 Change-Id: I88944dc169d194d1b2c75747769d7346b041fa75 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6191437 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Austin Annestrand <a.annestrand@samsung.com>
Shahbaz Youssefi 0f75fc3d 2025-01-20T14:10:41 Vulkan: Transition foreign images to the FOREIGN queue on submit Vulkan's interaction with AHB and dmabuf images is through the FOREIGN queue family. When ANGLE uses these images, it must take ownership of the images by doing a queue family ownership transfer (QFOT) away from the FOREIGN queue family and into the graphics queue family used by the Vulkan backend. Prior to this change, ANGLE would do the QFOT away from FOREIGN once such a foreign image is imported into an EGL image. Afterwards, usage in ANGLE works correctly. What ANGLE did not handle is when a foreign entity wants to use these images _after_ ANGLE has used them. For the above to work correctly, ANGLE must do a QFOT back into FOREIGN before the image can be used by the foreign entity. Unfortunately, EGL does not provide a clear point for this hand-off to happen. ANGLE has no choice then to proactively transition the images back into FOREIGN at some point "just in case". For some native drivers, this hand-off to FOREIGN can be quite frequent. For example, on Android for most vendors there is no actual layout transition between graphics and FOREIGN queue families (the actual data layout is the same), so a cache flush/invalidate at strategic points (such as the end of the command buffer) is sufficient as equivalent to transition to FOREIGN (and another at the beginning of the command buffer as equivalent to transition from FOREIGN). As a layer over Vulkan's formalism, ANGLE is less lucky; it has to enumerate exactly which image is being transitioned to and away from FOREIGN. Transitions away from FOREIGN are in principle easy. As long as the image is marked as being in the FOREIGN queue family, it will automatically transition to the graphics queue family on first use. In this change, when a foreign image is transitioned out of the FOREIGN queue, it's added to a list of images to be transitioned back to FOREIGN at submit time. Once submission is done, the image may or may not actually be used by a foreign entity, but ANGLE cannot know that. The next time the image is used in ANGLE, it is transitioned out of FOREIGN. Verifying correctness with multi-threading is tricky, and relies on GL's requirement that access in one context is followed by a synchronization and rebind in another context before it can be used there. This means that the image's transition to FOREIGN (at the end of one submission) naturally happens before the transition back from FOREIGN (at the beginning of the next submission). Because the set of images to transition is tracked in the context, submissions in other contexts don't interfere with the above logic. The situation can be more complicated with one-off submissions, but fortunately, no such usage of foreign images is present. Another wrinkle is simultaneous usage of the image as read-only in two contexts. According to GL, this is not a hazard and requires no synchronization. However this is broken in ANGLE even for non-foreign images (see http://anglebug.com/42266349), because as what _seems_ like read-only usage of the image from GL's point of view (like sampling from the image), there are associated write operations from Vulkan's point of view (image layout transitions and QFOT). This change does not attempt to address this corner case. Bug: angleproject:42263241 Bug: angleproject:42262454 Bug: angleproject:390443243 Bug: chromium:382527242 Change-Id: Idd4ef1fecfa3fccf1a4063f1bddb08d28b85386b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6184604 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao fbd230f5 2025-01-23T12:59:06 Vulkan: Split ErrorContext into ErrorContext and Context ErrorContext continue to be context for error handling. vk::Context is added to serve as common base class for ContextVk and CLContextVk. Bug: angleproject:390443243 Change-Id: Ifac0b1d2d714ce610693ce60a35459c6c9cddf1a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6191438 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c1214ec2 2025-01-22T14:22:56 Vulkan: Rename Context to ErrorContext In preparation for adding another Context (derived by GL and CL contexts), which includes logic that pertains to command recording (such as barrier tracking). Bug: angleproject:390443243 Change-Id: Idf495b62e63fb9aa901a2f16447fdaf3c2acd90b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6191248 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 1d25be59 2025-01-09T11:09:22 Vulkan: Pass Context instead of Renderer to BufferHelper APIs This is preparation CL for later CL. In later CL we need to access context argument in BufferHelper's barrier related functions. release() also preferred to have context argument so that the events can be recycled within share group. Because of this, a lot of functions has to propagate back to pass context as argument instead of renderer. Bug: angleproject:360274928 Change-Id: I13e930666eeeefbcff7b542d0e3126f3b07ce286 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6164686 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Tingwei Guo aa263d13 2024-12-19T17:48:13 Increase GL_MAX_ARRAY_TEXTURE_LAYERS to 4096 and end2end test Increase GL_MAX_ARRAY_TEXTURE_LAYERS from 2048 to 4096, and add an end2end test to test whether the increased GL_MAX_ARRAY_TEXTURE_LAYERS meets the memory limit. Bug: angleproject:385040554 Change-Id: Ibb1ebcb2414c530dd838b3414dc82b14ce017bc4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6108301 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Roman Lavrov 87d891dc 2024-12-13T19:51:42 Inline more tiny functions on hot path Similar to https://crrev.com/c/6094283, not as hot but broader. Highlighted by PGO profile of driver_overhead_2 trace combined with size and offset of the corresponding .so sections to maximize reduction in TLB misses. Improves driver_overhead_2 performance by 1~2% on Pixel 8. Almost no change in .so size as functions are tiny. Bug: b/383305597 Change-Id: Ib1c021d4635141b879667b59305e4d45de7b8aef Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6088958 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Roman Lavrov <romanl@google.com>
Charlie Lao d81834b6 2024-11-26T15:25:35 Vulkan: Store VkDevice in vk::SharedPtr So that we don't need to have two versions of destroy() APIs. In previous CLs I had to add another version of destroy() that does not take device argument due to SharedPtr may calls destroy when last reference count goes away. Because we do not have device information at that time, destroy() API was added but mostly just doing assertion that Vulkan object has been explicitly destroyed. With this CL, we now stores device in the SharedPtr so that we no longer need two destroy() APIs. The explicit destroy(device) call will be removed in the next CL. Bug: angleproject:372268711 Change-Id: Idcacbc3a922e17ac3d0f6056466b8f3aa084b02e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6052096 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 6c1021ec 2024-11-22T16:48:45 Vulkan: Switch DescriptorSetLayout to use AtomicSharedPtr SharedPtr has better semantics and safer to use. This CL removes direct exposure of RefCounted object and also allows me to delete BindingPointer class in later CL. Bug: angleproject:372268711 Change-Id: I08a0dff3efcf794be843a4a548b9f2609bb9a5e1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6044328 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 2dc072ec 2024-11-22T16:14:52 Vulkan: Switch PipelineLayout from AtomicBind* to AtomicSharedPtr AtomicSharedPtr/SharedPtr has better semantics and safer to use. This will allow deleting BindingPointer in later CL. Bug: angleproject:372268711 Change-Id: Ife20f68b2277a1913b06be0de153770214ac964a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6044326 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 21d747de 2024-11-20T11:54:15 Vulkan: Use vk::SharedPtr for SharedDescriptorSetCacheKey This CL switches SharedDescriptorSetCacheKey from using c++ std::shared_ptr to our internal version of vk::SharedPtr. Also get rid of an extra pointer indirection that SharedDescriptorSetCacheKey is a reference counted of actual cache key instead of std::unique_ptr of cache key. Bug: angleproject:372268711 Change-Id: Id9af5070d24f67711d6decc3a30a260b8d4062d9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6036302 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 8e0178fb 2024-11-19T13:31:45 Vulkan: Switch SamplerBinding to Use SharedPtr Another step to remove vk::BindingPointer. SharedPtr is used and SamplerBinding is renamed to SharedSamplerPtr. This also removed RefCountedSampler to avoid direct expose of RefCounted<SamplerHelper> which is risky due to ability of change reference count directly. Bug: angleproject:372268711 Change-Id: Ia6f352186a4f75ab9ce3396f298e33f70cd61a1b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6036294 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao ce53aff0 2024-11-05T16:57:57 Vulkan: Add per descriptorSet LRU cache eviction Before this CL, the descriptor set cache eviction is at the pool level. Either the entire pool is deleted or not. It is also not LRU based. This CL adds a per descriptor set cache eviction and reuse evicted descriptorSet before allocating a new pool. This eviction is LRU based so that it is more precise. The mCurrentFrameCount is passed into various API so that it can make eviction decision based on the frame number. In this CL, anything not been used in last 10 frames will be evicted and recycled before allocate a new pool. Since eviction is based on individual descriptor set, not by pool, ProgramExecutableVk no longer needs to track the DescriptorSetPool object. mDescriptorPools has been removed from ProgramExecutableVk class. As measured by crrev.com/c/5425496/133 This LRU linked list maintenance does not add any measurable time difference, but reduces total descriptorSet pool count by one third (from 75 down to 48). running test name: "TracePerf", backend: "_vulkan", story: "batman_telltale" Before this CL: cacheMissCount: 200, averageTime:23998 ns cacheHitCount: 1075445, averageTime:626 ns descriptorSetEvicted: 0, descriptorSetPoolCount:75 Average frame time 3.9262 ms After this CL: cacheMissCount: 200, averageTime:23207 ns cacheHitCount: 1025415, averageTime:602 ns descriptorSetEvicted: 102708, descriptorSetPoolCount:48 Average frame time 3.9074 ms BYPASS_LARGE_CHANGE_WARNING Bug: angleproject:372268711 Change-Id: I84daaf46f4557cbbfdb94c10c5386001105f5046 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5985112 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2b8d6bbe 2024-11-01T11:23:35 Vulkan: Use UpdateFullTexturesDescriptorSet when cache missed DescriptorSetDescBuilder::updateDescriptorSet() relies on the cache key to build descriptorSet. UpdateFullTexturesDescriptorSet() builds descriptorSet directly from state, it does not use cache key. Test shows UpdateFullTexturesDescriptorSet is much faster than updateActiveTexturesForCacheMiss and updateDescriptorSet pair. This CL removes updateActiveTexturesForCacheMiss() function and uses UpdateFullTexturesDescriptorSet for cache miss case. The timing code is added around the cache miss functions to measure the time. Old: asphalt_9 average 7,554 nanosec gl_driver2_off: 20,354 nanosec batman_telltale: 12,992 nanosec New: asphalt_9 average 916 nanosec gl_driver2_off: 1,839 nanosec batman_telltale: 3,437 nanosec Bug: angleproject:372268711 Change-Id: I176d67ed732c3fe3a18a079df7c4973aa926087a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5984893 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao fbe34df7 2024-10-29T16:19:29 Vulkan: More texture descriptorSet code cleanup Removed unused argument `pipelineType` from updateActiveTexturesForCacheMiss(). Removed unused argument `context` from getReadImageView() Rename getBufferViewAndRecordUse() to getBufferView() since there is no "record use" happening. Moved UpdateFullActiveTexturesDescriptorSet() function from vk_cache_utils.cpp to ProgramExecutableVk.cpp anonymous name space, since it is only used in this file. Bug: angleproject:372268711 Change-Id: Ib7240c1063f727fb52588234e79fba349f9aff9e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5977481 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Rafay Khurram a21b7ad0 2024-04-24T02:11:42 CL/Vulkan: Add skeleton for CLSamplerVk * It is setup to be a wrapper for the SamplerHelper interface Bug: angleproject:42266936 Change-Id: Iac7e80c4d5262687d98a8188a60a24a9be190dc2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5801184 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 5c26ffea 2024-10-29T11:11:38 Vulkan: Optmize descriptorSet cache disable code path Right now the way it works is that it first computes the cache key and then use the cache key to look in the cache. If cache misses, then it builds descriptorSet out of the cache Key. This might make sense if cache is enabled. If cache is disabled then no need to go through the middle man. This CL skip all the cache key build up entirely and directly build descriptorSet out of context state. In this CL, updateFullActiveTextures() and updateDescriptorSet() are merged into one function UpdateFullActiveTexturesDescriptorSet() which updates VkWriteDescriptorSet directly. Bug: angleproject:372268711 Change-Id: I7ba0c60a23b967d1ac903020d04022405c29e354 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5972508 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 08c1724f 2024-10-11T14:29:00 Vulkan: Support GL_ARM_shader_framebuffer_fetch_depth_stencil Bug: angleproject:352364582 Change-Id: I63fd78314fa7ebccbf366c252e309a9c0f09c8c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938150 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 9a4c7495 2024-10-15T13:05:28 Vulkan: Add feature flag to enable descriptorSet cache So that we can disable it to compare the performance difference. Bug: angleproject:372268711 Change-Id: I02da254e5d58815741080634a2dd005617aa7432 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5936135 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 31c80bbf 2024-10-17T10:56:16 Vulkan: Avoid redundant work in updateFullActiveTextures ContextVk keeps mActiveTexturesDesc, which gets updated by UpdatePreCacheActiveTextures(). This is only used for cache lookup. When there is a cache miss, we end up call updateFullActiveTextures() which recomputes DescriptorSetDesc again, which is redundant work. This CL removes mActiveTexturesDesc from ContextVk. UpdatePreCacheActiveTextures has been changed to be a DescriptorSetDescBuilder method so that it can directly update the mDesc. updateFullActiveTextures has been renamed to updateActiveTexturesForCacheMiss which avoid mDesc calculation. updateFullActiveTextures is still kept for now which will be used in next CL when cache is disabled. Bug: b/372268711 Change-Id: Ic9a0cdaa7cefca5f72b599d26d079cef14888f07 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5905766 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao dd54eeec 2024-10-11T13:26:46 Reland "Vulkan: Track GPU progress for individual DescriptorSet" This is a reland of commit 292102944add2ab30f4aa12a971cac456cc7726b with the fix of garbage being added back to garbage list. Original change's description: > Vulkan: Track GPU progress for individual DescriptorSet > > Right now ProgramExecutableVk keeps VkDescriptorSet object, and > DescriptorSetHelper is created when a cache entry becomes invalid. > Further, DescriptorSetCache keeps the cache of {VkDescriptorSet, > RefCountedDescriptorPoolHelper} pair. So we are having three different > type of objects at different stages of life: VkDescriptorSet, > DescriptorSetHelper, and {VkDescriptorSet, > RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at > creation and at cache and at garbage. With this change, you have a > reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer) > during entire life cycle and is passed around between cache and program > as is. This CL is preparation for the future CL where we may disable > cache for descriptorSet. The descriptorSet will be added to garbage list > and reused constantly without go through the cache code. We need to > track the individual descriptorSet with ResourceUse so that it won't > reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU > tracking object so that it will still just work when cache is disabled. > > Bug: angleproject:372268711 > Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> > Reviewed-by: Yuxin Hu <yuxinhu@google.com> Bug: angleproject:372268711 Change-Id: Ic920f99cc78cde1e94690bdbee3b885844fa155b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5954701 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 45cc47af 2024-10-22T21:41:22 Revert "Vulkan: Track GPU progress for individual DescriptorSet" This reverts commit 292102944add2ab30f4aa12a971cac456cc7726b. Reason for revert: Causing bot failure in later CLs Original change's description: > Vulkan: Track GPU progress for individual DescriptorSet > > Right now ProgramExecutableVk keeps VkDescriptorSet object, and > DescriptorSetHelper is created when a cache entry becomes invalid. > Further, DescriptorSetCache keeps the cache of {VkDescriptorSet, > RefCountedDescriptorPoolHelper} pair. So we are having three different > type of objects at different stages of life: VkDescriptorSet, > DescriptorSetHelper, and {VkDescriptorSet, > RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at > creation and at cache and at garbage. With this change, you have a > reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer) > during entire life cycle and is passed around between cache and program > as is. This CL is preparation for the future CL where we may disable > cache for descriptorSet. The descriptorSet will be added to garbage list > and reused constantly without go through the cache code. We need to > track the individual descriptorSet with ResourceUse so that it won't > reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU > tracking object so that it will still just work when cache is disabled. > > Bug: angleproject:372268711 > Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> > Reviewed-by: Yuxin Hu <yuxinhu@google.com> Bug: angleproject:372268711 Change-Id: I4d3c34058d100112a098144276b52c0faf8d593a No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5955529 Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Charlie Lao 29210294 2024-10-11T13:26:46 Vulkan: Track GPU progress for individual DescriptorSet Right now ProgramExecutableVk keeps VkDescriptorSet object, and DescriptorSetHelper is created when a cache entry becomes invalid. Further, DescriptorSetCache keeps the cache of {VkDescriptorSet, RefCountedDescriptorPoolHelper} pair. So we are having three different type of objects at different stages of life: VkDescriptorSet, DescriptorSetHelper, and {VkDescriptorSet, RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at creation and at cache and at garbage. With this change, you have a reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer) during entire life cycle and is passed around between cache and program as is. This CL is preparation for the future CL where we may disable cache for descriptorSet. The descriptorSet will be added to garbage list and reused constantly without go through the cache code. We need to track the individual descriptorSet with ResourceUse so that it won't reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU tracking object so that it will still just work when cache is disabled. Bug: angleproject:372268711 Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi a1584f49 2024-10-11T21:17:32 Vulkan: Qualify framebuffer fetch with "Color" In preparation for depth/stencil framebuffer fetch, many framebuffer fetch symbols are affixed with Color to indicate that they pertain to color framebuffer fetch logic. Bug: angleproject:352364582 Change-Id: I86000ada5e6ef47387dec0b6a3fca589d816cdc2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5926593 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 1608d0be 2024-10-10T16:53:15 Vulkan: Isolate framebuffer fetch no-RP-break optim from DR Prior to [1], changes to framebuffer fetch usage by shaders caused a render pass break. This was due to a limitation of render pass compatibility rules. It also caused other headache, such as needing to clear the render pass cache, recreating pipelines etc. [1]:https://chromium-review.googlesource.com/c/angle/angle/+/3697308 In [1] an important optimization was implemented for tiling GPUs where ANGLE permanently switched to framebuffer fetch mode on first encountering framebuffer fetch use. From that point on, ANGLE would always make every render pass framebuffer fetch compatible. In reality, the render pass break was unnecessary, which became apparent with dynamic rendering (for example that whether the render pass includes input attachments has no bearing on a pipeline that doesn't use input attachments at all). In [2], dynamic rendering kept the render pass break + permanent switch behavior for simplicity. [2]:https://chromium-review.googlesource.com/c/angle/angle/+/5637155 This change untangles the optimization done for legacy render passes from dynamic rendering, allowing dynamic rendering to start every render pass without framebuffer fetch and enable it later if a framebuffer fetch program is used. This is in preparation for supporting depth/stencil framebuffer fetch, where a perma-switch is troublesome (for example in combination with read-only depth/stencil feedback loops). Bug: angleproject:352364582 Change-Id: I31221cf22a28d58b9b2bf188e9c0b786cd0fe3d2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5923120 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Mohan Maiya b3d85cce 2024-09-30T14:28:35 Vulkan: Consolidate write colorspace override states ColorspaceState struct is now used to cache write colorspace related states to determine the colorspace of Vulkan draw image views. ImageViewHelper methods are called during initialization and when colorspace related states are toggled dynamically which in turn process these states and determine the final write colorspace. We can now fully support rendering to EGLImages, with colorspace overrides, via texture or renderbuffer EGLImage targets Bug: angleproject:40644776 Tests: ImageTest*Colorspace*Vulkan MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan ReadPixelsPBOTest.SrgbUnorm*Vulkan Change-Id: I2be2cd3b5b2b4ac8ecb803c34cde2b846cbd1cbe Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5901256 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya b38cc7fa 2024-09-30T12:43:09 Vulkan: Consolidate read colorspace override states ColorspaceState struct is now used to cache read colorspace related states to determine the colorspace of Vulkan read image views. ImageViewHelper methods are called during initialization and when colorspace related states are toggled dynamically which in turn process these states and determine the final read colorspace. Bug: angleproject:40644776 Tests: ImageTest*Colorspace*Vulkan SRGBTextureTest.SRGB*TextureParameter*Vulkan SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan Change-Id: I16b3666cd80865936b826dc0738fc9210dabeda9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5901255 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Charlie Lao b61f9f9e 2024-10-04T11:07:23 Vulkan: Add operator<< for descriptorSet for debugging Right now it is using streamOut() function which is hard to use with WARN(). This replaces the streamOut function with standard c++ operator<< so that we can use in WARN()/INFO() along with other logs for debugging. Bug: b/368566032 Change-Id: Iec98b4c59f360cbbfb8fbdd85d5d1150fcca8f4a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5908773 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao eb4eaea9 2024-10-03T17:15:21 Vulkan: Improve SharedCacheKeyManager::addKey performance This function walks a vector of keys. When there are many keys this could be slow. Also when we have to grow the vector size, it involves memory reallocation which means copy the data from old storage to new storage. This CL changes mSharedCacheKeys to use std::deque instead of vector which solves storage reallocation problem. It also adds angle::BitSet64<64> to track all available (i.e., empty) slots in mSharedCacheKeys so that we don't have to loop most of time. You only loop all keys once to find all empty slots and then subsequent addKey() call will be O(1) until all empty slots are used. Bug: b/368566032 Change-Id: I4d32b461761f1cd64380f5527883b84357bb44c1 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5908690 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi b16d105f 2024-10-03T10:25:32 Remove Desktop GL front-end support For Desktop GL applications, please use Zink! Bug: angleproject:370937467 Change-Id: Ie734634bb62a2e98c80e1b32d8b3d34624da3c04 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5905428 Reviewed-by: Geoff Lang <geofflang@chromium.org>
Gowtham Tammana cc44090d 2024-09-18T12:28:53 Vulkan: Add an extra descriptor set index In the case of CL, the clspv transcompiler can generate upto four descriptor set indices, so add an extra index to vk::DescriptorSetIndex. Also, adding aliases for CL specific naming. Bug: angleproject:369724757 Change-Id: I45ef8a6d9246c7863ebc6edf08479bc7c661c151 Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5893953 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Yuxin Hu eaffa034 2024-09-24T20:56:04 Revert "Vulkan: Consolidate colorspace override states" This reverts commit bffcd235ba6c031603d798daaa98f1cf9a3f3e46. Reason for revert: Breaks Android test `org.skia.skqp.SkQPRunner#UnitTest_DMSAA_dst_read`. Details: https://b.corp.google.com/issues/369388539. Original change's description: > Vulkan: Consolidate colorspace override states > > ColorspaceState struct is now used to cache colorspace related states > and used to determine the colorspace of Vulkan image views. > ImageViewHelper methods are called during initialization and when > colorspace related states are toggled dynamically which in turn process > these states and determine the final read and write colorspaces. > > We can now fully support rendering to EGLImages, with colorspace > overrides, via texture or renderbuffer EGLImage targets > > Bug: angleproject:40644776 > Tests: ImageTest*Colorspace*Vulkan > MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan > SRGBTextureTest.SRGB*TextureParameter*Vulkan > SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan > ReadPixelsPBOTest.SrgbUnorm*Vulkan > Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Charlie Lao <cclao@google.com> > Commit-Queue: mohan maiya <m.maiya@samsung.com> Bug: angleproject:40644776 Change-Id: I5bf6cf2ed0c8ec22fc02d8c3da92673ee85fe002 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5888506 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya bffcd235 2024-09-13T14:58:00 Vulkan: Consolidate colorspace override states ColorspaceState struct is now used to cache colorspace related states and used to determine the colorspace of Vulkan image views. ImageViewHelper methods are called during initialization and when colorspace related states are toggled dynamically which in turn process these states and determine the final read and write colorspaces. We can now fully support rendering to EGLImages, with colorspace overrides, via texture or renderbuffer EGLImage targets Bug: angleproject:40644776 Tests: ImageTest*Colorspace*Vulkan MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan SRGBTextureTest.SRGB*TextureParameter*Vulkan SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan ReadPixelsPBOTest.SrgbUnorm*Vulkan Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi 167b9e8d 2024-09-18T21:51:38 Vulkan: Fix pipeline cache store vs monolithic pipeline race The thread that creates monolithic pipelines needs to hold the pipeline cache lock, as well as the thread that stores the pipeline cache contents to the blob cache. Bug: angleproject:42265839 Change-Id: I17cf9d2bb3f27d531f368003cb4ee00007a464fa Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5872715 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Gowtham Tammana f8fc8ac3 2024-08-05T11:50:11 Vulkan: Remove dependency on ContextVk for CommandBufferHelper Following on the changes in [1], this makes the `CommandBufferHelperCommon` and `OutsideRenderPassCommandBufferHelper` interfaces independent of `ContextVk` state. Any dependency is made explicit. In addition, interfaces that are not specific to GLES context are also updated. [1]: Commit (bcf814fda5 Vulkan: Constrain the dependency on ContextVk in BufferHelper) Bug: angleproject:8544 Change-Id: I7d90ad915e8c14187ab5584453b9e8802bd91e2b Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5319147 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 7c77bb75 2024-07-24T12:05:29 Vulkan: Remove implicit buffer barrier for shader write When app uses shaders to write to SSBO, right now we are inserting an implicit barrier to ensure WAW are in order. But Spec says that "Explicit synchronization is required to ensure that the effects of buffer and texture data stores performed by shaders will be visible to subsequent operations using the same objects". This CL removes the implicit barrier for buffer write if the current write comes from shaders and relies on explicit glMemoryBarrier to insert a global barrier. Bug: angleproject:350994515 Change-Id: I8ab039610be9be2ded27ea60dab54bdad08502f6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719258 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 7691cea7 2024-07-22T13:46:14 Vulkan: Remove seamful cubemap emulation Practically, the Vulkan backend is never expected to run on ES2 hardware. It _may_ for WebGL, but seamful cubemap emulation was disabled for webgl anyway. Bug: angleproject:354729454 Change-Id: Iafa20fbdbe232c4df4c777b12e7698ef7a87cf24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5730143 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 1db80b88 2024-07-10T12:47:42 Reland "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This is a reland of commit c379ff48043a47e444c388c45270db40d3172d50 Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I083e6963b5421386695e49a9872edbb2016c9763 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691342 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 1f87cbc9 2024-07-15T13:07:35 Vulkan: Fix late-added resolve attachment tracking Resolve attachments may be added after the fact to a render pass due to glBlitFramebuffer or eglSwapBuffer. Previously, only the resolve image views were tracked by the render pass, and otherwise the state tracking (layout, content defined, etc) treated the resolve images as generically written-to by the render pass. As a result, the render pass was unable to finalize the layout of the resolve images early. Optimizing the layout of the swapchain image when the surface is multisampled for example was not done due to this issue. In this change, when resolve attachments are added late, they are tracked identically to when they are added at the beginning of the render pass, fixing the issues described above. Bug: angleproject:42265625 Bug: angleproject:42266019 Change-Id: I765560762bb8caf39ba1096fb028177201c082d7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5707470 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 7d461b21 2024-07-10T14:11:53 Revert "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]" This reverts commit c379ff48043a47e444c388c45270db40d3172d50. Reason for revert: Regresses CPU perf and memory when _not_ using DR Original change's description: > Vulkan: Use VK_KHR_dynamic_rendering[_local_read] > > Bug: angleproject:42267038 > Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 > Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> > Reviewed-by: Charlie Lao <cclao@google.com> Bug: angleproject:42267038 Change-Id: I3865f0d86813f0eeb9085a92875a33bd449b907f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691337 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c379ff48 2024-06-10T22:01:57 Vulkan: Use VK_KHR_dynamic_rendering[_local_read] Bug: angleproject:42267038 Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Igor Nazarov 7d61980e 2024-06-26T18:39:53 Vulkan: Rename DescriptorSetLayoutDesc update() to addBinding() The `update()` method is never actually used to update the exiting bindings (but rather to add new ones), this change renames the method to `addBinding()` and adds few ASSERTs for clarity. Also, after recent changes in `DescriptorSetLayoutDesc` class, some changes made by `update()` method are irreversible. It is possible to have different descriptions that will produce same layout if use `update()` to rewrite the existing structure. Bug: angleproject:8677 Change-Id: If85eb2b271bc06843ee9326c024d73801d3da091 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5676345 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi d193d51b 2024-06-17T22:46:08 Replace issue ids post migration to new issue tracker This change replaces anglebug.com/NNNN links. Bug: None Change-Id: I8ac3aec8d2a8a844b3d7b99fc0a6b2be8da31761 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637912 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi d0744916 2024-05-07T12:52:53 Vulkan: Smaller PackedDescriptorSetBinding Bug: angleproject:8677 Change-Id: Id7bcef8de129514446384a019b6cce95da13b028 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5522755 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 295ff607 2024-06-05T14:49:33 Vulkan: Precompute stageMask of kImageMemoryBarrierData Right now every time we need a pipelineStage in kImageMemoryBarrierData, we are doing a bitwise AND with mSupportedVulkanPipelineStageMask. This get called multiple times from barrier call. This CL adds mImageLayoutAndMemoryBarrierDataMap that has already precomputed all stageMask, thus avoid run time bitwise OR. This CL also precomputes the bufferWritePipelineStageMask so that flushImpl can be use it without construct every time. Bug: b/345279810 Change-Id: I878bd31c967cd217477061976f07df13b043fa7f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5601073 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi c3a1cae4 2024-04-15T14:58:55 Use angle::SimpleMutex everywhere in libGLESv2 Only cases left that use std::mutex are: - Share group and the context ErrorSet mutexes as they need try_lock() - Anywhere mutexes are used in conjunction with std::condition_variables (as they explicitly require std::mutex) Bug: angleproject:8667 Change-Id: Ib6d68938b0886f9e7c43e023162557990ecfb300 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5453294 Reviewed-by: Roman Lavrov <romanl@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 97aaad3a 2024-04-20T19:27:13 Vulkan: Pack DescriptorSetLayoutDesc layout Use angle::FastVector instead of arrays to further compact DescriptorSetLayoutDesc layout Bug: angleproject:8677 Tests: VulkanDescriptorSetLayoutDescTest* Change-Id: I5bb7b2ebf0aa5aba3d7c47c45384788245dce3dc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5470362 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya 48132950 2024-04-17T17:05:07 Vulkan: Optimize DescriptorSetLayoutDesc layout Separate out immutable samplers into its own array so we can remove padding from PackedDescriptorSetBinding which reduces the size of that struct from 16 bytes to 4 bytes. Bug: angleproject:2462 Change-Id: I79d1ab584178202c9b7f34b0c7926edced4e21a8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5464162 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi 80c8b6f0 2024-04-17T10:06:45 Revert "Vulkan: Only enable DS dynamic state if there is DS attachment." This reverts commit 471b50407d7d1c22491d066df77060cb8b9b2f89. The reverted change does not correctly handle UtilsVk functions, leading to validation failures. UtilsVk could be made to not set dynamic state when the depth/stencil attachments are missing, but instead the change is reverted because: - The original issue that prompted this is easily fixable (and fixed in this change) - Disabling depth/stencil dynamic state is not necessarily a performance improvement; every time a pipeline in such a render pass is bound, the driver would have to make sure to no-op the relevant state change if static, which is also costly. Instead, dynamic state may need to be set only once in the entire render pass. Bug: b/223456677 Bug: b/315353258 Bug: angleproject:8242 Change-Id: I8282b87857d6b9285dbcf307c3c6ecf69df5fadb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5462079 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Mohan Maiya b2aad1bb 2024-04-17T06:12:14 Vulkan: Track valid descriptor set layouts Instead of looping through kMaxDescriptorSetLayoutBindings in `DescriptorSetLayoutDesc::unpackBindings` track valid descriptor set layouts in `DescriptorSetLayoutDesc::update` Bug: angleproject:2462 Change-Id: I1ca2ba72875d9306b6059b14cde39c5d16250be6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5464160 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Mohan Maiya c1397510 2024-04-07T21:05:34 Vulkan: Fix data race in WarmUpGraphicsTask std::unordered_map doesn't support simultaneous read and write. Cache placeholder PipelineHelper in WarmUpGraphicsTask and std::move the newly created PipelineHelper when warm up is complete. Bug: angleproject:8297 Change-Id: I1cc4b3cd48147d0080666d5669d61de006c2252d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5431830 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya ad13fec3 2024-03-30T15:31:49 Vulkan: warmUpGraphicsPipelineCache(...) shouldn't set state The prepareForWarmUpPipelineCache(...) method would have already setup all necessary state for the warm up task. Make that intent explicit by calling into a method that sets no state. Bug: angleproject:8297 Change-Id: I959d8591045ff05ddb2a410fd0e0eda8dd692d37 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5408796 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi b559efa8 2024-03-26T22:02:41 Vulkan: Allow depth and stencil resolve to be separately added In preparation for optimizing resolve through glBlitFramebuffer for depth/stencil attachments. Bug: angleproject:7551 Change-Id: I57650d82c0cc6e56f44591eadfc42ac794cfef09 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5399140 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi c71a67de 2024-03-27T15:50:00 Vulkan: Move pipeline cache graph dump to renderer In preparation for moving some caches to the share group. Bug: angleproject:6565 Bug: angleproject:8629 Change-Id: I1a06a18417502e499da0edb9abb0d510e3ad99ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5401513 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com>
Shahbaz Youssefi 9475ac40 2023-11-15T10:25:06 Vulkan: Make efficient MSAA resolve possible Prior to this change, using a resolve attachment to implement resolve through glBlitFramebuffer was done by temporarily modifying the source FramebufferVk's framebuffer description. This caused a good deal of complexity; enough to require the render pass to be immediately closed after this optimization. The downsides to this are: - Only one attachment can be efficiently resolved - There is no chance for the MSAA attachment to be invalidated In this change, resolve attachments that are added because of glBlitFramebuffer are stored in the command buffer, with the FramebufferVk completely oblivious to them. When the render pass is closed, either the FramebufferVk's original framebuffer object is used (if no resolve attachments are added) or a temporary one is created to include those resolve attachments. With the above method, the render pass is able to accumulate many resolve attachments as well as have its MSAA attachments be invalidated before it is flushed. For a FramebufferVk that is resolved in this way, there used to be two framebuffers created each time and thrown away as the code alternated between starting a render pass without a resolve attachment and then closing with one. With this change, there is now one framebuffer (without resolve attachments) that is cached in FramebufferVk (and is not recreated every time), and only the framebuffer with resolve attachments is recreated every time. Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there would be no framebuffers to create and destroy, and this change paves the way for that support too. WindowSurfaceVk framebuffers are still imagefull. Making them imageless adds unnecessary complication with no benefit. ----------------- To achieve efficient MSAA rendering on tiling hardware, applications should do the following: ``` glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO); // Clear the framebuffer to avoid a load // Or invalidate, if not needed to load: // glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...); glClear(...); // Draw calls // Resolve into the single sampled framebuffer glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO); glBlitFramebuffer(...); // Immediately discard the contents of the MSAA buffer, to avoid store glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...); ``` The above would translate to the following Vulkan render pass: - MSAA LOAD_OP_CLEAR/DONT_CARE - MSAA STORE_OP_DONT_CARE - Resolve LOAD_OP_DONT_CARE - Resolve STORE_OP_STORE This makes sure the MSAA data doesn't leave the tile memory and greatly reduces bandwidth usage. Once anglebug.com/4892 is fixed, this would also allow the MSAA image to never be allocated either. Bug: angleproject:7551 Bug: angleproject:8625 Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi efd41bd2 2024-03-15T13:25:03 Vulkan: Rename ResourceVk.* to vk_resource.* This file adds helpers to namespace vk, so its name is changed for consistency with other namespace vk files. Bug: angleproject:8564 Change-Id: I6525e7609eb9385f2a3eecaa7c52b7417fda7f12 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370108 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 60aaf4a0 2024-03-14T12:58:56 Vulkan: Move renderer to namespace vk This class is agnostic of EGL. This change moves it to namespace vk for use with the OpenCL implementation Bug: angleproject:8564 Change-Id: I57f7807d6af8b3d5d7f8efbaf8b5d537a930f881 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5371324 Reviewed-by: Austin Annestrand <a.annestrand@samsung.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 91ddf851 2024-03-03T10:57:22 Vulkan: support QCOM foveated rendering extensions Add support for foveated rendering in the vulkan backend. This is done by leveraging the VK_KHR_fragment_shading_rate extension. Bug: angleproject:8484 Change-Id: I0d01d07583f710b2302ea07b19c9d113c73bfe41 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5269907 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Austin Annestrand fc440afa 2024-02-16T13:57:32 Vulkan: Move DS builder class to Vk utils Currently, UpdateDescriptorSetsBuilder lives in ShareGroupVk.cpp/h. The UpdateDescriptorSetsBuilder isn't really GL-specific. Thus it can be moved over to vk_cache_utils.h (more of a Vk utility class). Bug: angleproject:8546 Change-Id: I1ead04bab4c5840e6c471cdc7c5db4220e32bd50 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5303540 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 6607a2b9 2024-01-17T15:58:20 Vulkan: Add support for VK_EXT_vertex_input_dynamic_state Hook into VK_EXT_vertex_input_dynamic_state so pipeline states that differ only in vertex input state can reuse existing pipelines. Bug: angleproject:7162 Tests: StateChangeTestES3.Vertex* Change-Id: Icd3134dee93fc5fc2e9d284fcfa8c674b62faec8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5207462 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Chris Forbes f5f3304a 2024-02-02T16:27:38 Vulkan: Simplify handling of YUV filtering support When the requested filtering mode changes, we need to consider whether it is actually supported by the Vulkan driver. Now that we support renderable YUV textures, there are now three interesting cases: 1) The texture has a VkFormat, and so filtering support can be queried from GPDFP, as was already done. 2) The texture is imported from an opaque AHB using an external format, that format is renderable, and so we have assigned one of the EXTERNALn angle formats. This was *not* covered properly, and would lead to VVL errors or UB. 3) The texture is imported from an opaque AHB using an external format, and we have not assigned an EXTERNALn angle format to it, because the format is not renderable, or the Vulkan driver is missing the external format resolve functionality; In this case the angle format is NONE. This was similarly *not* covered properly, although the code did attempt to protect itself from querying the capabilities of format NONE. VVL errors and UB were still possible. To most simply cover all of these cases, capture whether the image has the VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER feature upfront, and forget about format lookups in the internals of the YcbcrConversionDesc. Bug: b/315387961 Change-Id: Ie140293d52c2b88bf06ef19bc54bb1c95927b8ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5259719 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 0c4d6446 2024-01-24T10:38:45 Rework uniform block <-> uniform buffer mapping In GLES, the shader declares which buffer binding a block (uniform, storage or atomic counter) is bound to. For example: layout(binding = 1) uniform ubo0 { ... }; layout(binding = 2) uniform ubo1 { ... }; layout(binding = 1) uniform ubo2 { ... }; In the above, ubo0 and ubo2 use data from the buffer bound to index 2 (through glBindBufferRange), while ubo1 uses data from the buffer bound to index 1. For uniform blocks in particular, omitting the binding is allowed, in which case it is implicitly bound to buffer 0. GLES allows uniform blocks (and only uniform blocks) to remap their bindings through calls to glUniformBlockBinding. This means that the mapping of uniform blocks in the program (ubo0, ubo1, ubo2) to the buffer bindings is not constant. For storage blocks and atomic counter buffers, this binding _is_ constant and is determined at link time. At link time, the mapping of blocks to buffers is determined based on values specified in the shaders. This info is stored was stored in gl::InterfaceBlock::binding (for UBOs and SSBOs), and gl::AtomicCounterBuffer::binding. For clarity, this change renames these members to ...::inShaderBinding. When glUniformBlockBinding is called, the mapping is updated. Prior to this change, gl::InterfaceBlock::binding was directly updated, trumping the mapping determined at link time. A bug here was that after a call to glProgramBinary, GL expects the mappings to reset to their original link-time values, but instead ANGLE restored the mappings to what was configured at the time the binary was retrieved. This change tracks the uniform block -> buffer binding mapping separately from the link results so that the original values can be restored during glProgramBinary. In the process, the support data structures for tracking this mapping are moved to ProgramExecutable and the algorithms are simplified. Program Pipeline Objects maintain this mapping identically to Programs and no longer require a special and more costly path when a buffer state changes. This change prepares for but does not yet fix the more fundamental bug that the dirty bits are tracked in the program executable instead of the context state, which makes changes not propagate to all contexts correctly. Bug: angleproject:8493 Change-Id: Ib0999f49be24db06ebe9a4917d06b90af899611e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5235883 Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 471b5040 2023-10-26T09:33:29 Vulkan: Only enable DS dynamic state if there is DS attachment. This is discovered while investigating EXT_yuv_target crash in driver. What happens is that UtilsVk::copyImage does not set depth stencil dynamic state since there is no depth stencil attachment. But we enabled dynamic state for D/S, thus driver still does D/S state setup, which sees garbage data and hitting assertion. Even though this is discovered with EXT_yuv_target test, I believe this is a general issue. This CL adds the renderPassDesc.hasDepthAttachment() and hasStencilAttachment() check and enable depth or stencil related dynamic state only if there is depth or stencil attachment. This fixes crash in driver with ImageTestES3.ClearYUVAHB test. This also has added performance benefit that we now completely skips depth/stencil related dynamic state dirty bit handling code, thus reduces state processing CPU overhead. Bug: b/223456677 Change-Id: I3a4fe6d97b14c066d78f8b8ded21c626cb2f376c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4980765 Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 4d7fc442 2023-10-18T12:49:06 Vulkan: Fix VK_android_external_format_resolve VVL error part 3 VUID-VkRenderPassAttachmentBeginInfo-pAttachments-parameter: The Vulkan spec states: If attachmentCount is not 0, pAttachments must be a valid pointer to an array of attachmentCount valid VkImageView handles. The bug here is that when nullColorAttachmentWithExternalFormatResolve is true, there is no color attachment, but the RenderPassDesc still appears having a color attachment because we need to store the formatID in it. This CL changes to use mFramebuffer.getImageViews().size() instead of mRenderPassDesc.attachmentCount() which is more correct anyway. Bug: b/223456677 Change-Id: I0f0947f0c642bac9cd18a80525b92c62ef0723ec Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4952969 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>
Charlie Lao 2e11fcc5 2023-10-16T16:40:35 Vulkan: Fix assertion when YUV image attached to resolve attachment When YUV image attached to resolve attachment, mSamples is 1. Righ now the code assumes resolve is a MSRT attachment, so it asserts mSamples>1. This CL adds a new API packYUVResolveAttachment so that we can assert properly for YUV and MSRT. Bug: b/223456677 Change-Id: Ib65fd3fe1e6561b85395cc27204bbd85c1f464c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4942907 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Roman Lavrov 6154bd93 2023-10-12T15:27:44 Remove binding from DescriptorInfoDesc. The issue with hitting the cache falsely is no longer reproduced (tests added in https://crrev.com/c/4104121) Charlie had changed the cache so mWriteDescriptors are no longer part of this class, so some of those changes might have affected that. Also mDescriptorInfos was previously a map and now is a vector, which imposes a specific ordering - and that might be taking care of the sampler swap hitting the cache falsely. Charlie suggested that https://crrev.com/c/4581881 might have taken care of this as textureUnit was used instead of bindingIndex: https://chromium-review.googlesource.com/c/angle/angle/+/4936096/comment/ad2c0aa0_441bd33d/ Bug: angleproject:7974 Change-Id: I58391790a4362313c07c7bd28ed6f38f30720781 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4936096 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Roman Lavrov <romanl@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 6698fb69 2023-08-25T22:21:32 Vulkan: Stop passing both ProgramExecutable and ...Vk around Now that ProgramExecutableVk is accessible through ProgramExecutable. Bug: angleproject:8297 Change-Id: Ie08770ef97400195d63b87f2d4b7e2a2c8f4ad24 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4812147 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Shahbaz Youssefi 571b4cdb 2023-08-14T16:55:28 Vulkan: Move pipeline/desc-set layout creation to link job The pipeline and desc-set layout caches are consequently made thread-safe. The reference counter on the layouts are also made atomic. With this change, practically all of the link in the Vulkan backend is moved to the link job. Bug: angleproject:8297 Change-Id: Iba694ece5fc5510d34cce2c34441ae08ca5bb646 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4774787 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 16cfa28e 2023-08-08T22:08:24 Vulkan: Basic infra for parallel link This change moves pipeline warm up to a parallelizable task, mostly as an exercise to put in the infrastructure for parallel link in the Vulkan backend. Follow up changes will move more of the link step to this task. The end goal is to be able to make the link task independent of ContextVk, which would allow it to be run as an UnlockedTailCall, even if not using a worker thread. Bug: angleproject:8297 Change-Id: I17047162b2a41f0d681d9e3ee33f2e0239b4280d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4764231 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 7c69116f 2023-08-08T10:14:47 Vulkan: Fix data race with DynamicDescriptorPool Right now DynamicDescriptorPool::destroyCachedDescriptorSet can be called from garbage clean up thread, while simultaneously accessed from context main thread, and data race will happen and cause bugs. This can only happen when the buffer is not being suballocated. In this case, suballocation owns the bufferBlock and bufferBlock gets destroyed when suballocation is destroyed from garbage collection thread. If buffer is suballocated, the shared group owns pool which owns bufferBlocks and they gets destroyed from shared group with the share group lock. This CL avoids this race problem by release the shared cacheKey when the buffer is released, while we still had the shared group lock. Bug: chromium:1469542 Change-Id: Ic1f99e6b6083d63e4efb9c3f408921da62c006ac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4761365 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Roman Lavrov c0f2f71e 2023-06-27T16:00:09 Use VK_EXT_legacy_dithering when available instead of emulation Yields improvement in gpu power: http://b/284462263#comment45 Bug: b/284462263 Change-Id: I5bfd115557b6baac17c05639118feaebf19c5cd4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4652590 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Roman Lavrov <romanl@google.com>
Charlie Lao 02292814 2023-06-01T14:46:05 Vulkan: Optimize the usage of FastMap in DescriptorSetDescBuilder While looking at disassemble of DescriptorSetDescBuilder::updateOneShaderBuffer() function, I noticed that there are a lot of CPU cycles spent in FastMap::operator[]. What happend here is that we are increasing size one by one as we build descriptorSet, and that hit `if (mData.size() <= key)` case and we end up resize the underline FastVector, and that resize also initialize the element with zeros, which immediately overwrite by actual data. Since we actually know the eventual size of DescriptorSetDescBuilder::mDesc/mHandles/mDynamicOffsets, we could just switch to angle::FastVector which will avoid this check size and grow every time we write to it. This CL switches the use of FastMap in DescriptorSetDescBuilder to FastVector. The only trick we need to watch out is that previously the new elements are always zero filled and now it does not. So we need to make sure we write every field of structure. This CL also renames WriteDescriptorDescBuilder to WriteDescriptorDescs since when it is read only we are passing it as const reference already, there is no added advantage to have two classes. Bug: b/282194402 Change-Id: I06a063cc51585fc17fbf0d5aa916b9aa0ab88dd4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4581881 Reviewed-by: Roman Lavrov <romanl@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi cafbf6e2 2023-06-22T22:50:32 Vulkan: Simplify active uniform check Bug: angleproject:7220 Change-Id: Ic0f26f3d09bac570d4ed3f791c456d569208424a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4636869 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi ec1f18db 2023-06-21T10:16:51 Vulkan: Remove ShaderVariableType and flatten info map With the conversion of the interface variable info map keys to SPIR-V ids, there is no longer a benefit to bucket resources by their type. This change removes this bucketing and flattens the map. Bug: angleproject:7220 Change-Id: If83cb02ca9e91f72dddb2deb7313fee40f9f06c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4632577 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi c1ba8e6f 2023-06-20T16:03:20 Vulkan: Flatten shader interface variable maps This change removes duplicate entries added in the shader shader interface variable maps. One level of arrayness (indexed by shader type) is removed from these maps as now there is only a single entry per linked resource/etc. Bug: angleproject:7220 Change-Id: Ibf2d06a0e1f68e68797c2066f36e14cb9e667f77 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4628677 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>