src/libANGLE/renderer/vulkan/BufferVk.cpp


Log

Author Commit Date CI Message
Imran Ziad 0040cda1 2024-09-25T18:16:36 Vulkan: Invalidate host visible non-coherent buffers on mapping We can not trust the cache during CPU readback when the buffer memory type is non-coherent. Bug: b/366134076 Change-Id: I89920cfa468ee0be0feb607fea9d60bc0732191f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5890707 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Auto-Submit: Imran Ziad <imranziad@chromium.org>
Charlie Lao c9d55051 2024-09-06T10:56:07 Vulkan: Consolidate dirtyRanges before vertex conversion Detect two ranges overlap or are continuous and merge them. This reduces number of dispatch calls as well as avoids redundant conversion. Bug: b/357622380 Change-Id: I06b73a1e9fd573d79af985b247f4d66bf97f756e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5851642 Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 10d56e63 2024-09-06T10:56:07 Vulkan: use mEntireBufferDirty if range covers entire buffer Detect the dirtyRange covers entire buffer and set mEntireBufferDirty instead of add range into mDirtyRanges. This will get into a much simpler code path that will not have any redundant conversion. This also removes quite big portion of overlapped ranges cases without much of overhead. Bug: b/357622380 Change-Id: Iedaa3662a6fc52257e71d39ab75baddf6ad3e41b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5840476 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Auto-Submit: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi 941b3df3 2024-08-21T17:23:16 Vulkan: Add new threshold for CPU buffer subdata Currently, with preferCPUForBufferSubData enabled, when the GPU is busy, we use the CPU to copy subdata, which includes duplicating the buffer. If the buffer is large, this can cause performance regression. To avoid this, we should avoid using the CPU in cases that can lead to such regressions. * Enabled preferCPUForBufferSubData for all ARM devices. * This feature is only used in the following cases: * If the buffer is smaller than than 32K; or * If the copy size is greater than 1/8 of the total buffer size. * Significant performance improvement observed in several traces. Bug: b/360118138 Change-Id: Ibc8e3de9b4e081c69457c85facba893283c8fb38 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5824347 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 53476d6f 2024-09-04T14:34:11 Vulkan: Do vertex conversion with fine grain dirtyRange Right now when we do vertex conversion with multiple dirty ranges, we are merging dirty ranges into single range and then UtilsVk::convertVertexBuffer() is called for the merged dirty range. If there is big gap between two ranges, the merged range could be very big. This means we end up doing many unnecessary conversion. This CL tracks individual dirty ranges and issues dispatchCompute for each dirty range, thus minimize the unnecessary conversions. On S24, this change further reduces TraceTest.gangstar_vegas frame time from ~6.0ms to ~3.8ms, almost parity to native GLES's ~3.6ms. Bug: b/357622380 Change-Id: Ia103f3963bdb5996ff3f95164c955a3e4f33f311 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5787633 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 20511d0b 2024-09-05T17:35:33 Vulkan: Remove unnecessary perfetto trace in BufferVk::mapRange There is not much usefulness to know there is mapRange call. The more meaningful place is where we end up waiting for GPU to finish which we already have the perfetto trace in SharedFence::wait. The exitsing of this trace made a lot of noise in trace since app tends to call this a lot and there are many tiny lines on the trace. Bug: None Change-Id: I83fbbba827361d068b521bdfb2bc187865d11012 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841067 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao 4e3b4e0c 2024-08-21T16:56:47 Vulkan: Clear conversion buffer's dirtyRange if bufferData called If BufferData is called, the previous dirtyRange on the conversion buffer is no longer valid (the original data in BufferVk is no longer valid). This CL clears the dirty range in conversion buffers so that they do not gets unnecessarily converted. Bug: b/357622380 Change-Id: Ic97c867a8d2141659a0ffd269049f9d4168a421f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5804531 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 790e0162 2024-08-09T17:11:38 Vulkan: Add dirty range to VertexConversionBuffer class Previously, ConversionBuffer only has a boolean indicates it is dirty or not. This CL adds mDirtyRange to it to indicate which range of data has been modified. The existing dirty boolean has been changed to mEntireBufferDirty so that all the current code will still work. Right now mEntireBufferDirty is always set when we mark it dirty, which means entire buffer gets converted. mDirtyRange has not been used to reduce the data to be converted. Right now the range is always being merged to the existing range and not actually being used in this CL. It will be used in the next CL. Bug: b/357622380 Change-Id: Ibfa702b29011f4e26c511d5db85c07cbf2a4aefb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5778347 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 86508e20 2024-08-16T14:56:37 Vulkan: Make VertexConversionBuffer a class And wrap the cache key (i.e, formatID/stride/offset) into a CacheKey struct so that we can easily add more data members. This CL also changes ConversionBuffer from struct to class to have better encapsulation. No functional changes is expected here. Bug: b/357622380 Change-Id: Ieecf5c922b95a940137c8e54657ef3f458c55fc9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5793921 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao a3f9f6c3 2024-08-16T12:12:00 Vulkan: Move VertexConversionBuffer out of BufferVk class This is another preparation CL to make code review easier. In the future CL, VertexConversionBuffer will be used in other files. BufferVk::VertexConversionBuffer is a bit too verbose. Make VertexConversionBuffer a standalone class makes code a bit easier to read. We could potentially merge ConversionBuffer and VertexConversionBuffer into single class, but I am leaving it for future exercise. Bug: b/357622380 Change-Id: I2255a1c9b01f5d6c3846e8017cc667e6a9d66ffc Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5787504 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Imran Ziad 24322b7d 2024-06-28T20:10:34 Vulkan: feature for cached non-coherent for dyn/stream buffers On platforms lacking cached coherent memory, ANGLE falls back to non-cached coherent memory for dynamic/stream buffers. This impacts CPU readback performance. Add VK feature preferCachedNoncoherentForDynamicStreamBufferUsage. When enabled, ANGLE prioritizes cached non-coherent memory for these buffers. Enable this feature for Intel Meteorlake SOCs. Bug: b/347601787 Change-Id: If62af9f3df57c0bcebf18af747cac56e45f93ea7 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5667457 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 018188c7 2024-05-28T14:44:14 Vulkan: Fix CachedPreferCoherent to actually require cached VK_MEMORY_PROPERTY_HOST_CACHED_BIT should be in requiredBits instead of preferredBits for CachedPreferCoherent buffer. This again caused pixel6 test failures. flush() call is added right after buffer allocation to fix the test failure. This likely is due to the spec says " If a range of non-coherent memory is written by the host and then invalidated without first being flushed, its contents are undefined.". Bug: b/339562049 Change-Id: Ie8529722bd03534598b03983ba447131573b1879 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5578276 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya 81c2b6e7 2024-05-03T08:12:18 Vulkan: Account for padBuffersToMaxVertexAttribStride ... in BufferVk::shouldRedefineStorage. For vendors that require additional padding, redefine storage iff the padded size is greater than existing buffer size. Bug: b/271915956 Change-Id: I2acf2bb32f85b71bcfdba6919974dd2922ee787b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5512870 Commit-Queue: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 2905a6a6 2024-04-19T15:09:41 Vulkan: Fix read pixel to cached non-coherent memory The bug here is that when we use cached non-coherent memory for image read, we must wait until DMA to finish before calling invalidate(). Otherwise CPU pre-fetching might end up populate the cache line again with old data between invalidate and DMA and causes CPU reads get the stale data from cache. This CL moves invalidate() call after we wait for copy to finish and removes requireCachedBitForStagingBuffer feature flag. Bug: b/335937565 Bug: b/315836169 Bug: b/324953979 Change-Id: Ie8a1854e17a5fe9c534c5102b2e0d51bd35c131a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5468597 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Cody Northrop <cnorthrop@google.com>
Shahbaz Youssefi 914fe61b 2024-03-15T13:20:49 Vulkan: Rename RendererVk.* to vk_renderer.* Done in a separate CL from the move to namespace vk to avoid possible rebase-time confusion with the file name change. Bug: angleproject:8564 Change-Id: Ibab79029834b88514d4466a7a4c076b1352bc450 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5370107 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Shahbaz Youssefi 60aaf4a0 2024-03-14T12:58:56 Vulkan: Move renderer to namespace vk This class is agnostic of EGL. This change moves it to namespace vk for use with the OpenCL implementation Bug: angleproject:8564 Change-Id: I57f7807d6af8b3d5d7f8efbaf8b5d537a930f881 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5371324 Reviewed-by: Austin Annestrand <a.annestrand@samsung.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Gowtham Tammana bcf814fd 2024-02-02T10:30:34 Vulkan: Constrain the dependency on ContextVk in BufferHelper Make the BufferHelper interface be not dependent on ContextVk state. This makes the interface to be suitable for implementation of other APIs with Vulkan backend. Any dependency on ContextVk is made explicit and handled in ContextVk. Bug: angleproject:8544 Change-Id: I8b285f54c8758a26dd7edf27b1371f9afcf7e241 Signed-off-by: Gowtham Tammana <g.tammana@samsung.com> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5303573 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org>
Charlie Lao 40f4de8f 2023-12-15T10:17:32 Vulkan: Ensure we use cached memory for readPixels stagingBuffer Previous CL crrev.com/c/5112759 does not solve the performance issue for ChromeOS. The reason is that on more recent intel GPU, there is no hostVisibleCachedCoherent heap. When we allocate staging buffer, we specify CachedCoherent as the preferredFlags instead of requiredFlags. This means we still end up getting UncachedCoherent since VMA tries to respect coherent bits as first priority. This CL Changes CachedCoherent to CachedPreferCoherent, and made Cached as required bit, thus ensures the memory allocated is cached. Since coherent bit may not be honored, thus we have to call invalidate/flush (which underline implementation will check the bit and early out if no need). Somehow on ARM GPU using cachedNonCoherent staging buffer causing many test failures, even though we do call invalidate() after allocation, and tests pass on all other GPUs. It almost indicates ARM driver have a bug with invalidate() that it is not doing expected. But before I can be sure and fixed, I added feature bit to keep ARM the old behavior, which uses UnCached memory for readPixels which should suffer the performance as well. Bug: b/315836169 Bug: b/310701311 Change-Id: I1eec6105ce74275faa893b0206be8470f0cde72f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5122318 Commit-Queue: Charlie Lao <cclao@google.com> Auto-Submit: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao de591cff 2023-12-11T13:15:30 Vulkan: Add CachedCoherent staging buffer Right now if we allocate a coherent staging buffer, it always uncached. I believe the reason it picked uncached is that most usage for staging buffer is data flow from CPU to GPU. CPU only sequentially write into staging buffer. Uncached may has better performance here due to write combined. But this performs horrible if CPU ever read from it. This CL adds a CachedCoherent staging buffer and let staging buffer use that for coherent memory. UncachedCoherent is currently not used, but I still kept here in case we find regression for certain type of usage. Bug: b/315836169 Change-Id: Ica331914c1f4729baa9d2eab048dc3099a2887b5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5112759 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 7f5143c2 2023-10-02T15:38:15 Vulkan: Notify VAO when VBO's mBufferWithUserSize changed. When buffer robust access is enabled, and bufferData is called with different size and we end up reusing the underline storage, we will have to recreate VkBuffer with user's size, and driver is relying on VkBuffer's size to implement robust access. The bug here is that we notify VAO when storage changes. But when storage is reused and we have dedicated VkBufer with user size and that VkBuffer changed, we were not notifying the VAO. This CL adds that notification so that VAO gets notified and dirty bits processed and its cache of VkBuffer gets updated Bug: chromium:1488055 Bug: b/303138134 Change-Id: Ie693c92c2edde9a22a41a25f5bde493397550d95 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4906568 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Hailin Zhang 141bada9 2023-09-22T10:25:30 Vulkan: add prefer cached memory type for dynamic buffer usage. Bug: b/288119108 Change-Id: I0fb5d91780e83af06762b9f3e6122313e76624da Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4886846 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Hailin Zhang <hailinzhang@google.com> Reviewed-by: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi e1d2e88a 2023-09-20T11:53:15 Check pending garbage after some buffer releases * Embedded BufferHelper::releaseBufferAndDescriptorSetCache() inside a new ContextVk method: releaseBufferAllocation() * After releasing the buffer, there is a check for excess pending garbage. If the tracked pending garbage size is larger than the threshold, the context will be flushed. * Unskipped the test "BufferDataInLoopManyTimes", which was failing on Android devices. Bug: b/280304441 Change-Id: Ib34319f3291dd2200fc1a92e30645f9d1da8e2b9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4879086 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi 91ef1f3c 2023-09-08T16:39:53 Move buffer suballocation callers to ContextVk * Moved the following functions from BufferHelper to ContextVk. * initBufferForBufferCopy() * initBufferForImageCopy() * initBufferForVertexConversion() Bug: b/280304441 Change-Id: I890f4396b00b0c20feb44f0ad113c55924ce1014 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4854760 Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 1b450b92 2023-09-15T11:07:25 Vulkan: Fix buffer storage reuse bug when robustAccess is enabled There is an optimization in vulkan backend that when the bufferData is called and current storage size is big enough for new bufferData call, we just reuse the storage. Mean while, when hasRobustAccess() is true, we must use the VkBuffer with the exact user size that glBufferData call provides so that driver can set proper access boundary. In order to satisfy both requirement, if robust resource access is enabled, we create a separate VkBuffer with the exact user provided size but bind to the same memory. There is a bug here that if robustAccess is true, this buffer of user provided size is not been recreated when storage is reused but with different user size (both has same allocation size). This causes we keep using the smaller VkBuffer and subsequently causes missing triangles. This CL clears mBufferWithUserSize when size changes and storage is reused. The other bug here is that previously we are checking isRobustResourceInitEnabled, which is incorrect. We should check hasRobustAccess. This appears works for chrome possibly due to both are enabled. This CL switches it to check hasRobustAccess. This CL also renames mBufferForVertexArray to mBufferWithUserSize to reflect what its true meaning. Bug: chromium:1476475 Change-Id: I843cc3a705f8a582a97bc0307f03aa1eb9fad3ff Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4864003 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Geoff Lang <geofflang@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Amirali Abdolrashidi e7418836 2023-08-16T14:25:52 Vulkan: Add context flushing as OOM fallback * As a new fallback for out-of-memory errors, if an allocation results in device OOM, the context is flushed and the allocation is retried. * Functions related to buffer/image allocations now return a VkResult value instead of angle::Result, which will be bubbled up to a higher level for safer handling. * The OOM is no longer handled at the level where the allocation happens, but is moved up to the context. * Added two functions to ContextVk for allocating memory for images and buffer suballocations, which also include the fallback options. * initBufferAllocation(): Uses BufferHelper::initSuballocation() * initImageAllocation(): Uses ImageHelper::initMemory() * Moved initNonZeroMemory() out of the following functions: * BufferHelper::initSuballocation() * Moved to ContextVk::initBufferAllocation(). * ImageHelper::initMemory() * Moved to ContextVk::initImageAllocation(). * Also moved to new function: ImageHelper::initMemoryAndNonZeroFillIfNeeded(). This function replaced the rest of initMemory() usages outside initImageAllocation(). * New macros for memory allocation * VK_RESULT_TRY() * If the output of the command inside it is not VK_SUCCESS, it will return with the error result from the command. * VK_RESULT_CHECK() * If the output of the command inside it is not VK_SUCCESS, it will return with the input error. * Added a test in which allocation would fail due to too much pending garbage without the fix on some platforms. The test ends once there has been a submission. * New suite: UniformBufferMemoryTest * Added a similar test for flushing texture-related pending garbage. * New suite: Texture2DMemoryTestES3 Bug: b/280304441 Change-Id: I60248ce39eae80b5a8ffe4723d8a1c5641087f23 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4787949 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 7c69116f 2023-08-08T10:14:47 Vulkan: Fix data race with DynamicDescriptorPool Right now DynamicDescriptorPool::destroyCachedDescriptorSet can be called from garbage clean up thread, while simultaneously accessed from context main thread, and data race will happen and cause bugs. This can only happen when the buffer is not being suballocated. In this case, suballocation owns the bufferBlock and bufferBlock gets destroyed when suballocation is destroyed from garbage collection thread. If buffer is suballocated, the shared group owns pool which owns bufferBlocks and they gets destroyed from shared group with the share group lock. This CL avoids this race problem by release the shared cacheKey when the buffer is released, while we still had the shared group lock. Bug: chromium:1469542 Change-Id: Ic1f99e6b6083d63e4efb9c3f408921da62c006ac Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4761365 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Igor Nazarov e24f4519 2023-01-19T02:30:39 Vulkan: Add externalFence into submitCommands() Currently one-off fence in the `queueSubmitOneOff()` is used only in `SyncHelperNativeFence::initializeWithFd()` to submit external fence. Other `queueSubmitOneOff()` calls may use `QueueSerial` instead of a fence. Providing `fence` into `queueSubmitOneOff()` prevents tracking that submission with `QueueSerial`. Therefore using `mUse` to collecting `mFenceWithFd` as garbage will not work as intended. This CL removes `fence` from `queueSubmitOneOff()` and adds optional `externalFence` into `submitCommands()` instead. Providing `externalFence` will cause additional `vkQueueSubmit()` call: - first submission will submit everything as usual except using the `externalFence`. - second, will only submit internal `CommandQueue` fence for `QueueSerial` tracking. As the result of this CL, call to `initializeWithFd()` will always produce two (2) `vkQueueSubmit()` calls. Previously it may be one (1) or two (2) submissions. Future CL will reduce submission count to one (1). If add additional submission into `queueSubmitOneOff()` instead of `submitCommands()`, then maximum number of submissions will be three (3). Bug: angleproject:8117 Change-Id: I6f1ec12682aaab71bfc871e665fec2659df96b26 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4392877 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Igor Nazarov <i.nazarov@samsung.com>
Shahbaz Youssefi 25e60197 2023-03-31T14:17:26 Vulkan: Unify buffer alloc strategy for uploads and GPU copies With this change, glCopyBufferSubData uses the same buffer allocation strategy as glBufferSubData. Only exception is with buffer self-copies which never allocate a new buffer for simplicity. Additionally, this change allows glCopyBufferSubData to be done on the CPU if possible, i.e. if the source buffer is not being written to by the GPU and whenever the equivalent glBufferSubData would have used a CPU upload. Bug: b/276002151 Change-Id: Ice8df5891c5516b148245d5d6fa9b19b787df4ce Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4390023 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 552d4271 2023-03-31T10:59:30 Vulkan: Refactor buffer init logic Bug: b/276002151 Change-Id: I28d3fa34ab11340cc8b38743e87664a514870068 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4388547 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Charlie Lao f4e71351 2023-03-14T14:55:04 Vulkan: Switch acquireAndUpdate to use Buddy pool Based on survey of all app traces we have, it is common that we end up with BufferVk::acquireAndUpdate even though the buffer was created with STATIC usage. This is mostly due to glBufferSubData call on the STATIC usage buffers and on ARM we most likely end up with acquireAndUpdate. Similarly, we also getting into ghostMappedBuffer and mapRangeImpl with STATIC usage buffers, even though with less app traces. Since the usage pattern usually repeats, using generic allocation algorithm has performance penalty. This CL moves these usage to buddy algorithm to ensure alloc/free are fast. This CL and previous CL crrev.com/c/4327290, reduces efootball_pes-2021 frame time from 4.2 ms to 2.87 ms, achieves parity with native GLES on pixel 7 pro. Bug: b/271915956 Change-Id: I56e0195181c77a3130513c74ec8a5075b2b29ea4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4321870 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 4982b903 2023-03-14T19:56:51 Revert "Vulkan: Remove inUseAndRespecifiedWithoutData from BufferVk" This reverts commit 755bfe471d23bc2aac5e78493537801dc5f90792. Reason for revert: Causing flaky on pixel 6 angleproject:8082 Original change's description: > Vulkan: Remove inUseAndRespecifiedWithoutData from BufferVk > > BufferVk::setDataWithMemoryType() has one optimization that it tries to > detect glBufferData(target, size, nullptr, usage) and if existing > storage is busy, it immediately reallocate storage. With the > optimization in previous CL (crrev.com/c/4317488), the storage reuse > logic should detect if we can reuse the storage or not. If the size > matches the existing storage's size, then there is no reason we can not > reuse existing storage. Later on when glBufferSubData or > glMapBufferRange is called, there are optimization in those calls that > will detect if we should reallocate storage or not as the further > optimization. This CL removes this check and replies on the other > optimization to handle the storage reallocate (shadowing) if necessary. > This simplifies code and also potentially avoids storage reallocation in > certain usage cases. > > This CL also fixes a test bug in > BufferDataTestES3.BufferDataWithNullFollowedByMap that was calling > glMapBufferRange with MAP_UNSYNCHRONIZED_BIT but incorrectly expecting > GL to do synchronization. > > Bug: b/271915956 > Change-Id: I7901687b3e3e262e77699f14eb8602d8a57eda3e > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4322048 > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Bug: b/271915956 Change-Id: Ie5716b609ab96b96afbe5927f20dfcf2bf5d4db6 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4338263 Auto-Submit: Charlie Lao <cclao@google.com> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Charlie Lao 755bfe47 2023-03-08T14:31:33 Vulkan: Remove inUseAndRespecifiedWithoutData from BufferVk BufferVk::setDataWithMemoryType() has one optimization that it tries to detect glBufferData(target, size, nullptr, usage) and if existing storage is busy, it immediately reallocate storage. With the optimization in previous CL (crrev.com/c/4317488), the storage reuse logic should detect if we can reuse the storage or not. If the size matches the existing storage's size, then there is no reason we can not reuse existing storage. Later on when glBufferSubData or glMapBufferRange is called, there are optimization in those calls that will detect if we should reallocate storage or not as the further optimization. This CL removes this check and replies on the other optimization to handle the storage reallocate (shadowing) if necessary. This simplifies code and also potentially avoids storage reallocation in certain usage cases. This CL also fixes a test bug in BufferDataTestES3.BufferDataWithNullFollowedByMap that was calling glMapBufferRange with MAP_UNSYNCHRONIZED_BIT but incorrectly expecting GL to do synchronization. Bug: b/271915956 Change-Id: I7901687b3e3e262e77699f14eb8602d8a57eda3e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4322048 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao d6a25bfa 2023-03-07T15:06:10 Vulkan: Optimize glBufferData call to improve storage reuse If app calls glBufferData with certain size, then calls it again with size 0, and then call it again with same old size again, we should try to reuse the existing storage. When size is zero, with the existing logic, we never free the storage. When glBufferData is called third time with the same size as the first glBufferData call, we expect to reuse the existing storage. But because of the storage reuse logic is comparing buffer's new size to the old size (which is 0), we missed the opportunity to reuse the existing storage. This CL update the reuse logic so that it checks the new size against storage's size (instead of OpenGLES buffer's size) and if we will end up with same sized allocation and same pool and memory type, then we reuse instead of reallocate. This reduces efootball_pes_2021 frame time from 4.670 ms to 4.277 ms on pixel 7 pro. Bug: b/271915956 Change-Id: I6f91e3e85b104eca215b28e7d0bea413ecc4401c Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4317488 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 67ad3ddc 2023-03-06T16:44:36 Vulkan: Relax size limit for dynamicBuffer to pick buddy algorithm If glBufferData's usage is one of the dynamic usage, app may keep calling glBufferData frequently, which means get into suballocation code frequently. There are two suballocation algorithms today: buddy algorithm (faster) and generic (slower). Right now the decision of which algorithm (i.e, which pool) to use is purely based on size or memory type. This CL also utilize usage information so that dynamic usage will pick buddy algorithm with bigger size threshold. mSmallBufferPool is removed and replaced with the BufferPoolPointerArray that gets picked based on allocation algorithm. This CL reduces average frame time of efootball_pes_2021 from 7.518 ms to 4.670 ms on pixel 7 Pro. Bug: b/271915956 Change-Id: I1c2f270ac49f56e6f405501d20691cfbab49e7eb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4313685 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao c402ea1c 2023-02-15T12:01:38 Vulkan: Rename hasUnfinishedUse to hasResourceUseFinished Most usage of hasUnfinishedUse is for !hasUnfinishedUse, and there was feedback that negative API is not preferred. This CL changes it to positive API name. Similarly renamed hasUnsubmittedUse to hasResourceUseSubmitted. Bug: b/267348918 Change-Id: Idb10b0f998ec50116ffb6aada19a98a516e87824 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4257105 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 410d8ba5 2022-12-21T13:27:00 Vulkan: Cleanup ContextVk::hasStartedRenderPass APIs ContextVk has a few hasStartedRenderPass APIs which interpret "start" inconsistently. A RenderPassCommands' life should be notStarted, started, requestEnd, and end (which is equivalent to notStarted). When someone calls onRenderPassFinished on a started renderpass, it does not immediate endRenderPass, but it will set DIRTY_BIT_RENDER_PASS dirty bit so that next draw call will trigger endRenderPass and start a new renderPass. We do not have a name for this state, which adds some confusion. This CL renames the stage between start and onRenderPassFinished to be "active" renderpass, when you have mRenderPassCommandBuffer pointer being valid and you can actively adding draw commands into the renderPass. For this purpose, I haves renamed hasStartedRenderPass to hasActiveRenderPass. This CL also simplifies hasStartedRenderPass implementation to only check mRenderPassCommandBuffer and turned mRenderPassCommands.started as assertion. This CL also changes hasStartedRenderPassWithQueueSerial to actually check mRenderPassCommands.started instead of being "active", so that name reflects what it is actually checking. This CL also changed hasStartedRenderPassWithCommands to hasActiveRenderPassWithCommands to make name and implementation consistent. One added benefit of this is that after this CL we now allow load/store optimization on a started but inactive renderPass as well (for example glInvalidateFramebuffer call after glFenceSync call, or invalidate after FBO blit as demonstrated by MultisampleResolveTest.ResolveD32FSamples tests). Bug: angleproject:7903 Bug: angleproject:7551 Change-Id: I8c8ec4c0d54b9ad0a9e373108dfce6b151c8fe0e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4119693 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 1219f55a 2022-12-07T16:19:37 Vulkan: Remove Resource::isCurrentlyInUse Due to header file include order, this function can not directly made inline. This CL removes the function and replace it with renderer->getUnfinishedUse() to reduce one extra function call of one line function. Bug: b/262048658 Change-Id: Ied33b63d0ec88336a5ce42cf7726f16b2b883b86 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4089623 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 798b97b8 2022-12-07T15:37:51 Vulkan: mapRangeImpl should call flushImpl if unflushed write BufferVk::mapRangeImpl() want to ensure any GPU write command has been flushed and finished. Right now it calls flushImpl if there is any unflushed access. It should only need to flush if there is any unflushed *write* command. This CL changes check of any access to any write access. This CL also inlines isCurrentlyInUseForWrite/finishGPUWriteCommands and removed these two single line function calls. Bug: b/261772793 Change-Id: I1628ec31eaceb87f82e654cb1f317570ff2f6c12 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4086972 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2e5ca217 2022-11-18T10:44:49 Vulkan: Let each current context has its own QueueSerial. This CL makes every current context has its own queueSerial. At context creation time or when context becomes current, it allocates a QueueIndex from renderer. When it becomes non-current, it releases QueueIndex for others to reuse. This way we significantly reduces the max number of QueueIndexs for reasonable usage. Each CommandBuffer has its own unique QueueSerial and we use that to determine if a resource is being used by the given CommandBuffer. The QueueSerial for RenderPassCommands is deferred until renderPass starts, and when we generate queueSerial for renderPassCommands, we also reserve a range of serials for outsideRenderPassCommands so that we can do incremental submission of outsideRenderPassCommands without need to close renderPassCommands. In rare situation, if that reserved serials runs out, we also close renderPassCommands to ensure the ordering of serials matches ordering of command buffers. With per current context queue serial, this CL is able to set resource queue serial as it is being used. This CL completely removes usage of ResourceUseList class since it was introduced due to deferred setSerial. This CL also get rid of refCount from ResourceUse since there we no longer add it to a ResourceUseList. With that, we also able to remove SharedResourceUse class since access to ResourceUse itself is now thread safe since we are able to make a copy of it when we add it to GarbageList. Because RenderPassCommands now has its own unique QueueSerial as it encodes command, we can use it to detect if a resource is being used by it or not, thus this CL also removes usage of CommandBufferID. Bug: b/255414841 Change-Id: I36dcbeaa7bc996f04e6c04bf9ad44cd0d630f61a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4038096 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 8378032e 2022-11-15T11:47:55 Vulkan: Remove get API for mLastCompletedQueueSerial In preparation for per context queue serial, this CL makes mLastSubmittedQueueSerial and mLastCompletedQueueSerial private to CommandQueue. Before this CL, we have a get function to return the last submitted serial and last completed serial and passing these serials around. This works because the serial is a single uint64_t number. With per context queue serial, this will be an array of serials and there is potential risk associated with access it from different threads. This CL makes these serials private to CommandQueue and when you want to know if GPU is completed with resource, you ask RendererVk/CommandQueue directly. This way we can ensure they have thread safe access in the CommandQueue (no lock is necessary, but all access will be restricted to one class). Bug: b/255414841 Change-Id: Ica565decce4a80588e0b447e179a2b634b55d7c3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4021676 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi fbd7d5fa 2022-10-17T17:20:09 Move thread pool classes to common/ In preparation for access by image_util files. Bug: b/250688943 Change-Id: I24777269a5071eae9a60f939635d01ed7246461f Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3961454 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 2c351351 2022-08-07T22:31:40 Vulkan: Don't break render pass on read-only buffer updates When uploading to a buffer that is in use by the GPU, we either acquire a new buffer and copy the contents over, or stage the update and do a GPU copy. Ignoring all other conditions, this decision was made based on whether a small or large part of the buffer is being updated; small updates where staged. However, if the current render pass uses the buffer in read-only mode, the staged update would break it (to apply the update). In this change, this situation is detected and the acquire-and-update path is chosen even for small updates. Bug: angleproject:7534 Change-Id: Ie2c0989449dcc7d03695a003cf6f353920f8fb65 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3812566 Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Shahbaz Youssefi 80022b96 2022-07-26T21:07:04 Vulkan: Fix xfb buffer redefine to smaller size In 89e11878b275b15735eaf273ababfa6fd43a2e3d, a use-after-free bug was fixed where glBufferData redefined a buffer, leading to a change in storage. This was only tested for the case where the new buffer was larger than the old buffer. When the new buffer is smaller however, another issue remains where the buffer size as cached by the transform feedback object used the old object's size. This is worked around in this change, with a fix for the real issue (that the buffer state is updated after calling into the backend instead of before) coming up. Bug: chromium:1345042 Change-Id: I6c9e9344705fefe49926a14cf6ce73ce84305872 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3788308 Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com> Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao 53d40aed 2022-07-15T15:03:25 Vulkan: Destroy descriptorSet cache when BufferHelper destroyed For atomic counter buffers or other cases, dynamic descriptor is not been used. Right now when such buffer is destroyed, the cache is still lingers around. With this CL, when a new cache entry has been created, we record the cache entry in the BufferHelper. When BufferHelper is destroyed, we also immediately destroy the cache entry since the cache will no longer reused. Bug: b/237686097 Change-Id: I26eee96318fbc003e65318c0b8263dc61092f350 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3764044 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Geoff Lang 31c13df5 2022-05-30T15:06:26 Revert "Initialize buffer contents separately from BufferImpl::setData" This reverts commit 34cff1a14b635c76a9063b8710e948d04ef98a79. Reason for revert: Speculative revert for Mac M1 WebGL failures. Bug: chromium:1330314 Original change's description: > Initialize buffer contents separately from BufferImpl::setData > > Some backends can initialize buffer data faster than allocating a > zero-filled scratch buffer (GL can map and memset for example). > Allow those backends the opportunity to make these optimizations. > > Verified that GL, D3D and VK backends do not regress by using a > separate set data call. > > Bug: chromium:983167 > Change-Id: Ibcbe6016059434dc36ab3c754df6a24f0a6e5e72 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3039778 > Reviewed-by: Jamie Madill <jmadill@chromium.org> > Reviewed-by: Peng Huang <penghuang@chromium.org> > Commit-Queue: Geoff Lang <geofflang@chromium.org> Bug: chromium:983167 Change-Id: Id1bfa76b832c35fd0b3ade04da16735aa089fdd2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3677335 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Peng Huang <penghuang@chromium.org>
Geoff Lang 34cff1a1 2021-07-19T14:29:35 Initialize buffer contents separately from BufferImpl::setData Some backends can initialize buffer data faster than allocating a zero-filled scratch buffer (GL can map and memset for example). Allow those backends the opportunity to make these optimizations. Verified that GL, D3D and VK backends do not regress by using a separate set data call. Bug: chromium:983167 Change-Id: Ibcbe6016059434dc36ab3c754df6a24f0a6e5e72 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3039778 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Peng Huang <penghuang@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org>
Charlie Lao ff011779 2022-05-13T12:36:24 Vulkan: Let texture buffer handle BufferVk's storage change When buffer's storage changed due to glBufferData call, texture buffer code should also respond to this and update the texture descriptor set. This CL merges BufferVkStorageChanged message into InternalMemoryAllocationChanged and removed BufferVkStorageChanged all together. Bug: angleproject:7283 Change-Id: I230ee7268634e747d06eab1954f5a76ecf84c9d6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3646955 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 6b9c5c8d 2022-04-22T11:06:16 Vulkan: Improve GetStorageMemoryType logic This is follow up from previous CL. For discrete GPU (preferDeviceLocalMemoryHostVisible is disabled), we will get HostVisible memory if any map can be created on it. For non discrete GPU, this CL also adds the check if the buffer will never gets updated, we just use DeviceLocal memory without HostVisible bit. Bug: angleproject:7047 Change-Id: I73bdc133badbf01c098db23563b30898d4d16a41 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3602943 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 3b38b379 2022-04-20T10:44:24 Vulkan: Add feature avoid HOST_VISIBLE and DEVICE_LOCAL combination Discrete GPUs device local memory usually is not CPU accessible. This adds a feature flag to control that. Fixed bug in BufferVk that when mapRangeImpl is called from angle internal, unmapImpl was using front end mapping parameters that is incorrect. We have to cache the mapping parameters in the backend to hangle the mapRangeImpl/unmapImpl calls from internal. Fixed the test bug in ComputeShaderTest.BufferImageBufferMapWrite that we are calling glMapBufferRange with GL_MAP_READ_BIT but are actually writing to the map pointer. This should result in undefined behavior per spec. Fixed the test bug in GLSLTest.* that VerifyBuffer calls glMapBufferRange, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Fixed the test bug in ReadPixelsPBOTest.* and ReadPixelsPBONVTest.* that calls glMapBufferRangeEXT, but was giving incorrect length which result in data only been partially copied. This bug was hidden due to previously all buffers are CPU accessible and there is no copy needed. Added new skipped syncval messages. Because this CL triggers a copyToBuffer call for some of the buffers and that changes the syncval message signature for the same reasons (i.e, feedback loop or synval does not know the exact range of buffer been used for vertex buffers etc). Bug: angleproject:7047 Change-Id: I28c96ae0f23db8e5b51af8259e5b97e12e8b91f2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3597711 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao d3dbaa8d 2022-04-19T10:45:57 Vulkan: Remove BufferVk::mHasBeenReferencedByGPU This variable was added before due to we used to only track GPU progress on the entire buffer instead of suballocation. Now each suballocation tracks its own GPU progress, so this is no longer needed. Bug: b/201826021 Change-Id: I2c2b1744b624e028fd905f0752a4264327620515 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3594620 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Yuxin Hu <yuxinhu@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Mohan Maiya 37cdf93d 2022-04-15T12:49:09 Vulkan: Acquire a new buffer even when size is unchanged If a buffer is respecified using glBufferData with no changes to size but client data pointer is null, we need to acquire a new BufferHelper to avoid affecting the results of previously submitted draws. Test: BufferDataTestES3.BufferDataWithNullFollowedByMap*Vulkan Bug: angleproject:7211 Change-Id: Icc20fe3509f94098c7a15988a9ebc888b06fd3c8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3588955 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Amirali Abdolrashidi bdb52cb8 2022-04-08T13:20:54 Vulkan: Remove retains before acquireBufferHelper * Removed the retainReadOnly() functions in BufferVk, since the BufferHelper object is now moved to a temporary buffer and retaining it is no longer necessary. Bug: angleproject:7103 Change-Id: Id5da88d7cfa4d7a8532eb596f552c70a9ff1d358 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3579862 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Charlie Lao fe28a429 2022-03-30T15:34:49 Vulkan: Create buffer for vertex array if robust enabled If robust access is enabled (i.e., chrome), we want to ensure vulkan driver never access beyond that OpenGL buffer boundary. But with suballocation from BufferPool, we are using the same VkBuffer for all suballocations from the same BufferBlock. this combined with the fact that there is no size information in the vkCmdBindVertexBuffers, it means vulkan driver can not properly ensure vertex access not go beyond the subrange. It can only guarantee not access beyond the entire VkBuffer size. This CL creates a dedicated vkBuffer object and bind it to the suballocation of the vkDeviceMemory so that vulkan driver will see the exact range of the subrange instead of entire buffer. Since we may allocated more memory than actual requested size and the extra paddings are not zero filled , user size is used to create this vkBuffer. This is only enabled when robust access is enabled. This CL also ported webgl conformance test out-of-bounds-index-buffers.html and out-of-bounds-array-buffers.html to end2end test. Bug: chromium:1310038 Change-Id: I3499ae600028149b1039082e5011232b3e4e5e80 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3553940 Reviewed-by: Yuxin Hu <yuxinhu@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 527ceb73 2022-02-07T18:25:02 Vulkan: Switch XFB counter buffer to suballocation Bug: b/205337962 Change-Id: I2e26fa3ab150b858f07665459fa108440af988d5 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3402333 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Ian Elliott <ianelliott@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 419bca3f 2022-01-19T18:22:56 Vulkan: Use Vulkan API directly for BufferPool's buffer allocation There are two motivations in this CL. 1) There are two layers of suballocator right now. BufferPool provides first suballocation. It tries to allocate from one of the buffers in the pool. If that failed, it try to create a new BufferBlock (i.e, a VkBuffer). Right now that calls into VMA which creates another pool to allocate a buffer. We really only need one layer of suballocation. And 2) Because we uses VMA to do actual VkBuffer allocation, we have to use Allocator object. But VMA can not handle external buffers, so we end up having a BufferMemory class just to handle two different cases. This CL attempts to clean up this by let ANGLE calling into vulkan driver directly for the actual buffer allocation, just like we did for VkImages. By doing so, we able to remove BufferHelper::mMemory data member as well as BufferMemory class all together. External memory is now treated exactly the same at BufferHelper. Bug: b/205337962 Change-Id: I7c183ab0fd7d9aceb6cf416b0214c300798bc010 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3402740 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao d2354968 2022-01-20T10:59:05 Vulkan: Rename BufferHelper::initFor* to allocateFor* Simply name change per feedback from other CL's review. Bug: b/205337962 Change-Id: Ieb53ed9a2922d09716a1219eb340fe273e5f1807 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3402882 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 1b5efe51 2022-01-19T14:04:35 Vulkan: Rename SubAllocation to Suballocation Simply a name change to make it one word. No functional change is expected. Bug: b/205337962 Change-Id: Ic505536821f18141c0d036b13d9aa81554a8bafd Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3403158 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 15439f8e 2022-01-13T14:58:41 Vulkan: Remove BufferMemoryAllocator This class was added in crrev.com/c/3036256. The original intention was to use VMA to implement buffer suballocation. Because VMA itself does not support buffer suballocation, I was thinking to use VMA custom pool to implement it and this class was intended to wrap all these functionality into one class. But now thanks to Jamie's effort, VMA exported generic suballocation algorithm via API and we have implemented buffer suballocation using that virtual allocation API. So this BufferMemoryAllocator class is really no longer useful. This CL mostly reverted that CL and flatten out the buffer allocation call to directly use VMA's Allocator object. Bug: b/205337962 Change-Id: I0336056e440f39e2ff49fee8e0ff4b1f355cefe4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3244022 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Shahbaz Youssefi acd8fc76 2021-12-16T01:05:02 Vulkan: Distinguish RP and outside-RP command buffer types What goes inside and outside a render pass command buffer is largely mutually exclusive. Moreover, the size and frequency of allocations is different between the two. This change distinguishes the C++ types used for inside and outside render pass command buffers: - The type now documents which command buffer a function is able to receive. - `isRenderPass` flag passing, checking and asserting is largely removed. - A follow up change experiments with using different (Vulkan vs ANGLE) secondary command buffers for inside and outside RP command buffers. - A future change could specialize the pool behaviors per command buffer type. Bug: angleproject:6811 Change-Id: Ia4bc669d26ac7e94e8a0dfb9b361666c82f42cc3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3344373 Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao 60a8b593 2022-01-05T11:06:16 Vulkan: Remove std::unique_ptr usage from BufferVk::mBuffer BufferVk::mBuffer is std::unique_ptr mostly because BufferHelper object itself does not support move assignment. Now crrev.com/c/3366855 added move assignment support, we can now use BufferHelper directly. The main downside I can see is that in BufferVk::ghostMappedBuffer() and BufferVk::acquireAndUpdate() functions where we have to use move assignment of mBuffer object, it becomes slightly more expensive than moving pointer. But switch to using BufferHelper directly makes code simpler and other access to mBuffer (which is more common usage) slightly cheaper by removing one pointer indirection. Bug: b/208323792 Change-Id: Ia7e7731e284eb6c76db954fef194e9d1de82174b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3362252 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao ad27d5d6 2021-12-21T11:22:30 Reland "Vulkan: Consolidate all vertex conversion buffers to shared pool" This is a reland of cca412cd8b349b7281727c50f2a59d115fd90a05 Further inspection shows it was red-herring. The original CL does not have the un-intended diff that I saw in the commit email. This is try to reland the original CL without any modification. Original change's description: > Vulkan: Consolidate all vertex conversion buffers to shared pool > > There are various conversion buffers that holds converted vertex or > element or index data. They are DynamicBuffer for now. This CL switches > them to use the shared group buffer pool. With this change, all > allocation is represented by a BufferHelper object instead of an offset. > I am able to remove the offset arguments from a lot of APIs. > > Bug: b/208323792 > Change-Id: Ib611beb0c16cddbdd9ddf7b8961c439da9fa5180 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3352489 > Reviewed-by: Tim Van Patten <timvp@google.com> > Reviewed-by: Jamie Madill <jmadill@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/208323792 Change-Id: I90852ad38c2b9ac423800bb6854757bcc17cd166 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3370602 Reviewed-by: Tim Van Patten <timvp@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 4e85bdd9 2022-01-06T17:06:25 Revert "Vulkan: Consolidate all vertex conversion buffers to shared pool" This reverts commit cca412cd8b349b7281727c50f2a59d115fd90a05. Reason for revert: There is accidental code merge bug left in. Original change's description: > Vulkan: Consolidate all vertex conversion buffers to shared pool > > There are various conversion buffers that holds converted vertex or > element or index data. They are DynamicBuffer for now. This CL switches > them to use the shared group buffer pool. With this change, all > allocation is represented by a BufferHelper object instead of an offset. > I am able to remove the offset arguments from a lot of APIs. > > Bug: b/208323792 > Change-Id: Ib611beb0c16cddbdd9ddf7b8961c439da9fa5180 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3352489 > Reviewed-by: Tim Van Patten <timvp@google.com> > Reviewed-by: Jamie Madill <jmadill@chromium.org> > Commit-Queue: Charlie Lao <cclao@google.com> Bug: b/208323792 Change-Id: I18bba207d1d8bb76dff32d9855a744dba93bc6d6 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3370601 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao cca412cd 2021-12-21T11:22:30 Vulkan: Consolidate all vertex conversion buffers to shared pool There are various conversion buffers that holds converted vertex or element or index data. They are DynamicBuffer for now. This CL switches them to use the shared group buffer pool. With this change, all allocation is represented by a BufferHelper object instead of an offset. I am able to remove the offset arguments from a lot of APIs. Bug: b/208323792 Change-Id: Ib611beb0c16cddbdd9ddf7b8961c439da9fa5180 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3352489 Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 43b0e92b 2021-12-13T15:38:32 Vulkan: Consolidate mHostVisibleBufferPool and mMapInvalidate BufferVk::mHostVisibleBufferPool is allocated when BufferVk::mBuffer is not hostvisible and we need to map it. In that case mHostVisibleBufferPool is allocated and data copied from mBuffer to it and the pointer to mHostVisibleBufferPool is returned to user. BufferVk::mMapInvalidateRangeStagingBuffer is used when map is called on a small range. In this case we allocate memory for the small range of buffer and return that intead of waiting for entire buffer for GPU to finish. Also when BufferSubData is called, we also needs to allocate a staging buffer and issue a copyBuffer from staging buffer to main buffer. This CL consolidate all these three usage cases into one mStagingBuffer. It removes mHostVisibleBufferPool and mMapInvalidateRangeStagingBuffer from BufferVk class. This makes overall logic of managing data consistency much simpler as well since we only have two buffers: The main buffer storage mBuffer or mStagingBuffer. And mIsStagingBufferMapped tracks if mStagingBuffer is the one actually mapped to user or not so that at unmap time we know if we should flush the data to mBuffer or not. Bug: b/208323792 Change-Id: I4f0c79a2d86da1a43844ed2ba83ddeb7dd4a5c0b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3335216 Reviewed-by: Lingfeng Yang <lfy@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 2f3e6cc0 2021-12-13T17:40:18 Vulkan: Remove mShadowBuffer from BufferVk class. The shadow buffer was initially designed to avoid synchronization in glMapBuffer call while buffer itself is still busy. There are many optimization done inside BufferVk::mapImpl that try to avoid wait for GPU as much as we can by distinguish GPU write versus read, by detecting map call read/write intention by checking access bit, and finally by allocating a staging buffer to return a CPU friendly copy of data to caller. This shadow buffer implementation also have known bugs that are not keeping data in sync. With all these optimization added after initial mShadowBuffer implementation, I believe we do not have a good reason to still keep mShadowBuffer. And this has been disabled for months in main branch. This CL removes this code path completely which makes code a lot simpler. Bug: b/208323792 Change-Id: Ie5999e38b6120a371ec2e969f196e4754ebd0f8d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3313333 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: mohan maiya <m.maiya@samsung.com> Reviewed-by: Tim Van Patten <timvp@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 6c894e82 2021-11-04T14:49:41 Vulkan: Replace BufferVk::getBufferAndOffset() with getBuffer() Now BufferHelper class already keeps offset information. There is no reason for BufferVk to have that information any more. Bug: b/205337962 Change-Id: I6e014fb480bfcd5018ef9231b0fb87a50021f179 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3266147 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 83a670ab 2021-10-29T09:12:26 Vulkan: Implement BufferPool using VMA's virtual allocator VMA's allocation calls used to be sub-allocating a pool of memory. What we really want is sub-allocate a VkBuffer object. VMA recently added support to expose the underlying range allocation algorithm via APIs, which user can use it to sub-allocate any object. This CL uses that new virtual allocation API to sub-allocate from a pool of VkBuffers. In this CL we only switched BufferVk::mBuffer to sub-allocate from the BufferPool object. Bug: b/205337962 Change-Id: Ia6ef00c22e58687e375b31bc12ac515fd89f3488 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3266146 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Amirali Abdolrashidi 89e11878 2021-12-06T09:42:46 Vulkan: Fix the UAF issue with BufferData * Fixed the use-after-free issue with stale buffer handles after calling BeginTransformFeedback. * Added an observer for TransformFeedbackVk to update the buffer handles when buffer's storage is changed and the buffer update type is StorageRedefined. * Added a function to TransformFeedbackVk::onDestroy() to release the counter buffers in order to avoid crash due to TransformFeedbackVk::end() not being called, e.g., as a result of no glEndTransformFeedback() calls. Bug: chromium:1274316 Change-Id: I8ed477f36e6ff89dd4764bb59af564c69efe33e2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3321789 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Hailin Zhang 7b13a9ac 2021-12-09T18:37:59 Vulkan: Fix dynamic partial update buffer data issue. add test case for dynamic update buffer data. Signed-off-by: Hailin Zhang<hailinzhang@google.com> Bug: b/207714894 Change-Id: I8c1e93d152847c3162c0e2dd49abe3d899c859a0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3328869 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Hailin Zhang <hailinzhang@google.com>
Hailin Zhang 3be551d7 2021-12-08T16:44:56 fix directUpdate buffer pointer issue. inside mapWithOffset already add the mBufferOffset Signed-off-by: Hailin Zhang<hailinzhang@google.com> Bug: b/207714894 Change-Id: Ia400bccbef1abc756cd8155e93a775338a30e8b9 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3325025 Reviewed-by: Lingfeng Yang <lfy@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Hailin Zhang <hailinzhang@google.com>
Lingfeng Yang 27bc56c6 2021-11-15T18:18:53 Vulkan: MAP_UNSYNCHRONIZED_BIT: Skip ghosting/idling Respect the following spec language: No GL error is generated if pending operations which source or modify the buffer overlap the mapped region, but the result of such previous and any subsequent operations is undefined Test: cpu time improves in unsync case in perf-tests/MapBufferRange.cpp Bug: angleproject:6680 Change-Id: I6133952546735aced6e6ee8468ef2ac695316fb6 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3272018 Commit-Queue: Lingfeng Yang <lfy@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com>
Lingfeng Yang 647a703e 2021-11-12T13:48:06 Vulkan: Reorder logic in BufferVk::mapRange This CL flattens the logic, ordering read case first, then write, and simpler cases before more complex ones. This is to prepare for an optimization where we ignore certain paths if MAP_UNSYNCHRONIZED_BIT is set. No change in functionality or performance is expected. Bug: angleproject:6680 Change-Id: I0a2e9ee969216c90353eac7af6dabf648dea2173 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3279615 Commit-Queue: Lingfeng Yang <lfy@google.com> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Shahbaz Youssefi dbc0c646 2021-11-06T01:09:26 Vulkan: Output the reason for RP closure in command buffer To make it easier when viewing the command buffer in a graphics debugger, this change inserts a marker just before closing the render pass that specifies why the render pass was closed. Bug: angleproject:2472 Change-Id: I862e500cd58332d6e199c853315c560fe6a73dc2 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3265609 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Lingfeng Yang cd822868 2021-10-28T13:00:09 Vulkan: MAP_INVALIDATE_RANGE_BIT: shadow or exclude One method of dealing with glMapBufferRange + range invalidation; treat it like bufferSubData and stage the update. Another method is to ghost the buffer but copy only memory outside the invalidated range. This CL pursues a policy where if less than half of the buffer is invalidated, we stage. Otherwise, we ghost and copy only memory outside the invalidated range. DynamicBuffer is chosen over DynamicShadowBuffer because it turns out to end up implicitly tracking all active invalidate ranges (through its freelist), and performs buffer copy on GPU. if we use a DynamicShadowBuffer and then BufferVk::stagedUpdate, it's the same thing but more work (an extra memcpy into the staging buffer). To make this clear, we split the logic of stagedUpdate into two parts, the allocation/map, and the flush, and reuse one half in glMapBufferRange, and the other half in glUnmapBuffer. Test: Faster performance in MapBufferRange perf test, no non-noisy regress in trace tests Bug: angleproject:6634 Change-Id: Ie2e6a9586824b8cb59a97419bb8052acd1de2033 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3251686 Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Lingfeng Yang <lfy@google.com>
Charlie Lao bae19e06 2021-10-26T13:35:57 Vulkan: Avoid unnecessary wait if mapBufferRange indicates read only When we call BufferVk::mapRangeImpl(), both from internal code paths for data reads or due to glMapBufferRange call, we are not passing the access bit to the call. This CL passes the proper access bits to the call and only wait for GPU writes to finish if access is for read only. This CL also adds access bitfield to the BufferVk::mapImpl() API and have various callers pass in the proper access bits as well. Bug: b/203582620 Change-Id: Ica8493c902dbd7b15996266c81ce0fd4dbfc2520 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3245487 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao 6b315a78 2021-10-26T19:04:06 Revert "Vulkan: Let BufferVk call into VMA for allocation when possible" This reverts commit 894ce75fb2f75e718ce76e466b8938524f65ac07. Reason for revert: crbug.com/1253325 Original change's description: > Vulkan: Let BufferVk call into VMA for allocation when possible > > Previously BufferVk class maintains a DynamicBuffer pool per BufferVk > object. This CL makes BufferVk skip DynamicBuffer pool in most cases and > do its own BufferHelper allocation directly. DynamicBuffer pool is only > used when desired, which is controled by a flag. With this CL, only > UBO/SSBO/AtomicBuffer will still use DynamicBuffer pool if the buffer > has to be allocated more than once. > > Bug: b/195588159 > Change-Id: I3aa08cef10ee9ee9f01f16403c6fbb99b37f4a8a > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2901241 > Commit-Queue: Charlie Lao <cclao@google.com> > Reviewed-by: Tim Van Patten <timvp@google.com> > Reviewed-by: Jamie Madill <jmadill@chromium.org> Bug: b/195588159 Change-Id: Iecda3baa6bc887fa0caa86ab076994cae7c10f93 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3244257 Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Charlie Lao <cclao@google.com> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Charlie Lao 894ce75f 2021-05-21T10:59:14 Vulkan: Let BufferVk call into VMA for allocation when possible Previously BufferVk class maintains a DynamicBuffer pool per BufferVk object. This CL makes BufferVk skip DynamicBuffer pool in most cases and do its own BufferHelper allocation directly. DynamicBuffer pool is only used when desired, which is controled by a flag. With this CL, only UBO/SSBO/AtomicBuffer will still use DynamicBuffer pool if the buffer has to be allocated more than once. Bug: b/195588159 Change-Id: I3aa08cef10ee9ee9f01f16403c6fbb99b37f4a8a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2901241 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Jamie Madill 123ba58d 2021-10-14T11:56:35 Vulkan: Remove "last submitted serial". This fixes race conditions with the async command processor. Instead of querying specific serial numbers, we ask the command queue to either wait for idle, or return the answer to "are you busy" directly. Bug: b/172704839 Change-Id: I06a8268d9b58d8c33b783af00ca74979ee158316 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3223641 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Charlie Lao ea580484 2021-10-05T13:40:01 Vulkan: Add feature flag to prefer CPU copy instead of staged update For ARM GPU, use GPU to do buffer to buffer copy has performance penalty due to potential bubble in the vertex pipeline. This CL adds a feature flag preferCPUForBufferDataSubData so that we can enable this behavior for ARM GPUs. This CL also tracks if GPU has referenced this BufferVk's storage since it got new storage. Due to sub-allocation, we may get a new sub-range of the same BufferHelper object when allocating new storage. But we currently do not have a way to track GPU progress of the sub-range of a buffer. So we will end up using BufferHelper's queueSerial to decide if it is still GPU busy or not. This CL adds mHasBeenReferencedByGPU boolean variable that will set to false when we got a new allocation and set to true as soon as buffer is been referenced by any GPU command. We use this to avoid checking queueSerial if it never been referenced by GPU. This is a temporary workaround for the bug, the full fix is tracked by https://issuetracker.google.com/201826021 Bug: b/200067929 Change-Id: I231fb0a678b0165a2ce1775d0aa4dbe7512fb4a8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3183398 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com>
Charlie Lao 74b2886f 2021-09-22T13:49:57 Vulkan: Try to use CPU to copy when preserving old buffer data When glBufferSubData is called, we may acquire a new buffer if buffer is still GPU busy. When this happened, we have to preserve buffer content if old buffer has valid data in it. Instead of always use GPU to do copy, this CL will check if GPU is not writing to the buffer, we will just use CPU to do the copy form old buffer to new buffer from the ranges outside subData, controlled by the feature flag preferCPUWhenPreservingBufferData. Bug: b/200067929 Change-Id: I42053104b2be8da5f399cca92e934254988f2fd8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3177322 Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>
Charlie Lao e437c4ad 2021-09-22T09:15:19 Vulkan: Only preserve buffer data when BufferVk has valid data When we receive glBufferSubData call and GPU is still accessing the buffer, we have two code paths to update data. If subData is more than half of the entire buffer range, we choose to acquire a new buffer and use DMA to copy the rest of buffer that outside of subData range from the old buffer back to the new buffer so that existing buffer data is being preserved. Otherwise we stage subData to use GPU buffer to buffer copy later on when buffer is been used. The reasoning behind is to minimize the amount of data copy. The improvement here is that if previously app called glBufferData with null pointer, we really do not have any valid data in the buffer and there is no need to preserve the existing buffer data. This CL tracks whether buffer has any valid data or not and also put this into consideration when we pick which code path to go. We also use this information to avoid preserve the existing data in BufferVk::acquireAndUpdate Bug: b/200067929 Change-Id: I266dd93bed2d3c07e3a5af3e4e613e7f6023b393 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3176500 Commit-Queue: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com>
Jamie Madill cf8c5678 2021-09-17T13:16:36 Vulkan: Don't sync VAOs after BufferSubData calls. We still need to syncState after buffers that contain converted attributes are updated. Includes a perf regression test. Bug: angleproject:6371 Change-Id: I54227fc43e7b3fe79072da7783dab0177ccb0486 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3182706 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Jamie Madill cebca7c2 2021-09-24T07:55:38 Texture: Ignore buffer contents changed events. Texture doesn't need to care when its attached buffer gets different contents via a SubData call. This CL updates the BufferVk logic to ensure that SubData calls trigger a storage changed notification when there's a new storage, and otherwise Texture can ignore SubData calls. Will make it easier to split "contents" changed notifications to their own event, for optimizing Vertex Buffer updates. Bug: angleproject:6371 Change-Id: I4f15ad3ad2da5d838bd51fb065184b7344b188d8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3181562 Commit-Queue: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com>
Tim Van Patten a1b829dd 2021-09-22T18:29:08 Vulkan: Retain src buffer in acquireAndUpdate() It's possible for acquireBufferHelper() to garbage collect the original (src) buffer before copyFromBuffer() has a chance to retain it, so it must be retained before then. Previously, we were relying on the retain calls in copyFromBuffer() to be sufficient. However, there is a race condition when the asynchronous CommandProcessor is enabled, since the garbage could be freed before copyFromBuffer() has a chance to retain the buffer (and allow destroyIfComplete() to skip destroying the object). For the full context, see the comment chain here: https://chromium-review.googlesource.com/c/angle/angle/+/3146319/16..24/src/libANGLE/renderer/vulkan/BufferVk.cpp#b833 Bug: angleproject:5971 Change-Id: I7c812069343fdad948189d696bfebab8da68c1a3 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3179866 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Commit-Queue: Tim Van Patten <timvp@google.com>
Tim Van Patten 68c0da83 2021-09-15T12:00:08 Vulkan: Inform frontend when new buffer is allocated When a buffer is mapped with GL_MAP_INVALIDATE_BUFFER_BIT while it's currently in use, the Vulkan backend will allocate a new buffer, map it, and return the pointer to the new buffer. This was missing a call to inform the frontend that a new buffer was allocated, causing the old buffer data to be accessed in subsequent draw calls. The fix is to add a onStateChange(angle::SubjectMessage::SubjectGhosted) call when the new buffer is allocated, to inform the frontend. Bug: angleproject:5971 Bug: angleproject:6396 Test: TextureBufferTestES31.MapTextureBufferInvalidateThenWrite Change-Id: I9984d1049ab4d6a2066f4440fc710c9b93ff6ab8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3163244 Commit-Queue: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Tim Van Patten 9158436e 2021-07-31T18:26:16 Vulkan: glMapBuffer(): Create new buffer (Buffer Ghosting) When glMapBuffer() is called, if the buffer is in use but not being written to by the GPU: 1.) Create a new buffer. 2.) Copy the contents of the old buffer into the new buffer. 3.) Map the new buffer and return the pointer. Creating a new buffer prevents ending the renderpass and flushing the commands to allow the in-use buffer to be mapped. This change increases Idle Heroes performance from 40FPS to 125FPS. Bug: angleproject:5971 Test: VulkanPerformanceCounterTest.MappingGpuReadOnlyBufferGhostsBuffer Test: BufferDataTest.MapWriteArrayBufferDataDrawQuad Test: BufferDataTest.MapWriteArrayBufferDataDrawArrays Change-Id: I1d433d179f9f5110a948f191c5aedda5397acac8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3065799 Commit-Queue: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Tim Van Patten 57d59e83 2021-09-07T17:41:11 Vulkan: Add ResourceWrite to track Read and Write Access vk::Resource currently only tracks accesses in general, not which type of access is being performed. This CL adds the new class ResourceWrite to track whether the access is a Read or Read/Write access and when the access completes. This allows a follow-on CL to know when a buffer is being written to by the GPU or if the GPU is only reading from a buffer. Tracking write accesses to buffers is required when attempting to "Ghost" (duplicate) GPU-read-only buffers to prevent breaking the render pass when the CPU maps the buffer memory. Bug: angleproject:5971 Test: ComputeShaderTest.ImageBufferMapWrite Change-Id: I965e3e75730719ccce77334744ae4feae33c6101 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3146319 Commit-Queue: Tim Van Patten <timvp@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Charlie Lao <cclao@google.com>
Shahbaz Youssefi eaa6961d 2021-05-17T18:56:53 Revert "Vulkan: Disable BufferVk suballocation" This reverts commit 76181384075c6eb0a5788bf1b732a1e05f6d73bc. Reason for revert: Bug exposed by this is fixed in https://chromium-review.googlesource.com/c/angle/angle/+/2896168 Original change's description: > Vulkan: Disable BufferVk suballocation > > There are still unresolved bugs. > > Bug: angleproject:5719 > Bug: chromium:1209197 > Change-Id: I6a971c421d0ae266404d1ecbf8741a9747a4e809 > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2897545 > Reviewed-by: Cody Northrop <cnorthrop@google.com> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> > Commit-Queue: Tim Van Patten <timvp@google.com> Bug: angleproject:5719 Bug: chromium:1209197 Change-Id: I5c24b5f6476eab98ed5a7b90b3d1796ffc7ca106 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2896169 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Shahbaz Youssefi 8bd3d7d5 2021-05-17T13:45:33 Vulkan: Fix a bug releasing DynamicBuffer-owned buffer There was one instance of BufferVk releasing a buffer it had allocated from a DynamicBuffer. This shouldn't have happened as the DynamicBuffer owns the buffers. Bug: angleproject:5720 Change-Id: I435512f4bb099130126bf3efb48a238fcd9f3ddb Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2896168 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Mohan Maiya <m.maiya@samsung.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Shahbaz Youssefi 76181384 2021-05-14T15:31:24 Vulkan: Disable BufferVk suballocation There are still unresolved bugs. Bug: angleproject:5719 Bug: chromium:1209197 Change-Id: I6a971c421d0ae266404d1ecbf8741a9747a4e809 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2897545 Reviewed-by: Cody Northrop <cnorthrop@google.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Tim Van Patten <timvp@google.com>
Shahbaz Youssefi e354ff1a 2021-03-05T04:07:21 Vulkan: Allow DynamicBuffer suballocation in BufferVk When allocations are made from DynamicBuffer, they suballocate from a possibly larger BufferHelper. In BufferVk, the offset of the suballocation was discarded, which limited the use of DynamicBuffer to a pool of small buffers. This change applies any such offset that may arise from suballocations everywhere, and makes BufferVk use a larger buffer size when the GL_DYNAMIC_* buffer usage hints are provided. Bug: angleproject:5719 Change-Id: I3df3317f7acff1b1b06a5e3e2bb707616a7d0512 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2738650 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Mohan Maiya 2d5df9d9 2021-05-01T12:50:55 Vulkan: Don't assume host visibility for external buffers When importing external buffers, Vulkan ICDs could choose to import the memory into a memoryType that doesn't support the VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT property. Account for this possibility. Bug: angleproject:5073 Bug: angleproject:5909 Change-Id: Ied063b38fa48d0c8508c4aaca9214cc526f393ad Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2783669 Commit-Queue: Mohan Maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Mohan Maiya cc3e7b5d 2021-04-26T14:26:29 Vulkan: Handle GL_MAP_PERSISTENT_BIT_EXT for external buffer When user specifies GL_MAP_PERSISTENT_BIT_EXT bit for an external buffer but we are unable to import it into a memoryType that supports host visibility, error out with GL_INVALID_OPERATION error. Bug: angleproject:5073 Bug: angleproject:5909 Change-Id: I03e5477266dfb705bfb0a1bce5ca003049ef4c7a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2862560 Commit-Queue: Mohan Maiya <m.maiya@samsung.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Mohan Maiya fc0c8d18 2021-04-27T16:58:50 Vulkan: Honor mapRangeImpl and unmapImpl abstraction All BufferVk methods need to honor the abstraction provided by mapRangeImpl and unmapImpl. Do not map BufferVk::mBuffer directly, this is needed for when we support device local buffers that cannot be CPU mapped. Bug: angleproject:5909 Change-Id: I520e5cc0994560a3784b8978e349550211dc2cde Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2862559 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Mohan Maiya <m.maiya@samsung.com>
Mohan Maiya 7fe44a53 2021-03-20T09:39:09 Vulkan: Don't acquire new BufferHelper for external buffers EXT_external_buffer spec - This extension allows the data store for an immutable buffer to be sourced from an external EGLClientBuffer, allowing sharing of EGL client buffers across APIs, across processes, and across different processing cores such as the GPU, CPU, and DSP. The intent is for a single backing memory to be reused across various processes and processors. Ensure that a glBuffer backed by external memory does not orphan the memory when glBuffer APIs like glBufferSubData or glMapBufferRangeEXT modify the glBuffer. Bug: angleproject:4380 Bug: angleproject:5073 Tests: ExternalBufferTestES31.*DoesNotCauseOrphaning*Vulkan Change-Id: I4e88f80d93ee1ba1208378121412926351d10af8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2776192 Commit-Queue: Mohan Maiya <m.maiya@samsung.com> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Mohan Maiya 331640e5 2021-03-09T14:36:19 Vulkan: Pass in the correct size to acquireBufferHelper When BufferVk::acquireAndUpdate calls into acquireBufferHelper to allocate a new buffer helper we were passing in the update size instead of the full buffer size. Modified acquireAndUpdate's parameter to better reflect intent. Bug: angleproject:5689 Change-Id: Ic4fbc015651491ec028d747da5d45670264b93fa Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2746066 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Mohan Maiya <m.maiya@samsung.com>
Mohan Maiya c054008f 2021-03-06T13:33:11 Vulkan: Check buffer usage before unmapping Buffers with dynamic usage will have frequent CPU updates. Don't CPU unmap such buffers after every update. Commits b5af8bde13 and 58c35d421 took care of performing an unmap when we release the buffer either to the renderer or mBufferFreeList. Bug: angleproject:5689 Change-Id: Ib6b8f6a7d0cb36583140e67bf164e074af098b8b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2741688 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Mohan Maiya <m.maiya@samsung.com>
Shahbaz Youssefi e366e2c3 2021-02-27T01:00:02 Vulkan: Keep dynamic buffer's free list trimmed ContextVk's staging buffer never gets a chance to free its free buffer list. During application load time, a large amount of memory may be allocated from this buffer to stage texture updates and they would remain throughout the life of the application. This change ensures that the free buffer list doesn't grow unbounded. In the Manhattan trace, this saves >1GB of memory on Linux. There are now three policies for vk::DynamicBuffer: - Always reuse buffers: This is useful for dynamic buffers that make frequent small allocations, such as default uniforms, driver uniforms, default vertex attributes and UBO updates. - Never reuse buffers: This is for situations where the buffer is unlikely to be used after some initial usage, such as texture data upload or vertex format emulation (as the conversion result is cached, so it's never redone). - Limited reuse of buffers: For the staging buffer in the context which is shared by all immutable texture data uploads, it's useful to keep a limited number of buffers (1 in this change) to support future texture streaming while allowing a large number of buffers allocated in a burst to be discarded. Bug: angleproject:5690 Change-Id: Ic39ce61e6beb3165dbce4b668e1d3984a2b35986 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2725499 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Charlie Lao <cclao@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org>
Shahbaz Youssefi d7037aa2 2021-02-11T14:35:30 Vulkan: noop glMemoryBarrier(CLIENT_MAPPED_BUFFER_BARRIER_BIT_EXT) CLIENT_MAPPED_BUFFER_BARRIER_BIT_EXT requires a memory barrier: shader buffer write -> host read. According to the spec, the data is only available after a call to glFinish or wait on sync: > The application must call MemoryBarrier with the > CLIENT_MAPPED_BUFFER_BARRIER_BIT_EXT set and then call FenceSync with > SYNC_GPU_COMMANDS_COMPLETE (or Finish). Then the CPU will see the > writes after the sync is complete. When a buffer is written to by the GPU, ANGLE calls onHostVisibleBufferWrite(), which ensures a "memory write -> host read" barrier is issued at the end of the command buffer. Additionally, persistently mapped buffers use VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, so there's no need for a call to vkInvalidateMappedMemoryRanges. As a result, there's nothing necessary in ANGLE to do for this barrier bit. Note that should persistenly mapped buffers start using non-coherent memory, this barrier should imply a call to vkInvalidateMappedMemoryRanges for the persistently mapped buffers. Bug: angleproject:5070 Change-Id: Iaeae019dadfa659a47d2dac41c0c09f1c15e584b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2689380 Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Mohan Maiya <m.maiya@samsung.com> Reviewed-by: Charlie Lao <cclao@google.com>
Charlie Lao 0c5a55a5 2020-12-17T14:52:59 Vulkan: MapBufferRange should avoid wait if INVALIDATE_BUFFER is set If glMapBufferRange is called with GL_MAP_INVALIDATE_BUFFER_BIT bit set, caller indicates that it don't care about the previous content. If the buffer is busy, instead of wait for GPU to finish and then map the buffer, we should just allocate a new memory and return it. brawl_stars is hitting this case. With this CL, the frame time is cutting to half on the pixel device. Bug: b/175905404 Change-Id: If1220f07ebf53dd28fe6a4732eaba84e2e57598e Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/2597784 Reviewed-by: Tim Van Patten <timvp@google.com> Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Charlie Lao <cclao@google.com>