|
7adbb3e8
|
2024-11-26T17:06:07
|
|
Vulkan: Remove explicit destroy calls
Since now SharedPtr will automatically call destroy(device) when last
reference goes away, there is no need for explicitly calling
destroy(device) any more.
Bug: angleproject:372268711
Change-Id: I208b17cf7e090babd51d6f337c20fdfd74c75b6b
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6052886
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
d81834b6
|
2024-11-26T15:25:35
|
|
Vulkan: Store VkDevice in vk::SharedPtr
So that we don't need to have two versions of destroy() APIs. In
previous CLs I had to add another version of destroy() that does not
take device argument due to SharedPtr may calls destroy when last
reference count goes away. Because we do not have device information at
that time, destroy() API was added but mostly just doing assertion that
Vulkan object has been explicitly destroyed. With this CL, we now stores
device in the SharedPtr so that we no longer need two destroy() APIs.
The explicit destroy(device) call will be removed in the next CL.
Bug: angleproject:372268711
Change-Id: Idcacbc3a922e17ac3d0f6056466b8f3aa084b02e
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6052096
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
f51170b3
|
2024-11-21T16:30:40
|
|
Enable GL_KHR_texture_compression_astc_hdr
Vulkan supports GL_KHR_texture_compression_astc_hdr,
so this extension can be enabled in Angle.
Bug: angleproject:379186304
Change-Id: I438a120c3f884a7eefcd883ad71abf68f81cb473
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6038457
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
01dee1cb
|
2024-10-14T15:04:28
|
|
Add implementation for GL_EXT_texture_storage_compression
Bug: angleproject:352364583
Change-Id: I3dab4c68d5d0206d681e165e991217bd3de8eeb6
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6011055
Auto-Submit: Neil Zhang <Neil.Zhang@arm.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
87d61997
|
2024-11-21T14:20:55
|
|
Vulkan: Switch ShaderModule to use SharedPtr
This CL gets rid of many vk::RefCounted<vk::ShaderModule> usage which is
risky due to it allows you to direct manipulate reference count. Switch
to vk::SharedPtr manages the reference counting automatically.
Bug: angleproject:372268711
Change-Id: I14f5c509bcbd9ea7d17101637e033652a68710a0
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6039117
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
ce53aff0
|
2024-11-05T16:57:57
|
|
Vulkan: Add per descriptorSet LRU cache eviction
Before this CL, the descriptor set cache eviction is at the pool level.
Either the entire pool is deleted or not. It is also not LRU based.
This CL adds a per descriptor set cache eviction and reuse evicted
descriptorSet before allocating a new pool. This eviction is LRU based
so that it is more precise. The mCurrentFrameCount is passed into
various API so that it can make eviction decision based on the frame
number. In this CL, anything not been used in last 10 frames will be
evicted and recycled before allocate a new pool.
Since eviction is based on individual descriptor set, not by pool,
ProgramExecutableVk no longer needs to track the DescriptorSetPool
object. mDescriptorPools has been removed from ProgramExecutableVk
class.
As measured by crrev.com/c/5425496/133 This LRU linked list maintenance
does not add any measurable time difference, but reduces total
descriptorSet pool count by one third (from 75 down to 48).
running test name: "TracePerf", backend: "_vulkan", story:
"batman_telltale"
Before this CL:
cacheMissCount: 200, averageTime:23998 ns
cacheHitCount: 1075445, averageTime:626 ns
descriptorSetEvicted: 0, descriptorSetPoolCount:75
Average frame time 3.9262 ms
After this CL:
cacheMissCount: 200, averageTime:23207 ns
cacheHitCount: 1025415, averageTime:602 ns
descriptorSetEvicted: 102708, descriptorSetPoolCount:48
Average frame time 3.9074 ms
BYPASS_LARGE_CHANGE_WARNING
Bug: angleproject:372268711
Change-Id: I84daaf46f4557cbbfdb94c10c5386001105f5046
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5985112
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
987cc0de
|
2024-08-06T12:16:37
|
|
Vulkan: Add release utility for BufferViewHelper
Add a release helper that doesn't require ContextVk for
BufferViewHelper.
Bug: angleproject:378103913
Change-Id: Ib468d152501ecaec1dd71c9c6ed5527b73a1afa5
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6004686
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
adb80cbb
|
2024-11-12T22:38:38
|
|
Vulkan: Optimize read-only depth/stencil switch after clear
Prior to this change, if color is cleared then read-only depth/stencil
mode is enabled, ContextVk::switchToReadOnlyDepthStencilMode eagerly
flushed the deferred clears (starting a render pass).
However, if the render pass is marked dirty for any reason afterwards,
for example because we want to flush the render pass after a query
ends, the render pass that is just started above is unnecessarily
closed.
In this change, `switchToReadOnlyDepthStencilMode` only detects if a new
render pass is needed and marks the appropriate dirt bits. This way,
the render pass can only be restarted once.
Bug: angleproject:378058737
Change-Id: I83a5ebae6c223882eafea338eeec19895d87e5c1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/6023414
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
f44427b5
|
2024-11-05T15:18:59
|
|
Vulkan: Fix MSAA glReadPixels into PBOs
The temporary image used to resolve the MSAA framebuffer during
glReadPixels did not have the SAMPLED usage bit. Additionally, the
image was supposed to be garbage collected afterwards but ImageHelper's
release() function was accidentally immediately destroy()ing it.
This was not an issue with blocking glReadPixels paths, because the
command buffer was immediately flushed and the GPU work was waited on
before the image was destroyed in RendererScoped's destructor.
Bug: b/377437834
Change-Id: I1dca47172d6f363277059a848fe9446ac2a872d4
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5995530
Commit-Queue: Charlie Lao <cclao@google.com>
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
17a01469
|
2024-10-26T06:07:31
|
|
Vulkan: Bugfix TextureVk::generateMipmap
Add support for generating mipmaps of textures that are EGLImage
texture targets with colorspace overrides
Bug: angleproject:40644776
Tests: ImageTestES3.SourceAHBTarget2DGenerateMipmap*
Change-Id: I9b4ff802f4118a42d54dc8d80ab30e2f9958bfee
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5966623
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
e2cd9082
|
2024-11-05T04:07:17
|
|
Vulkan: Bugfix in setCurrentImageLayout
Make sure to update mLastNonShaderReadOnlyLayout and
mCurrentShaderReadStageMask when updating the current layout.
Bug: angleproject:40644776
Change-Id: Ie8652099a0d4caca9f9aea5bac38256a513b08e7
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5992020
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
79b6c7ab
|
2023-10-09T13:29:07
|
|
CL/Vulkan: Add fillWithPattern interface
In CL, the buffer can be requested to filled with a pattern. Adding a
pattern fill helper routine that fills up the buffer from CPU side.
Bug: angleproject:42267074
Change-Id: I144e9b7c6f4d1263f21cabc2491c46e8951e604f
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5916157
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
|
|
6b9d3762
|
2024-08-22T15:29:00
|
|
Vulkan: Optimize full texture clears
Currently, a full texture clear (glClearTexImageEXT()) is treated as
a special case of a partial clear (glClearTexSubImageEXT() with image
dims as the input). However, it can be further optimized by treating
it as a clear update.
* For full clears from EXT_clear_texture, the clear update path is
taken.
* It leads to a more optimized path, including the usage of the
following APIs:
* vkCmdClearColorImage()
* vkCmdClearDepthStencilImage()
* It uses the following enum: ClearTextureMode
* If a partial clear uses the extents for the entire image, it is
treated as a full clear.
* Updated the method to determine if a texture is renderable in
clearSubImageImpl().
* Added perf counter: fullImageClears
* Added new unit tests
* Single 3D texture full clear (Clear3DSingleFull)
* 2D RGB SNORM clear (Clear2DRGB8Snorm)
* Added Vulkan perf counter test for 2D and 3D color image clear.
* Updated the related skipped tests on Pineapple.
Bug: angleproject:42266869
Bug: angleproject:375425839
Change-Id: I12ef3002dee190d7f8f43204f7d3f76e05d0b54f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5806207
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
af73a7eb
|
2024-10-18T17:19:29
|
|
Vulkan: Add WeakPtr to mirror std::weak_ptr
This is a follow up from previous CL crrev.com/c/5938947. Because of
pool eviction is based on reference count, there is need to have a weak
pointer to the reference counted pool that does not add an extra
reference count. This CL adds WeakPtr that works similarly to
std::weak_ptr, and replaced direct RefCountedDescriptorPool pointer with
WeakPtr wrapper. This is safer than RefCountedDescriptorPool in a way
that it does not allow modification of underline reference count. Also
the use of WeakPtr has been reduced to minimum in this CL.
Bug: angleproject:372268711
Change-Id: Idd6fa77432a9351269c968c961785a7cf5fab50c
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5944061
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
08c1724f
|
2024-10-11T14:29:00
|
|
Vulkan: Support GL_ARM_shader_framebuffer_fetch_depth_stencil
Bug: angleproject:352364582
Change-Id: I63fd78314fa7ebccbf366c252e309a9c0f09c8c1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938150
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
dd54eeec
|
2024-10-11T13:26:46
|
|
Reland "Vulkan: Track GPU progress for individual DescriptorSet"
This is a reland of commit 292102944add2ab30f4aa12a971cac456cc7726b
with the fix of garbage being added back to garbage list.
Original change's description:
> Vulkan: Track GPU progress for individual DescriptorSet
>
> Right now ProgramExecutableVk keeps VkDescriptorSet object, and
> DescriptorSetHelper is created when a cache entry becomes invalid.
> Further, DescriptorSetCache keeps the cache of {VkDescriptorSet,
> RefCountedDescriptorPoolHelper} pair. So we are having three different
> type of objects at different stages of life: VkDescriptorSet,
> DescriptorSetHelper, and {VkDescriptorSet,
> RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at
> creation and at cache and at garbage. With this change, you have a
> reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer)
> during entire life cycle and is passed around between cache and program
> as is. This CL is preparation for the future CL where we may disable
> cache for descriptorSet. The descriptorSet will be added to garbage list
> and reused constantly without go through the cache code. We need to
> track the individual descriptorSet with ResourceUse so that it won't
> reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU
> tracking object so that it will still just work when cache is disabled.
>
> Bug: angleproject:372268711
> Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Commit-Queue: Charlie Lao <cclao@google.com>
> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Bug: angleproject:372268711
Change-Id: Ic920f99cc78cde1e94690bdbee3b885844fa155b
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5954701
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
45cc47af
|
2024-10-22T21:41:22
|
|
Revert "Vulkan: Track GPU progress for individual DescriptorSet"
This reverts commit 292102944add2ab30f4aa12a971cac456cc7726b.
Reason for revert: Causing bot failure in later CLs
Original change's description:
> Vulkan: Track GPU progress for individual DescriptorSet
>
> Right now ProgramExecutableVk keeps VkDescriptorSet object, and
> DescriptorSetHelper is created when a cache entry becomes invalid.
> Further, DescriptorSetCache keeps the cache of {VkDescriptorSet,
> RefCountedDescriptorPoolHelper} pair. So we are having three different
> type of objects at different stages of life: VkDescriptorSet,
> DescriptorSetHelper, and {VkDescriptorSet,
> RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at
> creation and at cache and at garbage. With this change, you have a
> reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer)
> during entire life cycle and is passed around between cache and program
> as is. This CL is preparation for the future CL where we may disable
> cache for descriptorSet. The descriptorSet will be added to garbage list
> and reused constantly without go through the cache code. We need to
> track the individual descriptorSet with ResourceUse so that it won't
> reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU
> tracking object so that it will still just work when cache is disabled.
>
> Bug: angleproject:372268711
> Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Commit-Queue: Charlie Lao <cclao@google.com>
> Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Bug: angleproject:372268711
Change-Id: I4d3c34058d100112a098144276b52c0faf8d593a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5955529
Auto-Submit: Charlie Lao <cclao@google.com>
Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
|
|
99ba07d3
|
2024-03-26T00:58:57
|
|
CL/Vulkan: Implement createImage
Enabling functionality of:
- clCreateImage
Tests-Passing: OCLCTS.test_api get_image<1d|2d|3d>_info
Bug: angleproject:42266936
Signed-off-by: Rafay Khurram <r.khurram@samsung.com>
Change-Id: I0281f092bff13cdd81b87d596fdd15b33dda7e46
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5796527
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
29210294
|
2024-10-11T13:26:46
|
|
Vulkan: Track GPU progress for individual DescriptorSet
Right now ProgramExecutableVk keeps VkDescriptorSet object, and
DescriptorSetHelper is created when a cache entry becomes invalid.
Further, DescriptorSetCache keeps the cache of {VkDescriptorSet,
RefCountedDescriptorPoolHelper} pair. So we are having three different
type of objects at different stages of life: VkDescriptorSet,
DescriptorSetHelper, and {VkDescriptorSet,
RefCountedDescriptorPoolHelper. This CL makes DescriptorSetHelper at
creation and at cache and at garbage. With this change, you have a
reference counted DescriptorSetHelper object (i.e, DescriptorSetPointer)
during entire life cycle and is passed around between cache and program
as is. This CL is preparation for the future CL where we may disable
cache for descriptorSet. The descriptorSet will be added to garbage list
and reused constantly without go through the cache code. We need to
track the individual descriptorSet with ResourceUse so that it won't
reuse until GPU is finished. This CL is making DescriptorSetHelper a GPU
tracking object so that it will still just work when cache is disabled.
Bug: angleproject:372268711
Change-Id: I1cfb77cc5069b202d870388fd8809e265cdca90b
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5918586
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
4bdcdf0d
|
2024-10-16T11:44:49
|
|
Vulkan: Switch RefCountedDescriptorPoolBinding to use SharedPtr
This mostly a clean up. RefCountedDescriptorPoolBinding is replaced with
DescriptorPoolPointer, which is defined as
SharedPtr<DescriptorPoolHelper>. It has more intuitive semantics to use.
Bug: angleproject:372268711
Change-Id: I0397111b5228e896c1d226e00930851319d955a0
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938947
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
c52e8493
|
2024-10-11T14:21:19
|
|
Vulkan: Switch DescriptorPoolPointer to use SharedPtr
To make the reference counting easier to maintain, SharedPtr class is
added to automatically tracking the reference counting and object
creation/destruction. Right now we have BindingPointer class doing
similar things except it does not automatically manage the object
create/destroy, which makes it less robust as well as redundant code to
manage object life cycle. The other problem with BindingPointer is that
it does not work with normal assign/copy operator which made it hard to
read and use. SharedPtr uses exact same API semantics as
std::shared_ptr, which makes the reference counting very easy to use.
The main difference of SharedPtr and std::shared_ptr is that it does not
use any atomic or lock since it assumes user only uses it under thread
safe environment, which ANGLE's backend is.
This CL also changes mDescriptorPools to mDynamicDescriptorPools to make
it clear that it is dynamic descriptor pool not the descriptor pool.
This is also preparation CL for the next CL where we will use SharedPtr
to manage DescriptorSetHelper life cycle, which otherwise a bit
complicated to manually manage.
Bug: angleproject:372268711
Change-Id: I1033d9bf259bbc075a9b374d8a28e1f67d889873
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5926183
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
a19f0947
|
2024-10-17T22:42:30
|
|
Vulkan: Cache depth- and stencil-only views
Existing depth/stencil blit and resolve paths created temporary depth-
and stencil-only views. For
GL_ARM_shader_framebuffer_fetch_depth_stencil, such views are needed as
well.
In preparation for that extension, this change adds depth- and
stencil-only views to ImageViewHelper and allows them to be retrieved
through RenderTargetVk. The blit and resolve paths are consequently
simplfied as a side-effect.
Bug: angleproject:352364582
Change-Id: Ia822efb44ca7c82f63afce904eb19dd1bed02ff5
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5938149
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Yuxin Hu <yuxinhu@google.com>
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
0dbe85f3
|
2024-10-15T13:54:36
|
|
Increase the size of vector WriteImages to max
ANGLE limits the size of vector which represents the write images when
resolving images. So when blit a multisample buffer to mrt, the sum of
write images is more than 1 and app will abort while checking the size
of the vector.
This patch increases the size of vector WriteImages to max.
Add end2end test to test blit multisampled framebuffer to MRT
framebuffer.
Bug: angleproject:361369302
Change-Id: I2d892bcd3411f2bca2ff514f6f0b6055d872668a
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5872512
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
1608d0be
|
2024-10-10T16:53:15
|
|
Vulkan: Isolate framebuffer fetch no-RP-break optim from DR
Prior to [1], changes to framebuffer fetch usage by shaders caused a
render pass break. This was due to a limitation of render pass
compatibility rules. It also caused other headache, such as needing to
clear the render pass cache, recreating pipelines etc.
[1]:https://chromium-review.googlesource.com/c/angle/angle/+/3697308
In [1] an important optimization was implemented for tiling GPUs where
ANGLE permanently switched to framebuffer fetch mode on first
encountering framebuffer fetch use. From that point on, ANGLE would
always make every render pass framebuffer fetch compatible.
In reality, the render pass break was unnecessary, which became apparent
with dynamic rendering (for example that whether the render pass
includes input attachments has no bearing on a pipeline that doesn't use
input attachments at all). In [2], dynamic rendering kept the render
pass break + permanent switch behavior for simplicity.
[2]:https://chromium-review.googlesource.com/c/angle/angle/+/5637155
This change untangles the optimization done for legacy render passes
from dynamic rendering, allowing dynamic rendering to start every render
pass without framebuffer fetch and enable it later if a framebuffer
fetch program is used.
This is in preparation for supporting depth/stencil framebuffer fetch,
where a perma-switch is troublesome (for example in combination with
read-only depth/stencil feedback loops).
Bug: angleproject:352364582
Change-Id: I31221cf22a28d58b9b2bf188e9c0b786cd0fe3d2
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5923120
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
62f33a5c
|
2024-10-09T16:06:54
|
|
Vulkan: Make retainResource for descriptorSetPool consistent
Right now the cache hit and cache miss case are handled differently.
This CL makes it consistent. Now Caller of
DynamicDescriptorPool::getOrAllocateDescriptorSet will handle this
retainResource call regardless of cache hit or miss.
Bug: b/372268711
Change-Id: I11e47ab9e56a24eb68a0231c5e553ab6b8833c82
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5921640
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
b3d85cce
|
2024-09-30T14:28:35
|
|
Vulkan: Consolidate write colorspace override states
ColorspaceState struct is now used to cache write colorspace related
states to determine the colorspace of Vulkan draw image views.
ImageViewHelper methods are called during initialization and when
colorspace related states are toggled dynamically which in turn process
these states and determine the final write colorspace.
We can now fully support rendering to EGLImages, with colorspace
overrides, via texture or renderbuffer EGLImage targets
Bug: angleproject:40644776
Tests: ImageTest*Colorspace*Vulkan
MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan
ReadPixelsPBOTest.SrgbUnorm*Vulkan
Change-Id: I2be2cd3b5b2b4ac8ecb803c34cde2b846cbd1cbe
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5901256
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
b38cc7fa
|
2024-09-30T12:43:09
|
|
Vulkan: Consolidate read colorspace override states
ColorspaceState struct is now used to cache read colorspace related
states to determine the colorspace of Vulkan read image views.
ImageViewHelper methods are called during initialization and when
colorspace related states are toggled dynamically which in turn process
these states and determine the final read colorspace.
Bug: angleproject:40644776
Tests: ImageTest*Colorspace*Vulkan
SRGBTextureTest.SRGB*TextureParameter*Vulkan
SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan
Change-Id: I16b3666cd80865936b826dc0738fc9210dabeda9
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5901255
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
7c811715
|
2024-09-25T11:09:44
|
|
Vulkan: fix crash when clearing stencil with ClearBuffer
Follow up to [1] which fixed a crash with glClear, but the bug remained
with glClearBufferiv. This change refactors the "is stencil write
masked out" query to always take the framebuffer's stencil bit count
into account (practically always 8), which also happens to make the rest
of the code checking this query more accurate in the presence of
nonsense masks where the bottom 8 bits are 0.
[1]: https://chromium-review.googlesource.com/c/angle/angle/+/3315158
Bug: chromium:40207259
Bug: angleproject:42266334
Change-Id: I68a6b0b75c67ed2cdc8c4d03b243efe5495efce1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5889788
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Geoff Lang <geofflang@chromium.org>
|
|
eaffa034
|
2024-09-24T20:56:04
|
|
Revert "Vulkan: Consolidate colorspace override states"
This reverts commit bffcd235ba6c031603d798daaa98f1cf9a3f3e46.
Reason for revert: Breaks Android test `org.skia.skqp.SkQPRunner#UnitTest_DMSAA_dst_read`. Details:
https://b.corp.google.com/issues/369388539.
Original change's description:
> Vulkan: Consolidate colorspace override states
>
> ColorspaceState struct is now used to cache colorspace related states
> and used to determine the colorspace of Vulkan image views.
> ImageViewHelper methods are called during initialization and when
> colorspace related states are toggled dynamically which in turn process
> these states and determine the final read and write colorspaces.
>
> We can now fully support rendering to EGLImages, with colorspace
> overrides, via texture or renderbuffer EGLImage targets
>
> Bug: angleproject:40644776
> Tests: ImageTest*Colorspace*Vulkan
> MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan
> SRGBTextureTest.SRGB*TextureParameter*Vulkan
> SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan
> ReadPixelsPBOTest.SrgbUnorm*Vulkan
> Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Charlie Lao <cclao@google.com>
> Commit-Queue: mohan maiya <m.maiya@samsung.com>
Bug: angleproject:40644776
Change-Id: I5bf6cf2ed0c8ec22fc02d8c3da92673ee85fe002
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5888506
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Commit-Queue: Yuxin Hu <yuxinhu@google.com>
|
|
bffcd235
|
2024-09-13T14:58:00
|
|
Vulkan: Consolidate colorspace override states
ColorspaceState struct is now used to cache colorspace related states
and used to determine the colorspace of Vulkan image views.
ImageViewHelper methods are called during initialization and when
colorspace related states are toggled dynamically which in turn process
these states and determine the final read and write colorspaces.
We can now fully support rendering to EGLImages, with colorspace
overrides, via texture or renderbuffer EGLImage targets
Bug: angleproject:40644776
Tests: ImageTest*Colorspace*Vulkan
MultithreadingTestES3.SharedSrgbTextureMultipleContexts*Vulkan
SRGBTextureTest.SRGB*TextureParameter*Vulkan
SRGBTextureTestES3.SRGBDecodeTexelFetch*Vulkan
ReadPixelsPBOTest.SrgbUnorm*Vulkan
Change-Id: I1cc2b5bd834b519b83deab4d80a2fcaabeb271d6
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5841290
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
e38d25b1
|
2024-06-21T18:22:32
|
|
Vulkan: Implement EXT_clear_texture
* Added new functions to TextureVk to clear the image.
* clearImage()
* clearSubImage()
* Both implemented via clearSubImageImpl(), with the former a
special case of the latter.
* For multisample or renderable images, stagePartialClear() from
ImageHelper is called to add the update.
* For single-sampled non-renderable images, a buffer is filled with
the pixel data and applied to the image as a buffer update.
* Added new update type: ClearPartial
* Used for renderable textures. This includes multisample textures.
* LOAD_OP_CLEAR is used in a render pass to perform the clear.
* UtilsVk::clearTexture()
* (Uses ClearTextureParameters)
* Uses the following functions to get the VkClearValue from the
input data and format:
* GetVkClearColorValueFromBytes()
* GetVkClearDepthStencilValueFromBytes()
* ClearPartial updates can also be superseded and removed similar to
Buffer updates.
* Updated UtilsVk::startRenderPass() to accept a VkClearValue* as an
input arg. If used, the render pass will use LOAD_OP_CLEAR.
* Enabled the feature "clearTextureEXT" on Vulkan.
* Added new unit tests in ClearTextureEXTTest for various formats and
pixel sizes.
* Added related multisample tests in FramebufferTest.cpp.
* FramebufferTest_ES31.ClearTextureEXT*
* Disabled some of the new tests failing using OpenGL.
* Disabled stencil-only-related tests on Pineapple.
Bug: angleproject:42266869
Change-Id: I89c631d68a4ed63d9991abe1783333255ade20dd
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5778348
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
9685cb96
|
2024-09-17T11:41:28
|
|
Vulkan: Clean up PackedClearValuesArray::store
The code was a bit confusing, raising question why there is check of
aspectFlags != VK_IMAGE_ASPECT_STENCIL_BIT. I believe this was
originally from
https://chromium-review.googlesource.com/c/angle/angle/+/2142711, and
then followed by
https://chromium-review.googlesource.com/c/angle/angle/+/2410366. The
code seems to having some special logic to handle packed depth stencil
clear value. This code may have been changed a bit. When I am looking at
current code base, I am not seeing good reason doing it this way. Caller
is handling the packed format by reading out depth value and pack them
together and then store. This CL replaces store/storeNoDepthStencil
pair to storeColor/storeDepthStencil.
Bug: b/167301719
Change-Id: I40cfca1e51654f5ddaf4b2e8460ae5a26c656f2b
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5870921
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
1b4d6185
|
2024-09-12T09:18:46
|
|
Vulkan: Cleanup sRGB related code
Image and image view code is littered with sRGB related enums, even
in places that don't deal with sRGB. Remove sRGB related parameters
from initLayerImageView and getLevelLayerDrawImageView methods, which
now assume default values. Add dedicated methods that allow overriding
sRGB state values.
Also introduce ColorspaceState struct that consolidates all sRGB
related states, this will be used in follow up changes to track
and infer colorspace of image views
Bug: angleproject:40644776
Change-Id: Ifb366db48043e376f9ff6c30c852c44dd96562a1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5860808
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
|
|
937c5dc8
|
2024-09-09T12:18:55
|
|
Vulkan: Make image{Read,Write} helper interface api agnostic
Removing ContextVk dependency on the imageRead/imageWrite helper utility
functions.
Bug: angleproject:42266971
Change-Id: I493e1fb11e8ae192f766c822cbee278c49c23bfe
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5845197
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
7080b766
|
2024-08-16T11:31:39
|
|
Vulkan: Move LineLoopHelper from vk to rx namespace
There is no line loop support in vulkan. LineLoopHelper is a utility
function for backend, not a helper function for vulkan object. So it is
better fit in rx namespace instead of vk namespace.
This also helps my next CL where I am going to change
initBufferForVertexConversion to take a ConversionBuffer instead of
BufferHelper. LineLoopHelper uses initBufferForVertexConversion, which
means I have to change LineLoopHelper to uses ConversionBuffer. This
causes header inclusion problem that now vk namespace object end up have
to include rx namespace header. This CL fixes this inclusion problem by
moving it to the proper namespace.
Bug: b/357622380
Change-Id: I6d6cf1aa926f726bb1b1ab1017bcab092eaf5d37
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5787502
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
f8fc8ac3
|
2024-08-05T11:50:11
|
|
Vulkan: Remove dependency on ContextVk for CommandBufferHelper
Following on the changes in [1], this makes the
`CommandBufferHelperCommon` and `OutsideRenderPassCommandBufferHelper`
interfaces independent of `ContextVk` state. Any dependency is made
explicit.
In addition, interfaces that are not specific to GLES context are also
updated.
[1]: Commit (bcf814fda5 Vulkan: Constrain the dependency on ContextVk in
BufferHelper)
Bug: angleproject:8544
Change-Id: I7d90ad915e8c14187ab5584453b9e8802bd91e2b
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5319147
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
7c77bb75
|
2024-07-24T12:05:29
|
|
Vulkan: Remove implicit buffer barrier for shader write
When app uses shaders to write to SSBO, right now we are inserting an
implicit barrier to ensure WAW are in order. But Spec says that
"Explicit synchronization is required to ensure that the effects of
buffer and texture data stores performed by shaders will be visible to
subsequent operations using the same objects". This CL removes the
implicit barrier for buffer write if the current write comes from
shaders and relies on explicit glMemoryBarrier to insert a global
barrier.
Bug: angleproject:350994515
Change-Id: I8ab039610be9be2ded27ea60dab54bdad08502f6
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719258
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
bd3a3308
|
2024-07-22T14:03:20
|
|
Vulkan: Remove implicit image barrier for shader write
When app uses compute or fragment shader to write to an image and makes
multiple dispatchCompute or draw calls, right now we are inserting an
implicit barrier to ensure WAW is hazard free. But Spec says that
"Explicit synchronization is required to ensure that the effects of
buffer and texture data stores performed by shaders will be visible to
subsequent operations using the same objects". This CL records the bits
from the last glMemoryBarrier call and will skip the barrier calls in
ContextVk::updateActiveImages if there is no layout change, unless there
is requirement from prior glMemoryBarrier.
Bug: angleproject:350994515
Change-Id: I8bdeeb658993824369824aaa0f25cb4b6e3785f7
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5719024
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
7691cea7
|
2024-07-22T13:46:14
|
|
Vulkan: Remove seamful cubemap emulation
Practically, the Vulkan backend is never expected to run on ES2
hardware. It _may_ for WebGL, but seamful cubemap emulation was
disabled for webgl anyway.
Bug: angleproject:354729454
Change-Id: Iafa20fbdbe232c4df4c777b12e7698ef7a87cf24
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5730143
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
0d458614
|
2024-07-18T00:00:00
|
|
Vulkan: Fix PBO readbacks with small row length
Use CPU path when the row length is
smaller than the source area width.
Fixed: angleproject:354005999
Change-Id: I5c4686ca5387a98c6137868afb19c333aed8ac21
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5724591
Commit-Queue: Alexey Knyazev <lexa.knyazev@gmail.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
1db80b88
|
2024-07-10T12:47:42
|
|
Reland "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]"
This is a reland of commit c379ff48043a47e444c388c45270db40d3172d50
Original change's description:
> Vulkan: Use VK_KHR_dynamic_rendering[_local_read]
>
> Bug: angleproject:42267038
> Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155
> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> Reviewed-by: Charlie Lao <cclao@google.com>
Bug: angleproject:42267038
Change-Id: I083e6963b5421386695e49a9872edbb2016c9763
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691342
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
1f87cbc9
|
2024-07-15T13:07:35
|
|
Vulkan: Fix late-added resolve attachment tracking
Resolve attachments may be added after the fact to a render pass due to
glBlitFramebuffer or eglSwapBuffer. Previously, only the resolve image
views were tracked by the render pass, and otherwise the state tracking
(layout, content defined, etc) treated the resolve images as generically
written-to by the render pass.
As a result, the render pass was unable to finalize the layout of the
resolve images early. Optimizing the layout of the swapchain image when
the surface is multisampled for example was not done due to this issue.
In this change, when resolve attachments are added late, they are
tracked identically to when they are added at the beginning of the
render pass, fixing the issues described above.
Bug: angleproject:42265625
Bug: angleproject:42266019
Change-Id: I765560762bb8caf39ba1096fb028177201c082d7
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5707470
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
6578b9c0
|
2024-07-09T17:19:47
|
|
Vulkan: Exclude compute/preFrag only access images from event
This further restricts VkEvent usage for certain usage patterns. If
image is only used by compute, use VkEvent also will not benefit it
since compute itself can not overlap with compute (assume there is only
one compute engine and compute work can not overlap with each other).
Similarly this also applies to KPreFragment stages. Basically after this
CL, use of VkEvent is limited to usages that crosses different execution
units (modeled against tiler based GPUs where there are pre-fragment
stages and fragment stages and compute and all others). Before this CL,
we are seeing performance regression with antutu_refinery and
streets_of_rage_4 due to overhead of VkEvent, which is fixed with this
CL.
Bug: b/336844257
Change-Id: I5ca5d813daefe9bfcaf48f831340cdf9559f8104
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5692760
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
402c8ccd
|
2024-06-26T19:28:21
|
|
Vulkan: Limit VkEvent for images that has fragment access only
One of the problem with VkEvent is that the overhead comes with
VkCmdSetEvent causes some app traces regress performance. The goal in
this CL is to further limit VkCmdSetEvent to images that that we think
are potentially subject to the pipeline bubble. The bubble usually
occurs when accesses are alternated between different stages,
specifically a mix between vertex/transfer/compute/fragment. If all
accesses are from fragment shader or color attachment, then use VkEvent
will not be beneficial, but only adds extra overhead. This CL adds the
heuristic tracking for image access. Every time an image is used, a bit
is used to indicate the usage involves fragment only or not. A bitfield
is used to track the window of the history of the usage. When image is
used (usually at the time queueSerial is set), we shift the history bits
left and the new bit is added to the right most bit. If all accesses are
from the fragment shader or color attachment, then no need to use
VkEvent. For example, if a texture is always sample from fragment shader
only, then VkEvent will not used. Another common usage is you render to
it and then texture from it, it will also excluded from VkEvent with
this CL.
Bug: b/336844257
Change-Id: I175194f30b8f1d9b8fbf38ad594778474548016f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5664170
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
373ac541
|
2024-07-10T11:14:47
|
|
Vulkan: Make surface RP check independent from framebuffer object
With dynamic rendering, there is no framebuffer object, so checking
whether the currently open render pass belongs to the window surface (at
swap time) is made independent from these objects.
Bug: angleproject:42267038
Change-Id: I408e2376ba865b64fa1e8890316e8f57c08c695f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691345
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
867697b7
|
2024-07-09T10:43:02
|
|
Vulkan: Add ImageHelper::onRenderPassAttach helper function
RenderPass attachments has one difference compared to other images. The
QueueSerial has to be set first so that we can detect an image is being
used as attachment. But the layout is delayed until the endRenderPass
time. This CL adds a onRenderPassAttach API to set the queueSerial so
that we have a central place to adding other code if needed.
Bug: b/336844257
Change-Id: I894fff83745691e8167a295c71cbc2e1d22f1343
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5689452
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
9ca3ed37
|
2024-07-08T16:48:51
|
|
Vulkan: Let ContextVk::onResourceAccess uses retainImage
Right now ContextVk::onResourceAccess calls retainResource for
everything. Mean time we also have a retainImage() function, which adds
a bit confusion to why we have two retain API. This CL moves retainImage
from CommandBufferHelperCommon to OutsideRenderPassCommandBufferHelper
and RenderPassCommandBufferHelper so that ContextVk::onResourceAccess
can use retainImage directly. The slightly behavior difference between
RenderPassCommandBufferHelper and OutsideRenderPassCommandBufferHelper's
retainImage is from compute shader's image access, which we are using
VkEvent to track images, mainly due to we tailor VkEvent to the
manhattan's usage case, which involves compute.
Bug: b/336844257
Change-Id: Id3fb694f683289a4720cc279387dbc27642745de
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5686352
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
7d461b21
|
2024-07-10T14:11:53
|
|
Revert "Vulkan: Use VK_KHR_dynamic_rendering[_local_read]"
This reverts commit c379ff48043a47e444c388c45270db40d3172d50.
Reason for revert: Regresses CPU perf and memory when _not_ using DR
Original change's description:
> Vulkan: Use VK_KHR_dynamic_rendering[_local_read]
>
> Bug: angleproject:42267038
> Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155
> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> Reviewed-by: Charlie Lao <cclao@google.com>
Bug: angleproject:42267038
Change-Id: I3865f0d86813f0eeb9085a92875a33bd449b907f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5691337
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
c379ff48
|
2024-06-10T22:01:57
|
|
Vulkan: Use VK_KHR_dynamic_rendering[_local_read]
Bug: angleproject:42267038
Change-Id: I1f4eb0f309992a9c1c287a69520dadf5eff23b26
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5637155
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
8c546d35
|
2024-06-25T12:49:40
|
|
Vulkan: Limit VkEvent for usage matters for Manhattan31 only
If we use VkEvent to track all image operations causes performance
regression on some app traces, including manhattan10 trace. This mainly
because of CPU overhead comes with VkCmdSetEvent, mostly inside vulkan
driver. These app traces likely not benefit from VkEvent because the
specific bubble (false dependency) does not manifest on these app
traces, but the CPU overhead takes a performance toll on it. In order to
strike a balance between benefit and overhead, this CL removes most of
VkEvent usage and only leaves the ones that matters for manhattan31. The
only we still keeps are generateMipmap, dispatchCompute, texture
sampling. We can always add more if more beneficial usage cases comes up
and no regression in other traces.
Bug: b/336844257
Change-Id: I346fe70bc33e57edf04e933a2db0f79738c4481d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5654737
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
5703bd61
|
2024-06-14T14:12:41
|
|
Vulkan: Further optimize ProgramExecutableVk::resetLayout
1. Handle compute pipelines similar to how we handle graphics pipelines
2. Track valid compute pipeline permutations
Bug: angleproject:8297
Change-Id: I58200517e5a44a2b3092777ea24d1529ceee00f5
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5634574
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
06f1b72f
|
2024-06-03T08:59:46
|
|
Vulkan: Bugfix in MSRTT emulation
Transient multisampled images should have no mips. Enforce this
requirement when MSRTT is being emulated
Bug: angleproject:4836
Tests: MultisampledRenderToTexture*MultipleLevelsMultisample*
Change-Id: I6df21bbb49a4c45aa3ee321f7d49b81f55352562
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5601347
Commit-Queue: mohan maiya <m.maiya@samsung.com>
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
b4f3824e
|
2024-05-31T11:36:32
|
|
Reland "Vulkan: Defer texture data flush until data provided for all levels"
This is a reland of commit 490c056a88a33870cb4ba2a7906b0a9688d96262
Original change's description:
> Vulkan: Defer texture data flush until data provided for all levels
>
> One of the major overhead with VkEvent is seeing with first frame where
> all textures are being specified. The immutable textures, we always
> immediately flush out the update as data provided for each level. This
> means one VkEvent is created and SetEvent is called per level. This CL
> delays the flush until data for all levels are provided, thus there is
> only one flush per texture instead of per level. With this CL asphalt_9
> is no longer timeout on bots when VkEvent is enabled.
>
> There is also another benefit comes with this CL. On all desktop GPUs,
> ASTC format texture are falling back to RGBA8. We always stage a clear
> for the emulated format. That staged clear are able to be removed if
> data is provided later. Because of we flush out staged update when first
> level data is provided, all staged clear for the subsequent levels are
> also gets flushed out, losing the chance to be removed. This CL will
> allow all staged clears being removed.
>
> Bug: b/343976993
> Bug: b/336844257
> Change-Id: Ica731ea57db771b16966f4da92ccdc551ae93d81
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5588816
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> Commit-Queue: Charlie Lao <cclao@google.com>
Bug: b/343976993
Bug: b/336844257
Change-Id: Iabcc1b4ebca7d6f34a0e7f109795392fc00e7eda
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5606146
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
5b8e380c
|
2024-06-10T17:54:25
|
|
Vulkan: Fix bug in ImageHelper::flushSingleSubresourceStagedUpdates
There is another bug in ImageHelper flush staged update code path that
exposed by a new test I added in crrev.com/c/5606145. When we render to
a multi-layered texture and that layer we are trying to render to has a
staged clear and followed by an buffer update, and if the buffer update
overlaps with layer we try to render to but not exact match, we will
incorrectly think that the glClear call can override the buffer update.
The bug here is that ImageHelper::flushSingleSubresourceStagedUpdates is
using ImageHelper::SubresourceUpdate::isUpdateToLayers() call to decide
if buffer update will be overriden. That isUpdateToLayers is only
looking exact layer range match. So in this case because the buffer
update's layer range is bigger than glClear, it returns false. This
causes the flushSingleSubresourceStagedUpdates think it is outside the
layer range we try to render, and causes rendering bug. This CL renames
isUpdateToLayers to isLayerRangeExactMatch to reflect the actual
behavior of the function. This CL also adds new API isWithinLayerRange
and called by flushSingleSubresourceStagedUpdates to decide if the
updates can implement using renderPass loadOp.
Bug: angleproject:345532371
Bug: angleproject:42263375
Change-Id: Ia604ed1a61b56d7bde05f12a03baef8f00af2b17
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5619730
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
81452425
|
2024-06-07T11:49:28
|
|
Vulkan: Fix keeping overlapped updates in flushStagedUpdatesImpl()
There is an existing bug in ImageHelper::flushStagedUpdatesImpl() that
caused webGPU test to fail when my CL crrev.com/c/5588816 landed. The
bug is that when we flush out an update, we walk through the vector
updates and if the update is outside the range of requested layer range,
we stash away the update to updatesToKeep list. We only flush out the
updates that are intersects with the requested layer range. The bug here
is that if one of the update has bigger layer range than the requested
layer range, and there is an update that intersects with that update's
layer range but not overlap with requested layer range, now that update
may incorrectly gets moved to updatesToKeep list. Later on when that
updatesToKeep list gets flushed out, you end up overwriting the image
content.
This CL adds a new function adjustLayerRange() that first walk the
updates and calculate the actual layer range that will be flushed and
then use that adjusted layer range to determine if an update should be
kept or flushed.
Bug: angleproject:345532371
Change-Id: I59ef4ec935354766d35e4cfbb6ce4b13d9a2e868
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5607276
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
295ff607
|
2024-06-05T14:49:33
|
|
Vulkan: Precompute stageMask of kImageMemoryBarrierData
Right now every time we need a pipelineStage in kImageMemoryBarrierData,
we are doing a bitwise AND with
mSupportedVulkanPipelineStageMask. This get called multiple
times from barrier call. This CL adds
mImageLayoutAndMemoryBarrierDataMap that has already precomputed all
stageMask, thus avoid run time bitwise OR.
This CL also precomputes the bufferWritePipelineStageMask so that
flushImpl can be use it without construct every time.
Bug: b/345279810
Change-Id: I878bd31c967cd217477061976f07df13b043fa7f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5601073
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
87bbeaee
|
2024-06-03T15:04:25
|
|
Vulkan: Reduce VkEvent counts by using EventStage enums
Right now we are using too many VkCmdSetEvents and causes some of the
deqp tests timeout on CI bots (because of VVL is very slow along with
the number of events being used). RefCountedEvents are per ImageLayout.
But some of ImageLayous have the same VkPipelineStageFlags, for example
TransferSrc and TransferDst. This CL changes RefCountedEvent to per
unique VkPipelineStageFlags instead of per ImageLayout, thus allows
TransferSrc and TransferDst to share one VkEvent.
To do that, EventStage enum and kEventStageAndPipelineStageFlagsMap
table are added to define the predefined VkPielineStageFlags that ANGLE
uses. RefCountedEvent now keeps EventStage instead of ImageLayout. To
further reduce the CPU overhead, a customized
mPipelineStageMaskAndEventMap table is precomputed in renderer with
supported vulkan pipeline stages.
With this CL, previously timed out tests such as
KHR-GLES3.copy_tex_image_conversions.forbidden.renderbuffer_cubemap* now
passing.
Bug: b/336844257
Change-Id: I021a8f1d6112d5cf96c61652c9af5f679b1172eb
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5597732
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
92f198f6
|
2024-06-06T21:42:35
|
|
Revert "Reland "Vulkan: Defer texture data flush until data provided for all levels""
This reverts commit b93af07ac1ddb9f2e262d611d155f4b63f18999f.
Reason for revert: b/345532371
Original change's description:
> Reland "Vulkan: Defer texture data flush until data provided for all levels"
>
> This is a reland of commit 490c056a88a33870cb4ba2a7906b0a9688d96262
>
> Original change's description:
> > Vulkan: Defer texture data flush until data provided for all levels
> >
> > One of the major overhead with VkEvent is seeing with first frame where
> > all textures are being specified. The immutable textures, we always
> > immediately flush out the update as data provided for each level. This
> > means one VkEvent is created and SetEvent is called per level. This CL
> > delays the flush until data for all levels are provided, thus there is
> > only one flush per texture instead of per level. With this CL asphalt_9
> > is no longer timeout on bots when VkEvent is enabled.
> >
> > There is also another benefit comes with this CL. On all desktop GPUs,
> > ASTC format texture are falling back to RGBA8. We always stage a clear
> > for the emulated format. That staged clear are able to be removed if
> > data is provided later. Because of we flush out staged update when first
> > level data is provided, all staged clear for the subsequent levels are
> > also gets flushed out, losing the chance to be removed. This CL will
> > allow all staged clears being removed.
> >
> > Bug: b/343976993
> > Bug: b/336844257
> > Change-Id: Ica731ea57db771b16966f4da92ccdc551ae93d81
> > Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5588816
> > Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> > Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> > Commit-Queue: Charlie Lao <cclao@google.com>
>
> Bug: b/343976993
> Bug: b/336844257
> Change-Id: Ie987582a44e0d73abd38ce8f6813ff8995e907e2
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5597810
> Reviewed-by: Cody Northrop <cnorthrop@google.com>
> Commit-Queue: Charlie Lao <cclao@google.com>
Bug: b/343976993
Bug: b/336844257
Change-Id: I9356da6b4cdb21dba47758d6e937d1ae02f0ae34
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5606144
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
|
|
b93af07a
|
2024-05-31T11:36:32
|
|
Reland "Vulkan: Defer texture data flush until data provided for all levels"
This is a reland of commit 490c056a88a33870cb4ba2a7906b0a9688d96262
Original change's description:
> Vulkan: Defer texture data flush until data provided for all levels
>
> One of the major overhead with VkEvent is seeing with first frame where
> all textures are being specified. The immutable textures, we always
> immediately flush out the update as data provided for each level. This
> means one VkEvent is created and SetEvent is called per level. This CL
> delays the flush until data for all levels are provided, thus there is
> only one flush per texture instead of per level. With this CL asphalt_9
> is no longer timeout on bots when VkEvent is enabled.
>
> There is also another benefit comes with this CL. On all desktop GPUs,
> ASTC format texture are falling back to RGBA8. We always stage a clear
> for the emulated format. That staged clear are able to be removed if
> data is provided later. Because of we flush out staged update when first
> level data is provided, all staged clear for the subsequent levels are
> also gets flushed out, losing the chance to be removed. This CL will
> allow all staged clears being removed.
>
> Bug: b/343976993
> Bug: b/336844257
> Change-Id: Ica731ea57db771b16966f4da92ccdc551ae93d81
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5588816
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> Commit-Queue: Charlie Lao <cclao@google.com>
Bug: b/343976993
Bug: b/336844257
Change-Id: Ie987582a44e0d73abd38ce8f6813ff8995e907e2
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5597810
Reviewed-by: Cody Northrop <cnorthrop@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
170851ff
|
2024-06-04T09:01:11
|
|
Revert "Vulkan: Defer texture data flush until data provided for all levels"
This reverts commit 490c056a88a33870cb4ba2a7906b0a9688d96262.
Reason for revert: breaks win-trace
https://ci.chromium.org/ui/p/angle/builders/ci/win-trace/6014/overview
Original change's description:
> Vulkan: Defer texture data flush until data provided for all levels
>
> One of the major overhead with VkEvent is seeing with first frame where
> all textures are being specified. The immutable textures, we always
> immediately flush out the update as data provided for each level. This
> means one VkEvent is created and SetEvent is called per level. This CL
> delays the flush until data for all levels are provided, thus there is
> only one flush per texture instead of per level. With this CL asphalt_9
> is no longer timeout on bots when VkEvent is enabled.
>
> There is also another benefit comes with this CL. On all desktop GPUs,
> ASTC format texture are falling back to RGBA8. We always stage a clear
> for the emulated format. That staged clear are able to be removed if
> data is provided later. Because of we flush out staged update when first
> level data is provided, all staged clear for the subsequent levels are
> also gets flushed out, losing the chance to be removed. This CL will
> allow all staged clears being removed.
>
> Bug: b/343976993
> Bug: b/336844257
> Change-Id: Ica731ea57db771b16966f4da92ccdc551ae93d81
> Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5588816
> Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
> Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
> Commit-Queue: Charlie Lao <cclao@google.com>
Bug: b/343976993
Bug: b/336844257
Change-Id: I25854b855334c4cac1c2b40467d8e2ecb7661b8f
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5593935
Auto-Submit: Yuly Novikov <ynovikov@chromium.org>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
|
|
490c056a
|
2024-05-31T11:36:32
|
|
Vulkan: Defer texture data flush until data provided for all levels
One of the major overhead with VkEvent is seeing with first frame where
all textures are being specified. The immutable textures, we always
immediately flush out the update as data provided for each level. This
means one VkEvent is created and SetEvent is called per level. This CL
delays the flush until data for all levels are provided, thus there is
only one flush per texture instead of per level. With this CL asphalt_9
is no longer timeout on bots when VkEvent is enabled.
There is also another benefit comes with this CL. On all desktop GPUs,
ASTC format texture are falling back to RGBA8. We always stage a clear
for the emulated format. That staged clear are able to be removed if
data is provided later. Because of we flush out staged update when first
level data is provided, all staged clear for the subsequent levels are
also gets flushed out, losing the chance to be removed. This CL will
allow all staged clears being removed.
Bug: b/343976993
Bug: b/336844257
Change-Id: Ica731ea57db771b16966f4da92ccdc551ae93d81
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5588816
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
0d772ebe
|
2024-05-21T16:41:41
|
|
Vulkan: Cleanup releaseToExternal/acquireFromExternal API
This CL addresses some feedback from the earlier CLs.
Bug: b/337135577
Change-Id: I90c26a9374254af69bf00eb6580ce9580b71ca5a
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5561465
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
1a9a703b
|
2024-05-21T11:14:40
|
|
Vulkan: Add DeviceQueueIndex to Context/BufferHelper/ImageHelper
This CL adds a utility class DeviceQueueIndex, which encapsulates
queueFamilyIndex and the queueIndex into one integer value so that we
can pass around to barrier function. vk::Context and BufferHelper and
ImageHelper class now keeps mCurrentDeviceQueueIndex instead of
mCurrentQueueFamilyIndex. For All contexts by default it gets the
default queue from renderer (which is always the one corresponding to
Medium priority). For ContextVk, when priority changes it update
mCurrentDeviceQueueIndex to match new context priority.
Bug: b/337135577
Change-Id: I62cc483cfdb3e974d38db074e671c57299300074
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5555903
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
b22cce5f
|
2024-05-21T10:55:27
|
|
Vulkan: Remove Renderer::getDeviceQueueIndex
Renderer::getDeviceQueueIndex() returns queueFamilyIndex. There is a
function that already returns mCurrentQueueFamilyIndex, so this function
is now removed.
This CL also renames ImageHelper::isQueueChangeNeccesary to
isQueueFamilyChangeNeccesary
Bug: b/337135577
Change-Id: I3cd9ded1414d1389e162aaa5399c231a987f871e
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5553067
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
7cbac5de
|
2024-05-16T10:01:41
|
|
Vulkan: Switch RefCountedEventCollector to use std::deque
Right now it is using std::vector. The vector could potentially grow
big. The grow is costly with vector when storage has to resize. This CL
switches it to use std::deque.
Bug: b/293297177
Bug: b/336844257
Change-Id: Ia407e7d4e1b68bd0cb7cf76ccc5e7523c0e83415
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5545882
Reviewed-by: Roman Lavrov <romanl@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
f82d812d
|
2024-05-16T15:46:05
|
|
Vulkan: Fix EventBarrier bug when asyncSubmission is enabled
Since we have moved RefCountedEvent garbage collection into
ShareGroupVk, we have changed the reference counting to use non-atomic.
There is a bug with async submission code path where the
executeSetEvents gets called from submission thread which does not have
share group lock. This CL fixes this bug by storing VkEvent in
RenderPassCommandBufferHelper so that executeSetEvents uses VkEvent
instead of RefCountedEvent when async submission is enabled.
This CL also adds assertion that RefCountedEvent::releaseImpl does not
get called from async submission thread.
Bug: b/336844257
Change-Id: Ifcbd5a09d2bc7636cc15b2c6728dbbca103d4d9c
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5544449
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
0636b509
|
2024-05-06T12:36:20
|
|
Vulkan: Move RefCountedEvent GC and recycler to ShareGroupVk (2/3)
One of the problem we had with RefCountedEvents is CPU overhead comes
with it, and some part of the CPU overhead is due to atomic reference
counting. The RefCountedEvents are only used by ImageHelper and
ImageHelpers are per share group, so they are already protected by front
end context share lock. The only reason we needs atomic here is due to
garbage cleanup, which runs in separate thread and will decrement the
refCount. The idea is to move that garbage list from RendererVk to
ShareGroupVk so that access of RefCountedEvents are all protected
already, thus we can remove the use of atomic. The down side with this
approach is that a share group will hold onto its event garbage and not
available for other context to reuse. But VkEvents are expected to be
very light weighted objects, so that should be acceptable.
This is the second CL in the series. In this CL, we added
RefCountedEventsGarbageRecycler to the ShareGroupVk which is responsible
to garbage collect and recycle RefCountedEvent. Since most of
ImageHelper code have only access to Context argument, for convenience
we also stored the RefCountedEventsGarbageRecycler pointer in the
vk::Context for easy access. vk::Context argument is also passed to
RefCounteEvent::init and release function so that it has access to the
recycler. The garbage collection happens when RefCountedEvent is needed.
The per renderer recycler is still kept to hold the RefCounteEvents that
gets released from ShareGroupVk or when it is released without access to
context information.
Bug: b/336844257
Change-Id: I36fe5d1c8dacdbe35bb2d380f94a32b9b72bbaa5
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5529951
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
2e0aefe9
|
2024-05-06T11:03:19
|
|
Vulkan: Move RefCountedEvent GC and recycler to ShareGroupVk (1/3)
One of the problem we had with RefCountedEvents is CPU overhead comes
with it, and some part of the CPU overhead is due to atomic reference
counting. The RefCountedEvents are only used by ImageHelper and
ImageHelpers are per share group, so they are already protected by front
end context share lock. The only reason we needs atomic here is due to
garbage cleanup, which runs in separate thread and will decrement the
refCount. The idea is to move that garbage list from RendererVk to
ShareGroupVk so that access of RefCountedEvents are all protected
already, thus we can remove the use of atomic. The down side with this
approach is that a share group will hold onto its event garbage and not
available for other context to reuse. But VkEvents are expected to be
very light weighted objects, so that should be acceptable (If not, we
can add some limit to the number of events it can hold in the garbage
list).
This is the first CL in the series. Before this CL, the RefCounteEvents
are garbage collected at flushToPrimrary time, at which time we have
lost ContextVk information. In order for us to do garbage collect to
ShareGroupVk, we need to move the garbage collection process early,
before command buffers leaving ContextVk's visibility. For
OutsideRenderPassCommands, this is easy to do, we just call
flushSetEvents before we call mRenderer->flushRenderPassCommands. For
RenderPassCommands, that flushSetEvents call will simply make another
copy of RefCountedEvents and add to the garbage list and the actual
VkCmdSetEvents are defered at the executeSetEvents call that get called
from flushToPrimrary time.
Bug: b/336844257
Change-Id: I1948cd8240ff61d407931083b7584a54b1dc6b0d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5517891
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
3d04180c
|
2024-05-02T17:30:02
|
|
Vulkan: Add a dedicated garbage list for RefCountedEvents
Previously each individual RefCountedEvent is wrapped into a
GarbageObject. That is the reason behind RefCountedEvent being a
subclass from WrappedObject, since vk::GetGarbage call requires garbage
object is a subclass of WrappedObject. This CL adds a new garbage list
dedicated for RefCountedEvents, named mRefCountedEventGarbageList. With
this new change, we no longer limited by the vk::GetGarbage requirements
since it no longer called for RefCountedEvents. RefCountedEventCollector
is a vector of RefCountedEvents and every time a RefCountedEvent needs
to be released, it adds into the collector. Then the event collector
entire thing is treated like a garbage object and gets ResourceUse
tracked and added into mRefCountedEventGarbageList. This list gets
walked and cleaned when GPU is completed. This CL is also a preparation
for later CLs that adds event recycle support.
Bug: b/336844257
Change-Id: I4eff69b66922dfe5521b6994f240e967ff3726bd
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5516458
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
c3a1cae4
|
2024-04-15T14:58:55
|
|
Use angle::SimpleMutex everywhere in libGLESv2
Only cases left that use std::mutex are:
- Share group and the context ErrorSet mutexes as they need try_lock()
- Anywhere mutexes are used in conjunction with std::condition_variables
(as they explicitly require std::mutex)
Bug: angleproject:8667
Change-Id: Ib6d68938b0886f9e7c43e023162557990ecfb300
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5453294
Reviewed-by: Roman Lavrov <romanl@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
28d4c3eb
|
2024-05-03T13:00:03
|
|
Vulkan: Remove BarrierType argument from ImageHelper::barrierImpl
Originally I passed the barrierType argument into barrierImpl function
to force pipelineBArrier in a few edge cases. Notably when we do a
oneoff submission. In that case if we have an event that is pending
setEvent call, and we will end up inserting a waitEvent call on an event
that has not yet set, which is bad. But this really won't not happen.
This CL removed BarrierType from the API and added a few assertions in
the caller to ensure we do not have a valid event when doing a oneoff
submission (if there is a pending setEvent, mCurrentEvent will be
valid).
Bug: b/336844257
Change-Id: I7161b844fc1f36993cf7ff6c90a070d9f92930cc
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5512878
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
5eb3bca0
|
2024-05-01T11:47:34
|
|
Vulkan: Minor cleanup
Bug: b/336844257
Change-Id: I8d93c6dd814a666debf9990d151cad79c45469f1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5503645
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
d30e4772
|
2024-02-02T14:13:33
|
|
Vulkan: Add VkCmdWaitEvents for image barriers
This CL add EventBarrierArray (sister class of PipelineBarrierArray)
that accumulates the event barriers instead of pipeline barriers.
ImageHelper::barrierImpl and ImageHelper::updateLayoutAndBarrier has
been updated to have a code path that inserts waiting event to
EventBarrierArray. PipelineBarrier code path is still kept and is also
used when event is invalid or under certain situation as a fallback
method from waitEvent. After we generate barrier (regardless it is
pipelineBarrier or eventBarrier, we always release
ImageHelper::mCurrentEvent. When next barrier/layout call is made, if we
see mCurrentEvent is invalid, we always fallback to pipelineBarrier.
This way it is safe that if somehow we did not (intentionally or
accidentally) insert a new event between two barrier calls, we will not
end up with second barrier call wait for old event which creates
synchronization hazard. With this approach, second barrier will use
pipelineBarrier which is still safe.
In this CL the useVkEventForImageBarrier feature flag is still disabled,
so no events are created and thus pipelineBarrier is still used.
Bug: b/336844257
Change-Id: Idaf5a7200b85f901eae5d376543f189d21522022
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5263701
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
a96e9197
|
2024-04-25T10:35:02
|
|
Vulkan: Add RefCountedEvent class and VkCmdSetEvent call
This CL defines RefCountedEvent class that adds reference counting to
VkEvent. CommandBufferHelper and ImageHelper each holds one reference
count to the event. Every time an event is added to the command buffer,
the corresponding RefCountedEvent will be added to the garbage list
which tracks the GPU completion using ResourceUse. That event garbage's
reference count will not decremented until GPU is finished, thus ensures
we never destroy a VkEvent until GPU is completed.
For images used by RenderPassCommands, As
RenderPassCommandBufferHelper::imageRead and imageWrite get called, an
event with that layout gets created and added to the image. That event
is saved in RenderPassCommandBufferHelper::mRefCountedEvents and that
VkCmdSetEvents calls are issued from
RenderPassCommandBufferHelper::flushToPrimary(). For renderPass
attachments, the events are created and added to image when attachment
image gets finalized.
For images used in OutsideRenderPassCommands, The events are inserted as
needed as we generates commands that uses image. We do not wait until
commands gets flushed to issue VkCmdSetEvent calls. A convenient
function trackImageWithEvent() is added to create and setEvent and add
event to image all in one call. You can add this call after the image
operation whenever we think it benefits, which gives us better control.
(Note: Even if forgot to insert the trackImageWithEvent call, it is
still okay since every time barrier is inserted, the event gets
released. Next time when we inserts barrier again we will fallback to
pipelineBarrier since there is no event associated with it. But that is
next CL's content).
This CL only adds the VkCmdSetEvent call when feature flag is enabled.
The feature flag is still disabled and no VkCmdWaitEvent is used in this
CL (will be added in later CL).
Bug: b/336844257
Change-Id: Iae5c4d2553a80f0f74cd6065d72a9c592c79f075
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5490203
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
caebfea1
|
2024-04-24T16:58:39
|
|
Vulkan: Make PipelineBarrierArray a class
Right now PipelineBarrierArray is just a
angle::PackedEnumMap<PipelineStage, PipelineBarrier>. To make iterate
over barrierArray faster we added mPipelineBarrierMask. They are not
encapsulated well. This CL makes PipelineBarrierArray a class which
internally tracks mPipelineBarrierMask bits. We also moved
pipelineBarrier related code into this class. This is a preparation for
the later CL that we will have a EventBarrierArray so that the
pipelineBarrier and eventBarrier are better separated.
Bug: b/336844257
Change-Id: I6002e3fdd584d5a63587f68f13a260b417b3db32
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5484711
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
298b8739
|
2024-04-15T13:58:01
|
|
Vulkan: Restrict the ContextVk dependency in CommandBufferHelper
Updating the interfaces that have no need for ContextVk state and
instead passing in the vk::Context.
Bug: angleproject:8544
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Change-Id: Id3b72d9eabb7d1d6ee89c46cdc24a23da9e32b5c
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5492319
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
1064b38d
|
2024-04-24T14:41:59
|
|
Vulkan: Move ImageLayout define to earlier in vk_helpers.h file
This is the preparation for the later CL. Later in the CL I need to
define something like "angle::PackedEnumMap<ImageLayout,
RefCountedEvent>" to use in CommandBufferHelperCommon. So I need
ImageLayout to be defined earlier. No actual changes involved other than
moving the code around for easier code review.
Bug: b/336844257
Change-Id: I0a2553ef3aebf4b00d1798e2117fe6ed32ac172d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5484675
Commit-Queue: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
d4abe622
|
2024-04-03T17:46:38
|
|
CL/VK: Implement enqueue NDRangeKernel & Task
Adding support for:
clEnqueueNDRangeKernel
clEnqueueTask
Bug: angleproject:8631
Change-Id: If57002be3ea00a55215e89ca47ab8fe9a422c6e7
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5406614
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Austin Annestrand <a.annestrand@samsung.com>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
|
|
103c1b53
|
2024-03-29T14:37:23
|
|
Vulkan: Drop MSRTT emulation dependency on independentResolveNone
Usage of VK_RESOLVE_MODE_NONE was removed in [1], but dependency to this
property was accidentally added in [2].
[1]: https://chromium-review.googlesource.com/c/angle/angle/+/2743666
[2]: https://chromium-review.googlesource.com/c/angle/angle/+/3353895.
Bug: angleproject:4836
Change-Id: I25028b5d343686edd794acdac3714c4a6cb5fa17
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5407073
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
b559efa8
|
2024-03-26T22:02:41
|
|
Vulkan: Allow depth and stencil resolve to be separately added
In preparation for optimizing resolve through glBlitFramebuffer for
depth/stencil attachments.
Bug: angleproject:7551
Change-Id: I57650d82c0cc6e56f44591eadfc42ac794cfef09
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5399140
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
9475ac40
|
2023-11-15T10:25:06
|
|
Vulkan: Make efficient MSAA resolve possible
Prior to this change, using a resolve attachment to implement resolve
through glBlitFramebuffer was done by temporarily modifying the source
FramebufferVk's framebuffer description. This caused a good deal of
complexity; enough to require the render pass to be immediately closed
after this optimization.
The downsides to this are:
- Only one attachment can be efficiently resolved
- There is no chance for the MSAA attachment to be invalidated
In this change, resolve attachments that are added because of
glBlitFramebuffer are stored in the command buffer, with the
FramebufferVk completely oblivious to them. When the render pass is
closed, either the FramebufferVk's original framebuffer object is used
(if no resolve attachments are added) or a temporary one is created to
include those resolve attachments.
With the above method, the render pass is able to accumulate many
resolve attachments as well as have its MSAA attachments be invalidated
before it is flushed.
For a FramebufferVk that is resolved in this way, there used to be two
framebuffers created each time and thrown away as the code alternated
between starting a render pass without a resolve attachment and then
closing with one. With this change, there is now one framebuffer
(without resolve attachments) that is cached in FramebufferVk (and is
not recreated every time), and only the framebuffer with resolve
attachments is recreated every time.
Ultimatley, when VK_KHR_dynamic_rendering is implemented in ANGLE, there
would be no framebuffers to create and destroy, and this change paves
the way for that support too.
WindowSurfaceVk framebuffers are still imagefull. Making them imageless
adds unnecessary complication with no benefit.
-----------------
To achieve efficient MSAA rendering on tiling hardware, applications
should do the following:
```
glBindFramebuffer(GL_FRAMEBUFFER, msaaFBO);
// Clear the framebuffer to avoid a load
// Or invalidate, if not needed to load:
// glInvalidateFramebuffer(GL_DRAW_FRAMEBUFFER, ...);
glClear(...);
// Draw calls
// Resolve into the single sampled framebuffer
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, resolveFBO);
glBlitFramebuffer(...);
// Immediately discard the contents of the MSAA buffer, to avoid store
glInvalidateFramebuffer(GL_READ_FRAMEBUFFER, ...);
```
The above would translate to the following Vulkan render pass:
- MSAA LOAD_OP_CLEAR/DONT_CARE
- MSAA STORE_OP_DONT_CARE
- Resolve LOAD_OP_DONT_CARE
- Resolve STORE_OP_STORE
This makes sure the MSAA data doesn't leave the tile memory and greatly
reduces bandwidth usage.
Once anglebug.com/4892 is fixed, this would also allow the MSAA image
to never be allocated either.
Bug: angleproject:7551
Bug: angleproject:8625
Change-Id: Ia9f4d20863d76a013d8495033f95c7b39f77e062
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5388492
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
60aaf4a0
|
2024-03-14T12:58:56
|
|
Vulkan: Move renderer to namespace vk
This class is agnostic of EGL. This change moves it to namespace vk for
use with the OpenCL implementation
Bug: angleproject:8564
Change-Id: I57f7807d6af8b3d5d7f8efbaf8b5d537a930f881
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5371324
Reviewed-by: Austin Annestrand <a.annestrand@samsung.com>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
d4d34781
|
2024-03-14T13:03:55
|
|
Multisampling support check: sampleCounts > 1 and createFlags
At least two drivers are returning VK_SUCCESS from
vkGetPhysicalDeviceImageFormatProperties2
but also set sampleCounts to 1 which supposedly means no MSRTT
Qualcomm reference device driver fails vkCreateImageView
when enabling the multisampling bit one cubemaps which have
sampleCounts == 1
Additionally,
* include vk::GetMinimalImageCreateFlags() in createFlags - we don't get
the cubemap bit without that
* check both the image format and the additional view format
(linear+srgb) as we set both of these when creating the image
This fixes a bunch of cubemap and 3D tests on Qualcomm reference device
Bug: b/329286011
Change-Id: I6d3ddea0cd997cf37b503050063f42d69723bd50
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5372826
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Roman Lavrov <romanl@google.com>
|
|
91ddf851
|
2024-03-03T10:57:22
|
|
Vulkan: support QCOM foveated rendering extensions
Add support for foveated rendering in the vulkan backend.
This is done by leveraging the VK_KHR_fragment_shading_rate extension.
Bug: angleproject:8484
Change-Id: I0d01d07583f710b2302ea07b19c9d113c73bfe41
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5269907
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: mohan maiya <m.maiya@samsung.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
2ee295b4
|
2024-02-15T11:27:39
|
|
Vulkan: Add per-level image update tracker
* Add a per-level image write tracker to ImageHelper.
* It tracks the updates scheduled for different parts of the image.
Within each level, it also tracks different layers, currently up
to 64.
* kMaxParallelSubresourceUpload renamed to kMaxParallelLayerWrites;
moved to vk_helper header.
* It is reset when a barrier is issued for the image.
* Modified ImageHelper::recordWriteBarrier().
* Added isWriteBarrierNecessary().
* Now it checks the added writes for the image. It will no longer
issue a barrier if the image is in the same layout and there is
no write to a part of the image to which was previously written.
* Added ReadImageSubresources to CommandBufferAccess.
* It is used for layouts that allow both reading and writing to the
image (including self-copy):
* TransferSrcDst (used in CopyImageSubData)
* ComputeShaderWrite (used in compute-based mipmap generation)
* CommandBufferImageWrite -> CommandBufferImageSubresourceAccess
* Updated onImageSelfCopy() args to include read subresource data.
* Improves gpu_time for TextureUploadETC2TranscodingBenchmark perf test
* Windows/NVIDIA: ~180609 ns -> ~62669 ns (~2.88x)
* Linux/NVIDIA: ~157283 ns -> ~93360 ns (~1.68x)
* Windows/Intel: ~72297 ns -> ~57153 ns (~1.27x)
* Added a test to show that self-copy for a write-after-read works.
* ArraySelfCopyImageSubDataWithWriteAfterRead
* (ArraySelfCopyImageSubData covers RAW hazards; renamed)
Bug: b/308455694
Change-Id: I5cef296d991ce6ec02792edc3ffc5cc4994831e1
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5301855
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
3c517e45
|
2024-02-14T14:26:42
|
|
Vulkan: Process ClearEmulatedChannels update first
* When going through the level updates in flushStagedUpdates(), the
ClearEmulatedChannels updates are expected to be before the rest. In
addition, there can be only one such update in the level update list.
Therefore, now they are processed and applied before the rest of the
updates. By doing so, if this is the only update for the image, an
unnecessary layout transition can be avoided.
* Added flushStagedClearEmulatedChannelsUpdates().
* Added flushStagedUpdatesImpl() for the rest of the update types.
* Used clipLevelToUpdateListUpperLimit() to limit the flush loops to
the number of levels in subresource update list.
* Added unit test to ensure updates after ClearEmulatedChannels are not
ignored.
* ImageTestES3.IncompleteRGBXAHBImportThenUploadThenEnd
* The test contains a ClearEmulatedChannels followed by an image
update. If the latter is ignored in this test, there is a failure
during teardown due to orphanNonEmptyBufferBlock when destroying
the buffer that contains the update.
Bug: b/308455694
Change-Id: I53c73acb60a9c5440548886cde913112a664402d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5297317
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
|
|
bcf814fd
|
2024-02-02T10:30:34
|
|
Vulkan: Constrain the dependency on ContextVk in BufferHelper
Make the BufferHelper interface be not dependent on ContextVk state.
This makes the interface to be suitable for implementation of other APIs
with Vulkan backend. Any dependency on ContextVk is made explicit and
handled in ContextVk.
Bug: angleproject:8544
Change-Id: I8b285f54c8758a26dd7edf27b1371f9afcf7e241
Signed-off-by: Gowtham Tammana <g.tammana@samsung.com>
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5303573
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
|
|
eaddd3ba
|
2024-02-09T15:39:43
|
|
Vulkan: use linear chroma filter for ycbcr by default
This aligns our choice of initial chroma filter with the initial sampler
state in a texture object with target GL_TEXTURE_EXTERNAL_OES.
It is still possible to confuse the backend in some edge cases (which
will be addressed with later patches), but at least we have the default
right.
Bug: b/315387961
Test: atest CtsNativeHardwareTestCases
Test: atest CtsMediaDecoderTestCases
Change-Id: I2430a084a95010c7c5084cd858d4255e531e6f13
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5278362
Commit-Queue: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
c603a4f1
|
2024-02-08T10:53:27
|
|
Don't perf warn about ETC1->ETC2 emulation as it is efficient
Format is forwards compatible:
https://crsrc.org/c/third_party/angle/src/libANGLE/renderer/gl/formatutilsgl.cpp;drc=21f16cb16333802dfa942d67cac59885f904301d;l=701
Added hasInefficientlyEmulatedImageFormat() helper
Bug: b/302115557
Change-Id: Ibc82c27ecf4e3afbfaac52cb45bdda776c50b4b3
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5278562
Commit-Queue: Roman Lavrov <romanl@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
5a061558
|
2024-01-31T13:05:59
|
|
Vulkan: Update dynamic buffer size policy
When allocating a dynamic buffer, it is checked if the new data can
fit in an existing allocation. However, if the size of the new data
exceeds that of the current buffer, a new one is allocated. To avoid
using too much memory, if the data size is less than a threshold (a
fraction of the current buffer size, a smaller size will be used for
the new buffer.
However, with a specific pattern for the new sizes, combined with the
threshold value, there could be many allocations and deallocations,
which can affect the performance.
In this CL, the policy to update the dynamic buffer size is updated to
avoid this issue.
* Instead of using a smaller buffer when the required size is less than
1/4 of the current buffer size, it is done when the average required
size is less than 1/8 of the current size.
* Added a decaying average required size for the DynamicBuffer object.
* mSizeInRecentHistory
* For each new buffer allocation, the new required size is used with
the average size to calculate the new average.
* For each calculation, kDecayCoeffPercent is used as the weight for
the existing average, and the rest is the new required size, plus
rounding.
* kDecayCoeffPercent is currently set to 20%.
* sizeIgnoringHistory renamed to minRequiredBlockSize for more clarity.
Bug: b/322216767
Change-Id: Idcabbbe50f656910fe2103925e4d6d8602ca3425
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5254218
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
6367f541
|
2024-01-25T10:16:04
|
|
Vulkan: supply YcbcrConversionDesc earlier
Previously, the AHB import path would allow ImageHelper to build a bogus
YcbcrConversionDesc (in initExternal) and then later overwrite it with
what it wanted. The intermediate state was not necessarily valid, and
could cause assertion failures and VVL errors.
Instead, have ImageHelper clients provide the conversion they want
upfront. In the non-external case, build an appropriate conversion
for formats which need them, before delegating to initExternal.
Bug: b/315387961
Change-Id: Icc8f561bb2de0289ceec56d41978b8c4651a47a2
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5232769
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
ed2a8ef8
|
2023-12-20T00:06:10
|
|
Vulkan: Defer QFOT when acquiring texture with GL_NONE layout
Instead of issuing a queue family ownership transfer with the UNDEFINED
layout (and then hack its dst layout to be GENERAL), this change simply
lets the queue family be changed when the image is next accessed (at
which point a layout transition is necessary anyway).
Bug: angleproject:8464
Change-Id: Iab36af0c641bd04029bdc0d9097e766e8a0f4145
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5138657
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
|
|
40f4de8f
|
2023-12-15T10:17:32
|
|
Vulkan: Ensure we use cached memory for readPixels stagingBuffer
Previous CL crrev.com/c/5112759 does not solve the performance issue for
ChromeOS. The reason is that on more recent intel GPU, there is no
hostVisibleCachedCoherent heap. When we allocate staging buffer, we
specify CachedCoherent as the preferredFlags instead of requiredFlags.
This means we still end up getting UncachedCoherent since VMA tries to
respect coherent bits as first priority. This CL Changes CachedCoherent
to CachedPreferCoherent, and made Cached as required bit, thus ensures
the memory allocated is cached. Since coherent bit may not be honored,
thus we have to call invalidate/flush (which underline implementation
will check the bit and early out if no need).
Somehow on ARM GPU using cachedNonCoherent staging buffer causing many
test failures, even though we do call invalidate() after allocation, and
tests pass on all other GPUs. It almost indicates ARM driver have a bug
with invalidate() that it is not doing expected. But before I can be
sure and fixed, I added feature bit to keep ARM the old behavior, which
uses UnCached memory for readPixels which should suffer the performance
as well.
Bug: b/315836169
Bug: b/310701311
Change-Id: I1eec6105ce74275faa893b0206be8470f0cde72f
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5122318
Commit-Queue: Charlie Lao <cclao@google.com>
Auto-Submit: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
de591cff
|
2023-12-11T13:15:30
|
|
Vulkan: Add CachedCoherent staging buffer
Right now if we allocate a coherent staging buffer, it always uncached.
I believe the reason it picked uncached is that most usage for staging
buffer is data flow from CPU to GPU. CPU only sequentially write into
staging buffer. Uncached may has better performance here due to write
combined. But this performs horrible if CPU ever read from it. This CL
adds a CachedCoherent staging buffer and let staging buffer use that for
coherent memory. UncachedCoherent is currently not used, but I still
kept here in case we find regression for certain type of usage.
Bug: b/315836169
Change-Id: Ica331914c1f4729baa9d2eab048dc3099a2887b5
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5112759
Reviewed-by: Amirali Abdolrashidi <abdolrashidi@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
d0eb968d
|
2023-12-08T16:11:46
|
|
Vulkan: Fix the AHB leak for AHB backed buffer object
For client buffer backed OpenGL buffer object, we call
InitAndroidExternalMemory which calls AHB acquire. But when buffer
object is released/destroyed, we never call
ReleaseAndroidExternalMemory, which end up leaking AHB.
Bug: b/314791770
Change-Id: I693c74213e73008497a6dfeca93ea62e84c71352
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5106599
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Auto-Submit: Charlie Lao <cclao@google.com>
|
|
d896fab8
|
2023-11-09T15:03:05
|
|
Vulkan: Fix texture self-copy
A new layout is introduced to support self-copy.
Bug: angleproject:7289
Change-Id: Ib914c433d55b9a79cfeb7a91f8a2b8680824d473
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5018797
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
|
|
9a5d75de
|
2023-10-30T11:59:19
|
|
Vulkan: Fix incompatible redefinition of cube faces
The TextureVk::mRedefinedLevels bitmask tracked which levels are
incompatibly redefined, greatly reducing the complexity of dealing with
GL's mutable textures.
It did not however take into account the fact that GL allows each
cubemap face to be separately redefined (unlike 2D arrays, where all
layers are defined together). This change turns the bitmask into an
array of bitmasks. Previously, a single bit represented whether the
level is incompatibly redefined. Now, elements of the array track the
same information for each cube face. For non-cube-map textures, only
element 0 is used.
Bug: chromium:1494664
Change-Id: I69568d3da2391796bf5f01505861fee42c6c8924
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4986289
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Yuxin Hu <yuxinhu@google.com>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
9298baa9
|
2023-10-20T10:26:00
|
|
Vulkan: Fix ImageTestES3.RenderToYUVAHB assertion
The bug here is that for YUV resolve, we always set mTransience to
YuvResolveTransient. And isImageTransient() returns true if mTransience
!= Default. And if image is transient, we will do unresolve, which is
incorrect here. For nullColorAttachmentWithExternalFormatResolve() case,
the image is actually not transient. This CL will only set transience to
YuvResolveTransient if we need to create a transient color attachment.
This CL looks at mImage->getExternalFormat() as the answer for
isYUVResolve instead of rely on transience.
Bug: b/223456677
Change-Id: I1bc176df22b0abc91d668a178e48d6b90eacbdd7
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4959194
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Commit-Queue: Charlie Lao <cclao@google.com>
|
|
ba65feb4
|
2023-10-18T17:33:38
|
|
Vulkan: Limit mutable texture flush to one update
In case there are many updates for a mutable texture, flushing
it preemptively can reduce performance, especially if it is done
repeatedly.
* Added getLevelUpdateCount() to ImageHelper.
* Previous mutable textures will now be flushed only if they have
exactly one update per mip level/cubemap face (if defined).
* This means that mutable textures with no data will also not be
flushed.
* Added unit tests for single-level texture flushing and situations
with no updates or more than one update.
Bug: b/285613719
Change-Id: I1592ecf502051a55ebfbb7fcd22577c9ce87bf43
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4953847
Commit-Queue: Amirali Abdolrashidi <abdolrashidi@google.com>
Reviewed-by: Charlie Lao <cclao@google.com>
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
|
|
76608554
|
2023-10-02T15:07:45
|
|
Vulkan: use cpu transcoding for small texture size.
Bug: b/250042517
Change-Id: I9a70fb7d4823d10b09f498bfc01b5384951e2ce4
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4908660
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>
Commit-Queue: Hailin Zhang <hailinzhang@google.com>
|