|
9b00af52
|
2023-02-01T11:10:32
|
|
Metal: Add an in-memory MTLLibrary cache.
Add a small cache for (msl + compile parameters) -> MTLLibrary at the
egl::Display level. In regular executions of Chrome, the same shaders
(particularly vertex) are compiled multiple times in different programs.
Tested for a regular Chrome startup + open wikipedia + motionmark 1.2:
112/282 (40%) cache hits.
Several different caching methods were profiled (LinkProgram perf test)
- struct key with std::map : 303309
- struct key with std::unordered_map : 308090
- binary blob key with std::map : 263595
- binary blob key with std::unordered_map : 286051
- struct key + is_transparent with std::unordered_map : 304877
- struct key + is_transparent with absl::flat_hash_map : 335686
Using is_transparent allows us to search the hash map without
copying the shader source string to construct the key structure.
Bug: chromium:1385510
Change-Id: Ieec4ba526fe286276a4af7114d89cde32a8f9e1d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4214012
Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org>
Commit-Queue: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
|
|
bf5a007a
|
2022-06-16T14:21:08
|
|
Metal: Ensure render pass has at least one valid render target.
Extend MTLRenderPipelineDescriptor validation to ensure that there is at
least one valid render target set for the the render pipeline. This is
required for certain families of metal devices to avoid a validation
failure inside the metal framework. Moving the failure here will cause
the app using ANGLE to return a GL error instead of crashing the
process.
Bug: angleproject:7436
Change-Id: I594d92492a22a61a720dbe7021843c8460b389b8
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4109310
Reviewed-by: Jonah Ryan-Davis <jonahr@google.com>
Reviewed-by: Kimmo Kinnunen <kkinnunen@apple.com>
Commit-Queue: Kyle Piddington <kpiddington@apple.com>
|
|
a4db9477
|
2022-10-06T10:35:39
|
|
Implement pixel local storage with metal::read_write textures
Metal's programmable blending feature isn't available on non-Apple
Silicon, so on these devices we have to polyfill pixel local storage
using read_write textures, which can also be coherent if
raster_order_groups are supported.
This change leverages the existing PLS transformation to images, and
implements just enough shader image functionality in Metal to support
the pixel local storage usecase. Missing shader image features are
marked with UNIMPLEMENTED().
Bug: angleproject:7279
Bug: angleproject:7792
Bug: angleproject:7794
Bug: angleproject:7797
Bug: angleproject:7803
Change-Id: Ia96a714693d352d57351a1bae4f45437dde000e4
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3993363
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Quyen Le <lehoangquyen@chromium.org>
Commit-Queue: Chris Dalton <chris@rive.app>
Reviewed-by: Kyle Piddington <kpiddington@apple.com>
|
|
5b218196
|
2022-11-06T11:39:23
|
|
Metal: Remove compilation through SPIR-V
Direct metal generation is stable.
Bug: angleproject:6081
Change-Id: If9e76f61ad38f2fc9963f0181dfd03c99ffa3e2b
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4003675
Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Jonah Ryan-Davis <jonahr@google.com>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
|
|
968041b5
|
2022-08-19T12:11:23
|
|
Metal: Optimized BufferSubData per device
Adds a staging buffer path which means there are 4 paths
for bufferSubData.
1. direct copy
* get a pointer to the buffer
* copy the new data to the buffer
* if the buffer is managed, tell metal which part was updated
2. use a shadow copy
* copy the data to a shadow copy
* copy the entire shadow to a new buffer
* start using the new buffer
3. use a new buffer
* get a new buffer (or unused)
* put the new data in the new buffer
* blit any unchanged data from the old buffer to the new buffer
* start using the new buffer
4. use a staging buffer
* get a staging buffer
* put the new data in the staging buffer
* blit from the staging buffer to the existing buffer.
Further, there are 3 types of memory storage modes.
Managed, Staged, Private.
Based on the GPU type different storage modes and different
paths in different sitatutions are more performant.
So, add feature flags to select paths by GPU.
Bug: angleproject:7544
Change-Id: I741dd1874201043416374194bd2001ded8dbd9b4
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3842641
Reviewed-by: Kyle Piddington <kpiddington@apple.com>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Quyen Le <lehoangquyen@chromium.org>
Commit-Queue: Gregg Tavares <gman@chromium.org>
|
|
25bad36c
|
2022-09-23T13:23:57
|
|
Metal: Remove unpackLastRowSeparatelyForPaddingInclusion
This speculative fix did not work.
Bug: angleproject:7573
Change-Id: I345db1746f8725d82420aabffb37c8dd01230a34
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915182
Commit-Queue: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Gregg Tavares <gman@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
|
|
70e5e90f
|
2022-09-23T13:17:34
|
|
Metal: Avoid locking IOSurfaces in glReadPixels on AMD.
The AMD driver tends to crash when locking IOSurfaces. Avoid this by
using the copyIOSurfaceToNonIOSurfaceForReadOptimization feature to do
a texture-texture copy before reading back data to the CPU.
This is a *speculative* fix due to seeing crashes in the
ClientLockIOSurface function in the AMD driver.
Bug: angleproject:7573
Change-Id: Ia120f2a96eed65431b5f8a99cf1da7d7e85da639
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915181
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Gregg Tavares <gman@chromium.org>
|
|
2aa52da7
|
2022-09-23T13:15:44
|
|
Metal: Upload IOSurface data with staging buffers on AMD
Crashes have been seen in the AMD driver when locking IOSurfaces. Avoid
this by always using a staging buffer and doing a GPU-GPU copy for
uploading client side data to IOSurfaces.
Bug: angleproject:7573
Change-Id: I4d981a24554a755a7248199699b486d98cbad83d
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915180
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Gregg Tavares <gman@chromium.org>
|
|
09446a6b
|
2022-09-02T11:29:32
|
|
Metal: Upload the last texture row separately on AMD.
Speculative fix for crashes seen when uploading texture data on AMD.
Port of the unpackLastRowSeparatelyForPaddingInclusion workaround from
the GL backend.
Currently constrained to client data 2D uploads to non-compressed
textures.
Bug: angleproject:7573
Change-Id: Idd036b92619d309e5b2a8062043e8644f4d5b2e0
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3870655
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Gregg Tavares <gman@chromium.org>
|
|
662226a3
|
2022-09-06T14:12:26
|
|
Metal: Preemptively Start Provoking Vertex CmdBuffer on AMD
There seems to be a bug in older AMD drivers and this appears
to work around it
Bug: angleproject:7635
Change-Id: I1b22e4b7d5d1ce0d405e422d08d33eeeb731050a
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3877666
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Kenneth Russell <kbr@chromium.org>
|
|
94320a83
|
2022-05-27T17:41:29
|
|
Metal: Validate total bits used in color attachments
Metal has 2 limits for color attachments. 1 the number of
attachments supported. 2 the total number of bits it can
write per pixel. So for example Apple4 through Apple8 GPUs
can have 8 attachments but only 512bits of output. That
means you can attach 8 RGBA8 textures (256bits), but you
can't attach 8 RGBA32UI textures (1024bits).
If there are too many bits then return
FRAMEBUFFER_UNSUPPORTED from checkFramebufferStatus
and INVALID_FRAMEBUFFER_OPERATION from draws
Bug: angleproject:7280
Change-Id: I935aebad4d57664f59a60be20a927d6b69afb4ff
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3674322
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Gregg Tavares <gman@chromium.org>
|
|
8a0351a5
|
2022-05-26T14:29:21
|
|
Metal:Dynamically choose max draw buffers.
The code was hard coded to 4 which is lower than OpenGL's 8.
This implementation keeps a hard coded array of size 8 in
rx::mtl::RenderPassDesc and rx::mtl::RenderPipelineOutputDesc
but only uses up to the display's limit.
Bug: angleproject:7280
Bug: angleproject:5730
Change-Id: Idd7e64dc47697882b44540804159566158e1e924
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3671695
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Gregg Tavares <gman@chromium.org>
|
|
1a144edf
|
2022-04-13T17:15:29
|
|
Metal:ReadPixels AMD Copy Texture to Buffer optimization
On AMD GPUs it's faster to copy a texture to a buffer
for read back than to read via a texture.
For reading from a normal texture 24-27ms -> 6-9ms
For reading from a IOSurface texture 17-20ms -> 7-10ms
Bug: angleproject:7117
Change-Id: I7c7f276a3121e87f5c52a1a4287d13203a6b1b37
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584423
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Kyle Piddington <kpiddington@apple.com>
Commit-Queue: Gregg Tavares <gman@chromium.org>
|
|
400d9fe4
|
2022-04-23T01:08:19
|
|
Rename feature files to *_autogen.h
To clarify further that they are not to be edited by hand.
Bug: angleproject:6435
Change-Id: Iaf79706d2b688a43b3ebb65700cfbdd71a49a742
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3603842
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
|