include/platform/FeaturesMtl_autogen.h


Log

Author Commit Date CI Message
Geoff Lang 9b00af52 2023-02-01T11:10:32 Metal: Add an in-memory MTLLibrary cache. Add a small cache for (msl + compile parameters) -> MTLLibrary at the egl::Display level. In regular executions of Chrome, the same shaders (particularly vertex) are compiled multiple times in different programs. Tested for a regular Chrome startup + open wikipedia + motionmark 1.2: 112/282 (40%) cache hits. Several different caching methods were profiled (LinkProgram perf test) - struct key with std::map : 303309 - struct key with std::unordered_map : 308090 - binary blob key with std::map : 263595 - binary blob key with std::unordered_map : 286051 - struct key + is_transparent with std::unordered_map : 304877 - struct key + is_transparent with absl::flat_hash_map : 335686 Using is_transparent allows us to search the hash map without copying the shader source string to construct the key structure. Bug: chromium:1385510 Change-Id: Ieec4ba526fe286276a4af7114d89cde32a8f9e1d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4214012 Reviewed-by: Shahbaz Youssefi <syoussefi@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org>
Dan Glastonbury bf5a007a 2022-06-16T14:21:08 Metal: Ensure render pass has at least one valid render target. Extend MTLRenderPipelineDescriptor validation to ensure that there is at least one valid render target set for the the render pipeline. This is required for certain families of metal devices to avoid a validation failure inside the metal framework. Moving the failure here will cause the app using ANGLE to return a GL error instead of crashing the process. Bug: angleproject:7436 Change-Id: I594d92492a22a61a720dbe7021843c8460b389b8 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4109310 Reviewed-by: Jonah Ryan-Davis <jonahr@google.com> Reviewed-by: Kimmo Kinnunen <kkinnunen@apple.com> Commit-Queue: Kyle Piddington <kpiddington@apple.com>
Chris Dalton a4db9477 2022-10-06T10:35:39 Implement pixel local storage with metal::read_write textures Metal's programmable blending feature isn't available on non-Apple Silicon, so on these devices we have to polyfill pixel local storage using read_write textures, which can also be coherent if raster_order_groups are supported. This change leverages the existing PLS transformation to images, and implements just enough shader image functionality in Metal to support the pixel local storage usecase. Missing shader image features are marked with UNIMPLEMENTED(). Bug: angleproject:7279 Bug: angleproject:7792 Bug: angleproject:7794 Bug: angleproject:7797 Bug: angleproject:7803 Change-Id: Ia96a714693d352d57351a1bae4f45437dde000e4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3993363 Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Quyen Le <lehoangquyen@chromium.org> Commit-Queue: Chris Dalton <chris@rive.app> Reviewed-by: Kyle Piddington <kpiddington@apple.com>
Shahbaz Youssefi 5b218196 2022-11-06T11:39:23 Metal: Remove compilation through SPIR-V Direct metal generation is stable. Bug: angleproject:6081 Change-Id: If9e76f61ad38f2fc9963f0181dfd03c99ffa3e2b Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/4003675 Auto-Submit: Shahbaz Youssefi <syoussefi@chromium.org> Reviewed-by: Jonah Ryan-Davis <jonahr@google.com> Reviewed-by: Kenneth Russell <kbr@chromium.org>
Gregg Tavares 968041b5 2022-08-19T12:11:23 Metal: Optimized BufferSubData per device Adds a staging buffer path which means there are 4 paths for bufferSubData. 1. direct copy * get a pointer to the buffer * copy the new data to the buffer * if the buffer is managed, tell metal which part was updated 2. use a shadow copy * copy the data to a shadow copy * copy the entire shadow to a new buffer * start using the new buffer 3. use a new buffer * get a new buffer (or unused) * put the new data in the new buffer * blit any unchanged data from the old buffer to the new buffer * start using the new buffer 4. use a staging buffer * get a staging buffer * put the new data in the staging buffer * blit from the staging buffer to the existing buffer. Further, there are 3 types of memory storage modes. Managed, Staged, Private. Based on the GPU type different storage modes and different paths in different sitatutions are more performant. So, add feature flags to select paths by GPU. Bug: angleproject:7544 Change-Id: I741dd1874201043416374194bd2001ded8dbd9b4 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3842641 Reviewed-by: Kyle Piddington <kpiddington@apple.com> Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Quyen Le <lehoangquyen@chromium.org> Commit-Queue: Gregg Tavares <gman@chromium.org>
Geoff Lang 25bad36c 2022-09-23T13:23:57 Metal: Remove unpackLastRowSeparatelyForPaddingInclusion This speculative fix did not work. Bug: angleproject:7573 Change-Id: I345db1746f8725d82420aabffb37c8dd01230a34 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915182 Commit-Queue: Geoff Lang <geofflang@chromium.org> Reviewed-by: Gregg Tavares <gman@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org>
Geoff Lang 70e5e90f 2022-09-23T13:17:34 Metal: Avoid locking IOSurfaces in glReadPixels on AMD. The AMD driver tends to crash when locking IOSurfaces. Avoid this by using the copyIOSurfaceToNonIOSurfaceForReadOptimization feature to do a texture-texture copy before reading back data to the CPU. This is a *speculative* fix due to seeing crashes in the ClientLockIOSurface function in the AMD driver. Bug: angleproject:7573 Change-Id: Ia120f2a96eed65431b5f8a99cf1da7d7e85da639 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915181 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Reviewed-by: Gregg Tavares <gman@chromium.org>
Geoff Lang 2aa52da7 2022-09-23T13:15:44 Metal: Upload IOSurface data with staging buffers on AMD Crashes have been seen in the AMD driver when locking IOSurfaces. Avoid this by always using a staging buffer and doing a GPU-GPU copy for uploading client side data to IOSurfaces. Bug: angleproject:7573 Change-Id: I4d981a24554a755a7248199699b486d98cbad83d Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3915180 Commit-Queue: Kenneth Russell <kbr@chromium.org> Reviewed-by: Gregg Tavares <gman@chromium.org>
Geoff Lang 09446a6b 2022-09-02T11:29:32 Metal: Upload the last texture row separately on AMD. Speculative fix for crashes seen when uploading texture data on AMD. Port of the unpackLastRowSeparatelyForPaddingInclusion workaround from the GL backend. Currently constrained to client data 2D uploads to non-compressed textures. Bug: angleproject:7573 Change-Id: Idd036b92619d309e5b2a8062043e8644f4d5b2e0 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3870655 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Reviewed-by: Gregg Tavares <gman@chromium.org>
Gregg Tavares 662226a3 2022-09-06T14:12:26 Metal: Preemptively Start Provoking Vertex CmdBuffer on AMD There seems to be a bug in older AMD drivers and this appears to work around it Bug: angleproject:7635 Change-Id: I1b22e4b7d5d1ce0d405e422d08d33eeeb731050a Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3877666 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Kenneth Russell <kbr@chromium.org>
Gregg Tavares 94320a83 2022-05-27T17:41:29 Metal: Validate total bits used in color attachments Metal has 2 limits for color attachments. 1 the number of attachments supported. 2 the total number of bits it can write per pixel. So for example Apple4 through Apple8 GPUs can have 8 attachments but only 512bits of output. That means you can attach 8 RGBA8 textures (256bits), but you can't attach 8 RGBA32UI textures (1024bits). If there are too many bits then return FRAMEBUFFER_UNSUPPORTED from checkFramebufferStatus and INVALID_FRAMEBUFFER_OPERATION from draws Bug: angleproject:7280 Change-Id: I935aebad4d57664f59a60be20a927d6b69afb4ff Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3674322 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Gregg Tavares <gman@chromium.org>
Gregg Tavares 8a0351a5 2022-05-26T14:29:21 Metal:Dynamically choose max draw buffers. The code was hard coded to 4 which is lower than OpenGL's 8. This implementation keeps a hard coded array of size 8 in rx::mtl::RenderPassDesc and rx::mtl::RenderPipelineOutputDesc but only uses up to the display's limit. Bug: angleproject:7280 Bug: angleproject:5730 Change-Id: Idd7e64dc47697882b44540804159566158e1e924 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3671695 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Gregg Tavares <gman@chromium.org>
Gregg Tavares 1a144edf 2022-04-13T17:15:29 Metal:ReadPixels AMD Copy Texture to Buffer optimization On AMD GPUs it's faster to copy a texture to a buffer for read back than to read via a texture. For reading from a normal texture 24-27ms -> 6-9ms For reading from a IOSurface texture 17-20ms -> 7-10ms Bug: angleproject:7117 Change-Id: I7c7f276a3121e87f5c52a1a4287d13203a6b1b37 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3584423 Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Kyle Piddington <kpiddington@apple.com> Commit-Queue: Gregg Tavares <gman@chromium.org>
Shahbaz Youssefi 400d9fe4 2022-04-23T01:08:19 Rename feature files to *_autogen.h To clarify further that they are not to be edited by hand. Bug: angleproject:6435 Change-Id: Iaf79706d2b688a43b3ebb65700cfbdd71a49a742 Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/3603842 Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>