src/aarch64/ffi.c


Log

Author Commit Date CI Message
Jeremy Huddleston Sequoia eafab235 2021-03-24T11:38:36 arm64e: Pull in pointer authentication code from Apple's arm64e libffi port (#565) NOTES: This changes the ptrauth support from #548 to match what Apple is shipping in its libffi-27 tag. Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
AndreRH f58e5ee6 2021-03-23T23:54:00 aarch64: Fix closures for win64 (#606)
Jeremy Huddleston Sequoia d271dbe0 2021-03-20T06:06:28 Add some missing #if conditionals from Apple's code drop (#620) * arm/aarch64: Add FFI_CLOSURES conditionals where appropriate Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * aarch64: Don't emit the do_closure label when building without FFI_GO_CLOSURES Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Madhavan T. Venkataraman 9ba55921 2021-03-05T10:07:30 Static tramp v5 (#624) * Static Trampolines Closure Trampoline Security Issue ================================= Currently, the trampoline code used in libffi is not statically defined in a source file (except for MACH). The trampoline is either pre-defined machine code in a data buffer. Or, it is generated at runtime. In order to execute a trampoline, it needs to be placed in a page with executable permissions. Executable data pages are attack surfaces for attackers who may try to inject their own code into the page and contrive to have it executed. The security settings in a system may prevent various tricks used in user land to write code into a page and to have it executed somehow. On such systems, libffi trampolines would not be able to run. Static Trampoline ================= To solve this problem, the trampoline code needs to be defined statically in a source file, compiled and placed in the text segment so it can be mapped and executed naturally without any tricks. However, the trampoline needs to be able to access the closure pointer at runtime. PC-relative data referencing ============================ The solution implemented in this patch set uses PC-relative data references. The trampoline is mapped in a code page. Adjacent to the code page, a data page is mapped that contains the parameters of the trampoline: - the closure pointer - pointer to the ABI handler to jump to The trampoline code uses an offset relative to its current PC to access its data. Some architectures support PC-relative data references in the ISA itself. E.g., X64 supports RIP-relative references. For others, the PC has to somehow be loaded into a general purpose register to do PC-relative data referencing. To do this, we need to define a get_pc() kind of function and call it to load the PC in a desired register. There are two cases: 1. The call instruction pushes the return address on the stack. In this case, get_pc() will extract the return address from the stack and load it in the desired register and return. 2. The call instruction stores the return address in a designated register. In this case, get_pc() will copy the return address to the desired register and return. Either way, the PC next to the call instruction is obtained. Scratch register ================ In order to do its job, the trampoline code would need to use a scratch register. Depending on the ABI, there may not be a register available for scratch. This problem needs to be solved so that all ABIs will work. The trampoline will save two values on the stack: - the closure pointer - the original value of the scratch register This is what the stack will look like: sp before trampoline ------> -------------------- | closure pointer | -------------------- | scratch register | sp after trampoline -------> -------------------- The ABI handler can do the following as needed by the ABI: - the closure pointer can be loaded in a desired register - the scratch register can be restored to its original value - the stack pointer can be restored to its original value (the value when the trampoline was invoked) To do this, I have defined prolog code for each ABI handler. The legacy trampoline jumps to the ABI handler directly. But the static trampoline defined in this patch jumps tp the prolog code which performs the above actions before jumping to the ABI handler. Trampoline Table ================ In order to reduce the trampoline memory footprint, the trampoline code would be defined as a code array in the text segment. This array would be mapped into the address space of the caller. The mapping would, therefore, contain a trampoline table. Adjacent to the trampoline table mapping, there will be a data mapping that contains a parameter table, one parameter block for each trampoline. The parameter block will contain: - a pointer to the closure - a pointer to the ABI handler The static trampoline code would finally look like this: - Make space on the stack for the closure and the scratch register by moving the stack pointer down - Store the original value of the scratch register on the stack - Using PC-relative reference, get the closure pointer - Store the closure pointer on the stack - Using PC-relative reference, get the ABI handler pointer - Jump to the ABI handler Mapping size ============ The size of the code mapping that contains the trampoline table needs to be determined on a per architecture basis. If a particular architecture supports multiple base page sizes, then the largest supported base page size needs to be chosen. E.g., we choose 16K for ARM64. Trampoline allocation and free ============================== Static trampolines are allocated in ffi_closure_alloc() and freed in ffi_closure_free(). Normally, applications use these functions. But there are some cases out there where the user of libffi allocates and manages its own closure memory. In such cases, static trampolines cannot be used. These will fall back to using legacy trampolines. The user has to make sure that the memory is executable. ffi_closure structure ===================== I did not want to make any changes to the size of the closure structure for this feature to guarantee compatibility. But the opaque static trampoline handle needs to be stored in the closure. I have defined it as follows: - char tramp[FFI_TRAMPOLINE_SIZE]; + union { + char tramp[FFI_TRAMPOLINE_SIZE]; + void *ftramp; + }; If static trampolines are used, then tramp[] is not needed to store a dynamic trampoline. That space can be reused to store the handle. Hence, the union. Architecture Support ==================== Support has been added for x64, i386, aarch64 and arm. Support for other architectures can be added very easily in the future. OS Support ========== Support has been added for Linux. Support for other OSes can be added very easily. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> * x86: Support for Static Trampolines - Define the arch-specific initialization function ffi_tramp_arch () that returns trampoline size information to common code. - Define the trampoline code mapping and data mapping sizes. - Define the trampoline code table statically. Define two tables, actually, one with CET and one without. - Introduce a tiny prolog for each ABI handling function. The ABI handlers addressed are: - ffi_closure_unix64 - ffi_closure_unix64_sse - ffi_closure_win64 The prolog functions are called: - ffi_closure_unix64_alt - ffi_closure_unix64_sse_alt - ffi_closure_win64_alt The legacy trampoline jumps to the ABI handler. The static trampoline jumps to the prolog function. The prolog function uses the information provided by the static trampoline, sets things up for the ABI handler and then jumps to the ABI handler. - Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to initialize static trampoline parameters. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> * i386: Support for Static Trampolines - Define the arch-specific initialization function ffi_tramp_arch () that returns trampoline size information to common code. - Define the trampoline code table statically. Define two tables, actually, one with CET and one without. - Define the trampoline code table statically. - Introduce a tiny prolog for each ABI handling function. The ABI handlers addressed are: - ffi_closure_i386 - ffi_closure_STDCALL - ffi_closure_REGISTER The prolog functions are called: - ffi_closure_i386_alt - ffi_closure_STDCALL_alt - ffi_closure_REGISTER_alt The legacy trampoline jumps to the ABI handler. The static trampoline jumps to the prolog function. The prolog function uses the information provided by the static trampoline, sets things up for the ABI handler and then jumps to the ABI handler. - Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to initialize static trampoline parameters. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> * arm64: Support for Static Trampolines - Define the arch-specific initialization function ffi_tramp_arch () that returns trampoline size information to common code. - Define the trampoline code mapping and data mapping sizes. - Define the trampoline code table statically. - Introduce a tiny prolog for each ABI handling function. The ABI handlers addressed are: - ffi_closure_SYSV - ffi_closure_SYSV_V The prolog functions are called: - ffi_closure_SYSV_alt - ffi_closure_SYSV_V_alt The legacy trampoline jumps to the ABI handler. The static trampoline jumps to the prolog function. The prolog function uses the information provided by the static trampoline, sets things up for the ABI handler and then jumps to the ABI handler. - Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to initialize static trampoline parameters. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> * arm: Support for Static Trampolines - Define the arch-specific initialization function ffi_tramp_arch () that returns trampoline size information to common code. - Define the trampoline code mapping and data mapping sizes. - Define the trampoline code table statically. - Introduce a tiny prolog for each ABI handling function. The ABI handlers addressed are: - ffi_closure_SYSV - ffi_closure_VFP The prolog functions are called: - ffi_closure_SYSV_alt - ffi_closure_VFP_alt The legacy trampoline jumps to the ABI handler. The static trampoline jumps to the prolog function. The prolog function uses the information provided by the static trampoline, sets things up for the ABI handler and then jumps to the ABI handler. - Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to initialize static trampoline parameters. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
AndreRH 56f7df71 2020-11-10T12:27:59 aarch64: Allow FFI_WIN64 for winelib (#593)
Martin Storsjö c06468fa 2020-04-26T04:58:33 Fix building for aarch64 windows with mingw toolchains (#555) * aarch64: Check _WIN32 instead of _M_ARM64 for detecting windows This fixes building for aarch64 with mingw toolchains. _M_ARM64 is predefined by MSVC, while mingw compilers predefine __aarch64__. In aarch64 specific code, change checks for _M_ARM64 into checks for _WIN32. In arch independent code, check for (defined(_M_ARM64) || defined(__aarch64__)) && defined(_WIN32) instead of just _M_ARM64. In src/closures.c, coalesce checks like defined(X86_WIN32) || defined(X86_WIN64) || defined(_M_ARM64) into plain defined(_WIN32). Technically, this enables code for ARM32 windows where it wasn't, but as far as I can see it, those codepaths should be fine for that architecture variant as well. * aarch64: Only use armasm source when building with MSVC When building for windows/arm64 with clang, the normal gas style .S source works fine. sysv.S and win64_armasm.S seem to be functionally equivalent, with only differences being due to assembler syntax.
Ole André Vadla Ravnås 4c7bde32 2020-03-10T02:05:42 Port to iOS/arm64e (#548)
Paul Monson c2a68590 2019-08-07T11:57:45 fix mingw build and crashing bugs for Python Windows ARM64 (#496) * fix mingw build and crashing bugs for Python Windows ARM64 * Fix issues found in PR review
ossdev07 d856743e 2019-06-26T07:31:22 libffi: added ARM64 support for Windows (#486) * libffi: added ARM64 support for Windows 1. ported sysv.S to win64_armasm.S for armasm64 assembler 2. added msvc_build folder for visual studio solution 3. updated README.md for the same 4. MSVC solution created with the changes, and below test suites are tested with test script written in python. libffi.bhaible libffi.call 5. Basic functionality of above test suites are getting passed Signed-off-by: ossdev07 <ossdev@puresoftware.com> * Update README.md
Dan Horák a7d6396f 2019-03-29T14:19:20 fix check for Linux/aarch64 fixes #473
Jeremy Huddleston Sequoia 05a17964 2019-02-19T04:11:28 Cleanup symbol exports on darwin and add architecture preprocessor checks to assist in building fat binaries (eg: i386+x86_64 on macOS or arm+aarch64 on iOS) (#450) * x86: Ensure _efi64 suffixed symbols are not exported * x86: Ensure we do not export ffi_prep_cif_machdep Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * x86: Ensure we don't export ffi_call_win64, ffi_closure_win64, or ffi_go_closure_win64 Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * closures: Silence a semantic warning libffi/src/closures.c:175:23: This function declaration is not a prototype Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * aarch64: Ensure we don't export ffi_prep_cif_machdep Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * arm: Ensure we don't export ffi_prep_cif_machdep Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * aarch64, arm, x86: Add architecture preprocessor checks to support easier fat builds (eg: iOS) Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * x86: Silence some static analysis warnings libffi/src/x86/ffi64.c:286:21: The left operand of '!=' is a garbage value due to array index out of bounds libffi/src/x86/ffi64.c:297:22: The left operand of '!=' is a garbage value due to array index out of bounds Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * aarch: Use FFI_HIDDEN rather than .hidden Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> * ffi.h: Don't advertise ffi_java_rvalue_to_raw, ffi_prep_java_raw_closure, and ffi_prep_java_raw_closure_loc when FFI_NATIVE_RAW_API is 0 Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Florian Weimer 44a6c285 2019-02-19T12:55:11 aarch64: Flush code mapping in addition to data mapping (#471) This needs a new function, ffi_data_to_code_pointer, to translate from data pointers to code pointers. Fixes issue #470.
Martin Bektchiev 4a84df4a 2018-10-31T15:53:54 Fix Q registers parameter passing on ARM64 The second two quads are located at offset 32 not 16
Andreas Schwab e66fd678 2018-02-20T10:47:09 Revert "Fix passing struct by value on aarch64" This reverts commit 482b37f00467325e3389bab322525099860dd9aa. That was actually a bug in python, see <https://bugs.python.org/issue30353>.
Andreas Schwab 482b37f0 2017-09-18T12:44:08 Fix passing struct by value on aarch64 This fixes the ctypes test in the python testsuite.
Jean-Luc Jumpertz 181fc4cc 2017-10-23T15:02:29 Merge branch 'master' based on ksjogo/libffi Added a tvOS target in Xcode project. Misc Xcode project cleanup. Fix macOS build target in Xcode project. # Conflicts: # src/aarch64/ffi.c # src/x86/ffi64.c
Jean-Luc Jumpertz a78da739 2017-09-04T15:55:34 Fix macOS build target in Xcode project. - Add missing files for desktop platforms in generate-darwin-source-and-headers.py, and in the Xcode project. - Add a static library target for macOS. - Fix "implicit conversion loses integer precision" warnings for iOS mad macOS targets.
Gregory Pakosz bd72848c 2017-04-27T13:20:36 Prefix ALIGN macros with FFI_
Tom Tromey 06d7c519 2016-08-10T15:06:16 Merge pull request #269 from frida/fix/aarch64-variadic-closures-on-ios aarch64: Fix handling of variadic closures on iOS
Tom Tromey aa7ed78c 2016-08-10T15:03:37 Merge pull request #268 from frida/fix/aarch64-large-aggregates aarch64: Fix handling of aggregates larger than 16 bytes
Ole André Vadla Ravnås 4da814b1 2016-08-10T22:48:09 aarch64: Fix handling of aggregates larger than 16 bytes Instead of allocating stack space for a pointer we would allocate stack space for the actual aggregate size.
Ole André Vadla Ravnås 5e9ac7e2 2016-08-10T15:22:19 aarch64: Fix warning about unused function on iOS
Ole André Vadla Ravnås 4d1f11f6 2016-08-10T15:21:42 aarch64: Fix operand size warning reported by Clang
Ole André Vadla Ravnås 301166b1 2016-08-10T15:59:56 aarch64: Fix handling of variadic closures on iOS
Russell Keith-Magee bc4fc07a 2015-12-21T00:37:06 Fixed #181 -- Corrected problems with ARMv7 build under iOS. Based on a patch from @fealebenpae, with input from @SolaWing and @rth7680, and testing from @superdump.
Yavor Georgiev 53636634 2015-01-16T15:19:38 aarch64: implement the trampoline table workaround for ffi closures on Apple systems This is a direct copy/paste port of the ARM code, with changes because of Aarch64 pc-relative addressing restrictions.
Anthony Green 20562ac0 2014-11-12T07:00:59 Fix for AArch64. Release as 3.2.1.
Richard Henderson c6352b66 2014-10-23T00:26:14 aarch64: Add support for Go closures
Richard Henderson 0e41c73b 2014-10-22T23:48:12 aarch64: Move x8 out of call_context Reduces stack size. It was only used by the closure, and there are available argument registers.
Richard Henderson a992f878 2014-10-22T22:58:09 aarch64: Add support for complex types
Richard Henderson 658b2b56 2014-10-22T22:36:07 aarch64: Remove aarch64_flags This field was useless from the start, since the normal flags field is available for backend use.
Richard Henderson 4a3cbcaa 2014-10-22T22:32:13 aarch64: Unify scalar fp and hfa handling Since an HFA of a single element is exactly the same as scalar, this tidies things up a bit.
Richard Henderson 12cf89ee 2014-10-22T21:53:30 aarch64: Move return value handling into ffi_closure_SYSV As with the change to ffi_call_SYSV, this avoids copying data into a temporary buffer.
Richard Henderson 4fe1aea1 2014-10-22T17:06:19 aarch64: Move return value handling into ffi_call_SYSV This lets us pass return data directly to the caller of ffi_call in most cases, rather than storing it into temporary storage first.
Richard Henderson 325471ea 2014-10-22T13:58:59 aarch64: Merge prep_args with ffi_call Use the trick to allocate the stack frame for ffi_call_SYSV within ffi_call itself.
Richard Henderson 8c8161cb 2014-10-22T12:52:07 aarch64: Tidy up abi manipulation Avoid false abstraction, like get_x_addr. Avoid recomputing data about the type being manipulated. Use NEON insns for HFA manipulation. Note that some of the inline assembly will go away in a subsequent patch.
Richard Henderson b55e0366 2014-10-22T12:33:59 aarch64: Treat void return as not passed in registers This lets us do less post-processing when there's no return value.
Richard Henderson 95a04af1 2014-10-21T22:41:07 aarch64: Reduce the size of register_context We don't need to store 32 general and vector registers. Only 8 of each are used for parameter passing.
Richard Henderson 77c4cddc 2014-10-21T13:30:40 aarch64: Simplify AARCH64_STACK_ALIGN The iOS abi doesn't require padding between arguments, but that's not what AARCH64_STACK_ALIGN meant. The hardware will in fact trap if the SP register is not 16 byte aligned.
Richard Henderson b5f147d8 2014-10-21T13:27:57 aarch64: Always distinguish LONGDOUBLE Avoid if-deffery by forcing FFI_TYPE_LONGDOUBLE different from FFI_TYPE_DOUBLE. This will simply be unused on hosts that define them identically.
Richard Henderson 38b54b9c 2014-10-21T13:17:39 aarch64: Improve is_hfa The set of functions get_homogeneous_type, element_count, and is_hfa are all intertwined and recompute data. Return a compound quantity from is_hfa that contains all the data and avoids the recomputation.
Richard Henderson 18b74ce5 2014-10-21T13:00:34 aarch64: Fix non-apple compilation
Anthony Green 862f53de 2014-09-18T19:06:08 Merge pull request #130 from frida/fix/darwin-aarch64-float-alignment Fix alignment of FFI_TYPE_FLOAT for Apple's ARM64 ABI
Ole André Vadla Ravnås aebf2c30 2014-07-25T21:40:50 Fix alignment of FFI_TYPE_FLOAT for Apple's ARM64 ABI
Ole André Vadla Ravnås 0f4e09d2 2014-07-26T00:11:06 Fix non-variadic CIF initialization for Apple/ARM64 Turns out `aarch64_nfixedargs` wasn't initialized in the non-variadic case, resulting in undefined behavior when allocating arguments.
Ole André Vadla Ravnås 419503f4 2014-04-06T20:54:13 Fix handling of variadic calls on Darwin/AArch64
Zachary Waldowski b4df9cf9 2014-02-05T14:22:52 AArch64: Fix void fall-through case when assertions are enabled
Zachary Waldowski 0a333d6c 2014-01-09T14:03:29 Darwin/aarch64: Fix size_t assumptions
Zachary Waldowski f466aad0 2014-01-21T16:38:31 AArch64: Fix missing semicolons when assertions are enabled
Zachary Waldowski 0a0f12ce 2014-01-09T13:50:17 AArch64: Remove duplicitous element_count call. This inhibits an analyzer warning by Clang.
Zachary Waldowski 4330fdcd 2014-01-09T13:53:30 Darwin/aarch64: Respect iOS ABI re: stack argument alignment
Zachary Waldowski 2c18e3c7 2013-12-30T16:14:02 Darwin/aarch64: Fix "shadows declaration" warnings
Zachary Waldowski 1b8a8e20 2014-01-09T13:55:21 Darwin/aarch64: Use Clang cache invalidation builtin
Zachary Waldowski 6030cdca 2013-12-30T15:45:51 Darwin/aarch64: Account for long double being equal to double
Zachary Waldowski 5bfe62a0 2014-01-09T13:41:27 Darwin/AArch64: Inhibit Clang previous prototype warnings
Anthony Green 3dc3f32c 2013-12-05T16:23:25 Undo iOS ARM64 changes.
Zachary Waldowski 0278284e 2013-11-30T03:03:37 Darwin/aarch64: size_t assumptions
Zachary Waldowski 9775446b 2013-11-30T02:39:34 Darwin/aarch64: Fix “shadows declaration” warnings
Zachary Waldowski 4260badc 2013-11-30T02:08:14 Darwin/aarch64: Use Clang cache invalidation builtin
Zachary Waldowski 9fa7998d 2013-11-30T02:07:48 Darwin/aarch64: Inhibit Xcode warning
Zachary Waldowski 0e832048 2013-11-30T02:07:34 Darwin/aarch64: double == long double
Anthony Green 128cd1d2 2013-10-08T06:45:51 Fix spelling errors
Anthony Green 58e8b66f 2012-10-30T07:07:19 AArch64 port
Anthony Green fa5d7479 2012-10-30T07:07:19 AArch64 port