|
2687cfc5
|
2023-02-02T09:10:00
|
|
Add wasm32 emscripten support (#763)
* added build script
* Apply libffi-emscripten patch
* Some changes to wasm32/ffi.c
* Remove exit(0); from test suites
* Fix LONGDOUBLE argument type
* Use more macros in ffi.c
* Use switch statements instead of if chains
* Implemented struct args
* Finish struct implementation
* Partially working closures
* Got closures working (most of closures test suite passes)
* Revert changes to test suite
* Update .gitignore
* Apply code formatter
* Use stackSave and stackRestore rather than directly adjusting stack pointer
* Add missing break
* Fix visibility of ffi_closure_alloc and ffi_closure_free
* Fix FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used
sig needs to be vi here for FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE, noticed this while running the test suite without WASM_BIGINT support.
* Always use dynCall rather than direct wasmTable lookup (function pointer cast emulation changes dynCall)
* Prevent closures.c from duplicating symbols
* Try to set up CI
* Add test with bigint
* Make test methods static
* Remove BigInt shorthand because it messes up terser
* Add selenium tests
* Update tests a bit to try to make CI work
* WASM_BIGINT is a linker flag not a compile flag
* Finish getting CI working (#1)
* update gitignore
* Avoid adding "use strict;" to generated JS
This should be controlled by -s STRICT_JS in Emscripten.
* Make JavaScript ES5 compliant
* Remove redundant EXPORTED_RUNTIME_METHODS settings
* Fix definition of DEREF_I16
* Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used
* Add missing FFI_TYPE_STRUCT signature
* Improve test scripts
* Remove redundant EXPORTED_RUNTIME_METHODS settings
* Add missing EOL
* Add struct unpacking tests
* Update ci config to try to actually use WASM_BIGINT
* Revert "Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used"
This reverts commit 61bd5a3e20891623715604581b6e872ab3dfab80.
* Fix single_entry_structs tests
* Fix return from closure call
* Fix 64 bit return from closures
* only allocate as much space on stack for return pointer as needed
* Revert "only allocate as much space on stack for return pointer as needed"
This reverts commit e54a30faea3803e7ac33eed191bde9e573850fc1.
* xfail two tests
* Fix err_bad_abi test
* Remove test logging junk
* Try to set up long double marshalling for closures
* xfail err_bad_abi
* Fix reference errors in previous commit
* Add missing argument pointer assignment
* Fix signature of function pointer in cls_dbls_struct
* Fix longdouble argument
* Try some changes to bigint handling
* Fix BigInt handling
* Fix cls_longdouble test
* Fix long double closure arg with no WASM_BIGINT
* Use EM_JS to factor out js helpers
* Support for varargs closure calls
* Fix varargs calls
* Fix err_bad_abi test
* Fix typo in previous commit
* Add more assertions to closures test suite
* Fix some asserts
* Add assertions to a few more tests
* Fix some tests
* Fix more floating point assertions
* Update more tests
* Var args for ffi_call
* Don't do node tests
* Macro for allocating on stack
* Add some comments, simplify struct handling
* Try again to fix varargs calls, add comments
* Consolidate WASM_BIGINT conditionals into LOAD_U64 and STORE_U64 macros
* A bit of cleanup
* Fix another typo
* Some fixes to the testsuite
* Another testsuite fix
* Fix varags with closures?
* Another attempt at getting closure varargs to work
* sig is initialized later
* Allow libffi.closures tests to be run
* Improve build script
* Remove redundant semicolons
* Fix a few libffi.closures test failures
* Cleanup
* Legacy dynCall API is no longer used
* Fix FFI_TYPE_LONGDOUBLE offset
* xfail 2 tests for WASM
- closure_loc_fn0; not applicable -- codeloc doesn't point to closure.
- huge_struct; function signature too long.
* Revert some redundant dg-output/printf statements
Helps Node.
* Revert "Don't do node tests"
This reverts commit a341ef4b.
* Fix assertions in cls_24byte
* More tiny formating fixes to test suite
* Revert "Revert "Don't do node tests""
This reverts commit 7722e685ea04e2420e042886816d8c4dd31f5dcb.
* Fix 64 bit returns when WASM_BIGINT is absent
* Fix print statement in cls_24byte
* Add CALL_FUNC_PTR macro to allow pyodide to define custom calling behavior to handle fpcast
* Update single_entry_structs tests
* More explanations
* Fix compile error in last commit
* Add more support for pyodide fpcast emulation, update CI to try to test it
* Clone via https
* Fix path to pyodide emsdk_env
* Add asserts to the rest of the test suite
* Fix test compile errors
* Fix some tests
* Fix cls_ulonglong
* Fix alignment of <4 byte args
* fix cls_ulonglong again
* Use snprintf instead of sprintf
* Should assert than strncmp returned 0
* Fix va_struct1 and va_struct3
* Change double and long double tests
These tests are failing because of a strange bug with prinft and doubles, but I am not convinced
it necessarily has anything to do with libffi. This version casts the double to int before printing it and avoids the issue
* Enable node tests
* Revert "Change double and long double tests"
This reverts commit 8f3ff89c6577dc99564181cd9974f2f1ba21f1e9.
* Fix PYODIDE_FPCAST flag
* add conftest.py back in
* Fix emcc error: setting `EXPORTED_FUNCTIONS` expects `<class 'list'>` but got `<class 'str'>`
See discussion on https://github.com/pyodide/pyodide/pull/1596
* Remove test.html
* Remove duplicate test file
* More changes from upstream
* Fix some whitespace
* Add some basic debug logging statements
* Reapply libffi.exp changes
* Don't build docs (#7)
Works around build issue makeinfo: command not found.
* Update long double alignment
Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:
> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)
* Update long double alignment
Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:
> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)
* Improve error handling a bit (#8)
* Fix handling of signed arguments to ffi_call (#11)
* Fix struct argument handling in ffi_call (#10)
* Remove fpcast emulation tests
* Align the stack to MAX_ALIGN before making call (#12)
* Increase MAX_ARGS
* Cleanup (#14)
* Fix Closure compiler error with -sASSERTIONS=1 (#15)
* Remove function pointer cast emulation (#13)
This reverts commit 593b402 and cbc54da, as it's no longer needed
after PR pyodide/pyodide#2019.
* Prefer the `__EMSCRIPTEN__` definition over `EMSCRIPTEN` (#18)
"The preprocessor define EMSCRIPTEN is deprecated. Don't pass it to code
in strict mode. Code should use the define __EMSCRIPTEN__ instead."
https://github.com/emscripten-core/emscripten/blob/84a634167a1cd9e8c47d37a559688153a4ceace6/emcc.py#L887-L890
* Install autoconf 2.71
* Try again with installing autoconf 2.71
* Fix compatibility with Emscripten 3.1.28
* CI: remove use of `EM_CONFIG` env
See commit:
https://github.com/emscripten-core/emsdk/commit/3d87d5ea8143b3636f872fb05b896eb4a19a070b
* Fix cls_multi_schar: cast rest_call to signed char
* Remove test xfails (#17)
* Fix long double when used as a varargs argument
* Enable unwindtest and fix it
* Add EM_JS_DEPS
* Also require convertJsFunctionToWasm
* Run tests very very verbose
* Echo the .emscripten file
* Remove --experimental-wasm-bigint insertion
* Build with assertions
* Move verbosity flags back out of LDFLAGS
* Remove debug print statement
* Use up to date pyodide docker image
* Explicitly cast res_call to fix test failure
* Put back name of main function in cls_longdouble_va.c
* Fix alignment
The stack pointer apparently needs to be aligned to 16. There were
some terrible subtle bugs caused by not respecting this. stackAlloc
knows that the stack should be 16 aligned, so we can use stackAlloc(0)
to enforce this. This way if alignment requirements change, as long
as Emscripten updates stackAlloc to continue to enforce them we should
be okay.
* Fix handling of systems with no Js bigint integration
When we run the node tests we use node v14 tests (since node v14 is
vendored with Emscripten). Node v14 has no Js bigint integration
unless the --experimental-wasm-bigint flag is passed. So only the
node tests really notice if we get this right. Turns out, it didn't
work. We can't call a JavaScript function with 64 bit integer arguments
without bigint integration.
In ffi_call, we are trying to call a wasm function that takes 64 bit
integer arguments. dynCall is designed to do this. We need to go back
to tracking the signature when we don't have WASM_BIGINT, and then use
dynCall. This works better now that emscripten can dynamically fill in
extra dynCall wrappers:
https://github.com/emscripten-core/emscripten/pull/17328
On the other hand, for the closures we are not getting a function pointer
as a first argument. We need to make our own wasm legalizer adaptor that
splits 64 bit integer arguments and then calls the JavaScript trampoline,
then the JavaScript trampoline reassembles them, calls the closure, then
splits the result (if it's a 64 bit integer) and the adaptor puts it back
together.
* Improvements to emscripten test shell scripts (#21)
This fixes the C++ unwinding tests and makes other minor improvements
to the Emscripten test shell scripts.
* Rename the test folder and move test files into emscripten test folder
* Use docker image that has autoconf-2.71
* Cleanup
* Pin emscripten 3.1.30
* Fix build.sh path
* Rearrange ci pipeline
* Fix bpo_38748 test
* Cleanup
* Improvements to comments, add static asserts, and update copyright
* Use `*_js` instead of `*_helper` for EM_JS functions (#22)
* Minor code simplification
* Xfail first dejagnu test to work around emscripten cache messages
See https://github.com/emscripten-core/emscripten/issues/18607
* Remove unneeded xfails
* Shorten conftest.py by using pytest-pyodide
* Apply formatters and linters to emscripten directory
* Fix Emscripten xfail hack
* Fix build-tests script
* Patch emscripten to quiet info messages
* Clean up compiler flags in scripts and remove some settings from circleci config
* Rename emscripten quiet script
* Add missing export
* Don't remove go.exp
* Add reference to emscripten logging issue
---------
Co-authored-by: Kleis Auke Wolthuizen <info@kleisauke.nl>
Co-authored-by: Kleis Auke Wolthuizen <github@kleisauke.nl>
Co-authored-by: Christian Heimes <christian@python.org>
|
|
9ba55921
|
2021-03-05T10:07:30
|
|
Static tramp v5 (#624)
* Static Trampolines
Closure Trampoline Security Issue
=================================
Currently, the trampoline code used in libffi is not statically defined in
a source file (except for MACH). The trampoline is either pre-defined
machine code in a data buffer. Or, it is generated at runtime. In order to
execute a trampoline, it needs to be placed in a page with executable
permissions.
Executable data pages are attack surfaces for attackers who may try to
inject their own code into the page and contrive to have it executed. The
security settings in a system may prevent various tricks used in user land
to write code into a page and to have it executed somehow. On such systems,
libffi trampolines would not be able to run.
Static Trampoline
=================
To solve this problem, the trampoline code needs to be defined statically
in a source file, compiled and placed in the text segment so it can be
mapped and executed naturally without any tricks. However, the trampoline
needs to be able to access the closure pointer at runtime.
PC-relative data referencing
============================
The solution implemented in this patch set uses PC-relative data references.
The trampoline is mapped in a code page. Adjacent to the code page, a data
page is mapped that contains the parameters of the trampoline:
- the closure pointer
- pointer to the ABI handler to jump to
The trampoline code uses an offset relative to its current PC to access its
data.
Some architectures support PC-relative data references in the ISA itself.
E.g., X64 supports RIP-relative references. For others, the PC has to
somehow be loaded into a general purpose register to do PC-relative data
referencing. To do this, we need to define a get_pc() kind of function and
call it to load the PC in a desired register.
There are two cases:
1. The call instruction pushes the return address on the stack.
In this case, get_pc() will extract the return address from the stack
and load it in the desired register and return.
2. The call instruction stores the return address in a designated register.
In this case, get_pc() will copy the return address to the desired
register and return.
Either way, the PC next to the call instruction is obtained.
Scratch register
================
In order to do its job, the trampoline code would need to use a scratch
register. Depending on the ABI, there may not be a register available for
scratch. This problem needs to be solved so that all ABIs will work.
The trampoline will save two values on the stack:
- the closure pointer
- the original value of the scratch register
This is what the stack will look like:
sp before trampoline ------> --------------------
| closure pointer |
--------------------
| scratch register |
sp after trampoline -------> --------------------
The ABI handler can do the following as needed by the ABI:
- the closure pointer can be loaded in a desired register
- the scratch register can be restored to its original value
- the stack pointer can be restored to its original value
(the value when the trampoline was invoked)
To do this, I have defined prolog code for each ABI handler. The legacy
trampoline jumps to the ABI handler directly. But the static trampoline
defined in this patch jumps tp the prolog code which performs the above
actions before jumping to the ABI handler.
Trampoline Table
================
In order to reduce the trampoline memory footprint, the trampoline code
would be defined as a code array in the text segment. This array would be
mapped into the address space of the caller. The mapping would, therefore,
contain a trampoline table.
Adjacent to the trampoline table mapping, there will be a data mapping that
contains a parameter table, one parameter block for each trampoline. The
parameter block will contain:
- a pointer to the closure
- a pointer to the ABI handler
The static trampoline code would finally look like this:
- Make space on the stack for the closure and the scratch register
by moving the stack pointer down
- Store the original value of the scratch register on the stack
- Using PC-relative reference, get the closure pointer
- Store the closure pointer on the stack
- Using PC-relative reference, get the ABI handler pointer
- Jump to the ABI handler
Mapping size
============
The size of the code mapping that contains the trampoline table needs to be
determined on a per architecture basis. If a particular architecture
supports multiple base page sizes, then the largest supported base page size
needs to be chosen. E.g., we choose 16K for ARM64.
Trampoline allocation and free
==============================
Static trampolines are allocated in ffi_closure_alloc() and freed in
ffi_closure_free().
Normally, applications use these functions. But there are some cases out
there where the user of libffi allocates and manages its own closure
memory. In such cases, static trampolines cannot be used. These will
fall back to using legacy trampolines. The user has to make sure that
the memory is executable.
ffi_closure structure
=====================
I did not want to make any changes to the size of the closure structure for
this feature to guarantee compatibility. But the opaque static trampoline
handle needs to be stored in the closure. I have defined it as follows:
- char tramp[FFI_TRAMPOLINE_SIZE];
+ union {
+ char tramp[FFI_TRAMPOLINE_SIZE];
+ void *ftramp;
+ };
If static trampolines are used, then tramp[] is not needed to store a
dynamic trampoline. That space can be reused to store the handle. Hence,
the union.
Architecture Support
====================
Support has been added for x64, i386, aarch64 and arm. Support for other
architectures can be added very easily in the future.
OS Support
==========
Support has been added for Linux. Support for other OSes can be added very
easily.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
* x86: Support for Static Trampolines
- Define the arch-specific initialization function ffi_tramp_arch ()
that returns trampoline size information to common code.
- Define the trampoline code mapping and data mapping sizes.
- Define the trampoline code table statically. Define two tables,
actually, one with CET and one without.
- Introduce a tiny prolog for each ABI handling function. The ABI
handlers addressed are:
- ffi_closure_unix64
- ffi_closure_unix64_sse
- ffi_closure_win64
The prolog functions are called:
- ffi_closure_unix64_alt
- ffi_closure_unix64_sse_alt
- ffi_closure_win64_alt
The legacy trampoline jumps to the ABI handler. The static
trampoline jumps to the prolog function. The prolog function uses
the information provided by the static trampoline, sets things up
for the ABI handler and then jumps to the ABI handler.
- Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to
initialize static trampoline parameters.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
* i386: Support for Static Trampolines
- Define the arch-specific initialization function ffi_tramp_arch ()
that returns trampoline size information to common code.
- Define the trampoline code table statically. Define two tables,
actually, one with CET and one without.
- Define the trampoline code table statically.
- Introduce a tiny prolog for each ABI handling function. The ABI
handlers addressed are:
- ffi_closure_i386
- ffi_closure_STDCALL
- ffi_closure_REGISTER
The prolog functions are called:
- ffi_closure_i386_alt
- ffi_closure_STDCALL_alt
- ffi_closure_REGISTER_alt
The legacy trampoline jumps to the ABI handler. The static
trampoline jumps to the prolog function. The prolog function uses
the information provided by the static trampoline, sets things up
for the ABI handler and then jumps to the ABI handler.
- Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to
initialize static trampoline parameters.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
* arm64: Support for Static Trampolines
- Define the arch-specific initialization function ffi_tramp_arch ()
that returns trampoline size information to common code.
- Define the trampoline code mapping and data mapping sizes.
- Define the trampoline code table statically.
- Introduce a tiny prolog for each ABI handling function. The ABI
handlers addressed are:
- ffi_closure_SYSV
- ffi_closure_SYSV_V
The prolog functions are called:
- ffi_closure_SYSV_alt
- ffi_closure_SYSV_V_alt
The legacy trampoline jumps to the ABI handler. The static
trampoline jumps to the prolog function. The prolog function uses
the information provided by the static trampoline, sets things up
for the ABI handler and then jumps to the ABI handler.
- Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to
initialize static trampoline parameters.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
* arm: Support for Static Trampolines
- Define the arch-specific initialization function ffi_tramp_arch ()
that returns trampoline size information to common code.
- Define the trampoline code mapping and data mapping sizes.
- Define the trampoline code table statically.
- Introduce a tiny prolog for each ABI handling function. The ABI
handlers addressed are:
- ffi_closure_SYSV
- ffi_closure_VFP
The prolog functions are called:
- ffi_closure_SYSV_alt
- ffi_closure_VFP_alt
The legacy trampoline jumps to the ABI handler. The static
trampoline jumps to the prolog function. The prolog function uses
the information provided by the static trampoline, sets things up
for the ABI handler and then jumps to the ABI handler.
- Call ffi_tramp_set_parms () in ffi_prep_closure_loc () to
initialize static trampoline parameters.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
|