Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background:
WebAssembly/component-model#371 (comment)
We pay a cost to cross the wasm-JS boundary. If this happens a lot it can be significant. One way to avoid such boundary crossings is to batch calls, to build a buffer of serialized instructions and then call JS once to read it from linear memory and execute it. This approach is taken by Emscripten's GL proxying and webcc. If there are many short calls, this can speed things up in some cases.
This PR does something related but more general: it takes an input wasm and automatically applies batching to all calls where it can. All calls not returning a value are batched, while calls that do return a value flush the buffer and then run normally. The code also autogenerates JS deserialization code to match the serialization, and you paste that into the JS and that's it.
This is not safe in general, because of issues like reentrancy (wasm->js->wasm->js) and stale data (if a pointer is serialized to be used later, that data must not be modified). If we decide to productionize this, there would need to be user control over what is autobatched and what is not, etc. (and in emscripten specifically we could do things like enable this on all
proxy: asyncmethods, by default; other toolchains might have similar things). For now, however, this makes it easy to get benchmark numbers.I measured three things:
(These measurements are total time - I didn't measure the cost of individual wasm->js calls. But obviously this reduces that overhead to essentially 0, if you have enough calls being batched.)
So this does show a large speedup as expected, when doing large amounts of js/wasm boundary crossings for small amounts of work. However, I don't know how common that is in practice - the last benchmark I tested, with WebGL where I saw no speedup, is probably representative of most WebGL code out there (where proper shader and buffer usage avoids js/wasm overhead anyhow).