-
Notifications
You must be signed in to change notification settings - Fork 756
Compiling to WebAssembly with Binaryen
This page explains how you can write a compiler to WebAssembly using Binaryen. First, let's get some FAQs out of the way.
WebAssembly is a cross-browser standard for executable code. By compiling to it, you can run your code on the web, without plugins.
As well as working in browsers, WebAssembly also works as a general-purpose cross-platform binary format. For example, you can use standalone WebAssembly runtimes and the upcoming WASI system interface to create CLI applications.
There are already a few ways to compile to WebAssembly, and more will probably appear. Different approaches can have different benefits and tradeoffs. Specifically, Binaryen aims to be
- Simple,
- Fast,
- Still generate pretty good code despite the first two.
Binaryen keeps things simple and fast by using an internal IR that is almost identical to WebAssembly. This makes sense for two main reasons:
- We know our target is WebAssembly, and we don't need to think about other targets - that's a job for WebAssembly VMs! Here we don't need extra abstraction layers for numerous targets.
- Obviously multiple IRs can help in optimization - that's another reason most compilers have them (in particular, SSA-based IRs). However, this matters less for us, for 3 reasons:
- First, WebAssembly VMs in browsers will use a powerful SSA-based IR to optimize your code anyhow, because WebAssembly is a portable target, higher-level than what compilers like LLVM generally emit. For that reason, standard compiler optimizations matter less for us than in general.
- Second, WebAssembly needs a bunch of special optimizations - it's designed to be compact and efficient for transmission, and as a result, looks somewhat different than typical assembly languages. So we need a WebAssembly AST anyhow to generate good code, and that's what we have.
- Finally, it is becoming more and more clear that really good performance depends on language-specific optimizations, and not just the standard ones. That's why languages like Swift and Rust have their own IRs and optimizers before they feed into LLVM and optimize there. The same model can work with Binaryen: perform your language-specific optimizations first on your own IR that you already have, then feed that into Binaryen. As a result, you'll still have those language-specific optimizations.
Therefore even with a single IR in Binaryen we should be able to emit fairly good code. And with just one IR, you can avoid a substantial amount of overhead that most compilers have.
In addition, Binaryen's IR is designed to be lightweight and fast:
- Small data structures (e.g., nodes do not refer to their parents)
- Multithreaded optimization (simple in part because of those small data structures with minimal cross-references)
- Arena allocation, which is both fast and also encourages relevant content to be contiguous in memory
- Minimize allocation, instead prefer to reuse memory
Note that we said Binaryen uses WebAssembly as its main IR. Binaryen already has optional support for a more CFG-style input IR as well, for convenience; others may follow, but a fast one-IR path will remain.
Binaryen expects input that is generally compatible with being compiled to WebAssembly. Specifically, you should be able to emit data for it in the following form:
- Basic types are i32, i64, f32, f64 (no i8 or i16, and no i57 or anything else).
- Mathematical operations are everything in the WebAssembly spec: add, sub, mul, ctz, popclt, etc. etc., read the spec for details.
- "Native" control flow is structured - blocks, ifs, loops, etc. - and you can pass that in if you have it. If not, you can provide an arbitrary control flow graph, which Binaryen will "reloop" into structured control flow for you (more details later down).
- WebAssembly IR is structured, which means it can have nested expressions. You can pass that in if you have it. If, instead, you have a more "flat" representation of simple instructions in lists in basic blocks, then you can pass that in as well, just create Blocks and fill them with such simple instructions, using locals to avoid nesting. That will use more local variables, but the Binaryen optimizer can optimize those out. An example of using locals to avoid nesting:
(local.set $temp1
(i32.const 42))
(local.set $temp2
(call $foo
(local.get $temp1)))
(local.set $temp3
(i32.eqz
(local.get $temp2)))
(call $bar
(local.get $temp3))
;; instead of
(call $bar
(i32.eqz
(call $foo
(i32.const 42))))
- WebAssembly is not an SSA-based IR. Instead, you define an arbitrary number of local variables and can read and write to them without limit. You can pass that in if you have it. If instead you have SSA, then that should work almost trivially, just provide each SSA variable as a WebAssembly local variable (obviously you will have a lot per function, but the Binaryen optimizer can help there). For phis, the CFG support Binaryen has accepts code on branches, so you can replace a phi at a block with setting of local variables on the branch edges leading to it, which should also be almost trivial.
- WebAssembly defines no built-in ABI. All it provides is the ability to define functions and call them (without thinking about the ABI - you perform a
call
and it just happens, i.e., the native call stack is managed for you, including arguments, local variables, etc.), and a linear memory that you can use however you want. You can use that linear memory to implement a user stack, for example by deciding that address 4 is the location of the stack pointer (4 might make sense as the first non-0 aligned address), and emitting code to use that stack however you want. For example, you can see which local variables need to be on the user stack (if their address is taken, as local variables are like registers, they are not in linear memory), and allocate and free space for them in function call prologues and epilogues and so forth. Some languages may not need such a stack, of course, it's entirely up to you what you do with linear memory.
There are several ways to use Binaryen, which we will describe in this section: from JS, C, and C++.
See the binaryen.js-API docs for details. Code samples are in test/binaryen.js
.
The C API is in a single header here. That contains everything you need together with the libbinaryen
library which is built in lib/
.
There is a hello world test which is a good starting point. There is also a kitchen-sink test as well, which should cover practically all the API.
When compiling to Binaryen, you'll probably do something like what you see in those examples, which follows this pattern:
- Create a Module. A Module represents a WebAssembly module. It will contain code and data.
- Create functions.
- Each function needs a type: what type parameters it receives, and what type it returns.
- The code in the function. You can create expressions using the creator functions like
BinaryenGetLocal
which creates an AST node that reads a local. For details on what the AST nodes are, see the wasm spec and Binaryen's corewasm.h
header.- When creating nodes, you'll need to specific opcode types. Opcodes are numbers that are generated by e.g.
BinaryenAbs()
forAbs
(absolute value). These are function calls so that you don't need to use the C header if you don't want to. Their values are fixed so you can cache the output into a variable, if you want, for performance. - You'll also create literal values for constants, using e.g.
BinaryenLiteralInt32()
to get a literal representing a number of typei32
. Literals are passed around by value, they are very small structs.
- When creating nodes, you'll need to specific opcode types. Opcodes are numbers that are generated by e.g.
- You can then create the function, passing it the type and the expression which will be its body.
- Create imports and exports: what the Module receives from the outside, and what it provides. See the wasm spec for more details.
- Set up memory: you can define static data that will be part of the Module, and will reside in the memory the Module sees when it runs.
- When you have a full Module, you can perform operations on it, like
- Validate it, to check for any errors. It's helpful to validate during development and debugging, and we don't do it automatically (as it can take noticeable time), so except in stable production builds it's best to validate as soon as the module is complete, and before saving it or optimizing it, etc.
- Print it for debugging purposes.
- Write the module into a buffer. That binary data is the final WebAssembly code, that you can run in a browser.
- Modules must be cleaned up when you are finished with them, using
BinaryenModuleDispose
. That frees the Module and everything on it (i.e. the Module is the only thing you need to worry about memory management for).
For a complete example of a compiler using the C API, look at the mir2wasm project, which compiles Rust into WebAssembly.
As mentioned earlier, Binaryen's native IR is an AST, but it can also receive input in an arbitrary control flow graph, which is a very common case. Binaryen can then "reloop" that code into structured control flow. This works for any CFG, even irreducible ones.
To do so, use the CFG/Relooper part of the C API, as follows:
- Create a Relooper instance with
RelooperCreate()
. - Create basic blocks with
RelooperAddBlock()
. The contents of a basic block is a Binaryen expression! In other words, it can be anything in the WebAssembly AST. And that content is not touched when relooping, we just add necessary things around it (loops, ifs, etc.). The one thing you should avoid is control flow in the bodies of blocks - it's actually ok to have internal control flow if you want that, but if you branch outside, that can't work. (However, you probably don't want that anyhow since you're using the relooper to generate control flow for you!) In the simple case where you have just straightline code in your basic blocks, you can create an AST Block (usingBinaryenBlock()
) with the list of instructions on it. - Create branches between blocks, using
RelooperAddBranch()
. A branch goes from one block to another, if a condition occurs. The condition is a regular Binaryen expression, which should be ani32
(for example, you might create an expression that compares a local to a specific value, and the branch would be taken if that is true). The condition can also be NULL, and the branches out of a block should always contain exactly one such branch, which functions as the default: if no other condition is true, we take that route.- If a block has no outgoing branches, the Relooper assumes you created a terminator in the contents of the block. That is, the contents should end in a
return
(with or without a value, as appropriate for the function) or anunreachable
or something else that will prevent control flow from continuing. The Relooper can't do this for you, as it doesn't know how you want execution to stop (note that it's also hard to warn about this, as you may call an import from JS that you know will throw - so the IR the Relooper sees may not even contain a terminator). If you forget to add a terminating instruction, you may see odd behavior - control flow may flow into whichever basic block happens to appear right after that one. - You can also put code on a branch. The code happens as the branch is taken, before we reach the target. This is useful for implementing phis, if your compiler has them.
- If a block has no outgoing branches, the Relooper assumes you created a terminator in the contents of the block. That is, the contents should end in a
- Finally, after creating your blocks and branches, just call
RelooperRenderAndDispose()
. That will reloop the code and return a Binaryen expression which contains all those basic blocks and branches, properly structured. (This call will also dispose of the Relooper instance, and you don't need to worry about memory management.)
Note that the relooper output is not optimized by default: you will see redundant blocks and so forth. If you optimize your code (using BinaryenModuleOptimize()
) then you should see nice control flow.
See binaryen-c.h
and src/CFG/Relooper.h
for more technical details, and the test suite for concrete examples. For a full compiler, see the mir2wasm project mentioned earlier, which uses the CFG API in addition to the C API.
The C API (including the CFG API) have a tracing option that prints out C API commands for each command you issue. The result is a runnable program that does the same things you were doing when you ran the trace. This lets you easily generate a standalone testcase from your compiler, without a dependency on your compiler itself, it will just do the same Binaryen C API calls that you did.
See BinaryenSetAPITracing
in binaryen-c.h
for more details. There is also an example of this in test/example/c-api-kitchen-sink.c
, and check.py
checks that tracing outputs a proper program for that by both matching against the known correct output, and also building and running it.
C++ is what Binaryen is written in, and you can extend it in that language. Note though that the C API is likely going to be more stable, as internal APIs may change.
The Binaryen tools handle many errors, including some common I/O errors, by immediately exiting. This behavior applies to some of the APIs that users of Binaryen as a library might call. When Binaryen encounters such an error it calls the Fatal
type, which performs the early exit. It is possible to change this behavior at compile time by defining THROW_ON_FATAL
, in which case Fatal
will throw a std::runtime_error
that clients can catch.
The end result of compilation is a WebAssembly binary. You can run that in a browser, giving it its imports, and receiving and calling its exports. That's all there is to it.
However, you may also want to use the Emscripten compiler infrastructure. Emscripten lets you "link" with JavaScript libraries to do useful things, like render WebGL, run a browser main loop, handle a filesystem, provide bindings to JS so it's easy to call into your compiled code, etc., as well as a full set of compiled libraries like libc and so forth. To use that, you need to provide Emscripten with your WebAssembly file as well as a metadata file, and call emcc
. TODO: There are a few minor details to be finished on the Emscripten side for this to just work, but this is basically what already happens with the LLVM WebAssembly backend and Emscripten; we just need to generalize it a little.