-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize side table #711
Generalize side table #711
Conversation
On the principle this looks good. From an organization point of view, I would put everything in the side table (which represents pre-computed information on the wasm bytecode). This side table would contain information about how to take branches/jumps, where to find functions, etc. It needs to be serializable with easy random access. Regarding the other type of accessors, it's probably enough to store their position in the wasm bytecode and deserialize them on demand. This is still O(1). We can probably avoid storing the "end" parser position by always using the end of the wasm bytecode. For functions, we could also store only one position, which would be right before the |
1eefb4b
to
d093a8c
Compare
func()
and func_type()
in module.rs
@ia0 Could you take another look at the design? Thanks. |
Based on experiments, these are the smallest masks to pass CI. In other words, `SideTableEntry` only requires `u35` at this point. I'll add the fields from `func_type()` and `func()` in `module.rs` to the side table in #711, and `SideTableEntry` as `u64` might not be sufficient. #46 --------- Co-authored-by: Zhou Fang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (generalized) side table should logically be a mapping from function index to function metadata, where a function metadata has the following components:
- The type (index) of the function.
- The body (parser) of the function.
- The "side table" of the function. Let's call this branch table.
We want to serialize this logical value somehow. One idea is to have something like this:
side-table := index flatten-metadata (16-bits aligned)
index := u16 u16 ... u16 (N - 1 times where N is the number of functions in the module)
flatten-metadata := metadata metadata ... metadata (N times where `flatten-metadata[i + 1]` starts at byte position `index[i]`)
metadata := parser:(u32 u32) type:u16 branch-table (if 32-bits aligned)
| type:u16 parser:(u32 u32) branch-table (otherwise)
branch-table := branch branch ... branch
branch := delta-ip:u16 delta-stp:u16 val-cnt:u8 pop-cnt:u12
Could you elaborate on the two cases here? By "otherwise", do you mean 16 bits aligned? |
fa1544b
to
af54e07
Compare
func()
and func_type()
in module.rs
SideTableEntry
@ia0 Please take a look at the generalized |
Yes. Every "item" is at least 16 bits aligned. |
Why does the order matter? We can read exactly based on how we write, and the alignment is known by design when we write. |
For the CPUs we use it probably doesn't matter because they are quite simple. But Rust which supports much more, makes a distinction between aligned and unaligned accesses. But maybe the solution is to just use Regarding the order, every metadata has an odd amount of There's the same problem for writing. We need to write with the proper alignment. It is not known, it depends how many branches are in each function before the current function. We could add a u16 padding after some functions to make sure all metadata start 32-bit aligned, so that even indices and odd indices can be used to tell the alignment. But I think at this point, it's probably better to just not rely on alignment and read all values as |
e9e4e31
to
17f13e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think it looks good now, modulo the redoing of the BranchTableEntry representation (using [u16; 3] and no bitfield).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM modulo the nits. I'm creating a review commit and merging.
I'd like to get some feedback about the design: the "section table" is used to improve the performance of these accessors in
module.rs
, which are used in execution. For these three accessors that returnParser
, we can precompute theParseRange
during validation, and create a parser based on the range during execution.I'm not sure about the other accessors that return types or even
Option<ExportDesc>
. IIUC, we want to store the section table in flash as the side table. Does that mean we would have to design some encoding rules for these return types?Update 1
We will only optimize
func()
andfunc_type()
inmodule.rs
, because the other accessors should be fast.Update 2
SideTableEntry
asBranchTableEntry
.SideTable
to include branch table and other function metadata.#46