Projects/frostbyte-39
FrostByte: -39
Bytecode needs/design goals/whatever
Rough structure
Do it a function at a time first, go through callAsFunction, etc. However, try hard to construct arguments directly on the execution stack, and try to make List wrap around that.
Unsure: basic block boundaries. kjs-machine has an ellegant way of doing basic structured control flow, but I am not sure of details; and going to a real CFG will make optimization easier, and just defer linearization.
Stack
Obviously, access should be cheap. And far more cheap if Maks is designing the bytecode than if anyone else, since he has an undue fondness for RISC-like IRs. (Hey, can we do it in VIR?).
- All local and temporary access should be simple offset from something
- Stack size can be precomputed: have a ByteCodeTemp class that links in to a counter in compilation context. No need for conditionals.
- We really, really, really, want to do a single allocation for a frame anyway.
- Do we keep track of the SP?
- Pro: less ++ / --
- Con: garbage can linger
- Does it matter? Pick one
- Basic evaluation interface thus far: void evaluateTo(CompileState* comp, ByteCodeTemp& dest)
- A simple peephole may be needed to fix things like this up:
t1 = local; t2 = t1 * t3;
Exceptions
The ideal: there should be no work done checking for exceptions anywhere. The simple way of doing it is to expose the PC in the ExecState, than on raising/setException patch it up to point to the catch or a generic handle. The catch is easier, as one can just do the same some sort of a push/pop of handler address as old C++ implementations did. The other option is jumping to a special handler which figures out the catch based on PC. Probably not worth it -- try/catch should be somewhat rare. I would think. I hope. So I am probably wrong.
The problem is of course that PC can not be cached in a register. I don't see any way around that
Types
It would be nice to do double math, integer counters, etc., without making JS values at all.
- Add type to temporaries... Something like this:
NumberNode::evaluateTo(CompileState* comp, ByteCodeTemp& dest) { if (dest.type() == FrostByte::Number) emitLoadDoubleImmediate(dest, value()); else evaluateConvert(comp, dest, FrostByte::Number); // Asks Node:: to do the conversion for us }
- Problem: how much would extra conversion instructions/extra spin around the loop cost?
- Locals would still need JSValues
- When can they be demoted to temporaries? Is no eval and no nested functions enough?
- Problem: type information would help saner bytecode generation, but more data-flow-like type computations can only be done at bytecode level. Might be tricky to clean things up after the fact.
- This needs separate areas of the stack, one for JS objects, and one for things that should not be marked
- Or, simpler, typed portions of the stack, with a bit for the mark code. This is easier for globals
Literals
- Have a literal table, for all the string literals at least
- Can't exactly stick UStrings into not-very-structured memory
- Actually, can stick Identifier*, as long as the AST stick arounds,
which it has due to string conversion of functions.
- Number literals may be inline, or not. Unclear
Globals
- Obviously, want everything to work the same way
- Being able to access global functions by ID would be good -- but quite tricky to get right
- Multiple script tags ⇒ might have to resize
- Can not have fixed offsets into a multi-area frame from same location, keep multiple base pointers
- Those can be cached in interpreter locals most of the time I think --- but stuff like document.write may be tricky. Needs thought. Could perhaps abuse the exception machinery
- Do not need to have multiple areas if have bitmask for marking --- properly better, lower reg pressure, but still a patch up may be
- Can not have fixed offsets into a multi-area frame from same location, keep multiple base pointers
needed. Hmm, there can be multiple functions executing, though, which is bad. I really need to see whether this is an issue, even with document.write, since moving the BP to a non-local will be slow.
- Can realloc the mess and with multiple BPs keep older offsets.