Latest revision as of 00:52, 29 January 2008

FrostByte: -39

Bytecode needs/design goals/whatever

Rough structure

Do it a function at a time first, go through callAsFunction, etc. However, try hard to construct arguments directly on the execution stack, and try to make List wrap around that.

Unsure: basic block boundaries. kjs-machine has an ellegant way of doing basic structured control flow, but I am not sure of details; and going to a real CFG will make optimization easier, and just defer linearization.

Stack

Obviously, access should be cheap. And far more cheap if Maks is designing the bytecode than if anyone else, since he has an undue fondness for RISC-like IRs. (Hey, can we do it in VIR?).

All local and temporary access should be simple offset from something
Stack size can be precomputed: have a ByteCodeTemp class that links in to a counter in compilation context. No need for conditionals.
- We really, really, really, want to do a single allocation for a frame anyway.
Do we keep track of the SP?
- Pro: less ++ / --
- Con: garbage can linger
- Does it matter? Pick one
Basic evaluation interface thus far: void evaluateTo(CompileState* comp, ByteCodeTemp& dest)
A simple peephole may be needed to fix things like this up:

  t1 = local;
  t2 = t1 * t3;

Exceptions

The ideal: there should be no work done checking for exceptions anywhere. The simple way of doing it is to expose the PC in the ExecState, than on raising/setException patch it up to point to the catch or a generic handle. The catch is easier, as one can just do the same some sort of a push/pop of handler address as old C++ implementations did. The other option is jumping to a special handler which figures out the catch based on PC. Probably not worth it -- try/catch should be somewhat rare. I would think. I hope. So I am probably wrong.

The problem is of course that PC can not be cached in a register. I don't see any way around that

Types

It would be nice to do double math, integer counters, etc., without making JS values at all.

Add type to temporaries... Something like this:

NumberNode::evaluateTo(CompileState* comp, ByteCodeTemp& dest) {
       if (dest.type() == FrostByte::Number)
           emitLoadDoubleImmediate(dest, value());
       else
           evaluateConvert(comp, dest, FrostByte::Number); 
           // Asks Node:: to do the conversion for us
  }

Problem: how much would extra conversion instructions/extra spin around the loop cost?
Locals would still need JSValues
- When can they be demoted to temporaries? Is no eval and no nested functions enough?
Problem: type information would help saner bytecode generation, but more data-flow-like type computations can only be done at bytecode level. Might be tricky to clean things up after the fact.
This needs separate areas of the stack, one for JS objects, and one for things that should not be marked
- Or, simpler, typed portions of the stack, with a bit for the mark code. This is easier for globals

Literals

Have a literal table, for all the string literals at least
- Can't exactly stick UStrings into not-very-structured memory
- Actually, can stick Identifier*, as long as the AST stick arounds,

which it has due to string conversion of functions.

Number literals may be inline, or not. Unclear

Globals

Obviously, want everything to work the same way
- Being able to access global functions by ID would be good -- but quite tricky to get right
Multiple script tags ⇒ might have to resize
- Can not have fixed offsets into a multi-area frame from same location, keep multiple base pointers
  - Those can be cached in interpreter locals most of the time I think --- but stuff like document.write may be tricky. Needs thought. Could perhaps abuse the exception machinery
- Do not need to have multiple areas if have bitmask for marking --- properly better, lower reg pressure, but still a patch up may be

needed. Hmm, there can be multiple functions executing, though, which is bad. I really need to see whether this is an issue, even with document.write, since moving the BP to a non-local will be slow.

Can realloc the mess and with multiple BPs keep older offsets.

@@ Line 1: / Line 1: @@
-=== Bytecode needs/design goals/whatever ===
+= FrostByte: -39 =
+== Bytecode needs/design goals/whatever ==
-* Initial overview: do it a function at a time. Function calls still
+=== Rough structure ===
-do callAsFunction:
+Do it a function at a time first, go through callAsFunction, etc.
-** But can setup the arguments on the stack frame, along with the length.
+However, try hard to construct arguments directly on the execution stack, and
-    List integration with that may be tricky, especially for native calls.
+try to make List wrap around that.
-* Goal #1: cheap temporaries --- make the wiring inexpensive
+Unsure: basic block boundaries. kjs-machine has an ellegant
-** M.O.: my tendency is towards simpler bytecode (think Copy4, Copy8),
+way of doing basic structured control flow, but I am not
-    for simpler optimization; critical for that, nice otherwise
+sure of details; and going to a real CFG will make optimization
-** All accesses to temporaries, locals should be simple offsets
+easier, and just defer linearization.
-** Can compute stack size need easily: just have a
-     ByteCodeTemp class that links in to a counter in
-     compilation context. So there is no need to put in conditionals
-     on push/pop. Also, this gives the temporaries an explicit name,
-     which can be in a shared namespace with locals. Somewhat (see below)
-** Keep track of stack pointer?
-*** Pro: less ++ / --
-*** Con: garbage can linger
-** Basic compilation interface for evaluate:
-         void evaluateTo(CompileState* comp, ByteCodeTemp& dest);
-** A simple peephole optimization on reads for this:
-         t1 = local;
-         t2 = t1 * t3;
-    will probably be a good idea. Not sure if full copy prop is worth anything
+=== Stack ===
-  * Limit reallocations -- entering a context should do only a single
+Obviously, access should be cheap. And far more cheap if Maks
-    stack frame allocation
+is designing the bytecode than if anyone else, since he has an
+undue fondness for RISC-like IRs. (Hey, can we do it in VIR?).
+* All local and temporary access should be simple offset from something
+* Stack size can be precomputed: have a <tt>ByteCodeTemp</tt> class that links in to a counter in compilation context. No need for conditionals.
+** We really, really, really, want to do a single allocation for a frame anyway.
+* Do we keep track of the SP?
+** Pro: less ++ / --
+** Con: garbage can linger
+** Does it matter? Pick one
+* Basic evaluation interface thus far: <tt>void evaluateTo(CompileState* comp, ByteCodeTemp& dest)</tt>
+* A simple peephole may be needed to fix things like this up:
+<pre>
+  t1 = local;
+  t2 = t1 * t3;
+</pre>
-* Literals should go into some sort of a symbol table
+=== Exceptions ===
+The ideal: there should be no work done checking for exceptions anywhere.
+The simple way of doing it is to expose the PC in the ExecState,
+than on raising/setException patch it up to point to the catch or a generic
+handle. The catch is easier, as one can just do the same some sort of a
+push/pop of handler address as old C++ implementations did. The other option
+is jumping to a special handler which figures out the catch based on PC.
+Probably not worth it -- try/catch should be somewhat rare. I would think.
+I hope. So I am probably wrong.
-* Obviously, would like globals to used all of the above
+The problem is of course that PC can not be cached in a register. I don't see any way around
-  * Problem: symbol tables, stack frames might grow. May require
+that
-    segmenting them an multiple subst
+=== Types ===
+It would be nice to do double math, integer counters, etc., without making JS values at all.
+* Add type to temporaries... Something like this:
+  <pre>NumberNode::evaluateTo(CompileState* comp, ByteCodeTemp& dest) {
+       if (dest.type() == FrostByte::Number)
+           emitLoadDoubleImmediate(dest, value());
+       else
+           evaluateConvert(comp, dest, FrostByte::Number);
+           // Asks Node:: to do the conversion for us
+  }
+  </pre>
+* Problem: how much would extra conversion instructions/extra spin around the loop cost?
+* Locals would still need JSValues
+** When can they be demoted to temporaries? Is no eval and no nested functions enough?
+* Problem: type information would help saner bytecode generation, but more data-flow-like type computations  can only be done at bytecode level. Might be tricky to clean things up after the fact.
+* This needs separate areas of the stack, one for JS objects, and one for things that should not be marked
+** Or, simpler, typed portions of the stack, with a bit for the mark code.  This is easier for globals
+=== Literals ===
+* Have a literal table, for all the string literals at least
+** Can't exactly stick UStrings into not-very-structured memory
+** Actually, can stick Identifier*, as long as the AST stick arounds,
+which it has due to string conversion of functions.
+* Number literals may be inline, or not. Unclear
+=== Globals  ===
+* Obviously, want everything to work the same way
+** Being able to access global functions by ID would be good -- but quite tricky to get right
+* Multiple script tags &rArr; might have to resize
+** Can not have fixed offsets into a multi-area frame from same location, keep multiple base pointers
+*** Those can be cached in interpreter locals most of the time I think --- but stuff like document.write may be tricky. Needs thought. Could perhaps abuse the exception machinery
+** Do not need to have multiple areas if have bitmask for marking --- properly better, lower reg pressure, but still a patch up may be
+needed. Hmm, there can be multiple functions executing, though, which is bad.
+I really need to see whether this is an issue, even with document.write,
+since moving the BP to a non-local will be slow.
+* Can realloc the mess and with multiple BPs keep older offsets.