Bystroushaak's blog / English section / Programming / tinySelf / tinySelfEE 2021-07; let's throw away the Symbolic eval code

tinySelfEE 2021-07; let's throw away the Symbolic eval code

This will be a weird blogpost, here is some context:

I've been working on tinySelfEE for some time, but basically, I didn't touch it for more than six months now. And I feel like I don't really know how to start, or what I was even working on before. That weird feeling, when you find ancient artifacts left by strange civilization, which was yourself in the past.

This means its exploration time. I am writing this blogpost to capture my thought process and to make myself work on the tinySelfEE again. It should provide me some context of where I am and what am I doing.

cloc utility tells me that there are 3779 lines of java. I've somehow persuaded Idea to give me this diagram of classes used in the project:

From what I remember, tokenizer is working fine. Parser is mostly done, there is still stuff that could be improved, but at the moment, it works.

I have the project set up so that when I run it, it parses the input file called simple_send.tself:

(|
    slot: parameter = (| var = 1. |
        (parameter + 1) printString.
        parameter + 1.
    ).
|) slot: 1

It defines an object, which has one keyword slot called poetically slot:, which takes one parameter and defines one unused local variable var. When called at the last line, it should print parameter + 1 (⇒ 2) and then also return the resulting value 2.

When I run it, it prints the AST:

Send{obj=Obj{parents=null, slots={slot:=Obj{parents=null, slots={var=NumberInt{value=1}}, arguments=[parameter], code=[Send{obj=Send{obj=Send{obj=Self{}, message=MessageUnary{message_name='parameter'}}, message=MessageBinary{message_name='+', parameter=NumberInt{value=1}}}, message=MessageUnary{message_name='printString'}}, Send{obj=Send{obj=Self{}, message=MessageUnary{message_name='parameter'}}, message=MessageBinary{message_name='+', parameter=NumberInt{value=1}}}]}}, arguments=null, code=null}, message=MessageKeyword{message_name='slot:', parameter=[NumberInt{value=1}]}}

AST is basically a tree consisting of Send, which takes as a first parameter the definition of the object, and as a second parameter definition of a message.

Man, I really wish I implemented printing to the plantuml syntax because this is pretty unreadable. Lets put it into TODO.

That seems to be working. Then there is a result of compilation to the symbolic structure:

SymbolicSend(
  SymbolicObject(
    id: 0,
    version: 1,
    slots = [
      "slot:" = SymbolicObject(
          id: 1,
          version: 1,
          arguments = [
            "parameter",
          ],
          slots = [
            "var" = 1,
          ],
          code = [
            SymbolicSend(
              SymbolicSend(
                SymbolicSend(
                  (default) self,
                  SymbolicMessage(
                    message: "parameter",
                  ),
                ),

                SymbolicMessage(
                  message: "+",
                  arguments = [
                    1,
                  ],
                ),
              ),

              SymbolicMessage(
                message: "printString",
              ),
            ),
            SymbolicSend(
              SymbolicSend(
                (default) self,
                SymbolicMessage(
                  message: "parameter",
                ),
              ),

              SymbolicMessage(
                message: "+",
                arguments = [
                  1,
                ],
              ),
            ),
          ],
        ),
    ],
  )
  SymbolicMessage(
    message: "slot:",
    arguments = [
      1,
    ],
  ),
)

Hm. Lovely. I think I wanted to "compile" the AST into a more simple structure of lists of SymbolicMessage-s and nested SymbolicSend-s.

Which seems to be working, at least for this short example. I remember that I almost lost myself in the process because it required me to implement compile methods for all AST items, and it got weird with inheritance and interfaces.

I had to create a parallel structure for the compiled AST tree, consisting of Symbolic* classes.

Each of them have an .accept() method for SymbolicVisitor, which is used by the debug printers and also by the SymbolicCompiler. They also have an .evaluate() method which is not yet implemented and should be used for symbolic evaluation. Then there are classes working with these:

I vaguely remember that I've implemented most of the SymbolicObject, so that the slot lookup and parent lookup worked.

Let's work on it

Ok. So, what is missing. From what I understand now, the evaluation should be just a matter of calling .evaluate() on the root of the output from SymbolicCompiler. Lets try that.

My Main class calles this method:

private static void runFile(String file_path) throws IOException {
    byte[] bytes = Files.readAllBytes(Paths.get(file_path));
    ArrayList<ASTItem> ast = parseSourceAndPrintErrors(new String(bytes, StandardCharsets.UTF_8));

    printRawAst(ast);

    SymbolicCompiler compiler = new SymbolicCompiler();
    compiler.compile(ast);
    printSymbolicRepresentation(compiler.getCode());
}

Let's add evaluation:

System.out.println("---");
System.out.println("Symbolic evaluation time:\n");

SymbolicObject namespace = new SymbolicObject();
SymbolicFrame frame = new SymbolicFrame();
for (SymbolicEvalProtocol item : symbolic_code) {
    item.evaluate(namespace, frame);
}

I've added an empty namespace object, which will in time hold default objects required for the interpreter to be able to do anything useful.

And I've added toplevel frame. I call it and it does .. nothing. I can't say I expected more. Let's look at the implementation of the SymbolicSend's evaluate method:

public void evaluate(SymbolicObject namespace, SymbolicFrame frame) {

}

Unsurprisingly, it's empty. Ok.

Symbolic evaluation

Hmm, how should it work. I can see, that the SymbolicObject has this implementation of the .evaluate() method:

@Override
public void evaluate(SymbolicObject namespace, SymbolicFrame frame) {
    frame.push(this);
}

When called, it adds itself on top of the stack in the frame. It seems that it would be a good idea to call .evaluate() on the first object (receiver) in the symbolic send. This will also work for the case when the receiver is other SymbolicSend. But it won't work for the cases, where receiver is self.

Because in the case we want to push self on the top of the stack, we have to know what is self. Hm.

When I look at the code, I can see that the Self AST is not compiled into the symbolic representation, but it is recognized in the SymbolicSend that it is meant for self. This means that I just have to have that self stored somewhere.

Okay, think about it. When the message is sent to the object, it should be looked in the object itself, and if not found, in all the parents. But if it is not found in the parent tree, it should look into the namespace. Which means:

Bleh. I am starting to be anxious, and I feel great feel to go procrastinate. I solved and invented so much stuff in the tinySelf, that it feels really, really bad to reinvent it again in the tinySelfEE. Let's look at the ._do_send() implementation in tinySelf:

obj = self.process.frame.pop()
self._set_scope_parent_if_not_already_set(obj, code)

and:

def _set_scope_parent_if_not_already_set(self, obj, code):
    if obj.scope_parent is None:
        obj.scope_parent = self.universe

Aha. The object to which the message is sent is taken from the top of the stack frame. And the global namespace is inserted into the .scope_parent property if it is not already set. That makes sense.

And there is _do_push_self() instruction, which takes the self from the frame. So the frame knows what is self. Ok.

def _do_push_self(self, bc_index, code_obj):
    self.process.frame.push(self.process.frame.self)

    return ONE_BYTECODE_LONG

And since it is bytecode inrepreter and objects are literals, when pushing new object to the frame, it stores the self:

elif literal_type == LITERAL_TYPE_OBJ:
    assert isinstance(boxed_literal, ObjBox)

    obj = boxed_literal.value.clone()
    if self.process.frame.self is None:
        self.process.frame.self = obj

Ok, this is an ugly solution. For one, I don't understand why I didn't solve this in the frame itself. I've updated the push() method of the frame to this:

ObjectRepr self;
boolean has_self = false;

public void push(ObjectRepr obj) {
    obj_stack.add(obj);
    pointer++;

    if (! has_self) {
        self = obj;
        has_self = true;
    }
}

public void pushSelf() {
    push(self);
}

What next. Update .evaluate() of the SymbolicSend to call object's .evaluate() to push it on top of the frame and make it self. Ok. It now looks like this:

public void evaluate(SymbolicObject namespace, SymbolicFrame frame) {
    if (send_to_self) {
        frame.pushSelf();
    } else {
        receiver.evaluate(namespace, frame);
    }

}

Now it would be a good idea to see if it really works and how the frame looks like when this is executed.

*After some bugfixes*

Frames:

SymbolicFrame(
    depth: 0,
    self: Object,
    Object,
)

Revelation

Now I look at the result and think why I even need stack in the symbolic evaluation. Did I just used it for symbolic evaluation because I had it in tinySelf? When crunching bytecodes, it makes perfect sense, but in the symbolic evaluation, I think I can work without it.

Hm. And I study the code, and I try to implement the symbolic eval, when I realize .. why am I even doing this?

I mean the whole symbolic execution. I know that I want to have a working bytecode interpreter, like I had in tinySelf. And I think I was thinking at some point, that making symbolic execution work will be easy, and it should allow me to prototype some things before I get to the harder part.

Now I see that I'll have to do a bunch of nonsense, create a whole parallel set of classes just for easier symbolic evaluation, and I'll still need to write all the hard stuff anyway. And then when I'll get to the bytecode interpreter, I'll have this stuff all over the place and just getting in the way and not being useful at all.

So let's just .. throw it away.

And I did that. I've commited everything, created new tag symbolic_execution and then went to delete files.

TODO

Become a Patron