Bystroushaak's blog / English section / Series about Self / Environment and the programming language Self (part two; language)

Environment and the programming language Self (part two; language)

Note:

There is also a Czech version of this article: Prostředí a programovací jazyk Selfu (díl druhý; jazyk)


Last episode Environment and the programming language Self (part one; environment) introduced Self as a project, showed where to download it and you could also find some basic orientation about the usage of the morphic interface environment there. In this episode, we'll look into the language itself and also some more interesting parts of the standard library.

From the language point of view

On the syntactic level, Self is inspired by Smalltalk. It was born in the same place, in Xerox PARC, although a decade later. In Self, like in Smalltalk, everything revolves around sending messages. Unlike Smalltalk, Self introduces a literal for objects, which means that objects are as natural as strings and numbers in other languages.

You can understand the object as a key: val storage. Individual keys are called slots.

When you send a message to the object, a corresponding slot is resolved. If it contains data (object stored there doesn't contain code), it is returned. If it is a code object, a value, that is left after the evaluation of the code, is returned.

Message is sent to the object by typing its name after the object.

obj message

This is similar to C-like languages:

obj.message()

Syntax of the object

() creates an empty object. That is an object without slots, which can't react to messages.

(| a. b. |) creates object, which contains two slots called a and b. Values of both slots are set to nil.

| opens and closes section of slot definitions. Slots themselves are separated by dots. The above object is a "box" that has two drawers called a and b. These drawers can store data or a code.

Object in Self works a bit like a hashmap or a dictionary. A specific value can be stored in a specific slot. The value can be assigned during initialization:

(| a <- nil. b = nil. |)

You can see the two different types of assignments in the example. First one (<-) creates rewritable slot, the other one (=) is read only.

In fact, in the first case, there are two slots; slot a itself, and then a method a: (an assignment primitive), which writes to that slot. In the latter case, only the slot itself is created and it is thus not possible to write into it directly. This has its use for all kinds of constant values.

Each object can also contain code, which is listed after the second vertical line |:

(| a = 1 | a printLine)

The above example defines an unnamed object with a single a slot set to an object of the number 1. It also contains a code to send a printLine message to the slot a, that is, to the value in that slot.

Parent slots

Special type of slots in Self are the so-called parent slots, which cause, that messages, which are not found in the object, are delegated to objects that these slots point to.

When we send a message clone to the object,

(| p* = traits clonable |)

it returns its own copy, even though it does not have a method stored in the clone slot. However, this method is defined in the object referenced by the slot p*, or in the other objects in the parent chain, if this object also contains parent slots.

This mechanism effectively implements inheritance. If you think about it, this behavior resembles a situation where we call the ancestor's method of an object, that is inherited from traits clonable, for example in Python. Main difference is that a parent slot is just a slot, and that means it can be dynamically changed.

Parameters

| slot | definition can also contain parameters of the method. These are preceded by a colon at the beginning of the name. For example, an object:

(|
    x:Y: = (| :a. :b. | a printLine)
|)

contains one slot x:Y:, which points to the code object accepting the parameters a and b.

Alternative syntax for the object is:

(|
    x: a Y: b = (a printLine)
|)

Messages

There are three standard types of messages:

Unary messages don't have parameters. Binary ones have exactly one parameter and are used for operators. Keyword messages can have any number of parameters. Unlike in Smalltalk, second and all following keywords of the multi-parameter keyword message start with a capital letter. This makes it clear where the message ends.

Message a first causes interpreter to look for a slot called first in the object a. If the first slot contains data, they are returned. If it contains object with code, it is evaluated and value of the last expression is returned.

Message a > 1 resolves into lookup of the slot >, where the code object is given a parameter 1. Slot of this type can contain only code objects, because they always have to accept one parameter.

Message x set: a And: b causes lookup of the slot set:And: in the object x. Resolved code object is given parameters a and b.

Primitive messages

Fourth type of messages is primitive calls, distinguished from the other types by the _ at the beginning of the message name:

_print

_set: s And: b

This type of messages is used only for calling primitive methods of the interpreter, that is the parts of the interpreter implemented in the C++.

The whole Self as a programming environment can be understood as a language built on axioms defined by primitives.

Self

Self as a language is called "Self" because unlike in Smalltalk, you don't have to use this keyword in front of every message sent to the object itself.

If you want to call a method print in the different method of the object, you can use:

self print

However, you can omit self and use just:

print

This is rather an interesting feature that is worth thinking about. Each identifier you type, and is not found immediately in the method namespace, is delegated to the object itself and then to all parent slots. What is actually a local namespace, when an object implicitly sends everything to itself?

Message a first from the previous chapter can be actually rewritten as a self a first message, or as a (self a) first message.

Blocks

Blocks are similar to objects, except for three differences:

  1. They usually don't belong to the place, where they are defined, and are evaluated somewhere else.
  1. They act as if they contained a parent slot pointing to the namespace where they were defined.
  1. They automatically contain parent* pointing to traits block.

Thanks to these features, they work as a closures known from the other programming languages.

[] is an empty block.

Like objects, block can contain slots: [| a <- 1. |] will create a block with a rewritable slot a set to value 1.

Blocks can also have parameters: [| :a | a printLine] and act as code objects. To call a block, you can send a message to it: value, or value: if the block expects one parameter, or value: .. With: .. With: .. if it expects multiple parameters.

Blocks are used to implement all control structures used in Self: if conditions, loops and so on.

For example, if condition is just a keyword messageifTrue:, or ifTrue:False: sent to bool object, which expects block as a parameter:

(| :a. :b. |
    (a > b) ifTrue: [^a] False: [^b].
)

Here you can see the code object with two parameters a and b, which are compared to each other by sending a binary message > b to the object a and then sending the message.

The caret character ^ is used for return. If used in a block, it returns a value not only from the block itself, but also from the surrounding object. In this case, the value from the entire code-object is returned, not just from the block (this can be done using return:). So if ais greater thanb, a itself is returned from the whole code object.

Generally, with respect to return values, it is either possible to use the ^ to return a value or the value of the last message in a given code object/block.

(|
    parent* = traits boolean.
    a = (true ifTrue: [1])
|)

Here we see an object definition that contains a parent slot pointing to the traits boolean object. This makes the message true accessible.

The method stored in the a slot then sends a true message to self, which returns the content of a true slot defined by traits boolean's parent chain. A keyword message with a block parameter, that contains the code consisting just from the object 1, is then sent to this object.

Because the object 1 is the last value in the block, it is returned. The result from ifTrue: message is also returned and becomes the last value in the method stored in the a slot and is thus also returned as a value.

Result of the a message is thus the object 1, even though the return is not used.

Delegation

As I already mentioned, Self uses something that works as an inheritance, but it is not the typical inheritance as we know it from other mainstream languages. Better description would be "a delegation of messages, which the object can't resolve itself, to objects defined in parent slots."

Object can have multiple parent slots, that contain slot with same name. This leads us to so called resends, which allow us to specify which parent should be used for slot resolution. Syntactically, it is defined as <parent name>.<message name>, for example parent.message.

If the following code

(|
  firstParent* = traits something.
  secondParent* = traits different.
|

   copy.
)

contained copy slots in both parents, it would be necessary to choose specific target by rewriting a message sent to resend secondParent.copy.

Delegation is a fairly interesting concept that allows both single and multiple inheritance. In addition, it also allows things that classic languages can't offer, such as changing the parent slot at runtime to effectively switch to which object the undiscovered messages are delegated.

This, at first glance somewhat wild design feature, has quite interesting usage cases, for example parsers, where it allows you to easily change the context.

Comments

Comments are written in quotes:

"this is a comment"

Annotations

Annotations provide a way of adding metadata to objects, but there is surprisingly little documentation on how they work and can be used, even though they are used across the entire standard image.

The syntax uses curly braces:

(|
  p* = traits clonable.
  {'Category: accessing'
    slot = nil.
  }
|)

This example tells the graphical interface to make the slot appear in the accessing category:

Personally, I think that annotations come as a somewhat syntactically confusing feature, and I have managed to cause all sorts of errors when I created annotations with various random labels:

Still, annotations are used everywhere, especially for the Transporter, which uses them to indicate which object belongs to which module, when it was last updated, and so on.

Annotations are not visible in the object itself when viewed in the Outliner. You need to use a mirror (see the next chapter).

In terms of Stdlib's point of view

The trap of the simple languages, which syntax fits on a post card, lays in the complexity of the stdlib. Self is not an exception. I won't show you the whole stdlib, just the most important parts. Curious readers can see specific details in the Self itself.

Conditions

As is customary in the Smalltalk-ish languages, if conditions are implemented as messages sent to a bool object. Following messages are available:

and their equivalents with the "else" branch:

Loops

Like conditions, cycles are implemented as messages sent to either collections or to block objects.

Basic message is loop:

This loops the body of a block unconditionally.

Additionally, conditional loops are available:

and also equivalent messages whileFalse:, untilTrue:, ultilFalse:, loopExit and loopExitValue.

Other loops are do: messages sent to the numeric types, with alternative forms of to:Do: and to:By:Do:, which cycle to: given value By: some step, like for example range() function in Python.

There are also various transformers and iterators for collections, that work as loops. For example, you can send a message to mapBy:, mapBy:Into:, gather:, filterBy: and so on.

I can only recommend to look at the specific collection for the list of all available messages, because there are dozens of messages of this kind, allowing everything from filtering to searching, sorting, mapping, transforming, counting, and so on. There are definitely more messages than there are methods in Python.

Just for the sake of curiosity, here is a list of messages that a list collection can respond to:

<= x
> x
>= x
areKeysOrdered
copare: x IfLess: lb Equal: eb Greater: gb
copy
KeyedStoreStringIfFail: fb
max: x
min: x

Accessing

at: k
at: i IfAbsent: b
first
first: v
firstIfAbsent: noneBlk
isEmpty
last
firstLinkFor: elem IfPresent: presentBlock ifAbsent: absentBlock
firstLinkSatisfying: conditionBlock IfPresent: presentBlock ifAbsent: absentBlock
ifNone: noneBlock
ifNone: noneBlock IfOne: oneBlock IfMany: manyBlock
keys
last: v
soleElement

Adding

add: elem
add: v WithKey: k
addAll: c
add:allFirst: c
addFirst: elem
addLast: elem

Coercing

asList

Comparing

< c
= c
compare: c IfLess: lb Equal: eb Greater: gb
hash
isPrefixOf: c
isSuffixOf: c
equalsCollection: c

Concatenating

, c

Copying

copy
copyContaining: c
copyRemoveAll

Double dispatch from universalSetOrDictionary

unsafe_with: c1 Do: b FirstKey: firstK1 FirstValue: firstV1

Inserting

insert: x AfterElementSatisfying: blk IfAbsent: aBlk
insert: x BeforeElementSatisfying: blk IfAbsent: aBlk
insertAll: x AfterElementSatisfying: blk IfAbsent: aBlk
insertAll: x BeforeElementSatisfying: blk IfAbsent: aBlk

Iterating

do: b
doFirst: f Middle: m Last: lst IfEmpty: mt
reverseDo:
with: x Do: b
with: x ReverseDo: b
withNonindexable: c Do: b

InteratingWithEnds

do: elementBlk SeparatedBy: inBetweenBlk
doFirst: f Middle: m Last: lst
doFirst: f Middle: m Last: lst IfEmpty: e
doFirst: f MiddleLast: ml
doFirst: f MiddleLast: ml IfEmpty: e
doFirstLast: f Middle: ml
doFirstLastt: f Middle: ml IfEmpty: e
doFirstMiddle: fm Last: lst
doFirstMiddle: fm Last: lst IfEmpty: e

Printing

collectionName
comment1
printStringSize: smax Depth: dmax
statePrintString
storeStringForUnkeyedCollectorIfFail: fb
storeStringIfFail: fb
storeStringNeeds
unkeyedStoreStringIfFail: fb
buildStringWith: block
continued
defaultPrintSize
leftBracket
minContentsSize
minElSize
printStringKey: k
rightBracket
separator
statePrintStringOfElements
statePrintStringOfSize

Reducing

countHowMany: testBlock
dotProduct: aCollection
harmonicMean
max
mean
median
min
percentile: nth
product
reduceWith: b
reduceWith: b IfSingleton: sb
reduceWith: b IfSingleton: sb IfEmpty: mt
rootMeanSquare
standardDeviation
sum

Removing

remove: x
remove elem IfAbsent: block
removeAll
removeAll: aCollection
removeFirstIfAbsent: ab
removeLast
removeLastIfAbsent: ab

Searching

allSatisfy: b
anySatisfy: b
findFirst: eb IfPresent: fb
findFirst: eb IfPresent: fb IfAbsent: fail
includes: v
keyOf: elem
keyOf: elem IfAbsent: ab
noneSatisfy: b
occurrencesOf: v
occurrencesOfEachElement

setLikeOperations

includesAll: c
intersect: c
difference: c

Sizing

isEmpty
nonEmpty
size

Sorting

ascendingOrder
comment2
copySort
copySortBy: cmp
copySortBySelector: sel
isAlreadyKnownToBeSortedBy: cmp
sortedBy: cmp Do: b
sortedDo: b

Testing

isOrdered

Transforming

asByteVector
asDictionary
asList
asOrderedSet
asSequence
asSet
asString
asTreeBag
asTreeSet
asVMByteVector
asVector
copyFilteredBy: eb
copyMappedBy: eb
filterBy: filterBlock
filterBy: eb Into: c
gather: aBlock
gather: aBlock Into: aCollection
mapBy: eb
mapBy: eb Into: c

That's quite rich, isn't it?

Data structures

Data structures are made of layers of traits hierarchy.

All collections are based on key-value pairs. Even lists are based on a key value with individual elements being used both as keys and as values.

Self offers a rather rich variation of sets, dictionaries and trees:

Trees differ from dictionaries in the use of unbalanced binary trees, which can lead to degeneration and poor performance if you don't know what you are doing.

There is also a variation of lists, vectors, strings and queues:

The most important messages supported by virtually all collections include:

Important messages

Zpráva Popis
at: Get item at position / key.
at:Put: To position / key put item.
add: Add item (to the end if it is sorted collection).
addAll: Add all items from this collection.
do: [ .. ] For each item do this block.

Example

If you want to use the collection, just type the name in the shell/source code editor and clone it. This can be done with the clone or copy message (same thing for most of the objects).

WARNING: In prototype based systems it is really important to clone the collection. Other objects use the same prototype and if you don't clone the source, you'll change the prototype for every other piece of the code that works with the collection. This usually leads to a fast crash.

Now click on Get it.

Put the outliner somewhere on the desktop and "unpack" it using small arrow on the left:

You can see that the collection there contains zero elements (size 0). Open a shell in it and try to add something to it:

Now choose Do it. You can alternativelly choose also Get it and you'll get outliner for the result of the message call, instead of just evaluating the code.

As you can see, the value has changed. Now you can see what the object returned as a reaction to the values message is.

Here is the outliner for the result. You can put it somewhere on the desktop:

and unpack it again to see what is inside:

As you can see, the vector contains 'value' string at index 1. Keep in mind that the dictionary is unsorted.

Don't worry about the strange appearance of the Shell in the top left corner, I have broken graphics drivers and Self uses such prehistoric X bindings, that it lags and repaints it strangely. It works on my laptop as it should.

Here's an example of how to use to: to print elements and keys in the console:

do: (as all iterators in Self) expects a block, that can have two optional parameters, a value, and a key, named in the example as v andk. Note the somewhat strange order, one would logically expect the opposite (the key and then the value).

Collector

Collector is a special data structure that responds to the binary message &. Basically, it exists because Self does not have a literal for creating lists. To create a list, the easiest way is to use a collector:

(1 & 2 & 3) asList

Collector is neither a list nor a dictionary. It can be converted to all possible data structures by a message called as <Something>, for example asList.

Exceptions

Exceptions are not supported. Messages that might end up with an error usually offer an alternative in the form of a keyword message with the parameter IfFail:. You can see the example here with an object for accessing the operating system:

It is up to the programmer to use and handle the appropriate error (by passing the block with error handler code). If he does not, the debugger should appear, or the program will crash.

Likewise, if you want to allow error handling as a programmer of the library, you need to add a message variant with IfFail: parameter.

In my opinion, it doesn't seem to be the best solution, unfortunately that's how it works.

Object model

Note: A rather interesting discussion on the topic of Self's object model can be found here: https://news.ycombinator.com/item?id=14409088

As I explained, Self uses an object model based on prototypes. This can be summarized as saying that you copy new objects with a clone or copy message, or you create an empty object directly from the source code object literal with a reference to parent*, which provides a functionality similar to an inheritance.

As for the hierarchy of different objects, they are divided between traits and mixins.

Traits

These are "ancestors", that is, objects containing shared functionality, that are often fully functional by themselves. These objects are created in order to be shared by other objects via the parent* slots.

Self has a relatively rich trait hierarchy, as can be seen in the previous chapters.

Mixins

Mixins are small clusters of shared functionality, typically without a parent* slot, and shared only at some specific level of the object hierarchy. The purpose is to provide a functionality that is mixed into an object. Their equivalent is something like an interface with partial implementation.

Reflection with Mirrors

Mirrors are a specialty of Self, which I haven't seen in any other programming language.

Mainstream programming languages usually use reflection by employing various internal properties. For example, Python uses .__ class__, or.__ dict__, or .__ name__ to access internal object information.

Self uses mirrors. You can create a mirror by sending a reflect: message to an object with traits clonable in its parent hierarchy.

This will return an object that is mirroring the object given as a parameter.

You can see that it contains a slot pointing to the original object.

If you open the parent slot, you can also see all kinds of messages it can react to:

Note that I jump on the desktop to the left and right using the WIN + arrow keys shortcut. I just jumped halfway to the right.

In traits mirrors slots parent we can see that there isn't much, so we'll look at his parent:

You can see a rich list of categories that allow you to do all kind of stuff with a mirror:

You can for example list all messages that the object can react to:

Which is as you can see an empty set:

For clarification: the object, to which we created the mirror, contains no messages, only two slots. Here in this context, only slots, that contain code objects, are counted as "messages".

The slots themselves can be explored with messages from the slotAccess category:

You can see for example the result of the firstKey message:

The result was really the name of the slot "a". You can also resolve the whole slot:

The beauty of mirrors lies in the ability to disable them. To do that, you just have to remove the proper parent from the code where you want to disable the reflect: message, for example, by overwriting it with an object returning nil. This makes it possible to execute the code relatively safely by using functions like eval (of course you have to also remove the access to syscals and filesystem).

Tips & tricks

Over time, I collected useful notes, tips and tricks in my personal wiki, that can make programming in Self more pleasant.

copy message support

To support the copy message, an object must inherit some basic functionality, which you can find in the traits clonable.

Display parent slots in outliner

It's somewhat unfortunate to open up parents* in the outliner all the time just to see what's available for inherited slots.

Fortunately, the Outliner can be easily configured to display inherited slots by setting:

preferences outliner kevooidal: true

Open all subcategories

Sometimes it is very annoying to click on the black arrows when you have multiple nested categories. Double-clicking on the arrow opens all its subcategories.

Quit by sending a message

The environment can be terminated by the saveThenQuit or quitNoSave messages. Personally, I like to put a button on the desktop that calls one of these messages when pressed.

Build new image

If you don't want to use the default image provided with the Self distribution, and for some reason you want to build your own, you can do this from the objects/ directory in the project's source code repository (! it is required to do it from this directory!) by using following command:

Self -f worldBuilder.self -o morphic

-o does not specify the output file name (!) but the overclock. morphic parameter tells the worldBuilder that it should include the graphical interface.

When the script finishes, type into the console

desktop open

This will open the graphical interface and you can then save the image by using context menu.

Find slot

From any outliner:

This will give you an object, that can be used to look for a slot by its name:

Input bar at the bottom specifies the root of the search, the top one is used to specify what you want. Basic wildcards using * are supported.

Input bars are approved by clicking on the green square, or by pressing CTRL + enter. Then you have to click on the top arrow to start the search:

Individual slots can be opened by clicking the square next to them.

And you can also invoke all kinds of actions from the context menu

Read serialized objects from script

If you want to run one of the scripts or load your saved module, you can use the following approaches:

bootstrap read: 'name' From: 'directory'

Note that the name doesn't end with .self. Or:

'path/to/file.self' runScript

You can find more details here: Reading a module.

radarView

As I already mentioned in the previous episode, you can get the radarview by using following messages:

desktop worlds first addMorph: radarView

or

desktop w hands first addMorph: radarView

Better fonts

Maybe you did notice that Self's basic font looks really awful. It is originally looking for verdana, which is not installed on Linux by default, so it uses fallback. The solution is described here:

Editor morph

Experimentally I found out, that if you want to use the editor, you need editorRowMorph containing editorMorph. This then responds to contentsString message.

I didn't find out how to use ui2_textField, ui2_textBuffer, textViewerMorph and uglyTextEditorMorph.

Next episode

Next episode Environment and the programming language Self (part three; debugger, transporter and problems) is about debugger, transporter and it also goes into details some of the problems with Self as a language and the environment.

Relevant discussions

Become a Patron