Here, as the first writing of this new website, I put on display a language I've created, as perhaps the first of many. Homoiconicity easily enables metaprogramming and so I was inclined to add it to an already existing language, Brainfuck, and observe the result, in a similar vein to Brainfuck's derivation from P''. I call this result ``Masturbation'', from the combination of ``homo'' and ``fuck''.
As a detour of sorts, the other main reason for the creation of this language was to take one designed for ease of compilation, Brainfuck, and make it extremely difficult or impossible to compile satisfactorily. I'm of the general opinion that a language designed to be compiled easily is good for automation, but lacks many interesting qualities its not-so-easily-compiled brethren enjoy. A language that bars programs from modifying themselves during operation is boring and weak. I find it more interesting when a program can't be understood well or at all by another program. Programs that can be understood incredibly well tend to be nothing more than simple automation of a particular task.
Masturbation adds a single meaningful letter to the language, ``='', which is used to control the homoiconic qualities. For completeness, the semantics of all Masturbation letters are given:
+ Add one to the current cell.
- Subtract one from the current cell.
< Subtract one from the cell index; set this index modulo the size of the data array to the new cell index.
> Add one to the cell index; set this index modulo the size of the data array to the new cell index.
. Print the contents of the current cell, preferably as ASCII.
, Set the value of the current cell to the character code corresponding to the next input character.
[ If the value of the current cell is zero, resume execution at the letter following the corresponding ].
] Resume execution at the corresponding [.
= If the value of the current cell is zero, the instruction array overwrites the data array, upto the length of the instruction array and execution resumes at the next letter; otherwise, the data array overwrites the instruction array, upto the length thereof, and execution resumes at the new first letter.
The data array is, by default, 30,000 cells long. A cell can hold 256 discrete values, modulus 256. The instruction array containing the program has a length measuring the length of the program in letters, with the typical strategy to limit it to the size of the data array if it would exceed such.
=text0[>.] Print text not containing commands delimited by the null character (represented as 0 here).
More programs will be added throughout the life of this document.
=[.>] This is a quine.
It's now obvious that a prime advantage of Masturbation over Brainfuck is the ability to easily control the contents of memory. Note that this adds strings to the language by consequence, rather than specific fiat; any data structure can now be literally specified and added to memory easily.
These first two initial programs were written before the language was specified; the original idea to add homoiconicity was a letter, ``^'', which would jump between the code and data arrays and begin executing there. Complications, such as communication between these two arrays became apparent and so the copying strategy was adopted instead. The data array wraps around to enable easy storage of information between overwrites, assuming the instruction array is shorter than the data array.
An APL implementation with single-stepping is currently the only implementation.
Despite being written to make compilation difficult, strategies have been derived and the discussion of these and general implementation strategies will consume the remainder of this article.
Basic implementation strategies, sans interpretation:
An implementation may compile the instruction array into a compiled array, keeping the instruction array for homoiconic purposes, and then execute the compiled array. This approach is particularly suited to languages that allow for ease of dynamic program generation, such as a machine code or a Lisp. One advantage is that brackets can be handled with a single stack, with ``['' pushing a location and ``]'' popping and resolving both jumps. This is the chosen strategy for a future machine code implementation, as it's easy to implement at the cost of memory.
An implementation may interpret the instruction array and, upon encountering a bracket that won't be taken yet, push the location onto the corresponding stack, having one for ``['' and one for ``]''. A ``['' encountered without being used at least once will need to locate the corresponding ``]'' location. This is the chosen strategy for a potential Forth implementation and is suited for languages that make dynamic code generation more difficult, in comparison to the earlier strategy.
A particularly interesting strategy is to attempt to compile every possible instruction array beforehand.
The current program may be checked for a ``=''. If there isn't one, then the program can be treated as normal brainfuck.
Each ``='' can be followed speculatively, providing what could be called ``futures'', based on the determined state of the data array at the time.
The number of ``='' in a program doesn't make a particular difference, asides from the possibility of any single one being evaluated, the path it would take, and including the state of the data array at each time, replacing one array with the other and the state of the data array at each point. This provides a few possible futures, one of which is the instruction array with a data array containing its contents. Of particular interest are futures dependent on user input.
A future without a ``='' could be called a ``dead end'' and optimization would cease at this for the particular path.
Conditional or infinite occurences of ``='' can be fought through a limit of 100 or so futures, at which point optimization ceases, but any number will do. I'm fairly certain (but will provide no rigorous proof), it would be equivalent to the halting problem to determine if there is a limited amount of futures, and so optimization should end at a certain number of them.
Do keep in mind that reasoning about this language or its programs is mental Masturbation.