Musings of machine code have lead to the article you're reading and those that will follow it. These musings have lead to the development of a tool designed for effectively manipulating some of the resulting machine codes and others.
The first that will be explained inspired the development of the tool. This machine code is named 33554432, as it had a twenty five bit unit. As the story goes, it was 2015 and I was in a car in a parking lot considering ways a machine code program could modify itself and sought to consider a machine code that made this far easier. Among the more interesting metamodifications I was considering was the modification of a loop to disable instructions so that a loop could contain its initialization code without the loss of efficiency, as it would disable it afterwards instead of requiring some manner of testing to avoid repeated initializations. The solution that came to me was what I now deem a ``nothing'' bit; this bit, if one or true, causes the instruction to be ignored and is the first bit of an instruction.
Minutes later I thought using n bit instructions, with the machine having a total of n instructions, and so having each instruction be composed of a series of toggles would perhaps most easily allow for programs that modify themselves. After careful thought, I've decided that the most fundamental instruction on a machine is changing a value from one to another; that is, on a decimal machine, changing a digit from one of ten to another, whereas, on a binary machine, this is toggling a bit; it's noteworthy that this equality of changing a value and toggling a bit in the binary system obscured this idea from clarity for a tad. This reinforced the idea of this simple modification being a reasonable means for this.
Then the question of how to actually use this machine code arose in my mind. Thoughts of assemblers flooded my mind and I realized immediately how ugly it would be to fit this machine code to that model. I figured it would be even worse to use than an Intel machine code. I then combined this idea with my thinking of a better keyboard and found a model I figured would work. This paragraph and related considerations will be expanded upon the release of the next article.
With all of this written, further fleshing out and experimentations with 33554432 are now considered to be a failure in the general case. Questions of using powers of two as the unit, along with others, arose constantly. The general thought was to divide the instruction into fields, explained below, and have the instruction followed by two or three addresses to provide the data to operate on. It was decided early on to use bit-addressable memory, as it was thought this would more easily lead to variadic procedures being made, which was a secondary goal of the architecture itself. This variadic instruction format was seen as complicating the metamodification, with the consequence of removal being more constraint and the consequence of forcing it being more memory consumption. The idea of doubling the size of each address to fifty bits was also toyed with, as twenty five bits is a rather small address space. Ultimately, the instruction fields were left largely barren and it was later realized that the functionality desired wasn't considered at all suitable to operate on the same data at once; for this reason, the idea is largely left behind, with the thinking that a far more specific machine may be able to still make use of the general idea, perhaps.
control: |nothing|meta |extra |follow |literal|
logic: | | | | | |
math: | | |carry |add | |
move: | | | |move | |
other: | | | | |reverse|
It was intended that the machine would operate on data one bit at once and the instructions would reflect this.
The nothing bit causes the instruction to be ignored. The meta bit causes all fields, sans the control field, to assume different meanings, meta meanings, and so use instructions that modify the following instructions; this field was never expanded upon, but ideas included providing double arguments, masking the next instruction and executing the two resulting instructions in sequence, and the use of looping meta-instructions for small loops without control transfers. The extra bit was to allow for the extra address, which then becomes the destination; when zero or false, the second address is the location of half of the data and the destination. The follow bit causes indirection on the arguments. The literal bit causes the arguments to be interpreted as the actual data; combinations with the follow bit were given thought, but never made concrete.
The logic field was intended to specify five useful logical functions and also allow for the other combinations, resulting from combined application, to be derived. It was determined that this wasn't particularly different from using an index into a table, asides from the four duplicated functions.
The math bit was intended to have addition and other arithmetical functions. Difficulties arose in finding suitably useful mathematical functions that could reasonably work incrementally on single bits. One result thereof was to fill the table with options for the functions that were suitably useful, such as a recognize carry bit.
The move field was the least well specified. It was intended that moves would logically occur in units of single bits, but this left little else for the field to do. It was also the thought of a move combined with the other fields that ultimately lead to the discarding of this architecture for a general machine such as this.
The other field was unspecified, sans the idea of a ``reverse'' bit which would cause the instruction to execute from the end forwards, sans the control field, to be appropriately located at the end of the instruction.
A decimal machine in the same vein was considered, but not elaborated on for various reasons.
I find particularly specific machines to be alluring and so the thinking is that this general idea of instructions as a series of toggles may work well on a machine intended to have far fewer instructions and crafted for specific processing, such as image or sound data, and so lacking general instructions that ``step on the others' toes''. Phrased differently, the idea is thought to perhaps work with suitably disparate instructions.
The second machine code that will be explained is Meta-CHIP-8, a component of the prototypical tool release. This machine code is designed to manipulate the CHIP-8 address space within the context of the tool.
It was decided that every machine code targeted by the tool could have a corresponding and suited meta-machine code. This then creates a dual meaning, with ``Meta-Machine Code'' being a component of the tool's metamodification facilities and the name of the tool, describing what one does with the tool. A meta-machine code instruction is larger than the target instruction; the original thinking was that there would be three or so most significant bits larger than an entire instruction, leaving room for adding information to it; it was later thought this could be done by the programmer if desired, as this interfered gravely with the eight-bit-byte memory granularity of most targets.
The reason for the existence of a meta-machine code rose about from the desire to simplify the tool. It was determined that most every feature provided by an assembler was not necessary to include within the system itself, as I found many of them, local labels in particular, to be aesthetically displeasing. The only system that was to be left within the system was the ability to associate a name with a value and to associate names with memory addresses, used as names with the address being the value. This then required a system to implement features as desired. Thus meta-machine code was derived.
Various ideas of a Meta-CHIP-8 were passed through before arriving at the current version. All were touched by 33554432 and other thoughts on machine architecture in various ways. It was decided that instructions would be uniform and should attempt to use a CHIP-8 instruction format, so first form instructions were attempted to be used, the form composed of an instruction code followed by a twelve bit address. This form proved limiting in several ways, as eight instructions were insufficient and all were also binary, leaving a single argument as insufficient; the solution for the latter issue was to have the single argument point to argument pair forms in various ways, but all were too complicated and would also cause the need for allocation tools to be designed. A forty bit instruction format was tested, but this was considered too limiting in the address space available. A thirty two bit format was decided upon and chosen for certainly being suitable.
The thirty two bit format was a modification of the forty bit format, mostly to save space. The most original instruction in the set, which was present with the original idea of a meta-machine code, is the disassembling instruction, which interprets memory as a target instruction, categorizes it with the type number, and organizes it in memory for easier modification. A corresponding function to compose the result of this disassembling instruction back into an instruction is also present. I relate the idea of this disassembling instruction with the nothing bit of 33554432; I'm inclined to believe that, perhaps, these are already rather optimal incarnations of the corresponding ideas, instructions as toggles and meta-machine code instructions designed around manipulating a target machine code; meta-machine codes have received far less musing and whatnot than 33554432 and so I'm inclined to believe more experimentation is warranted. Instructions that direct the environment itself are also present and a settlement between this and simply mapping the environment in the address space has been made.
The instruction format is as follows: a four bit command field, followed by the first address' two toggle bits, followed by the second address' two toggle bits, followed by the first twelve bit address, followed by the second twelve bit address. The toggle bits are interpreted as follows: if both are zero or false, the address specifies a location in the meta-address space; if the second is one or true, the address specifies a location in the target address space; if the first is one or true, the meta-address is indirected upon; and if both are one or true, the first field refers to the byte it points to and the following byte and is interpreted specially by certain instructions and the second field is interpreted as a twelve-bit literal. The last toggle bit combination was created specifically because moving data to the program counter, located at locations zero and one, takes the place of a jump instruction and this was necessary to achieve such in one instruction; as the first location can only specify a single byte, it's necessary to use a twelve-bit literal for this; certain jumps can, of course, work with only moving one byte, however.
In all cases, ``data'' refers to the second argument, ``destination'' refers to the first, and ``next'' specifies the last toggle bit combination for the second argument. An instruction resembles this diagram; case changes being used to distinguish fields:
00 move Move the data to the destination and the following location if next.
01 disassemble Disassemble the data interpreted as an instruction and place the identifier and segments beginning from destination.
02 assemble Assemble data in the form of disassemble output to destination and the following location.
03 finish Return control to the environment; ignore arguments; this can be controlled by programs.
04 ask Prompt the user, fulfilling the conditions of data, and place the result beginning from destination; ignore next.
05 show Display text to the user, with count from data and the text from destination; ignore next.
06 skip= Skip the next instruction if data equals destination.
07 add Add data to destination.
08 name Enable manipulations of the name space.
09 find Enable introspection into the target metadata space.
10 x2 Data is multiplied by two and placed in destination.
11 /2 Data is divided by two and placed in destination.
12 and Data is ANDed with destination and the result is placed in destination.
13 nand Data is NANDed with destination and the result is placed in destination.
14 or Data is ORed with destination and the result is placed in destination.
15 xor Data is XORed with destination and the result is placed in destination.
Several instructions ignore a next specification. The ``name'' and ``find'' instructions are vague in purpose and overloaded in an unsatisfactory manner. The Meta-CHIP-8 currently has no facilities to perform specific actions the user can, but programmatically. This Meta-CHIP-8 is suitable as a prototype, but must be considered subject to widespread changes in the tool after release. It was decided that, while writing this article, the second address would become the data address, as this is easy to modify in its entirety with a single instruction using a next specification, whereas the first address isn't; this enables the modification of literal data in an instruction in one instruction.
The ``finish'' instruction can be controlled by a memory mapping, so as to cause a finish to cause a transfer of control, instead, which then allows programs to call one another easily.
The thought of using two fields and allowing both to be literal, while not occurring in Meta-CHIP-8, has lead to the thought of an interesting controlled means for program control: A literal destination is overwritten in the instruction, itself.
Reasonable orthogonality with CHIP-8 instructions is considered to be very important; the C instruction, which generates random data and ANDs it with a number before storing it, can be made to resemble the Meta-CHIP-8 method of generating random data: There is a twelve bit source of random data in the address space, which can be accessed by any instruction; currently, it can't be written to in order to seed it; a comparable instruction sequence to the CHIP-8 would be to use the ``and'' instruction with the data provided by the random source and have the destination be the mask, which could be the literal field of another instruction. This particular example was the final reason used to switch the meanings of the fields, for easier access.
The Meta-CHIP-8 address space will be discussed more in the next article, which details the Meta-Machine Code (MMC) tool, itself.
Only vague considerations have been had with regards to Meta-MIPS, Meta-ARM, Meta-V850, Meta-RISC-V, et al..