(2023-05-03) Ode to 64K virtual machines
----------------------------------------
Many, if not most, VMs and interpreters that are meant to be implemented on
old hardware or just simulating old hardware experience anywhere else, are 
designed to work with 16-bit address bus and, as such, the maximum of 65536 
bytes of addressable memory. This started back in 1970s (when CPUs 
themselves had 8-bit buses) with Wozniak's SWEET16 and continues to this 
day, including but not limited to Uxn, VTL-2, CHIP-8 with all its flavors, 
PICO-8, Minicube64, OK64, many Forth implementations, most Subleq and other 
OISC implementations, and even my Equi. So, why does this work and why do I 
personally consider this amount of RAM (and 16-bit environments as a whole) 
optimal for day-to-day low-power human-scale computing?

First, let's get the most obvious thing out of the way. In order to be as
efficient as we can, we need the maximum address to be equal to the maximum 
value of the machine word. Hence, for systems with 8-bit machine words, 256 
bytes of RAM would be architecturally perfect, but, you guessed it, it's way 
too little and we can hardly fit anything in there... and if we do, let's 
remember how hard it was to program for Atari 2600 which did. So, we rather 
allocate two 8-bit machine words (or one 16-bit machine word) to store our 
address, which means 65536 bytes of RAM, which, you guessed it, has been 
enough for an entire generation of home computers and gaming consoles, 
especially considering that there was even less of the actual RAM and a good 
part of the address space was dedicated to video, input, sound and other 
ports and internal needs.

OK, so we have found out why 64K of addressed space is the smallest
comfortable amount. Now, why not more? Well, we can do more but the next 
comfortable stop is 24 bits or 16 MiB. To be honest, programs that occupy 
space this large are already far from human-scale. Of course there are 
legitimate scenarios for storing large amounts of data (and not code) in RAM 
all at once, like BWT-based compression (which is generally better when the 
block size is larger). Well, in this case, you can optimize your algorithms 
to use processing methods more suitable for working with storage (e.g. in 
case of BWT, replace quicksort with an external merge sort). The point is, 
there should be virtually nothing to fill that much RAM with, otherwise 
something else is definitely going wrong.

I'm not saying that memory usage over 64KB per application/VM should be
prohibited, but I'm saying it must be heavily motivated. Otherwise, DIY and 
LPC projects and platforms will eventually come to the state that mainstream 
software/platform development is in today.

--- Luxferre ---