Insidious Optimizations I: Machine Architecture This is my first ``Insidious Optimizations'' article, and so I'll explain the basic concepts herein. The idea of insidious optimizations is important to me, being a good reason I began writing articles and whatnot. An insidious optimization is such an optimization that ceases to be viewed as one. An insidious optimization is insidious in how it limits thought, especially to those thinking after one is introduced. An insidious optimization is such an optimization which is neither necessary nor was necessarily always present with regards to its topic. It limits thought by contributing unnecessary constraints and assumptions; with an insidious optimization, people will simply not see alternations or will even have difficulty understanding alternations could exist. It stunts a mind. There exist many insidious optimizations in that young field of automatic computing, with this article detailing those concerning machine design. The prime insidious optimization I see is that of the register; registers cause issues for compilers involving optimal allocation, implicit usage, the redundant instructions they make necessary, and in the limited nature of their storage. The former two issues work in tandem to make writing compilers more complex than necessary; the third issue is the least egregious of the set; and that final issue causes unnecessarily complicated memory hierarchies, through having small and very limited registers act as the fastest memory available. I've found great interest in the memory-to-memory model of machine architecture; such a model is the least explored by others and I believe the most inherent, and so relatively lacking in any insidious optimizations. There is relatively no need to optimally allocate data as access is generally equal; implicit usage can be enforced by the particular machine and in some cases should, but is less poor; a memory-to-memory machine needs a lone instruction for moving any data as all is at an equal level; and perhaps most importantly the fastest memory in such machines need no longer be poorly segmented. Emphasizing the last point, it's not generally possible to store an important array across registers of a machine, but a memory-to-memory machine faces no such issue. Continuing from that third, other instructions become unnecessary, including shift and control flow, while also exposing a more suited interface for arbitrary-length operations through removing those register sizes as the common units. Lacking registers, such a machine would do well to simply expose all state through the memory, which would also have recovery and resumption made easier; a machine with the program counter exposed from a memory location needs no special jump instruction, as this would merely be a move; I prefer such a counter be placed at the zeroeth memory location. I consider the issue of fast memory rather uninteresting, but a memory-to-memory machine can resolve the issue nicely by providing the programmer with some control. A cache memory not under good power of the programmer is inherently poor, I think. The programmer could merely have the machine replace a range of memory with a suitably large amount of fast memory, transparently, so the only difference is the speed with which the memory range is accessed; this nicely addresses the point that registers serve as hints to the machine of what's valuable. This mechanism, however, works for all data types be they code, minor data, or large structures. Another insidious optimization is the collecting of memory into fixed multiples of bits, commonly by octet. This decreases an address size by a mere three bits, at the cost of flexibility, and I'm led to believe this loss of flexibility results in a net-loss of memory by making the optimal compacting arduous in some cases. An obvious example is needing only a single bit of storage, requiring either wasting the other seven or writing code to collect only the relevant bit. A less obvious example is with alignment of structures; it would be possible to use the minimum amount of bits trivially, were memory used bitwise, but in many cases this would be so arduous that it's not done, wasting whatever space is used to make it convenient to access. That extra code required would likely dwarf savings. A memory-to-memory machine with bitwise memory trivially eliminates shifting instructions, through a simple move or by merely changing an address with that relevant data having some space so set aside. The issue of using conventional memory hardware is irrelevant, as required translations are trivial. A prime question in the design of a memory-to-memory machine with bitwise memory would be which unit is good to use as a base. My solution is using the address length as such a unit, as this is rather inherent to operation anyway and would do well where other fixed-size units are needed. A prime disadvantage of the memory-to-memory design compared to others is the inherently larger size of instructions, but some of my such designs have combatted this to nice results; the issue of large instructions can be mitigated by making each instruction more capable, which may be aided by opening space for more varied such instructions. I find the design both fascinating and elegant; some early automatic computers could be written to have used a variant. The simplification through eliminating instructions, ease of optimizing memory usage, and the ease through which optimizations can be added transparently all suggest the design has merit. It's a different insidious optimization to leave machine architecture dealing almost solely in terms of numbers and not spanning to other abstract domains, but this will be detailed elsewhere. I don't intend to mean other architectures are strictly inferior. Each design has its quirks I find fascinating. Abandoned fields are fertile ground for novel research and those machine architectures which deviate are rather certainly abandoned now; this article gives context for some of my efforts.