Assembly

   Assembly (also ASM) is, for any given [1]hardware computing platform
   ([2]ISA, basically a [3]CPU architecture), the lowest level [4]programming
   language that expresses typically a linear, unstructured (i.e. without
   nesting blocks of code) sequence of CPU instructions -- it maps (mostly)
   1:1 to [5]machine code (the actual [6]binary CPU instructions) and
   basically only differs from the actual machine code by utilizing a more
   human readable form (it gives human friendly nicknames, or mnemonics, to
   different combinations of 1s and 0s). Assembly is converted by
   [7]assembler into the the machine code, something akin a computer
   equivalent of the "[8]DNA", the lowest level instructions for the
   computer. Assembly is similar to [9]bytecode, but bytecode is meant to be
   [10]interpreted or used as an intermediate representation in [11]compilers
   while assembly represents actual native code run by hardware. In ancient
   times when there were no higher level languages (like [12]C or
   [13]Fortran) assembly was used to write computer programs -- nowadays most
   programmers no longer write in assembly (majority of [14]zoomer
   "[15]coders" probably never even touch anything close to it) because it's
   hard (takes a long time) and not [16]portable, however programs written in
   assembly are known to be extremely fast as the programmer has absolute
   control over every single instruction (of course that is not to say you
   can't fuck up and write a slow program in assembly).

   { see this meme lol :D http://lolwut.info/images/4chan-g1.png ~drummyfish
   }

   Assembly is NOT a single language, it differs for every architecture, i.e.
   every model of CPU has potentially different architecture, understands a
   different machine code and hence has a different assembly (though there
   are some standardized families of assembly like x86 that work on wide
   range of CPUs); therefore assembly is not [17]portable (i.e. the program
   won't generally work on a different type of CPU or under a different
   [18]OS)! And even the same kind of assembly language may have several
   different [19]syntax formats that also create basically slightly different
   languages which differ e.g. in comment style, order of writing arguments
   and even instruction abbreviations (e.g. x86 can be written in [20]Intel
   or [21]AT&T syntax). For the reason of non-portability (and also for the
   fact that "assembly is hard") you mostly shouldn't write your programs
   directly in assembly but rather in a bit higher level language such as
   [22]C (which can be compiled to any CPU's assembly). However you should
   know at least the very basics of programming in assembly as a good
   programmer will come in contact with it sometimes, for example during
   hardcore [23]optimization (many languages offer an option to embed inline
   assembly in specific places), debugging, reverse engineering, when writing
   a C compiler for a completely new platform or even when designing one's
   own new platform (you'll probably want to make your compiler generate
   native assembly, so you have to understand it). You should write at least
   one program in assembly -- it gives you a great insight into how a
   computer actually works and you'll get a better idea of how your high
   level programs translate to machine code (which may help you write better
   [24]optimized code) and WHY your high level language looks the way it
   does.

   OK, but why doesn't anyone make a portable assembly? Well, people do, they
   just usually call it a [25]bytecode -- take a look at that. [26]C is
   portable and low level, so it is often called a "portable assembly",
   though it still IS significantly higher in abstraction and won't usually
   give you the real assembly vibes. [27]Forth may also be seen as close to
   such concept. ACTUALLY [28]Dusk OS has something yet closer, called
   [29]Harmonized Assembly Layer (see
   https://git.sr.ht/~vdupras/duskos/tree/master/fs/doc/hal.txt). [30]Web
   assembly would also probably fit the definition.

   The most common assembly languages you'll encounter nowadays are [31]x86
   (used by most desktop [32]CPUs) and [33]ARM (used by most mobile CPUs) --
   both are used by [34]proprietary hardware and though an assembly language
   itself cannot (as of yet) be [35]copyrighted, the associated architectures
   may be "protected" (restricted) e.g. by [36]patents (see also [37]IP
   cores). [38]RISC-V on the other hand is an "[39]open" alternative, though
   not yet so wide spread. Other assembly languages include e.g. [40]AVR
   (8bit CPUs used e.g. by some [41]Arduinos) and [42]PowerPC.

   To be precise, a typical assembly language is actually more than a set of
   nicknames for machine code instructions, it may offer helpers such as
   [43]macros (something akin the C preprocessor), pseudoinstructions
   (commands that look like instructions but actually translate to e.g.
   multiple instructions), [44]comments, directives, automatic inference of
   opcode from operands, named labels for jumps (as writing literal jump
   addresses would be extremely tedious) etc. I.e. it is still much easier to
   write in assembly than to write pure machine code even if you knew all
   opcodes from memory. For the same reason remember that just replacing
   assembly mnemonics with binary machine code instructions is not yet enough
   to make an executable program! More things have to be done such as
   [45]linking [46]libraries and converting the result to some [47]executable
   format such as [48]elf which contains things like header with
   metainformation about the program etc.

   How will programming in assembly differ from your mainstream high-level
   programming? Quite a lot, assembly is extremely low level, so you get no
   handholding or much programming "safety" (apart from e.g. CPU operation
   modes), you have to do everything yourself -- you'll be dealing with
   things such as function [49]call conventions, [50]interrupts, [51]syscalls
   and their conventions, counting CPU cycles of individual instructions,
   looking up exact hexadecimal memory addresses, opcodes, defining memory
   segments, dealing with [52]endianness, raw [53]goto jumps, [54]call frames
   etc. You have no branching (if-then-else), loops or functions, you make
   these yourself with gotos. You can't write expressions like (a + 3 * b) /
   10, no, you have to write every step of how to evaluate this expression
   using registers, i.e. something like: load a to register A, load b to
   register B, multiply B by 3, add register B to A, divide A by 10. You
   don't have any [55]data types, you have to know yourself that your
   variables really represent signed values so when you're dividing, you have
   to use signed divide instruction instead of unsigned divide -- if you mess
   this up, no one will tell you, your program simply won't work. And so on.

Typical Assembly Language

   Assembly languages are usually unstructured, i.e. there are no control
   structures such as if or while statements: these have to be manually
   implemented using labels and jump ([56]goto, branch) instructions. There
   may exist macros that mimic control structures. The typical look of an
   assembly program is however still a single column of instructions with
   arguments, one per line, each representing one machine instruction.

   In assembly it is also common to blend program instructions and data, i.e.
   sometimes you create a label after which you just put bytes that will
   represent e.g. text strings or images and after that you start to write
   program instructions that work with these data, which will likely
   physically be placed this way (after the data) in the final program. This
   may cause quite nasty bugs if you by mistake jump to a place where data
   reside and try to treat them as instructions.

   The working of the language reflects the actual [57]hardware architecture
   -- most architectures are based on [58]registers so usually there is a
   small number (something like 16) of registers which may be called
   something like R0 to R15, or A, B, C etc. Sometimes registers may even be
   subdivided (e.g. in x86 there is an eax 32bit register and half of it can
   be used as the ax 16bit register). These registers are the fastest
   available memory (faster than the main RAM memory, they are literally
   INSIDE the CPU, even in front of the [59]cache) and are used to perform
   calculations. Some registers are general purpose and some are special:
   typically there will be e.g. the FLAGS register which holds various 1bit
   results of performed operations (e.g. [60]overflow, zero result etc.).
   Some instructions may only work with some registers (e.g. there may be
   kind of a "[61]pointer" register used to hold addresses along with
   instructions that work with this register, which is meant to implement
   [62]arrays). Values can be moved between registers and the main memory
   (with instructions called something like move, load or store).

   Writing instructions works similarly to how you call a [63]function in
   high level language: you write its name and then its [64]arguments, but in
   assembly things are more complicated because an instruction may for
   example only allow certain kinds of arguments -- it may e.g. allow a
   register and immediate constant (kind of a number literal/constant), but
   not e.g. two registers. You have to read the documentation for each
   instruction. While in high level language you may write general
   [65]expressions as arguments (like myFunc(x + 2 * y,myFunc2())), here you
   can only pass specific values.

   There are also no complex [66]data types, assembly only works with numbers
   of different size, e.g. 16 bit integer, 32 bit integer etc. Strings are
   just sequences of numbers representing [67]ASCII values, it is up to you
   whether you implement null terminated strings or Pascal style strings.
   [68]Pointers are just numbers representing addresses. It is up to you
   whether you interpret a number as signed or unsigned (some instructions
   treat numbers as unsigned, some as signed, some don't care because it
   doesn't matter).

   Instructions are typically written as three-letter abbreviations and
   follow some unwritten naming conventions so that different assembly
   languages at least look similar. Common instructions found in most
   assembly languages are for example:

     * MOV (move): move a number between registers and/or main memory (RAM).
     * JMP (jump, also e.g. BRA for branch): unconditional jump to far away
       instruction.
     * JEQ (jump if equal, also BEQ etc.): jump if result of previous
       comparison was equality.
     * ADD (add): add two numbers.
     * NOP (no operation): do nothing (used e.g. for delays or as
       placeholders).
     * CMP (compare): compare two numbers and set relevant flags (typically
       for a subsequent conditional jump).
     * ...

   [69]Fun note: HCF -- halt and catch fire -- is a humorous nickname for
   instructions that just stop the CPU and wait for restart.

How To

   For specific assembly language how tos see their own articles: [70]x86,
   [71]Arm etc.

   On [72]Unices the [73]objdump utility from GNU binutils can be used to
   disassemble compiled programs, i.e view the instructions of the program in
   assembly (other tools like ndisasm can also be used). Use it e.g. as:

 objdump -d my_compiled_program

   Let's now write a simple Unix program in 64bit [74]x86 assembly -- we'll
   be using AT&T syntax that's used by [75]GNU. Write the following source
   code into a file named e.g. program.s:

 .global   _start         # include the symbol in object file

 str:
 .ascii    "it works\n"   # the string data

 .text
 _start:                  # execution starts here
   mov     $5,   %rbx     # store loop counter in rbx

 .loop:
   # make a Linux "write" syscall:
                          # args to syscall will be passed in regs.
   mov     $1,   %rax     # says syscalls type (1 = write)
   mov     $1,   %rdi     # says file to write to (1 = stdout)
   mov     $str, %rsi     # says the address of the string to write
   mov     $9,   %rdx     # says how many bytes to write
   syscall                # makes the syscall

   sub     $1,   %rbx     # decrement loop counter
   cmp     $0,   %rbx     # compare it to 0
   jne     .loop          # if not equal, jump to start of the loop

   # make an "exit" syscall to properly terminate:
   mov     $60,  %rax     # says syscall type (60 = exit)
   mov     $0,   %rdi     # says return value (0 = success)
   syscall                # makes the syscall

   The program just writes out it works five times: it uses a simple loop and
   a [76]Unix [77]system call for writing a string to standard output (i.e.
   it won't work on [78]Windows and similar shit).

   Now assembly source code can be manually assembled into executable by
   running assemblers like as or nasm to obtain the intermediate [79]object
   file and then [80]linking it with ld, but to assemble the above written
   code simply we may just use the gcc compiler which does everything for us:

 gcc -nostdlib -no-pie -o program program.s

   Now we can run the program with

 ./program

   And we should see

 it works
 it works
 it works
 it works
 it works

   As an exercise you can objdump the final executable and see that the
   output basically matches the original source code. Furthermore try to
   disassemble some primitive C programs and see how a compiler e.g. makes if
   statements or functions into assembly.

Example

   Let's take the following [81]C code:

 #include <stdio.h>

 char incrementDigit(char d)
 {
   return // remember this is basically an if statement
     d >= '0' && d < '9' ?
     d + 1 :
     '?';
 }

 int main(void)
 {
   char c = getchar();
   putchar(incrementDigit(c));
   return 0;
 }

   We will now compile it to different assembly languages (you can do this
   e.g. with gcc -S my_program.c). This assembly will be pretty long as it
   will contain [82]boilerplate and implementations of getchar and putchar
   from standard library, but we'll only be looking at the assembly
   corresponding to the above written code. Also note that the generated
   assembly will probably differ between compilers, their versions, flags
   such as [83]optimization level etc. The code will be manually commented.

   { I used this online tool: https://godbolt.org. ~drummyfish }

   { Also not sure the comments are 100% correct, let me know if not.
   ~drummyfish }

   The [84]x86 assembly may look like this (to understand the weird juggling
   of values between registers see [85]calling conventions):

 incrementDigit:
   pushq   %rbp                   # save base pointer
   movq    %rsp, %rbp             # move base pointer to stack top
   movl    %edi, %eax             # move argument to eax
   movb    %al, -4(%rbp)          # and move it to local var.
   cmpb    $47, -4(%rbp)          # compare it to '0'
   jle     .L2                    # if <=, jump to .L2
   cmpb    $56, -4(%rbp)          # else compare to '9'
   jg      .L2                    # if >, jump to .L4
   movzbl  -4(%rbp), %eax         # else get the argument
   addl    $1, %eax               # add 1 to it
   jmp     .L4                    # jump to .L4
 .L2:
   movl    $63, %eax              # move '?' to eax (return val.)
 .L4:
   popq    %rbp                   # restore base pointer
   ret
  
 main:
   pushq   %rbp                   # save base pointer
   movq    %rsp, %rbp             # move base pointer to stack top
   subq    $16, %rsp              # make space on stack
   call    getchar                # push ret. addr. and jump to func.
   movb    %al, -1(%rbp)          # store return val. to local var.
   movsbl  -1(%rbp), %eax         # move with sign extension
   movl    %eax, %edi             # arg. will be passed in edi
   call    incrementDigit
   movsbl  %al, %eax              # sign extend return val.
   movl    %eax, %edi             # pass arg. in edi again
   call    putchar
   movl    $0, %eax               # values are returned in eax
   leave
   ret

   The [86]ARM assembly may look like this:

 incrementDigit:
   sub   sp, sp, #16              // make room on stack
   strb  w0, [sp, 15]             // load argument from w0 to local var.
   ldrb  w0, [sp, 15]             // load back to w0
   cmp   w0, 47                   // compare to '0'
   bls   .L2                      // branch to .L2 if <
   ldrb  w0, [sp, 15]             // load argument again to w0
   cmp   w0, 56                   // compare to '9'
   bhi   .L2                      // branch to .L2 if >=
   ldrb  w0, [sp, 15]             // load argument again to w0
   add   w0, w0, 1                // add 1 to it
   and   w0, w0, 255              // mask out lowest byte
   b     .L3                      // branch to .L3
 .L2:
   mov   w0, 63                   // set w0 (ret. value) to '?'
 .L3:
   add   sp, sp, 16               // shift stack pointer back
   ret
  
 main:
   stp   x29, x30, [sp, -32]!     // shift stack and store x regs
   mov   x29, sp
   bl    getchar
   strb  w0, [sp, 31]             // store w0 (ret. val.) to local var.
   ldrb  w0, [sp, 31]             // load it back to w0
   bl    incrementDigit
   and   w0, w0, 255              // mask out lowest byte
   bl    putchar
   mov   w0, 0                    // set ret. val. to 0
   ldp   x29, x30, [sp], 32       // restore x regs
   ret

   The [87]RISC-V assembly may look like this:

 incrementDigit:
   addi    sp,sp,-32              # shift stack (make room)
   sw      s0,28(sp)              # save frame pointer
   addi    s0,sp,32               # shift frame pointer
   mv      a5,a0                  # get arg. from a0 to a5
   sb      a5,-17(s0)             # save to to local var.
   lbu     a4,-17(s0)             # get it to a4
   li      a5,47                  # load '0' to a4
   bleu    a4,a5,.L2              # branch to .L2 if a4 <= a5
   lbu     a4,-17(s0)             # load arg. again
   li      a5,56                  # load '9' to a5
   bgtu    a4,a5,.L2              # branch to .L2 if a4 > a5
   lbu     a5,-17(s0)             # load arg. again
   addi    a5,a5,1                # add 1 to it
   andi    a5,a5,0xff             # mask out the lowest byte
   j       .L3                    # jump to .L3
 .L2:
   li      a5,63                  # load '?'
 .L3:
   mv      a0,a5                  # move result to ret. val.
   lw      s0,28(sp)              # restore frame pointer
   addi    sp,sp,32               # pop stack
   jr      ra                     # jump to addr in ra
  
 main:
   addi    sp,sp,-32              # shift stack (make room)
   sw      ra,28(sp)              # store ret. addr on stack
   sw      s0,24(sp)              # store stack frame pointer on stack
   addi    s0,sp,32               # shift frame pointer
   call    getchar
   mv      a5,a0                  # copy return val. to a5
   sb      a5,-17(s0)             # move a5 to local var
   lbu     a5,-17(s0)             # load it again to a5
   mv      a0,a5                  # move it to a0 (func. arg.)
   call    incrementDigit
   mv      a5,a0                  # copy return val. to a5
   mv      a0,a5                  # get it back to a0 (func. arg.)
   call    putchar
   li      a5,0                   # load 0 to a5
   mv      a0,a5                  # move it to a0 (ret. val.)
   lw      ra,28(sp)              # restore return addr.
   lw      s0,24(sp)              # restore frame pointer
   addi    sp,sp,32               # pop stack
   jr      ra                     # jump to addr in ra

Links:
1. hw.md
2. isa.md
3. cpu.md
4. programming_language.md
5. machine_code.md
6. binary.md
7. assembler.md
8. dna.md
9. bytecode.md
10. interpreter.md
11. compiler.md
12. c.md
13. fortran.md
14. zoomer.md
15. coding.md
16. portability.md
17. portability.md
18. os.md
19. syntax.md
20. intel.md
21. at_and_t.md
22. c.md
23. optimization.md
24. optimization.md
25. bytecode.md
26. c.md
27. forth.md
28. duskos.md
29. hal.md
30. web_assembly.md
31. x86.md
32. cpu.md
33. arm.md
34. proprietary.md
35. copyright.md
36. patent.md
37. ip_core.md
38. risc_v.md
39. open.md
40. avr.md
41. arduino.md
42. ppc.md
43. macro.md
44. comment.md
45. linking.md
46. library.md
47. executable_format.md
48. elf.md
49. call_convention.md
50. interrupt.md
51. syscall.md
52. endianness.md
53. goto.md
54. call_frame.md
55. data_type.md
56. goto.md
57. hardware.md
58. register.md
59. cache.md
60. overflow.md
61. pointer.md
62. array.md
63. function.md
64. argument.md
65. expression.md
66. data_type.md
67. ascii.md
68. pointer.md
69. fun.md
70. x86.md
71. arm.md
72. unix.md
73. objdump.md
74. x86.md
75. gnu.md
76. unix.md
77. syscall.md
78. windows.md
79. obj.md
80. linking.md
81. c.md
82. boilerplate.md
83. optimization.md
84. x86.md
85. calling_convention.md
86. arm.md
87. risc_v.md