x86 Assembly – A crash course tutorial I

x86 Assembly – A crash course tutorial
Let’s get to the point, you have these things called registers. Registers are containers that can hold up to 4 bytes of data (right now, we will only focus on the smaller 2 byte model registers). These registers have names and there are 14 registers. Using a number of instructions, which are like commands, you can manipulate the data within these registers and perform functions. So 4 registers that you should know about are named AX, BX, CX, and DX. Their names stands for the following: A=Accumulator, B=Base, C=Count, D=Data. Now like I said, these registers are 2 bytes in length and 2 bytes = 16 bits:

|   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |

So above, we see a representation of a 16-bit register, where every box is a bit. And as you know, every bit can be either a 1 or 0. And when converted to Hexadecimal, you get 4 hex digits to specify as a value for a 2 byte register because 4 digits in binary can represent up to 16 different values which is the number of the maximum value of a single hex digit, and there are 16 bits; therefore 4 hex digits is a value representing the value of the register.

Now in assembly, you are able to access the 1 byte halves of AX, BX, CX, or DX . In that case you can specify the high half or the low half of the register whose names are split into the following:

AX = |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
     |______________________________| |______________________________|
                     |                               |
                    AH                               AL

BX = |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
     |______________________________| |______________________________|
                     |                               |
                    BH                               BL
CX = |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
     |______________________________| |______________________________|
                     |                               |
                    CH                               CL
DX = |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
     |______________________________| |______________________________|
                     |                               |
                    DH                               DL

So in that case, when you modify the contents of one of the high or low register halves you in turn modify the whole 2 byte register. In case you haven’t noticed yet, you can only specify 1 byte as a value when modifying AH, AL, BH, BL, CH, CL, DH, or DL.

Now let’s get to the instructions you can use in this language.

First you have the MOV instruction. This command allows you to overwrite a value within a register. Syntax: mov dest,src Where src is a direct value, a pointer to a value, or a register to get data from to put into the register specified by dest. Examples:
mov ax,4c00 = put the value (which is 2 bytes/4 hex digits) 0x4C00 into AX; Result: AX=0x4C00,AH=0x4C, AL=0x00
mov ah,09 = put 0x09 into 1 byte register ah; Result: AH=0x09, AL=Unchanged
mov bx,dx = put value found in dx into bx
mov ah,cl = put value found in cl into ah
mov dx,[bx] = put byte value pointed to by the address found in bx into dx
mov byte ptr [bx],ah = put value found in ah into memory location pointed to by bx
mov dh,[010f] = put value found at address 010f into dh
An important thing to remember is that the source and destination for your mov instruction are the same sizes. So you cannot mov ch,bx, that would be invalid since bx is larger than ch.

As you’ve noticed, some registers or values are enclosed in brackets. This means that those values are treated as pointers. A pointer is an address that references to a value within a file or memory. So mov dh,[010f] doesn’t mean place the actual 10f value inside dh, it actually means to take the byte value at the address 010f within the file and place it into dh. The same concept goes when specifying registers. When you enclose the register in brackets you are taking the value from an address in memory. And that address is found as a value within that register you specified.

Now notice that when I used a register as a pointer in those operations I only used BX. BX (Base) is the only general purpose register that you can use as a pointer. The AX (Accumulator) register is used for math operations and most importantly, specifying a number that references a function for a system call (which we will talk about later). CX (Count) is used for counting and looping while DX (Data) is used for just storing additional data.

Now the next instruction is INC It’s function is simple: it increments the value of any register you specify.
inc cx If cx = 1234 befor that instruction, then it’s new value would have been 1235
inc ah if ah=4c then ah’s new value would have been 4d

The DEC instruction is the same as INC except it decrements the specified register

ADD is an instruction that adds a value to the existing value of a register.
if ch = b2 then add ch,0A makes cx = bc (yes, you need to know hexadecimal to get this)
if bx=010f then add bx,124d makes bx = 135c

SUB, MUL, and DIV hav the same syntax as the ADD instruction except: SUB = subtract, MUL = multiply, and DIV = divide.

XCHG is an instruction not used too much (At least not by me), this means to swap the values of two registers. And the registers must be exactly the same sizes.
If ax=1234 and bx=2435 then xchg ax,bx makes ax=2435 and bx=1234

By now you should know about the stack, if you don’t then see [url=http://www.lingubender.com/forum/viewtopic.php?f=24&t=107]this article I wrote[/url]. The 2 instructions that manipulate the stack is the PUSH and POP instructions. With push, you specify a register or 2 byte value to push to the stack: push ax, push 1234. and POP pops a value from the top of the stack and puts it into the register you specify: pop dx

Now there are two instructions that go together, that is the CALL and the RET instructions. CALL means to save the address of the next instruction to the stack and then jump to the specified address or label. So it is like a call to a sub-procedure, and RET means to return from the call which will get the value from the top of the stack and return to that address.

JMP is simpler than CALL, as it justs goes to an address in your code with modifying the stack with a need for a RET instruction. example jmp 010f it’s like the goto command in batch.

Well that’s all I have time for now. I will be making a 2nd part to this tutorial. Don’t be afraid of assembly, it is really simple but it just gets really long.

5 thoughts on “x86 Assembly – A crash course tutorial I

  1. Thank you for this Tutorial.
    Assembly on PC is totally strange for me.
    Long time ago I wrote in Assembler on a Motorola 68000 in an Amiga 2000.
    This was totally different concerning the registers.
    There were a0 – a7 address register and d0-d7 data register each 8-bit large.
    I would like to learn coding in Assembler on PCs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: