Chapter 2 Instructions: Language of the p Computer
Instructions: s uc o s:
Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
2
Instructions: s uc o s:
We ll be working with the MIPS We’ll instruction set architecture
similar to other simila othe architectures a chitect es developed de eloped since the 1980's used by NEC, NEC Nintendo, Nintendo Silicon Graphics, Graphics Sony
Design goals: maximize performance and minimize cost, cost reduce design time Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
3
Instructions: s uc o s: 1400 1300 1200 1100 1000 900 800
Other SPARC Hitachi SH PowerPC Motorola 68K MIPS IA-32 ARM
700 600 500 400 300 200 100 0 1998
1999
2000
2001
2002
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
4
Instruction Set
The repertoire of instructions of a computer Different computers have different instruction sets
Early computers had very simple instruction sets t
But with many aspects in common
Simplified implementation
Many modern computers also have simple instruction sets Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
5
The MIPS Instruction Set
Used as the example throughout the book Stanford MIPS commercialized by MIPS Technologies (www.mips.com) (www mips com) Large share of embedded core market
Applications in consumer electronics, electronics network/storage equipment, cameras, printers, …
Typical yp of manyy modern ISAs
See MIPS Reference Data tear-out card, and Appendixes B and E Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
6
MIPS Sa arithmetic e c
All instructions have 3 operands p Operand order is fixed (destination first) Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 (associated with variables by compiler)
“The natural number of operands for an operation like addition is three…requiring every instruction to have exactly tl three th operands, d no more and d no less, l conforms to the philosophy of keeping the hardware simple” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
7
MIPS Sa arithmetic e c
Design Principle: simplicity favors regularity. Why? Of course this thi complicates li t some things... C code: code A = B + C + D D; E = F - A; MIPS code: add dd $t0 $t0, $ $s1, 1 $ $s2 2 add $s0, $t0, $s3 sub b $ $s4, 4 $ $s5, 5 $ $s0 0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
8
MIPS Sa arithmetic e c
Operands must be registers, registers only 32 registers provided D i Principle: Design Pi i l smaller ll is i faster. f t Why?
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
9
Arithmetic Operations
Add and subtract, subtract three operands
Two sources and one destination
add dd a, b, b c # a gets b + c All arithmetic operations have this form Design Principle 1: Simplicity favors g y regularity
Regularity makes implementation simpler Simplicity enables higher performance at lower cost Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
10
Registers eg s e s vs. s. Memory e oy
Arithmetic instructions operands must be registers, registers — only 32 registers provided Compiler associates variables with registers What about programs with lots of variables Control
Input Memory
D t Datapath th
O t t Output
Processor
I/O
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
11
Register Operands
Arithmetic instructions use register operands MIPS has a 32 × 32-bit register file
A Assembler bl names
Use for frequently accessed data Numbered 0 to 31 32-bit data called a “word” $t0, $t1, …, $t9 for temporary values $s0, $s1, …, $s7 for saved variables
Design Principle 2: Smaller is faster
main memory: millions of locations Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
12
Register Operand Example
C code: f = (g + h) - (i + j); f, f …, j in i $s0, $ 0 …, $s4 $ 4
Compiled MIPS code: add $t0, $s1, $s2 add $t1, $s3, $s4 sub b $ $s0, 0 $t0 $t0, $t1
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
13
Registers vs. vs Memory
Registers are faster to access than memory Operating on memory data requires loads and stores
More instructions to be executed
Compiler must use registers for variables as much h as possible ibl
Only spill to memory for less frequently used variables Register optimization is important!
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
14
Memory e oyO Organization ga a o
Viewed as a large large, singledimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory. Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
0 1 2 3 4 5 6 ...
8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits bit off data d t 8 bits of data 8 bits of data
15
Memory e oyO Organization ga a o
Bytes are nice nice, but most data items use larger "words" F MIPS For MIPS, a word d iis 32 bits bit or 4 bytes. b t 0 4 8 12 ...
32 bits of data 32 bits of data 32 bits of data
Registers hold 32 bits of data
32 bits of data
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
16
Memory e oyO Organization ga a o
232 bytes with byte addresses from 0 to 232-1 230 words d with ith byte b t addresses dd 0, 0 4, 4 8, 8 ... 232-4 Words are aligned i.e., what are the least 2 significant bits of a word address? Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
17
Memory Operands
Main memoryy used for composite p data
To apply arithmetic operations
Each ac add address ess identifies de es a an 8 8-bit b byte by e
Words are aligned in memory
Load values from memory into registers Store result from register to memory
Memory is byte addressed
Arrays, structures, dynamic data
Address must be a multiple of 4
MIPS is Big Endian
Most-significant byte at least address of a word Little Endian: least-significant byte at least address Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
18
Memory Operand Example 1
C code: g = h + A[8]; g in $s1, h in $s2, base address of A in $s3
Compiled MIPS code:
Index de 8 requires equ es offset o set o of 3 32
4 bytes per word
lw $t0, 32($s3) add dd $ $s1, 1 $s2, $ 2 $t0 offset
# load word
base register Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
19
Memory Operand Example 2
C code: A[12] = h + A[8]; h in i $s2, $ 2 base b address dd off A in i $s3 $ 3
Compiled MIPS code: Index 8 requires offset of 32 lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
20
Immediate Operands
Constant data specified in an instruction addi $s3, $s3, 4
No subtract immediate instruction
Just use a negative constant addi $s2, $s1, -1
Design Principle 3: Make the common case fast
Small constants are common Immediate operand avoids a load instruction Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
21
The Constant Zero
MIPS register 0 ($zero) is the constant 0
C nnot be overwritten Cannot o e itten
Useful for common operations
E.g., move between registers add $t2, $s1, $zero
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
22
Unsigned Binary Integers n 1 n-bit n2 1 0 Given an n bit number x n1 2 x n2 2 x1 2 x 0 2
x
Range: 0 to +2n – 1 Example
0000 0000 0000 0000 0000 0000 0000 10112 = 0 + … + 1×23 + 0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110
Using 32 bits
0 to +4,294,967,295 4,294,967,295 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
23
2s Complement Signed Integers 2s-Complement n 1n-bit n2 1 0 Given an n bit number x n1 2 x n2 2 x1 2 x 0 2
x
Range: –2 2n – 1 to +2n – 1 – 1 Example
1111 1111 1111 1111 1111 1111 1111 11002 = –1×231 + 1×230 + … + 1×22 +0×21 +0×20 = –2,147,483,648 + 2,147,483,644 = –410
Using 32 bits
–2,147,483,648 2,147,483,648 to +2,147,483,647 2,147,483,647 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
24
2s Complement Signed Integers 2s-Complement
Bit 31 is sign g bit
–(–2n – 1) can’t be represented Non negative numbers have the same unsigned and 2sNon-negative 2s complement representation Some specific numbers
1 for negative numbers 0 for non-negative numbers
0: 0000 0000 … 0000 –1: 1111 1111 … 1111 Most-negative: 1000 0000 … 0000 Most-positive: 0111 1111 … 1111
N Negation: ti C Complement l t and d add dd 1
Complement means 1 → 0, 0 → 1
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
25
Sign Extension
Representing a number using more bits
In MIPS instruction set
addi: extend immediate value lb, lh: extend loaded byte/halfword beq, bne: extend the displacement
R li Replicate the h sign i bit bi to the h left l f
Preserve the numeric value
c.f. unsigned values: extend with 0s
Examples: p 8-bit to 16-bit
+2: 0000 0010 => 0000 0000 0000 0010 –2: 1111 1110 => 1111 1111 1111 1110 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
26
Instructions s uc o s
Load and store instructions Example: C code: A[8] = h + A[8]; MIPS code: lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, $t0 32($ 32($s3) 3) Store word has destination last Remember b arithmetic h operands d are registers, not memory! Can’t write: ite add dd 48($ 48($s3), 3) $ $s2, 2 32($ 32($s3) 3) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
27
Our First Ou s Example a pe
Can we figure out the code?
swap(int v[], int k); { int temp; temp = v[k] v[k] = v[k+1]; v[k+1] = temp; swap: } muli li $2 $2, $5 $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
28
So far a we’ve e e learned: ea ed:
MIPS - loading words but addressing bytes - arithmetic on registers only Instruction Meaning add $s1 $s1, $s2 $s2, $s3 sub $s1, $s2, $s3 lw $s1, 100($s2) sw $s1, 100($s2)
$s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
29
Machine ac e Language a guage
Instructions, like registers and words of data Instructions data, are also 32 bits long
Example: add $t0, $s1, $s2 registers i t have h numbers, b $t0=9, $s1=17, $s2=18
Instruction Format:
000000
10001
10010
01001
00000
100000
op
rs
rt
rd
shamt
funct
Can you guess what the field names stand for?
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
30
Machine ac e Language a guage
Consider the load-word load word and store store-word word instructions,
What would o ld the regularity eg la it principle p inciple have ha e uss do? New principle: Good design demands a compromise
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
31
Machine ac e Language a guage
Introduce a new type of instruction format
II-type t pe fo for data ttransfer ansfe inst instructions ctions other format was R-type for register
Example: l lw $t0, 32($s2)
35
18
9
32
op
rs
rt
16 bit number
Where'ss the compromise? Where Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
32
Representing Instructions
Instructions are encoded in binary
MIPS instructions
Called machine code Encoded as 32-bit instruction words Small number of formats encoding operation code (opcode), register g numbers,, … Regularity!
Register numbers
$t0 – $t7 are reg’s ’ 8 – 15 $t8 – $t9 are reg’s 24 – 25 $s0 – $s7 are reg’s 16 – 23 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
33
MIPS R-format R format Instructions
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
Instruction fields
op: operation code (opcode) rs: first source register number rt: second source register number rd: d destination d ti ti register i t number b shamt: shift amount (00000 for now) f funct: ffunction code d ((extends d opcode) d ) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
34
R format Example R-format op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
add dd $t0, $ 0 $s1, $ 1 $s2 $ 2 special
$s1
$s2
$t0
0
add
0
17
18
8
0
32
000000
10001
10010
01000
00000
100000
000000100011001001000000001000002 = 0232402016 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
35
MIPS I-format I format Instructions
op
rs
rt
constant or address
6 bits
5 bits
5 bits
16 bits
Immediate arithmetic and load/store instructions
rt: destination or source register number Constant: –215 to +215 – 1 Address: offset added to base address in rs
Design Principle 4: Good design demands good
compromises
Differentt fformats Diff t complicate li t d decoding, di b butt allow ll 32-bit 32 bit instructions uniformly Keep formats as similar as possible Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
36
Logical Operations
Instructions for bitwise manipulation Operation
C
Java
Shift left
>
srl
Bitwise AND
&
&
and andi and,
Bitwise OR
|
|
or, ori
Bitwise NOT
~
~
nor
Useful for extracting and inserting groups of bits in a word Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
37
Shift Operations
op
rs
rt
rd
shamt
funct
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
shamt: how many positions to shift Shift left logical
Shift left and fill with 0 bits sll by i bits multiplies by 2i
Shift right logical
Shift right and fill with 0 bits srl by i bits divides by 2i (unsigned only) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
38
AND Operations
Useful to mask bits in a word
Select some bits, clear others to 0
and $t0, $t1, $t2 $t2
0000 0000 0000 0000 0000 1101 1100 0000
$t1
0000 0000 0000 0000 0011 1100 0000 0000
$t0
0000 0000 0000 0000 0000 1100 0000 0000
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
39
OR Operations
Useful to include bits in a word
Set some bits to 1, leave others unchanged
or $t0, $t1, $t2 $t2
0000 0000 0000 0000 0000 1101 1100 0000
$t1
0000 0000 0000 0000 0011 1100 0000 0000
$t0
0000 0000 0000 0000 0011 1101 1100 0000
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
40
NOT Operations
Useful to invert bits in a word
Change 0 to 1, and 1 to 0
MIPS h has NOR 3-operand 3 d instruction i i
a NOR b == NOT ( a OR b )
nor $t0, $t1, $zero
Register 0: always read as zero
$t1 $
0000 0000 0000 0000 0011 1100 0000 0000
$t0
1111 1111 1111 1111 1100 0011 1111 1111
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
41
Conditional Operations
Branch to a labeled instruction if a condition is true
beq rs, rt, L1
if (rs == rt) branch to instruction labeled L1;
b bne rs, rt, L1 1
Otherwise, continue sequentially
if (rs != rt) branch to instruction labeled L1;
j L1
unconditional jump to instruction labeled L1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
42
Compiling If Statements
C code: if (i==j) f = g+h; else f = g-h;
f, g, … in $s0, $s1, …
C Compiled il d MIPS code: d bne add dd j Else: sub Exit: …
Assembler calculates addresses
$s3, $s4, Else $s0, $ $s1, $ $s2 $ Exit $s0 $s1, $s0, $s1 $s2 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
43
Compiling Loop Statements
C code: while (save[i] == k) i += 1;
i in i $s3, $ 3 k in i $s5, $ 5 address dd off save in i $s6 $ 6
Compiled MIPS code: Loop: sll ll add lw bne addi j Exit: …
$t1, $ 1 $t1, $t0, , $t0, $s3, Loop
$s3, $ 3 2 $t1, $s6 0($t1) ( ) $s5, Exit $s3, 1
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
44
Basic Blocks
A basic block is a sequence of instructions with
No embedded b branches n he (e (except ept att end) No branch targets (except at beginning) A compiler ompile identifies identifie basic b i blocks for optimization An advanced processor can accelerate execution of basic blocks Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
45
More Conditional Operations
Set result to 1 if a condition is true
slt rd, rd rs, rs rt
if (rs < rt) rd = 1; else rd = 0;
slti rt, rs, constant
Otherwise, set to 0
if (rs < constant) rt = 1; else rt = 0;
Use in combination with beq, bne slt $t0, $s1, $s2 bne $t0, $zero, L
# if ($s1 < $s2) # branch to L
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
46
Branch Instruction Design
Why not blt, blt bge, bge etc? Hardware for +1 $t0 = 0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
48
Procedure Calling
Steps required 1. 2 2. 3. 4. 5. 6.
Place parameters in registers T Transfer f control t l tto procedure d Acquire storage for procedure Perform f procedure’s d ’ operations Place result in register for caller Return to place of call
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
49
Stored Program Computers The BIG Picture
Instructions represented in binary, just like data Instructions and data stored in memory Programs can operate on programs
e.g., compilers, linkers, …
Binary compatibility allows compiled programs to work on different computers
Standardized ISAs
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
50
Stored S o ed Program og a Co Concept cep
Fetch & Execute Cycle
Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
51
Control Co o
Decision making instructions
alter the control flow, i i.e., change h the th "next" " t" iinstruction t ti tto b be executed
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
52
Control Co o
MIPS conditional branch instructions: bne $t0, $t1, Label beq $t0 $t0, $t1 $t1, Label Example: E ample if (i (i==j) j) h = i + j; j bne $s0, $s1, Label add dd $ $s3, 3 $ $s0, 0 $ $s1 1 Label: .... Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
53
Control Co o
MIPS unconditional branch instructions: j label Example: if (i!=j) h=i+j; else h=i-j;
beq $s4, $s5, Lab1 add $s3, $s4, $s5 j Lab2 Lab1: sub $s3, $s4, $s5 Lab2:...
Can yyou build a simple p for f loop? p Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
54
So far: a:
Instruction add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) bne $s4,$s5,Label beq $s4,$s5,Label j Label
Meaning g $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 Next instr. is at Label if $s4 ≠ $s5 Next instr. is at Label if $s4 = $s5 Next instr. is at Label
Formats:
R
op
rs
rt
I
op
rs
rt
J
op
rd
shamt
funct
16 bit number 26 bit address
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
55
Control Co o Flow o
We have: beq, bne, what about Branch Branch-if-less-than? if less than? New instruction: if $s1 < $s2 then $t0 = 1 $ slt $t0, $s1, $s2 else $t0 = 0 Can use this instruction to build "blt $s1, $s2, Label" — can now build general control structures Note that the assembler needs a register to do this, — there are policy of use conventions for registers
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
56
Policy of Use Conventions Name Register number 0 $zero 2-3 $v0-$v1 4-7 $a0-$a3 8-15 $t0-$t7 16 23 16-23 $ 0 $ 7 $s0-$s7 24-25 $t8-$t9 28 $gp 29 $sp 30 $fp 31 $ra
Usage Preserved on call? the constant value 0 n.a. values for results and expression evaluation no arguments yes temporaries no saved yes more temporaries no global pointer yes stack pointer yes yes frame pointer return address yes
Register 1 ($at) reserved for assembler, 26-27 for operating system
Memory Layout
Text: program code Static data: global variables
Dynamic data: heap
e.g., static variables in C, constant arrays and strings $gp initialized to address allowing ±offsets into this g segment E.g., malloc in C, new in Java
Stack: automatic storage
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
58
Local Data on the Stack
Local data allocated by callee
e.g., C automatic variables
Procedure frame (activation record)
Used by some compilers to manage stack storage Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
59
Procedure Call Instructions
Procedure call: jump and link jal ProcedureLabel Address of following instruction put in $ra Jumps to target address
Procedure return: jjump p register g jr $ra Copies $ra to program counter Can also be used for computed jumps
e.g., for case/switch statements Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
60
Leaf Procedure Example
C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; ; } Arguments g g, …,, j in $a0, $ , …,, $a3 $ f in $s0 (hence, need to save $s0 on stack) Result in $v0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
61
Leaf Procedure Example
MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, $s0 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0 $s0, $t0, $t0 $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi ddi $ $sp, $sp, $ 4 jr $ra Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
Save $s0 on stack
Procedure body Result Restore $s0 Return
62
Non Leaf Procedures Non-Leaf
Procedures that call other procedures For nested call, caller needs to save on th stack: the t k
Its return address Any arguments and temporaries needed after the call
Restore from the stack after the call Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
63
Non Leaf Procedure Example Non-Leaf
C code: int fact (int n) { if (n < 1) return f; else return n * fact(n ( - 1); ); } Argument g n in $a0 $ Result in $v0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
64
Non-Leaf Procedure Example
MIPS code: fact: addi sw sw slti beq addi addi ddi jr L1: addi jal l lw lw addi mul j jr
$sp, $ra, $ $a0, 0 $t0, $t0, $v0, $sp, $ $ra $a0, fact $ $a0, 0 $ra, $sp, $v0, $ $ra
$sp, -8 4($sp) 0($sp) 0($ ) $a0, 1 $zero, L1 $zero, 1 $sp, $ 8 $a0, -1 0($sp) 0($ ) 4($sp) $sp, 8 $a0, $v0
# # # #
adjust stack for 2 items save return address save argument t test for n < 1
# # # # # # # # # #
if so, result is 1 pop 2 items it from f stack t k and return else decrement n recursive call restore t original i i l n and return address pop 2 items from stack multiply to get result and d return t
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
65
Character Data
Byte-encoded character sets
ASCII: 128 characters
Latin-1: 256 characters
95 g graphic, p , 33 control ASCII, +96 more graphic characters
Unicode: 32-bit character set
Used in Java, C++ wide characters, … M t off th Most the world’s ld’ alphabets, l h b t plus l symbols b l UTF-8, UTF-16: variable-length encodings Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
66
Byte/Halfword Operations
Could use bitwise operations MIPS byte/halfword load/store
String processing is a common case
lb rt, offset(rs)
Sign g extend to 32 bits in rt
lbu rt, offset(rs)
lhu rt, offset(rs)
Zero extend to 32 bits in rt
sb rt, offset(rs) ff
lh rt, offset(rs)
sh rt, offset(rs) ff
Store just rightmost byte/halfword Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
67
String Copy Example
C code: Null-terminated string void id strcpy t (char ( h x[], [] char h y[]) []) { int i; i = 0; while ((x[i]=y[i])!='\0') i += 1; ; } Addresses of x, y in $a0, $a1 i in $s0
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
68
String Copy Example
MIPS code: strcpy: addi sw add L1: add lbu add dd sb beq addi j L2: lw addi jr
$sp, $s0, $s0, $t1, $t2, $t3, $ 3 $t2, $t2, $s0, $ , L1 $s0, $sp, $ra
$sp, -4 0($sp) $zero, $zero $s0, $a1 0($t1) $s0, $ 0 $a0 $ 0 0($t3) $zero, L2 $s0, $ , 1 0($sp) $sp, 4
# # # # # # # # # # # # #
adjust stack for 1 item save $s0 i = 0 addr of y[i] in $t1 $t2 = y[i] addr dd of f x[i] [i] in i $t3 $ 3 x[i] = y[i] exit loop if y[i] == 0 i = i + 1 next iteration of loop restore saved $s0 pop 1 item from stack and return
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
69
Constants Co sa s
Small constants are used quite frequently (50% of operands) e.g., g , A = A + 5;; B = B + 1; C = C - 18; Solutions? Why not?
put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like one. Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
70
Constants Co sa s
MIPS Instructions: addi $29, $29, 4 slti $8 $8, $18 $18, 10 andi $29, $29, 6 ori $29, $29, 4
How do we make this work?
Design Principle: Make the common case fast. Which format? Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
71
How o abou about larger a ge co constants? s a s?
We'd like to be able to load a 32 bit constant into a register g Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 filled with zeros
1010101010101010
0000000000000000
Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010
0000000000000000
0000000000000000
1010101010101010
1010101010101010
1010101010101010
ori
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
72
Branch Addressing
Branch instructions specify
M Most branch b h targets are near branch b h
Opcode, two registers, target address Forward or backward op
rs
rt
constant or address
6 bits
5 bits
5 bits
16 bits
PC relative addressing PC-relative
Target address = PC + offset × 4 PC already l d incremented i t d by b 4 by b this thi time ti Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
73
Jump Addressing
Jump (j and jal) targets could be anywhere in text segment
En ode full Encode f ll address dd e in instruction in t tion op
address
6 bit bits
26 bits
(Pseudo)Direct jump addressing
Target address = PC31…28 : (address × 4)
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
74
Branching Far Away
If branch target is too far to encode with 16-bit offset, assembler rewrites the code Example beq $ $s0,$s1, $ L1 ↓ bne $s0,$s1, L2 j L1 L2: … L2 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
75
Addressing Mode Summary
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
76
Synchronization
Two processors sharing an area of memory
P1 writes, then P2 reads Data race if P1 and P2 don’t synchronize
Hardware support required
R Result l depends d d off order d off accesses
Atomic read/write / memoryy operation p No other access to the location allowed between the read and write
Could be a single instruction
E.g., atomic swap of register ↔ memory Or an atomic pair of instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
77
Synchronization in MIPS
Load linked: ll rt, offset(rs) Store conditional: sc rt, offset(rs)
Succeeds if location not changed since the ll
Fails if location is changed
Returns 1 in rt Returns 0 in rt
Example: atomic swap (to test/set lock variable) try: add ll sc beq add
$t0,$zero,$s4 $t1,0($s1) $ 0 0($ 1) $t0,0($s1) $t0,$zero,try $s4,$zero,$t1
;copy exchange value ;load linked ;store conditional di i l ;branch store fails ;put load value in $s4
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
78
Translation and Startup Many compilers produce object modules directly
Static linking
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
79
Assembler Pseudoinstructions
Most assembler instructions represent machine instructions one-to-one P Pseudoinstructions: d i t ti figments fi t off th the assembler’s imagination → add $t0, $zero, $t1 blt $t0, $t1, L → slt $at, $t0, $t1 move $t0, $t1
bne $at, $zero, L
$at (register 1): assembler temporary Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
80
Producing an Object Module
Assembler (or compiler) translates program into machine instructions Provides information for building a complete program from the pieces
Header: described contents of object module Text segment: translated instructions St ti d Static data t segment: t d data t allocated ll t d ffor th the lif life off th the program Relocation info: for contents that depend on absolute location of loaded program Symbol table: global definitions and external refs Debug info: for associating with source code Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
81
Linking Object Modules
Produces an executable image 1.Merges segments 2.Resolve labels (determine their addresses) 3.Patch location-dependent and external refs
Could leave location dependencies p for fixing g by a relocating loader
But with virtual memory, no need to do this Program can be loaded into absolute location in virtual memory space Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
82
Loading a Program
Load from image file on disk into memory 1.Read header to determine segment sizes 2.Create virtual address space 3.Copy text and initialized data into memory
Or set page table entries so they can be faulted in
4.Set up arguments on stack 5.Initialize registers (including $sp, $fp, $gp) 6J 6.Jump to t startup t t routine ti
Copies arguments to $a0, … and calls main When main returns, do exit syscall y Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
83
Dynamic Linking
Only link/load library procedure when it is called
Requires Req ie p procedure o ed e code ode to be relocatable elo t ble Avoids image bloat caused by static linking of all (transitively) referenced libraries Automatically picks up new library versions
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
84
Lazy Linkage
Indirection table Stub: Loads routine ID, Jump to linker/loader Linker/loader code
Dynamically mapped code Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
85
Starting Java Applications Simple Si l portable t bl instruction set for the JVM
Compiles bytecodes of “hot” methods into native code for host machine
Interprets bytecodes
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
86
C Sort Example
Illustrates use of assembly instructions for a C bubble sort function Swap p procedure p (leaf) ( )
void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } v in $a0, k in $a1, temp in $t0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
87
The Procedure Swap swap: sll $t1 $t1, $a1, $a1 2 # $t1 = k * 4 add $t1, $a0, $t1 # $t1 = v+(k*4) # (address of v[k]) lw $t0, $t0 0($t1) # $t0 (temp) = v[k] lw $t2, 4($t1) # $t2 = v[k+1] sw $t2, 0($t1) # v[k] = $t2 (v[k+1]) sw $t0, $t0 4($t1) # v[k+1] [k 1] = $t0 (temp) (t ) jr $ra # return to calling routine
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
88
The Sort Procedure in C
Non-leaf ((calls swap) p)
void sort (int v[], int n) { int i, j; for (i = 0; i < n; i += 1) { for (j = i – 1; j >= 0 && v[j] > v[j + 1]; j -= 1) { swap(v,j); } } } v in $a0, k in $a1, i in $s0, j in $s1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
89
The Procedure Body move move move for1tst: slt beq addi for2tst: slti bne sll add lw lw slt beq move move jal addi j exit2: addi j
$s2, $a0 # save $a0 into $s2 $s3, $a1 # save $a1 into $s3 $s0, $zero # i = 0 $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n) $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n) $s1, $s0, –1 # j = i – 1 $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0) $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0) $t1, $s1, 2 # $t1 = j * 4 $t2, $s2, $t1 # $t2 = v + (j * 4) $t3, 0($t2) # $t3 = v[j] $t4, 4($t2) # $t4 = v[j + 1] $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3 $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3 $a0, $s2 # 1st param of swap is v (old $a0) $a1, $s1 # 2nd param of swap is j swap # call swap procedure $s1, $s1, –1 # j –= 1 for2tst # jump to test of inner loop $s0, $s0, 1 # &i Computer += 1 Engineering Electrical of Engineering for1tst #School jump to test of outer loop THE COLLEGE OF NEW JERSEY
Move params Outer loop
Inner loop
Pass params & call Inner loop Outer loop 90
The Full Procedure sort:
addi $sp,$sp, p p –20 sw $ra, 16($sp) sw $s3,12($sp) sw $s2, 8($sp) , 4($sp) p sw $s1, sw $s0, 0($sp) … … , 0($sp) ( p) exit1: lw $s0, lw $s1, 4($sp) lw $s2, 8($sp) lw $s3,12($sp) , ( p) lw $ra,16($sp) addi $sp,$sp, 20 jr $ra
# # # # # # #
make room on stack for 5 registers g save $ra on stack save $s3 on stack save $s2 on stack save $s1 on stack save $s0 on stack procedure body
# # # # # # #
restore $s0 from stack restore $s1 from stack restore $s2 from stack restore $s3 from stack restore $ra from stack restore stack pointer return to calling routine
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
91
Effect of Compiler Optimization Compiled with gcc for Pentium 4 under Linux Relative Performance
3
Instruction count
140000 120000
2.5
100000
2
80000
15 1.5
60000
1
40000
0.5
20000
0
0 none
O1
O2
Clock Cycles
180000 160000 140000 120000 100000 80000 60000 40000 20000 0000 0
none
O3
O1
2
O2
O3
O2
O3
CPI
1.5 1 0.5 0 none
O1
O2
O3
none O1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
92
Effect of Language Lang age and Algorithm Bubblesort Relative Performance
3 2.5 2 1.5 1 0.5 0 C/ C/none
C/O1
C/O2
C/O3
J Java/int /i t
J Java/JIT /JIT
Quicksort Relative Performance
2.5 2 1.5 1 0.5 0 C/none
C/O1
C/O2
C/O3
Java/int
Java/JIT
Quicksort vs. Bubblesort Speedup
3000 2500 2000 1500 1000 500 0 C/none
C/O1
Electrical & Computer Engineering SchoolC/O3 of Engineering C/O2 Java/int THE COLLEGE OF NEW JERSEY
Java/JIT
93
Lessons Learnt
Instruction count and CPI are not good performance indicators in isolation C Compiler il optimizations ti i ti are sensitive iti to t the algorithm Java/JIT compiled code is significantly faster than JVM interpreted
Comparable to optimized C in some cases
Nothing g can fix a dumb algorithm! g Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
94
Arrays vs. vs Pointers
Array indexing involves
Multiplying index by element size Addi to Adding t array base b address dd
Pointers correspond directly to memory addresses dd
Can avoid indexing complexity
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
95
Comparison of Array vs. vs Ptr
Multiply “strength strength reduced” reduced to shift Array version requires shift to be inside loop
Part of index calculation for incremented i c.f. incrementing pointer
Compiler can achieve same effect as manual use of pointers
Induction variable elimination Better to make program clearer and safer
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
96
ARM & MIPS Similarities
ARM: the most popular embedded core Similar basic set of instructions to MIPS ARM
MIPS
Date announced
1985
1985
Instruction size
32 bits
32 bits
Address space
32-bit flat
32-bit flat
Data alignment
Aligned
Aligned
9
3
15 × 32-bit
31 × 32-bit
Data addressing modes Registers I Input/output t/ t t
Memory M Electrical & Computer Engineering mapped School of Engineering THE COLLEGE OF NEW JERSEY
Memory M mapped 97
Compare and Branch in ARM
Uses condition codes for result of an arithmetic/logical instruction
Negative, Neg ti e zero, e o carry, o overflow e flo Compare instructions to set condition codes without keeping the result
Each instruction can be conditional
Top 4 bits of instruction word: condition value C avoid Can id branches b h over single i l instructions i t ti Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
98
Instruction Encoding
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
99
Alternative e a e Architectures c ec u es
Design alternative:
provide more powerful operations goal is to reduce number of instructions executed danger is a slower cycle time and/or a higher CPI
–“The path toward operation complexity is thus fraught with peril. To avoid these problems, designers have moved toward simpler instructions” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
100
Alternative e a e Architectures c ec u es
Sometimes referred to as “RISC RISC vs. vs CISC” CISC
virtually all new instruction sets since 1982 have been RISC VAX: minimize code size, make assembly language easy instructions from 1 to 54 bytes long!
We ll look at PowerPC and Intel Architecture We’ll (IA) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
101
The Intel x86 ISA
Evolution with backward compatibility
8080 (1974): 8-bit microprocessor
8086 (1978) (1978): 16-bit 16 bit extension t i to t 8080
Adds FP instructions and register stack
80286 (1982): 24-bit addresses, MMU
Complex instruction set (CISC)
8087 (1980): floating-point coprocessor
Accumulator, plus 3 index-register pairs
Segmented memory mapping and protection
80386 (1985): 32-bit extension (now IA-32)
Additional addressing modes and operations Paged g memoryy mapping pp g as well as segments g Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
102
The Intel x86 ISA
Further evolution…
i486 (1989): pipelined, on-chip caches and FPU
Pentium (1993): superscalar, 64-bit datapath
Later versions added MMX (Multi-Media eXtension) instructions The infamous FDIV bug
Pentium Pro (1995), Pentium II (1997) New microarchitecture (see Colwell, Colwell The Pentium Chronicles) Pentium III (1999)
Compatible competitors: AMD, Cyrix, …
Added SSE (Streaming SIMD Extensions) and associated registers
P i Pentium 4 (2001)
New microarchitecture Added SSE2 instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
103
The Intel x86 ISA
And further…
AMD64 (2003): extended architecture to 64 bits EM64T – Extended Memory 64 Technology (2004)
Intel Core (2006)
Intel declined to follow, instead…
Advanced Vector Extension (announced 2008)
Added SSE4 instructions, virtual machine support
AMD64 (announced 2007): SSE5 instructions
AMD64 adopted by Intel (with refinements) Added SSE3 instructions
Longer g SSE registers, g , more instructions
If Intel didn’t extend with compatibility, its competitors would!
Technical elegance ≠ market success Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
104
Basic x86 Registers
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
105
IA 32 Register Restrictions IA-32
Registers are not “general general purpose” purpose – note the restrictions below
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
106
Basic x86 Addressing Modes
Two operands per instruction Source/dest operand
Second source operand
Register
Register
Register
Immediate
Register
Memory
Memory
Register
Memory
Immediate
Memory addressing modes
Address Address Address Address
in register = Rbase + displacement = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3) = Rbase + 2scale × Rindex + displacement Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
107
x86 Instruction Encoding
Variable length encoding
Postfix Po tfi b bytes te specify pe if addressing mode Prefix bytes modify operation
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
Operand length, length repetition, locking, …
108
Implementing IA-32 IA 32
Complex instruction set makes implementation difficult
Hardware translates instructions to simpler microoperations
Simple instructions: 1–1 C Complex l instructions: i t ti 1–many 1
Microengine similar to RISC Market share makes this economically viable
Comparable performance to RISC
Compilers avoid complex instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
109
Intel e Architecture c ec u e “This This history illustrates the impact of the “golden golden handcuffs” of compatibility ““adding ddi new features f as someone might i h add dd clothing to a packed bag” “an architecture that is difficult to explain and impossible to love”
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
110
A dominant architecture: 80x86
Saving grace:
the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow
“what the 80x86 lacks in style is made up in quantity, quantity making it beautiful from the right perspective” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
111
PowerPC o e C
Indexed addressing g
example:
What do we have to do in MIPS?
#$t1=Memory[$a0+$s3]
Update d addressing dd
update a register as part of load (for marching through arrays) example: lwu $t0,4($s3) #$t0=Memory[$s3+4];$s3=$s3+4
What do we have to do in MIPS?
lw $t1,$a0+$s3
Others:
load multiple/store multiple a special counter register “bc Loop” decrement counter, if not 0 goto loop
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
112
Fallacies
Powerful instruction higher performance
Fewer instructions required But complex instructions are hard to implement
May slow down all instructions, including simple ones
Compilers are good at making fast code from simple i l instructions i i
Use assembly code for high performance
Butt modern B d compilers il are better b tt att d dealing li with ith modern processors More lines of code more errors and less productivity Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
113
Fallacies
Backward compatibility instruction set doesn’t change
But they do accrete more instructions
x86 instruction set
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
114
Pitfalls
Sequential words are not at sequential addresses
In ement by Increment b 4, 4 not by b 1!
Keeping a pointer to an automatic variable i bl after ft procedure d returns t
e.g., passing pointer back via an argument Pointer becomes invalid when stack popped Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
115
Concluding Remarks
Design principles 1. 2 2. 3. 4.
Layers of software/hardware
Simplicity favors regularity Smaller is faster Make the common case fast Good design demands good compromises
Compiler assembler Compiler, assembler, hardware
MIPS: typical of RISC ISAs
x86 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
116
Concluding Remarks
Measure MIPS instruction executions in benchmark programs
Consider making the common case fast Instruction class MIPS examples SPEC2006 Int SPEC2006 FP Consider compromises
Arithmetic
add, sub, addi
16%
48%
Data transfer
lw, sw, lb, lbu, lh, lhu, sb, lui
35%
36%
Logical
and, or, nor, andi, ori, i sll, ll srl l
12%
4%
Cond. Branch
beq, bne, slt, slti, sltiu
34%
8%
Jump
j, jr, jal
2%
0%
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
117
Overview O e e o of MIPS S
simple instructions all 32 bits wide very structured, no unnecessary baggage only three instruction formats
R
op
rs
rt t
I
op
rs
rt
J
op
rd d
shamt h t
f funct t
16 bit number 26 bit address
rely on compiler to achieve performance — what are the compiler's goals? help compiler where we can Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
118
Addresses in Branches and Jumps p
Instructions: bne $t4,$t5,Label beq $t4,$t5,Label j Label
I
op
J
op
Next instruction is at Label if $t4 ≠ $t5 Next instruction is at Label if $t4 = $t5 Next instruction is at Label
Formats: rs
rt
16 bit number 26 bit address
Addresses are not 32 bits — How do we handle this with load and store instructions?
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
119
Addresses dd esses in Branches a c es
Instructions: bne $t4,$t5,Label beq $t4,$t5,Label
I
Formats:
op
rs
rt
16 bit number
Could specify a register (like lw and sw) and add it to address
Next instruction is at Label if $t4 ≠ $t5 Next instruction is at Label if $t4=$t5
use Instruction Address Register (PC = program counter) most branches b h are local l l (principle ( i i l off locality) l li )
Jump instructions just use high order bits of PC
address boundaries of 256 MB Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
120
To summarize: MIPS operands Name 32 registers
Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0] Memory[0],
2
30
Accessed only by data transfer instructions instructions. MIPS uses byte addresses addresses, so
memory Memory[4], ...,
words
sequential words differ by 4. Memory holds data structures, such as arrays,
Memory[4294967292]
and spilled registers, such as those saved on procedure calls.
MIPS assembly language Category
Arithmetic
Instruction add
Example add $s1, $s2, $s3
Meaning $s1 = $s2 + $s3
Three operands; p ; data in registers g
subtract
sub $s1, $s2, $s3
$s1 = $s2 - $s3
Three operands; data in registers
$s1 = $s2 + 100 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1
Used to add constants
addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) store word lb $s1, 100($s2) load byte sb $s1, 100($s2) store byte load upper immediate lui $s1, 100 add immediate load word
Data transfer
Conditional branch
Unconditional jump
$s1 = 100 * 2
16
Comments
Word from memory to register Word from register to memory Byte from memory to register Byte from register to memory Loads constant in upper 16 bits
branch on equal
beq
$s1, $s2, 25
if ($s1 == $s2) go to PC + 4 + 100
Equal test; PC-relative branch
branch on not equal
bne
$s1, $s2, 25
if ($s1 != $s2) go to PC + 4 + 100
Not equal test; PC-relative
set on less than
slt
$s1, $s2, $s3
if ($s2 < $s3) $s1 = 1; else $s1 = 0
Compare less than; for beq, bne
set less than iimmediate di t
slti
jump
j jr jal
jump register jump and link
$s1, $s2, 100 if ($s2 < 100) $s1 = 1;
Compare less than constant
$ 1 =0 else l $s1
2500 $ra 2500
Jump to target address go to 10000 For switch, procedure return go to $ra $ra = PC + 4; go to 10000 For procedure call
1. Immediate addressing op
rs
rt
Immediate
2. Register addressing op
rs
rt
rd
...
funct
Registers Register
3. Base addressing op
rs
rt
Memory
Address
+
Register
Byte
Halfword
Word
4 PC-relative 4. PC relative addressing op
rs
rt
Memory
Address
PC
+
Word
5. Pseudodirect addressing op
Memory
Address
Word
PC
Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
122
Summary Su ay
Instruction complexity is only one variable
Design Principles:
lower instruction count vs. higher CPI / lower clock rate simplicity favors regularity smaller is faster good design demands compromise make the common case fast
Instruction set architecture
a very important abstraction indeed! Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY
123