Chapter 2. Instructions: Language of the Computer

Chapter 2 Instructions: Language of the p Computer Instructions: s uc o s:    Language of the Machine More primitive than higher level language...
Author: Molly Hall
10 downloads 1 Views 1MB Size
Chapter 2 Instructions: Language of the p Computer

Instructions: s uc o s:  



Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

2

Instructions: s uc o s: 

We ll be working with the MIPS We’ll instruction set architecture 



similar to other simila othe architectures a chitect es developed de eloped since the 1980's used by NEC, NEC Nintendo, Nintendo Silicon Graphics, Graphics Sony

Design goals: maximize performance and minimize cost, cost reduce design time Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

3

Instructions: s uc o s: 1400 1300 1200 1100 1000 900 800

Other SPARC Hitachi SH PowerPC Motorola 68K MIPS IA-32 ARM

700 600 500 400 300 200 100 0 1998

1999

2000

2001

2002

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

4

Instruction Set  

The repertoire of instructions of a computer Different computers have different instruction sets 



Early computers had very simple instruction sets t 



But with many aspects in common

Simplified implementation

Many modern computers also have simple instruction sets Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

5

The MIPS Instruction Set  



Used as the example throughout the book Stanford MIPS commercialized by MIPS Technologies (www.mips.com) (www mips com) Large share of embedded core market 



Applications in consumer electronics, electronics network/storage equipment, cameras, printers, …

Typical yp of manyy modern ISAs 

See MIPS Reference Data tear-out card, and Appendixes B and E Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

6

MIPS Sa arithmetic e c  

All instructions have 3 operands p Operand order is fixed (destination first) Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 (associated with variables by compiler)

“The natural number of operands for an operation like addition is three…requiring every instruction to have exactly tl three th operands, d no more and d no less, l conforms to the philosophy of keeping the hardware simple” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

7

MIPS Sa arithmetic e c 



Design Principle: simplicity favors regularity. Why? Of course this thi complicates li t some things... C code: code A = B + C + D D; E = F - A; MIPS code: add dd $t0 $t0, $ $s1, 1 $ $s2 2 add $s0, $t0, $s3 sub b $ $s4, 4 $ $s5, 5 $ $s0 0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

8

MIPS Sa arithmetic e c 



Operands must be registers, registers only 32 registers provided D i Principle: Design Pi i l smaller ll is i faster. f t Why?

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

9

Arithmetic Operations 

Add and subtract, subtract three operands 

 

Two sources and one destination

add dd a, b, b c # a gets b + c All arithmetic operations have this form Design Principle 1: Simplicity favors g y regularity  

Regularity makes implementation simpler Simplicity enables higher performance at lower cost Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

10

Registers eg s e s vs. s. Memory e oy 

 

Arithmetic instructions operands must be registers, registers — only 32 registers provided Compiler associates variables with registers What about programs with lots of variables Control

Input Memory

D t Datapath th

O t t Output

Processor

I/O

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

11

Register Operands 



Arithmetic instructions use register operands MIPS has a 32 × 32-bit register file   



A Assembler bl names  



Use for frequently accessed data Numbered 0 to 31 32-bit data called a “word” $t0, $t1, …, $t9 for temporary values $s0, $s1, …, $s7 for saved variables

Design Principle 2: Smaller is faster 

main memory: millions of locations Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

12

Register Operand Example 

C code: f = (g + h) - (i + j);  f, f …, j in i $s0, $ 0 …, $s4 $ 4



Compiled MIPS code: add $t0, $s1, $s2 add $t1, $s3, $s4 sub b $ $s0, 0 $t0 $t0, $t1

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

13

Registers vs. vs Memory  

Registers are faster to access than memory Operating on memory data requires loads and stores 



More instructions to be executed

Compiler must use registers for variables as much h as possible ibl 



Only spill to memory for less frequently used variables Register optimization is important!

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

14

Memory e oyO Organization ga a o 





Viewed as a large large, singledimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory. Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

0 1 2 3 4 5 6 ...

8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits bit off data d t 8 bits of data 8 bits of data

15

Memory e oyO Organization ga a o 



Bytes are nice nice, but most data items use larger "words" F MIPS For MIPS, a word d iis 32 bits bit or 4 bytes. b t 0 4 8 12 ...

32 bits of data 32 bits of data 32 bits of data

Registers hold 32 bits of data

32 bits of data

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

16

Memory e oyO Organization ga a o 





232 bytes with byte addresses from 0 to 232-1 230 words d with ith byte b t addresses dd 0, 0 4, 4 8, 8 ... 232-4 Words are aligned i.e., what are the least 2 significant bits of a word address? Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

17

Memory Operands 

Main memoryy used for composite p data 



To apply arithmetic operations  



Each ac add address ess identifies de es a an 8 8-bit b byte by e

Words are aligned in memory 



Load values from memory into registers Store result from register to memory

Memory is byte addressed 



Arrays, structures, dynamic data

Address must be a multiple of 4

MIPS is Big Endian  

Most-significant byte at least address of a word Little Endian: least-significant byte at least address Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

18

Memory Operand Example 1 

C code: g = h + A[8];  g in $s1, h in $s2, base address of A in $s3



Compiled MIPS code: 

Index de 8 requires equ es offset o set o of 3 32 

4 bytes per word

lw $t0, 32($s3) add dd $ $s1, 1 $s2, $ 2 $t0 offset

# load word

base register Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

19

Memory Operand Example 2 

C code: A[12] = h + A[8];  h in i $s2, $ 2 base b address dd off A in i $s3 $ 3



Compiled MIPS code: Index 8 requires offset of 32 lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word 

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

20

Immediate Operands 

Constant data specified in an instruction addi $s3, $s3, 4



No subtract immediate instruction 

Just use a negative constant addi $s2, $s1, -1



Design Principle 3: Make the common case fast  

Small constants are common Immediate operand avoids a load instruction Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

21

The Constant Zero 

MIPS register 0 ($zero) is the constant 0 



C nnot be overwritten Cannot o e itten

Useful for common operations 

E.g., move between registers add $t2, $s1, $zero

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

22

Unsigned Binary Integers n 1 n-bit n2 1 0 Given an n bit number  x n1 2  x n2 2    x1 2  x 0 2

x  

Range: 0 to +2n – 1 Example 



0000 0000 0000 0000 0000 0000 0000 10112 = 0 + … + 1×23 + 0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110

Using 32 bits 

0 to +4,294,967,295 4,294,967,295 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

23

2s Complement Signed Integers 2s-Complement n 1n-bit n2 1 0 Given an n bit number   x n1 2  x n2 2    x1 2  x 0 2

x  

Range: –2 2n – 1 to +2n – 1 – 1 Example 



1111 1111 1111 1111 1111 1111 1111 11002 = –1×231 + 1×230 + … + 1×22 +0×21 +0×20 = –2,147,483,648 + 2,147,483,644 = –410

Using 32 bits 

–2,147,483,648 2,147,483,648 to +2,147,483,647 2,147,483,647 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

24

2s Complement Signed Integers 2s-Complement 

Bit 31 is sign g bit  

  

–(–2n – 1) can’t be represented Non negative numbers have the same unsigned and 2sNon-negative 2s complement representation Some specific numbers    



1 for negative numbers 0 for non-negative numbers

0: 0000 0000 … 0000 –1: 1111 1111 … 1111 Most-negative: 1000 0000 … 0000 Most-positive: 0111 1111 … 1111

N Negation: ti C Complement l t and d add dd 1 

Complement means 1 → 0, 0 → 1

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

25

Sign Extension 

Representing a number using more bits 



In MIPS instruction set   



addi: extend immediate value lb, lh: extend loaded byte/halfword beq, bne: extend the displacement

R li Replicate the h sign i bit bi to the h left l f 



Preserve the numeric value

c.f. unsigned values: extend with 0s

Examples: p 8-bit to 16-bit  

+2: 0000 0010 => 0000 0000 0000 0010 –2: 1111 1110 => 1111 1111 1111 1110 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

26

Instructions s uc o s  

 

Load and store instructions Example: C code: A[8] = h + A[8]; MIPS code: lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, $t0 32($ 32($s3) 3) Store word has destination last Remember b arithmetic h operands d are registers, not memory! Can’t write: ite add dd 48($ 48($s3), 3) $ $s2, 2 32($ 32($s3) 3) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

27

Our First Ou s Example a pe 

Can we figure out the code?

swap(int v[], int k); { int temp; temp = v[k] v[k] = v[k+1]; v[k+1] = temp; swap: } muli li $2 $2, $5 $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

28

So far a we’ve e e learned: ea ed: 



MIPS - loading words but addressing bytes - arithmetic on registers only Instruction Meaning add $s1 $s1, $s2 $s2, $s3 sub $s1, $s2, $s3 lw $s1, 100($s2) sw $s1, 100($s2)

$s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

29

Machine ac e Language a guage 

Instructions, like registers and words of data Instructions data, are also 32 bits long  



Example: add $t0, $s1, $s2 registers i t have h numbers, b $t0=9, $s1=17, $s2=18

Instruction Format:

000000

10001

10010

01001

00000

100000

op

rs

rt

rd

shamt

funct



Can you guess what the field names stand for?

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

30

Machine ac e Language a guage 

Consider the load-word load word and store store-word word instructions, 



What would o ld the regularity eg la it principle p inciple have ha e uss do? New principle: Good design demands a compromise

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

31

Machine ac e Language a guage 

Introduce a new type of instruction format  



II-type t pe fo for data ttransfer ansfe inst instructions ctions other format was R-type for register

Example: l lw $t0, 32($s2)

35

18

9

32

op

rs

rt

16 bit number



Where'ss the compromise? Where Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

32

Representing Instructions 

Instructions are encoded in binary 



MIPS instructions  





Called machine code Encoded as 32-bit instruction words Small number of formats encoding operation code (opcode), register g numbers,, … Regularity!

Register numbers   

$t0 – $t7 are reg’s ’ 8 – 15 $t8 – $t9 are reg’s 24 – 25 $s0 – $s7 are reg’s 16 – 23 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

33

MIPS R-format R format Instructions



op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

Instruction fields      

op: operation code (opcode) rs: first source register number rt: second source register number rd: d destination d ti ti register i t number b shamt: shift amount (00000 for now) f funct: ffunction code d ((extends d opcode) d ) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

34

R format Example R-format op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

add dd $t0, $ 0 $s1, $ 1 $s2 $ 2 special

$s1

$s2

$t0

0

add

0

17

18

8

0

32

000000

10001

10010

01000

00000

100000

000000100011001001000000001000002 = 0232402016 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

35

MIPS I-format I format Instructions



op

rs

rt

constant or address

6 bits

5 bits

5 bits

16 bits

Immediate arithmetic and load/store instructions   



rt: destination or source register number Constant: –215 to +215 – 1 Address: offset added to base address in rs

Design Principle 4: Good design demands good

compromises 



Differentt fformats Diff t complicate li t d decoding, di b butt allow ll 32-bit 32 bit instructions uniformly Keep formats as similar as possible Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

36

Logical Operations 



Instructions for bitwise manipulation Operation

C

Java

Shift left


>

srl

Bitwise AND

&

&

and andi and,

Bitwise OR

|

|

or, ori

Bitwise NOT

~

~

nor

Useful for extracting and inserting groups of bits in a word Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

37

Shift Operations

 

op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

shamt: how many positions to shift Shift left logical  



Shift left and fill with 0 bits sll by i bits multiplies by 2i

Shift right logical  

Shift right and fill with 0 bits srl by i bits divides by 2i (unsigned only) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

38

AND Operations 

Useful to mask bits in a word 

Select some bits, clear others to 0

and $t0, $t1, $t2 $t2

0000 0000 0000 0000 0000 1101 1100 0000

$t1

0000 0000 0000 0000 0011 1100 0000 0000

$t0

0000 0000 0000 0000 0000 1100 0000 0000

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

39

OR Operations 

Useful to include bits in a word 

Set some bits to 1, leave others unchanged

or $t0, $t1, $t2 $t2

0000 0000 0000 0000 0000 1101 1100 0000

$t1

0000 0000 0000 0000 0011 1100 0000 0000

$t0

0000 0000 0000 0000 0011 1101 1100 0000

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

40

NOT Operations 

Useful to invert bits in a word 



Change 0 to 1, and 1 to 0

MIPS h has NOR 3-operand 3 d instruction i i 

a NOR b == NOT ( a OR b )

nor $t0, $t1, $zero

Register 0: always read as zero

$t1 $

0000 0000 0000 0000 0011 1100 0000 0000

$t0

1111 1111 1111 1111 1100 0011 1111 1111

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

41

Conditional Operations 

Branch to a labeled instruction if a condition is true 



beq rs, rt, L1 



if (rs == rt) branch to instruction labeled L1;

b bne rs, rt, L1 1 



Otherwise, continue sequentially

if (rs != rt) branch to instruction labeled L1;

j L1 

unconditional jump to instruction labeled L1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

42

Compiling If Statements 

C code: if (i==j) f = g+h; else f = g-h; 



f, g, … in $s0, $s1, …

C Compiled il d MIPS code: d bne add dd j Else: sub Exit: …

Assembler calculates addresses

$s3, $s4, Else $s0, $ $s1, $ $s2 $ Exit $s0 $s1, $s0, $s1 $s2 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

43

Compiling Loop Statements 

C code: while (save[i] == k) i += 1; 



i in i $s3, $ 3 k in i $s5, $ 5 address dd off save in i $s6 $ 6

Compiled MIPS code: Loop: sll ll add lw bne addi j Exit: …

$t1, $ 1 $t1, $t0, , $t0, $s3, Loop

$s3, $ 3 2 $t1, $s6 0($t1) ( ) $s5, Exit $s3, 1

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

44

Basic Blocks 

A basic block is a sequence of instructions with  

No embedded b branches n he (e (except ept att end) No branch targets (except at beginning)  A compiler ompile identifies identifie basic b i blocks for optimization  An advanced processor can accelerate execution of basic blocks Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

45

More Conditional Operations 

Set result to 1 if a condition is true 



slt rd, rd rs, rs rt 



if (rs < rt) rd = 1; else rd = 0;

slti rt, rs, constant 



Otherwise, set to 0

if (rs < constant) rt = 1; else rt = 0;

Use in combination with beq, bne slt $t0, $s1, $s2 bne $t0, $zero, L

# if ($s1 < $s2) # branch to L

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

46

Branch Instruction Design  

Why not blt, blt bge, bge etc? Hardware for +1  $t0 = 0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

48

Procedure Calling 

Steps required 1. 2 2. 3. 4. 5. 6.

Place parameters in registers T Transfer f control t l tto procedure d Acquire storage for procedure Perform f procedure’s d ’ operations Place result in register for caller Return to place of call

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

49

Stored Program Computers The BIG Picture 





Instructions represented in binary, just like data Instructions and data stored in memory Programs can operate on programs 



e.g., compilers, linkers, …

Binary compatibility allows compiled programs to work on different computers 

Standardized ISAs

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

50

Stored S o ed Program og a Co Concept cep 

Fetch & Execute Cycle 





Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

51

Control Co o 

Decision making instructions  

alter the control flow, i i.e., change h the th "next" " t" iinstruction t ti tto b be executed

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

52

Control Co o 



MIPS conditional branch instructions: bne $t0, $t1, Label beq $t0 $t0, $t1 $t1, Label Example: E ample if (i (i==j) j) h = i + j; j bne $s0, $s1, Label add dd $ $s3, 3 $ $s0, 0 $ $s1 1 Label: .... Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

53

Control Co o 





MIPS unconditional branch instructions: j label Example: if (i!=j) h=i+j; else h=i-j;

beq $s4, $s5, Lab1 add $s3, $s4, $s5 j Lab2 Lab1: sub $s3, $s4, $s5 Lab2:...

Can yyou build a simple p for f loop? p Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

54

So far: a: 



Instruction add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) bne $s4,$s5,Label beq $s4,$s5,Label j Label

Meaning g $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 Next instr. is at Label if $s4 ≠ $s5 Next instr. is at Label if $s4 = $s5 Next instr. is at Label

Formats:

R

op

rs

rt

I

op

rs

rt

J

op

rd

shamt

funct

16 bit number 26 bit address

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

55

Control Co o Flow o  





We have: beq, bne, what about Branch Branch-if-less-than? if less than? New instruction: if $s1 < $s2 then $t0 = 1 $ slt $t0, $s1, $s2 else $t0 = 0 Can use this instruction to build "blt $s1, $s2, Label" — can now build general control structures Note that the assembler needs a register to do this, — there are policy of use conventions for registers

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

56

Policy of Use Conventions Name Register number 0 $zero 2-3 $v0-$v1 4-7 $a0-$a3 8-15 $t0-$t7 16 23 16-23 $ 0 $ 7 $s0-$s7 24-25 $t8-$t9 28 $gp 29 $sp 30 $fp 31 $ra

Usage Preserved on call? the constant value 0 n.a. values for results and expression evaluation no arguments yes temporaries no saved yes more temporaries no global pointer yes stack pointer yes yes frame pointer return address yes

Register 1 ($at) reserved for assembler, 26-27 for operating system

Memory Layout  

Text: program code Static data: global variables 





Dynamic data: heap 



e.g., static variables in C, constant arrays and strings $gp initialized to address allowing ±offsets into this g segment E.g., malloc in C, new in Java

Stack: automatic storage

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

58

Local Data on the Stack



Local data allocated by callee 



e.g., C automatic variables

Procedure frame (activation record) 

Used by some compilers to manage stack storage Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

59

Procedure Call Instructions 

Procedure call: jump and link jal ProcedureLabel  Address of following instruction put in $ra  Jumps to target address



Procedure return: jjump p register g jr $ra  Copies $ra to program counter  Can also be used for computed jumps 

e.g., for case/switch statements Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

60

Leaf Procedure Example 

C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; ; }  Arguments g g, …,, j in $a0, $ , …,, $a3 $  f in $s0 (hence, need to save $s0 on stack)  Result in $v0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

61

Leaf Procedure Example 

MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, $s0 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0 $s0, $t0, $t0 $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi ddi $ $sp, $sp, $ 4 jr $ra Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

Save $s0 on stack

Procedure body Result Restore $s0 Return

62

Non Leaf Procedures Non-Leaf  

Procedures that call other procedures For nested call, caller needs to save on th stack: the t k  



Its return address Any arguments and temporaries needed after the call

Restore from the stack after the call Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

63

Non Leaf Procedure Example Non-Leaf 

C code: int fact (int n) { if (n < 1) return f; else return n * fact(n ( - 1); ); }  Argument g n in $a0 $  Result in $v0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

64

Non-Leaf Procedure Example 

MIPS code: fact: addi sw sw slti beq addi addi ddi jr L1: addi jal l lw lw addi mul j jr

$sp, $ra, $ $a0, 0 $t0, $t0, $v0, $sp, $ $ra $a0, fact $ $a0, 0 $ra, $sp, $v0, $ $ra

$sp, -8 4($sp) 0($sp) 0($ ) $a0, 1 $zero, L1 $zero, 1 $sp, $ 8 $a0, -1 0($sp) 0($ ) 4($sp) $sp, 8 $a0, $v0

# # # #

adjust stack for 2 items save return address save argument t test for n < 1

# # # # # # # # # #

if so, result is 1 pop 2 items it from f stack t k and return else decrement n recursive call restore t original i i l n and return address pop 2 items from stack multiply to get result and d return t

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

65

Character Data 

Byte-encoded character sets 

ASCII: 128 characters 



Latin-1: 256 characters 



95 g graphic, p , 33 control ASCII, +96 more graphic characters

Unicode: 32-bit character set   

Used in Java, C++ wide characters, … M t off th Most the world’s ld’ alphabets, l h b t plus l symbols b l UTF-8, UTF-16: variable-length encodings Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

66

Byte/Halfword Operations  

Could use bitwise operations MIPS byte/halfword load/store 

String processing is a common case

lb rt, offset(rs) 

Sign g extend to 32 bits in rt

lbu rt, offset(rs) 

lhu rt, offset(rs)

Zero extend to 32 bits in rt

sb rt, offset(rs) ff 

lh rt, offset(rs)

sh rt, offset(rs) ff

Store just rightmost byte/halfword Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

67

String Copy Example 

C code: Null-terminated string void id strcpy t (char ( h x[], [] char h y[]) []) { int i; i = 0; while ((x[i]=y[i])!='\0') i += 1; ; }  Addresses of x, y in $a0, $a1  i in $s0 

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

68

String Copy Example 

MIPS code: strcpy: addi sw add L1: add lbu add dd sb beq addi j L2: lw addi jr

$sp, $s0, $s0, $t1, $t2, $t3, $ 3 $t2, $t2, $s0, $ , L1 $s0, $sp, $ra

$sp, -4 0($sp) $zero, $zero $s0, $a1 0($t1) $s0, $ 0 $a0 $ 0 0($t3) $zero, L2 $s0, $ , 1 0($sp) $sp, 4

# # # # # # # # # # # # #

adjust stack for 1 item save $s0 i = 0 addr of y[i] in $t1 $t2 = y[i] addr dd of f x[i] [i] in i $t3 $ 3 x[i] = y[i] exit loop if y[i] == 0 i = i + 1 next iteration of loop restore saved $s0 pop 1 item from stack and return

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

69

Constants Co sa s 



Small constants are used quite frequently (50% of operands) e.g., g , A = A + 5;; B = B + 1; C = C - 18; Solutions? Why not?  

put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like one. Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

70

Constants Co sa s 

MIPS Instructions: addi $29, $29, 4 slti $8 $8, $18 $18, 10 andi $29, $29, 6 ori $29, $29, 4



How do we make this work?



Design Principle: Make the common case fast. Which format? Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

71

How o abou about larger a ge co constants? s a s?  

We'd like to be able to load a 32 bit constant into a register g Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 filled with zeros

1010101010101010 

0000000000000000

Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010

0000000000000000

0000000000000000

1010101010101010

1010101010101010

1010101010101010

ori

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

72

Branch Addressing 

Branch instructions specify 



M Most branch b h targets are near branch b h 



Opcode, two registers, target address Forward or backward op

rs

rt

constant or address

6 bits

5 bits

5 bits

16 bits

PC relative addressing PC-relative  

Target address = PC + offset × 4 PC already l d incremented i t d by b 4 by b this thi time ti Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

73

Jump Addressing 

Jump (j and jal) targets could be anywhere in text segment 



En ode full Encode f ll address dd e in instruction in t tion op

address

6 bit bits

26 bits

(Pseudo)Direct jump addressing 

Target address = PC31…28 : (address × 4)

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

74

Branching Far Away 



If branch target is too far to encode with 16-bit offset, assembler rewrites the code Example beq $ $s0,$s1, $ L1 ↓ bne $s0,$s1, L2 j L1 L2: … L2 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

75

Addressing Mode Summary

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

76

Synchronization 

Two processors sharing an area of memory  

P1 writes, then P2 reads Data race if P1 and P2 don’t synchronize 



Hardware support required  



R Result l depends d d off order d off accesses

Atomic read/write / memoryy operation p No other access to the location allowed between the read and write

Could be a single instruction  

E.g., atomic swap of register ↔ memory Or an atomic pair of instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

77

Synchronization in MIPS  

Load linked: ll rt, offset(rs) Store conditional: sc rt, offset(rs) 

Succeeds if location not changed since the ll 



Fails if location is changed 



Returns 1 in rt Returns 0 in rt

Example: atomic swap (to test/set lock variable) try: add ll sc beq add

$t0,$zero,$s4 $t1,0($s1) $ 0 0($ 1) $t0,0($s1) $t0,$zero,try $s4,$zero,$t1

;copy exchange value ;load linked ;store conditional di i l ;branch store fails ;put load value in $s4

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

78

Translation and Startup Many compilers produce object modules directly

Static linking

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

79

Assembler Pseudoinstructions 



Most assembler instructions represent machine instructions one-to-one P Pseudoinstructions: d i t ti figments fi t off th the assembler’s imagination → add $t0, $zero, $t1 blt $t0, $t1, L → slt $at, $t0, $t1 move $t0, $t1

bne $at, $zero, L 

$at (register 1): assembler temporary Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

80

Producing an Object Module 



Assembler (or compiler) translates program into machine instructions Provides information for building a complete program from the pieces   



 

Header: described contents of object module Text segment: translated instructions St ti d Static data t segment: t d data t allocated ll t d ffor th the lif life off th the program Relocation info: for contents that depend on absolute location of loaded program Symbol table: global definitions and external refs Debug info: for associating with source code Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

81

Linking Object Modules 

Produces an executable image 1.Merges segments 2.Resolve labels (determine their addresses) 3.Patch location-dependent and external refs



Could leave location dependencies p for fixing g by a relocating loader  

But with virtual memory, no need to do this Program can be loaded into absolute location in virtual memory space Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

82

Loading a Program 

Load from image file on disk into memory 1.Read header to determine segment sizes 2.Create virtual address space 3.Copy text and initialized data into memory 

Or set page table entries so they can be faulted in

4.Set up arguments on stack 5.Initialize registers (including $sp, $fp, $gp) 6J 6.Jump to t startup t t routine ti  

Copies arguments to $a0, … and calls main When main returns, do exit syscall y Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

83

Dynamic Linking 

Only link/load library procedure when it is called  



Requires Req ie p procedure o ed e code ode to be relocatable elo t ble Avoids image bloat caused by static linking of all (transitively) referenced libraries Automatically picks up new library versions

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

84

Lazy Linkage

Indirection table Stub: Loads routine ID, Jump to linker/loader Linker/loader code

Dynamically mapped code Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

85

Starting Java Applications Simple Si l portable t bl instruction set for the JVM

Compiles bytecodes of “hot” methods into native code for host machine

Interprets bytecodes

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

86

C Sort Example 



Illustrates use of assembly instructions for a C bubble sort function Swap p procedure p (leaf) ( )



void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } v in $a0, k in $a1, temp in $t0 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

87

The Procedure Swap swap: sll $t1 $t1, $a1, $a1 2 # $t1 = k * 4 add $t1, $a0, $t1 # $t1 = v+(k*4) # (address of v[k]) lw $t0, $t0 0($t1) # $t0 (temp) = v[k] lw $t2, 4($t1) # $t2 = v[k+1] sw $t2, 0($t1) # v[k] = $t2 (v[k+1]) sw $t0, $t0 4($t1) # v[k+1] [k 1] = $t0 (temp) (t ) jr $ra # return to calling routine

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

88

The Sort Procedure in C 

Non-leaf ((calls swap) p)



void sort (int v[], int n) { int i, j; for (i = 0; i < n; i += 1) { for (j = i – 1; j >= 0 && v[j] > v[j + 1]; j -= 1) { swap(v,j); } } } v in $a0, k in $a1, i in $s0, j in $s1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

89

The Procedure Body move move move for1tst: slt beq addi for2tst: slti bne sll add lw lw slt beq move move jal addi j exit2: addi j

$s2, $a0 # save $a0 into $s2 $s3, $a1 # save $a1 into $s3 $s0, $zero # i = 0 $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n) $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n) $s1, $s0, –1 # j = i – 1 $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0) $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0) $t1, $s1, 2 # $t1 = j * 4 $t2, $s2, $t1 # $t2 = v + (j * 4) $t3, 0($t2) # $t3 = v[j] $t4, 4($t2) # $t4 = v[j + 1] $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3 $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3 $a0, $s2 # 1st param of swap is v (old $a0) $a1, $s1 # 2nd param of swap is j swap # call swap procedure $s1, $s1, –1 # j –= 1 for2tst # jump to test of inner loop $s0, $s0, 1 # &i Computer += 1 Engineering Electrical of Engineering for1tst #School jump to test of outer loop THE COLLEGE OF NEW JERSEY

Move params Outer loop

Inner loop

Pass params & call Inner loop Outer loop 90

The Full Procedure sort:

addi $sp,$sp, p p –20 sw $ra, 16($sp) sw $s3,12($sp) sw $s2, 8($sp) , 4($sp) p sw $s1, sw $s0, 0($sp) … … , 0($sp) ( p) exit1: lw $s0, lw $s1, 4($sp) lw $s2, 8($sp) lw $s3,12($sp) , ( p) lw $ra,16($sp) addi $sp,$sp, 20 jr $ra

# # # # # # #

make room on stack for 5 registers g save $ra on stack save $s3 on stack save $s2 on stack save $s1 on stack save $s0 on stack procedure body

# # # # # # #

restore $s0 from stack restore $s1 from stack restore $s2 from stack restore $s3 from stack restore $ra from stack restore stack pointer return to calling routine

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

91

Effect of Compiler Optimization Compiled with gcc for Pentium 4 under Linux Relative Performance

3

Instruction count

140000 120000

2.5

100000

2

80000

15 1.5

60000

1

40000

0.5

20000

0

0 none

O1

O2

Clock Cycles

180000 160000 140000 120000 100000 80000 60000 40000 20000 0000 0

none

O3

O1

2

O2

O3

O2

O3

CPI

1.5 1 0.5 0 none

O1

O2

O3

none O1 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

92

Effect of Language Lang age and Algorithm Bubblesort Relative Performance

3 2.5 2 1.5 1 0.5 0 C/ C/none

C/O1

C/O2

C/O3

J Java/int /i t

J Java/JIT /JIT

Quicksort Relative Performance

2.5 2 1.5 1 0.5 0 C/none

C/O1

C/O2

C/O3

Java/int

Java/JIT

Quicksort vs. Bubblesort Speedup

3000 2500 2000 1500 1000 500 0 C/none

C/O1

Electrical & Computer Engineering SchoolC/O3 of Engineering C/O2 Java/int THE COLLEGE OF NEW JERSEY

Java/JIT

93

Lessons Learnt 





Instruction count and CPI are not good performance indicators in isolation C Compiler il optimizations ti i ti are sensitive iti to t the algorithm Java/JIT compiled code is significantly faster than JVM interpreted 



Comparable to optimized C in some cases

Nothing g can fix a dumb algorithm! g Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

94

Arrays vs. vs Pointers 

Array indexing involves  



Multiplying index by element size Addi to Adding t array base b address dd

Pointers correspond directly to memory addresses dd 

Can avoid indexing complexity

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

95

Comparison of Array vs. vs Ptr  

Multiply “strength strength reduced” reduced to shift Array version requires shift to be inside loop  



Part of index calculation for incremented i c.f. incrementing pointer

Compiler can achieve same effect as manual use of pointers  

Induction variable elimination Better to make program clearer and safer

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

96

ARM & MIPS Similarities  

ARM: the most popular embedded core Similar basic set of instructions to MIPS ARM

MIPS

Date announced

1985

1985

Instruction size

32 bits

32 bits

Address space

32-bit flat

32-bit flat

Data alignment

Aligned

Aligned

9

3

15 × 32-bit

31 × 32-bit

Data addressing modes Registers I Input/output t/ t t

Memory M Electrical & Computer Engineering mapped School of Engineering THE COLLEGE OF NEW JERSEY

Memory M mapped 97

Compare and Branch in ARM 

Uses condition codes for result of an arithmetic/logical instruction  



Negative, Neg ti e zero, e o carry, o overflow e flo Compare instructions to set condition codes without keeping the result

Each instruction can be conditional 



Top 4 bits of instruction word: condition value C avoid Can id branches b h over single i l instructions i t ti Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

98

Instruction Encoding

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

99

Alternative e a e Architectures c ec u es 

Design alternative:  



provide more powerful operations goal is to reduce number of instructions executed danger is a slower cycle time and/or a higher CPI

–“The path toward operation complexity is thus fraught with peril. To avoid these problems, designers have moved toward simpler instructions” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

100

Alternative e a e Architectures c ec u es 

Sometimes referred to as “RISC RISC vs. vs CISC” CISC 





virtually all new instruction sets since 1982 have been RISC VAX: minimize code size, make assembly language easy instructions from 1 to 54 bytes long!

We ll look at PowerPC and Intel Architecture We’ll (IA) Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

101

The Intel x86 ISA 

Evolution with backward compatibility 

8080 (1974): 8-bit microprocessor 



8086 (1978) (1978): 16-bit 16 bit extension t i to t 8080 



Adds FP instructions and register stack

80286 (1982): 24-bit addresses, MMU 



Complex instruction set (CISC)

8087 (1980): floating-point coprocessor 



Accumulator, plus 3 index-register pairs

Segmented memory mapping and protection

80386 (1985): 32-bit extension (now IA-32)  

Additional addressing modes and operations Paged g memoryy mapping pp g as well as segments g Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

102

The Intel x86 ISA 

Further evolution… 

i486 (1989): pipelined, on-chip caches and FPU 



Pentium (1993): superscalar, 64-bit datapath  





Later versions added MMX (Multi-Media eXtension) instructions The infamous FDIV bug

Pentium Pro (1995), Pentium II (1997)  New microarchitecture (see Colwell, Colwell The Pentium Chronicles) Pentium III (1999) 



Compatible competitors: AMD, Cyrix, …

Added SSE (Streaming SIMD Extensions) and associated registers

P i Pentium 4 (2001)  

New microarchitecture Added SSE2 instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

103

The Intel x86 ISA 

And further…  

AMD64 (2003): extended architecture to 64 bits EM64T – Extended Memory 64 Technology (2004)  



Intel Core (2006) 



Intel declined to follow, instead…

Advanced Vector Extension (announced 2008) 



Added SSE4 instructions, virtual machine support

AMD64 (announced 2007): SSE5 instructions 



AMD64 adopted by Intel (with refinements) Added SSE3 instructions

Longer g SSE registers, g , more instructions

If Intel didn’t extend with compatibility, its competitors would! 

Technical elegance ≠ market success Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

104

Basic x86 Registers

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

105

IA 32 Register Restrictions IA-32 

Registers are not “general general purpose” purpose – note the restrictions below

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

106

Basic x86 Addressing Modes 



Two operands per instruction Source/dest operand

Second source operand

Register

Register

Register

Immediate

Register

Memory

Memory

Register

Memory

Immediate

Memory addressing modes    

Address Address Address Address

in register = Rbase + displacement = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3) = Rbase + 2scale × Rindex + displacement Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

107

x86 Instruction Encoding 

Variable length encoding 



Postfix Po tfi b bytes te specify pe if addressing mode Prefix bytes modify operation 

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

Operand length, length repetition, locking, …

108

Implementing IA-32 IA 32 

Complex instruction set makes implementation difficult 

Hardware translates instructions to simpler microoperations  

 



Simple instructions: 1–1 C Complex l instructions: i t ti 1–many 1

Microengine similar to RISC Market share makes this economically viable

Comparable performance to RISC 

Compilers avoid complex instructions Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

109

Intel e Architecture c ec u e “This This history illustrates the impact of the “golden golden handcuffs” of compatibility ““adding ddi new features f as someone might i h add dd clothing to a packed bag” “an architecture that is difficult to explain and impossible to love”

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

110

A dominant architecture: 80x86 

Saving grace: 



the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow

“what the 80x86 lacks in style is made up in quantity, quantity making it beautiful from the right perspective” Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

111

PowerPC o e C 



Indexed addressing g 

example:



What do we have to do in MIPS?

#$t1=Memory[$a0+$s3]

Update d addressing dd 

update a register as part of load (for marching through arrays) example: lwu $t0,4($s3) #$t0=Memory[$s3+4];$s3=$s3+4



What do we have to do in MIPS?





lw $t1,$a0+$s3

Others:  

load multiple/store multiple a special counter register “bc Loop” decrement counter, if not 0 goto loop

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

112

Fallacies 

Powerful instruction  higher performance  

Fewer instructions required But complex instructions are hard to implement 





May slow down all instructions, including simple ones

Compilers are good at making fast code from simple i l instructions i i

Use assembly code for high performance 



Butt modern B d compilers il are better b tt att d dealing li with ith modern processors More lines of code  more errors and less productivity Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

113

Fallacies 

Backward compatibility  instruction set doesn’t change 

But they do accrete more instructions

x86 instruction set

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

114

Pitfalls 

Sequential words are not at sequential addresses 



In ement by Increment b 4, 4 not by b 1!

Keeping a pointer to an automatic variable i bl after ft procedure d returns t  

e.g., passing pointer back via an argument Pointer becomes invalid when stack popped Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

115

Concluding Remarks 

Design principles 1. 2 2. 3. 4.



Layers of software/hardware 



Simplicity favors regularity Smaller is faster Make the common case fast Good design demands good compromises

Compiler assembler Compiler, assembler, hardware

MIPS: typical of RISC ISAs 

x86 Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

116

Concluding Remarks 

Measure MIPS instruction executions in benchmark programs

Consider making the common case fast Instruction class MIPS examples SPEC2006 Int SPEC2006 FP  Consider compromises 

Arithmetic

add, sub, addi

16%

48%

Data transfer

lw, sw, lb, lbu, lh, lhu, sb, lui

35%

36%

Logical

and, or, nor, andi, ori, i sll, ll srl l

12%

4%

Cond. Branch

beq, bne, slt, slti, sltiu

34%

8%

Jump

j, jr, jal

2%

0%

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

117

Overview O e e o of MIPS S   

simple instructions all 32 bits wide very structured, no unnecessary baggage only three instruction formats

R

op

rs

rt t

I

op

rs

rt

J

op 



rd d

shamt h t

f funct t

16 bit number 26 bit address

rely on compiler to achieve performance — what are the compiler's goals? help compiler where we can Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

118

Addresses in Branches and Jumps p 

Instructions: bne $t4,$t5,Label beq $t4,$t5,Label j Label



I

op

J

op 

Next instruction is at Label if $t4 ≠ $t5 Next instruction is at Label if $t4 = $t5 Next instruction is at Label

Formats: rs

rt

16 bit number 26 bit address

Addresses are not 32 bits — How do we handle this with load and store instructions?

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

119

Addresses dd esses in Branches a c es 

Instructions: bne $t4,$t5,Label beq $t4,$t5,Label



I

Formats:

op



rs

rt

16 bit number

Could specify a register (like lw and sw) and add it to address  



Next instruction is at Label if $t4 ≠ $t5 Next instruction is at Label if $t4=$t5

use Instruction Address Register (PC = program counter) most branches b h are local l l (principle ( i i l off locality) l li )

Jump instructions just use high order bits of PC 

address boundaries of 256 MB Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

120

To summarize: MIPS operands Name 32 registers

Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0] Memory[0],

2

30

Accessed only by data transfer instructions instructions. MIPS uses byte addresses addresses, so

memory Memory[4], ...,

words

sequential words differ by 4. Memory holds data structures, such as arrays,

Memory[4294967292]

and spilled registers, such as those saved on procedure calls.

MIPS assembly language Category

Arithmetic

Instruction add

Example add $s1, $s2, $s3

Meaning $s1 = $s2 + $s3

Three operands; p ; data in registers g

subtract

sub $s1, $s2, $s3

$s1 = $s2 - $s3

Three operands; data in registers

$s1 = $s2 + 100 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1

Used to add constants

addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) store word lb $s1, 100($s2) load byte sb $s1, 100($s2) store byte load upper immediate lui $s1, 100 add immediate load word

Data transfer

Conditional branch

Unconditional jump

$s1 = 100 * 2

16

Comments

Word from memory to register Word from register to memory Byte from memory to register Byte from register to memory Loads constant in upper 16 bits

branch on equal

beq

$s1, $s2, 25

if ($s1 == $s2) go to PC + 4 + 100

Equal test; PC-relative branch

branch on not equal

bne

$s1, $s2, 25

if ($s1 != $s2) go to PC + 4 + 100

Not equal test; PC-relative

set on less than

slt

$s1, $s2, $s3

if ($s2 < $s3) $s1 = 1; else $s1 = 0

Compare less than; for beq, bne

set less than iimmediate di t

slti

jump

j jr jal

jump register jump and link

$s1, $s2, 100 if ($s2 < 100) $s1 = 1;

Compare less than constant

$ 1 =0 else l $s1

2500 $ra 2500

Jump to target address go to 10000 For switch, procedure return go to $ra $ra = PC + 4; go to 10000 For procedure call

1. Immediate addressing op

rs

rt

Immediate

2. Register addressing op

rs

rt

rd

...

funct

Registers Register

3. Base addressing op

rs

rt

Memory

Address

+

Register

Byte

Halfword

Word

4 PC-relative 4. PC relative addressing op

rs

rt

Memory

Address

PC

+

Word

5. Pseudodirect addressing op

Memory

Address

Word

PC

Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

122

Summary Su ay 

Instruction complexity is only one variable 



Design Principles:    



lower instruction count vs. higher CPI / lower clock rate simplicity favors regularity smaller is faster good design demands compromise make the common case fast

Instruction set architecture 

a very important abstraction indeed! Electrical & Computer Engineering School of Engineering THE COLLEGE OF NEW JERSEY

123

Suggest Documents