User Mode Execution. Computer Architecture - Overview. Supervisor Mode Execution. Processor Status Register. processor architecture

User Mode Execution Computer Architecture - Overview processor architecture able to use all of the “normal” instructions – privileged execution mo...
Author: Percival Howard
2 downloads 1 Views 55KB Size
User Mode Execution

Computer Architecture - Overview processor architecture

able to use all of the “normal” instructions



privileged execution modes



load and store general registers from/to memory



asynchronous exceptions (traps)



arithmetic, logical, test, compare, data copying



branches and subroutine calls

I/O architecture

able to address some subset of memory



busses, controllers, devices, smart controllers



I/O: direct, polled, mapped, DMA, interrupt driven



sequential and random access devices



disks and factors affecting disk I/O performance



I/O operations, update the MMU

You need to understand how these really work



interrupt enables, enter supervisor mode

computer and I/O architecture



what is controlled by a Memory Management Unit

not able to perform privileged operations

3/5/03 - 1

computer and I/O architecture

Supervisor Mode Execution

Processor Status Register

can execute privileged instructions

contains condition codes



able to perform I/O operations



set by arithmetic/logical operations (0,+,-,ovflo)



interrupt enable/disable/return, load PS



tested by conditional branch instructions



instructions to change processor mode

controls execution mode (user/supervisor)

can access privileged address spaces –

access data structures inside the OS



access other process's address spaces



change and create address spaces

describes which interrupts are enabled may describe what address space to use may control other processor features/options

may have alternate registers, alternate stack computer and I/O architecture

3/5/03 - 2

3/5/03 - 3



word length, endian-ness, instruction set, ...

computer and I/O architecture

3/5/03 - 4

Choice of Execution Modes

Asynchronous Exceptions and Handlers

computer boots up in supervisor mode –

most errors can be handled “in-line”

used by bootstrap and OS to initialize the system

applications run in user mode –

OS changes to user mode before running user code user programs cannot do I/O, restricted address space



they have no way to get into supervisor mode because instructions to change the PS are privileged

reentering supervisor mode is strictly controlled –

only happens in response to traps and interrupts

computer and I/O architecture

3/5/03 - 5

Trap Handling

st

1 level trap handler (saves registers and selects 2nd level handler)

PS/PC PS/PC PS/PC PS/PC PS/PC

TRAP vector table

return to user mode

2nd level handler (actually deals with the problem) computer and I/O architecture



program can test for, and handle such conditions

some errors must interrupt program execution –

e.g. CPU was unable to execute this instruction



there must be a way to inform OS if this happens

most computers accomplish this with “traps” –

a well specified list of all possible exceptions



a means for the OS to associate handlers with each

computer and I/O architecture

3/5/03 - 6

hardware trap handling

... instr; instr; instr; bad instr; instr; instr; instr ...

supervisor mode

arithmetic overflows are reflected in condition codes

(Transition into Supervisor Mode)

Application Program user mode



3/5/03 - 7



use trap cause to index into trap vector table for PC/PS



load new processor status word, switch to supv mode



push PC/PS of program that caused trap onto stack



load new program counter (w/addr of 1st level handler)

software trap handling –

1st level handler pushes all other registers onto stack



1st level handler gathers info, selects 2nd level handler



2nd level handler deals with the exception condition

computer and I/O architecture

3/5/03 - 8

Control of Supervisor mode transitions all user->supervisor changes are via traps/interrupts –

it is difficult to know when these will happen

there is a designated handler for each trap/intr –

its address is stored in a trap/interrupt vector table



the operating system sets up all of the handler vectors

ordinary programs can't access these vectors –

vectors are not in the process' address spaces

by carefully controlling all of the trap/intr “gateways”

computer and I/O architecture

some exceptions are handled by the OS –

e.g. page faults, alignment, floating point emulation



OS simulates expected behavior and returns

some exceptions may be fatal to running task –

e.g. zero divide, illegal instruction, invalid address



OS reflects the failure back to the running process

some exceptions may be fatal to the system

the OS controlls all supervisor mode transitions –

Dealing with the cause of a trap

3/5/03 - 9



e.g. power failure, cache parity, stack violation



OS cleanly shuts down the affected hardware

computer and I/O architecture

(Returning to User Mode)

Stacking and unstacking a trap

user mode computation

return is opposite of interrupt/trap entry

supervisor mode stack

user mode stack growth

user-mode PC and PS saved user-mode registers parameters to 2 nd level handler



2nd level system call handler returns to 1st level handler



1st level handler restores all registers from stack



use privileged return instruction to restore PC/PS



resume user-mode execution after trapped instruction

saved registers can be changed before return

return PC stack frame for 2 nd level handler

...

computer and I/O architecture

3/5/03 - 10

3/5/03 - 11



used to set entry point for newly loaded programs



used to deliver signals to user-mode processes



used to set return codes from system calls

computer and I/O architecture

3/5/03 - 12

Traps while in Supervisor Mode I/O architectures: busses

nearly identical to traps while in user mode –

trap saves interrupted PC/PS on supervisor mode stack



trap goes to same vector & 1st level handler



same register saving, restoring, and return

there are very few differences

control data address interrupts

main bus



saved PS at time of interrupt shows supervisor mode



2nd level handler knows trap was from supervisor mode (and may consider it to be more or less severe than the same trap from user mode)

computer and I/O architecture

Controller

CPU

3/5/03 - 13

Memory

Controller

Device

computer and I/O architecture

Memory type busses

3/5/03 - 14

Network type busses

came from back-plane memory-to-CPU interconnects

evolved as peripheral device interconnects



a few “bus masters”, and many “slave devices”



SCSI, USB, 1394 (firewire), Infiniband, ...



arbitrated multi-cycle bus transactions



cables and connectors rather than back-planes

request, grant, address, respond, transfer, ack



designed for easy and dynamic extensibility

operations: read, write, read/modify/write, interrupt



originally slower than back-plane, but no longer

originally most busses were of this sort

much more similar to a general purpose network



ISA, EISA, PCMCIA, PCI, cPCI, video busses, ...



distinguished by form-factor, speed, data width, ...



newer busses support bridging, hot-swap, self-identifying

computer and I/O architecture

3/5/03 - 15



packet switched, topology, routing, node identity



may be master/slave (USB) or peer-to-peer (1394)



may be implemented by controller or by host

computer and I/O architecture

3/5/03 - 16

I/O architectures: devices & controllers

mechanisms: device controller registers

I/O devices

device controllers export registers to the bus



peripheral devices that interface between the computer and other media (disks, tapes, networks, serial ports, keyboards, displays, pointing devices, etc.)

device controllers connect a device to a bus –

communicate control operations to device



relay status information back to the bus



manage DMA transfers for the device



generate interrupts for the device

FER DCD

PER RI

reading from registers obtains data/status

may require special instructions (e.g. x86 IN/OUT) may be mapped onto bus like memory accessed with normal (load/store) instructions I/O address space not accessible to most processes

computer and I/O architecture

3/5/03 - 18

(16550 UART registers)

Register Data Register Interrupt Enable Register Interrupt Register Line Control Register

RTS Modem Control Register RER Line Status Register CTS Modem Status Register

A 16550 presents seven 8-bit registers to the bus.

0: data – read received byte, write to transmit a byte (or LSB of speed divisor when speed set is enabled)

1: interrupt enables – for transmit done, data received, cd/ring (or MSB of speed divisor when speed set is enabled)

2: interrupt registers – currently pending interrupt conditions 3: line control register – character length, parity and speed 4: modem control register – control signals sent by computer

All communication between the bus and the device (send data, receive data, status and control) is performed by reading from, and writing to these registers. computer and I/O architecture



privileged instructions restricted to supervisor mode

A simple device: 16550 UART

BRK

writing into registers controls device or sends data



3/5/03 - 17

DTR OVR DSR





computer and I/O architecture

contents x x x x x MDM STS XMT RCV MDM STS XMT RCV PARITY STOP WORDLEN

registers in controller can be addressed from bus

register access method varies with CPU type

a controller is usually specific to a device and a bus

offset 0 x x x 1 2 3 spee BRK d 4 5 RCV EMT XMT 6



3/5/03 - 19

5: line status register – xmt/rcv completion and error conditions 6: modem status registers – received modem control signals computer and I/O architecture

3/5/03 - 20

Scenario: direct I/O with polling

(mechanisms: direct polled I/O) all transfers happen under direct control of CPU

uart_write_char( char c ) { while( (inb(UART_LSR) & TR_DONE) == 0); outb( UART_DATA, c ); }



CPU transfers data to/from device controller registers



transfers are typically one byte or word at a time



may be accomplished with normal or I/O instructions

CPU polls device until it is ready for data transfer

char uart_read_char() { while( (inb(UART_LSR) & RX_READY) == 0); return( inb(UART_DATA) );



received data is available to be read



previously initiated write operations have been completed

advantages –

} computer and I/O architecture

3/5/03 - 21

performance of direct I/O each byte or word transferred requires mutiple instructions busy-wait polling ties up CPU until I/O is completed

devices are idle while we are running other tasks –

3/5/03 - 22

bus facilitates data flow in all directions between

CPU is wasted while awaiting completion of transfers –

computer and I/O architecture

Direct Memory Access – I/O w/o the CPU

CPU intensive data transfers –

very easy to implement (both hardware and software)



CPU, memory, and device controllers

CPU can be the bus-master –

initiating data transfers with memory or device controllers

device controllers can also master the bus

I/O can only happen when an I/O task is running



how can problems be dealt with

CPU instructs controller what transfer is desired what data to move to/from what part of memory



let controller transfer data without attention from CPU



device controller performs transfer w/o CPU assistance



let application block pending I/O completion



device controller generates interrupt at end of transfer



let controller interrupt CPU when I/O is finally done

computer and I/O architecture

3/5/03 - 23

computer and I/O architecture

3/5/03 - 24

completion interrupts – waking up CPU

Interrupt Handling Application Program

device controllers, busses, and interrupts

... instr; instr; instr; instr; instr; instr ...

busses have ability to send interrupts to the CPU

user mode



devices signal controller when they are done/ready

supervisor mode



when device is done, controller asserts interrupt on bus

CPUs and interrupts –

1st level interrupt handler

interrupts look very much like traps

PS/PC PS/PC PS/PC PS/PC

return to user mode

PS/PC

Interrupt vector table

traps come from CPU, interrupts are caused externally –

unlike traps, interrupts can be selectively enabled/disabled

2nd level handler (device driver interrupt routine)

a device can be told it can or cannot generate interrupts special instructions can enable/disable interrupts to CPU computer and I/O architecture

3/5/03 - 25

interrupts vs. traps –

they are triggered when something happens



there is (usually) no persistent state that must be cleared

interrupts are caused a device being in some state –

they are triggered when the device enters a particular state



they will continue to be asserted until device state changes

lock(devlock);

/* lock device */

/* update data read count */

/* program the DMA request */

req_xfr = req_cnt – dp->cnt;

dp->loc = req_loc;

dp->adr = req_adr;

dp->cnt = req_cnt;

dp->op = READ;

/* turn off device ability to interrupt */ dp->ctrl = IDISABLE;

dp->ctrl = IENABLE | GO;

/* wake up the requester */ wakeup(devcompletion);

intr_enable( save );

once delivered, an interrupt must be disabled

await(devcompletion);

/* tell intr dispatcher we're done */

CPU must ignore continuing request for that interrupt

/* request has completed */

cause must be cleared, and interrupt acknowledged

unlock(devlock);

computer and I/O architecture

dev_intr_handler() {

save = intr_enable(DISABLE);

/* re-enable and await completion */

the device is changed from DONE to BUSY again



3/5/03 - 26

DMA read w/completion interrupts

traps are caused by an instantaneous condition



list of device interrupt handlers computer and I/O architecture

3/5/03 - 27

/* release device */

computer and I/O architecture

return( ACKNOWLEDGE_INTERUPT) } 3/5/03 - 28

(device I/O with completion interrupts)

mechanisms: memory mapped I/O

requesting process checks to see if device is busy

DMA may not be the easiest way to do I/O



if idle, start the I/O operation, and await its completion



if busy, wait for the device to become idle

I/O interrupt handler –

gathers completion information from the device



posts completion awakening requester

wake up the next requester



continuous updates to isolated areas of the screen



1MB display controller sits on the CPU memory bus



each byte of display memory corresponds to one pixel



application uses ordinary stores to update display

low overhead per update, no interrupts to service

we'll talk about waiting and waking up in two weeks computer and I/O architecture

consider a video game display adaptor

implement as a bit-mapped display adaptor

when current device owner finishes using the device –



3/5/03 - 29

relatively easy to program computer and I/O architecture

trade-off: memory mapped vs. DMA

3/5/03 - 30

Smart Device Controller

DMA performs large transfers efficiently –

better utilization of both the devices and the CPU

I/O completion interrupts

device doesn't have to wait for CPU to do transfers –

I/O instructions

but there is considerable per transfer overhead setting up the operation, processing completion interrupt

memory-mapped I/O has no start/finish overhead –

device driver

basic status basic control

accessed through bus

control registers (on bus) buffer pointers

device controller

but every byte is transferred by a CPU instruction normal instructions

DMA better for occasional large transfers

accessed through DMA

memory-mapped better frequent small transfers memory-mapped devices are more difficult to share computer and I/O architecture

3/5/03 - 31

shared buffers (in memory) computer and I/O architecture

3/5/03 - 32

Random v.s. Sequential Access

(I/O Mechanisms: smart controllers) Smarter controlers can improve on basic DMA they can queue multiple input/output requests –

when one finishes, automatically start next one



reduce completion/start-up delays



eliminate need for CPU to service interrupts

request scheduling to improve perormance



they can do automatic error handling & retries

they can better hide the details of underlying devices computer and I/O architecture



byte/block N must be read before byte/block N+1



may be read/write once, or may be rewindable



examples: magnetic tape, printer, keyboard

Random access devices

they can relieve CPU of other I/O responsibilities –

Sequential access devices

3/5/03 - 33



possible to seek directly to any desired byte/block



seeks may or may not be instantaneous



examples: memory, magnetic disk, CD, graphics adaptor

They are used very differently computer and I/O architecture

random access devices: disks

Disk drive geometry

random access devices are much more interesting –

usage, performance, and scheduling techniques

program loading, file I/O, paging



disk performance drives timesharing performance



a mounted assembly of circular platters



read/write head per surface, all moving in unison

track –

ring of data readable by one head in one position

cylinder

disk I/O operations are subject to overhead





higher overhead means fewer operations/second



careful scheduling can reduce overhead



clever scheduling can improve throughput and delay

computer and I/O architecture

spindle head assembly

key time sharing services depend on disk I/O –

3/5/03 - 34

corresponding tracks on all platter

sector –

3/5/03 - 35

logical records written within tracks

disk address = computer and I/O architecture

3/5/03 - 36

Disk Drive - Logical

Disk Drive – Physical

Sectors

Spindle

Track

10 heads

platter/surface

0 1

5 platters 10 surfaces

head positioning assembly

8 9

Cylinder

Motor

computer and I/O architecture

3/5/03 - 37

computer and I/O architecture

Optimizing disk performance

Disk Drive Performance heads 10 platters cylinders 17,000 tracks/inch sectors/track 400 bytes/sector RPM 7200 speed seek time 2-15ms (average 9ms) latency 0-8ms (average 4ms)

best case worst case average

don't start I/O until disk is on-cyl/near sector

5 18,000 512 200Mb/sec



I/O ties up the controller, locking out other operations



other drives seek while one drive is doing I/O

minimize head motion

time to read one 8,000 byte block seek rotate transfer total 400 s 0ms 0ms 400 s 23.4ms (58X) 15ms 8ms 400 s 13.4ms (33X) 9ms 4ms 400 s

computer and I/O architecture

3/5/03 - 38



do all possible reads in current cylinder before moving



make minimum number of trips in small increments

encourage efficient data requests

3/5/03 - 39



have lots of requests to choose from



encourage cylinder locality



encourage largest possible block sizes

computer and I/O architecture

3/5/03 - 40

Head Travel under various algorithms

read sections 6-6.3

76

First Come First Served 124 17 269 201 29

137

12

48

107

252

125

Tot=880

29

Shortest Seek First 17 12 124 137

68

172

108

For the next lecture (see Greek to English dictionary regarding figure 6-3)

there will be a quiz on the reading 76 47

12

5

112

13

64

201 68

topics for the next lecture

269

Tot=321

76

Scan/look (elevator algorithm) 124 137 201 269 29

17

12

48

13

5

Tot=450

64

68

240

12

computer and I/O architecture

3/5/03 - 41

key points

user view of processes



process address spaces



object modules, load modules, linkage editing



procedure calls, stack frames, system calls, signals

computer and I/O architecture

trap and interrupt handling

channels sit between CPU and I/O devices –

save/restore, vectoring 1st and 2nd level handlers

think of them as extremely smart busses

the include highly specialized CPUs

busses, devices, controllers, interconnections



they execute channel I/O programs

I/O mechanisms, what they are, how they work



instructions to read, write and control devices



instructions to generate progress interrupts



polled I/O, direct I/O, memory mapped I/O, DMA



interrupt driven I/O, smart controllers

once started, I/O programs execute w/o CPU attention

random access devices –

disk geometry, disk performance, disk scheduling

computer and I/O architecture

3/5/03 - 42

Channel Controllers – I/O co-processors

supervisor mode execution, privileged instructions –



3/5/03 - 43



command chaining



data chaining

computer and I/O architecture

3/5/03 - 44

Typical Channel Architecture

Typical Channel Program (both programs located in main memory)

Device Controller 0x11?

CPU

Main bus

Channel Controller 0x1?? Channel Controller 0x2??

...

Device Controller 0x1F?

Device 0x110

...

Device 0x11F

SIO 0x101, iopgm

...

...

... 3/5/03 - 45

Channel Controller iopgm SEEK cyl=1020, hd=5, rec=10 READ buf=xxx, cnt=4096 READX buf=yyy, cnt=4096, intr TIC next next

intr: TIO 0x101

all channels, controllers and devices have "Geographic" addresses computer and I/O architecture

Main CPU

...

computer and I/O architecture

SEEK cyl=1050, hd=0, rec=2 WRITE buf=zzz, cnt=8192, intr END intr

(note, channel can concurrently execute one program per controller) 3/5/03 - 46

Suggest Documents