Input / Output Systems CPU and Memory System performance are only part of the picture, consider the following scenario:
Input/Output Systems
Your system’s a dog. You need to improve its performance. On average your system spends about 75% of its time using the CPU and 25% of its time doing I/O. There are several options available: A new multiprocessor upgrade:
COMP 251 Computer Organization and Architecture Fall 2009
• 60% improvement in CPU performance for $11,000
A new I/O system: • 55% improvement in I/O performance for $4,500
Which upgrade gives you the most bang for your buck?
Input / Output Systems
I/O System Concepts
The I/O System: A subsystem of components that transfer data between the core system and external devices.
Data Transmission: The ways in which data can be transferred between devices in a system. Serial Parallel
Core System: CPU / Cache / Main Memory
Device Type: Indicates the way that a device physically reads/writes data.
External Devices (Peripherals, I/O devices): Disk Drives Graphics Card Network Card Keyboard / Mouse Etc…
Character devices Block devices
Serial Communication
Parallel Communication
Serial: a single bit of data is transmitted at a time.
Parallel: many bits of data are transmitted simultaneously.
Requires 2 wires + control lines Each bit is transmitted as a voltage differential
(Vcc
typically 5 volts) +vcc
wire1 wire2
1
0
0
1
0
1
1
Ground -vcc
newest
oldest
Requires 1 wire for each bit of data Each bit is transmitted as +Vcc (1) or Ground (0). Electrical interference causes signals to degrade with distance High speed parallel communications are typically limited to a few feet.
Examples:
Examples: PC Serial Port (RS232/332) USB
Fire Wire (IEEE 1394) Ethernet
Receiver
Sender
• 1 Wire 1 = +Vcc, Wire 2 = -Vcc • 0 Wire 1 = -Vcc, Wire 2 = +Vcc
12 Mbs (1.1)
480 Mbs (2.0)
400 or 800 Mbs
10 or 100 or 1000 Mbs
ATA / IDE
SCSI
PC Printer Port
100 MB/s
80 MB/s
1
Character Devices
Block Devices
Character devices read / write a single byte (or word) of data at a time.
Block devices read / write a collection of bytes (or words) at a time.
Examples: Main Memory Keyboard / Mouse Network Interface Card USB / Flash Drive?
Don’t Try This At Home What do you get if you stuff your computer’s disk drive with herbs? A thyme machine
Physical Devices Physical devices are often complicated electromechanical machines. Each brand of device has its own peculiarities. E.g. Disk drive Monitor: CRT / LCD Pointing Device: Mouse / touch pad / track ball
Example: Disk drive CD / DVD ROM USB / Flash Drive?
I / O System Components The main components in an I/O System are: System Bus BIOS / Device Drivers Interface cards (a.k.a. adapters) Physical I/O Devices
Interface Card An Interface Card provides a uniform programmatic interface to a physical device: All interface cards for a type of device (graphics, disk drive, etc) support a standard basic set of instructions for interacting with that type of device. Cards from specific manufacturers typically also support enhanced or expanded instructions that are specific to that manufacturer’s device.
Instructions are issued by writing into an instruction register on the interface card. Other registers on the card get loaded with parameters for the instruction or are read to obtain the results of an instruction.
2
Basic Interface Card Operations Examples: Disk Interface: Read block 22 Write block 74
Graphics Interface: Color the pixel at location (20,30) using the color light blue. Place the character ‘A’ at location (10,15)
BIOS: Basic I/O System The BIOS contains a set of machine language subroutines for issuing basic instructions to interface cards. Programs can call these sub-routines to carry out basic input/output operations. Hides the complexity of reading and writing device registers behind the standard procedure calling interface.
Keyboard Interface: Read the last keystroke.
Mouse: Read the last mouse action (left click/right click etc…).
BIOS also contains routines for editing the basic configuration of the computer and for bootstrapping the operating system.
Bootstrapping Bootstrapping is the process by which the computer loads its operating system when it is turned on. Typical bootstrap process: PC initialized to address of a small program in BIOS Bootstrap routine
• Loads first sector of the boot disk into a known memory location. • Contains a small program (~512 bytes) – the primary bootstrap loader.
Bootstrap routine JUMPs to first instruction of the primary bootstrap loader. Primary bootstrap loader reads a larger more powerful program – the secondary bootstrap loader - from a specified location on disk
Device Drivers Device Driver: Used by the operating system, in place of BIOS routines, to interact with devices. Device drivers can utilize the enhanced or expanded instruction set provided by the specific physical device that is present. Take advantage of additional features: E.g. Graphics cards with 2D or 3D support.
Use the device more efficiently: E.g. Schedule disk read/write operations.
Methods for Performing I/O There are four common methods that are used by BIOS routines and device drivers to perform I/O operations. Each method is designed to decrease the involvement of the CPU in the I/O operation. Programmed I/O Interrupt Driven I/O Direct Memory Access (DMA) Channel I/O
Programmed I/O In programmed I/O the CPU explicitly controls all aspects of the the input and output of the data. Disk Read Example: CPU requests the first block of data of the file from the disk CPU waits for disk drive to read the block CPU transfers block of data to main memory word by word. Repeat for each block of data in the file
Used for: Reading and writing main memory. • E.g. Copying one array into another array.
Dedicated devices • ATM • Alarm System (polling)
3
Interrupt Driven I/O In interrupt driven I/O the CPU is free to perform other work while a device is reading/writing the data. Disk Read Example: CPU requests the first block of the file Disk drive reads the first block into a buffer. Meanwhile the CPU is free to do other work. Disk generates an interrupt when the block has been read into the buffer. CPU stops (interrupts) what it is doing and transfers the data in the buffer to main memory word by word. Repeat for each block of the file.
Interrupt Driven I/O What tradeoff are we making? Speed vs. Cost/Complexity
When is is most useful? When the device is significantly slower than the CPU (e.g. disk drive / network / printer). When the the availability of data is initiated by an external entity (e.g. keyboard / mouse / network).
Cycle Stealing Cycle Stealing: Occurs when both the CPU and the DMA controller need to access main memory Only one device at a time can access main memory. The DMA controller “steals” memory cycles from the CPU • Priority for main memory access is given to the DMA controller. • Prevents buffer overflows and lost data. • E.g. data transfer over the network interface.
Cycle stealing is the reason for sluggish system performance during periods of high I/O (e.g. printing, downloading). • Cache helps
Double Buffering The performance of interrupt driven I/O can be improved by using two buffers. Only one device, the Disk or memory can use the buffer at a time. Using two buffers allows both the disk and the CPU to be busy simultaneously.
Direct Memory Access (DMA) With direct memory access a DMA controller handles the transfer of data from a device buffers to memory. Disk Read Example: CPU instructs DMA controller to read first block of the file from disk and store it into main memory at a specified address. CPU goes off to do other work. DMA controller requests first block of file from disk. Disk reads the first block into a buffer and notifies the DMA controller. The DMA controller transfers the block from the disk buffer into main memory. The DMA controller generates an interrupt for the CPU. Repeat for each block of the file.
Channel I/O Channel I/O is an extension of DMA in which the DMA controller becomes a small CPU (an I/O processor) capable of executing programs. The CPU downloads a program to the I/O processor which carries it out. The program may include instructions to transfer multiple blocks of data from multiple files to different locations in memory. In the case of CD Jukeboxes or robotic tape libraries, the program may also contain instructions for switching CDs and rewinding tapes.
Expensive Used primarily on high performance main-frames.
4
Interrupt Processing When the CPU receives an interrupt the operating system is invoked to process the interrupt.
Polled Interrupt Processing In polled interrupt processing all devices share a single interrupt request line.
Two approaches: Polled Interrupt Processing Vectored Interrupt Processing
Vectored Interrupt Processing In vectored interrupt processing an interrupt controller is used to identify the device that generated the interrupt.
The interrupt request line is one of the wires in the control portion of the system bus. A device sets the interrupt request line to 1 (asserts it) when it wants to interrupt the CPU. When the interrupt occurs the OS is invoked. The OS polls each device to determine which generated the interrupt. OS invokes appropriate sub-routine / device driver to handle the interrupt.
Persistent Storage Technology The Early Days: Punch Cards / Paper Tape Magnetic Tape Hard Disks IBM, 1956 ~5MB, 1 second access time $1,000,000
A device asserts its interrupt request line. The interrupt controller asserts the INT line to the CPU. The CPU acknowledges the interrupt by asserting the ACK line. The CPU reads the IRQ number of the device that caused the interrupt. The IRQ number is used as an index into a table (interrupt vector). Vector contains the address of the OS sub-routine / device driver for handling the interrupt. PC is set to address of the found in the table.
Today: HDD: SCSI: 1TB, 3Gb/s, 16MB buffer, 4.16ms access, 7.2k RPM, ~$230 ATA: 1TB, 300MB/s, 32MB buffer, 4.2 ms access, 7.2k RPM, ~$150 USB: 1TB, 480Mb/s, 7.2k RPM, ~$100
Optical:
4.7GB (Single-sided, Single Layer DVD) 17GB (Double-sided, Double Layer DVD)
Hard Disk Structure
Hard Disk Structure Tracks Cylinders Sectors
Interleaving
Blocks/Clusters Disks read/write a block of data at a time.
Zoned Bit Recording
5
Hard Disk Capacity What is the total capacity of a hard disk with: 4 platters 1024 tracks per surface 64 sectors per track 512 bytes per sector
Optical Disk Structure Optical Disks: CDROM, CDR, CDRW, DVD, DVDR, DVDRW, WORM
Hard Disk Access Addressing Data: C:H:S: Cylinder Head Sector addressing Logical: Sectors numbered from 0 to N.
Access Time: The time for the disk to begin reading from a specified address. Seek Time: Time required for the heads to move to the requested cylinder. Rotational Latency: Time required for the requested sector to rotate under the read/write head. Average values are typically reported.
Optical Disk Data Encoding CD: 1 micron = 1000 nm
Pit: a bump on the face of the disk. Each pit edge encodes a 1. All other regions represent 0’s. • On a CD every 0.83 microns is a bit.
Land: space between the pits.
RAID RAID ≡ Redundant Arrays of Inexpensive Disks A.K.A. Redundant Arrays of Independent Disks The idea of RAID is to use multiple disk drives to increase the performance and/or reliability of secondary storage systems. Higher throughput Disk failure recovery Original RAID description had 5 levels: • RAID-1 … RAID 5 • Not all are commercially available.
RAID Concepts Data Striping: Spreading data across multiple disks to allow for concurrent reading of different parts of the data. Data Mirroring: Storing copies of the same data on multiple disks for drive failure recovery. Parity: Storing extra bits of data for error detection and additional fault tolerance. Error Correcting Codes (ECC): Storing extra bits of data for error correction and additional fault tolerance.
Two additional levels have been added: • RAID-0 … RAID 6
6
RAID-1
RAID 0 RAID-0: Block Interleave Data Striping
RAID-1: Disk Mirroring
Redundancy
Not really RAID - no redundancy High Performance: Data striping allows for multiple concurrent read and write operations High Bandwidth Applications • Audio/Video Production & Editing • Image Editing
Tolerates any single drive failure Tolerates certain simultaneous drive failures
Some concurrent read/write operations Large hardware overhead Requires 2N disks, N = # of disks needed to store 1 copy of data.
Fast recovery High Availability Applications Accounting, Ecommerce, Payroll
RAID-3
RAID-4
RAID-3: Bit Interleave Data Striping with Parity Check
RAID-4: Block Interleave Data Striping with Parity Disk Just like RAID-3 but stripe blocks instead of bytes.
Redundancy: Tolerates any single disk failure. • Continuous operation with failed disk.
Lower cost than RAID-1 (requires only N+1 disks) High bandwidth applications (requiring redundancy) Best for block data (E.g. Image/Video)
RAID-5 RAID-5: Block Interleave Data Striping with Distributed Parity
Redundancy: Can tolerate any single disk failure. Same cost as RAID-3 Better random read / write performance. Most widely available RAID implementation File/Application Severs WWW/E-Mail Severs Database Servers
RAID-2 RAID-2: Bit Interleave Data Striping with
Error Correcting Hamming Code
Redundancy: Tolerates the loss of any single disk. On the fly error correction.
Expensive: Requires large number of disks.
7
Hamming Codes Hamming Code: An error correcting code based on Hamming Distance. Hamming Distance: The number of bit differences between two binary values: Example:
1011
0010
• Hamming Distance = 2
Basic Idea: Encode data using only values that are at least a hamming distance of three apart. Then if any one bit is in error correct it to the nearest value.
Using Hamming Codes Consider the following set of Hamming code values with Hamming distance 3: 00000 01011 10110 11101
Imagine we read the value 10000. This is not a valid value in our Hamming code. How do we correct it?
Other RAID Systems RAID-6: Block Interleaved Data Striping with Multiple Independent Distributed Parity Schemes Tolerates multiple concurrent disk failures. Requires N+2 disks.
Nested RAID Levels Combinations of multiple RAID Levels RAID-01: Mirrored array of striped disks. RAID-10: Striped array of mirrored disks. Yet others: • 03, 30, 53, 05, 50, 15, 51
Nested RAID RAID-10 Striped array of Mirrored Disks
RAID 0 /-----------------------------------\ | | | RAID 1 RAID 1 RAID 1 /--------\ /--------\ /--------\ | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A1 A2 A2 A3 A3 A4 A4 A5 A5 A6 A6 B1 B1 B2 B2 B3 B3 B4 B4 B5 B5 B6 B6
RAID 1 /--------------------------\ | | RAID 0 RAID 0 /-----------------\ /-----------------\ | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A2 A3 A1 A2 A3 A4 A5 A6 A4 A5 A6 B1 B2 B3 B1 B2 B3 B4 B5 B6 B4 B5 B6
RAID-01 Mirrored array of Striped Disks
8