Deparrtment of Electrrical Engineering,, Feng-Ch hia Unive ersity
Chapter 9 ~ 10: Memory Management
王振傑 (Chen-Chieh Wang) ccwang@mail ee ncku edu tw
[email protected]
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Outline Background (address translation) S Segmentation t ti Paging Virtual Memory Page Replacement
2 System Programming, Spring 2010
hia Unive ersity Deparrtment of Electrrical Engineering,, Feng-Ch
Virtualizing Resources
Physical Reality: Different Processes/Threads share the same hardware Need to multiplex p CPU ((CPU scheduling) g) Need to multiplex use of Memory (Today) Need to multiplex disk and devices
Wh worry about Why b t memory sharing? h i ?
The complete working state of a process and/or kernel is defined by its data in memory (and registers) Consequently, cannot just let different threads of control use the same memory Probably don’t don t want different threads to even have access to each other’s memory (protection)
3
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Multi-step Processing of a Program for Execution Preparation of a program for execution involves components at: Compile time (i.e. “gcc”) Link/Load time (unix “ld” does link) ( g dynamic y libs)) Execution time (e.g.
Addresses can be bound to final values anywhere in this path Depends D d on h hardware d supportt Also depends on operating system
Dynamic y Libraries Linking postponed until execution Small piece of code, stub, used to locate the appropriate memoryresident library routine Stub replaces itself with the address of the routine routine, and executes routine
4 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Dynamic Loading Routine is not loaded until it is called Better memory memory-space space utilization; unused routine is never loaded Useful when large amounts of code are needed to handle infrequently occurring cases No special support from the operating system is required implemented through program design
5 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive Deparrtment o ersity
Dynamic Linking Linking postponed until execution time Small piece of code code, stub, used to locate the appropriate memory-resident library routine Stub replaces itself with the address of the routine routine, and executes the routine Operating system needed to check if routine is in processes’ memory address Dynamic y linking g is p particularly y useful for libraries System also known as shared libraries
6 System Programming, Spring 2010
Uniprogramming (no Translation or Protection) Application always runs at same place in physical memory since only one application at a time Application can access any physical address Operating System y
Application
0xFFFFFFFF Valiid 32-b bit Ad ddressess
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Recall: Uniprogramming
0x00000000
Application given illusion of dedicated machine by giving it reality of a dedicated machine
Of course, this doesn’t help us with multithreading
7
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Multiprogramming (First Version) Multiprogramming without Translation or Protection Must somehow prevent address overlap between threads Operating System
0xFFFFFFFF
Application2
0x00020000
Application1
0x00000000
Trick: Use Loader/Linker: Adjust addresses while program loaded into memory (loads, stores, jumps) ey g adjus ed to o memory e o y location oca o o og a Everything adjusted of p program Translation done by a linker-loader Was pretty common in early days
With this solution solution, no protection: bugs in any program can cause other programs to crash or even the OS 8 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Multiprogramming (Version with Protection) Can we protect programs from each other without translation? Operating System
0xFFFFFFFF
Application2
0x00020000
Application1
LimitAddr=0x10000 BaseAddr=0x20000
0x00000000
Yes: use two special registers BaseAddr and LimitAddr to prevent user from straying outside designated area If user tries to access an illegal address, cause an error
During switch, kernel loads new base/limit from TCB User not allowed to change base/limit registers
9 System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
Simple Segmentation: Base and Bounds
CPU
B Base
Virtual Address Limit
>?
+
Physical Address
DRAM
Yes: Error!
Can use base & bounds/limit for dynamic address translation (Simple form of “segmentation”): Alter every address by adding “base” base Generate error if address bigger than limit
This gives program the illusion that it is running on its own machine, with memory starting at 0 o n dedicated machine ith memor Program gets continuous region of memory program g do not have to be relocated when Addresses within p program placed in different region of DRAM
10
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Memory-Management Unit (MMU) Hardware device that maps virtual to physical address In MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory The user p program deals with logical addresses; it never sees g g the real physical addresses
11 System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
Issues with simple segmentation method process 6
process 6
process 6
process 6
process 5
process 5
process 5
process 5
process 9
process 9
process 2 OS
process 10
OS
OS
OS
Fragmentation problem
Not every process is the same size Over time time, memory space becomes fragmented
Hard to do inter-process sharing
Want to share code segments when possible Want to share memory between processes Helped by providing multiple segments per process
N d enough Need h physical h i l memory ffor every process
12
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive Deparrtment o ersity
Dynamic Storage-Allocation Problem How to satisfy a request of size n from a list of free holes First-fit: Allocate the first hole that is big enough Best-fit: Allocate the smallest hole that is big enough; h entire ti lilist, t unless l d db i mustt search ordered by size Produces the smallest leftover hole
Worst-fit: Allocate the largest hole; must also search entire list Produces the largest leftover hole
13 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Fragmentation External Fragmentation – total memory space exists to satisfy a request request, but it is not contiguous Internal Fragmentation – allocated memory may be memory; slightly g y larger g than requested q y; this size difference is memory internal to a partition, but not being used Reduce external fragmentation by compaction Shuffle memory contents to place all free memory together in one large block Compaction is possible only if relocation is dynamic, and is done at execution time I/O problem Latch job in memory while it is involved in I/O Do I/O only into OS buffers 14 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Multiprogramming
(Translation and Protection version 2)
Problem: Run multiple applications in such a way that they are protected from one another Goals: Isolate processes and kernel from one another Allow flexible translation that:
Doesn’t lead to fragmentation Allows easy sharing between processes Allows only part of process to be resident in physical memory
(Some of the required) Hardware Mechanisms: General Address Translation
Flexible: Can fit physical chunks of memory into arbitrary places in users address space Not limited to small number of segments Think of this as providing a large number (thousands) of fixedsized segments (called “pages”)
Dual Mode Operation
Protection base involving kernel/user distinction
15
System Programming, Spring 2010
Deparrtment o of Electrrical Engineering,, Feng-Ch hia Unive ersity
Example of General Address Translation Data 2
Code Data Heap Stack
Code Data Heap Stack
Stack 1 Heap 1 Code 1 Stack 2
Prog 1 Virtual Address Space 1
Prog 2 Virtual Address Space 2
Data 1 Heap 2 Code 2 OS code
T n l ti n M Translation Map p 1
OS data
T n l ti n M Translation Map p 2
OS heap & Stacks
Physical Address Space
16
System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
Two Views of Memory
CPU
Virtual Addresses
MMU
Physical Addresses
Untranslated read or write
Recall: Address Space:
All the addresses and state a process can touch Each process and kernel has different address space
Consequently: two views of memory:
View from the CPU (what program sees, virtual memory) View from memory (physical memory) Translation box converts between the two views
Translation helps to implement protection
If task A cannot even gain access to task B’s B s data, data no way for A to adversely affect B
With translation, every program can be linked/loaded into g of user address space p same region Overlap avoided through translation, not relocation
17
System Programming, Spring 2010
hia Unive ersity Deparrtment of Electrrical Engineering,, Feng-Ch
Schematic View of Swapping
Extreme form of Context Switch: Swapping
In order to make room for next process, process some or all of the previous process is moved to disk Likely need to send out complete segments
This g greatly y increases the cost of context-switching g
Desirable alternative?
Some way to keep only active portions of a process in memory at any one time Need finer granularity control over physical memory 18 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Outline Background (address translation) S Segmentation t ti Paging Virtual Memory Page Replacement
19 System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
More Flexible Segmentation 11 4
1 2 3
22
4
3 user view of memory y space p
physical memory y space p
Logical View: multiple separate segments Typical: Code, Data, Heap, Stack Others: memory sharing, etc Oh h i
Each segment is given region of contiguous memory Has a base and limit Can reside anywhere in physical memory
20 System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
Implementation of Multi-Segment Virtual Seg # Offset Address
Base0 Base1 Base2 Base3 Base4 Base5 Base6 Base7
Limit0 Limit1 Limit2 Limit3 Limit4 Limit5 Limit6 Limit7
V V V N V N N V
>
Error
+
Physical Address
Segment map resides in processor
Segment number mapped into base/limit pair Base added to offset to generate physical address Error check catches offset out of range
As many chunks of physical memory as entries
Segment addressed by portion of virtual address However, could be included in instruction instead: x86 Example: mov [es:bx],ax.
What is “V/N”?
Can mark segments as invalid; requires check as well
21
System Programming, Spring 2010
Deparrtment o of Electrrical Engineering,, Feng-Ch hia Unive ersity
Example: Four Segments (16 bit addresses) Seg ID #
Seg
15 14 13
Offset
Virtual Address Format
0
Base
Limit
0 (code)
0x4000
0x0800
1 (data)
0x4800
0x1400
2 (shared)
0xF000
0x1000
3 (stack)
0x0000
0x3000
0x0000
0x0000
0x4000
0x4000 0x4800 0x5C00
Might Mi ht be shared
0x8000
Space for Other Apps
0xC000 0xF000
Virtual Vi t l Address Space
Physical Ph i l Address Space
Shared with pp Other Apps 22
System Programming, Spring 2010
Deparrtment o of Electrrical Engineering,, Feng-Ch hia Unive ersity
Example of segment translation 0x240 0x244 … 0x360 0x364 0x368 … 0x4050
main:
strlen: loop:
varx
la $a0, varx jal strlen … li $v0 0 ;count $v0, lb $t0, ($a0) beq $r0,$t1, done … dw 0x314159
Seg ID #
Base
Limit
0 (code)
0x4000 0x0800
1 (data)
0x4800 0x1400
2 (shared) 0xF000 0x1000 3 ((stack))
0x0000 0x3000
Let’s simulate a bit of this code to see what happens (PC=0x240): 1. Fetch 0x240. Virtual segment #? 0; Offset? 0x240 Physical address? Base=0x4000, so physical addr=0x4240 Fetch instruction at 0x4240. Get “la $a0, varx” Move 0x4050 $a0, Move PC+4PC 2 Fetch 2. F t h 0x244. 0 244 T Translated l t d tto Ph Physical=0x4244. i l 0 4244 G Gett “jal “j l strlen” tl ” Move 0x0248 $ra (return address!), Move 0x0360 PC 3. Fetch 0x360. Translated to Physical=0x4360. Get “li $v0,0” Move 0x0000 $v0,, Move PC+4PC 4. Fetch 0x364. Translated to Physical=0x4364. Get “lb $t0,($a0)” Since $a0 is 0x4050, try to load byte from 0x4050 Translate 0x4050. Virtual segment #? 1; Offset? 0x50 Physical address? Base=0x4800 Base=0x4800, Physical addr = 0x4850 0x4850, Load Byte from 0x4850$t0, Move PC+4PC
23
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Observations about Segmentation Virtual address space has holes
Segmentation efficient for sparse address spaces A correct program should never address gaps (except as mentioned in moment) If it does, trap to kernel and dump core
When it is OK to address outside valid range:
This is how the stack and heap are allowed to grow For instance, stack takes fault, system automatically increases size of stack
Need protection mode in segment table
For example, example code segment would be read-only Data and stack would be read-write (stores allowed) Shared segment could be read-only or read-write
Wh t mustt be What b saved/restored d/ t d on context t t switch? it h? Segment table stored in CPU, not in memory (small) Might store all of processes memory onto disk when switched it h d ((called ll d “swapping”) “ i ”)
24 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Outline Background (address translation) S Segmentation t ti Paging Virtual Memory Page Replacement
25 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Paging: Physical Memory in Fixed Size Chunks Problems with segmentation?
Must fit variable-sized chunks into physical memory May move processes multiple times to fit everything Limited options for swapping to disk
Fragmentation: wasted space g p
External: free gaps between allocated chunks Internal: don’t need all memory within allocated chunks
Solution to fragmentation from segments?
Allocate physical memory in fixed size chunks (“pages”) Every chunk of physical memory is equivalent Can use simple vector of bits to handle allocation: 00110001110001101 … 110010 Each bit represents page of physical memory 1allocated, 0free
Should pages be as big as our previous segments? No: Can lead to lots of internal fragmentation Typically yp y have small p pages g ((1K-16K))
Consequently: need multiple pages/segment
26 System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
How to Implement Paging? Virtual Address:
Virtual Page #
PageTablePtr
PageTableSize
>
Access Error
Offset page p g #0
V,R
page #1
V,R
page #2
V,R,W
page #3
V,R,W
page #4
N
page #5
V,R,W
Physical y Page #
Offset
Physical Address Check hec Perm erm
Access Error
Page Table (One per process)
Resides in physical memory p y p page g and p permission for each virtual p page g Contains physical Permissions include: Valid bits, Read, Write, etc
Virtual address mapping
Offset from Virtual address copied to Physical Address Example: 10 bit offset 1024-byte pages
Virtual page # is all remaining bits
Example for 32-bits: 32-10 = 22 bits, i.e. 4 million entries Physical Ph sical page # copied from table into ph physical sical address
Check Page Table bounds and permissions
27 System Programming, Spring 2010
Deparrtment of Electrrical Engineering,, Feng-Ch hia Unive ersity
Free Frames (Physical Pages)
Before allocation
After allocation
28
System Programming, Spring 2010
Deparrtment o of Electrrical Engineering,, Feng-Ch hia Unive ersity
What about Sharing? Virtual Address (Process A):
Virtual Page #
PageTablePtrA
PageTablePtrB
Virtual Address: Process B
Virtual Page #
Offset page page page page page page
#0 #1 #2 #3 #4 #5
page page page page page page
#0 #1 #2 #3 #4 #5
V,R V,R V,R,W V R W V,R,W N
Shared Page
V R W V,R,W V,R N V,R,W N V,R V R V,R,W
This physical page appears in address space of both processes
Off Offset 29 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Simple Page Table Discussion 0x00
0x04
0x08
a b c d e f g h i j k l
Virtual Memory
What needs to be switched on a context switch?
0x00 4
0 04 0x04
3 1
0x08
Page 0x0C a Table 0x10
i j k l e f g h a b c d
Physical Memory
Example (4 byte pages)
Page P table t bl pointer i t and d lilimit it
Analysis Pros
l memory allocation ll ti Si Simple Easy to Share
Con: What if address space is sparse?
E.g. on UNIX, code starts at 0, stack starts at (231-1). With 1K pages, need 4 million page table entries!
Con: What if table really big?
Not all pages used all the time would be nice to have working set of page table in memory
How about combining paging and segmentation?
30 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Multi-level Translation What about a tree of tables?
Lowest level page table memory still allocated with bitmap Higher levels often segmented
Could have any number of levels. Example (top segment): Virtual Address:
Virtual Seg #
Base0 Base1 Base2 Base3 Base4 Base5 Base6 Base7
Virtual Page #
Limit0 Limit1 Limit2 Limit3 Limit4 Limit5 Limit6 Limit7
V V V N V N N V
Offset
>
page #0
V,R
page p g #1
V,R ,
page #2
V,R,W
page #3
V,R,W
page #4
N
page #5
V,R,W
Physical y P Page #
Check Perm
Access Error
Access Error
What must be saved/restored on context switch?
C Contents t t off top-level t l l segmentt registers i t (f (for this thi example) l ) Pointer to top-level table (page table)
31
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Paging Hardware With TLB Making Address Translation Fast A cache for address translations: Translation Lookaside Buffer
32 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Outline Background (address translation) S Segmentation t ti Paging Virtual Memory Page Replacement
33 System Programming, Spring 2010
TLB
hia Unive ersity Deparrtment of Electrrical Engineering,, Feng-Ch
Virtual Memory
Page Table Virtual Memory 4 GB
D Disk 500GB
Physical Memory 512 MB
Illusion of Infinite Memory g than p y y Disk is larger physical memory
In-use virtual memory can be bigger than physical memory Combined memory of running processes much larger than physical p y memory y 34 System Programming, Spring 2010
35
Deparrtment of Electrrical Engineering,, Feng-Ch hia Unive ersity
Page Tables
System Programming, Spring 2010
36
Deparrtment of Electrrical Engineering,, Feng-Ch hia Unive ersity
Translation Lookaside Buffer (TLB)
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
TLB Misses If page is in memory Load the PTE from memory and retry Could be handled in hardware C gett complex l ffor more complicated li t d page ttable bl Can structures Or in software Raise a special exception, with optimized handler
If page is not in memory (page fault) OS handles fetching the page and updating the page table Then Th restart t t the th faulting f lti instruction i t ti
37 System Programming, Spring 2010
Deparrtment of Electrrical Engineering,, Feng-Ch hia Unive ersity
Steps in Handling a Page Fault
38 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Outline Background (address translation) S Segmentation t ti Paging Virtual Memory Page Replacement
39 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
What happens if there is no free frame? Page replacement – find some page in memory, but not really in use use, swap it out algorithm performance – want an algorithm which will result in minimum number of page faults
Same page may be brought into memory several titimes
40 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Page Replacement Policies Why do we care about Replacement Policy? Replacement is an issue with any cache Particularly P ti l l iimportant t t with ith pages
The cost of being wrong is high: must go to disk Must keep important pages in memory, not toss them out
In First Out) FIFO (First In,
Throw out oldest page. Be fair – let every page live in memory for same amount of time. Bad, Bad because throws out heavily used pages instead of infrequently used pages
MIN (Minimum, Optimal):
Replace p p page g that won’t be used for the longest g time Great, but can’t really know future… Makes good comparison case, however
RANDOM:
Pick random page for every replacement Typical solution for TLB’s. Simple hardware Pretty y unpredictable – makes it hard to make real-time guarantees 41 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Replacement Policies (Con’t) LRU (Least Recently Used):
Replace page that hasn’t been used for the longest time Programs have locality, so if something not used for a while, unlikely to be used in the near future. Seems like LRU should be a good approximation to MIN.
How to implement LRU? Use a list! Head
Page 6
Page 7
Page 1
Page 2
Tail ((LRU)) On each use, remove page from list and place at head LRU page is at tail
Problems with this scheme for paging?
Need to know immediately when each page used so that can change position in list… Many instructions for each hardware access
In practice, people approximate LRU (more later)
42
System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive Deparrtment o ersity
Example: FIFO Suppose we have 3 page frames, 4 virtual pages, g reference stream: and following ABCABDADBCB
Consider FIFO Page replacement: A
B
C
A
B
D
A
D
B
C
B
A
A
A
A
A
D
D
D
D
C
C
B
B
B
B
B
A
A
A
A
A
C
C
C
C
C
C
B
B
B
FIFO: 7 faults. faults When referencing D, replacing A is bad choice, since need A again right away 43 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
Example: MIN (Optimal) Suppose we have the same reference stream: ABCABDADBCB
Consider MIN Page replacement: A
B
C
A
B
D
A
D
B
C
B
A
A
A
A
A
A
A
A
A
C
C
B
B
B
B
B
B
B
B
B
B
C
C
C
D
D
D
D
D
D
MIN: 5 faults Where will D be brought g in? Look for p page g not referenced farthest in future.
What will LRU do?
Same decisions as MIN here here, but won’t won t always be true! 44 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch hia Unive ersity Deparrtment o
When will LRU perform badly? Consider the following: A B C D A B C D A B C D LRU Performs as follows ((same as FIFO here): ) A
B
C
D
A
B
C
D
A
B
C
D
A
A
A
D
D
D
C
C
C
B
B
B
B
B
B
A
A
A
D
D
D
C
C
C
C
C
B
B
B
A
A
A
D
Every reference is a page fault!
MIN Does much better: A
B
C
D
A
B
C
D
A
B
C
D
A
A
A
A
A
A
A
A
A
B
B
B
B
B
B
B
B
C
C
C
C
C
C
C
D
D
D
D
D
D
D
D
D 45
System Programming, Spring 2010
hia Unive ersity Deparrtment o of Electrrical Engineering,, Feng-Ch
Graph of Page Faults Versus The Number of Frames
One desirable property: When you add memory the miss rate goes down Does this always happen? Seems like it should should, right?
No: BeLady’s anomaly
Certain replacement algorithms (FIFO) don’t have this obvious b i property! t ! 46 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Adding Memory Doesn’t Always Help Fault Rate Does adding memory reduce number of page faults? Yes for LRU and MIN Not necessarily for FIFO! (Called Belady’s anomaly) 1
2
3
4
1
2
5
1
2
3
4
5
1
1
1
4
4
4
5
5
5
5
5
5
2
2
2
1
1
1
1
1
3
3
3
3
3
3
2
2
2
2
2
4
4
1
2
3
4
1
2
5
1
2
3
4
5
1
1
1
1
1
1
5
5
5
5
4
4
2
2
2
2
2
2
1
1
1
1
5
3
3
3
3
3
3
2
2
2
2
4
4
4
4
4
4
3
3
3
9 page faults
10 page faults
47 System Programming, Spring 2010
of Electrrical Engineering,, Feng-Ch Deparrtment o hia Unive ersity
Thrashing If a process does not have “enough” pages, the page-fault rate is very high. This leads to:
low CPU utilization operating system thinks that it needs to increase the degree of multiprogramming another process added to the system
Thrashing a process is busy swapping pages in and out
48 System Programming, Spring 2010