Content
Prefetching in File System
Course: CSC456 Speaker: Bin Bao
Introduction of file caching Caching in Linux Introduction of file prefetching Prefetching in Linux Some research work
Prefetching in File System
File caching
2
Working mechanism Read Write
CPU Cache
write-back: UNIX write-through: DOS
Memory Cache
Temporal locality Spatial locality
Disk
Prefetching in File System
3
Prefetching in File System
4
1
Caching in Linux (2.6.11)
Radix tree in address_space
Page cache
Look up a page in cache quickly
Page descriptor mapping -> address_space Index: page number relative to each file
Core structure: address_space embedded in inode object Prefetching in File System
5
Prefetching in File System
Radix tree in address_space
File prefetching
Goal
Consider a program sequentially accessing a file, e.g. cp Caching is not sufficient to explore spatial locality Prefetching can hide I/O latency by fetching data into cache before they are requested by programs
Retrieve pages with certain state quickly
Each node in the tree keeps 1-bit PG_dirty tag and 1-bit PG_writeback tag for each child
6
Prune search tree Prefetching in File System
7
Prefetching in File System
8
2
Possible approaches
Prefetching in Linux
User-inserted prefetching
Function page_cache_readahead
Hard to program Lack of portability
Two windows
Compiler-generated prefetching
current window ahead window (pages to be prefetched) |||
Hard to analyze dynamic behavior
OS prefetching
Prefetching in File System
9
Prefetching in File System
Prefetching in Linux (cont.)
Key factors for prefetching
ahead window size changes adaptively. First access to a file
What to prefetch
request the first page sequential access request other page random access
10
When to prefetch
ahead window size (a) is related to current window size (c) RA_FLAG_MISS flag is set a=c-2(min:4) otherwise, a=2*c or 4*c (max:32)
Prefetching in File System
11
Prefetching in File System
12
3
Rules for optimal prefetching and caching
Related work (1) P. Cao, E. W. Felten, A. R. Karlin, and K. Li. A Study of Integrated Prefetching and Caching Strategies. In Proc. of the ACM SIGMETRICS, pages 188–197, Ottawa, Canada, June 1995
Prefetching in File System
1. Optimal Prefetching: Every prefetch should bring into the cache the next block in the reference stream that are not in the cache 2. Optimal Replacement (for cache): Every prefetch should discard the block whose next reference is furthest in the future
13
Prefetching in File System
Rules for optimal prefetching and caching (cont.)
Two theoretical prefetching strategy
3. Do No Harm: Never discard block A to prefetch block B when A will be referenced before B 4. First Opportunity: Never perform a prefetch-andreplace operation when the same operations could have been performed previously
Conservative
Prefetching in File System
14
The same fetch number as paging algorithm MIN Prefetching is performed at the earlist opportunity consistent with the four rules
Aggressive Prefetch may replace the block whose next reference is furthest in the future
15
Prefetching in File System
16
4
Related work (2)
Main approach
J. Griffioen and R. Appleton, Reducing file system latency using a predictive approach, in Proceedings of the USENIX Summer 1994 Technical Conference, p197-208, June 1994
Are file relationship among files predicable? Analyze real daily file-operation trace. Find certain patterns do exist Use past file access information to predict future request
Prefetching in File System
17
Prefetching in File System
Probability graph
Two parameters
Probability graph to record file open history Node: file open operation Edge:, B’s opening follows A’s opening Weight: how many times B’s opening follows A’s
lookahead period
Prefetching in File System
18
number of file opening related to current one
chance For edge , chance is ratio of the weight of edge from A to B divided by the total weight of edges leaving A Threshold: MinChance 19
Prefetching in File System
20
5
Related work (3)
Main idea
C. Li, K. Shen and A. E.Papathanasiou, Competitive Prefetching for Concurrent Sequential I/O, in Proceedings of the 2nd ACM EuroSys Conference, Lisbon, Portugal, March 2007
Prefetching in File System
21
Control of prefetching depth (also can be called read-ahead size) The amount of data that can sequentially transferred within the average time of a single disk seek and rotation
Prefetching in File System
Competitive
Experimental results
At least half the performance (in terms of I/O throughput) of the optimal offline policy
Better than default Linux prefetching
Prefetching in File System
22
Aggressive prefetching (800KB lookahead size) works almost the same as competitive prefetching, but is worse in some programs
23
Prefetching in File System
24
6
Summary
References
Caching in file system and Linux example Prefetching in file system and Linux example Related research work
[BC06] D. Bovet and M. Cesati, Understanding the Linux Kernel, 3rd edition, 2006, O'Reilly. [CFK95]P. Cao, E. W. Felten, A. R. Karlin, and K. Li, A Study ofIntegrated Prefetching and Caching Strategies. [GA94] J. Griffioen and R. Appleton, Reducing file system latency using a predictive approach. [LSP07]C. Li, K. Shen and A. E. Papathanasiou, Competitive Prefetching for Concurrent Sequential I/O. [MDO96] T. Mowry, A. Demke and O. Krieger, Automatic Compiler Inserted I/O Prefetching for Out-of-core applications. [PGG95] R. H. Patterson, G. A. Gibson, E. Ginting, D. Stodolsky and J. Zelenka, Informed Prefetching and Caching. [Tan01] A. Tanenbaum, Modern Operating Systems, 2nd edition, 2001, Prentice Hall.
Integrate prefetching and caching Prefetching across files Competitive prefetching Prefetching in File System
25
Prefetching in File System
26
7