Chapter 11: File-System Implementation

Chapter 11: File System Implementation         

File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured File Systems NFS

Operating Systems CS 33211

2

Objectives 



To describe the details of implementing local file systems and directory structures To discuss block allocation and free-block algorithms and trade-offs

Operating Systems CS 33211

3

File-System Structure 

File structure  



Logical storage unit Collection of related information

File system resides on secondary storage (disks).  Allows read, write and modify block of data from disk  Sequence or Random access to any given block of information 



File system organized into layers  



I/O transfers between disk and memory in blocks of typically 512 bytes (efficiency)

Efficiency Avoids duplication of code

File System maintains file structure via File control block: storage structure consisting of information about a file   

Ownership Permissions Locations of file contents …

Operating Systems CS 33211

4

File system Design Challenges The file system provides efficient and convenient access to information on the disk  File System Design Challenges: File system UI

1.  

File Attributes, Operations Directory structure

Mapping logical file system to physical storage (disk)

2.



Requires efficient algorithms and data structures

Operating Systems CS 33211

5

File System Layered Design 

In Layered Architecture 



I/O Control 



Issues generic commands to read/write physical blocks on disk

File Organization 



Translates commands to hardware instructions

for use by controller Basic file system: 



Uses Device drivers to transfer info between memory and disk

Device driver 



Higher levels make use of features in the lower level

Translates logical block addresses to physical block addresses

Logical File System 

Manages directory structure  Metadata information  Maintains file structure via File Control Block (FCB)

Operating Systems CS 33211

6

A Typical File Control Block

Operating Systems CS 33211

7

Creating a new file 1.

2.

3.

Your application program calls the logical file system Logical file system creates a new FCB (allocates one from free FCBs) Systems reads appropriate directory into memory  

4.

Creates new file name Creates new FCB

Writes back onto disk

Now the new file may be used for I/O operations

Operating Systems CS 33211

8

In-Memory File System Structures



Necessary file system structures provided by the OS. 

Figure 12-3(a) refers to opening a file.



Figure 12-3(b) refers to reading a file.

Operating Systems CS 33211

9

In-Memory File System Structures 

The open( ) call first searches “systemwide-open file table” to see if file is in use  If file is in use: 



Else:  





An entry is created in “per-process open-file table”. Entry points to system-wide open-file-table Search directory structure for fie name Copy file’s FCB to system-wide open file table An “entry” is created in “per-process open-file table”. Entry points to system-wide open-file-table

Open( ) call returns a pointer to the entry in the per-process open-file table

All I/O on the file are performed via pointer

Operating Systems CS 33211

10

Virtual File Systems OS concurrently supports multiple type of file-systems via VFS interface 

Virtual File Systems (VFS) provide an object-oriented way to simplify, organize and modularize File-system implementation



VFS provides a uniform system call interface (the API) to be used for different types of file systems.



The API is to the VFS interface, rather than any specific type of file system:  Using data structures and procedures

Operating Systems CS 33211

11

Schematic View of Virtual File System 

Layer 1: File System Interfaces handles: 



System calls: open(), read(), write(), close()

Layer 2: Virtual File System 

Separates generic operations from implementation details

API provides a uniform interface to different types of file system implementations Layer 3: Supported file system types 



Operating Systems CS 33211

12

Directory Implementation Directory management algorithms can affect efficiency of file-systems 

Linear list simple to program but time-consuming to execute  

Maintain a list of file names with pointers to the data blocks. To create a new file:  



To delete a file: 





Search dir (or list) to be sure file does not exist Add new at end of list Search list for file then delete name from file

Disadvantage: Linear search of directory is time consuming

Hash Table – linear list with hash data structure.  

Linear list stores directory entries Harsh table takes a value computed from file name and returns a pointer to file name in the list 

 

decreases directory search time

collisions – situations where two file names hash to the same location Disadvantage: Fixed size

Operating Systems CS 33211

13

Allocation Methods 

An allocation method refers to how disk blocks are allocated for files:



Contiguous allocation



Linked allocation



Indexed allocation

Operating Systems CS 33211

14

Contiguous Allocation 

Each file occupies a set of contiguous blocks on the disk



Simple – only starting location (block #) and length (number of blocks) are required



Random access



Wasteful of space (dynamic storage-allocation problem)



Files cannot grow

Operating Systems CS 33211

15

Contiguous Allocation 

Mapping from logical to physical Q LA/512 R

Block to be accessed = ! + starting address Displacement into block = R

Operating Systems CS 33211

16

Contiguous Allocation of Disk Space

Operating Systems CS 33211

17

Extent-Based Systems







Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents An extent is a contiguous block of disks  

Extents are allocated for file allocation A file consists of one or more extents.

Operating Systems CS 33211

18

Linked Allocation 

Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. block

Operating Systems CS 33211

=

pointer

19

Linked Allocation (Cont.)  

 

Simple – need only starting address Free-space management system – no waste of space Q No random access LA/511 R Mapping Block to be accessed is the Qth block in the linked chain of blocks representing the file. Displacement into block = R + 1

Operating Systems CS 33211

20

Linked Allocation

Operating Systems CS 33211

21

Indexed Allocation 



Brings all pointers together into the index block. Logical view.

index table

Operating Systems CS 33211

22

Example of Indexed Allocation

Operating Systems CS 33211

23

Linked Allocation File-Allocation Table (FAT) MS-Dos and OS2 Operating Systems









FAT resides at beginning of each disk partition Each FAT index represents block number of the file  Last block has eof value Directory entry contains the block number of first block Each FAT entry contains the next block number of the file

Operating Systems CS 33211

24

Indexed Allocation Small Files (max size 256K words; block size 512 words) Recall, index allocation:    

uses directory entry table Uses Index table, size normally one disk block (512) Provides random access Provides dynamic access without external fragmentation, but have overhead of index block. For small files: Mapping from logical to physical: Requires only 1 index table Q = displacement into index table R = displacement into block

Operating Systems CS 33211

Q LA/512 R

25

Indexed Allocation Large Files (Unbounded Length; block size 512 words) For Large Files: 

Linked scheme – Link several blocks of index table (no limit on size). Q1 LA / (512 x 511) R1 Q1 = block of index table R1 is used as follows:

Last word is pointer to another index block

Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file: Operating Systems CS 33211

26

Indexed Allocation Two-Level Index 

Two-level index (maximum file size is 5123) Q1 LA / (512 x 512) R1

Q1 = displacement into outer-index R1 is used as follows:

Q2

R1 / 512 R2

Q2 = displacement into block of index table R2 displacement into block of file:

Operating Systems CS 33211

27

Indexed Allocation Multi-level Index 





Directory entry table points to the start of the outer index table Outer-index table points to a set of second level index blocks  Second level block may also point to another set of third level index blocks  Third level points to fourth … Entry in index table contains direct access block address

Operating Systems CS 33211

M

outer-index

index table

file

28

Combined Scheme: UNIX File Systems (4K bytes per block) 

Inode contains N pointers of the index block (E.g., N =12)  First nine pointers point to direct blocks (small files ~ 4*9 KB)  The 10th pointer points to an indirect block (an index block)  Index block contains address of blocks that contain data blocks  The 11th pointer points to double indirect block (an index block)  The double indirect block contains?  The 12th pointer points to triple indirect block

Operating Systems CS 33211

29

Free-Space Management Bit Vector n-Bit vector:



To create new files, we need to re-use free space from deleted files How do you keep track of free space on the disk?  Maintain a free-space list  Space not allocated to files or directory  Implemented as Bit Vector

01 2

n-1

… 678



bit[i] =

0 ⇒ block[i] free 1 ⇒ block[i] occupied

Block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit Block number calc applies to First non-zero word

Operating Systems CS 33211

30

Free-Space Management n-Bit Vector Storage 



n-Bit vector requires extra space  Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes)  Requires 32K bytes to store bit maps Easy to get contiguous files

Operating Systems CS 33211

31

Free-Space Management Linked Free Space List on Disk 

Basic idea is to link all free disk blocks 

Requires a pointer to first free block 





Pointer stored on disk and cache

Each block contains a pointer to next free disk block

Linked list (free list)  Cannot get contiguous space easily  No waste of space

Operating Systems CS 33211

32

Efficiency and Performance 

Efficiency dependent on:  Disk allocation and directory algorithms  Linked allocation Vs Index allocation  Types of data kept in file’s directory entry  Pointer size limits the length of file  16-bit pointer  216 (64 KB)  32-bit pointer  232 (4 GB)  Requires more disk space to store index tables



Performance  Disk cache – separate section of main memory for frequently used Disk blocks 

Free-behind and read-ahead – techniques to optimize sequential access

Operating Systems CS 33211

33

Performance Page Cache



A page cache caches pages rather than disk blocks using virtual memory techniques



Memory-mapped I/O uses a page cache



Routine I/O through the file system uses the buffer (disk) cache



This leads to the following figure

Operating Systems CS 33211

34

Performance I/O Without a Unified Buffer Cache

Operating Systems CS 33211

35

Performance Unified Buffer Cache



A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O

Operating Systems CS 33211

36

Performance I/O Using a Unified Buffer Cache

Operating Systems CS 33211

37