File System - Guest Lecture by Master Jim
File System structure: Files must be mapped onto disk blocks. Two things that are separate: interface to user vs. mapping to physical devices. Layered file system organization.
- Logical file system: interface presented to the user
- File-orgnization: Does mapping from logical file to disk blocks
- Basic File System: Issues commands for specific disk blocks
- I/O Control: Does DMA driving
- Devices
Standard FS's: Berkeley FFS (Unix), FAT, FAT-32, NTFS (Windows), Ext-2 and Ext-3 (Linux).
Look at open file structures (CS360 picture)
VFS: One FS in OS, many implementations beneath. Simplifies structure.
Directory Implementation: (i) Linear: Simple to implement, too long to search (ii) Hashing: Simple to implement and fast to search. Have to increase hash table size sometimes.
Allocation methods
All in use (e.g. Data General RDOS supports all three)
Contiguous: Fast search. Fast traversal. Little metadata required. Standard memory allocation problem. Compaction: "Defrag". Does not deal with dynamic sizes well. Both external and internal fragmentation. The "Extent" buzzword (e.g. Veritas).
Linked: Allocation easy. Resizing easy. No external fragmentation. Internal fragmentation. Very slow search. Slower traversal. Can lump blocks together, but then you get even more internal fragmentation. Reliability difficult. FAT: File allocation table - keep links in blocks at the head of the disk (and cache them, of course), search is now easy, many head seeks unless FAT is cached.
Indexed: File is a pointer to an index block, which contains locations of data blocks. Like a page table of disk blocks. Allocation easy. Fast search. Decent traversal. Good dynamic behavior. No external fragmentation. Internal fragmentation. How to deal with really large files? - Linked index blocks - Multilevel index blocks - Mixed scheme like Unix inodes: 12 direct pointers to blocks, 3 pointers to indirect blocks, 1 pointer to double indirect block, 1 pointer to triple indirect block. Assume block size is 4K and pointers are 4-bytes. Calculate potential size.
Performance - E.g. Sun's improvements for Unix: - Allocate space in large clusters if possible - Optimize reading - Use read-ahead and free-behind caching
Free-Space Management
Bit Vector: Makes use of hardware (find first zero bit). Takes up mongo space.
Linked List
Grouping (index block of free blocks)
Counting - pointer and count. Gets contiguous blocks.
Efficiency and Performance
Efficiency: Preallocating inodes. Access times. Pointer sizes.
Performance: Disk controller cache. Disk block cache - unified buffer cache. Read-ahead.