The current file systems for Linux are facing a number of challenges with scaling to the large storage subsystems. File systems need to scale in their ability to address and manage large storage, and also in their ability to detect, repair and tolerate errors in the data stored on disk. BTRFS is a new copy on write file system for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. It is expected to be free of many of the limitations that other Linux file systems currently have. The extents approach used in BTRFS is more scalable and efficient than the 4k block approach of Ext3.
The main BTRFS features include:
- Extent based file storage (2^64 max file size).
- Space efficient packing of small files.
- Space efficient indexed directories.
- Dynamic inode allocation.
- Writable snapshots.
- Sub volumes (separate internal file system roots).
- Object level mirroring and striping.
- Check sums on data and metadata (multiple algorithms available).
- Integrated multiple device support, with several raid algorithms.
- Online file system check. Very fast offline file system check.
- Efficient incremental backup and FS mirroring.
- Online file system defragmentation.
BTRFS use COW (copy on write) tree or Rodeh's btrees. Major advantages of COW Btrees are
- Increase in the overall depth is infrequent Minimum number of rearrangement
- Search could be done in O(log2N)
- Entire tree need not be in memory
- COW facility speeds up operations
Btrfs internally only knows about three data structures ( Block header - btrfs_header, Key - btrfs_key, Item - btrfs_item). The block header contains check sum for the block contents, uuid of the file system, level of the block in the tree and block number where this block is supposed to live. Key contains unique object id analogous to inode number in ext series and Object id is Most Significant Bits of key which results in grouping together all info associated with particular object id. Offset field of the key indicates the byte offset for a particular item. Inode has offset value 0. Type field gives type of the item can be inode, file data etc. BTRFS item contains Key which describing the item, Offset of the offset of item and Size of item.
The basic purpose of this seminar is to provide detailed information regarding the metadata structure, file organization strategies and features of BTRFS with a brief look at the limitations of existing UNIX file systems.