Link Search Menu Expand Document

Due by 11:59 PM on Sunday, Nov 22

Changelog

  • November 21, 2020: Part II release
  • November 21, 2020: Add clarifications about layout
  • November 08, 2020: Initial version

The Multimedia Embedded Memory Encapsulation Filesystem

The filesystem you will design in this project will follow a very specific format, to ensure that all students should be able to design code that is interoperable with each other. That filesystem is described below.

The filesystem is a simple file allocation table based filesystem with a fixed volume size of 128KiB (256 total blocks of 512 bytes each), a single-level directory structure, and 16-bit entries in its FAT. However, the filesystem has a permissions model more similar to the ext2/ext3/ext4 filesystem that is prevalent on Linux, rather than the model employed in the FAT16 or FAT32 filesystems as designed by Microsoft.

A complete filesystem consists of a backup superblock (1 block), user data area (220 blocks), a system reserved section (18 blocks), a backup file allocation table (1 block), the single-level directory (14 blocks), the main file allocation table (1 block), and the main superblock (1 block). The user data area is used for storing files on the filesystem, and has no specific format. The reserved area should always be initialized with 0 bytes and should generally not be used — that is to say that you should ignore this area of the filesystem entirely and not modify anything contained in it. The directory, file allocation table, and root blocks each have a specific format, which will be described below. All multi-byte values are stored in big endian byte ordering (also known as network byte order). Please note that the native number format of the x86-64 CPU that is in your system is little endian, so you will need to convert any multi-byte integer values to big endian.

The above layout is just an example, and refers to the image produced by mkmemefs. The location of all sections can change, the size of all sections can change, the number of blocks in an image can change, almost everything can change. That’s why you have your superblock(s) to tell you the layout of the image.

Here are a few things that won’t change:

  • The 0th block is the backup superblock
  • The last block is the primary superblock
  • The size of the block is always 512 bytes
  • The layout of a superblock (See The Superblock)
  • The layout of a directory entry (See The Directory structure)
  • The size of a FAT entry (2 bytes)

The Superblock

The filesystem superblock is always stored as the last block in the volume. As all basic volumes are 128KiB in size, this is block 255 in the volume (assuming that the first block is block 0). The superblock is responsible for directing the filesystem driver to various important pieces of data in the volume and recording some basic information that is set when the volume is formatted. A backup superblock is stored as the first block of the volume and can should be kept in-sync with the main superblock. This backup block should only be used if the main superblock is found to be corrupted (for instance, if the signature in the main superblock is corrupt or some other pre-defined value is not correct). The format of the superblock is described below.

Offset Size
(bytes)
Description
0x000 16 Filesystem Signature - ASCII characters: ‘?MEMEFS++CMSC421’
0x010 1 Set to 0 to mark the filesystem as having been cleanly unmounted. Should be set to 0xFF when the filesystem is mounted and cleared to 0 on unmount.
0x011 3 Reserved, set to 0 when formatting and ignore otherwise
0x014 4 Version number of the filesystem, set to 0x00000001
0x018 8 Filesystem creation timestamp (Binary Coded Decimal format)
0x020 2 Starting block of the main FAT, set to 0x00FE (254)
0x022 2 Size of the main FAT, in blocks; set to 0x0001
0x024 2 Starting block of the backup FAT, set to 0x00EF (239)
0x026 2 Size of the backup FAT, in blocks; set to 0x0001
0x028 2 First block of the directory, set to 0x00FD (253)
0x02A 2 Size of the directory, in blocks; set to 0x000E (14)
0x02C 2 Number of user blocks, set to 0x00DC (220)
0x02E 2 First user block, set to 0x0001
0x030 16 Volume label (ASCII characters) — fill in empty characters with NUL (‘\0’) bytes.
0x040 448 Unused, set all bytes to 0 when formatting and ignore otherwise

Binary-Coded Decimal Timestamps

Timestamps (both in the superblock and in the directory) are stored in a format known as Binary-Coded Decimal (BCD, for short). In Binary-Coded Decimal, each byte represents two digits, with 4 bits for each digit. If you look at a BCD number in hexadecimal (and ignore any indication it is in hex), it will conveniently have the decimal value of the number. All timestamps should be stored as Universal Coordinated Time (since there is no UTC offset in the timestamp). BCD is easiest to demonstrate with a few examples:

Decimal Value Binary (BCD) Hexadecimal (BCD)
20 0010 0000 0x20
18 0001 1000 0x18
31 0011 0001 0x31
5 0000 0101 0x05

Timestamps in are always stored in the following format (one byte each, each of these are single-byte quantities):

  1. Century (20)
  2. Year within century (20)
  3. Month within year (01 = January, 12 = December)
  4. Day within month (01-31)
  5. Hour of the day (00-23)
  6. Minute within hour (00-59)
  7. Second within minute (00-59)
  8. Unused, set to 0

The File Allocation Table structure

The file allocation table is an array of 16 bit values. The entire FAT fits in one or more blocks of 512 bytes on the volume, and has one entry for every block on the volume (including the block(s) it resides in). For an overview of how the FAT scheme works for allocation, see Chapter 14 of OSC10e or Chapter 12 of OSC9e. There are also other resources linked to in this project that may be helpful here. The basics of the file allocation table scheme used are as follows:

  • Free blocks should be marked with 0x0000
  • The end of chain marker is 0xFFFF
  • Each used block should contain either an end of chain marker, or a pointer to the next block in use for the file. Valid block numbers for data blocks range from 1 to 220 in the default filesystem image.

The directory is treated as a special file that is allocated starting at block 253 of the volume, with each subsequent block preceding that. That is to say that the second block of the directory is allocated at block 252, the third at block 251, and so on until the fourteenth block which is allocated at block 240 of the volume. The directory appears in the FAT as a normal file chain, which has been set up as described above.

At format time, this means that entries 1-238 in the FAT should be set to 0x0000, entry 240 should be set to 0xFFFF, entries 241-253 should be set to their entry number minus 1, and entries 0, 239, 254 and 255 should be set to 0xFFFF.

The Directory structure

Each file on this filesystem has an entry in the directory structure that describes some basic information about the file. Each entry in the directory is 32 bytes long. With 14 blocks of 512 bytes allocated for the directory at formatting time, this gives us a maximum of 224 possible files on each filesystem. Files are always allocated in block size increments, like any sensible filesystem, so the limit of 224 possible files allows for the maximum number of files that we could have on a filesystem anyway (since there are only 220 blocks available for files). Each entry in the directory follows the format described below:

Offset Size
(bytes)
Description
0x00 2 File Type and Permissions (0x0000 = file entry not used) — described below
0x02 2 Location of the first block of the file
0x04 11 8.3 style filename (ASCII string, fill unused bytes with NUL bytes, do not include the dot)
0x0F 1 Unused, set to zero.
0x10 8 BCD timestamp of the last write to the file
0x18 4 File size, in bytes
0x1C 2 Owner user ID
0x1E 2 Owner group ID

The file type and permissions bits are stored as an array of bits marking whether the file is in use and what the UNIX-style permissions are on the file. There are 9 bits reserved for this purpose (the SUID, SGID, and Sticky bits are not supported by this filesystem). The bits are ordered such that the owner’s bits are the high-order bits (bits 6, 7, and 8), the group’s bits are next (bits 3, 4, and 5), and the bits used for all other users are in the low-order bits (0, 1, and 2). As is customary on UNIX, the bits are ordered read, write, execute from highest to lowest in each set. All bits within the file type and permissions word that are not used for permissions shall be set to 1 to mark a file as a regular file. This filesystem does not support any special file types including symlinks, hard links, directories, devices, FIFOs, or sockets.

File names must be at most eight characters long, with a three character extension. The extension is always stored in the last three bytes of the filename field. Unused spaces in the pre-extension portion of the filename shall be filled in with NUL (‘\0’ bytes), as shall unused characters in the extension. Valid filenames consist only of upper- or lower-case letters A-Z, numbers 0-9, and the special characters ^, -, _, =, and |. No other characters are allowed in a filename. If you encounter a file on a volume with an illegal filename, you must ignore the entire file entry in the directory (treat it as an unused entry). Filenames in this filesystem are case-sensitive, so README.TXT and README.txt are two different filenames. To demonstrate how filenames are stored, the filename README.TXT would be stored as follows:

Byte Character
0 ‘R’
1 ‘E’
2 ‘A’
3 ‘D’
4 ‘M’
5 ‘E’
6 ‘\0’
7 ‘\0’
8 ‘T’
9 ‘X’
10 ‘T’

Files shall be deleted by setting their file type and permissions field in the directory entry to 0x0000 and by marking all blocks of the file as free in the FAT. You may clear the entire directory entry, but you are not required to do so. You also do not have to clear the blocks of the file themselves — just clear the entries in the FAT (and the backup FAT).

When a data block is initially allocated to a file, it shall be cleared to all 0 bytes (that is NUL bytes, not bytes with a literal ‘0’ character).