Reduced FAT File System

Reduced FAT File System Project

by: burt rosenberg
at: university of miami
date: nov 2025

US Patent US4905110 A

Goals

File systems are an example of data persistence. Computer operating systems make the file system one of their core provisions to the user. Modern operating systems allow for a variety of file systems to be in use at the same time.

A very popular file system is FAT. It was Microsoft's original file system, even before Windows, and continues popular to this day, used in many devices.

This file system is a reduced version of the FAT file system, retaining the key concepts. Some simplifications undo accidents of history that complicate FAT. (E.g., the use of 0xE5 as a directory marking, which turned out to conflict with the Shift-JIS encoding). Other simplifications are in the interest of a more enjoyable project.

According to ChatGPT, the original FAT was developed by Bill Gates and Marc McDonald even before MS-DOS existed. The original FAT had 8 bit FAT entries, which were later expanded to 12 bit and 16 bit FAT entries in FAT12 and FAT16. This allowed for larger disks and files sizes.

Later FAT16 allowed for long file name (Apple gave Microsoft C:\ONGRTLNS.W95 on this important event.) The official FAT32 specification is published, if you are interested in all the details.

Disks and Filesystems
The MAN Page
The Cluster Table
The FAT Table
The Root Directory and DirEnt
Actions to Implement
Development Plan
Implementation Notes
Appending to a file example
A FAT Example
Example run of fat-reduced

Disks and Filesystems

A disk is an indexed sequence of blocks. A block is a some number of bytes. Early disks had 128 byte blocks. A modern high capacity drive has 4096 byte blocks. Clusters are the software analog of blocks.

A disk driver, speaking directly to the disk, only knows how to move blocks of data to and from the disk by the disk index. A filesystem uses this notion of an array of clusters (blocks joined as clusters) to make a system that creates and maintains files and folders.

Our reduced implementation of the FAT file system will focus on three elements,

A data region as an array of clusters.
The FAT table.
The root directory.

In reality all of these will be laid into an array of clusters, but we will extract out the root diretory as an array of strut DirEnt and the FAT table as an array of integers. The data region remains fairly authentic as an array of clusters. Each strut DirEnt contains this important information,

The file name
The file length
The file type (file or directory)
The index of the first of a chain of clusters where the data for the file is stored.

The MAN Page

NAME
    fat-reduced
    
SYNOPSIS
    fat-reduced  [-v]
    
DESCRIPTION
    An interactive program to simulate the operation of the File Allocation
    Table of a very much simplified FAT file system.

OPTIONS
    -v verbose output. Can be used multiple times to increase verbosity.

COMMANDS
    ls 
       to list the file names, data lengths, and directory index
    qu
       to quit. this is an administrative command   
    rd _filename_
       to print the contents of file _filename_. Error if it does not exist.
    rm _filename_ 
       to remove the file _filename_. Error if it does not exist.
    wr _filename_ _content_ 
       to create or update the file _filename_ with content _content_. Although
       the file can contain binary values, this command creates files with ASCII
       content only.
    dd _filename_
       to display the cluster chain associated with the named file

   Filenames are at most 15 characters. Values are 64 bytes.

HISTORY
    Created for the 241 edition of CSC421 (2023-2024 academic year).
    
BUGS

The Cluster Table

In the program, the data section is the array cluster_table of entries struct Cluster, where each cluster is of size CLUSTER_SIZE.

The cluster_table is a global variable declared in fat-reduced-util.c.

The FAT Table

The central element of the FAT filesystem is the File Allocation Table, a.k.a. the FAT.

The fat_table is an array of unsigned int. It is a global variable declared in fat-reduced-util.c.

The data for a file is stored in a collection of clusters allocated for this purpose. The clusters do not have to be in sequence on the disk. Instead there is a sequence that can jump around. The sequence of clusters in encoded in the FAT as a linked list of clusters.

The entries in the FAT are in exact correspondence with the clusters in the data region. The i-th entry is associated with the i-th cluster in the data region.

If fat_table[i]==0 then cluster_table[i] is not allocated to any file.
If fat_table[i]!=0 then cluster_table[i] is allocated to a file and the next cluster in the sequence of clusters allocated to this file is cluster_table[fat_table[i]]
If fat_table[i]==-1 then this cluster is allocated to a file and is the last cluster in the sequence of clusters allocated to this file.

The Root Directory and DirEnt

In our implementation, we have an array of struct DirEnt named root_dir. It is a global variable declared in fat-reduced-util.c

The directory entry structure is defined in fat-reduced.h as,

struct  DirEnt {
    char name[FILENAME_LEN] ;  // first char \0 means entry empty; might not be null terminated
    unsigned short flags ;     // not needed for this project
    unsigned int len ;         // the actual length. the cluster might give more room.
    unsigned int starting_cluster ;  // the first cluster in the cluster chain
    } ;

Note some important points. Use strncmp and strncpy in the field name. This field is size FILENAME_LEN and filenames can be up to FILENAME_LEN in length.

See the documentation of strncpy for the appropriate string functions. If the name is shorter than FILENAME_LEN then it is null terminated. If it is equal to FILENAME_LEN then it is not null terminated.

By convention, if the first character of name is \0, then the DirEnt is unused. A typical implementation would not zero the entire struct — it would just zero this one byte.

Actions to Implement

The file fat-reduced.c reads and parses the typed command. It then jumps to the function that implements the command. These functions are what you have to implement. They unfinished versions of them are found in fat-reduced-util.c.

This is a list of the functions you are to implement.

qu_action() to implement the qu operation quit the program.
- Command takes no arguments.
- Output: The action has no output.
- Errors: ERR_ABORT: returned always.
ls_action() function to implement that ls operation to list the root directory.
- Command takes no arguments.
- Output:
  - A printf format is provided, so output matches reference.
  - The listing gives the DirEnt number, the file length, and file name.
  - The listing is in order of the DirEnt index.
  - Empty Dirent's are not listed.
- Errors: None.
rm_action() to implement that rm operation to remove a file.
- Command takes one argument: the filename.
- Remove the file by setting the filename to the empty string
- Freeing the clusters associated with the file by zeroing in the FAT table.
- Output: The action has no output.
- Errors:
  - ERR_NOEXIST: if the file to remove does not exist.
rd_action() to implement the rd operation to print the file contents.
- Command takes one argument: the filename.
- Output:
  - The format must be precisely as specified.
  - You may assume that the contents are printable ascii.
- Note: we read the entire file.
- Errors:
  - ERR_NOEXIST: if the file to read does not exist.
wr_action() function to implement the wr operation.
- Command takes two arguments: the filename and the file contents.
- If the file does not exist, create it with the given contents.
- If the file exists, append the given contents to the file.
- Clusters are allocated as needed, but never less than one.
- The content is assumed to be binary.
- The actual content size is recorded in the DirEnt.
- Output: The action has no output.
- Errors:
  - ERR_DIRFULL: no more DirEnt's available.
  - ERR_FATFULL: no more FAT table entries available.
  - ERR_DISKFULL: no more clusters available.
dd_action() to implement that dd operation to dump the files metadata.
- Command takes one argument: the filename.
- Output: Displays the cluster chain for the file named.
- Errors:
  - ERR_NOEXIST: if the file to remove does not exist.

Development Plan

I would do this project in stages. Each stage finishes something you need for the next stage, and can be complete for a partial implementation

Implement the root_dir. (Ignore entirely the cluster_table and the fat_table)
- Get ls working,
- Get wr, restricted to only changes in the root directory.
- Get rm, restricted to only changes in the root directory
Implement reading and writing of files requiring no more than one cluster.
- Ignore writes that append to existing files.
- Write code for the fat_table to find free clusters, mark them in use.
- Add to wr to allocate and record the starting cluster, and copy the data into the cluster, and record the length (possibly removing the null byte.)
- Add to rd to copy the data out of the starting cluster, using the length recorded in the DirEnt, add the null byte and return the data.
- Note: The data as stored in the clusters are not strings. In particular, in the case the data fills the cluster completely, there will be no null byte. Always go by the length field in the DirEnt.
Implement file remove consistent with the current partial implementation.
- In addition to rm working with the root_dir, it will modify the fat_table to mark the file's unique cluster as free.
Note that this is a fine partially finished project.
- Don't stop here!
- But if you have to stop, this is a nice accomplishment.
- Check this code in. Just in case.
Implement reading and writing of files requiring more than one cluster.
- Still, ignore writes that append to existing files.
- Write code for the fat_table to find free clusters, mark them in use, and form them into a cluster chain.
- Add to wr to call a subroutine (previous step) to create a chain and return the starting cluster; then copy the the data into the cluster chain.
- Add to rd to copy the data out of a chain of clusters, using the length recorded in the DirEnt, add the null byte (as data in clusters does not always have a null byte.)
Add to the rm implementation to include the tear-down of cluster chains.
Implement appending writes.

Implementation Notes

To receive full credit for the project, certain operations must be done in strict conformance with this specification.

If a filename is too long, silently truncate.
Data length is as in DirEnt.len, not according to a null byte. Your code might be tested on this.
When finding free DirEnt's or Clusters, find and use the one with smallest index. This is for purposes of automatic grading.
Use this format to implement the ls command:
Use this format to implement the rd command:
Use these formats to implement the dd command:
(2025: The template has this command implemented.)
On an appending, if the slack in the final cluster is sufficient for the additional bytes to append, do not add a cluster.
A zero byte file allocates one cluster. Otherwise, files of size S should allocate K clusters, where (K*Cluster_Size==S).

Appending to a file example

Original file with contents "abcdef"; stored in two 4 byte clusters with 2 bytes of slack
       +---+---+---+---+    +---+---+---+---+       
       | a | b | c | d |--->| e | f |   |   |   
       +---+---+---+---+    +---+---+---+---+

Wrote "ghijkl" to the file; slack used and one more 4 byte cluster added
       +---+---+---+---+    +---+---+---+---+    +---+---+---+---+
       | a | b | c | d |--->| e | f | g | h |--->| i | j | k | l |
       +---+---+---+---+    +---+---+---+---+    +---+---+---+---+

A FAT Example

2. After creating 3 files.

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 3 |
    +---+
3   |-1 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+

1. An initialized FAT table

    +---+
0   | 0 |
    +---+
1   | 0 |
    +---+
2   | 0 |
    +---+
3   | 0 |
    +---+
4   | 0 |
    +---+
5   | 0 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+

We begin with a FAT table all zeros, since all blocks are free.

Three files are created, each requiring two blocks.

The first chains clusters 0 and 1, 0->1.
The second chains clusters 2 and 3, 2->3.
The third chains clusters 4 and 5, 4->5.

The clusters 6 and 7 are still free.

4. After creating a fourth file.
    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 3 |
    +---+
3   | 6 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   |-1 |
    +---+
7   | 0 |
    +---+

3. After deleted the second file

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 0 |
    +---+
3   | 0 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+

We delete the file that occupied clusters 2 and 3. The FAT entries are zero.
Finally, a file requiring 3 clusters is created. It takes the first 3 free clusters available and chains them into the cluster chain 2->3->6.

Example run of fat-reduced

% ./fat-reduced 
ls
wr hi hello_world
rd hi
hello_world
ls
  0     11 hi
wr bye good_night_moon
ls
  0     11 hi
  1     15 bye
rd bye
good_night_moon
wr hi _the_sun_is_up
rd hi
hello_world_the_sun_is_up
ls
  0     25 hi
  1     15 bye
rm hi
ls 
  1     15 bye
rd hi
Command rd returned error 2.
qu
%

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 5 nov 2023
update: 7 nov 2025