Reduced FAT File System Project

by: burt rosenberg
at: university of miami
date: nov 2025

US Patent US4905110 A

Goals

File systems are an example of data persistence. Computer operating systems make the file system one of their core provisions to the user. Modern operating systems allow for a variety of file systems to be in use at the same time.

A very popular file system is FAT. It was Microsoft's original file system, even before Windows, and continues popular to this day, used in many devices.

This file system is a reduced version of the FAT file system, retaining the key concepts. Some simplifications undo accidents of history that complicate FAT. (E.g., the use of 0xE5 as a directory marking, which turned out to conflict with the Shift-JIS encoding). Other simplifications are in the interest of a more enjoyable project.

According to ChatGPT, the original FAT was developed by Bill Gates and Marc McDonald even before MS-DOS existed. The original FAT had 8 bit FAT entries, which were later expanded to 12 bit and 16 bit FAT entries in FAT12 and FAT16. This allowed for larger disks and files sizes.

Later FAT16 allowed for long file name (Apple gave Microsoft C:\ONGRTLNS.W95 on this important event.) The official FAT32 specification is published, if you are interested in all the details.

Table of Contents

  1. Disks and Filesystems
  2. The MAN Page
  3. The Cluster Table
  4. The FAT Table
  5. The Root Directory and DirEnt
  6. Actions to Implement
  7. Development Plan
  8. Implementation Notes
  9. Appending to a file example
  10. A FAT Example
  11. Example run of fat-reduced

Disks and Filesystems

A disk is an indexed sequence of blocks. A block is a some number of bytes. Early disks had 128 byte blocks. A modern high capacity drive has 4096 byte blocks. Clusters are the software analog of blocks.

A disk driver, speaking directly to the disk, only knows how to move blocks of data to and from the disk by the disk index. A filesystem uses this notion of an array of clusters (blocks joined as clusters) to make a system that creates and maintains files and folders.

Our reduced implementation of the FAT file system will focus on three elements,

  1. A data region as an array of clusters.
  2. The FAT table.
  3. The root directory.
In reality all of these will be laid into an array of clusters, but we will extract out the root diretory as an array of strut DirEnt and the FAT table as an array of integers. The data region remains fairly authentic as an array of clusters. Each strut DirEnt contains this important information,
  1. The file name
  2. The file length
  3. The file type (file or directory)
  4. The index of the first of a chain of clusters where the data for the file is stored.
The MAN Page
NAME
    fat-reduced
    
SYNOPSIS
    fat-reduced  [-v]
    
DESCRIPTION
    An interactive program to simulate the operation of the File Allocation
    Table of a very much simplified FAT file system.

OPTIONS
    -v verbose output. Can be used multiple times to increase verbosity.

COMMANDS
    ls 
       to list the file names, data lengths, and directory index
    qu
       to quit. this is an administrative command   
    rd _filename_
       to print the contents of file _filename_. Error if it does not exist.
    rm _filename_ 
       to remove the file _filename_. Error if it does not exist.
    wr _filename_ _content_ 
       to create or update the file _filename_ with content _content_. Although
       the file can contain binary values, this command creates files with ASCII
       content only.
    dd _filename_
       to display the cluster chain associated with the named file

   Filenames are at most 15 characters. Values are 64 bytes.

HISTORY
    Created for the 241 edition of CSC421 (2023-2024 academic year).
    
BUGS

The Cluster Table

In the program, the data section is the array cluster_table of entries struct Cluster, where each cluster is of size CLUSTER_SIZE.

The cluster_table is a global variable declared in fat-reduced-util.c.

The FAT Table

The central element of the FAT filesystem is the File Allocation Table, a.k.a. the FAT.

The fat_table is an array of unsigned int. It is a global variable declared in fat-reduced-util.c.

The data for a file is stored in a collection of clusters allocated for this purpose. The clusters do not have to be in sequence on the disk. Instead there is a sequence that can jump around. The sequence of clusters in encoded in the FAT as a linked list of clusters.

The entries in the FAT are in exact correspondence with the clusters in the data region. The i-th entry is associated with the i-th cluster in the data region.

  1. If fat_table[i]==0 then cluster_table[i] is not allocated to any file.
  2. If fat_table[i]!=0 then cluster_table[i] is allocated to a file and the next cluster in the sequence of clusters allocated to this file is cluster_table[fat_table[i]]
  3. If fat_table[i]==-1 then this cluster is allocated to a file and is the last cluster in the sequence of clusters allocated to this file.

The Root Directory and DirEnt

In our implementation, we have an array of struct DirEnt named root_dir. It is a global variable declared in fat-reduced-util.c

The directory entry structure is defined in fat-reduced.h as,

struct  DirEnt {
    char name[FILENAME_LEN] ;  // first char \0 means entry empty; might not be null terminated
    unsigned short flags ;     // not needed for this project
    unsigned int len ;         // the actual length. the cluster might give more room.
    unsigned int starting_cluster ;  // the first cluster in the cluster chain
    } ;
Note some important points. Use strncmp and strncpy in the field name. This field is size FILENAME_LEN and filenames can be up to FILENAME_LEN in length.

See the documentation of strncpy for the appropriate string functions. If the name is shorter than FILENAME_LEN then it is null terminated. If it is equal to FILENAME_LEN then it is not null terminated.

By convention, if the first character of name is \0, then the DirEnt is unused. A typical implementation would not zero the entire struct — it would just zero this one byte.

Actions to Implement

The file fat-reduced.c reads and parses the typed command. It then jumps to the function that implements the command. These functions are what you have to implement. They unfinished versions of them are found in fat-reduced-util.c.

This is a list of the functions you are to implement.

Development Plan

I would do this project in stages. Each stage finishes something you need for the next stage, and can be complete for a partial implementation

  1. Implement the root_dir. (Ignore entirely the cluster_table and the fat_table)
  2. Implement reading and writing of files requiring no more than one cluster.
  3. Implement file remove consistent with the current partial implementation.
  4. Note that this is a fine partially finished project.
  5. Implement reading and writing of files requiring more than one cluster.
  6. Add to the rm implementation to include the tear-down of cluster chains.
  7. Implement appending writes.

Implementation Notes

To receive full credit for the project, certain operations must be done in strict conformance with this specification.

  1. If a filename is too long, silently truncate.
  2. Data length is as in DirEnt.len, not according to a null byte. Your code might be tested on this.
  3. When finding free DirEnt's or Clusters, find and use the one with smallest index. This is for purposes of automatic grading.
  4. Use this format to implement the ls command:
  5. Use this format to implement the rd command:
  6. Use these formats to implement the dd command: (2025: The template has this command implemented.)
  7. On an appending, if the slack in the final cluster is sufficient for the additional bytes to append, do not add a cluster.
  8. A zero byte file allocates one cluster. Otherwise, files of size S should allocate K clusters, where (K*Cluster_Size==S).

Appending to a file example

Original file with contents "abcdef"; stored in two 4 byte clusters with 2 bytes of slack
       +---+---+---+---+    +---+---+---+---+       
       | a | b | c | d |--->| e | f |   |   |   
       +---+---+---+---+    +---+---+---+---+

Wrote "ghijkl" to the file; slack used and one more 4 byte cluster added
       +---+---+---+---+    +---+---+---+---+    +---+---+---+---+
       | a | b | c | d |--->| e | f | g | h |--->| i | j | k | l |
       +---+---+---+---+    +---+---+---+---+    +---+---+---+---+

A FAT Example

2. After creating 3 files.

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 3 |
    +---+
3   |-1 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+
1. An initialized FAT table

    +---+
0   | 0 |
    +---+
1   | 0 |
    +---+
2   | 0 |
    +---+
3   | 0 |
    +---+
4   | 0 |
    +---+
5   | 0 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+
  1. We begin with a FAT table all zeros, since all blocks are free.

  2. Three files are created, each requiring two blocks.
    1. The first chains clusters 0 and 1, 0->1.
    2. The second chains clusters 2 and 3, 2->3.
    3. The third chains clusters 4 and 5, 4->5.
    The clusters 6 and 7 are still free.

    4. After creating a fourth file.
        +---+
    0   | 1 |
        +---+
    1   |-1 |
        +---+
    2   | 3 |
        +---+
    3   | 6 |
        +---+
    4   | 5 |
        +---+
    5   |-1 |
        +---+
    6   |-1 |
        +---+
    7   | 0 |
        +---+
    
    3. After deleted the second file
    
        +---+
    0   | 1 |
        +---+
    1   |-1 |
        +---+
    2   | 0 |
        +---+
    3   | 0 |
        +---+
    4   | 5 |
        +---+
    5   |-1 |
        +---+
    6   | 0 |
        +---+
    7   | 0 |
        +---+
    
  3. We delete the file that occupied clusters 2 and 3. The FAT entries are zero.

  4. Finally, a file requiring 3 clusters is created. It takes the first 3 free clusters available and chains them into the cluster chain 2->3->6.

Example run of fat-reduced

% ./fat-reduced 
ls
wr hi hello_world
rd hi
hello_world
ls
  0     11 hi
wr bye good_night_moon
ls
  0     11 hi
  1     15 bye
rd bye
good_night_moon
wr hi _the_sun_is_up
rd hi
hello_world_the_sun_is_up
ls
  0     25 hi
  1     15 bye
rm hi
ls 
  1     15 bye
rd hi
Command rd returned error 2.
qu
%
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 5 nov 2023
update: 7 nov 2025