Reduced FAT File System Project
by: burt rosenberg
at: university of miami
date: nov 2025
US Patent US4905110 A
Goals
File systems are an example of data persistence. Computer operating
systems make the file system one of their core provisions to the user.
Modern operating systems allow for a variety of file systems to be in use
at the same time.
A very popular file system is FAT. It was Microsoft's original file system,
even before Windows, and continues popular to this day, used in many devices.
This file
system is a reduced version of the FAT file system, retaining the
key concepts. Some simplifications undo accidents of history that complicate FAT.
(E.g., the use of 0xE5 as a directory marking, which turned out to conflict with the Shift-JIS encoding).
Other simplifications are in the interest of a more enjoyable project.
According to ChatGPT, the original FAT was developed by Bill Gates and Marc McDonald
even before MS-DOS existed.
The original FAT had 8 bit FAT entries, which were later expanded to 12 bit and 16 bit
FAT entries in FAT12 and FAT16. This allowed for larger disks and files sizes.
Later FAT16 allowed for long file name (Apple gave Microsoft C:\ONGRTLNS.W95
on
this
important
event.)
The official FAT32 specification is published,
if you are interested in all the details.
Table of Contents
-
Disks and Filesystems
-
The MAN Page
-
The Cluster Table
-
The FAT Table
-
The Root Directory and DirEnt
-
Actions to Implement
-
Development Plan
-
Implementation Notes
-
Appending to a file example
-
A FAT Example
-
Example run of fat-reduced
Disks and Filesystems
A disk is an indexed sequence of blocks. A block is a some number of bytes. Early disks
had 128 byte blocks. A modern high capacity drive has 4096 byte blocks.
Clusters are the software analog of blocks.
A disk driver, speaking directly to the disk, only knows how to move blocks of data
to and from the disk by the disk index. A filesystem uses this notion of an array of
clusters (blocks joined as clusters) to make a system that creates and maintains
files and folders.
Our reduced implementation of the FAT file system will focus on three elements,
- A data region as an array of clusters.
- The FAT table.
- The root directory.
In reality all of these will be laid into an array of clusters, but we will extract
out the root diretory as an array of strut DirEnt and the FAT table as an
array of integers. The data region remains fairly authentic as an array of clusters.
Each strut DirEnt contains this important information,
- The file name
- The file length
- The file type (file or directory)
- The index of the first of a chain of clusters where the data
for the file is stored.
The MAN Page
NAME
fat-reduced
SYNOPSIS
fat-reduced [-v]
DESCRIPTION
An interactive program to simulate the operation of the File Allocation
Table of a very much simplified FAT file system.
OPTIONS
-v verbose output. Can be used multiple times to increase verbosity.
COMMANDS
ls
to list the file names, data lengths, and directory index
qu
to quit. this is an administrative command
rd _filename_
to print the contents of file _filename_. Error if it does not exist.
rm _filename_
to remove the file _filename_. Error if it does not exist.
wr _filename_ _content_
to create or update the file _filename_ with content _content_. Although
the file can contain binary values, this command creates files with ASCII
content only.
dd _filename_
to display the cluster chain associated with the named file
Filenames are at most 15 characters. Values are 64 bytes.
HISTORY
Created for the 241 edition of CSC421 (2023-2024 academic year).
BUGS
The Cluster Table
In the program, the data section is the array cluster_table of entries
struct Cluster, where each cluster is of size CLUSTER_SIZE.
The cluster_table is a global variable declared in fat-reduced-util.c.
The FAT Table
The central element of the FAT filesystem is the File Allocation Table, a.k.a. the FAT.
The fat_table is an array of unsigned int. It is a global variable
declared in fat-reduced-util.c.
The data for a file is stored in a collection of clusters allocated for this purpose.
The clusters do not have to be in sequence on the disk. Instead there is a sequence
that can jump around. The sequence of clusters in encoded in the FAT as a linked list of
clusters.
The entries in the FAT are in exact correspondence with the clusters in the data region.
The i-th entry is associated with the i-th cluster in the data region.
- If fat_table[i]==0 then cluster_table[i] is not allocated to any file.
- If fat_table[i]!=0 then cluster_table[i]
is allocated to a file and the next cluster in the sequence of clusters allocated to this
file is cluster_table[fat_table[i]]
- If fat_table[i]==-1 then this cluster is allocated to a file and is the last
cluster in the sequence of clusters allocated to this file.
The Root Directory and DirEnt
In our implementation, we have an array of struct DirEnt named root_dir.
It is a global variable declared in fat-reduced-util.c
The directory entry structure is defined in fat-reduced.h as,
struct DirEnt {
char name[FILENAME_LEN] ; // first char \0 means entry empty; might not be null terminated
unsigned short flags ; // not needed for this project
unsigned int len ; // the actual length. the cluster might give more room.
unsigned int starting_cluster ; // the first cluster in the cluster chain
} ;
Note some important points. Use strncmp and strncpy in the field
name. This field is size FILENAME_LEN and filenames can be up to
FILENAME_LEN in length.
See the documentation of strncpy for the appropriate string functions.
If the name is shorter than FILENAME_LEN then it is null terminated.
If it is equal to FILENAME_LEN then it is not null terminated.
By convention, if the first character of name is \0, then the
DirEnt is unused. A typical implementation would not zero the entire struct
— it would just zero this one byte.
Actions to Implement
The file fat-reduced.c reads and parses the typed command. It then
jumps to the function that implements the command. These functions are what
you have to implement. They unfinished versions of them are found in fat-reduced-util.c.
This is a list of the functions you are to implement.
- qu_action() to implement the qu operation quit the program.
- Command takes no arguments.
- Output: The action has no output.
- Errors: ERR_ABORT: returned always.
- ls_action() function to implement that ls operation to list the root directory.
- Command takes no arguments.
- Output:
- A printf format is provided, so output matches reference.
- The listing gives the DirEnt number, the file length, and file name.
- The listing is in order of the DirEnt index.
- Empty Dirent's are not listed.
- Errors: None.
- rm_action() to implement that rm operation to remove a file.
- Command takes one argument: the filename.
- Remove the file by setting the filename to the empty string
- Freeing the clusters associated with the file by zeroing in the FAT table.
- Output: The action has no output.
- Errors:
- ERR_NOEXIST: if the file to remove does not exist.
- rd_action() to implement the rd operation to print the file contents.
- Command takes one argument: the filename.
- Output:
- The format must be precisely as specified.
- You may assume that the contents are printable ascii.
- Note: we read the entire file.
- Errors:
- ERR_NOEXIST: if the file to read does not exist.
- wr_action() function to implement the wr operation.
- Command takes two arguments: the filename and the file contents.
- If the file does not exist, create it with the given contents.
- If the file exists, append the given contents to the file.
- Clusters are allocated as needed, but never less than one.
- The content is assumed to be binary.
- The actual content size is recorded in the DirEnt.
- Output: The action has no output.
- Errors:
- ERR_DIRFULL: no more DirEnt's available.
- ERR_FATFULL: no more FAT table entries available.
- ERR_DISKFULL: no more clusters available.
- dd_action() to implement that dd operation to dump the files metadata.
- Command takes one argument: the filename.
- Output: Displays the cluster chain for the file named.
- Errors:
- ERR_NOEXIST: if the file to remove does not exist.
Development Plan
I would do this project in stages. Each stage finishes something you need
for the next stage, and can be complete for a partial
implementation
- Implement the root_dir. (Ignore entirely the cluster_table and the fat_table)
- Get ls working,
- Get wr, restricted to only changes in the root directory.
- Get rm, restricted to only changes in the root directory
- Implement reading and writing of files requiring no more than one cluster.
- Ignore writes that append to existing files.
- Write code for the fat_table to find free clusters, mark them in use.
- Add to wr to allocate and record the starting cluster, and copy the
data into the cluster, and record the length (possibly removing the null byte.)
- Add to rd to copy the data out of the starting cluster, using the
length recorded in the DirEnt, add the
null byte and return the data.
- Note: The data as stored in the clusters are not strings. In particular, in the
case the data fills the cluster completely, there will be no null byte. Always
go by the length field in the DirEnt.
- Implement file remove consistent with the current partial implementation.
- In addition to rm working with the root_dir, it will modify
the fat_table to mark the file's unique cluster as free.
- Note that this is a fine partially finished project.
- Don't stop here!
- But if you have to stop, this is a nice accomplishment.
- Check this code in. Just in case.
- Implement reading and writing of files requiring more than one cluster.
- Still, ignore writes that append to existing files.
- Write code for the fat_table to find free clusters, mark them in use,
and form them into a cluster chain.
- Add to wr to call a subroutine (previous step) to create a chain and
return the starting cluster; then copy the the data into the cluster chain.
- Add to rd to copy the data out of a chain of clusters, using the
length recorded in the DirEnt, add the null byte (as data in clusters
does not always have a null byte.)
- Add to the rm implementation to include the tear-down of cluster chains.
- Implement appending writes.
Implementation Notes
To receive full credit for the project, certain operations must be done in strict
conformance with this specification.
- If a filename is too long, silently truncate.
- Data length is as in DirEnt.len, not according to a null byte. Your code might be tested on this.
- When finding free DirEnt's or Clusters, find and use the one with smallest index. This is for purposes of automatic grading.
- Use this format to implement the ls command:
- Use this format to implement the rd command:
- Use these formats to implement the dd command:
(2025: The template has this command implemented.)
-
On an appending, if the slack in the final cluster is sufficient for the additional bytes to append,
do not add a cluster.
- A zero byte file allocates one cluster. Otherwise, files of size S should allocate K clusters,
where (K*Cluster_Size==S).
Appending to a file example
Original file with contents "abcdef"; stored in two 4 byte clusters with 2 bytes of slack
+---+---+---+---+ +---+---+---+---+
| a | b | c | d |--->| e | f | | |
+---+---+---+---+ +---+---+---+---+
Wrote "ghijkl" to the file; slack used and one more 4 byte cluster added
+---+---+---+---+ +---+---+---+---+ +---+---+---+---+
| a | b | c | d |--->| e | f | g | h |--->| i | j | k | l |
+---+---+---+---+ +---+---+---+---+ +---+---+---+---+
A FAT Example
2. After creating 3 files.
+---+
0 | 1 |
+---+
1 |-1 |
+---+
2 | 3 |
+---+
3 |-1 |
+---+
4 | 5 |
+---+
5 |-1 |
+---+
6 | 0 |
+---+
7 | 0 |
+---+
1. An initialized FAT table
+---+
0 | 0 |
+---+
1 | 0 |
+---+
2 | 0 |
+---+
3 | 0 |
+---+
4 | 0 |
+---+
5 | 0 |
+---+
6 | 0 |
+---+
7 | 0 |
+---+
-
We begin with a FAT table all zeros, since all blocks are free.
-
Three files are created, each requiring two blocks.
- The first chains clusters 0 and 1, 0->1.
- The second chains clusters 2 and 3, 2->3.
- The third chains clusters 4 and 5, 4->5.
The clusters 6 and 7 are still free.
4. After creating a fourth file.
+---+
0 | 1 |
+---+
1 |-1 |
+---+
2 | 3 |
+---+
3 | 6 |
+---+
4 | 5 |
+---+
5 |-1 |
+---+
6 |-1 |
+---+
7 | 0 |
+---+
3. After deleted the second file
+---+
0 | 1 |
+---+
1 |-1 |
+---+
2 | 0 |
+---+
3 | 0 |
+---+
4 | 5 |
+---+
5 |-1 |
+---+
6 | 0 |
+---+
7 | 0 |
+---+
-
We delete the file that occupied clusters 2 and 3. The FAT entries are zero.
-
Finally, a file requiring 3 clusters is created. It takes the first 3 free clusters
available and chains them into the cluster chain 2->3->6.
Example run of fat-reduced
% ./fat-reduced
ls
wr hi hello_world
rd hi
hello_world
ls
0 11 hi
wr bye good_night_moon
ls
0 11 hi
1 15 bye
rd bye
good_night_moon
wr hi _the_sun_is_up
rd hi
hello_world_the_sun_is_up
ls
0 25 hi
1 15 bye
rm hi
ls
1 15 bye
rd hi
Command rd returned error 2.
qu
%