Fat Skeleton Project

by: burt rosenberg
at: university of miami
date: nov 2022

US Patent US4905110 A

Goals

File systems are an example of data persistence. Computer operating systems make the file system one of their core provisions to the user. Modern operating systems allow for a variety of file systems to be in use at the same time.

A very popular file system is FAT. It was Microsoft's original file system, even before windows, and continues to this day in many devices. All file systems must keep track of disk space associated with a file. Disk space is allocated in units of blocks or clusters, so that filesystem must keep for each file an ordered sequence of clusters. The file system also needs to be able to identify a file by name, or by the file name in a hierarchy of directory names.

The FAT file system did these tasks using two important data structures,

  1. The FAT Table, standing for File Allocation Table.
  2. The DIRENT, the directory entry structure.

We are implementing a very simplified FAT that is just the skeletal data structure of FAT, hence the name. A skeleton is not fat.

The FAT Table
The FAT table kept track of the clusters associated with a file as an array representation of the sequence of clusters as a linked list. If a file needed three clusters, say 17, 9 and 11 in that order, then location 17 of the fat will contain a 9, location 9 will contain an 11, and location 11 will contain a special value marking the end of the sequence. The location 17 is the head of this cluster chain, and would be entered into the DIRENT associated with the file. If an entry in the FAT table is 0, that means the cluster was unused.

See CLR 2nd ed. section 10.3 Implementing pointers and objects for more about an array implementation of a linked list.

The Directory and DIRENTS
DIRENTS are structures that contain the name, starting cluster, and exact byte length of the file. The collection of clusters has space to contain at least the requested byte length, but may contain more. The unused bytes in the last cluster are called slack space. In FAT a directory is a sequence of DIRENTS that themselves form a file. We shall implement an even simpler filesystem where the DIRENT sequence is a fixed array of struct's, and we will not have directories inside directories.

We will write code for DirEnt's in a DirEnt table, and cluster numbers in a FAT table, but we will not program anything to actually read and write data to the data clusters that the FAT table is linking together.

The MAN Page

NAME
    fat-skeleton
    
SYNOPSIS
    fat-skeleton  [-v]
    
DESCRIPTION
    An interactive program to simulate the operation of the File Allocation
    Table of a very much simplified FAT file system.
    
OPTIONS
    -v verbose output. Can be used multiple times.
        
HISTORY
    Created for the 231 edition of CSC421 (2022-2023 academic year).
    
BUGS

Project Details

To received credit for the project, some parts of the your code must be carried out in detailed concert with these requirements.

The directory structure will be an array called root_dir of structures called struct DirEnt. The structure has three fields,

An initialized FAT table

    +---+
0   | 0 |
    +---+
1   | 0 |
    +---+
2   | 0 |
    +---+
3   | 0 |
    +---+
4   | 0 |
    +---+
5   | 0 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+

After creating 3 files requiring 2 blocks 
(length between 129 and 256 bytes)

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 3 |
    +---+
3   |-1 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+


After deleted the file starting at cluser 2

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 0 |
    +---+
3   | 0 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   | 0 |
    +---+
7   | 0 |
    +---+

After creating a file requiring 3 clusters 
(length between 257 and 384)

    +---+
0   | 1 |
    +---+
1   |-1 |
    +---+
2   | 3 |
    +---+
3   | 6 |
    +---+
4   | 5 |
    +---+
5   |-1 |
    +---+
6   |-1 |
    +---+
7   | 0 |
    +---+


The FAT Table is an array called fat_table. The i-th entry in the FAT refers to the i-th block (or cluster) on the disk. Values in the FAT create sequences of clusters, one sequence for each file. The technique is the standard method of representing a linked list in an array.

To create a link pointing from cluster i to cluster j,

	fat_table[i] = j
Two special values are 0, meaning the cluster is free, and -1, meaning the chain ends here. In standard FAT, 0 cannot viably mean cluster 0, because the lowest numbered clusters on a disk are reserved for the disk label. Hence no confusion results. We do no reserve 0, but make the rule that cluster 0, if on a chain, can only be the first cluster in the chain.

To maintain this convention and so that you match the Basic Test exactly, when creating cluster chains, you must take the clusters of lowest index first. FAT does not do this, as it would wear the disk unevenly (why?). But that's not a problem we have for our simulation.

The code uses a function call-out table to jump to the code implementing these operations. The functions have the signature,

     int (*action_function)(int actc,char * actv[])
and are kept in the call-out table paired with the string that triggers the function and the number of parameters the function should have. This parameters actc and actv mimic the parameters to a C main fucntion to pass the parameters to the action functions as an array of strings.
The action functions
  1. ls_action() function to implement that ls operation to list the root directory.
  2. cr_action() function to implement the cr operation to create a file.
  3. rm_action() to implement that rm operation to remove a file, given the file name.
  4. pc_action() to implement the pc operation to print the cluster chain. This is not a typical file operation in a file system.
  5. qu_action() to implement the qu operation quit the program.
Algorithms
The template suggest several subtasks to accomplish.

How a Skeleton is not FAT

A skeleton is not fat because it is just the bones. Here are other ways our file system is not FAT,
  1. Filenames in FAT are of a particular 8+3 format, from a restricted character set.
  2. A Dirent is free in FAT if the first character in the filename is an ASCII blank (0x20).
  3. The FAT Dirent also has some date stamps and a bit vector for file attributes.
  4. Although the original FAT had a fixed root directory with exactly 99 dirent slots, and no subdirectories, all subsequent FATs had variable size directories with subdirectories.
  5. Our directory array is in a separate memory area than where the data would be. FAT keeps the directories as data in the data area.
  6. FAT probably does not use the lowest index first placement of clusters.
  7. FAT actually stores data in the clusters!
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 13 nov 2022
update: 14 nov 2022