Mini-FAT Project

by: burt rosenberg
at: university of miami
date: nov 2015
NAME
    minifat
    
SYNOPSIS	
    minifat -s -d _directory_ [-cv] _virtual-disk_
    
DESCRIPTION
    Uses FUSE to attach the file system implemented in the minifat code to the
    named directory, with the virtual data file _virtual-disk_.

OPTIONS
   The options -s and -d are passed to FUSE, and are manditory.

    -d The directory to attach.
    -s Single threaded
    -c create the file _virtual-disk_, or zero if already exists.
    -v verbose output
        
        
HISTORY
    FAT/FUSE was introduced in CSC521 2006-2007; MiniFat was introduced 2012-2014; 
    further simplified in this project.
    
BUGS


US Patent US4905110 A

Goals

You are to write a simple file system which will use a regular file as the "virtual disk". The structure of the file system is based on FAT, simplified, and called mini-FAT. In order that you can work as a file system driver but not in the kernel, Linux has a technology called FUSE — File system in User space. Built into the kernel, FUSE will redirect file system calls to a program you write and run as a background job as a regular user program. Your program will take those redirected calls and act upon the data in the "virtual disk" file to carry out the request.

The project therefore requires that you also learn about FUSE. There is some documentation, and a sample "Hello World" filesystem, that does nothing at all except pretend to have a single file whose contents is the string "Hello World". In fact, there is no file system at all, just a bunch of functions that are connected to file system calls by the FUSE mechanism, and these functions return the string "Hello World" when the file helloworld.txt is read, and pretty much ignore any other file system operation.

The Mini-FAT filesystem is a simplification of the FAT file system. The FAT filesystem, although simple and robust, has a lot of complications, which don't add to the conceptual understanding of filesystems, although are very instructive to the true nature of software technology in the context of the real world. That is to say, it is surprisingly messy, and the half of it is untold in the official documentation.

Mini FAT description

Please read the official documentation.

A large number of simplifications are made to the standard FAT file system.

File systems use block devices, that have their physically determined block size. For instance, disks have traditionally offered 512 byte sectors. A file system generally has its own block size, although it tends to be an even multiple of the device's block size. In FAT, the block is called a cluster, and can be adjusted. Also the number of clusters in the filesystem is stated at file system creation, as all the internal data structures are sized according to the number of clusters in the file system.

✶ In minifat, there are 1024 bytes in a cluster, and 1024 clusters in the file system.

All files systems have three general classes of data: the superblock, the meta-data and the content. The superblock is used by the operating system to identify the file system and the elements within the file system, and set options. The meta-data comprises all of the data structures that implement the file system organization. The content are the clusters that carry the data, they are were the data visible to the user is stored.

✶ In minfat:

  1. The super-block occupies cluster 0 and has no purposed. It can contain all zeros.
  2. The meta-data region, which consists only of the FAT table. The FAT table is an array of 32 bit integers, indexed by cluster numbers, 0 through 1023. The table occupies clusters 1 through 4.
  3. The content region, which contains files and directories; which occupies clusters 5 through 1023.
  4. Cluster 5 always contains the first cluster of the root directory.
  5. The FAT table creates a linked list of clusters in a file or directory; except that the value 0 means the cluster is free, and the value of -1 marks the end of a linked list of clusters.

A directory is a special sort of file which points to other files and directories. The minifat directory format is a simplication of the FAT directory format. Each directory is a sequence of directory entires. Each directory entry is either in use or empty. The directory can be a chain of clusters, and any cluster in the chain can contain only empty directories, for instance, after a sequence of deletions.

At first, do not worry about trimming from the chain of clusters a cluster that contains only empty directories.

Once your code is working, unconcerned about trimming directories, you should implement a function, trim_directory, which trims such clusters from the directory in a lazy fashion, typical of many filesystem implementations.

The laziness is given by the following correctness condition:

✶ A directory is either a single cluster, or the last cluster in the directory contains at least one non-empty directory entry. This means that you do not have to "compact down" directories to fit into the smallest number of clusters. Just trim what you can off the end of the cluster chain.

The format of the directory entry is given in the following table.

MiniFat Directory Entry
name offset  size description
dir_name0118+3 filename. Empty entry if first byte is 0x0. (see notes)
dir_attr1110x0 for a regular file, 0x10 for a directory file.
dir_empty1212unused
dir_firstcluster244cluster number of first cluster in file
dir_filesize284size of file in bytes, if a regular file.

Notes on directory entries:

✶ Filesize is ignored for directories. The length of a directory is the total length of the clusters in the cluster chain.

✶ Like the true FAT file system, filenames are 8+3 style and blank padded on the right. The file system is case-insensitive, but filenames are stored all upper case.

✶ An empty directory entry is indicated by a 0 (0x0) as the first character of the file name. (The FAT option of using a different mark to indicate an empty directory entry but other non-empty directory entries follow, is not supported.)

✶ A filename can only contain the characters "A" through "Z", "0" through "9", and "~". The "~" character is reserved to the file system and may substitute for characters other than in the permitted set.

✶ Files have no owner, permissions, or dates. Minifat may report the files are owned by root, have all permissions, and dates of unix time zero.

✶ Fields dir_firstcluster and dir_sizesize are little endian.

FUSE and the Hello World filesystem

FUSE is a kernel technology that passes file requests to a listener program. The program attaches to a directory, and file requests that descend into that directory at passed on the the program, for the program to respond. It allows us to hook our minifat file system into the operating system.

To get started with FUSE, copy class/proj7 to your proj7 folder, and get working the Makefile target fuse-test. The Makefile has an install target that you might have to run, to install the FUSE development package.

The documentation on FUSE is scarce. Here are some resources:

FUSE and Files

User-program <-----> Kernel <-----> FUSE <-----> minifat.c <-----> fatdisk.dmg

            filesystem                 FUSE callbacks     filesystem

User programs that are requesting files in your file system will be redirected in the kernel to FUSE, and FUSE will send those requests to your program, minifat, which will be running continuously. The requests are made through callbacks to functions loaded into FUSE through the presentation of a function vector — an array of functions.

Those functions service those requests by making file requests to the virtual disk, using normal file system calls. The virtual disk will be minifat format. To debug your program, it is helpful to be able to read the virtual disk. You can use hexdump:

    hexdump -C fatdisk.dmg
See man hexdump.

Grading, partial credit

The project is long, and should be approached in stages, to make sure an amount of credit is awarded that is proportional to the work accomplished. However, up to whatever stage of completion, the project must be correct. (This is both a grading requirement, and a lesson in incremental software development, that suggests breaking a project down into individually accomplishable and testable milestones.)

For 1 point:

Implement a version of minifat that does not support subdirectories, and only supports files of content size 0. It should be possible to create and delete files, and to list them with ls.

For 3 points:

Implement a version of minifat that does not support subdirectories. In addition to above, reading, writing and truncating is supported.

For full credit:

Implement subdirectories and directory trimming.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 8 nov 2015
update: 8 nov 2015