Each directory consists of a number of fixed-size entries. Each entry is 32 bytes long. The number of sectors in the directory is fixed for the root directory on FAT12 and FAT16 partitions, and sectors are consecutive on disk. For non-root directories, as well as for the root directory on the FAT32 partitions, the number of sectors is not fixed, and the directory is stored according to the normal cluster chain.
There are two different types of the entries for long filenames and for aliases.
Offset in the Entry | Length in Bytes | Description |
00 | 8 | Filename |
08 | 3 | Extension |
0B | 1 | Attribute |
0C | 1 | Case |
0D | 1 | Creation time in ms |
0E | 2 | Creation time |
10 | 2 | Creation date |
12 | 2 | Last access date |
14 | 2 | High word of starting cluster for FAT32 |
16 | 2 | Time stamp |
18 | 2 | Date stamp |
1A | 2 | Starting cluster |
1C | 4 | Size of the file |
Filename and extension are left-justified and blank-padded. Note that filename cannot consist solely of spaces, but extension can. Watch for illegal characters in filenames. My humble suggestion is replacing any illegal characters with underscores.
Some characters in the filename have special meaning. If the first character has the code 05, then actually the first character has the code E5 and it is not a special character. If the first character has the code E5, then the file was deleted. You may save some time when going through the directory structure by checking the first character in the filename. If it is zero, there are no more entries in current directory.
Two entries have a special meaning. They are present only in subdirectories, but not in the root directory. The entry with the name consisting of exactly one dot is the pointer to the root directory. Its starting cluster is the first cluster of the root directory, which is usually two. You are best advised to ignore this value because the location of the root directory can be easily calculated otherwise. The entry with the name made up of exactly two dots points to the next higher-level directory in the hierarchy. Its starting cluster is the first cluster in that directory. These entries should be the first and the second one in the directory, correspondingly. Their attributes should be 10h (Directory). They are created at the time the subdirectory is created. There are no corresponding long names for them.
Attribute is a collection of bit flags:
Value | Meaning |
01 | Read Only |
02 | Hidden |
04 | System |
08 | Volume Label |
10 | Directory |
20 | Archive |
40 | Unused |
80 | Unused |
Read Only, Hidden, and System are pretty self-explanatory. I will only note that neither Hidden nor System files should be moved during defragmentation or any other disk service. If you remember, I recommended certain actions when file corruption is detected. You are best advised not to try any corrections on Hidden or System files. Also, Hidden files should not be returned by search commands unless they were explicitly asked to do so.
Volume label attribute means that this entry contains the disk label in the filename and extension fields. Volume label is valid only in the root directory. Common sense says, there should be only one volume label per disk. For the entry to really contain the volume label, the attribute should be exactly 08. If Attribute is equal to 0Fh (Read Only, Hidden, System, Volume Label) then this entry does not contain the alias, but it is used as part of the long filename or long directory name.
Directory bit is set if the entry is a subdirectory. In this case the starting cluster contains the beginning cluster for the subdirectory, and the file size field is ignored (set to zero). Directories can also be Read Only, Hidden, System, or Archive. Directory bit is not set for the long directory name entries.
Archive bit is somewhat symbolic. It should be set if the file was not archived by the backup utility. Never in my life I have seen the use of this bit.
Two values are unused, which means that the entries with either of these bits set should be considered invalid. Another invalid combination is when both, Directory and Volume Label bits are set. Unless you are a disk analyzing tool, the best technique is to ignore the entries with the invalid attribute.
Case is zero if the filename and extension need to be converted to upper case. This field is used only by Windows NT.
Time stamp and creation time have the following format:
Bits | Range | Translated Range | Valid Range | Description |
0..4 | 0..31 | 0..62 | 0..59 | Seconds/2 |
5..10 | 0..63 | 0..63 | 0..59 | Minutes |
11..15 | 0..31 | 0..31 | 0..23 | Hours |
Date stamp, last accessed date, and creation date have
the following format:
Bits | Range | Translated Range | Valid Range | Description |
0..4 | 0..31 | 0..31 | 1..28 up to 1..31 | Day, blame Julian for complexity |
5..8 | 0..15 | 0..15 | 1..12 | Month |
9..15 | 0..127 | 1980..2107 | 1980..2107 | Year, add 1980 to convert |
Generally, creation time and date say when the file was created. Accessed time and date say when the file was last modified. Time and date stamps are set to the time that applications want you to think is the time of the last modification.
Starting cluster is the beginning cluster for the file or directory cluster chain. For FAT32, this value consists of the two 16-bit words, and the high four bits of the high word should be masked out. I have never seen any documentation regarding this, but a couple of hours of playing with FAT32 convinced me that this is the case.
Size of the file specifies the real file size in bytes. This value might be in conflict with the file size calculated by going through the cluster chain. Whenever they are in conflict, the smaller value takes over.
Offset in Entry | Length in Bytes | Description |
00 | 1 | Sequence number for the slot |
01 | 10d | First five characters in filename |
0B | 1 | Attribute |
0C | 1 | Reserved, always zero |
0D | 1 | Alias checksum |
0E | 12d | Next six characters in filename |
1A | 2 | Starting cluster |
1C | 4 | Last two characters in filename |
The starting cluster number is always zero, and the attribute is always 0Fh.
Slots are always positioned right before the alias in the directory.
The closest to the alias slot contains the first thirteen characters of
the long filename. The slot above it contains the next thirteen characters,
and so on, up to 256 characters. Additionally, the sequence number of the
slot contains its number in the slot chain, starting from one. The sequence
number for the last slot in the chain is or'ed with 40h to indicate
end of chain.
Slot Number | Sequence Number | Characters |
3 | 43h | me.text |
2 | 02 | y long filena |
1 | 01 | This is a ver |
Alias | Alias | THISIS~1.TEX |
If the length of the filename is not the multiple of thirteen, the name in null-terminated. Otherwise, it is not null-terminated. If after null termination there are any characters left in the slot, they are filled with FFFF.
Checksum contains the checksum for the corresponding alias. It is calculated
in the following way:
unsigned char sum, i;
for(sum=i=0; i<11; i++)
sum=(((sum&1)<<7)|((sum&0xFE)>>1))+name[i];
In a more common language, they rotate the sum right with cycling and
add the next character at each iteration. Note that the checksum is case-sensitive.
When the file is deleted, all entries for the long filename start with E5.
What can go wrong with long filenames? Give some space to your imagination...
Because the directory entry is otherwise untouched, recovery is possible. However, recovery is not guaranteed. Furthermore, it is not guaranteed that the recovered file will have the same contents as the original file. To recover the deleted file: