Club Officers
Dates and Times
Muggings
Lou's Views
About Spam
Member Pages
Membership Virus Info
About MUG Tech Support Free Stuff
Character Map Keyboard Shortcuts
Today is: -

Safety Links

Go Back
Homepage
Contact

Muggings...

Articles and Information
by Members for Members

Why does an Operating System need a File System and what does it do?

While you are working on a document, an audio file, or a photo the File System is dormant (unless you have instructed the application you are working in to automatically make a backup copy). All your activity is going on in RAM (Random Access Memory) which is volatile, meaning your work will disappear when you shut down the computer. It could be that it is what you want – you were just goofing. More likely, you would like to save those hours of creativity and retrieve the work at a future date. In the case of “hard copy,” the document would be in a notebook or on sheets of paper that you would put in a folder in your file cabinet. That's just what a File System does – it organizes and manages your work so you can get at it later without having to reenter all that data.

The File System can save your work as – you guessed it – a file on non-volatile media. Files can be of many types: document, music, movie, graphic, etc. The Operating System and the File System work together so that when you open your work file, if it is a registered filetype , it knows the appropriate application to bring up. It is your PC's Post Office – but a very superior one. If the USPS tears your letter up, you have little chance of recovery. If your mail box overflows and letters blow away, tough. The efficient File System adds information that assures file recovery.

The non-volatile media mentioned can be of many types: the hard drive, a floppy, a CD (Compact Disk), a ZIP drive, a flash memory card, etc. For this article, we will concentrate on the hard drive. Okay, but what's a Hard Drive ? It is a recording device comprised of platters and read/write heads; think of a miniature jukebox. The platters are stacked on a spindle and spin at a high speed. This high speed spinning creates a laminar air flow over the platters and supports the read/write heads. Never move or jostle your PC while the HD is reading or writing to the disks, this could cause permanent damage as a head crash . The musical jukebox handles one platter at a time and reads a continuous spiral of data via a needle tracking a groove in the vinyl LP record. If our HDs behaved the same way we would soon fill them and take an inordinately long time to write or read our files. As HDs grew in capacities, sizes unimagined when first introduced, technological advances required a different scheme of organization. IBM, in 1956, had the RAMAC 305 (Random Access Method of Accounting and Control). It stored 5 MB on 50 24-inch disks, cost about $50,000, and was as large as two refrigerators.

The stacked platters are formatted as cylinders – a collection of circular tracks, on each side of the individual platters, at a fixed radial distance, like tubes within tubes. They are consecutively numbered from the edge to the hub. These cylinders are further divided into sectors – think of a slice of a many layered cake. The sectors are combined to form a cluster. A cluster is the smallest unit of the HD's memory, a quantum of storage. Each cluster can contain only one file but a large file can occupy more than one cluster. Enter the File System – it fills the clusters, records their address, and appends other noteworthy information to the directory; other info such as date and time of creation, modification, attribute, and (most importantly) where the various fragments of a file that did not fit in a single cluster are located.

There are three Windows file systems, developed historically as Hard Drive capacities grew. FAT/FAT16 (File Allocation Table 16 bits) can handle drives up to 4 GB, FAT32 - over 32 GB, and NTFS (New Technology File System). The major parameter to the user is the cluster size . The table below shows the required cluster size for a hard drive's (or partition's) capacity. In the specification for FAT16 only 16 bits are allowed for the cluster address. With 2 16 = 65,536 available addresses (clusters), the size of the cluster is thus determined by the drive capacity. As we move to FAT32 we get about 4.3 billion addresses and cluster size for this file system can be reduced for the larger drives. The NTFS is the most efficient in reducing slack space ; the unused portion of a cluster.

 

  FAT/FAT16   FAT32   NTFS

Partition Size

Cluster Size

Partition Size

Cluster Size

Partition Size

Cluster Size

127 MB

2 KB

8 GB

4 KB

< = 0.5 GB

512 B

255 MB

4 KB

16 GB

8 KB

1 GB

1 KB

511 MB

8 KB

32 GB

16 KB

2 GB

2 KB

1 GB

16 KB

>32 GB

32 KB

>2 GB

4 KB

2 GB

32 KB

 

 

 

 

4 GB

64 KB

 

 

 

 

 

Consider COOKIES – most are only 1 KB in size yet, on a 20 GB hard drive (which is considered small in today's latest computers) a cookie would waste 15 KB of storage in the FAT32 File System, but only 3 KB in the NTFS system. When you consider it does not take a lot of Web surfing to pick up tens of cookies, you can see how much of a memory hog that can be with the older file system. It is also a good reason to dump those cookies, periodically.

 

 

The NTFS File System

Key elements of the NTFS system are designed to be more secure in an enterprise. Home users benefit with better reliability and flexibility in a home network. Granting access to individual files allows family members to share and yet keep unauthorized eyes off sensitive files. The file organization stores file attributes within a Master File Table (MFT). These metadata files include marking bad clusters on the hard drive, a Cluster Allocation Bitmap to apportion space as needed, and an MFT Mirror file. The mirror file is a copy of the first 16 MFT files as a backup. The MFT is on the first 12% of the drive and the 16 sys op files are at the beginning of that space. The MFT Mirror is in the center of the other 88%. With Win XP, Microsoft added a Quota Table to NTFS. It gives you control over the size a directory can occupy. Handy when a family member tries to make a giant jukebox out of the family PC.

FRAGMENTATION

When you start out with a fresh new drive, whether it is a brand new device or a recently created partition, there is plenty of room (typically) for many, many files. A run-of-the-mill Word document will average maybe 50 KB, not much memory space when you consider Gigabytes (1 GB = 1,000,000 KB). Now consider the cluster size (let's use the NTFS system) of 4 KB. Fifty divided by four = 12.5 – thus we need 13 clusters to store the document. Changing your mind about the whole thing, you delete the file and proceed to write a new document. Your “deleted document space” is available but will not be used if there are brand new clusters available. Deleting says the tenant no longer is in residence but does not move the furnishings out. Eventually you would fill the new clusters and the file system would look for those “empty” clusters. With our example of a 20 GB drive and considering only Word documents and the like, it would take a considerable amount of time to get to that point. However, consider how we use our computers lately. Digital photos, digital movies, and digital music – we are no longer talking KB sized files. It is common for these types to be MB and, in the case of movies, even GB in size.

We can solve limited storage by “off loading” files – burn them to CDs and DVDs and delete them from the drives. Back to the analogy of an apartment building – we then have newly available space but not like when the building was first available for occupancy. Some apartments are now occupied. A new tenant with a family of ten will not fit in a two bedroom flat; they will have to rent more than one address within the building – Apartment Number. It is not likely that the apartments will be adjacent (contiguous for memory space on a drive). The family will have to fragment itself into multiple domiciles.

Can you say defragmenter ?

Defragging is the reading and re-writing of data on the drive to sort files into contiguous space. You must have enough room on the drive to temporarily assemble pieces of files into a holding place until they can be written continuously. The file system is the source of the address data and its subsequent update to the re-arranged locations of the files. The oft-asked question is when and how often should I defrag? Today's high speed CPUs, fast rotating drives, extended RAM at reasonable prices all serve to alleviate the problems of fragmentation – it is highly unlikely that you would notice a difference in performance between before and after a defrag, with some notable exceptions – audio, video, and photo editing. Unless you have an unusually large cache of RAM – a Gigabyte or more – there is a good chance that the editing programs are going to make use of swap files (Win XP calls it a paging file ) to store vast amounts of data including multiple Undoes, info on layers, etc. If you had not done so recently, it might be advisable to run a defrag program prior to a session of large file editing.

The Future File System

In case you think mastering the NTFS file system will be the last word – here's a wakeup. Waiting in the wings to be introduced with Longhorn, the next generation of Windows OS, is WinFS. WinFS is set to put an entirely new look on Windows file management with a user interface similar to Google. It is being built atop NTFS with additional functions. At Comdex in November, 2003, a concept called Implicit Query was introduced. As you composed a document, IQ would search your files for key phrases as you type them in. Examples were a search of your email for From: Joe when you enter “Joe” in the To: line of a new message, read your Calendar for any appointments with Joe, and connect to the Web if you enter words that could be part of a URL – like an organization, geographic entity, or retail company.

The data model for WinFS is more complex using types with properties, fields, and relationships. A person type could contain fields for name and address. The relationships are to be of a relational database nature.

As we have seen in the past, when it comes to PCs – there is nothing more constant than change.

back to top