Single Instance Store Groveler

When the SIS Groveler starts, it searches the root of each NTFS volume in the system to see if it contains the SIS directory SIS Common Store and a file called MaxIndex within that directory. If it finds these items, and the SIS filter driver is installed on the system, the Groveler knows to search for and consolidate duplicate files on the volume.

The SIS Groveler does most of its work when the system is not busy. It uses the same technology that the Indexing Service (a service that indexes your volumes for quick search capabilities) uses to not consume CPU time when the system cannot afford it. If the disk space on the volume drops below a specified value, the Groveler increases its CPU usage regardless of system activity to help prevent the possibility of running out of space on the volume.

A side effect of the Groveler's intelligent CPU use is that the service does not run at full speed during the first several hours after installation, even if the system is idle. This is because the service attempts to determine how much CPU and input/output (I/O) bandwidth it can use without causing problems to other system components. If you want the Groveler to run at maximum capacity do the following:

To make SIS Groveler run at maximum capacity

  1. Expand grovctrl.ex_ from the Windows 2000 Server operating system CD. This file is located in the \i386 directory on the CD.

  2. Run grovctrl f to force the Groveler into foreground mode for all drives.

After the Groveler completes its work in foreground mode, it resumes normal operation where it intelligently uses the CPU cycles.

When scanning a volume, the Groveler marks files that are 32 KB or larger in size and identical to one or more files on the volume. It then checks the file in more detail to verify that the content is identical. After the file is verified, it is copied into the \ SIS Common Store folder, renamed with a unique GUID, and given the .sis file name extension. The identical files on the volume are then changed to reparse points. When an application tries to open the original file, the file system redirects any file input or output to the < guid> .sis file in \ SIS Common Store. Figure 24.7 shows an example of two files that were combined with a reparse pointing to the SIS common store.

Cc978292.DSED10(en-us,TechNet.10).gif

Figure 24.7 A File with a Reparse Pointing to the Common Store

The server has two operating system images, A and B. Both contain the Driver.sys file. The files in both directories are identical, so the data has been placed in the SIS common store and the original files are changed to reparse points with referrals to the < guid >.sis file.

note-iconNote

Even though the files are combined and space is being saved on the disk, for disk quota purposes the users who owned the original files are still charged as if the file had not been combined.

When a file that has been consolidated by SIS is modified or its contents replaced, such as when you copy over the file or modify it in some way, the reparse point is removed and replaced with a copy of the < guid >.sis file. The changes are then applied to the fresh copy of the original file. The results of such an operation are shown in Figure 24.8

Cc978292.DSED12(en-us,TechNet.10).gif

Figure 24.8 A File with No Reparse Point After Modification

The other reparse point or points for the original file are not changed, even if only one reparse point remains pointing to the < guid >.sis file. After the final instance of the original file is modified or deleted, the < guid >.sys file in the SIS Common Store folder is deleted.