FILESTREAM Storage in SQL Server 2008
SQL Server Technical Article
Writer: Paul S. Randal (SQLskills.com)
Technical Reviewer: Alexandru Chirica, Arkadi Brjazovski, Prem Mehra, Joanna Omel, Mike Ruthruff, Robin Dhamankar
Published: October 2008
Applies to: SQL Server 2008
Summary: This white paper describes the FILESTREAM feature of SQL Server 2008, which allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system. It covers choices for BLOB storage, configuring Windows and SQL Server for using FILESTREAM data, considerations for combining FILESTREAM with other features, and implementation details such as partitioning and performance.
This white paper is targeted at architects, IT Pros, and DBAs tasked with evaluating or implementing FILESTREAM. It assumes the reader is familiar with Windows and SQL Server and has at least a rudimentary knowledge of database concepts such as transactions.
In today’s society, data is generated at incredible rates and often needs to be stored and accessed in a controlled and efficient way. There are various technologies to do this and the choice of technology often depends on the nature of the data being stored – structured, semistructured, or unstructured:
- Structured data is data that can be easily stored in a relational schema, such as that representing sales data for a company. This can be stored in a database with a table of information for products the company sells, another with information about customers, and another that details sales of products to customers. The data can be accessed and manipulated using a rich query language such as Transact-SQL.
- Semistructured data is data that conforms to a loose schema but does not lend itself well to being stored in a set of database tables, such as data where each data point could have radically different attributes. Semistructured data is often stored using the xml data type in the Microsoft® SQL Server® database software and accessed using an element-based query language such as XQuery.
- Unstructured data may have no schema at all (such as a piece of encrypted data) or may be a large amount of binary data (many MBs or even GBs) which may seem to have no schema but in reality has a very simple schema inherent to it, such as image files, streaming video, or sound clips. Binary data in this case means data that can have any value, not just those that can be entered at a keyboard. These data values are commonly known as Binary Large Objects, or more simply BLOBs.
This white paper describes the FILESTREAM feature of SQL Server 2008, which allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system. It covers the FILESTREAM feature itself, choices for BLOB storage, configuring the Windows® operating system and SQL Server for using FILESTREAM data, considerations for combining FILESTREAM with other features, and implementation details such as partitioning and performance.
While structured and semistructured data can easily be stored in a relational database, the choice of where to store unstructured or BLOB data is more complicated. When deciding where to store BLOB data, consider the following requirements:
- Performance: The way the data is going to be used is a critical factor. If streaming access is needed, storing the data inside a SQL Server database may be slower than storing it externally in a location such as the NTFS file system. Using file system storage, the data is read from the file and passed to the client application (either directly or with additional buffering). When the BLOB is stored in a SQL Server database, the data must first be read into SQL Server’s memory (the buffer pool) and then passed back out through a client connection to the client application. Not only does this mean the data goes through an extra processing step, it also means that SQL Server’s memory is unnecessarily “polluted” with BLOB data, which can cause further performance problems for SQL Server operations.
- Security: Sensitive data that needs to have tightly-managed access can be stored in a database and security can be controlled using the usual SQL Server access controls. If the same data is stored in the file system, different methods of security such as access control lists (ACLs) need to be implemented.
- Data size: Based on the research cited later in this white paper, BLOBs smaller than 256 kilobytes (KB) (such as widget icons) are better stored inside a database, and BLOBs larger than 1 megabyte (MB) are best stored outside the database. For those sized between 256 KB and 1 MB, the more efficient storage solution depends on the read vs. write ratio of the data, and the rate of “overwrite”. Storing BLOB data solely within the database (e.g., using the varbinary(max) data type) is limited to 2 gigabytes (GB) per BLOB.
- Client access: The protocol that the client uses to access SQL Server data, such as ODBC, may not be suited to applications such as streaming large video files. This may necessitate storing the data in the file system.
- Transactional semantics: If the BLOB data has associated structured data that will be stored in the database, changes to the BLOB data will need to adhere to transactional semantics so the two sets of data remain synchronized. For example, if a transaction creates BLOB data and a row in a database table but then rolls back, the creation of the BLOB data should be rolled back as well as the creation of the table row. This can become very complex if the BLOB data is stored in the file system with no link to the database.
- Data fragmentation: Frequent updates and overwrites will cause the BLOBs to move, either within the SQL Server database files or within the file system, depending on where the data is stored. In this case, if the BLOBs are large, then they may become fragmented (i.e., not stored in one contiguous part of the disk). This fragmentation can be more easily addressed by using the file system than by using SQL Server.
- Manageability: A solution that uses multiple technologies that are not integrated will be more complex and costly to manage than an integrated solution.
- Cost: The cost of the storage solution varies depending on the technology used.
The explanations above around size and fragmentation are based on the well-known Microsoft Research paper titled To BLOB or Not to BLOB: Large Object Storage in a Database or a Filesystem? (Gray, Van Ingen, and Sears). The paper has more information on the trade-offs involved and can be downloaded from:
There are a variety of solutions for BLOB storage, each with pros and cons based on the requirements above. The following table compares three common options for storing BLOB data, including FILESTREAM, in SQL Server 2008.
File server / file system
SQL Server (using varbinary(max))
Maximum BLOB size
NTFS volume size
2 GB – 1 bytes
NTFS volume size
Streaming performance of large BLOBs
Integrated + automatic ACLs
Cost per GB
Integration with structured data
Application development and deployment
Recovery from data fragmentation
Performance of frequent small updates
Table 1: Comparison of BLOB Storage Technologies Before SQL Server 2008
FILESTREAM is the only solution that provides transactional consistency of structured and unstructured data as well as integrated management, security, low-cost, and excellent streaming performance. This is accomplished by storing the structured data in the database files and the unstructured BLOB data in the file system, while maintaining transactional consistency between the two stores. More details of the FILESTREAM architecture are given in the “Overview of FILESTREAM” section later in this white paper.
FILESTREAM is a new feature in the SQL Server 2008 release. It allows structured data to be stored in the database and associated unstructured (i.e., BLOB) data to be stored directly in the NTFS file system. You can then access the BLOB data through the high-performance Win32® streaming APIs, rather than having to pay the performance penalty of accessing BLOB data through SQL Server.
FILESTREAM maintains transactional consistency between the structured and unstructured data at all times, even allowing point-in-time recovery of FILESTREAM data using log backups. Consistency is maintained automatically by SQL Server and does not require any custom logic in the application. The FILESTREAM mechanism does this by maintaining the equivalent of a database transaction log, which has many of the same management requirements (described in more details in the “Configuring FILESTREAM Garbage Collection” section later in this white paper). The combination of the database’s transaction log along with the FILESTREAM transaction log allows the FILESTREAM and structured data to be transactionally recovered correctly.
Instead of being a completely new data type, FILESTREAM is a storage attribute of the existing varbinary (max) data type. FILESTREAM preserves the majority of the existing behavior of the varbinary (max) data type. It alters how the BLOB data is stored – in the file system rather than in the SQL Server data files. Because FILESTREAM is implemented as a varbinary (max) column and integrated directly into the database engine, most SQL Server management tools and functions work without modification for FILESTREAM data.
It should be noted that the behavior of the regular varbinary (max) data type remains completely unchanged in SQL Server 2008, including the 2-GB size limit. The addition of the FILESTREAM attribute makes a varbinary (max) column be essentially unlimited in size (in reality the size is limited to that of the underlying NTFS volume).
FILESTREAM data is stored in the file system in a set of NTFS directories called data containers, which correspond to special filegroups in the database. Transactional access to the FILESTREAM data is controlled by SQL Server and a file system filter driver that is installed as part of enabling FILESTREAM at the Windows level. The use of a file system filter driver also allows remote access to the FILESTREAM data through a UNC path. SQL Server maintains a link of sorts from table rows to the FILESTREAM files associated with them. This means that deleting or renaming any FILESTREAM files directly through the file system will result in database corruption.
The use of FILESTREAM requires several schema modifications to data tables (most notably the requirement that each row must have a unique row ID), and it also has some restrictions when it is combined with other features (such as the inability to encrypt FILESTREAM data). These are all described in detail in the “Configuring SQL Server for FILESTREAM” section later in this white paper.
The FILESTREAM data can be accessed and manipulated in two ways – either with the standard Transact-SQL programming model, or via the Win32 streaming APIs. Both mechanisms are fully transaction-aware and support most DML operations including insert, update, delete, and select. FILESTREAM data is also supported for maintenance operations such as backup, restore, and consistency checking. The major exception is that partial updates to FILESTREAM data are not supported. Any update to a FILESTREAM data value translates into creating a new copy of the FILESTREAM data file. The old file is asynchronously removed, as described in the “Configuring FILESTREAM Garbage Collection” section later in this white paper.
After data is stored in a FILESTREAM column, it can be accessed by using Transact-SQL transactions or by using Win32 APIs. This section gives some high-level details of the programming models and how to use them.
By using Transact-SQL, FILESTREAM data can be inserted, updated, and deleted as follows:
- FILESTREAM fields can be prepopulated using an insert operation (with an empty value or small non-null value). However, the Win32 interfaces are a more efficient way to stream a large amount of data.
- When FILESTREAM data is updated, the underlying BLOB data in the file system is modified. When a FILESTREAM field is set to NULL, the BLOB data associated with the field is deleted. Transact-SQL chunked updates implemented as UPDATE.Write() cannot be used to perform partial updates to FILESTREAM data.
- When a row that contains FILESTREAM data is deleted or a table that contains FILESTREAM data is deleted or truncated, the underlying BLOB data in the file system is also deleted. The actual physical removal of the FILESTREAM files is an asynchronous background process, as explained in the “Configuring FILESTREAM Garbage Collection” section later in this white paper.
For more information and examples of using Transact-SQL to access FILESTREAM data, see the SQL Server 2008 Books Online topic “Managing FILESTREAM Data by Using Transact-SQL” (http://msdn.microsoft.com/en-us/library/cc645962.aspx).
Win32 Streaming Access
To allow transactional file system access to the FILESTREAM data, a new intrinsic function, GET_FILESTREAM_TRANSACTION_CONTEXT(), provides the token that represents the current transaction that the session is associated with. The transaction must have been started and not yet committed or rolled back. By obtaining a token, the application binds the FILESTREAM file system streaming operations with a started transaction. The function returns NULL in the case where there is no explicitly started transaction. A token must be obtained before FILESTREAM files can be accessed.
In FILESTREAM, the database engine controls the BLOB physical file system namespace. A new intrinsic function, PathName, provides the logical UNC path of the BLOB that corresponds to each FILESTREAM field in the table. The application uses this logical path to obtain the Win32 handle and operate on the BLOB data by using regular Win32 file system interfaces. The function returns NULL if the value of the FILESTREAM column is NULL. This emphasizes the fact that a FILESTREAM file must be precreated before it can be accessed at the Win32 level. This is done as described previously.
The Win32 streaming support works in the context of a SQL Server transaction. After it obtains a transaction token and a pathname, the Win32 OpenSqlFilestream API is used to obtain a Win32 file handle. Alternatively, the managed SqlFileStream API can be used. This handle can then be used by the Win32 streaming interfaces, such as ReadFile() and WriteFile(), to access and update the file by way of the file system. Again, note that FILESTREAM files cannot be directly deleted, and it cannot be renamed using the file system. Otherwise the link-level consistency will be lost between the database and the file system (i.e., the database essentially becomes corrupt).
The FILESTREAM file system access models a Transact-SQL statement by using file open and close. The statement starts when a file handle is opened and ends when the handle is closed. For example, when a write handle is closed, any possible AFTER trigger that is registered on the table fires as it would if an UPDATE statement were completed.
For more information and examples of using Win32 APIs to access FILESTREAM data, see the SQL Server 2008 Books Online topic “Managing FILESTREAM Data by Using Win32” (http://msdn.microsoft.com/en-us/library/cc645940.aspx).
All file handles must be closed before the transaction commits or rolls back. If a handle is left open when a transaction commit occurs, the commit will fail and additional reads and writes against the handle will cause a failure, as expected. The transaction must then be rolled back. Similarly, if the database or instance of the database engine shuts down, all open handles are invalidated.
Whenever a FILESTREAM file is opened for a write operation, a new zero-length file is created and the entire updated FILESTREAM data value is written to it. The old file is removed asynchronously as described in the “Configuring FILESTREAM Garbage Collection” section later in this white paper.
With FILESTREAM, the database engine ensures transaction durability on commit for FILESTREAM BLOB data that is modified from the file system streaming access. This is done using the FILESTREAM log mentioned earlier and an explicit flush of the FILESTREAM file contents to disk.
The isolation semantics are governed by database engine transaction isolation levels. When FILESTREAM data is accessed through the Win32 APIs, only the read-committed isolation level is supported. Transact-SQL access also allows the repeatable-read and serializable isolation levels. Furthermore, using Transact-SQL access, dirty reads are permitted through the read-uncommitted isolation level, or the NOLOCK query hint, but such access will not show in-flight updates of FILESTREAM data.
The file system access open operations do not wait for any locks. Instead, the open operations fail immediately if they cannot access the data because of transaction isolation. The streaming API calls fail with ERROR_SHARING_VIOLATION if the open operation cannot continue because of isolation violation.
To allow for partial updates to be made, the application can issue a device FS control (FSCTL_SQL_FILESTREAM_FETCH_OLD_CONTENT) to fetch the old content into the file that the opened handle references. This can also be done using the managed SqlFileStream API using the ReadWrite flag. This will trigger a server-side old content copy, as discussed earlier. For better application performance and to avoid running into potential time-outs when you are working with very large files, you should use asynchronous I/O.
If the FSCTL is issued after the handle has been written to, the last write operation will persist, and prior writes that were made to the handle are lost.
For more information on partial updates, see the SQL Server 2008 Books Online topic “FSCTL_SQL_FILESTREAM_FETCH_OLD_CONTENT” (http://technet.microsoft.com/en-us/library/cc627407.aspx).
Write-Through from Remote Clients
Remote file system access to FILESTREAM data is enabled over the Server Message Block (SMB) protocol. If the client is remote, caching of write operations depends on the options specified and the API used. For instance, the default for native code APIs is to perform write-through, whereas for the managed APIs the default is to use buffering. This difference reflects customer feedback on the various APIs and their uses through the prerelease CTP versions of SQL Server 2008.
It is recommended that applications that are running on remote clients consolidate small write operations (through buffering) to make fewer write operations using a larger data size. Also, if buffering is being used, an explicit flush should be issued by the client before the transaction is committed.
Creating memory mapped views (memory mapped I/O) by using a FILESTREAM handle is not supported. If memory mapping is used for FILESTREAM data, the database engine cannot guarantee consistency and durability of the data or the integrity of the database.
Even though FILESTREAM technology has many attractive features, it may not be the optimal choice in all situations. As mentioned earlier, the size of the BLOB data and the access patterns are the most significant factors when deciding whether to store the BLOB data wholly within the database or by using FILESTREAM.
Size affects the following:
- Efficiency with which the BLOB data can be accessed using either storage mechanism. As mentioned earlier, streaming access of large BLOB data is more efficient using FILESTREAM, but partial updates are (potentially much) slower.
- Efficiency of backing up the combined structured and BLOB data using either storage mechanism. A backup that combines SQL Server database files and a large number of FILESTREAM files will be slower than a backup of just SQL Server database files of an equivalent total size. This is because of the extra overhead of backing up each NTFS file (one per FILESTREAM data value). This overhead becomes more noticeable when the FILESTREAM files are smaller (as the time overhead becomes a larger percentage of the total time to backup per MB of data).
As an example, the following graph shows the relative throughput of local reads of various sizes of BLOB data using varbinary (max), FILESTREAM through Transact-SQL, and FILESTREAM through NTFS. It can be seen (on the blue line) that Win32 access of FILESTREAM data becomes several times faster than Transact-SQL access of varbinary (max) data as data size increases. Note that the throughput measurements are in megabits per second (Mbps).
Figure 1: Read performance of various BLOB sizes
The NTFS numbers include the time necessary to start a transaction, retrieve the pathname and transaction context from SQL Server, and open a Win32 handle to the FILESTREAM data. Each test was performed using the same computer with four processor cores and a warm SQL Server buffer pool.
The other factor to consider is whether the client or mid-tier can be written (or altered) to use the Win32 streaming APIs as well as regular access to SQL Server. If this is not the case, FILESTREAM will not be appropriate, because the best performance is obtained using the Win32 streaming APIs.
As with any other deployment, before you deploy an application that uses FILESTREAM, it is important to prepare the Windows server that will be hosting the SQL Server database and associated FILESTREAM data containers. This section explains how to configure storage hardware and the NTFS file system in preparation for using FILESTREAM. It then shows how to enable FILESTREAM at the Windows level.
One of the most common causes of a poorly performing workload is an unsuitable hardware configuration. Sometimes the cause is insufficient memory, leading to “thrashing” in SQL Server’s buffer pool, and sometimes the cause is simply that the storage hardware does not have the I/O throughput capability the workload demands. For applications that will use FILESTREAM for high-performance streaming of BLOB data using the Win32 APIs, the choice and configuration of the storage hardware is critical.
The following sections describe some best practices around storage choice and layout. For a deeper discussion on this, see the TechNet white paper “Physical Database Storage Design” (http://www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx). After you design an optimal layout, it is prudent to perform load testing to validate the performance capacity of the I/O subsystem. This is discussed in detail in the TechNet SQL Server Best Practices Article “Predeployment I/O Best Practices” (http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx).
Be sure to take into account the anticipated workload on a FILESTREAM data container when you are deciding where to place it, as well as the workloads on any co-located data containers or SQL Server files. Each FILESTREAM data container may need to be on its own volume, as having multiple data containers with heavy workloads on a single volume could lead to contention.
The point to take away here is that without thinking about the workloads involved, simply placing everything on one volume may lead to performance problems. The degree of separation required will vary from customer to customer.
It is also possible to create a table schema inside SQL Server that allows crude load balancing of FILESTREAM data between multiple volumes. This is described in the section “Load Balancing of FILESTREAM Data”.
The benefits of using RAID technology are well-known, and much has been written about choosing a RAID level appropriate to the application requirements, so this white paper will not attempt to repeat all that information. The aforementioned “Physical Database Storage Design” white paper has an excellent section on RAID levels and RAID level choice. What follows here is a simple overview of the factors to consider.
RAID levels differ in a variety of ways, most notably in terms of read/write performance, resilience to failure, and cost. For instance, RAID 5 is relatively low cost, it can handle the failure of only one drive in the RAID array, and it may be unsuitable for write-heavy workloads. On the other hand, RAID 10 provides excellent read and write performance, and it can handle multiple drive failures (depending on the degree of mirroring involved), but it is more expensive, given that at least 50 percent of the drives in the RAID array are redundant. These are the three main factors involved in choosing a RAID level. RAID level choice may be different for the volume on which each user database is stored, and it may even differ between the volume storing the data files and that storing the log files for a single database.
If the workload will involve high-performance streaming of FILESTREAM data, the immediate choice may be to have the FILESTREAM data container volume use the RAID level that gives the highest read performance. However, this may not provide a high degree of resilience against failures. On the other hand, the immediate choice may be to use the same RAID level as for the other volumes that store the data for the database, but this may not provide the requisite performance levels that the workload demands.
For this white paper, the point to stress is that a conscious RAID level choice should be made for the FILESTREAM data container volumes after evaluating the trade-offs involved, not just deciding based on a single factor.
In typical databases involving BLOB data, the total size of the BLOB data may be many times more than the total size of the structured data. When implementing a solution involving FILESTREAM data stored in separate volumes, you may want to use cheaper storage for the volume, such as IDE or SATA (hereafter simply called “SATA”), instead of more expensive SCSI storage. Before this choice is made, the trade-offs involved should be understood. This section gives an overview of the different characteristics of SCSI vs. IDE/SATA to enable an informed choice to be made based on performance and reliability as well as cost.
Capacity and Performance
SATA drives tend to have higher capacity than SCSI drives, but they have slower rotational speed (RPM) than SCSI drives. While there are some 10,000 RPM SATA drives, most are 5,400 or 7,200 RPM. High-performance SCSI drives are available that are 10,000 and up to 15,000 RPM. Although RPM can be a useful comparison metric, the two figures that must really be used for a comparison are the latency (how long until the disk head is at the right position over the disk surface) and average transfer rates (how much data can be transferred to/from the disk surface per second). It is also important that the drives are able to process complex I/O patterns efficiently. When choosing drives, ensure that SATA drives support Native Command Queue (NCQ) and SCSI drives support Command Tag Queue (CTQ), which allow them to process multiple and interleaved disk I/Os for better effective performance.
To summarize, SCSI drives usually have better latency and transfer rates and so will provide better streaming performance, but potentially at a higher cost.
SQL Server relies on guaranteed write-ordering and durability to provide reliability and recoverability through its write-ahead logging mechanism. For more information on these I/O requirements, see the TechNet white paper “SQL Server I/O Basics” (http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx).
For reliability, SCSI is usually better than SATA because it uniformly supports forcing data to be written to disk where SATA does not This is done either by supporting write-through, where the data to be written is not cached at all, or by supporting forcibly flushing cache contents to disk. The lack of either can impact recoverability after a hardware, software, or power failure. All the types of interfaces can support hot-swapping to allow repairs while maintaining availability.
The FILESTREAM feature relies on two write-ordering and durability guarantees:
- Data durability at transaction commit time
- Write-ahead logging for FILESTREAM file creation and deletion
Data durability is achieved by the FILESTREAM file system driver issuing an explicit flush of files that have been modified prior to a transaction committing (the details of the mechanism are beyond the scope of this white paper). This guarantees that in the event of a power failure, any disks that do not have sufficient battery-backed cache do not have committed, but unflushed FILESTREAM data that will be lost. If the SATA drives do not support a forced-flush operation, then recoverability may be impacted and data may be lost.
Write-ahead logging relies on the consistency of NTFS metadata. This in itself relies on the reliability of the underlying drives. There is no problem with SCSI, but if the SATA drives do not support forced-flushing, then some NTFS metadata changes may be lost in a power failure situation. This could result in a number of scenarios:
- NTFS cannot recover and the volume cannot be mounted (i.e., the FILESTREAM data container is essentially offline).
- NTFS recovers but NTFS metadata changes are lost and SQL Server does not know to roll back an uncommitted transaction that performs an insert of FILESTREAM data (i.e., the FILESTREAM data is “leaked”).
- NTFS recovers but NTFS metadata changes are lost and SQL Server does not know to roll back an uncommitted transaction that performs a delete of FILESTREAM data (i.e., the FILESTREAM data is lost).
It should be noted that all three of these scenarios are no worse than if the BLOB data were stored outside the database on an NTFS volume with underlying SATA drives that did not support forcing data to disk. The use of FILESTREAM on a volume with underlying SATA drives in this case is actually better than storing the BLOB data in raw NTFS files on the same volume, as FILESTREAM’s link-level consistency provides a mechanism to detect when such “corruptions” have occurred (through running DBCC CHECKDB on the database).
To summarize, FILESTREAM data can be reliably stored on volumes with underlying SATA storage, as long as the SATA drives support forcing data to disk through cache flushing.
Even the most carefully designed I/O subsystem running on high-performance hardware may not work as desired if the file system (in this case NTFS) is not configured correctly. This section describes some of the configuration options that can affect a workload involving FILESTREAM data.
For a more complete overview of NTFS, see the TechNet Library articles “NTFS Technical Reference” (http://technet.microsoft.com/en-us/library/cc758691.aspx) and “Working with File Systems” (http://technet.microsoft.com/en-us/library/bb457112.aspx).
Optimizing NTFS Performance
By default, NTFS is not configured to handle a high-performance workload with tens of thousands of files in an individual file system directory (i.e., the FILESTREAM scenario). There are two NTFS options that need to be configured to facilitate FILESTREAM performance. It is especially important to set these options correctly prior to undertaking any performance benchmarking; otherwise, results will not be representative of true FILESTREAM performance.
The first configuration option is to disable the generation of 8.3 names when new files are created (or renamed). This process generates a secondary name for each file that is only for backward compatibility with 16-bit applications. The algorithm generates a new 8.3 name and then has to scan through all existing 8.3 file names in the directory to ensure the new name is unique. As the number of files in the directory grows large (generally above 300,000), this process takes longer and longer. The time to create a file in turn rises and performance decreases, so turning off this process can significantly increase performance. To turn off this process, type the following at a command prompt and then restart the computer:
fsutil behavior set disable8dot3 1
Note: This option disables the generation of 8.3 names on all NTFS volumes on the server. If any volumes are being used by 16-bit applications, they may experience issues after you change this behavior.
The second option to turn off is updating the last access time for a file when it is accessed. If the workload briefly accesses many files, then a disproportionate amount of time is spent simply updating the last access time for each file. Turning off this option can also significantly increase performance. To turn off this process, type the following at a command prompt and then restart the computer:
fsutil behavior set disablelastaccess 1
All Windows file systems have the concept of a “cluster”, which is the unit of allocation when disk space is allocated. As a cluster is the smallest amount of disk space that can be allocated, if a file is very small, some of the cluster may be unused (essentially wasted). The cluster size is therefore typically quite small so that small files do not waste disk space.
Large files may have many clusters allocated to them, or files may grow over time and have clusters allocated as they grow. If a file grows a lot, but in small chunks, then the allocated clusters are likely to not be contiguous on disk (i.e., they are “fragments”). This means that the smaller the clusters are, and the more a file grows, the more “fragmented” it will become.
Cluster size is therefore a trade-off between wasting disk space and reducing fragmentation. More details can be found about the various clusters sizes on the Windows file systems in the Knowledge Base article “Default Cluster Size for FAT and NTFS” (http://support.microsoft.com/kb/140365).
The recommendation for using FILESTREAM is that the individual units of BLOB data be 1 MB in size or higher. If this is the case, it is recommended that the NTFS cluster size for the FILESTREAM data container volume is set to 64 KB to reduce fragmentation. This must be done manually as the default for NTFS volumes up to 2 terabytes (TB) is 4 KB. This can be done using the /A option of the format command. For example, at a command prompt type:
format F: /FS:NTFS /V:MyFILESTREAMContainer /A:64K
This setting should be combined with large buffer sizes, as described in the “Performance Tuning and Benchmarking Considerations” section later in this white paper.
As described earlier, when many files on a volume grow, they become fragmented. This means that the collection of clusters allocated to the file is not contiguous. When the file is read sequentially, the underlying disk heads need to read all the clusters in sequence, which may mean they have to read different portions of the disk. Even if files do not grow once they have been created, if they were created on a volume where the available free space is not in a single contiguous chunk, they may be fragmented immediately, because the necessary clusters to accommodate them are not available contiguously.
This fragmentation causes the sequential read performance to be lower than when there is no (or little) fragmentation. The problem is very similar to that of index fragmentation within a database slowing down query range scan performance.
It is therefore essential that fragmentation be periodically removed using a disk defragmenting tool to maintain sequential read performance. Also, if the volume that will be used to host the FILESTREAM data container was previously used, or if it still contains other data, the fragmentation level should be checked and fixed if necessary.
Data stored in NTFS can be compressed to save disk space, but at the expense of extra CPU to compress and decompress the data when it is written or read, respectively. Compression is also not useful if the data is essentially uncompressible. For example, random data, encrypted data, or data that has already been compressed will not compress well, but it must still be passed through the NTFS compression algorithm and incur CPU overhead.
For these reasons, it only makes sense to enable compression when the data can be heavily compressed, and when the extra CPU required will not cause the workload performance to decrease. It should also be noted that compression can only be enabled when the NTFS cluster size is 4,096 bytes or less.
Compression can be enabled on the FILESTREAM data container volume when it is formatted, using the /C option of the format command. For example:
format F: /FS:NTFS /V:MyFILESTREAMContainer /A:4096 /C
An existing volume can also be enabled for compression using the following steps:
- In My Computer or Windows Explorer, right-click the volume to compress or uncompress.
- Click Properties to display the Properties dialog box.
- On the General tab, select or clear the Compress drive to save disk space check box, and then click OK.
- In the Confirm Attribute Changes dialog box, select whether to make the compression apply to the entire volume or only to the root folder.
This is shown in the following figure.
Figure 2: Compressing an existing volume using Windows Explorer
Although multiple FILESTREAM data containers can be placed on a single NTFS volume, there are reasons for having a 1:1 mapping between data containers and NTFS volumes. Apart from the potential for workload-dependent contention, there is no way to manage the FILESTREAM data container space usage from within SQL Server, so use of NTFS disk quotas is necessary if this is required. Disk quotas are tracked on a per-user, per-volume basis, so having multiple FILESTREAM data containers on a single volume makes it hard to tell which data container is using more disk space. Note that all the FILESTREAM files will be created under the SQL Server service account. If this is changed, then disk space will start to be charged to the new service account.
There is a single FILESTREAM file system filter driver for each NTFS volume that has a FILESTREAM data container, and there is also one for each version of SQL Server that has a FILESTREAM data container on the volume. Each filter driver is responsible for managing all FILESTREAM data containers for that volume, for all instances that use a particular version of SQL Server.
For example, an NTFS volume that is hosting three FILESTREAM data containers, one for each of three SQL Server 2008 instances, will have only one SQL Server 2008 FILESTREAM file system filter driver.
There are two security requirements for using the FILESTREAM feature. Firstly, SQL Server must be configured for integrated security. Secondly, if remote access will be used, then the SMB port (445) must be enabled through any firewall systems. This is the same as required for regular remote share access. For more information, see the Knowledge Base article “Service Overview and Network Port Requirements for the Windows Server System” (http://support.microsoft.com/kb/832017).
Antivirus software is ubiquitous in today’s environment. FILESTREAM cannot block antivirus software from scanning the files in the FILESTREAM data container (this would create security problems). The software usually has a policy setting on what to do to a file that is suspected as being contaminated with a virus: Either delete the file or restrict access to it (known as putting the file in “quarantine”). In both cases, access to the BLOB data in the affected file will be prevented, and to SQL Server the file will appear to have been deleted.
It is recommended that the antivirus software be set to quarantine files, not delete them. DBCC CHECKDB can be used inside SQL Server to discover which files seem to be missing, and then the Windows administrator can correlate the file names against the antivirus software’s log and take corrective action.
FILESTREAM is a hybrid feature that requires both the Windows administrator and the SQL Server administrator to perform actions before the feature is enabled. This is necessary to preserve the separation of duties between the two administrators, especially if the SQL Server administrator is not also the Windows administrator. Enabling FILESTREAM at the Windows level installs a file system filter driver, which is something that only a Windows administrator has privileges to do.
At the Windows level, FILESTREAM is enabled either during the installation of SQL Server 2008 or by running SQL Server Configuration Manager. Here are the steps to follow:
- On the Start menu, point to All Programs, point to Microsoft SQL Server 2008, point to Configuration Tools, and then click SQL Server Configuration Manager.
- In the list of services, right-click SQL Server Services, and then click Open.
- In the SQL Server Configuration Manager snap-in, locate the instance of SQL Server on which you want to enable FILESTREAM.
- Right-click the instance and then click Properties.
- In the SQL Server Properties dialog box, click the FILESTREAM tab.
- Select the Enable FILESTREAM for Transact-SQLAccess check box.
- If you want to read and write FILESTREAM data from Windows, click Enable FILESTREAM for file I/O streaming access. Enter the name of the Windows share in the Windows Share Name box.
- If remote clients must access the FILESTREAM data that is stored on this share, select Allow remote clients to have streaming access to FILESTREAM data.
- Click Apply.
The following figure shows the FILESTREAM tab as described in the procedure.
Figure 3: Configuring FILESTREAM using SQL Server Configuration Manager
This procedure must be completed for each SQL Server instance that will use the FILESTREAM feature before it can be used by SQL Server. Note that there is no specification of the FILESTREAM data container at this stage – that is done when a FILESTREAM filegroup is created in a database after FILESTREAM has been enabled within SQL Server.
Note that it is possible to disable FILESTREAM access at the Windows level even while SQL Server has it enabled. In that case, after the SQL Server instance is restarted, all FILESTREAM data will be unavailable. The following warning will be displayed.
Figure 4: Warning displayed when disabling FILESTREAM using SQL Server Configuration Manager
Each SQL Server instance that will use the FILESTREAM feature must be separately configured, both at the Windows and the SQL Server level. After FILESTREAM has been enabled, a database must be configured to store FILESTREAM data, and only then can tables be defined that include FILESTREAM columns. This section describes how to configure FILESTREAM at the SQL Server level and how to create FILESTREAM-enabled databases and tables, and it explains how FILESTREAM interacts with other features in SQL Server 2008.
FILESTREAM requires that integrated security (i.e., Windows Authentication) be used. When an application using Win32 attempts to access FILESTREAM data, the Windows user is validated through SQL Server. If the user has Transact-SQL access to the FILESTREAM data, access will be granted at the Win32 level also, as long as the transaction token is obtained in the security context of the Windows user who is performing the file open.
The requirement for Windows Authentication comes from the nature of Windows File I/O APIs. The only way to pass the client’s identity from the client application to SQL Server during a File I/O operation is to use the Windows token associated with the client’s thread.
When the FILESTREAM data container is created, it is automatically secured so that only the SQL Server service account and members of the builtin/Administrators group can access the data container directory tree. Care should be taken that the data container contents are never changed except through supported transactional methods, as change through other methods will lead to the container becoming corrupt.
The second step of enabling FILESTREAM is performed within the SQL Server 2008 instance. This should not be done until FILESTREAM has been enabled at the Windows level, and the NTFS volume that will store the FILESTREAM data has been adequately prepared (as described in the “Configuring Windows for FILESTREAM” section earlier).
FILESTREAM access is controlled within SQL Server using sp_configure to set the filestream_access_level configuration option to one of three settings. The possible settings are:
- 0 – disable FILESTREAM support for this instance
- 1 – enable FILESTREAM for Transact-SQL access only
- 2 – enable FILESTREAM for Transact-SQL and Win32 streaming access
The following example shows enabling FILESTREAM for Transact-SQL and Win32 streaming access.
The RECONFIGURE statement is necessary for the newly configured value to take effect. Note that if FILESTREAM has not been enabled at the Windows level, it will not be enabled at the SQL Server level when the code above is run. The current configured value can be found using the following code.
If FILESTREAM is not configured at the Windows level then the “config_value” in the sp_configure output will be different (i.e., 0) from the “run_value” after the RECONFIGURE statement has been executed.
Once FILESTREAM is enabled at the Windows and SQL Server levels, a FILESTREAM data container can be defined. This is done by defining a FILESTREAM filegroup in a database. There is a 1:1 mapping between FILESTREAM filegroups and FILESTREAM data containers.
A FILESTREAM filegroup can be defined when a database is created, or it can be created separately by using an ALTER DATABASE statement. The following example creates a FILESTREAM filegroup in an existing database.
The CONTAINS FILESTREAM clause is necessary to distinguish the new filegroup from regular database filegroups. If the FILESTREAM feature is disabled, this statement will fail with the following error.
Assuming FILESTREAM is enabled at the Windows and SQL Server levels, the filegroup will be created. At this point, the FILESTREAM data container is defined by adding a single file to the filegroup. The pathname specified is the pathname of the directory that will be created as the root of the data container. The entire pathname up to, but not including, the final directory name must already exist. The following example defines the data container for the FileStreamGroup1 filegroup created earlier.
ALTER DATABASE Production ADD FILE ( NAME = FSGroup1File, FILENAME = 'F:\Production\FSDATA') TO FILEGROUP FileStreamGroup1; GO
At this point, the FSDATA directory will be created. It will be empty apart from two things:
- The file filestream.hdr. This is the FILESTREAM metadata for the data container.
- The directory $FSLOG. This is the FILESTREAM equivalent of a database’s transaction log.
It should be noted that a database can have multiple FILESTREAM filegroups. This can be useful to separate the BLOB storage for multiple tables in the database.
Once the database has a FILESTREAM filegroup, tables can be created that contain FILESTREAM columns. As mentioned earlier, a FILESTREAM column is defined as a varbinary (max) column that has the FILESTREAM attribute. The following code creates a table with a single FILESTREAM column.
USE Production; GO CREATE TABLE DocumentStore ( DocumentID INT IDENTITY PRIMARY KEY, Document VARBINARY (MAX) FILESTREAM NULL, DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID ()) FILESTREAM_ON FileStreamGroup1; GO
A table can have multiple FILESTREAM columns, but the data from all FILESTREAM columns in a table must be stored in the same FILESTREAM filegroup. If the FILESTREAM_ON clause is not specified, whichever FILESTREAM filegroup is set to be the default will be used. This may not be the desired configuration and could lead to performance problems.
Once the table is created, the FILESTREAM data container will contain another directory, corresponding to the table, with a subdirectory that corresponds to the FILESTREAM column in the table. This subdirectory will contain the data files once data is entered into the table. The directory structure will vary depending on the number of FILESTREAM columns a table has, and whether the table is partitioned or not.
Note that for a table to have one or more FILESTREAM columns, it must also have a column of the uniqueidentifier data type that has the ROWGUIDCOL attribute. This column must not allow null values and must have either a UNIQUE or PRIMARY KEY single-column constraint. The GUID value for the column must be supplied either by an application when inserting data or by a DEFAULT constraint that uses the NEWID() function (or NEWSEQUENTIALID() if merge replication is configured, as mentioned in the “Feature Combinations and Restrictions” section later).
For more information about details and restrictions on the required table schema and options, see the SQL Server 2008 Books Online topic “CREATE TABLE (Transact-SQL)” (http://msdn.microsoft.com/en-us/library/ms174979.aspx).
FILESTREAM data files in the FILESTREAM data container cannot be partially updated. This means that any change to the BLOB data in the FILESTREAM column will create a whole new FILESTREAM data file. The “old” file must be preserved until it is no longer needed for recovery purposes. Files representing deleted FILESTREAM data, or rolled-back inserts of FILESTREAM data, are similarly preserved.
Files that are no longer needed are removed by a garbage collection process. This process is automatic, unlike Windows SharePoint® Services, where garbage collection must be implemented manually on the external BLOB store.
All FILESTREAM file operations are mapped to a Log Sequence Number (LSN) in the database transaction log. As long as the transaction log has been truncated past the FILESTREAM operation LSN, the file is no longer needed and can be garbage collected. Therefore, anything that can prevent transaction log truncation may also prevent a FILESTREAM file being physically deleted. Some examples are:
- Log backups have not been taken, in the FULL or BULK_LOGGED recovery model.
- There is a long-running active transaction.
- The replication log reader job has not run.
FILESTREAM garbage collection is a background task that is triggered by the database checkpoint process. A checkpoint is automatically run when enough transaction log has been generated. For more information, see the SQL Server 2008 Books Online topic “CHECKPOINT and the Active Portion of the Log” (http://msdn.microsoft.com/en-us/library/ms189573.aspx). Given that FILESTREAM file operations are minimally logged in the database’s transaction log, it may take a while before the number of transaction log records generated triggers a checkpoint process and garbage collection occurs. If this becomes a problem, you can force garbage collection by using the CHECKPOINT statement.
If the table that contains FILESTREAM data is partitioned, the FILESTREAM_ON clause must specify a partitioning scheme involving FILESTREAM filegroups and based on the table’s partitioning function. This is necessary because the regular partitioning scheme will involve the regular filegroups that cannot be used to store FILESTREAM data. The table definition (either in a CREATE TABLE or CREATE CLUSTERED INDEX … WITH DROP_EXISTING statement) then specifies both partitioning schemes.
The FILESTREAM partitioning scheme can specify that all the partitions are mapped to a single filegroup, but this is not recommended as it can lead to performance problems.
The following (contrived) example shows this syntax.
CREATE PARTITION FUNCTION DocPartFunction (INT) AS RANGE RIGHT FOR VALUES (100000, 200000); GO CREATE PARTITION SCHEME DocPartScheme AS PARTITION MyPartFunction TO (Data_FG1, Data_FG2, Data_FG3); GO CREATE PARTITION SCHEME DocFSPartScheme AS PARTITION MyPartFunction TO (FS_FG1, FS_FG2, FS_FG3); GO CREATE TABLE DocumentStore ( DocumentID INT IDENTITY PRIMARY KEY, Document VARBINARY (MAX) FILESTREAM NULL, DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID () ON Data_FG1) ON DocPartScheme (DocumentID) FILESTREAM_ON DocFSPartScheme; GO
Notice that to use the DocumentID column as the partitioning column, the underlying nonclustered index that enforces the UNIQUE constraint on the DocGUID must be explicitly placed on a filegroup so that the DocumentID column can be the partitioning column. This means that partition switching is only possible if the UNIQUE constraints are disabled before performing the partition switch, as they are unaligned indexes, and then re-enabled afterwards.
Continuing the previous example, the following code creates a table and then attempts a partition switch.
CREATE TABLE NonPartitionedDocumentStore ( DocumentID INT IDENTITY PRIMARY KEY, Document VARBINARY (MAX) FILESTREAM NULL, DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID ()); GO ALTER TABLE DocumentStore SWITCH PARTITION 2 TO NonPartitionedDocumentStore; GO
The partition switch fails with the following message.
Msg 7733, Level 16, State 4, Line 1
‘ALTER TABLE SWITCH’ statement failed. The table ‘FileStreamTestDB.dbo.DocumentStore’ is partitioned while index ‘UQ_Document_8CC1617F60ED59’ is not partitioned.
Disabling the unique index in the source table and trying again gives the following code.
ALTER INDEX [UQ__Document__8CC331617F60ED59] ON DocumentStore DISABLE; GO ALTER TABLE FileStreamTest3 SWITCH PARTITION 2 TO NonPartitionedFileStreamTest3; GO
This also fails, with the following message.
Msg 4947, Leve 16, State 1, Line 1
ALTER TABLE SWITCH statement failed. There is no identical index in source table ‘FileStreamTestDB.dbo.DocumentStore’ for the index ‘UQ_NonParti_8CC3316103317E3D’ in target table ‘FileStreamTestDB.dbo.NonPartitionedDocumentStore’.
The unique indexes in both the partitioned and nonpartitioned tables must be disabled before the switch can proceed.
ALTER INDEX [UQ__NonParti__8CC3316103317E3D] ON NonPartitionedDocumentStore DISABLE; GO ALTER TABLE DocumentStore SWITCH PARTITION 2 TO NonPartitionedDocumentStore; GO ALTER INDEX [UQ__NonParti__8CC3316103317E3D] ON NonPartitionedDocumentStore REBUILD WITH (ONLINE = ON); ALTER INDEX [UQ__Document__8CC331617F60ED59] ON NonPartitionedDocumentStore REBUILD WITH (ONLINE = ON); GO
More information on partitioning FILESTREAM data will be included in the upcoming white paper on partitioning in SQL Server 2008.
Partitioning can also be used to create a table schema that allows crude load balancing of FILESTREAM data between multiple volumes. This may be desirable for a variety of reasons such as hardware limitations or to allow hot-spots in a table to be stored on different volumes.
The following code shows a partitioning function and schema based on the uniqueidentifier column that effectively spreads the FILESTREAM data across 16 volumes, while striping the structured data across two filegroups.
USE master; GO -- Create the database CREATE DATABASE Production ON PRIMARY (NAME = 'Production', FILENAME = 'E:\Production\Production.mdf'), FILEGROUP DataFilegroup1 (NAME = 'Data_FG1', FILENAME = 'F:\Production\Data_FG1.ndf'), FILEGROUP DataFilegroup2 (NAME = 'Data_FG2', FILENAME = 'G:\Production\Data_FG2.ndf'), FILEGROUP FSFilegroup0 CONTAINS FILESTREAM (NAME = 'FS_FG0', FILENAME = 'H:\Production\FS_FG0'), FILEGROUP FSFilegroup1 CONTAINS FILESTREAM (NAME = 'FS_FG1', FILENAME = 'I:\Production\FS_FG1'), FILEGROUP FSFilegroup2 CONTAINS FILESTREAM (NAME = 'FS_FG2', FILENAME = 'J:\Production\FS_FG2'), FILEGROUP FSFilegroup3 CONTAINS FILESTREAM (NAME = 'FS_FG3', FILENAME = 'K:\Production\FS_FG3'), FILEGROUP FSFilegroup4 CONTAINS FILESTREAM (NAME = 'FS_FG4', FILENAME = 'L:\Production\FS_FG4'), FILEGROUP FSFilegroup5 CONTAINS FILESTREAM (NAME = 'FS_FG5', FILENAME = 'M:\Production\FS_FG5'), FILEGROUP FSFilegroup6 CONTAINS FILESTREAM (NAME = 'FS_FG6', FILENAME = 'N:\Production\FS_FG6'), FILEGROUP FSFilegroup7 CONTAINS FILESTREAM (NAME = 'FS_FG7', FILENAME = 'O:\Production\FS_FG7'), FILEGROUP FSFilegroup8 CONTAINS FILESTREAM (NAME = 'FS_FG8', FILENAME = 'P:\Production\FS_FG8'), FILEGROUP FSFilegroup9 CONTAINS FILESTREAM (NAME = 'FS_FG9', FILENAME = 'Q:\Production\FS_FG9'), FILEGROUP FSFilegroupA CONTAINS FILESTREAM (NAME = 'FS_FGA', FILENAME = 'R:\Production\FS_FGA'), FILEGROUP FSFilegroupB CONTAINS FILESTREAM (NAME = 'FS_FGB', FILENAME = 'S:\Production\FS_FGB'), FILEGROUP FSFilegroupC CONTAINS FILESTREAM (NAME = 'FS_FGC', FILENAME = 'T:\Production\FS_FGC'), FILEGROUP FSFilegroupD CONTAINS FILESTREAM (NAME = 'FS_FGD', FILENAME = 'U:\Production\FS_FGD'), FILEGROUP FSFilegroupE CONTAINS FILESTREAM (NAME = 'FS_FGE', FILENAME = 'V:\Production\FS_FGE'), FILEGROUP FSFilegroupF CONTAINS FILESTREAM (NAME = 'FS_FGF', FILENAME = 'W:\Production\FS_FGF'); GO USE Production; GO -- Create a partition function based on the last 6 bytes of the GUID CREATE PARTITION FUNCTION LoadBalance_PF (UNIQUEIDENTIFIER) AS RANGE LEFT FOR VALUES ( CONVERT (uniqueidentifier, '00000000-0000-0000-0000-100000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-200000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-300000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-400000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-500000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-600000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-700000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-800000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-900000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-a00000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-b00000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-c00000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-d00000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-e00000000000'), CONVERT (uniqueidentifier, '00000000-0000-0000-0000-f00000000000')); GO -- Create a filestream partitioning scheme that allows mapping to 16 filestream filegroups CREATE PARTITION SCHEME LoadBalance_FS_PS AS PARTITION LoadBalance_PF TO ( FSFileGroup0, FSFileGroup1, FSFileGroup2, FSFileGroup3, FSFileGroup4, FSFileGroup5, FSFileGroup6, FSFileGroup7, FSFileGroup8, FSFileGroup9, FSFileGroupA, FSFileGroupB, FSFileGroupC, FSFileGroupD, FSFileGroupE, FSFileGroupF); GO -- Create a data partitioning scheme to round-robin between two filegroups CREATE PARTITION SCHEME LoadBalance_Data_PS AS PARTITION LoadBalance_PF TO ( DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2, DataFileGroup1, DataFileGroup2); GO -- Create the partitioned table CREATE TABLE DocumentStore ( DocumentID INT IDENTITY, Document VARBINARY (MAX) FILESTREAM NULL, DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL DEFAULT NEWID (), CONSTRAINT DocStorePK PRIMARY KEY CLUSTERED (DocGUID), CONSTRAINT DocStoreU UNIQUE (DocGUID)) ON LoadBalance_Data_PS (DocGUID) FILESTREAM_ON LoadBalance_FS_PS; GO
The load balancing can be easily tested using the following code.
SET NOCOUNT ON; GO -- Insert 10000 rows to test load balancing DECLARE @count INT = 0; WHILE (@count < 10000) BEGIN INSERT INTO DocumentStore DEFAULT VALUES; SET @count = @count + 1; END; GO -- Check the distribution SELECT COUNT ($PARTITION.LoadBalance_PF (DocGUID)) FROM DocumentStore GROUP BY $PARTITION.LoadBalance_PF (DocGUID); GO
The results of a sample run of this test were 631, 641, 661, 640, 649, 637, 618, 618, 576, 608, 595, 645, 640, 616, 602, and 623 rows in each of the FILESTREAM filegroups FS_FG0 through FS_FGF.
Because the FILESTREAM feature stores data within the file system, there are some restrictions and considerations when FILESTREAM is combined with other SQL Server features. This section gives an overview of the feature combinations to be aware of. For more information, see the SQL Server 2008 Books Online topic “Using FILESTREAM with Other SQL Server Features” (http://msdn.microsoft.com/en-us/library/bb895334.aspx).
Both transactional replication and merge replication support FILESTREAM data, but there are many considerations such as:
- When the replication topology involves instances using different versions of SQL Server, there are limitations on the size of data than can be sent to down-level instances.
- The replication filter options determine whether the FILESTREAM attribute is replicated or not using transactional replication.
- The varbinary (max) maximum data value size that can be replicated in transactional replication without replicating the FILESTREAM attribute is 2 GB.
- When merge replication is used, both it and FILESTREAM require a uniqueidentifier column. Care must be taken with the table schema when using merge replication so that the GUIDs are sequential (i.e., use NEWSEQUENTIALID() rather than NEWID()).
Database mirroring does not support FILESTREAM. A FILESTREAM filegroup cannot be created on the principal server. Database mirroring cannot be configured for a database that contains FILESTREAM filegroups.
FILESTREAM data cannot be encrypted using SQL Server encryption methods. If transparent data encryption is enabled, FILESTREAM data is not encrypted.
FILESTREAM is fully supported with failover clustering. All nodes in the cluster must have FILESTREAM enabled at the Windows level, and the FILESTREAM data container(s) must be placed on shared storage so the data is available to all nodes. For more information, see the SQL Server 2008 Books Online topic: “How to: Set Up FILESTREAM on a Failover Cluster” (http://msdn.microsoft.com/en-us/library/cc645886.aspx).
Full-text indexing works with a FILESTREAM column in exactly the same way that it does with a varbinary (max) column. The table must have an extra column that contains the file name extension for the BLOB data being stored in the FILESTREAM column.
SQL Server does not support database snapshots for FILESTREAM data containers. If a FILESTREAM data file is included in a CREATE DATABASE ON clause, the statement will fail and an error will be raised.
If a database contains FILESTREAM data, a database snapshot of regular filegroups can still be created. In this case, a warning message will be returned and the FILESTREAM filegroups will be marked as offline in the database snapshot. Queries will work as expected against the database snapshot unless they attempt to access the FILESTREAM data. If this happens an error will result.
A database cannot be reverted to a snapshot if the database contains FILESTREAM data, because there is no way to tell what state the FILESTREAM data was in at the point in time represented by the database snapshot.
Views, Indexes, Statistics, Triggers, and Constraints
FILESTREAM columns cannot be part of an index key or specified as an INCLUDE column in a nonclustered index. A computed column can be defined that references a FILESTREAM column, but the computed column cannot be indexed.
Statistics cannot be created on FILESTREAM columns.
PRIMARY KEY, FOREIGN KEY, and UNIQUE constraints cannot be created on FILESTREAM columns.
Indexed views cannot contain FILESTREAM columns; however, nonindexed views can.
Instead-of triggers cannot be defined on tables that contain FILESTREAM columns.
When FILESTREAM data is accessed through the Win32 APIs, only the read-committed isolation level is supported. Transact-SQL access also allows the repeatable-read and serializable isolation levels. Furthermore, using Transact-SQL access, dirty reads are permitted through the read-uncommitted isolation level, or the NOLOCK query hint, but such access will not show in-flight updates of FILESTREAM data.
Backup and Restore
FILESTREAM works with all recovery models and all forms of backup and restore (full, differential, and log). In a disaster situation, if the CONTINUE_AFTER_ERROR option is specified on either a BACKUP or RESTORE option, the FILESTREAM data may not recover with zero data loss (similar to recovery of regular data when CONTINUE_AFTER_ERROR is specified).
The SQL Server instance must be configured to use integrated security if Win32 access to the FILESTREAM data is required.
Log shipping supports FILESTREAM. Both the primary and secondary servers must be running SQL Server 2008, or a later version, and have FILESTREAM enabled at the Windows level.
SQL Server Express
SQL Server Express supports FILESTREAM. The 4-GB database size limit does not include the FILESTREAM data container.
However, if FILESTREAM data is being sent to or from the SQL Server Express instance using Service Broker, then care must be taken, because Service Broker does not support storing data as FILESTREAM in the transmission or target queues. This means that if either queue builds up, the 4-GB database size limit may be reached.
An alternative in this case is to use a scheme where the Service Broker conversation carries notifications that FILESTREAM data needs to be sent or received. Then the actual transmission of the FILESTREAM data is done using remote access through the FILESTREAM share of the SQL Server Express instance’s FILESTREAM data container.
There are several important considerations when tuning a FILESTREAM workload:
- Ensure that the hardware is configured correctly for FILESTREAM.
- Ensure that the generation of 8.3 names is disabled in NTFS.
- Ensure that last access time tracking is disabled in NTFS.
- Ensure that the FILESTREAM data container is not on a fragmented volume.
- Ensure that the BLOB data size is appropriate for storing with FILESTREAM.
- Ensure that the FILESTREAM data containers have their own, dedicated volumes.
One important factor to point out is the buffer size used by the SMB protocol that is used for buffering reads of FILESTREAM data. In testing using the Windows Server® 2003 operating system, larger buffer sizes tend to get better throughput, with buffer sizes of a multiple of approximately 60 KB. Larger buffer sizes may be more efficient on other operating systems.
There are additional considerations when comparing a FILESTREAM workload against other storage options (once the FILESTREAM workload has been tuned):
- Ensure that the storage hardware and RAID level is the same for both.
- Ensure that the volume compression setting is the same for both.
- Take into account whether FILESTREAM is performing write-through based on the API in use and options specified.
A common scenario using SQL Server 2008 will be the migration of existing BLOB data into FILESTREAM storage. While providing a complete tool or set of code to perform such migrations is beyond the scope of this white paper, the following is a simple workflow to follow:
- Review the data size considerations for using FILESTREAM to ensure that the average data sizes involved are such that FILESTREAM storage is appropriate.
- Review the available information on feature combinations and limitations to ensure that FILESTREAM storage will work with all other requirements of the application.
- Follow the recommendations in the “Performance Tuning and Benchmarking Considerations” section earlier.
- Ensure that the SQL Server instance is using integrated security and that FILESTREAM has been enabled at the Windows and SQL Server levels.
- Ensure that the target FILESTREAM data container location has enough disk space to store the migrated BLOB data.
- Create the required FILESTREAM filegroups.
- Duplicate the table schemas involved, changing the required BLOB columns to be FILESTREAM.
- Migrate all non-BLOB data to the new schema.
- Migrate all BLOB data into the new FILESTREAM columns.
This section is a collection of best practices that have emerged from FILESTREAM usage during the prerelease internal and public testing of the feature. As with all best practices, these are generalizations and they may not apply to every situation and scenario. The best practices are as follows, in no particular order:
- Multiple small appends to a FILESTREAM file should be avoided if possible, as each append creates a whole new FILESTREAM file. This could be very expensive for large FILESTREAM files. If possible, multiple appends should be bunched together into a varbinary (max) column and then appended to the FILESTREAM column when a size threshold is reached.
- With a heavily multithreaded write workload, consider setting the AllocationSize parameter for the OpenSqlFilestream or SqlFilestream APIs. Higher initial allocation sizes will limit the potential for fragmentation at the file-system level, especially when combined with a large NTFS cluster size as described previously.
- If FILESTREAM files are large, avoid Transact-SQL updates that append or pre-pend data to a file. This will (generally) spool data into tempdb and back into a new physical file, which will affect performance.
- When reading a FILESTREAM value, consider the
- If reads require only reading the first few bytes, than consider substring functionality.
- If entire file is to be read, consider Win32 access.
- If random portions of the file are to be read, consider opening the file handle using SetFilePointer.
- When reading an entire file, specify the FILE_SEQUENTIAL_ONLY flag.
- Use buffers sized in multiples of 60 KB (as described earlier).
The size of a FILESTREAM file can be obtained without having to open a handle to the file by adding a persisted computed column to the table that stores the FILESTREAM file size. The computed column is updated while the file is already open for write operations.
This white paper has described the FILESTREAM feature of SQL Server 2008 that allows storage of and efficient access to BLOB data using a combination of SQL Server 2008 and the NTFS file system. To conclude, it is worth reiterating the major points made during the white paper.
FILESTREAM storage is not appropriate in all cases. Based on prior research and FILESTREAM feature behavior, BLOB data of size 1 MB and larger that will not be accessed through Transact-SQL is best suited to storing as FILESTREAM data.
Consideration must also be given to the update workload, as any partial update to a FILESTREAM file will generate a complete copy of the file. With a particularly heavy update workload, the performance may be such that FILESTREAM is not appropriate.
The details of feature combinations should be studied to ensure that the deployment succeeds. For instance, in SQL Server 2008 RTM, database mirroring cannot be used FILESTREAM data, nor can either flavor of snapshot isolation. Most other feature combinations are supported, but some may have limitations (such as replication). This white paper does not provide an exhaustive taxonomy of features and their interaction so the most recent SQL Server Books Online sections should be checked prior to deployment, especially as some limitations are likely to be lifted in future releases.
Finally, if FILESTREAM is deployed without correctly configuring Windows and SQL Server, the anticipated performance levels may not be reached. The best practices and configuration details described above should be used to help avoid performance problems.
For more information:
http://www.microsoft.com/sqlserver/: SQL Server Web site
http://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter
http://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter
Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:
- Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
- Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback will help us improve the quality of white papers we release.