Export (0) Print
Expand All

Chapter 9 - Storage Engine Enhancements

This chapter describes the new storage engine features in Microsoft® SQL Server™ 2000 and offers tips for using them, along with some insight into how the storage engine works. Developing a basic understanding of the inner workings of the storage engine can help you get the most out of SQL Server.

In a world that is focused on highly scalable applications, databases are now designed and implemented on short schedules, and they remain in a constant state of evolution as development requirements change and production usage grows. Scalability, availability, and ease-of-use requirements demand a responsive and flexible data storage engine. Different editions of SQL Server 2000 support a wide variety of systems, ranging in size from a tiny mobile system for a Pocket PC all the way up to a high availability multiple-terabyte transaction processing or decision support system running clustered Windows® 2000 Datacenter Servers. All of these systems maintain the flexibility, security, and reliability mission-critical business systems demand.

You can deploy SQL Server 2000 applications for projects of varying purposes and sizes, as a result of intelligent, automated storage engine operations. A highly sophisticated architecture improves performance, availability, and scalability.

Availability 

Reliability and concurrency are enhanced with new algorithms for physical file interaction. These algorithms eliminate the need to run database console commands (DBCCs) as part of regular maintenance. However, DBCC is still available, and the new DBCC CHECK commands can be run without inhibiting online processing.

Scalability

The storage subsystem, which consists of the physical database files and their layout on disk, supports scaling from very small to very large databases. Data can be stored according to usage across files and filegroups that span many physical disks. SQL Server can now support up to 64 GB of physical memory and up to 32 processors.

Ease of Use

Enhanced administration capabilities help the Database Administrator (DBA) to automate and centralize server management. This also allows easy maintenance of remote servers and applications without the necessity of having a DBA at every site. Server configuration, managed by a sophisticated algorithm, is dynamically responsive to server usage patterns, freeing the DBA to work on database management and optimization tasks.

Storage Engine Enhancements

The relational database server of SQL Server 2000 has two main parts: the relational engine and the storage engine. The two engines work independently, interacting with each other through native data access components such as OLE DB. The relational engine provides an interface into the storage engine, which is composed of services to interact with the underlying database storage components and features.

The primary responsibilities of the storage engine include:

  • Providing features to improve ease of use for managing storage components. 

  • Managing the data buffers and all I/O to the physical files. 

  • Controlling concurrency, managing transactions, locking, and logging. 

  • Managing the files and physical pages used to store data. 

  • Recovering from system faults. 

The SQL Server 2000 storage engine offers new features that add conceptual simplicity and physical flexibility, while reducing the need for meticulous capacity planning and performance tuning. SQL Server 2000 reacts to its own environment and dynamically adapts to changes in database usage accurately and quickly. This breakthrough in technology has elevated the focus of database administration to the facilitation of data as a service. SQL Server 2000 DBAs can focus on designing a system that is responsive to data flow and usage, rather than spending time tuning individual parameters.

The changes in SQL Server 2000 are built on architectural enhancements, introduced in SQL Server 7.0, that provide a foundation for continuing future improvement and innovation. The primary goal of this release was to reduce the amount of time and effort spent tuning the server on a regular basis. Because most tuning parameter settings could be based on database usage, the engine was designed to dynamically adjust to situations in the database environment according to an adaptive algorithm. This automated flexibility has been implemented for tuning parameters that required constant adjustment and experimentation in earlier versions. You can still manually adjust tuning features, but SQL Server 2000 does more of the work for you.

The following table summarizes the key enhancements made to the SQL Server 2000 storage engine. They are described in greater detail later in this chapter.

Feature

Description and benefits

Application lock manager

If you need to control concurrent access to application-defined resources, such as forms, new stored procedures now allow you to lock these resources using the SQL Server application lock manager.

Database console commands (DBCCs)

The DBCC CHECK commands can now run during online processing, without blocking updates. New enhancements allow verifying consistency of physical pages to detect hardware-induced errors. In SQL Server 2000 Enterprise Edition, DBCC now runs in parallel across multiple processors.

Database options

All database options can now be modified using ALTER DATABASE. This ability simplifies administration.

Differential backups

Differential backups are quicker in SQL Server 2000, due to an enhancement that tracks database changes at the extent level.

Dynamic tuning

Using dynamic adaptive algorithms, the server automatically adjusts previously static configuration settings. Administrative control is still available to manage system-wide resources, but you will not usually need to use it. Manually set parameters adapt dynamically within their constrained boundaries.

In-row text

In tables that include a small, frequently used text column, smaller text values can be stored in the same page with the standard data row, rather than on a page of text values. In tables where this text data is accessed frequently, this feature can eliminate a large amount of disk I/O.

Index builds in parallel

In Enterprise Edition, index builds automatically make use of all processors configured for parallel processing, reducing the time it takes to build an index by as much as a factor of six on an 8-processor server. Index builds also take advantage of available resources in memory and tempdb.

Index read ahead

Index reads have been enhanced to increase performance on index scans.

Index reorganization

Improvements made to DBCC SHOWCONTIG provide more detailed information regarding index fragmentation. A new DBCC command, INDEXDEFRAG, reorganizes index pages online without disrupting database service or incurring any risk to database consistency or recovery.

Descending order key columns on indexes

Individual keys columns in an index can be specified as ascending or descending order.

KILL command

This command now reports completion progress. If this command is waiting on another process, such as a rollback, you can view how much of the command has been executed. This command has been enhanced to allow stopping of Microsoft Distributed Transaction Coordinator (MS DTC) transactions, which are not associated with a specific session.

Large memory support

Windows 2000 technology improves the performance of Enterprise Edition systems that use a large amount of memory. Using the AWE extensions of Windows 2000, SQL Server 2000 can support up to 64 GB of physical memory.

Locking

The lock manager has been enhanced to detect deadlocks across additional resources such as threads and memory. Concurrency improvements reduce deadlocks. This further enhances scalability in SQL Server 2000.

Logical log marks

Transact-SQL commands can create a bookmark in the log to permit restoration of the database to the point in time indicated by the bookmark. This feature also synchronizes restoration of multiple databases used for the same application.

Online index reorganization

Improvements made to DBCC SHOWCONTIG provide more detailed information regarding index fragmentation. A new DBCC command, INDEXDEFRAG, reorganizes index pages online without disrupting database service or incurring any risk to database consistency or recovery.

Optimized I/O read-ahead

SQL Server 2000 issues multiple serial read-ahead reads at once for each file involved in the scan. The query optimizer uses serial, read-ahead I/O when scanning tables and indexes for improved performance.

Passwords on backups

Backup media and individual backup can be password-protected. This prevents an unauthorized user restoring a backup and gaining access to a database.

Recovery models

Using recovery models, you can select the level of logging in the database. This allows greater flexibility of transaction log management. The recovery model can be altered online to complement varying usage of the database throughout the day.

Shared table scans

In Enterprise Edition, multiple scans of a table can now take advantage of other ongoing scans of that table, reducing physical I/O to the disk.

Shrinking the log

The command to shrink the log runs immediately in more situations. When the log cannot be shrunk immediately, SQL Server will provide constructive feedback on what must be done before continuing with or completing the shrink operation.

Snapshot backups

Support for snapshot backups by third-party vendors has been enhanced. Snapshot backups take advantage of storage technologies to backup or restore an entire database in seconds. These backups can now be combined with conventional transaction log and differential backups to provide complete protection for OLTP databases. This is especially beneficial for moderate to very large databases in which availability is extremely important.

Space-efficient empty tables and indexes

No disk pages are allocated for empty tables and indexes in SQL Server 2000. SQL Server 7.0 allocated as many as three pages for empty tables and indexes.

Top n sort

This new feature optimizes retrieval of top n values (for example, SELECT TOP 5 * FROM tablename).

Xlock

SQL Server 2000 provides this new Transact-SQL locking hint. It can be used to explicitly invoke an exclusive transaction-level page or table lock.

SQL Server 2000 has been enriched with features that allow more efficient interaction with the data and more flexibility from an administrative perspective. The following sections give more details on these enhancements and some tips on how to use them.

Interacting with Data

In SQL Server 2000, the storage engine has been enhanced to provide even more scalability and performance when interacting with the data. Understanding these enhancements can help you use SQL Server more effectively.

The exchange of data begins with a query, whether it originates from a user interface or from an automated task. The data request is passed into the relational engine, which interacts with the storage engine to get the data and pass it back to the user. From the perspective of the user, and even the DBA, the functioning of the storage and relational engines are indistinguishable.

Reading Data More Effectively

Data flows between the server and the user through a series of transactions. The application or user initiates the work, and the database passes it to the query processor for completion and then returns the end results. The query processor does the work by accepting, interpreting, and executing SQL statements.

For example, when a user session issues a SELECT statement, the following steps occur:

  1. The relational engine compiles and optimizes the statement into an execution plan (which is a series of steps required to get the data). The relational engine then runs the execution plan. The execution steps involve accessing tables and indexes through the storage engine. 

  2. The relational engine interprets the execution plan, making calls into the storage engine to gather the necessary data.

  3. The relational engine combines all the data returned by the storage engine into the final result set and then sends it back to the user.

A couple of improvements have been made to boost performance in this process. In SQL Server 2000, the relational engine relays qualifying query predicates to the storage engine so they can be applied earlier in the process, resulting in more efficient exchange between the storage and relational engine. This can provide a significant performance gain for qualifying queries.

Top n Enhanced 

Another improvement is in the way the storage engine handles selection of the top n records from a result set. In SQL Server 2000, a new top n engine analyzes the best path of operation for statements like this one:

SELECT top 5 * from orders order by date_ordered desc

For this example, if the whole table must be searched, the engine analyzes the data and tracks only the top n values in the cache. This is a tremendous performance boost for this type of SELECT statement, because only the values in the top n will be sorted, rather than the whole table.

Shared Scans 

In Enterprise Edition, two or more queries can share ongoing table scans, which can improve performance in very large SQL Server 2000 enterprises. For example, when a query searches a really large table using an unordered scan, the pages flow through the cache to make room for the data flowing in. If another query is started, a second scan of the same table would incur disk I/O to retrieve those pages again. In an environment where there are frequent table scans, this can cause disk thrashing as both queries search the same data pages.

Cc917632.sentrain(en-us,TechNet.10).gif

An optimizing process reduces the amount of disk I/O produced by this type of data access pattern. The first unordered scan of a table will read the data from the disk; instead of having to read the disk again, subsequent unordered scans of the same table can build on the information already in memory. During multiple simultaneous scans of the same table, this synchronization process may boost performance as much as eightfold. This improvement is even more noticeable in large decision support queries, where the total table size is much larger than the size of the cache.

Shared scans are a feature invoked by the storage engine to assist with queries that have no better available execution plan. The intent of this feature is to assist in frequent reads of very large tables. When the query processor determines that the best execution plan includes a table scan, this feature is invoked. However, while it is possible to use query or index tuning to force shared scans, no performance gain is achieved by forcing a table scan where a well-maintained index would do the job as well or better.

Concurrency

In order to maintain transactional consistency while many users are interacting with the data, the storage engine locks resources to manage dependencies on rows, pages, keys, key ranges, indexes, tables, and databases. By locking resources while they are being altered, the engine prevents more than one user from altering the same data at the same time. SQL Server locks are dynamically applied at various levels of granularity, in order to select the least restrictive lock required for the transaction.

In SQL Server 2000, concurrency improvements further reduce deadlocks and avoidable locking of resources. For example, the lock manager has been enhanced to be aware of other resources that might be in contention, such as threads and memory. This new ability can help a database administrator identify a wider variety of design or hardware limitations.

A new Transact-SQL interface into the lock manager has been introduced to support customized locking logic within programming code. Locks necessary for business logic can be initiated by invoking sp_getapplock within your Transact-SQL batch, which allows you to specify an application-defined resource to be locked (for example, a lock on an application resource like a form, instead of a lock on a data row), the mode of locking to use, the timeout value, and whether the scope of the lock should be the transaction or the session. After locks have been initiated with the new application lock manager, they participate in the normal lock management of SQL Server, just as if the storage engine had initiated them, so you do not have to worry that your application-initiated lock will remain open if the calling transaction is terminated.

The process by which locks are acquired in SQL Server 2000 takes into account whether or not all the data on the page is committed. For example, if you run a SELECT statement against a table whose data has not changed recently, such as a table in the pubs database, the process does not produce any locks because no active transactions have recently updated the table. The storage engine accomplishes this by comparing the log sequence number on the data page to the current active transactions. In databases where most of the data is older than the oldest active transaction, this can reduce locking significantly, enhancing performance.

While locks protect data during transactions, another process, latching, controls access to physical pages. Latches are very lightweight, short-term synchronization objects protecting actions that do not need to be locked for the life of a transaction. When the engine scans a page, it latches the page, reads the row, gives it back to the relational engine, and then unlatches the page again so another process can reach the same data. Through a process called lazy latching, the storage engine optimizes access to the data pages by releasing latches only when a page is also requested by another ongoing process. If no ongoing process requests the same data page, a single latch remains valid for the entire operation on that page.

For improving concurrency in your system, you should focus on the design of the database system and the code objects that touch it. SQL Server 2000 is designed to support multiple terabytes of data and to scale upwards for virtually limitless performance. The role of the DBA is to manage the database life cycle, a cycle of design and optimization of all database components from code to data storage on disk, to ensure that the design continues to meet the service level agreement.

Tables and Indexes

Enhancements have also been made to the physical data structures, to allow more flexibility of design and maintenance.

As a table or index grows, SQL Server allocates new data pages in sets of eight; these are called extents. A row of data cannot cross pages, so it can hold only 8 KB of data, although associated text, ntext, or image columns can be stored on different pages. Tables that have clustered indexes are physically stored in key order on disk. Heaps, which are tables that do not have clustered indexes, are not sorted. The records are stored in the order in which they were inserted.

SQL Server 2000 supports indexes on views. (Indexed views are called materialized views in some other database products.) This is because when you create a clustered index on a view, the view is no longer a derived object, but becomes a base object and is stored in the database with the same structure as a table with a clustered index. An indexed view is useful for storing precalculated values, or the result of a complex join, in cases where the maintenance cost does not outweigh the performance gain. In Enterprise Edition, the query processor automatically uses an indexed view whenever this would optimize a query plan. Indexed views can improve query speed on data that is rarely changed but is frequently part of a complex join or calculation query.

In-Row Text

In-row text allows you to store small text data in the primary page. For example, if you have a table that has a text column, but the text values are frequently small enough to fit on a normal page with the rest of the row, you can set a threshold on the text column. The threshold determines the size below which data is stored on the primary page rather than a separate text page. This results in much faster performance if the majority of the data will fit on the page, and only a small percentage of the data is actually large enough to justify the creation of a text page.

To determine when to use this new feature, balance the storage density or how many rows are stored on each data page versus the I/O improvement. For example, you have a text column for comments. In the table, you observe that 20 percent of the text values are large, but the other 80 percent are less than 100 bytes. This may seem like a logical candidate for the in-row text solution; however, you should only use in-row text if the data in that column is accessed frequently. If your users access this table frequently, but they do not look at the comments column unless they are doing special research, using in-row text might not be the best answer. The storage density is reduced, because fewer rows per page are stored, but table scan response times are increased, because the table contains more pages. The best case for implementing in-row text is when you have a frequently accessed text column that also happens to have many small values that could be stored in the row.

New Data Types

SQL Server 2000 introduces three new data types. bigint is an 8-byte integer type. sql_variant allows the storage of data values of different data types. The third data type, table, is useful for optimizing performance. Table variables make more efficient use of tempdb, and are faster than temporary tables. They are scoped to the batch where they were declared. As long as the data stored in them does not fill the data cache, these temporary tables are much faster. As a general rule, always consider the resources of your server when you use either table variables or temporary tables.

Indexes

Access to the data is optimized through the use of indexes. Correct indexing is vital to performance. Incorrect indexing is one of the most common causes of slowness in a database. Indexing is based on usage, which usually changes over time. Therefore, in addition to performing standard index maintenance, you should periodically check current indexing and adjust it to current system usage by dropping or adding indexes as appropriate.

Several new features in SQL Server 2000 make index maintenance more efficient and easier for administration. These enhancements decrease disk I/O, increasing the performance of index scans. This is especially useful where there is a secondary index available for a range scan.

Building Indexes 

When you build an index, the storage engine samples the rows and calculates the most efficient way to utilize server resources to build the index. Options allow you to control how indexes are built, so you can choose to control how system resources are allocated. You can use these options to balance resources in a process that is important to performance of the system as a whole, in accordance with your knowledge of the particular database system, so the index build will have the lowest possible impact on transaction processing.

Resource

Command

Option

Description

Memory

sp_configure (advanced)

index create memory

Specifies the amount of memory used by any index build

TempDB

create index

Sort_in_tempdb

Causes disk space used for sorting during the index build to be allocated from tempdb. This can result in more I/O bandwidth if tempdb is on separate disks and can result in a possible more physically contiguous layout of index pages if the database is low on contiguous space.

CPU

sp_configure (advanced)

max degree of parallelism

Limits the number of processors (CPU) used in parallel operations (server wide).

For more information about these options, see SQL Server Books Online.

Another scalability feature for large systems is the parallel index build, which is available in Enterprise Edition. This process is invoked automatically when you issue a single CREATE INDEX statement. The storage engine calculates the requirements for the data and then creates separate threads, each of which builds a section of the index.

Cc917632.dbpardex(en-us,TechNet.10).gif 

An index build can also make use of a shared table scan, to further optimize the process.

Defragmenting Indexes 

SQL Server 2000 supports online reorganization of indexes, a tremendous advancement from earlier versions. Online index reorganization has minimal impact on transaction throughput and can be stopped and restarted at any time without loss of work. The reorganization is accomplished in small increments and is fully recoverable.

As information is inserted, deleted, and updated in a table, the clustered and nonclustered index pages can eventually become fragmented, decreasing the efficiency of range queries against that data. Therefore, it can be beneficial to defragment your indexes periodically. You can use DBCC SHOWCONTIG, which has been improved in SQL Server 2000, to analyze and report fragmentation. For more information, see SQL Server Books Online.

If you determine that an index is fragmented, use DBCC INDEXDEFRAG to reorganize it. It reorders the pages in logical key order, compacting free space and moving rows within the established extents to conform to the fill factor setting. This enhances read performance by densely populating the pages so less of them must be read during a scan of the data. Running DBCC INDEXDEFRAG has far less impact on online performance than rebuilding the index does, provided the index has been regularly maintained and is not completely fragmented.

DBCC INDEXDEFRAG is one of a number of long-running online administrative operations that use small transactions internally. These small transactions maximize concurrency within the server, allow the operation to be stopped without loss of work, and are fully logged to prevent having to redo them in case of a failure.

Logging and Recovery

The transaction log is a stream of records that records changes to the database from the point the database was created until the current point in time. Every logged operation creates a log record. The log records generated by a transaction are written to disk when the transaction commits. In contrast, the data pages modified by the transaction are not immediately written to disk, but are retained in the SQL Server buffer cache and written to disk some time later. Delaying writes of the data to disk maximizes the efficiency of multiple accesses to the data pages and avoids disrupting scans. Forcing the log to disk on commit guarantees that no committed work is lost if the server goes down.

Recovery ensures that a database is transactionally consistent prior to bringing it online. If a database is transactionally consistent, all committed work is present and any uncommitted work has been undone. The log always defines the correct view of the database. Simply put, recovery is the process of making the data consistent with the transaction log at a given point in time.

Recovery is performed automatically when SQL Server starts, when a database is attached, or as the final step in restoring a database from backups. Recovery performed by SQL Server when it starts is called startup recovery. Recovery from backups is typically due to disk failure. This type of recovery is called media recovery.

Startup recovery occurs automatically each time an instance of SQL Server is started and consists of rolling back any transactions that were incomplete when the instance was last shut down. In the case of recovery from backups, the DBA may choose to recover to an earlier point in time. This is subject to restrictions. For more information, see SQL Server Books Online. In both cases, recovery operates based on this target point in time.

Recovery consists of two phases:

  1. Redo all changes until the target point in time is encountered in the transaction log. 

  2. Undo all work performed by transactions that were active at the point where redo stopped. 

SQL Server uses checkpoints to limit the amount of time required for startup recovey. A checkpoint forces all modified data pages currently in the buffer cache to disk. This creates a starting point for the redo portion of recovery. Checkpoints can be expensive, so SQL Server automatically manages checkpoints to maximize performance while minimizing restart time.

The logging and recovery technology used in SQL Server requires that writes that complete successfully be stored durably on disk. If you use write-caching disk storage, work with your storage vendor to ensure that the cache is fault tolerant. Fault tolerance means that the cache is immune to power failures or operator actions. If your cache is not fault tolerant, it should be disabled.

Logical Log Marks 

In SQL Server 7.0, it was possible to recover to any specified point in time. In the case of hardware failure, the restore process was fairly straightforward. However, another threat to a database is the possibility that invalid data may be entered or that valid data may be destroyed by a user's action. In this case, you need to determine when the problem transaction began. In SQL Server 7.0, the only way to do this was to restore logs to a copy of the database until the problem recurred; then you could run your restore to the production image up to a point in time just prior to the discovered time of error.

In SQL Server 2000, you can mark transactions in the log. Later, if you need to restore, you can reference the mark that was used at the time of execution, rather than using wall-clock time. To do this, use a named BEGIN TRANSACTION statement and the WITH MARK [description] clause. The marks are stored in msdb. Recovery can include or stop right before a transaction that contains the mark. For example, if you have a process that runs in batch and changes many records, you can use this feature to ensure that you can roll the data back to the point in time that the command was executed.

Mark names do not need to be unique. To indicate which transaction you need, specify a datetime value. The syntax for this is:

RESTORE LOG WITH [ STOPBEFOREMARK|STOPAFTERMARK ] = @TaggedTransaction AFTER @datetime

You can also use marks in a distributed transaction, known as distributed marks, to support recovery of multiple related databases to a transactionally consistent state. These related databases might reside on the same or different instances of SQL Server. You can set distributed marks across a set of databases periodically (for example, once every five minutes). If the transaction log of one of the databases is damaged, you must recover the set of databases to an earlier point in time. The distributed mark provides this point. Using distributed marks negates the worry of coordinating precise timing of backups for multiple related databases. For more information, see "Recovering to a Named Transaction" in SQL Server Books Online.

Shrinking the Transaction Log 

Log shrink operations were not executed immediately in SQL Server 7.0. They were deferred until the transaction log was next backed up or truncated. This confused many SQL Server 7.0 customers. SQL Server 2000 shrinks the log as much as possible and then indicates if further shrinking will be possible after a log backup. In this case, run the shrink command again after the log backup has completed. For more information, see "Shrinking the Transaction Log" in SQL Server Books Online.

The size of the log will be based on your current recovery model and your application design. If you find that you need to shrink to log periodically, look beyond the symptom to the cause; You should further investigate what is causing the log to fill up, rather than focus on constant maintenance with the shrink command.

Recovery Models

Recovery models were added to SQL Server 2000 to facilitate data protection planning. They clarify tradeoffs between performance, log space requirements, and protection from media (disk) failure. There are three models: Simple Recovery, Full Recovery, and Bulk-Logged.

The choice of recovery model is based on database usage and availability requirements and helps determine appropriate backup and restore procedures. Recovery models only apply to media recovery, that is, recovery from backups. Startup recovery recovers all committed work. For more information, see "Selecting a Recovery Model" in SQL Server Books Online.

You can easily transition between recovery models. For example, on a very large database, you can use full or bulk logged, or both. You can use full during the day and bulk_logged at night, during a data load process that consists of bulk insert and rebuilding indexes. You can also switch to bulk logging while you run a data load and switch back to full mode, run a transaction log backup, and be able to restore to that point in time without having to run a full database backup. This feature allows you to do the bulk processing more efficiently; all you need to do is make a transaction log backup afterwards.

To change recovery models, use the following syntax:

ALTER DATABASE SET RECOVERY RecoveryModel

For more information, see "Switching Recovery Models" in SQL Server Books Online.

Simple Recovery Model 

The Simple Recovery model typically requires less log space, but it incurs the greatest potential work loss if data or log files are damaged. Only events needed for basic recovery are logged. Using the Simple Recovery Model, only full database and differential database backups are available. In the event of a failure, all committed work since the last backup must be redone. This model is the simplest to administer, but it is not a good choice for a mission-critical application where loss of committed work cannot be tolerated.

This model is similar to the truncate log on checkpoint option in SQL Server 7.0 and earlier versions.

Full Recovery Model 

In the Full Recovery model, everything is logged. Full Recovery model provides complete protection against work loss from a damaged data file. If the transaction log is damaged, work committed since the most recent log backup is lost and must be redone manually.

Even when you use the Full Recovery model, it is important to use fault-tolerant disks for the transaction log to prevent data loss. The Full Recovery model also allows recovery to any specific point in time.

Bulk-Logged Recovery Model 

The Bulk-Logged Recovery model provides the highest performance for bulk operations. These operations also consume less log space than they do under the Full Recovery model. For example, the allocation of a new page is logged, but the data inserted onto the page is not. In SQL Server 2000, bulk operations consist of bulk load (BCP and BULK INSERT, including when they run within a DTS package), SELECT INTO, CREATE INDEX, WRITETEXT, and UPDATETEXT.

Compared with the Full Recovery model, the Bulk-Logged Recovery model minimizes logging for bulk operations. Keep in mind that in the event that recovery becomes necessary, if the log is damaged or if bulk operations have occurred since the most recent log backup, changes made in the database since the last log backup are lost.

This model does not support recovery to a specific point in time, but it will allow recovery to the end of a transaction log backup containing bulk changes. Transaction log backups made using the Bulk-Logged Recovery model contain the extents modified by bulk operations. This feature improves support for log shipping, because you no longer need to worry that a bulk operation will invalidate your backups. SQL Server maintains a bitmap to track the data extents modified, which optimizes the process by which SQL Server identifies changes.

Improved Backup Functionality 

In addition to the introduction of recovery models to simplify data protection in general, SQL Server 2000 has improved manageability: snapshot technology, differential backups, and security have been enhanced.

  • The transaction log backup chain is never broken. In SQL Server 7.0, certain operations, such as adding a file to a database, broke the log chain and required a subsequent full database backup. 

  • Backup operations do not conflict with applications or other administrative actions. For example, backups can occur concurrently with bulk operations such as create index and bulk load. 

  • Log and file backups can occur concurrently. 

Unattended backup operations, regardless of system activity, are also better supported in SQL Server 2000.

SQL Server supports snapshot backup and restore technologies in conjunction with independent hardware and software vendors. Snapshot backups minimize or eliminate the use of server resources to accomplish the backup. This is especially beneficial for moderate to very large databases in which availability is extremely important. The primary benefits of this technology are:

  • A backup can be created in a very short time, usually measured in seconds, with little or no impact on the server.

  • A disk backup can be used to restore a database just as quickly. 

  • Another host can create a backup with no impact on the production system. 

  • A copy of a production database can be created instantly for reporting or testing.

Snapshot backups and restores are accomplished in cooperation with third-party hardware and/or software vendors who use features of SQL Server 2000 designed for this purpose. The backup technology creates an instantaneous copy of the data being backed up, usually by splitting a mirrored set of disks. At restore time, the original is immediately available. The underlying disks are synchronized in the background, resulting in almost instantaneous restores.

Differential database backups can be completed in a time that is proportional to the amount of data changed since the last full backup. The less your data has changed, the quicker the backup. SQL Server 2000 uses a bitmap to track data extents modified since the most recent database or file backup to enable them to be located efficiently. In addition, SQL Server 2000 supports file differential backups.

Backups still accumulate changes made to the database since the most recent full backup, functioning the same way in the event of recovery. They are significantly faster, however, because they only record the small amount of information that has changed, especially for very large databases that contain only a small amount of changed data.

For added security, you can implement password protection for your backup media and backup sets. This helps prevent unauthorized users from adding to your backups or restoring to your database.

Administrative Improvements

Several administrative features of the storage engine have been altered and enhanced in SQL Server 2000.

Database Verification 

The DBCCs provide a variety of administrative capabilities, including the CHECK commands for verifying database consistency

Experience with SQL Server 7.0 and SQL Server 2000 has shown that database inconsistency is caused by hardware problems that may or may not be detected by the database engine or applications during normal operation. This is particularly applicable to data that is accessed infrequently. In response to this need, SQL Server 2000 introduces a checking mode, Physical_Only, that is designed to detect most hardware-caused problems. It is very fast, approximately disk scan speed, and is not resource intensive.

Due to fundamental architectural improvements in the SQL Server storage engine, which started with SQL Server 7.0, it is not necessary to run database verification as part of normal maintenance. However, Microsoft remains committed to database verification tools as an important part of managing mission critical data. Microsoft recommends that you:

  • Run the Physical_Only check occasionally, depending on your confidence in underlying hardware, particularly the disk subsystems. 

  • Run a complete database check at critical times, such as a hardware or software upgrade, or whenever a problem is suspected regardless of cause. 

Microsoft does not recommend running a complete check as part of regular maintenance.

SQL Server 2000 also includes important enhancements to database verification:

  • By default, checking is fully online. Online checking has low impact on the transaction workload. This impact will vary depending on the system load, hardware configuration, and speed of tempdb. Microsoft has measured this impact at 15 to 20 percent with a medium OLTP workload (50 percent CPU). The TABLOCK option is provided to force the check to take shared table locks, which enables it to run faster but will prevent updates. 

  • Checking is done in parallel on symmetric multiprocessing (SMP) computers, limited by the maximum degree of parallelism you have set for the instance of SQL Server. 

SQL Server 2000 check commands continue to support the repair functionality introduced in SQL Server 7.0. Repair can provide an alternative to a restore from backups in some situations. The database must be offline during repair.

Database State Control 

SQL Server 2000 includes enhancements to the ALTER DATABASE statement that allow more control of database states through Transact-SQL. All database options can now be modified with greater control through the ALTER DATABASE command; sp_dboption and databaseproperty() will no longer be updated in future releases. The Transact-SQL commands sp_helpdb and DatabasePropertyEx() provide information about the state of your database.

The following table lists database state options.

Option type

Available settings

User access

SINGLE_USER
RESTRICTED_USER
MULTI_USER

Availability

ONLINE
OFFLINE

Updatability

READ_ONLY
READ_WRITE

SQL Server also sets the following states in reaction to conditions within the database: restoring, recovering, and suspect. The database options can be set by using the SET clause of the ALTER DATABASE statement, the sp_dboption system stored procedure, or, in some cases, SQL Server Enterprise Manager.

When the database state is changed, the session making changes to the database state remains connected, while sessions inconsistent with the new state can be terminated and their transactions rolled back. Session termination options include the following:

  • Terminate immediately

  • Terminate after a specified time

  • Allow the ongoing processes to complete normally 

  • Check for activity and disregard the state change if active user sessions are found

Here are two examples of the syntax:

alter database accting set read_only with rollback immediate
alter database accting set single_user with rollback after 60 seconds

For more information, see "Setting Database Options" in SQL Server Books Online.

System Process IDs and Units of Work 

An additional administrative enhancement helps when you need to stop a process. The KILL command has been enhanced with status feedback. So, if you want to learn the status of an outstanding KILL command, run the following:

KILL SPID WITH STATUSONLY

If you try to stop a system process ID (SPID) that is being stopped by another KILL command, the system returns the same status information.

In SQL Server 2000, Microsoft Distributed Transaction Coordinator (MS DTC) transactions can exist without an associated connection or SPID. Therefore, a connection can be used for other processes while waiting for a transaction or unit of work to complete. When the MS DTC transaction manager sends a message that it has completed the task, you can either commit or roll back the transaction.

The term for this is a unit of work (UOW), which is the transaction identifier used by MS DTC for the transaction. A UOW does not have a SPID. For more information, see SQL Server Books Online.

Dynamic Tuning

In SQL Server 2000, usage-based performance tuning is managed dynamically, without required or recommended manual adjustments. The static parameters have been eliminated, but administrative control has been retained for certain resources (for example, setting an upper limit on the amount of memory SQL Server can use). This method is far more accurate and responsive than a manually calculated system based on averages and estimates. This allows you to concentrate on the design aspects of database management.

  1. Traditional database systems require a great deal of manual management and tuning. For example, to tune the system in response to usage, the DBA would be required to monitor the system, recording a vast amount of statistics over time, in order to select a static setting that seems to provide the optimal advantage for the system. Then the DBA would re-evaluate the system to judge what effect the new setting has, and the tuning process would begin again.

SQL Server 2000 introduces a dynamic algorithm into the storage engine, which actively monitors usage of the server and adjusts the settings internally. Dynamic feedback and analysis in SQL Server 2000 keeps the setting within 10 percent of the absolute optimal value. The result is a better-tuned and highly adaptive system.

Cc917632.sdynctrl(en-us,TechNet.10).gif

Data Storage Components

SQL Server 2000 balances across all available processors in coordination with the Windows 2000 operating system. If you are running a dedicated instance of SQL Server, and no other applications share the same resources, leave the processor related settings at their default to take full advantage of all the processors. SQL Server can take advantage of parallel processing across multiple processors for queries, index builds, and DBCCs across available processors. SQL Server 2000 Standard Edition can support up to four processors and 2 GB of physical memory (RAM). Enterprise Edition can scale upwards to new levels, to support up to 32 processors and 64 GB of physical memory.

The main source of memory for an instance of SQL Server is called its memory pool. Almost all data structures that use memory in an instance of SQL Server are allocated from the memory pool. Examples of objects allocated from the memory pool include the buffer cache, where recently read data pages are stored, and the procedure cache, which holds recent execution plans.

The assignments within the memory pool are highly dynamic. To optimize performance, SQL Server constantly adjusts the amounts of the memory pool assigned to the various areas. In situations where the number of stored execution plans is lower, the memory pool is adjusted to make optimal use of the resources by making more memory available for the data cache.

SQL Server 2000 is designed to use memory to minimize disk I/O as much as possible. To accomplish this, SQL Server uses the buffer cache to hold recently referenced data, in physical memory, where it can be reused. One potential way to reduce disk I/O and speed up your database system would be to add to the physical memory available to SQL Server.

Normally, memory settings do not require any adjustment. However, they can be controlled in certain situations. For example, memory requires special attention where you are running multiple instances of SQL Server on the same server, especially if you use failover clustering. You also need to monitor memory usage if you are running applications in addition to SQL Server on the same server.

Cc917632.seawe(en-us,TechNet.10).gif

Use of physical memory beyond 3 GB is vastly improved in SQL Server 2000 because it is built on functionality in the Windows 2000 operating system. SQL Server 2000 Enterprise Edition can use as much memory as Windows 2000 Advanced Server or Windows 2000 Datacenter Server allows.

For more information on large memory support in SQL Server 2000, see "Managing AWE Memory" in SQL Server Books Online.

Files, Filegroups, and Disks

SQL Server stores data and the log on disk files. In a basic installation, and as a default, data and log files are created in the default location specified in the server configuration. However, to maximize performance, you can apply a few basic principles:

  • Spread data over as many disks, channels, and controllers as possible.

    In general, the more disks you have (regardless of their individual size), the faster the storage engine can read and write data. The larger your system becomes, the more important it is to store the data files and log files on separate physical drives. Also, because the use of tempdb has changed, you should now store tempdb on a large number of disks, for example, with the data files or on a separate set of disks.

  • Use filegroups to scale your enterprise database. 

    Every database begins with one default filegroup. Because SQL Server 2000 can work effectively without additional filegroups, many systems will not need to add user-defined filegroups. However, as a system grows, the use of additional filegroups supports scalability by allowing you to increase performance by distributing I/O appropriately for your system.

    In SQL Server 2000, if you set a particular filegroup within a database to read-only, the data on that filegroup cannot be altered, but catalog information such as permissions can still be managed.

To successfully implement an optimized database design, carefully consider the configuration of the database storage components, including the layout of physical and logical disks, the correct sizing of hardware, and the arrangement of the physical database files across disks.

For more information about SQL Server architecture and the storage engine, see SQL Server Books Online and Kalen Delaney's Inside Microsoft SQL Server 2000. For more information about hardware and SQL Server, see "SQL Server 2000 on Large Servers" in SQL Server Books Online. For more information about database optimization, see Chapter 33, "The Data Tier: An Approach to Database Optimization."

Innovation and Evolution

For DBAs, increased flexibility and control over performance provides the freedom to focus their database technology skills and experience on managing the database code, design, and storage components as a unified approach to database system management. The storage engine for SQL Server 2000 was designed to help meet these needs.

Cc917632.spacer(en-us,TechNet.10).gif

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2015 Microsoft