Data Collector Logging
The data collector captures and logs different kinds of information for each collection set and its collection packages. This information, written to msdb tables, provides information about the current state of the collection set and the outcome of package execution. These log tables also provide statistical information about each collection set and its packages. The following kinds of information are available for monitoring and analyzing data collection:
The currently running collection sets.
The history of a collection set run.
For a given collection set, the hierarchy of packages that were executed as part of that collection.
For each package, start and stop times, duration, completion status, and the number of rows processed.
If a collection set run failed, the package that caused the failure.
If a package execution failed, the component in that package that failed together with the error message.
For each package, the sequence of log events that the package logged for each run.
The data collector implements logging by using stored procedures and SQL Server 2008 Integration Services (SSIS) task execution and logging features. These features create summary and detailed log entries for log providers when run-time events occur. For more information, see Implementing Logging in Packages.
Logged events provide reference points for all phases of program execution. Each event record is mapped to the Execution ID of the package that a task ran under. The data collector maps these IDs to the specific packages for data collection and upload.
The high-level architecture and data handling for data collector logging are as follows:
Collection set information, provided by the data collector at run time, is stored to a data collector table using a stored procedure. Every record is given a log_id that is used to reference table entries and map a package to the SSIS Execution ID.
Package information is logged at the start and end of each package run. This logging is performed by running a Transact-SQL task at the beginning and at the end of the control flow for each package.
While the package is running, periodic updates and statistical updates are made by the data collector. Each package has a global error handler task (implemented by using SSIS package events) that logs error events. This is done by putting the SSIS "Execute SQL Task" in the control flow to provide the necessary event logging that is required. For more information, see SSIS Execute SQL Task and Control Flow Elements.
Events that SSIS generates are logged to the SSIS -defined table, dbo. sysssislog using the dbo.sp_ssis_addlogentry stored procedure.
When the package ends, a success or failure event is logged.
When the collection set stops, this information is logged. As with a package, a success or failure event is logged.
The data generated by the data collector is stored in the dbo.syscollector_execution_log_internal table, which is joined to dbo.sysssislog. This provides a complete view of collection set execution. These tables are described in more detail in the following section.
The log tables and their columns are listed in the following table. The "Accessing and Viewing Log Data" section, later in this topic, describes the functions and views that you can use to obtain detailed information about a log table's data types and contents.
A data collector table for storing execution information. It has the following columns:
A data collector table for storing statistical information. It has the following columns:
An SSIS-defined table for event information such as the start of a collection set or package, and warnings or errors. For more information, see sysssislog (Transact-SQL).
Do not modify data in the log tables. The data in these tables must be inserted, updated, and deleted in specific ways as enforced by stored procedures. The table names and definitions can change at run time, do change when you update the application, and might change in future releases. Instead of directly using the tables, use the documented stored procedures and views provided with the data collector to access instance and application data.
A collection set log is kept for the same length of time as the data in the management data warehouse is kept for the collection set. This is defined by the retention period that you can configure for each collection set. When the retention period expires, a SQL Server Agent job purges the data from the log.
Three levels of logging are provided to handle the different collection scenarios that might occur. This ensures that you have the data that you need for troubleshooting a collection set or package if the need arises. You can set different logging levels for each collection set according to your logging requirements.
The following table describes the available logging options.
The default logging level. Logs execution information and the following SSIS events:
Combines level 0 logging and the following events:
Combines level 1 logging and detailed SSIS events.
By default, the logging level is 0. Level 0 is used to avoid excessive log messages, such as the warning events that are generated by level 1 logging. These warning events may include performance suggestions from SSIS that cannot be corrected by the user.
Level 2 provides additional event information that can be used for advanced troubleshooting.
Every data collector package uses the SQL Server event provider. The events described in the following table are logged and available for viewing.
A standard event that is always generated. This event cannot be disabled.
A standard event that is always generated. This event cannot be disabled.
An error event.
A warning event.
Provides information about any component that kept buffers alive after the buffer manager stops. If a buffer is still alive, buffer resources were not released and may cause memory leaks. This log entry provides the name of the component and the ID of the buffer.
Reports the execution plan of the data flow. The execution plan provides information about how buffers will be sent to components. This information, in combination with the PipelineExecutionTrees log entry, describes what is occurring in a Data Flow task.
Reports the execution trees of the layout in the data flow. The scheduler of the data flow engine uses the trees to build the execution plan for the data flow.
Provides initialization information about the task. This information includes the directories to use for temporary storage of binary large object (BLOB) data, the default buffer size, and the number of rows in a buffer. Depending on the configuration of the Data Flow task, multiple log entries may be written.
Provides information about the execution phases of the SQL statement. Log entries are written when the task connects to the database, when the task starts to prepare the SQL statement, and after the execution of the SQL statement is completed. The log entry for the prepare phase includes the SQL statement that the task uses.
Reports the operation that the task performs. The log entry is written when the file system operation starts, and includes information about the source and destination.
Provides information about the process that the task is configured to run.
Two log entries are written. One contains information about the name and location of the executable that the task runs, and the other entry records the exit from the executable.
Accessing and Viewing Log Data
You can access and view data programmatically by using provided functions and views. You can also use the Log File Viewer in Management Studio Object Explorer to view the logs. For more information, see How to: View Collection Set Logs.
The following table describes the functions and views that can be used for accessing log data.
Returns detailed information about collection set and package logs for the requested log_id and all its dependent log entries.
Returns a set of statistics for the requested log_id. The statistics are calculated from entries in the syscollector_execution_stats view.
Provides a view of the syscollector_execution_log_internal table.
Provides a detailed view of the syscollector_execution_log_internal table and shows the log entries in a hierarchy, starting from the top-level collection log entry and going through all packages that ran as part of a collection set run.
Provides a 1:1 view of the syscollector_execution_stats_internal table.