Query Activity Collector Type

The Query Activity collector type is a custom collector type used by the Query Statistics collection set, one of the predefined system data collection sets.

This collector type is used to collect query statistics and query activity information along with the query plan and query text for queries that meet predefined criteria. If you create your own collection set using this collector type, it will collect the same information as the Query Statistics collection set. Therefore, we recommend that you use the predefined Query Statistics collection set.

Architecture and Processing

The Query Activity collector type consists of the following elements:

  • An input parameters schema (internal)

  • An SSIS package for collecting data (QueryActivityCollect.dtsx)

  • An SSIS package for uploading data (QueryActivityUpload.dtsx)

In addition, this collector type utilizes custom tasks or transforms to analyze and select data for storage in the management data warehouse.

The Query Activity collector type performs the following operations:

  • It collects samples of dm_exec_requests, dm_exec_sessions, and selected other related dynamic management views. This is implemented as a single joined query. Data collection happens with the frequency specified for the collection item.

  • It collects snapshots of the dm_exec_query_stats dynamic management view, with a frequency equal to the collection set upload frequency. By default, the Query Statistics collection set has an upload frequency of 15 minutes.

Collection Phase

The following table shows the query that is used during the collection phase. This query is defined in the QueryActivityCollect.dtsx SSIS package.

Collection frequency

10 seconds

Query

SET NOCOUNT ON
-- Get the collection time as UTC time
DECLARE @collection_time datetime
SET @collection_time = GETDATE()
SELECT
CONVERT(int, ROW_NUMBER() OVER (ORDER BY sess.session_id, ISNULL (req.request_id, -1), ISNULL (tasks.exec_context_id, -1)) ) AS row_id,
-- IDs and Blocking IDs
sess.session_id, 
ISNULL (req.request_id, -1) AS request_id, 
ISNULL (tasks.exec_context_id, -1) AS exec_context_id, 
ISNULL (req.blocking_session_id, 0) AS blocking_session_id,
CONVERT (bit, CASE 
                WHEN EXISTS (SELECT TOP 1 session_id FROM sys.dm_exec_requests bl WHERE bl.blocking_session_id = req.session_id) THEN 1
                ELSE 0
              END) AS is_blocking,
ISNULL (waits.blocking_exec_context_id, 0) AS blocking_exec_context_id, 
tasks.scheduler_id, 
DB_NAME(req.database_id) as database_name, 
req.[user_id], 
-- State information
LEFT (tasks.task_state, 10) AS task_state, 
LEFT (req.status, 15) AS request_status, 
LEFT (sess.status, 15) AS session_status,
req.executing_managed_code, 
-- Session information
sess.login_time, 
sess.is_user_process, 
LEFT (ISNULL (sess.[host_name], ''), 20) AS [host_name], 
LEFT (ISNULL (sess.[program_name], ''), 50) AS [program_name], 
LEFT (ISNULL (sess.login_name, ''), 30) AS login_name, 
-- Waits information
LEFT (ISNULL (req.wait_type, ''), 45) AS wait_type, 
LEFT (ISNULL (req.last_wait_type, ''), 45) AS last_wait_type, 
ISNULL (waits.wait_duration_ms, 0) AS wait_duration_ms, 
LEFT (ISNULL (req.wait_resource, ''), 50) AS wait_resource, 
LEFT (ISNULL (waits.resource_description, ''), 140) AS resource_description,
-- Transaction information
req.transaction_id, 
ISNULL(req.open_transaction_count, 0) AS open_transaction_count,
COALESCE(req.transaction_isolation_level, sess.transaction_isolation_level) AS transaction_isolation_level,
-- Request stats
req.cpu_time AS request_cpu_time, 
req.logical_reads AS request_logical_reads, 
req.reads AS request_reads, 
req.writes AS request_writes, 
req.total_elapsed_time AS request_total_elapsed_time, 
req.start_time AS request_start_time, 
-- Session stats
sess.memory_usage, 
sess.cpu_time AS session_cpu_time, 
sess.reads AS session_reads, 
sess.writes AS session_writes, 
sess.logical_reads AS session_logical_reads, 
sess.total_scheduled_time AS session_total_scheduled_time, 
sess.total_elapsed_time AS session_total_elapsed_time, 
sess.last_request_start_time, 
sess.last_request_end_time, 
req.open_resultset_count AS open_resultsets, 
sess.row_count AS session_row_count, 
sess.prev_error, 
tasks.pending_io_count, 
-- Text/Plan handles
ISNULL (req.command, 'AWAITING COMMAND') AS command,  
req.plan_handle, 
req.sql_handle, 
req.statement_start_offset, 
req.statement_end_offset,
@collection_time AS collection_time
FROM sys.dm_exec_sessions sess 
LEFT OUTER MERGE JOIN sys.dm_exec_requests req  ON sess.session_id = req.session_id
LEFT OUTER MERGE JOIN sys.dm_os_tasks tasks ON tasks.session_id = sess.session_id AND tasks.request_id = req.request_id AND tasks.task_address = req.task_address
LEFT OUTER MERGE JOIN sys.dm_os_waiting_tasks waits ON waits.session_id = sess.session_id AND waits.waiting_task_address = req.task_address
WHERE 
    sess.session_id <> @@SPID
    AND
    (
        (req.session_id IS NOT NULL AND (sess.is_user_process = 1 OR req.status COLLATE Latin1_General_BIN NOT IN ('background', 'sleeping')))-- active request
            OR 
        (sess.session_id IN (SELECT DISTINCT blocking_session_id FROM sys.dm_exec_requests WHERE blocking_session_id != 0))            -- not active, but head blocker
    )
OPTION (FORCE ORDER)

Upload Phase

During the upload phase, the collected data is analyzed to determine what data will be saved in the management data warehouse. This analysis determines a set of query statistics, query plans, and query text that needs to be saved.

A key element is an algorithm that selects which queries and query plans to save in the data warehouse. This algorithm works as follows:

  1. Collects a snapshot of sys.dm_exec_query_stats. This snapshot is collected at the frequency equal to the collection set upload frequency. (By default, this is 15 minutes.)

  2. Retrieves the most recent snapshot (from 15 minutes earlier) for comparison with the new snapshot. The most recent snapshot is cached locally and does not have to be retrieved from the management data warehouse.

  3. Selects the top three queries from each snapshot by using the following metrics:

    • Elapsed time

    • Worker time

    • Logical reads

    • Logical writes

    • Physical reads

    • Execution count

    This process provides 6 x 3 sql_handles and plan_handles.

  4. Identifies the unique sql_handles and plan_handles.

  5. Intersects this result with the sql_handles and plan_handles that are stored in the data warehouse.

    For new sql_handles and plan_handles, obtains the plan and text from the server. If the plan or text cannot be found (it may have been already been removed from the local cache), stores the handles in the management data warehouse.

  6. For each sql_handle text that is gathered, normalizes the text (for example, removes parameters and literals) and calculates the unique hash value for the normalized text. Stores the normalized text, the hash value, and mapping to the original sql_handle in the management data warehouse.

The following table shows the query that is used to obtain the snapshots, and to analyze and then upload the data to the management data warehouse. This query is defined in the QueryActivityUpload.dtsx SSIS package.

Query

SET NOCOUNT ON
DECLARE @p1 datetime
SET @p1 = GETDATE()
SELECT 
    [sql_handle],
    statement_start_offset,
    statement_end_offset,
    -- Use ISNULL here and in other columns to handle in-progress queries that are not yet in sys.dm_exec_query_stats.  
    -- These values only come from sys.dm_exec_query_stats. If the plan does not show up in sys.dm_exec_query_stats 
    -- (first execution of a still-in-progress query, visible in sys.dm_exec_requests), these values will be NULL. 
    MAX (plan_generation_num) AS plan_generation_num,
    plan_handle,
    MIN (creation_time) AS creation_time, 
    MAX (last_execution_time) AS last_execution_time,
    SUM (execution_count) AS execution_count,
    SUM (total_worker_time) AS total_worker_time,
    MIN (min_worker_time) AS min_worker_time,           -- NULLable
    MAX (max_worker_time) AS max_worker_time,
    SUM (total_physical_reads) AS total_physical_reads,
    MIN (min_physical_reads) AS min_physical_reads,     -- NULLable
    MAX (max_physical_reads) AS max_physical_reads,
    SUM (total_logical_writes) AS total_logical_writes,
    MIN (min_logical_writes) AS min_logical_writes,     -- NULLable
    MAX (max_logical_writes) AS max_logical_writes,
    SUM (total_logical_reads) AS total_logical_reads,
    MIN (min_logical_reads) AS min_logical_reads,       -- NULLable
    MAX (max_logical_reads) AS max_logical_reads,
    SUM (total_clr_time) AS total_clr_time,
    MIN (min_clr_time) AS min_clr_time,                 -- NULLable
    MAX (max_clr_time) AS max_clr_time,
    SUM (total_elapsed_time) AS total_elapsed_time,
    MIN (min_elapsed_time) AS min_elapsed_time,         -- NULLable
    MAX (max_elapsed_time) AS max_elapsed_time,
    @p1 AS collection_time
FROM
(
    SELECT  
        [sql_handle],
        statement_start_offset,
        statement_end_offset,
        plan_generation_num,
        plan_handle,
        creation_time,
        last_execution_time,
        execution_count,
        total_worker_time,
        min_worker_time,
        max_worker_time,
        total_physical_reads,
        min_physical_reads,
        max_physical_reads,
        total_logical_writes,
        min_logical_writes,
        max_logical_writes,
        total_logical_reads,
        min_logical_reads,
        max_logical_reads,
        total_clr_time,
        min_clr_time,
        max_clr_time,
        total_elapsed_time,
        min_elapsed_time,
        max_elapsed_time 
    FROM sys.dm_exec_query_stats AS q
    UNION ALL 
    SELECT 
        r.[sql_handle],
        r.statement_start_offset,
        r.statement_end_offset,
        ISNULL (qs.plan_generation_num, 0) AS plan_generation_num,
        r.plan_handle,
        ISNULL (qs.creation_time, r.start_time) AS creation_time,
        r.start_time AS last_execution_time,
        1 AS execution_count,
        r.cpu_time AS total_worker_time,
        qs.min_worker_time,     -- min should not be influenced by in-progress queries
        r.cpu_time AS max_worker_time,
        r.reads AS total_physical_reads,
        qs.min_physical_reads,  -- min should not be influenced by in-progress queries
        r.reads AS max_physical_reads,
        r.writes AS total_logical_writes,
        qs.min_logical_writes,  -- min should not be influenced by in-progress queries
        r.writes AS max_logical_writes,
        r.logical_reads AS total_logical_reads,
        qs.min_logical_reads,   -- min should not be influenced by in-progress queries
        r.logical_reads AS max_logical_reads,
        qs.total_clr_time,      -- CLR time is not available in dm_exec_requests
        qs.min_clr_time,        -- CLR time is not available in dm_exec_requests
        qs.max_clr_time,        -- CLR time is not available in dm_exec_requests
        r.total_elapsed_time AS total_elapsed_time,
        qs.min_elapsed_time,    -- min should not be influenced by in-progress queries
        r.total_elapsed_time AS max_elapsed_time
    FROM sys.dm_exec_requests AS r 
    LEFT OUTER JOIN sys.dm_exec_query_stats AS qs ON r.plan_handle = qs.plan_handle AND r.statement_start_offset = qs.statement_start_offset 
        AND r.statement_end_offset = qs.statement_end_offset 
    WHERE r.sql_handle IS NOT NULL 
) AS query_stats 
OUTER APPLY sys.dm_exec_sql_text (sql_handle) AS sql
GROUP BY [sql_handle], plan_handle, statement_start_offset, statement_end_offset 
ORDER BY [sql_handle], plan_handle, statement_start_offset, statement_end_offset

Query output

snapshots.query_stats, snapshots.notable_query_text, and snapshots.notable_query_plan