SSIS best practices - Optimum resource utilization

PDF
SQL Server Integration Services is designed to process large amounts of data row by row in memory with high speed. Because of this, it is important to understand resource utilization, i.e., the CPU, memory, I/O, and network utilization of your packages.

CPU Bound

Seek to understand how much CPU is being used by Integration Services and how much CPU is being used overall by SQL Server while Integration Services is running. This latter point is especially important if you have SQL Server and SSIS on the same box, because if there is a resource contention between these two, it is SQL Server that will typically win – resulting in disk spilling from Integration Services, which slows transformation speed.

The perfmon counter that is of primary interest to you is Process / % Processor Time (Total). Measure this counter for both sqlservr.exe and dtexec.exe. If SSIS is not able to drive close to 100% CPU load, this may be indicative of:
  • Application contention:For example, SQL Server is taking on more processor resources, making them unavailable to SSIS.
  • Hardware contention:A common scenario is that you have suboptimal disk I/O or not enough memory to handle the amount of data being processed.
  • Design limitation:The design of your SSIS package is not making use of parallelism, and/or the package uses too many single-threaded tasks.

Network Bound

SSIS moves data as fast as your network is able to handle it. Because of this, it is important to understand your network topology and ensure that the path between your source and target have both low latency and high throughput.
The following Network perfmon counters can help you tune your topology:
  • Network Interface / Current Bandwidth: This counter provides an estimate of current bandwidth.
  • Network Interface / Bytes Total / sec: The rate at which bytes are sent and received over each network adapter.
  • Network Interface / Transfers/sec: Tells how many network transfers per second are occurring. If it is approaching 40,000 IOPs, then get another NIC card and use teaming between the NIC cards.
These counters enable you to analyze how close you are to the maximum bandwidth of the system. Understanding this will allow you to plan capacity appropriately whether by using gigabit network adapters, increasing the number of NIC cards per server, or creating separate network addresses specifically for ETL traffic.

I/O Bound

If you ensure that Integration Services is minimally writing to disk, SSIS will only hit the disk when it reads from the source and writes to the target. But if your I/O is slow, reading and especially writing can create a bottleneck.

Because tuning I/O is outside the scope of this technical note, please refer to Predeployment I/O Best Practices.  Remember that an I/O system is not only specified by its size ( “I need 10 TB”) – but also by its sustainable speed (“I want 20,000 IOPs”).

Memory bound

A very important question that you need to answer when using Integration Services is: “How much memory does my package use?”

The key counters for Integration Services and SQL Server are:
  • Process / Private Bytes (DTEXEC.exe) – The amount of memory currently in use by Integration Services. This memory cannot be shared with other processes.
  • Process / Working Set (DTEXEC.exe) – The total amount of allocated memory by Integration Services.
  • SQL Server: Memory Manager / Total Server Memory: The total amount of memory allocated by SQL Server. Because SQL Server has another way to allocate memory using the AWE API, this counter is the best indicator of total memory used by SQL Server. To understand SQL Server memory allocations better, refer to Slava Ok’s Weblog.
  • Memory / Page Reads / sec – Represents to total memory pressure on the system. If this consistently goes above 500, the system is under memory pressure.
 
Home SSIS Tutorial SSIS best practices - Optimum resource utilization