Sunday, 7 April 2019

How to improve the performance of package in SSIS



In this article, I will explain how many ways to improve the performance of package in SSIS at various levels and it also provides information about features and tools that you can use to troubleshoot the performance of packages.








Performance tuning at Package level in SSIS



·       Check points:
Implement check points to have better restart ability of components in the package.

·       Disable Event Handlers:
Event handlers decrease package performance. So, unnecessary event handlers should be removed or disabled.

·       MaxConcurrentExecutables:
Increasing the number of executables will increase the parallelism of package and concurrently execute in less time.

·       MaximumErrorCount:
The default value of MaximumErrorCount is “1” means it fails for single error in the package. If you increase the error count it doesn’t fail the package until it reaches the count.


In SSIS package contains more than one task. If you add multiple tasks to the control flow, you can add the tasks by using precedence constraints.

Performance tuning at Data Flow in SSIS



·       Proper maintenance of Delay Validation:
Delay validations means True/False in between the components. True means the validations of component is delayed until the execution of other component is finished.

·        BLOB Temp Storage path:
This property is specifying at the time of working with Binary large objects such as Images, Media files etc.

·         Increase the size of Buffer:
       (See this link)

·       Engine Threads (parallel):
The default value is “10”, it will increase more threads it runs more parallel and uses more processes to finish the dataflow operations.

·       Run In Optimized Mode:
If it’s true, the dataflow avoids unnecessary transformations, conservations, operations etc.

·       Optimize event handlers. 

Performance tuning at Source level in SSIS



·         Source as Flat File, use the property “Fast parse=True”. So that the column uses faster

·         Source as Database (Table or View), Create indexes on the source table so that it retrieves the data faster. Instead of taking a table, take an SQL Query (or) SQL Command as data access mode to get the required columns and rows of data.

Performance tuning at Destination level in SSIS



·         Destination as Flat File, try to take the file local to the system.

·         Destination as Database (Table)
§  Use data access mode as “fast load” to load the data much faster.
§  Use data access mode as SQL Command to load any required rows and columns.
§  The table contains, indexes, triggers, and then loading will be slow. So we need to disable (or) drop them, later the loading is finished recreate or enable them.


Performance tuning at Transformation level in SSIS


We need to identify either the transformation is Synchronous (or) Asynchronous.

Asynchronous transformation always takes more time to run then Synchronous.

·       Synchronous Transformation:
Here the rows are processed as it is getting from the source.
Ex: Derived column, data conversion etc.

·       Asynchronous Transformation:
In this case until the required rows, the system collects and it doesn’t perform any operation.
It requires more memory and processes many rows.
These transformations either blocks partially or fully. So these transformation can also called as “Partially blocking or Full blocking” transformations.
Ex: Sort, Aggregate, Merge etc.

·       Remove unused columns from the pipeline:
Extract required data from source system like table or file. Need to avoid the tendency to pull everything available on the source. If you can follow the same method it will impact the future, it eats up network bandwidth, consumes system resource (I/O & CPU), requires extra storage, and it kills the overall performance of the ETL System.

·       Avoid unnecessary sorting in the pipeline:
Behaviour of sorting is to kills the performance and avoid unnecessary sorting can enhance the performance of the package flow.

Sometimes the source data has already been sorted before going to the next steps. Such pre-sorting can occur when the SELECT query used an ORDER BY clause or when the data was inserted into the source in sorted order. For such pre-sorted source data, you can provide a hint that the data is sorted, and thereby avoid the use of a Sort transformation to satisfy the sorting requirements of certain downstream transformations. (For example, the Merge and Merge Join transformations require sorted inputs.) 

To provide a hint that the data is sorted, you must do the following tasks:

·         Set the IsSorted property on the output of an upstream data flow component to True.
·         Specify the sort key columns on which the data is sorted.