In this article, I will explain how many
ways to improve the performance of package in SSIS at various levels and it
also provides information about features and tools that you can use to
troubleshoot the performance of packages.
Sunday, 7 April 2019
Performance tuning at Package level in SSIS
· Check points:
Implement check points to have better restart ability of
components in the package.
· Disable Event Handlers:
Event handlers decrease package performance. So,
unnecessary event handlers should be removed or disabled.
· MaxConcurrentExecutables:
Increasing the number of executables will increase the
parallelism of package and concurrently execute in less time.
· MaximumErrorCount:
The default value of MaximumErrorCount is “1” means it fails for
single error in the package. If you increase the error count it doesn’t fail
the package until it reaches the count.
In SSIS package contains more than one
task. If you add multiple tasks to the control flow, you can add the tasks by
using precedence constraints.
Performance tuning at Data Flow in SSIS
· Proper maintenance of Delay
Validation:
Delay validations means True/False in between the
components. True means the validations of component is delayed until the
execution of other component is finished.
· BLOB Temp Storage path:
This property is specifying at the time of working with
Binary large objects such as Images, Media files etc.
·
Increase the size of Buffer:
· Engine Threads (parallel):
The default value is “10”, it will increase more threads
it runs more parallel and uses more processes to finish the dataflow
operations.
· Run In Optimized Mode:
If it’s true, the dataflow avoids unnecessary
transformations, conservations, operations etc.
· Optimize event handlers.
Performance tuning at Source level in SSIS
·
Source as Flat File, use the
property “Fast parse=True”. So that the column uses faster
·
Source as Database (Table or
View), Create indexes on the source table so that it retrieves the data faster.
Instead of taking a table, take an SQL Query (or) SQL Command as data access
mode to get the required columns and rows of data.
Performance tuning at Destination level in SSIS
·
Destination as Flat File, try
to take the file local to the system.
·
Destination as Database (Table)
§ Use data access mode as “fast load” to load the data much faster.
§ Use data access mode as SQL Command to load any required rows and
columns.
§ The table contains, indexes, triggers, and then loading will be slow.
So we need to disable (or) drop them, later the loading is finished recreate or
enable them.
Performance tuning at Transformation level in SSIS
We need to identify either the
transformation is Synchronous (or) Asynchronous.
Asynchronous transformation always takes
more time to run then Synchronous.
· Synchronous Transformation:
Here the rows are processed as it is getting from the
source.
Ex: Derived column, data conversion etc.
· Asynchronous Transformation:
In this case until the required rows, the system
collects and it doesn’t perform any operation.
It requires more memory and processes many rows.
These transformations either blocks partially or fully.
So these transformation can also called as “Partially blocking or Full blocking”
transformations.
Ex: Sort, Aggregate, Merge etc.
· Remove unused columns from the pipeline:
Extract required data from source system
like table or file. Need to avoid the tendency to pull everything available on
the source. If you can follow the same method it will impact the future, it
eats up network bandwidth, consumes system resource (I/O & CPU), requires
extra storage, and it kills the overall performance of the ETL System.
· Avoid unnecessary sorting in
the pipeline:
Behaviour of sorting is to kills the
performance and avoid unnecessary sorting can enhance the performance of the
package flow.
Sometimes the source data has already been sorted before
going to the next steps. Such pre-sorting can occur when the SELECT query used
an ORDER BY clause or when the data was inserted into the source in sorted
order. For such pre-sorted source data, you can provide a hint that the data is
sorted, and thereby avoid the use of a Sort transformation to satisfy the
sorting requirements of certain downstream transformations. (For example, the
Merge and Merge Join transformations require sorted inputs.)
To
provide a hint that the data is sorted, you must do the following tasks:
·
Set the IsSorted property on
the output of an upstream data flow component to True.
·
Specify the sort key columns on
which the data is sorted.
Subscribe to:
Posts (Atom)