Satish MSBI World: March 2015

Monday, 30 March 2015

Advanced SQL Server Query Tuning Concepts with example

SQL statements can be used to retrieve data from the any database. To get same results we need to write different SQL queries. For better performance we need to use best, faster and efficient queries. So we need SQL query tuning based on the business and user requirements.
Microsoft SQL Server Performs different query tuning operations using in-memory sort & join technology. A join operator is a type of an algorithm which the SQL Server chooses in order to implement logical joins between two sets of data.

It depends on different scenarios based on the
Requested query (It performs what type of operation like intersect, union ... etc)
Available indexes (Cluster and Non Cluster indexes in columns of table)
Statistics information (Sorted order on the join column)
Number of estimated rows in each data set.

There are Three major types of join operations in SQL Server:
• Nested Loops (loop) – If one dataset is small(fewer than 10 rows) and the other dataset is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons.
• Merge Join – If the two datasets are not small but are stored on their join column, a merge join is the fastest join operation.
• Hash Match – If both datasets are large, unsorted, nonindexed inputs. This join allows reduction in the use of denormalization. (Denormalization means achieve better performance by reducing join operations)
Briefly examples the above joins operations as follows in different cases

Example - 1:
Let’s create two tables with large data

create table Dataset1 (id int identity ,name varchar(50))

declare @i int

set @i=0

while (@i<200)

begin

insert into Dataset1 (name)

select name from master.dbo.spt_values

set @i=@i+1

end

--select COUNT(*) from dbo.Dataset1 --501600

create table Dataset2 (id int identity ,name varchar(50))

declare @i int

set @i=0

while (@i<200)

begin

insert into Dataset2 (name)

select name from master.dbo.spt_values

set @i=@i+1

end

-- select COUNT(*) from dbo.Dataset2 –501600

select * from dbo.Dataset1 a join dbo.Dataset2 b on (a.id=b.id)

Case1: Without Index

Case2: With Index
Let’s create a clustered index

create unique clustered index cx_Dataset1 on Dataset1 (id)

create unique clustered index cx_Dataset2 on Dataset2 (id)

Note: If either of the table has indexed it goes hash join.

Example - 2:
Let’s create two tables with medium data

create table Dataset3 (id int identity ,name varchar(50))

declare @i int

set @i=0

while (@i<1)

begin

insert into Dataset3 (name)

select name from master.dbo.spt_values

set @i=@i+1

end

--select COUNT(*) from dbo.Dataset3 --2508

create table Dataset4 (id int identity ,name varchar(50))

declare @i int

set @i=0

while (@i<1)

begin

insert into Dataset4 (name)

select name from master.dbo.spt_values

set @i=@i+1

end

-- select COUNT(*) from dbo.Dataset4 –2508

select * from dbo.Dataset3 c join dbo.Dataset2 d on (c.id=d.id)

Case1: Without Index

Case2: With Index
Let’s create a clustered index

create unique clustered index cx_Dataset3 on Dataset3 (id)

create unique clustered index cx_Dataset4 on Dataset4 (id)

Note: If either of the table has indexed it goes hash join.

Example - 3:
Let’s create two tables with small amount of data means 10 lines only.

create table Dataset5 (id int identity,name varchar(50))

insert into Dataset5 (name)

select top 10 name from master.dbo.spt_values

-- select COUNT(*) from dbo.Dataset5 --10

create table Dataset6 (id int identity,name varchar(50))

insert into Dataset6 (name)

select top 10 name from master.dbo.spt_values

-- select COUNT(*) from dbo.Dataset6 –10

select * from dbo.Dataset5 e join dbo.Dataset6 f on (e.id=f.id)

Case1: Without Index

Case2: With Index
Let’s create a clustered index

create unique clustered index cx_Dataset5 on Dataset5 (id)

create unique clustered index cx_Dataset6 on Dataset6 (id)

Note: If either of the table has indexed it goes Nested loop join.

From the above diagram:
• If both the tables have No index then the query optimizer will choose “Hash joins” internally.
• If both the tables have indexes then the query optimizer will choose “Merge (For Large Data in tables) / Nested loop (For small data in tables)” internally.
• If either of the tables have indexes then the query optimizer will choose “Merge (For medium data in tables) / Hash (For Large data in tables) / Nested loop (For small & Large Vs small tables)” internally.

Simply explain by table as follows

Table size of First Dataset	Table size of Second Dataset	With index (both)	Without index (Both)	Either of table has index
Large Data	Large Data	Merge	Hash	Hash
Medium Data	Medium Data	Merge	Hash	Merge
Small Data	Small Data	Nested Loop	Hash	Nested Loop
Large Data	Small Data	Nested Loop	Hash	Nested Loop
Large Data	Medium Data	Nested Loop	Hash	Nested Loop

Tuesday, 17 March 2015

Cross Join without Duplicate Combinations

How can u remove the Duplicate combinations by using cross join?

Generally Cross join produces the Cartesian product of the tables involved in the join operation. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table.

For Example : First table contains 3 rows , Second table contains 3 rows . By applies cross join operation then u get 9 rows Output.

But in this Article, I will explain how to get Combinations with out Duplications by using Cross Join.

Briefly explain as follows

Creating a Table in SQL Server

create table Teams
(id int identity (1,1),
TeamName varchar(10))

Inserting values in the Table

insert into Teams values('IND')
insert into Teams values('WI')
insert into Teams values('NZ')

The table looks like as follows

select * from Teams

To apply Crossjoin operation with Teams table

select A.TeamName,B.TeamName from Teams A cross join Teams B

Finally u got result to eliminate duplicate combination from the output of above query by using where clause.

select A.TeamName,B.TeamName from Teams A cross join Teams B
where A.TeamName>B.TeamName