avoid

6 Query Tuning Techniques - Solve 75% SQL Server Performance Problems (by Amit Bansal)



Sharing buttons:

[Music]

the whole world is joining data platform

virtual summit 2020 and there are some

very good reasons for that

the summit runs 72 hours non-stop

continuously

round the clock not only the summit even

our pre-cons and post cons run

round the clock which means the entire

package covers all

the time zones all the countries

whichever part of the world you are in

you can join the summit all in all we

have more than 200 sessions to choose

from and you have about 30 training

classes to join

this is true value because not only

do you attend live but you also get the

recordings for it

and thanks to our speakers all this is

brought to you by more than 170 speakers

about 100 are from the community

and more than 70 speakers from microsoft

let's mention and special thanks to our

training class speakers

they have agreed that their classes

could be recorded and the recordings

can be shared with the delegates so hats

off to them these are world's best

educators

we have covered you our technology

tracks cover on-prem

cloud database administration

development advanced analytics

artificial intelligence

and this time there is a special track

industry solution

where speakers are going to share

real-world experiences with you yes so

we have

freshly baked sessions for you and you

will see

that content is king for us highly

curated session

highly selected session and all this

comes to you at an

incredibly affordable price in less than

300

you can buy the summit ticket and you

can choose a training class for yourself

the price is so low that even if your

company is not sponsoring you

you can buy the ticket on your own so

why wait

log on to data platform

virtualsummit2020.com

no data platform virtualsummit.com and

evaluate

the summit for yourself and see whether

it is worth your

time and money and i'm sure you are

going to be a winner

get the dps ticket today

grab the content if you join dps 2020

training class you get to attend live

and you also get the class recordings to

watch over next 12 months

as many times as you want let's get

started t sequel guru itzik ben bengan

joins us

devops master alex yates etl gem andy

leonard

the world-renowned bob ward and anna

hoffman with their popular class

azure sequel workshop kubernetes look no

beyond anthony nocentino

anupama netrajan from australia on

machine learning for developers

no one knows sql server big data

clusters better than ben weissman

another etl gem benjamin kettner want to

dive deep into data science algorithms

join this class by dejan sarkar get real

world experience into enterprise power

bi

with the help of dr greg lowe another

sql community

gm and microsoft certified master edwin

msr miyanto with his class on sql server

and docker containers

the best sql server performance tuning

class comes to dps

by eric darling niche topic cosmos db by

hussein safran

yey itzik bengan does his second

training class at dps kelly k power bi

admin

then kevin faisal comes with data

analysis in python and sql niche topic

want to do cell service ai

on power bi desktop joined this class by

marcus nish topic again azure sql db

performance tuning by martin k

day want to do some deep dive into sql

server migration join this class from

microsoft

with the help of mukesh and raj world

renowned expert

on power bi peter mice comes to dps for

the fifth time with his popular class

power bi desktop modeling

philip c mark does dax in a day prashant

boyer with microsoft conversational ai

master class

community gym rob civil with his

powershell and sql server dba tools

siva harinath with dashboard in a day

jana berkovich with power bi data

visualization another sql server expert

who we reckon with his performance

unique class

want to deploy sql server on azure vms

join this class

by five experts from the sql server team

and yes

synapse is the buzzword in modern

analytics join this class by warner

chefs

so much to choose from be a winner join

dps today

good morning good afternoon good evening

depending on where you have joined from

today

welcome to six query tuning techniques

that will solve 75

of your performance problem i mean not

your performance problem your sql

server's performance problem

my name is amit bunsel i'm going to be

your speaker and host for the next 75 to

90 minutes

and there are six precise demos very

crisp

from real world we have learned a lot

from production servers of our customers

and i've created a few academic examples

that i'm going to present

to you so the motivation behind this

session and the content

really is that when i learn about

performance tuning from customers

problems

working with their large data

understanding their query tuning issues

performance issues

interacting with developers and dbas

whatever i learned i tried to put that

into

academic examples and i saw that there

is

there are common patterns and anti

patterns with customers

and those common problems is what i've

taken and i've converted them into

uh this session and and the demos that

you're going to see so it's a it's a

simple motivation the inspiration is

that many of you many of your sql server

deployments

potentially have similar problems

now what are your takeaways you are

definitely going to learn

six different problems you will know

about them you will learn how to fix

them

and not only six during this course of

75

minutes to 90 minutes of interaction

you're going to learn a lot more

internals about

sql server and performance tuning some

of the problems that i'm going to talk

about probably

you already know about them you have

faced them with your customers you have

even solved them

i totally appreciate that try to give

some feedback on the chat window or in

the q a

window about how you when you faced how

you solved it it could be some learning

for the audience and i'm going to call

that

out but i hope there will be a lot of

new stuff

new content that you can learn in this

session all these scripts

and the content is going to be made

available to you

after the event is over so we have two

more delivery on saturday

so we're going to package all of that on

send and give it to you there is a

common question that we get about

recording

uh we're not sure if we will be able to

record the uh

session because the entire team as you

have seen just before joining

is glued and is working quite hard

behind data platform virtual summit

so not sure if we will be able to get

that much time but

in some ways or the other will try to

bring out more content

uh to you so let's see how it goes

the uh other thing is about housekeeping

which is

q a so please have your questions coming

in

use the q a panel of zoom some of you

are watching

live on youtube so i'm just trying to

see there are

many of you on youtube out there so if

there is room you can hop over in into

zoom but uh if you're not

getting inside zoom due to limit

capacity continue watching on youtube

and use the youtube chat window or

the chat box and if you're on sql miss

live page

use the chat box out there our

moderators are constantly watching so if

you have questions put them in those

chat box

i promise no question will get

unanswered

today so irrespective of where you're

watching it uh use the tool to put down

your questions we'll bring them on zoom

and

answer the questions live

so friends let's get started with uh

query tuning six query tuning techniques

and i'm on twitter a underscored bunsel

that's my twitter handle

i've been working with sql server for

many years now and the idea is quite

simple

um as an mvp as someone who is a

community freak so to say

whatever i learn i just tend to share it

with all of you so hope you

enjoy this content let's get started

the first one very elementary non

uh sargability sargability is a made-up

term it's not an official english

word it means search argument ability

the ability of

the sql server engine and more precisely

the optimizer

uh to go into the index and have the

seeking capability

uh in the index so that is what is meant

by sarcability so sometimes what happens

is

even if you have the right index on the

table and the optimizer is using that

index

there are particularly two ways on how

you can search

on the data in the index either you can

seek or you can scan

and the problem here is that the

optimizer is using the index but instead

of seeking it is

scanning so maybe the query is not

written in the right way

which limits the searching capability of

the optimizer

and you land up scanning instead of

seeking and now what does this mean

if you think about the index the b3

structure and i think many of you know

this

so you have the root page you have the

intermediate levels and the

leaf level so scanning would mean go to

the leaf level directly and scan

all the leaf level pages and that may

turn out to be quite expensive if this

is a huge index

really when you create an index you want

seeking capability so you can traverse

using the b3 structure from root to

intermediate

to the leaf level now sargability is a

very common issue that i see with our

customers and

sometimes they're kind of unaware as to

why this is happening but it's really

everything boils down to the way the t

sql

code is written so let's dive deep into

the demonstration and we'll discuss more

concepts

as we go along so the in the first demo

let's use adventure works 2016.

many folks think that adventure works is

not a good database it's not about good

or bad database it's about good or bad

data model even if i was using the

world's best database

or the world's largest database the

learning would have been same so no

worries if you're using adventure works

in adventure verse 2016 first let's

create a

index on modify date column which is in

sales order detail let's call this index

as idx sod modified date

and you know why am i creating this

index because i'm going to search

on this table i'm going to fire a select

query and i'm going to

filter on modify date so we have the

database the index is created let's turn

on statistics time and io now these two

statements

very very popular all of you know about

this time will give me time information

about my query how much time it has

taken

for the total execution the elapsed time

how much time has my query

spent on cpu that kind of information

i'll get from time and i o will give me

i o information

the physical reads the logical reads the

reader head reads

etc i use these two quite frequently

when i'm doing query tuning stuff now

let's get down to the query itself

the query is right in front of you this

is a very simple query where you're

fetching a few columns from sales order

header and you are joining

another table called sales order detail

and then look at this you

are filtering on a column modified date

this is the same column on which you

have created the index

now when you look at this query i'm sure

some of you are trying to figure out

where is the problem

and many of you know the problem is

right in the where clause

look at this where clause the problem

here

is that on the left hand side of the

equality operator the caller modified

date

you're doing some mathematics on that

right and on the right hand side you

have the expression

now the problem with sargability here is

the

capability to seek on the index uh gets

lost

the optimizer loses that because when

you have an attribute on the left hand

side of the column and you you're

applying these t sql functions like

convert or date time functions or string

functions

it is no more an attribute it becomes an

expression and then

the overall t sql query is expression a

compared with expression b and at that

point of time because it is an

expression you will not be able to seek

on the index column

modified date so this means that if you

execute this query let me select it

let's turn on actual execution plan and

click on

what you're going to see is the query

runs you do not get any data it's not

compulsory to get data right you can

always get empty results set

but the problem is let's go to the

execution plan

and you can see the first thing in the

execution plan which will catch your

attention is that

instead of seeking the optimizer is

scanning let's take the cursor over the

operator index scan

and you can see this is the same object

that we have

created so this was idx modified date

this is what we call as the optimizers

access methods how are you accessing the

data

are you seeking are you scanning so on

and so forth

so a scan is happening here and if you

go to the

messages tab there you can see in uh

perspective in terms of io you can see

sales order detail

has 33 37 logical reads right

not a big number but again remember this

is just a demo

if you take the same concept in real

world you will definitely see

big flying numbers if you go and look

into the execution time the overall

elapsed time has been just slightly over

half a second 600 milliseconds and query

spent about 78 milliseconds

on the cpu so you know that this

uh is a typical sargability issue right

the query is non-sargable so to say

and you got to fix it so how can you fix

this

well if you go and rewrite the query

with something like this now if you see

or

when we say where modify date is greater

than equal to something and less than

and the same column of course is less

than equal to another expression

what you have to make sure that when you

are rewriting the t sql query

and you are trying to fix the sagability

issue

both the queries the previous version

and the modified the revised version

should be logically equivalent and in

this case

they are logically equivalent let's go

and execute this

and see what happens so first well let's

go to the messages tab

and what you will see here is logical

reads from 337

have come down to three and i think i

can guess that right because

now seek is happening so simple write

three reads so you have the root page

one page in the intermediate level and

another page at the

leaf level maybe so three reads and if

you look at the total elapsed time

remember the total elapsed time was

something like 600 milliseconds and now

it has cut down by

uh three times so now you have like 200

milliseconds and

very negligible time on the cpu because

a lot of i o has been cut down

remember i o is the cpu function if you

do a lot of i o

you will have cpu spiking up so in this

particular example we fixed it and

there's hardly any cpu usage let's go to

the execution plan

and verify so if you look at the index

now

we are seeking and that is index seek

happening

and you are seeking on the object

idx sod modified date

now this is a very very basic example of

sargability so

again to summarize the problem is that

you have the index but you're not

seeking on that index instead the

optimizer lines up scanning

you got to go and see whether the query

does make sense and if there is

something that you can do with the query

rewrite it to ensure that

you are seeking now of course uh sql

server helps you here there are a lot of

dmvs like

uh index usage stats etc in combination

with couple of other helper dmvs

that you can use to find out

uh what's happening with your index

usage stats are

you seeking are you scanning that kind

of information

sql server can give you now of course

the

session does that session time that we

have does not allow me to go into the

depth of so many different

aspects and artifacts of sql server but

just calling this out

that those are the helper dmvs that you

can use to find out

okay i have all these indexes how are

they being used are

sequels they were seeking or is sql

server scanning on them so those numbers

and statistics

you can get now sargability is just not

this friends i mean

it's a it's a wide concept and a lot has

been spoken about a lot of resources

available on the internet

now simple things you know let's say you

have an index

on let's say last name and then you say

select star from this table where last

name is equal to bunsel and when you do

that

you may expect the optimizer to seek on

the index

because you put the literal as bunsen

but suppose you put a wildcard character

in your literals so you say

percentage bunsel so like percentage

bunsen if you do that

then sql server has no way to start

searching the

data because you have put a wildcard

character and again sql server will

start

scanning instead of seeking and these

are little things that developers dbs

they all miss out

but that's the whole idea of keeping

these common problems in mind

bunsel percentage wouldn't be a problem

because percentage the wildcard

character is coming at the end

of the string so when you have this

index key column and the attribute

values that are being searched upon

make sure that there are no wild cards

preceding

the literal so there could be many such

examples of

sargability so this is the first

common problem that i see generally

across

many sql server deployments and it's a

common thing so

you know that indexes are quite powerful

in many cases most cases

indexes help you improve the performance

of your

query now remember index is not a magic

bullet it just

just by the mere creation of indexes you

don't improve performance of the queries

remember

indexes are searching mechanisms they

help you get to the data faster

not just automatically improve

performance

and and this particular case could be

even worse that

you have the index you created a huge

index you are

spending resources on maintaining that

index indexes could be hundreds of gbs

in size depending on how

how large the table data was and how big

the

size and the number of rows that you

have and you are not seeking so

look at the resource consumption there

well this was the first demo let's close

this hope this was useful let's move on

to the

second demo now the second demo i just

didn't know how to name the second demo

the second problem that i see

so i kind of just call it what i hear

from customer hey amit

seek is not happening now in the

previous example i showed you

uh an example of non-targetability where

again seek was not happening

so don't confuse with the first one and

this second demo

in the second demo i am actually not

showing you sargability yet

i am showing you another example another

common pattern

another reason why you have the right

index in place but

sql server is not seeking on it and and

you're not kind of

you're not able to figure out what's

going wrong again let's use adventure

works 2016

and turn on actual execution plan

look at this very simple query select

start from person dot person where first

name is equal to ken

i try to keep queries very simple in

sessions like these

so that you don't spend time deciphering

what the query construct is etc the

concept and the learnings are more

important

when i select this query and execute

you get some data which is good if you

go to the execution plan

you can see in the execution plan that

again the index is being scanned upon

now let's say you know that there is an

index

which uses first name but you are

wondering why is the optimizer not

seeking

let's try to figure this out okay i am

seeing that index is being used but it

is being scanned upon

let's go ahead and take the cursor over

the iterator and let's zoom in and

scroll down and look at the object name

when i look at the object name it says

ix person

last name first name middle name now

great the naming convention was well

followed

now if you look at you will see that the

index

is created on three columns last name

first name and

middle name and probably the order of

columns is exactly what's mentioned in

the object you can go and verify

in object explorer or with dmv's what

the index definition is

but let's trust adventure works right

now for the moment

so you have this index last name first

name middle name you know friends what

creating a single column index and

troubleshooting single column indexes

are relatively easier sometimes with

multi-column indexes things get a little

trickier

and today i will talk about a few things

related to the multi-column index

now you have this index and the and

let's say you are the developer dba you

are the sql server practitioner who's

writing all these queries

and you are trying to figure out

why uh seek is not happening now

okay there you go so why is seek not

happening

so there is in the index definition

first name is

already included and the question is

seek is not happening so here is the

deal

your query is only trying to search

on first name your query construct does

not include last name and middle name

and how the rule of the optimizer works

is when you create a multi-column index

with column one column two and column

three

searching seeking happens from left to

right

so if you are not seeking on column one

you cannot seek on column two

if you are not seeking on column one and

two you cannot seek on column three

this is the rule of the multi-column

index with the optimizer

always keep that in mind when i started

talking about this to some customers

that you know you have this query

and you have put first name into the

query but actually the index is on last

name

then first name then middle name so last

name is not there

so uh and uh and and

the customer thinks okay so shall we

include last name well

there are different ways to look at the

problem and there are different

solutions

you may create an index on first name

that could be possible

possibility one you may modify the query

if the report permits you may

if the data set is small the client

application can discard

last name and just take first name if

that is the need but the whole idea is

order of columns in your select query

does not matter

this is a misconception that many sql

folks have that if i change the column

order in my select query i will somehow

land up using

seeking on the index and i can match

that up not really

irrespective of the order of columns in

your select query first name middle name

last name last name middle and first

name really doesn't matter what only

matters is

that does your query have those columns

or not but

friends the order of column in the index

does matter that is very very critical

so when you are designing indexes when

you're thinking about indexing

strategies

please keep this in mind because you may

land up creating more indexes like you

may end up creating an index on first

name another index on

middle name another index on first name

middle name and last name and all these

types of permutations and combinations

that is only going to just get uh

from bad to worse because you will end

up creating too many

redundant indexes try to create indexes

that will just not satisfy one query but

probably multiple set of queries

and there will be places where you may

have to compromise a

bit now in this particular case as i

said that there is no

just one solution it really depends on

the workload on the querying pattern

what are the other indexes available

and what are the other queries that that

are running around this person table and

how are

uh how's the filtering happening but the

learning here is that

seeking is not happening because of the

order of columns and the rule is

i am not seeking on last name so i

cannot seek on first name and sql has to

go to the leaf level

scan the entire data search for every

value where the first name is

ken that is the problem so let's go and

fix this now fixed means

trying to show you how this works is now

if you see i've included

last name and first name and if i select

this

and execute now and go to the execution

plan

you can see that now i am seeking it's

the same object

but instead of scanning i am seeking so

the idea here now of course you might

think that well well my query is not

logically the same because now i have

another filter you're absolutely right

the query is not same as that's why i

said that

i may not have the solution on how you

may want to rewrite the query

or how you may want to redesign your

indexes is last name really being used

uh to search if not why not just modify

the index i mean drop and recreate it

and put first name as the first column

and last name as the

second column or third column or

whatever probably that can do the job

isn't it and if that's the case it will

work because see

and i will show this to you let me just

um

change this one and take it here

to show and prove the point that if the

search was only this much

where last name is equal to sanchez and

i select this now remember i am seeking

on

last name last name is the first column

when it was first name it wasn't seeking

it was scanning what about now

so if you execute this and go to the

execution plan

yes you are seeking and this is what you

wanted so this proves the point

that yes indexes follow the left based

searching mechanism left base subset so

you start from left

left most column and go all the way to

the right most

column and great i remember that i can

show you the wildcard character thing

here so this is there so what if i put a

percentage here this is this is what i

was trying to talk

and explain in the first demo so this

makes the query non-sargable now

remember this was seeking now because i

put the wildcard character and

instead of equals to let's put like

there

and and execute this and you will see

in execution plan yeah and now you are

sorry let me zoom this out again because

the tooltip is coming up and now you can

see again

that you are scanning why because there

is wildcard and the seeking capability

is lost this is again a non-sargable

uh query okay

and you know when i was doing this uh

i got this feedback from some attendees

okay like

will like really have seek i mean will

like work with seek

and when you use like will optimizer

seek yes it will

it's not seeking right now because of

the wildcard character there which makes

it non-sargable so

what if i just remove the percentage

from here and put he at the

end now searching from left you have the

character

it will seek let's go and execute and

you can

execute and you can see that we we are

seeking all great

and what if there is no wildcard

character at all like

just like sanchez or anything right will

it seek there's no equality operator

let's go to the execution plan yes you

are seeking

great now if you you know the second

so just to summarize the second demo was

not the same as first demo i just took

the second demo and the liberty to kind

of

expand more on sargability which i which

i explained in the

first demo but the first and the second

demo were distinctively different but

the

problem was common which is you have the

right index in place

but scan is seek is not happening it's

scanning and how

can you fix that so i just took up these

two common examples that i see

all right friends let's go on to the

next demo hope you're enjoying so can i

get some responses in the chat window

are you are you glued to the demos are

you enjoying it

and if you're we have some room maybe um

no not much

i guess so maybe people on youtube

are continuing to watch there so if you

have questions please keep posting them

in youtube chat window

i have my friend and colleague satya

who's taking down all the questions

from youtube from facebook from zoom and

we will answer them at the end of the

session and i will tell you friends

after the session is over stick around

for the q a because

you learn a lot from questions from your

peers because

different people work in different

backgrounds different deployments

different challenges

and you learn a lot from the questions i

learn a lot from your questions

so stick around for the q a okay

time to move on to the third demo i'm

just looking at uh

the time okay we're doing well let's

move on to the next one implicit

conversion

another common problem that is there

around in sql deployments and

i will assume at this point in time many

of you really know about implicit

conversion

but this demo is just not going to be a

very simple demo i'm just going to show

you a query where implicit conversion is

happening and then i will modify the

query

then implicit conversion will not happen

no i'm going to take you to two

different

levels here in the first example i will

just show you a basic implicit

conversion

so that all of us are at the same page

then i will step up

bit and show you something else with

implicit conversion

then i will show you the third part in

the same demo

uh on how you can fix implicit

conversion in

in better ways or how you can do things

in better ways so let's get started

we're using adventure works 2016 again

and here is a very simple

query where you're doing a varchar2

invercar conversion

query is quite simple you're joining two

tables fetching some columns

the matter the problem is here where

account number is equal to

some literal the constant that you have

specified here now you have prefixed

this literal with n which means you're

telling sql server that this is

unicode data so this is n val char on

the right hand side

of the of the equality operator and

account number what's the data type of

account number so if you take the cursor

over

account number and i will show you that

with the tooltip you

get that this is varchar so what is

sql server going to do when you send

this query to sql it's

going to figure out what's the best way

to do

this conversion and possible data types

conversion so sql server

is going to spend some time that

sometime could be very negligible but

it's still going to be something

and sql server has to figure out the

conversions from which data type to

which data type and that's going to

happen

so let's turn on actual execution plan

and let's also turn on set statistics

time

to see how much time is sql server going

to spend in doing that conversion

and now let's go and execute

when we execute we get the data great

let's go to the execution plan now you

have a simple execution plan in front of

you

and i'm sure you are looking at the

select operator there so if you see the

select operator there there is a warning

symbol there exclamation mark

which means sql server is trying to tell

you something if you take the cursor

over select operator and zoom in on the

tooltip you can see the warning there

the warning clearly says that type

conversion expression convert underscore

implicit something something something

may affect seek planning query plan

choice

great i'm quite happy that sql server is

giving you all these warnings in

management studio in the execution plans

and you can

go and take a look at them

now i already told you implicit

conversion will happen and yes it is

happening and sql server is telling you

that

because of this implicit uh conversion

plan uh seek plan

uh uh may be affected uh when when the

optimizer is computing multiple

permutations for the query plan

if you go to the messages tab here and

look at the time

factor you can see the total elapsed

time was just about half a second 481

milliseconds

but the most interesting thing to note

here is that the optimizer has spent

about 235 milliseconds

on a cpu the execution uh 235

milliseconds okay so just keep this in

mind

now let's do something let's go and fix

this and remember friends this is

a very very elementary demo very basic

and i'm sure all of you know this

so i'm going to just fix this and i'm

going to remove n so see what i've done

i'm just removing n

and i'm just ensuring that on right hand

side of the equality operator i have had

chat

now it is very chat to var chart there

will be no conversion

let's go and execute this and you are

going to see that in execution plan

no more warning symbol neat plan

absolutely no issues

let's go to the messages tab and see how

much time has

the execution engine spent on doing all

this stuff so

the elapsed time gets cut by half right

so now you get 254 milliseconds

and there is hardly negligible time the

query is spending on cpu which is zero

milliseconds so you can see that

with with not having implicit conversion

happening

you are saving cpu time now comes the

interesting question when i was talking

about this and showing this

this to customers people do come back

with questions like

does it really matter does it really

matter saving

uh this much time in um

in sql server i mean does every

millisecond matters

my answer to that is when you look at

the query in single ton

when you look at just a single execution

of the query and you are seeing this

improvement of just a few milliseconds

it might not be very motivating it might

not be very encouraging

but look at the larger picture all of

you are working with sql

servers that are deployed and there are

hundreds of users connected to them

and there are all these users firing all

sorts of ad hoc queries oltp

queries very short-lived queries like

these this is a very classic oltp query

and you're finding all the time to sql

server if you look at all of that in

totality

and if you fix a little thing like this

think about the magnitude of time that

can be saved the cpu time

so to just kind of simulate that why not

put this implicit conversion back i'm

putting this end and let's go and fire

this query

let's say 40 times so i'm saying go 40

which means i'm telling this entire

batch to just go and hit sql server come

back and do that 40 times 40 round trips

40 executions and let's record the total

time

so while this is running i'm going to

not turn on

actual execution plan i'm going to turn

it off because for every execution i'm

going to get the plan that's not

required now

we are only going to see the 40

executions how much time

this takes so let's go and execute now

40 executions are happening one after

the other

they're going and hitting the engine

back again the good thing is right now

at this point

the client and the server are on the

same box in the same vmv so things are

going to be faster

but in real world you know clients are

awake across anywhere across the globe

hitting the server over the wire

and whatnot now if you look at the total

execution here

you can see that this has taken about

17 seconds so those 40 iterations

took 17 seconds to execute now let's do

something let's just remove the end

prefix there and make sure

that there's no implicit conversion

happening and execute the same thing

this is just simulating right and i'm

running it on iteration and if you just

multiply this by 100 users

look at the magnitude now when you run

this how much time this takes earlier it

took

17 seconds now the total execution has

taken just

seven seconds so if if 200 milliseconds

was not convincing enough you will see

10 seconds hopefully that is convincing

multiply that

10 seconds with 100 users right those

are like 1000 seconds and a lot of

minutes and hours

and so on and so forth so yes every

milliseconds

second does matter look at the overall

trade-off in the

larger picture so the very basic demo is

done i am assuming that many of you had

already known about this

but now let's move on to the next level

of this

uh uh demo and so the first one was just

showing you the implicit conversion

second one was the bigger magnitude of

it third one

and third one could be answering some of

your questions that hey amit if i

really have conversions to be done i

mean then what i mean because there

could be a real need to do the

conversion yes

why not do explicit conversion right

explicit conversion tell sql server what

do you really want to do

so i'm telling sql server that okay my

account number is varchar

and i want to convert this string to

voucher and i'm using the convert

function

that may just do the trick i'm i'm not

letting sql server spend time

in deciding all the different

permutations and combinations of the

data types

i'm just telling sql server go ahead and

do this what will be the scene here

well let's look at this so why not

simply straight away jump into

uh running this 40 times and that will

give you the answer isn't it

we are going to do this 40 times i'm

going to turn off actual execution plan

idea is that this should complete in

about seven to eight seconds isn't it

and if it does

we are the winners so let's go and

execute this

40 times explicit conversion now you're

telling sql server

from where to convert to what data type

etc and hopefully this should complete

in seven to eight seconds

yay you are the winner just like data

platform virtual summit

pass holder you're the winner with

seven seconds of execution that's there

so you've seen the three

level of demos there right

okay so friends first demo done second

demo done and third demo done six query

tuning techniques we are halfway through

and because i mentioned about dps i'm a

little excited to talk about dps this

conference is just a month away and uh

but i will take your permission so in

the chat window if you allow me

i want to digress uh two minutes like a

like a small break

have some water and talk about dps and

then do

session four five and six just give me

some feedback how is it going in the

chat window are you enjoying the demos

worth your time coming over today

great thank you for the great feedback

friends

all right so we um we were talking about

uh

uh all these demos and yeah dps just

came in between so just take two minutes

you have been probably hearing a lot

about dps you just watched a few slides

and a few videos that i recorded to kind

of encourage the community

the whole idea is very simple apart from

all the sql stuff that we do as a team

once in a year we do this data platform

summit very large scale event

we get the world's best trainers

speakers or microsoft

azure data sql server power bi azure

data stack analytics artificial

intelligence

and uh for the last five years this

conference was happening in bangalore

and for all

obvious reasons for this pandemic

covered when everyone goes virtual

we took the conference virtual and we

had only one thought how can we make

this conference more

inclusive and how can we get different

uh diverse cultures and nations

and guess what we decided to run the

conference

non-stop around the clock which means

the conference is actually running for

72 hours

round the clock three days and three

nights which means wherever you are in

whichever part of the world you can join

the conference

we wanted to enable this for the

community in collab

with microsoft we're very thankful that

microsoft supported us

not only microsoft sincere thanks to all

mvps my mvp colleagues friends from all

over the world

came and heavily supported this and

guess what 170 speakers today maybe 180

as i speak to you 180 speakers

came on board with some fresh unique

highly curated content

out of 180 speakers 60 spokes are from

microsoft

from the product groups your favorite

sql server bi

azure data stack ai and they're coming

up with so many sessions and

mvps are coming up with some real world

experience some great content in total

we are delivering 200 sessions across

five parallel tracks in those 72 hours

and the best thing our delegates are

going to get the recordings for all of

them

it's quite an effort on our side to

execute this conference

get all the recordings done there edit

post event processing etc

and you know kind of put them up on uh

whatever streaming platform and give you

on-demand access to

the content community is loving it as i

speak to you today

there are registrations from 32

countries which is very motivating

very encouraging not only this we

thought let's go one step beyond which

no conference is doing

our training classes pre-cons and post

cons are being recorded

and i want to take this opportunity and

thank our speakers all our training

class speakers who agreed

of course that the recordings could be

shared with the delegates so if you take

up

any training class depending on which

time zone you are in

remember the training classes pre-cons

and post cons are also running around

the clock so you can choose

eight hour classes divided into four

hours each day

so you can balance your learning with

your work and personal life

and that's another great uh thing about

the virtual world that you can

really customize this stuff so if you

sign up for any training class

you will get the recordings of that

class

that's quite a big deal that is um i

will say true value for every community

member

across the globe and of course the

third thing is even when i sign up for a

conference i definitely look at the

price factor that you know i'm going to

i'm going to spend my money i'm going to

spend my employers money

is it worth my time is it really worth

it and if you look at the pricing today

uh probably less than 300 you can get

the a training class

and you can go into the summit and you

get the recorded content for all of it

for

next 12 months watch as many times as

you want

so total true value and i will encourage

that you

your colleagues your friends should come

over data platform virtual summit

join you can sign up for multiple

training classes because even if you're

not able to attend a specific training

class you still get the

recordings of it so i'm quite proud of

our team microsoft all the mvps who have

come forward and supported this global

initiative

to take this to so many countries and

let me call this out if you are from a

low income group

country we have special packages that we

have designed to make it inclusive

so um any country from uh from the

european nation middle east africa or

apac region if you are from india and

your

your employer is not sponsoring you i

know how

indian companies work so no offense but

that's the reality

there is pandemic and there there has

been a

step back due to covet and cost cutting

etc

just drop an email to contact dps10.com

i'm going to

ask my colleague sathya to put that

email down contact dps10.com

just let us know that your company is

not sponsoring you but you still want to

come on board dps

you want to sign up for a training class

or the conference we'll figure it out

and we'll do something because it is a

community conference

we don't want anyone to be left out

the whole intention of launching this

in this virtual world and make it more

inclusive running

around the clock days and night was to

make sure that everyone can

join the whole world is invited we want

to work with that motto

and friends for all of you specifically

don't kind of distribute this out

because you joined today's sequel

maestro's live training there is a

special 55 percent discount code sql

mesh throws

we have limited seats probably uh by

month till monday or tuesday or

wednesday this is going to be there this

discount code so if you really feel

encouraged if you see value just go and

sign up

and use this discount code sequel mage

rose which will give you

slightly more five percent more than

what is available online

so the discount code is sql masters

and friends take a look at our training

classes real gem i don't want to take

just

one name because all of them are masters

right and probably just before joining

you must have seen that video and you

can go online to

data platform virtual summit dot com and

explore the content yourself please

explore the content

content is king that's my background

which i'm donning today

so yes focus on the content grab it

that's the whole idea

okay with this let's move on to the next

demo

which is dynamic plan cash and

this is also a common thing that we

see with uh customers this is a long

demo friends i mean this is a demo

which is like one hour because i'm

trying to show you

a lot of things here but and i was in

dual minds if i should include this uh

example in today's class um because of

its lengthy nature but i thought okay

i can cut this down and show you only

the crux the key thing that you should

keep in mind and probably it will be

useful so you know what happens

is dynamic sql is great and dynamic sql

is good but

i see that dynamic sql has two sides of

it there's the good world of the dynamic

sequel and the bad world

and in the back world there is this

concept of parameterization

that you need to really uh understand

well so

if i look uh scroll down just let's jump

into the query and first let me show you

the query here is a query

okay what are we what are we trying to

do in adventure works here

we are trying to run this query which is

going to fetch data from human resources

dot employee table

where manager id is equal to something

now look at this query

you take this query and you put it into

a local variable here sql string

and then you are trying to execute the

sql statement dynamically using sp

underscore execute sql

simple dynamic sql concept but how are

you concatenating the manager id where

is the manager id coming from manager id

comes from another local

variable at the rate cnt and you run

this entire query this batch

in iteration for 100 managers let's

assume there are 100 managers

and then you run this query for each

manager where is the problem

the problem is here in this query

because you have not done explicit

parameterization

the problem is when this query runs when

this batch runs with 100 iterations

and you are incrementing at the rate cnt

every time

sql server is going to create 100 plans

one plan for each execution of the query

why one plan for each execution of the

query

because there is going to be a different

manager id in each execution so one plan

for manager id

second plan for manager id two third

plan for manager id3

that's going to be bad this is a a

scenario which we also call as

plan cash bloating because

you have this ad hoc query that is being

executed with this dynamic sql facility

and in the plan cache you will have 100

plans

in other world in the in a better world

you could have explicitly parameterized

this query so that there could be only

one parameterized plan and irrespective

of any number of manager ids for example

but there could be just one

parameterized plan and that would be

great

so first let's look at the bad thing and

then we'll look at the good thing

so let's go and um

take this query here the helper queries

for us to troubleshoot a few things i'm

going to copy this one

and take it in a new window and put it

here what am i trying to do i'm trying

to see how many plants are there in the

plant cache

and then i'm going to look at the plan

cache size in

kilobytes let's go and execute this and

you will see that right now there are

about thousand plants in the plant cache

and about 204 thousand kilobytes

okay just to make our calculations

simple

why not do dbcc free proc cash

dbcc free product cash and dbcc drop

clean buffers just because i'm showing

them in the demo doesn't mean you start

running them in your production servers

okay so don't do that please

this is just for the demo i did dbcc

free proc cash which means i cleared off

the plan cash

and now you can see that there are only

two plants in the plant cash and the

plant cash size is

192 kilobytes keep this number in mind

let's go and run the query the bad

version here where there's no explicit

parameterization run it in a

loop and see after the total

100 executions of this how many plans do

we have in the plan cache and while it

is running why not just go back

uh to the other window

and run the helper dmvs to see the total

plan cache

size now there you go so the execution

is done

and you can see that there are 103 plans

now in the plan cache and the size of

the plan cache is

1800 kilobytes and i can tell you that

those 100 plans

are the 100 executions of that query

right that is what it is

how do we prove that okay so let's go

back and

run this statement which i skipped on

the top oh no it's on the bottom where

let me just go and see yes so going

a little internal into the query

execution and the query structure so

there is something called as the query

hash which is the signature of the query

how this query is interpreted inside sql

server is

that select something something

something where manager id is equal to 1

where manager id is equal to 2. so look

at the structure of the query the

structure of the query remains same

it is only the manager id value that is

changing

so the hash the signature is same so can

i

ask sql server to group all the queries

in the plan cache and group them by

query hash

which is one of the attributes in my dmv

dim exec query stats

and i see where having count is greater

than one so i can

actually group all the similar queries

together

where their hash where their signature

is

same and do i have any one of those

yes there is one there is one query

where the number of entries are

100 there are 100 instances of this

query and the sample query is

right out there this is select employee

id national id number right this is our

query

and there are 100 instances of this

query in the plan cache

so that proves the point now what are

you waiting for

is what is the fix well the fix is this

let me take a moment to execute the fix

the first part of the query here is a

kind of same where you write the query

put them in put the query in sql string

local variable but see what you're doing

you're saying where manager id is equal

to add the rate manager id now this add

the rich

manager id is part of your sql statement

and it goes into that local variable

but where is this manager id at the rate

manager id coming from this is called as

explicit parameterization

so what i have done is you have taken

another

local variable here a local variable is

defined here called param definition

nvhr 500 and what we are trying to do

is all my parameters are going to be

explicitly defined and i'm going to put

them in this

local variable so if you see here in

line number 91 i am doing this i'm

saying set pattern definition equals to

at the rate manager id tiny end because

i know that there are not going to be

more than 100 managers i take the data

type

tiny ink and i put them in at the rate

manager

um i mean i define the uh the parameter

with at the rate manager i didn't

just take tiny in because it's not going

to be more than 100 now

this parameter goes into my local

variable param definition so what if i

had multiple parameters

well very simple i would just separate

them by comma put more separate them by

comma so on and so forth

so all parameters are explicitly defined

and this is what we called as exit

parametrization

so now remember that this add the rate

manager id here should

match this at the rich manager id here

of course because

you want this to be contextually same

now let's look at the execution now when

you execute you say

execute sp underscore execute sql this

is dynamic sql execution

where is the statement the statement is

in at the rate sql string it is inside

this local variable

where are my parameters coming from my

parameters are coming from this local

variable at the rate param definition

where i've defined all of this isn't it

now comes the last part

you have defined your sql statement you

have supplied the

parameter definition to sp execute sql

but what about the values for those

parameters

well you put a comma and i'm just

talking about the syntax here you put

the comma and then you say add the rate

manager id now remember this add the

rate manager id here

this at the rate manager id here and

this manager right here is all

same and you say add the rate manager id

is equal to at the rate

cnt so the value for at the date manager

id is again coming from that local

variable we're doing this counter

improvement

which means if you actually had more

parameters same thing friends just put

comma

at the rate something something

something right and you say

is equal to which is parameter equal to

some other

variable and you can keep separating

them with comma

and get the total uh

parameter set done so this is this is

really part of the

syntax it's a little uh i would say

it's it's just you know first time when

you see all of this it just is a little

tricky in case you have been doing this

this would be like

wow why is amit spending so much time on

this you know i'm spending so much time

on this explaining the syntax

because it was a little difficult to get

this to

the developers initially people who uh

have probably are looking at

uh sql servers parameterization thing

for the first time but then the folks

who have been working with java or

net developers they know that you can

declare even explicit parameterization

in the client code you can do this with

classes and objects there

so that's the part of the syntax you

declare the statement

you declare a local variable which has

all the parameter definitions

and then you supply them in your dynamic

sql execution

and you tell them where the value is

coming from quite simple

now let's free the proc cache again and

go back to the helper

window there and see how many plans you

have you have again two plans

and 192 kilobytes is the plan cache size

now let's go and

execute this entire query

and hopefully now you will see that

there will be no plan cash floating

probably there will just be one plan in

the planned cash

and that is going to be a parameterized

plan this is what you expect

this is another common thing that we see

now you you saw it took me so much time

to explain this but there will

there are more aspects to it which i'm

not covering in this

uh demo okay so let's go and uh

execute this this is done and you can

see that there are four plans something

else is also there that's fine but just

one plan

one plan for the execution that you did

and now the plan cash size is six to 64

kilobytes remember

earlier it was one zero four and the

total plan cash size was

eighteen hundred kilobytes look at the

difference look at the amount of time we

are saving the cpu time also we are

saving otherwise computing so many plans

now there's just one plan we're taking

we're saving a lot of optimizers time

we're saving on memory remember plan

clash memory comes from the main memory

and it's it's one of the memory clerks

just like buffer pool so you really

don't want

plan cash bloating and in in big

environments if there could be hundreds

and thousands of such

ad hoc plans and and if of course the

plan cash hits the limit

plans will get kicked out of the memory

then and the same query comes again

the plans are going to be recomputed

recompiled regenerated and all of that

causing

extra cpu cycles

all right friends let's close this one

so this was the fourth demo done

let's move on to the fifth demo how are

we doing on time well we're just

touching about an hour and we have two

more uh demos to go

okay so let's go into bookmark lookups

now bookmark lookups are good but

sometimes they can go

very bad um i am uh i've seen the good

side of things and have seen the bad

side of things and

when i'm showing this demo i want to

thank adam mechanic i'm using his

uh big database because sometimes to

show these

big demos you need large data so i'm

using his scripts to create the large

data set and at this particular script

as an example to

demonstrate the bad behavior of bookmark

lookup but

the learning is not only with bookmark

lookups there are some more learnings

here

so let's dive deep straight away into

this we're using adventure works 2016.

let's go and look into a query

there is this query where you are

retrieving

some data from uh transactions table in

dbo so this transaction table is a big

table with 3.9 million records about

close to four million records

and there is another table called big

product table which is not very big but

it has just 25 000

records so your query says that you want

to fetch

data into this temporary object

x and you're fetching the data from big

product table and you do a cross supply

with this inner expression and the inner

expression is fetching data from

transactions table there is some filter

on the product id so

a relatively simple query with an outer

and an inner expression and you're

joining them using cross supply

let's turn on actual execution plan and

execute this query

and let's see how much time this takes

so

this query is going to run it's going to

put data into

this temporary object hash x so this

select

into statement takes how much time it's

going to execute this

actual execution plan is turned on in my

last execution it took about

six to seven seconds okay yes

it takes exactly seven seconds sql

server is becoming more and more

predictable with every new release

all these demos are running on sql

server 2019

now if you go and look into the

execution plan this is a parallel plan i

can see a lot of parallel operators

everything looks good here

seven seconds all fine no problem now

let's go back to the query window

and let's discuss the scenario now let's

say

there was a need to add another column

to the transactions table right the

customer id column

now this customer id column to be added

to the transaction table is not your

requirement i'm just making up a

scenario

maybe some other application some other

department some

other set of people just wanted to add

customer id

and and the dba goes and adds customer

id column to the transactions table

if you're thinking that in real world it

doesn't happen that way there is a whole

change management process that has to be

put into place to add

a additional attribute to a table

let me tell you in real world it does

happen that way you just go

and modify schemas so all sorts of

things happen

anyway that's a separate discussion and

debate but let's say customer id column

gets added

to the transactions uh table now did i

really add that let's go and add this

okay yes it is added

and now what you were not aware of this

let's assume because as i said this was

for some other application in some other

department but

just like any other normal day you go

and execute your query i need to drop

the table

to execute the query again so let's drop

the table and like any other normal day

you go and execute your query now

and to your surprise let's turn on

actual execution plan and execute

now this query takes a lot more than

what it took

earlier and you know that uh that funny

thing that we

say you know the query is taking so much

time that i can actually go to the

cafeteria

and have a cup of coffee and while i

come back the query is still running and

all of that

so while this query is still running it

has been over seven seconds

let's try to put that into action let me

grab

a bottle of water and while this is

running let me drink some water

there's no coffee here right now in

front of me so water is good enough

anyway my throat was

drying up a bit

okay friends so this is still running

isn't it that that joke of the query is

running and it takes that much time that

i can go to the cafeteria and whatnot

yeah it is done so it took about 51

seconds for the same execution

the interesting thing to note here is

that friends

only a column was added you have not

done any change to the

data so what has gone wrong let's go and

look into the execution plan

and let's

see the execution plan now and all of

you are seeing the execution plan let me

ask you friends

where do you see a problem in the plan

now don't say look up just because i

said look up you look at the plan and

evaluate yourself

i know you may want to take the cursor

over i traitors and you may want to

investigate we will do that

but just on the first look at the first

look on the plan

where do you see is a problem you can

use the chat window to put down some

answers

you can see a lot of iterators flying

around there where do you see

a problem any guesses

you can use the chat window not the q a

panel to answer q1 is for questions

you can use the chat window okay someone

says cost

80 percent okay for the key lookup sort

operation

uh could be a problem yes there is a

warning symbol on the sort operator good

observation there

okay problem is lookup key lookup io

cost

okay a lot of okay nested loops and key

lookups key lookups and sort both could

be a problem yes

you're right that both could be a

problem okay parallelism

fine lot of different answers but yes

most of you are kind of right when you

look at the plan

your first attention your attention

would probably go to the sort operator

because there is a warning symbol there

and of course and real world we will not

ignore that

and the other thing is the key lookup

because key lookup iterator cost takes

up

80 percent of this relatively overall

plan now the que now the point here is

um

just for the sake of brevity for save of

saving time

i am not going and jumping into the sort

operator friends i am going and looking

into the

uh key lookup sort just to let you know

it might be spilling the data etc but

that's not

the real culprit so just for the sake of

time let's go and look into the key

lookup

and you know why key look look up gets

more attention here for

me at least because of the cost factor

the cost factor is eighty percent the

sort operator even though there is a

warning symbol the cost factor is

seven percent i'm not trying to say that

i'm not going to look into it i will

uh in real world when we are

troubleshooting performance but just

for this demo i am going to look into

the key lookup so when i take the cursor

over

key lookup you can see that

okay what do you see you can see that

the object here is pk transactions that

we are looking up on

and the key lookup is fetching a column

customer id that's the crux here friends

okay

which means what customer id was not

was not there earlier in the first

execution we added customer id to the

transactions table

and assuming i'm assuming but i'm going

to verify this probably there was an

index which was covering the query and

all the data was coming from the index

but now because you have this additional

column customer id which is not part of

the index

and because your query says select star

from transactions table which means get

me

all the columns from transactions table

the optimizer has to

apply a key lookup to fetch this

additional column and why is it so

expensive

it is expensive because friends you are

executing this key lookup

3.9 million times one for each road to

fetch the data so you get the data from

the index

but there's no customer id you want

customer id so go and look up

on the base table get the customer id

value then the second row go look up get

the value then the third row

go look up get the value you do that 3.9

million times that's why it took so much

time

that you had a sip of water in between

okay so this is what is happening now

let's try to

corroborate a bit how are we going to

decipher this

sql server gives you a missing index

hint so i'm going to right click

and look into the missing index details

now when i look into the missing index

details again there's a lot of advice on

the internet that missing index hints

are bad

you should just ignore them or not

blindly create them and all of that

advice some of that advice is right

don't blindly go ahead and create the

indexes recommended by sql server

this is a hint right a hint is a hint

just take the hint

try to understand why is management

studio why sequel server

is not management studio sorry but why

sequel server recommending you

uh this hint uh and of course the

missing index hints are playing in the

background but anyway why

are you really getting this uh

recommendation

and then try to evaluate this

recommendation

based on the existing indexes that you

have

so what the sql server telling me here

is go ahead and create an index on

product id

there and include transaction date

quantity cost and also include customer

id

this is interesting so let me do

something let's go to databases

expand adventure works 2016

expand tables and expand dbo

transactions

go on to indexes and here is the

index non cluster index right click and

let's script this out

and let's look at the structure of this

index

the definition of this index what are

the key columns and what are the

columns that have been included and when

you look at this index friends

you can see that this non-clustered

index actually has product id and

transaction date as part of the key

columns

and it includes quantity and actual cost

now i would really spend a lot of time

on this but just the key summary here

there's another column called

transaction id in the transaction date

but transaction id is the clustered

index so transaction id clustered index

is

already part of every non-clustered

index because it is the clustered index

so you don't need to explicitly add that

which means this actually was the

covering index unless

until the customer id column came

till the time customer id column was not

there this index had all the columns

that the query wanted so there was no

need to do the lookup

now because customer id is there sql

server recommends you to create another

index

where you can include customer id so

that this new index becomes the covering

index and it satisfies the query

and there will be no bookmark lookup

that is the trucks

so what are the solutions in front of

you well the solution is

you can just go ahead and create this

index solution number one but that would

be the worst thing

you will do because you don't need

another index

and existing index is already there best

of all

probably you could just add customer id

column

in this included section here job done

you could do that

but you know what i'll go one step ahead

and

tell you something which is very very

fundamental very very rudimentary and

developers dbs all of us i mean even i

make those mistakes

we don't follow the guidelines and the

best practices

you know the problem is not with adding

customer id as an additional column

the problem is not even with index even

though by creating this new index or by

modifying an existing

index you can solve the problem but you

know the real problem is

here this is the real problem friends

selects

start from transactions table that's the

real problem

why star why do you want all the columns

and even if you want all the columns

originally why don't you specify the

column names

remember the problem that i talked about

customer id as an additional column was

not your requirement

if this query was actually written by uh

if this query was actually written with

specifying all the column names that

were required those

transaction date id actual cost and

whatnot

it wouldn't mention customer id and then

y customer id

any number of columns that were getting

added to the transactions table your

query would run actually

absolutely fine and would avoid the

bookmark

lookup so what i'm trying to say is

this friends here look at this

i'm specifying the columns that i want

so see where we all started bookmark

lookups do get expensive those are

areas where you should look into but

remedies could be many there could be

very different

diagnosis that you can do so in this

particular example at least the

diagnosis is not to create the new index

not to modify an existing index and some

other

scenarios that could be a solution maybe

or that could be required

but in this particular case the need is

just to

make sure that you write the query in

the right way and you follow the

best practice which is specify the

column k names

avoid select star and if you go and

execute this see i'm not making any

change

but let me first friends drop the table

and if i specify

the columns that i just wanted so

this is the query let's go and execute

this how much time will this take

and we are including actual execution

plan which is fine this should again

take just about seven to eight seconds

what it originally originally took

so yes there you go and it takes about

seven seconds so many learnings from

this one single demo

and you know friends we talk about uh

the

putting down the column names and

avoiding select star and all these best

practices you know this i mean because

these are the first few things that we

learn when we learn about sql server or

we learn about databases

but there's another thing testing your

query with real data

with most of my customers what i've seen

is when we are writing such queries you

know this

actually showed up as a slow running

query because it was looking up on a

table with like millions of records

and when it was really fetching and

executing iterating so many times

millions of times

it was running so slow when i write this

query

in my dev environment i'm testing it

against small tables

tables with just like thousands of

records this is just going to run so

fast

and there wouldn't be a problem but when

i take this query in production

environment

and put it up against large tables that

is where performance problems are going

to come

so the learning is in your development

environment in your testing environment

make sure that you're testing your

queries

with some real data it may not be

exactly production data but today

there's so many free tools available to

kind of create and you know blow up the

table and put like millions and billions

of records for you to test

take my word on it i have seen

developers

testing their queries against empty

tables

no data and query runs absolutely fine

right now stop laughing you and i we

both have done that

okay let's move on so a lot of learnings

from this one

and let's do a bit of cleanup and drop

this column

and friends there we come to the

last demo of the day which is

query tuning parameter sniffing and this

is a bit relaxing because

this is where i think uh parameter

sniffing is recent times

has been talked a lot about so many of

you might know about parameter sniffing

but let's

uh see what is happening so we discussed

about parameters a bit when we were

doing the

uh the other demo about dynamic sql this

is a little

different flavor of uh parameters and uh

let's first

run a create a stored procedure run it

and then understand

the concept behind parameter sniffing

and how we can actually

fix it so i'm using adventure box 20

16 yeah that's fair enough let's use

that

and there is a stored procedure get

customer ship dates which has two

parameters there ship date start and

date end and the stored procedure is

quite simple simple

select statement here where i'm fetching

data from sales order header

uh table where the ship date is between

the start and the end

quite simple let's go and execute this

and then if you scroll down we'll create

a index on shipped date just to support

the select query

so let's go and create an index here you

know this

is going to be a good demo because it's

going to sum up many things about

scanning

seeking parameter sniffing and lookups

and whatnot

i don't know it's a coincidence i didn't

plan it that way but anyway

so the pro the stored procedure is

created you have created a helper index

just to speed up the select statement

which is inside the

stored procedure and now let's free the

proc cache clear of the

cache again because it is a demo i'm

doing it and now

now let's run the stored procedure with

a different set of parameter values the

first

execution gets us the gets us the all

the data let's call it 2020 okay

and it's it's getting us all the data of

last 15 years this

execution and the second execution of

the stored procedure is only going to

fetch me 10 days of data from july

10th 2005 to july 20th 2005.

so let's free the proc cache and let's

execute the

first execution let's do the first

execution

when we do the first execution here and

click on execute

you will okay and let me turn on actual

execution plan i didn't do that so

execute it again

and you are going to see that sql

executes it

get good you get 30 000 records if you

look at the status bar

no point zooming in we fetch all the

data and

you are seeing a scan plan there so if

you see this is a clustered index scan

that is happening

and rightly so optimizer sees that you

you're getting a lot of data the query

is

low selective which means getting high

number of records and it decides to use

a scan now let's free the

proc cache again and let's execute the

stored procedure

again now and now again this is a fresh

execution

and this time the query is highly

selective only 10 days of data

and we execute this and you get no data

which is fine

no getting no data is also highly

selective and you go and look into the

execution plan

and this time you get a different plan

you don't get a scan plan

you get a seek with key lookup now don't

confuse with key lookup here with other

key lookup things that we have talked

about in previous demo

let's just take our head to this demo

two different executions of the stored

procedure

separately and to different plan what's

going

well this is called as parameter

sniffing in each execution of the stored

procedure

the optimizer sniffs the parameter value

and generates an optimized plan based on

the parameter value

now there is a thing about the stored

procedure which is good stored

procedures

encourage plan reuse so in the

stored procedures are compiled objects

in their first execution

the execution plan gets created and it

gets injected

into the plan cache subsequent

executions of the stored procedures

reuse the plan which is already there in

the plan cache

but in our example here what has

happened is when i

executed both the stored procedures i

was i was clearing the cache in between

which means every execution was a fresh

execution

of the stored procedure first time

execution of the stored procedure

and there was no plan in the plan cash

so the optimizer creates a new plan

every time and then you see the

parameter values are different so it

generates

a new plan every time but now look at

the fun

this is not what is going to happen in

real world isn't it no one is going to

fire free proc cache or something every

before or after every execution of the

stored procedure in real world this is

what is going to happen

when you execute the stored procedure

for the first time with the first set of

parameter values which is getting all of

the data

you are going to get a scan and the scan

plan gets injected

another user comes with a different set

of parameter values making the query

highly selective different values

and you execute this again you are only

getting few records or no records highly

selective

but the optimizer forces itself to use

the

scan plan which is already there in the

planned cache and look at

what happens here the second execution

subsequent execution of the stored

procedure

lands up using the scan plan fine

but then something else could have

happened which is

instead of the first set of parameters

coming in first

you could have this one this execution

coming in first there's no plan in the

plan cache now

the first execution of the stored

procedure itself comes with

highly selective parameter values

execute this

and what happens the seek plan gets

injected into the plan cache

quite interesting and the subsequent

execution could be this probably

where it wants a lot of data now and and

making the query low selective and you

execute

and this forces the optimizer to use the

seek plan with key lookup

fetching the data from the table forcing

itself to use the seek plan

and the same problem you have the key

lookup

where the key lookup execution is 31 000

times

now the table is small with 31 000

records imagine if this was 31 million

this could have taken minutes to exit

this is

is the problem called as parameter

sniffing

parameter sniffing is a good thing in

general because the optimizer sniffs the

parameter values it's great but

sometimes

it can cause this kind of problem where

a plan gets injected into the plan cache

based on certain set of parameter values

and

all parameter values that are that that

are good for that plan

it's great but then if your database

table data is

column values are skewed up right skewed

distribution of data

where you have different parameter

values and also varieties of parameters

coming in

and there could be some other plan that

could be good for that the optimizer and

because of the way stored procedure

reusing plan it it encourages plant

reuse it will force itself to use those

existing plants in the plant cache and

friends this is the reason why sometimes

in your sql server environment when you

have these stored procedures

sometimes they run fast sometimes they

run slow sometimes they're running fast

again

sometimes they're running slow again and

all this is happening in background

parameter sniffing

is probably happening in your

environment

now how do you fix it again it's a big

concept there are a lot of other things

that i can talk about

but a quick fix could be to use the

statement level recompile hint

so i'm going to drop the stored

procedure now and

i'm going to modify the stored procedure

in this way i'm going to create the

stored proc again with the parameter

the variables parameters and i'm going

to say for this statement option

recompile i'm sure

you have probably come across this hint

option recompile where i'm telling the

optimizer that you know what

when this stored procedure comes for

this particular statement don't store

the plan

every time compile it again based on the

parameter

values so this is going to do the trick

first

now the moment i say this as a solution

i as i said this may not be the best

solution

it's a quick quick solution elevates a

lot of uh

other problems uh but now the moment you

hear me saying

that every time the stored procedure

comes again create the plan again you're

thinking about

plan cash recompilation extra cpu cycles

being consumed more time and all of that

i'm going to come to that

but let's first see if the problem gets

solved so you have created this

stored procedure you modified it let's

go and execute the

queries again first execution of the

stored procedure

what do you get get me all of the data

you go to the execution plan

it's a scan plan all right second

execution of the stored procedure

let's go and execute this and go to the

execution plan

you get a seek i am not running dbcc for

free proc cash in between isn't it let's

go and execute the first one back again

and you will see a scan plan again so

every execution of the stored procedure

you get a fresh new plan

based on the parameter values the

previous plan is not stored in the plan

cache

now you're thinking about the cost

factor because every time the plan

has to be you know compiled optimized

and you know extra cpu cycles

etc well there is no magic bullet here

it's about

trade off very simple to gain something

you have to lose something

this is an old saying which you have

heard all the time

so the whole idea is if the trade-off is

good today hardwares are

modern hardware is quite powerful right

storage cpu

huge memory on boxes so there are cpu

cycles available with those additional

cpu cycles if the trade-off is that you

you're saving so much on query execution

if they're running fast why not

if trade-off is good go ahead and use it

so

that's the whole idea of uh parameter uh

sniffing

there again the many of these topics are

quite um

required more debates longer discussions

more scenarios but that's what

uh a free session is all about well on

sql mestros dot conference beyond the

virtual summit and all of that i do have

a lengthy video course also which is

just like as interactive as this i'm on

screen but much

much better with green screen setup and

all of that so go to sqlmespros.com

check out the performance tuning video

course which is a 15 hour course packed

with a lot of demos lot of modules

maybe you may like that and you may want

to subscribe to that uh go ahead and do

that

uh if if you think that's uh worth it

but uh for this session

these are the six demos and as i said

the sixth one was quite

interesting it summed up so many things

about scanning seeking bookmarklet

the same concepts came again but then

yes friends that's the summary these are

uh the six things that i see commonly uh

but i'll tell you

these are just not the six things but i

just picked up some random

six common query tuning techniques which

can solve

75 percent maybe it will solve 60 65 or

maybe even 90 percent that's

relative but yes common problems that

you see in sql server

and i hope your time was useful in

today's

session you learned something new uh do

give feedback friends the link is uh

there in the chat window my colleague

satya is just posting it so

i would want to hear from you if this

session was worth your time

please give some feedback the linkedin

link is there in the chat window if you

click on that

i hope that it has gone to all the

attendees uh when you're posting it

great so um uh i'm waiting for feedback

and post something there visit data

platform virtual summit dot com also and

explore the content friends content is

king

content has to be meaningful for you

just don't uh buy a ticket and come up

on board on any conference look at the

content and compare it with the price i

truly believe that the team has done an

incredible job here

trying to put up all of this together at

a very very affordable price

even if your company and your managers

are not sponsoring you

you can still uh think about coming on

board

so two things yes please give feedback

follow me on twitter a underscore bunsel

please feel free to give some feedback

on twitter as well if you are a twitter

user and on linkedin and yeah

i had a great uh 90 minutes of uh this

session time to jump into question and

answers right so your questions if

you're on youtube

post it on youtube chat window no

problem and

if you're on zoom let's use the q a

window

and let's start getting down to the

questions

so the first question stages has a

question

how can we use index seek using convert

functions

do we need to create compute columns

which convert date time into date and

use directly

in variable in where condition yes they

just you can

actually use cast and convert to do

explicit conversion and that might work

better

but there is a catch to that also there

are there is an exception if you go to

sqlmastrose.com look at some recent

blogs

i did post a very long blog where cast

and convert was also an exception

to sargability so while in most

functions it will work

but there could be some exceptions as

well

next question is uh gabriel or

the key look up still c is inefficient

okay

well uh gabriel see it's not about any

iterator being

efficient or inefficient these are

different access

methods by that logic i think i can

always say scan is always inefficient

but

nothing is always bad or nothing is

always good

so to say and the and the best answer

which you don't like to hear

it depends so given a problem let's take

the problem at hand and then try to

figure out

uh what's good or bad relatively small

set of data lookups could be good

but you just can't keep creating

covering indexes to avoid

lookup that could be bad in the other

world

i know you and that was not the answer

you were expecting but that's all i had

akshay has a question i have a doubt

in the where clause you haven't used

business entity id

in the where clause so how index scan is

uh

being used okay actually that's one of

the challenges of taking

all the questions at the end because

your question is about a specific demo

so actually which uh

demo is this about maybe satya is

following he may tell me

uh business entity id was this uh okay

doubt where clause haven't used business

entity id

in the where clause index scan is

showing

so this was one of the early questions

probably so

am i using why sk seek is not happening

so that question came at 726

maybe in the second demo of there

i think your question about business

entity id

could be because it's uh it's the

clustered index

so if i'm using that maybe it would be

only seeking

but can you just rephrase which demo

you're talking about post that again i

will answer that question

uh with the demo promise

jayhawk has a question do you have any

recommendation for understanding how to

analyze the execution plans

well joe's simple guidelines are there

look at

look at it with warning look at it

actuators with cost factors high cost

factors

look at where is you know huge amount of

data flowing in the visual indicator of

the arrows are good

that helps always take a point to go and

look into the select operator because

when you take the cursor over select

operator it gives you

if the query has been using extra memory

grant those are some general guidelines

i also go one step ahead and try to

figure out cardinality estimations so

if you take the cursor over arrows apart

apart from the number of rows

that are being transferred between

iterators try to look at cardinality

estimation factors

those are important because they may

tell you our statistics being up to date

and what not

right so just general guidelines again

much larger

topic

and mole has a question how convert uh

how covering index work in case of

execution plan well and it's not about

how covering index

will work in case of execution plan i

mean if you have a

covering index uh sql will either just

scan or see

the whole idea is a covering is index is

an index which contains

all the data that your query is asking

for so if your query has select column a

b

and c and let's say there is an index

which has a b and c

in whichever order it could either scan

or seek and when it would scan when it

would seek

that i've already explained in the first

two demos

actually has a question let's say we

create an index on a

b and c there is the order of index

creation okay a b and c now in the where

clause i say b

is equal to x and c is equal to y

so bill is will it uh use c uh

no it will not use c because you are not

seeking on a

so it won't uh seek on b and c it will

be a scan

what if c is equal to y did not use the

sig what if a is equal to then

it will use c yeah so in the order of

columns as i said because a is the first

column in the

index you have to seek on a if you want

to seek on b

and c right and in that order

and you can try that out yourself i mean

just write a simple query with with an

index with three columns and play around

santosh high image session is going

really good i have a question how does

optimizer handle conversion on var

binary or text uh

data type columns example select start

from table where call one like

something percentage var binary and uh

text data type which are lobs

uh large objects conversion on them

sorry santosh um i don't have a good

answer for that

right now maybe just got to look into it

so

uh take a note of this uh

try to post this question in our

linkedin group so friends uh have you

joined our linkedin groups do that uh we

can continue doing more discussions even

beyond the session i'm hitting

8 30 now and that is where i will have

to wind up because i have another

session coming up for a customer uh but

i'll just quickly go over this one

post that question there i'll do some

research and get back to you

on on the linkedin group so you can join

either sql mesh pros or sql server

geeks group whichever you like no

problem

okay shilpa has a question is there any

way to avoid uh cross

apply table uh if there's any way to

avoid cross-supply table with the table

valued function so it's causing

performance it should

ship a completely different topic all

together with table valued functions and

cross apply

some other time right or just post the

question in the in the group because

tvf is a different discussion altogether

and you know

you know i know where is this question

coming from tvf creating performance

issues because

you it's constantly running for every

row from the outer expression

so i know where it's coming from and it

won't use stats and

and whatnot so cardinality estimation

will go bad rahul khan's can be useful

over seek in some cases yes rahul scan

can be useful overseek in not some but

many cases

three one zero five four one has a

question that's a great name

three one zero five four one so three

hundred and ten five four one asks a

question

is it good choice to use recompile

everywhere no it's not a good choice to

use recompile everywhere

um i just showed you um a demo and a

scenario where it could be used and it

is a hint hints are really

the last thing we should try to use when

we're doing query tuning stuff

so as i said nothing is good or bad you

know all the time

it really depends on the situation and

scenario

and then ganesh has a question how to

find parameter sniffing is happening

great question ganesh thank you how to

find

well if you are on latest versions of

sql like 2019 17 16 you can use query

store

they help you identify but if you're not

using query store or if you're on 2012

or 14

something like that then uh there is a

way on how you can fetch the information

from a query stats dmv where query stats

dmv has

the information about the time it has

taken for the current execution and the

time it took

in previous execution you have to find

out a relative factor

that is there is a big difference

between the two times and extract

all those queries and probably

potentially there could be a parameter

sniffing problem there

not always but potentially that's uh the

answer to that

so um i think once uh i think in um

in sql maestro's youtube channel so uh

friends i will invite all of you to

subscribe to the youtube channel

please subscribe

sequel mage pros do subscribe to that

channel there's a lot of videos out

there

and ganesh there is a video under the

playlist called sequel

shigra where i've talked about that i

didn't

post the query there but i've explained

you how it can be done

and thanks i will in some days to come i

will write a query and i will put it up

in linkedin group or i will blog about

it

so ganesh you will get the answer to

that so that's the channel there

youtube.com sequel mage pros please go

ahead and subscribe to that thank you

friends

they just has another question how can

we use index c

using convert function do we need to

create okay this i answered this came a

second time

okay say two has a question does index

seek include columns play a vital role

in performance tuning

english yes they play a vital role they

are very powerful

remember in sql server 7 and 2000 this

include thing was not there

and we used to dump all the columns in

the key definition that used to increase

the size of the index key

which increased this uh you know uh the

length of uh

of the key on the page and it increased

more intermediate levels that was very

bad so include is good

but remember that if you put a

particular column an include keyword

you cannot seek on that you can only

seek on the columns that you put an

index key definition

right otherwise including key uh

including columns in the

in the index using the include keyword

is actually quite powerful

okay farang has a question any device

for keeping system friends i hope those

who are leaving are you giving feedback

on linkedin please do that

that's the only thing i ask right and

the other thing i ask is you should join

data platform virtual

summit dot com okay farang has a

question any device for keeping system

generated statistics

system generated status up to date for

um

big talk topic about statistics slightly

out of context for today's session

so i invite you to join the linkedin

group post the question there and we can

have some discussion

friends we have telegram groups also in

case you are a mobile freak and you like

asking questions on mobile

um my friend satya can put telegram link

also so we have

telegrams for sql maze pro sequel server

geeks and data platform geeks

and even dps for that matter so um

you can join uh telegram app is just

like whatsapp

um you know app so you can uh join but

the good thing about telegram is your

mobile number is not exposed and that's

why we prefer telegram the other thing

is groups can have

more members more beyond the limitation

that whatsapp has and a lot of q a

happening there

in recent times i'm not very

participative because of the dps work

but

uh the good thing is community helps so

as a community member you can ask

questions

you can also answer questions of other

folks

okay friends so morale has a question

regarding dynamic sql plan cache can we

use the same cursor

and rather in a while no no avoid

cursors uh

try to stick to relational sets

sometimes cursors are the only solutions

you can't help much with that but try to

avoid questions in general

santosh has a question procedure means

one time uh compilation reused many

times

procedure means one time compile and

reuse many times yeah santosh i think

that's what you meant right

you create the compile and the plan gets

reused correct

when you create an index on a table with

column of sp use

okay more and more questions are coming

can we get the recorded i

will we'll work on the recordings uh

anurag um

let's get all these questions answered

now again parameter sniffing okay

actually

when you create an index on a table with

column uh if sp is using that column

then sps will get recompile if not when

recompiled so then

actually sorry to def uh sorry to i'm

not able to follow your question right

now if you could just rephrase it

but then friends we are on the top of

the hour and i have to end the session

now

beyond 8 30 need to jump into another

meeting but it has been so actually

please post your question in the

linkedin group i'll come back to this

again

more questions i'll come back to that

please give feedback on linkedin

join our groups youtubes and telegram

and we let's stay connected and talk

to each other more about sql server and

performance tuning

with this i'll wind up today's session

thank you very much all of you joining

hope it was worth your time i really

enjoyed

uh delivering this uh session uh for all

of you

and uh wherever you are and whatever

time is left of the day have a great

uh day um have a great sleep if you're

in around a pack time zone

etc thank you very much and see you soon

maybe see you at

dps2020 goodbye

the whole world is joining data platform

virtual summit 2020 and there are some

very good reasons for that

the summit runs 72 hours non-stop

continuously

round the clock not only the summit even

our pre-cons and post cons run

round the clock which means the entire

package covers

all the time zones all the countries

whichever part of the world you are in

you can join the summit all in all we

have more than 200 sessions to choose

from and you have about 30 training

classes to join

this is true value because not only

do you attend life but you also get the

recordings for it

and thanks to our speakers all this is

brought to you by more than 170 speakers

about 100 are from the community

and more than 70 speakers from microsoft

let's mention and special thanks to our

training class speakers

they have agreed that their classes

could be recorded and the recordings

can be shared with the delegates so hats

off to them these are world's best

educators

we have covered you our technology

tracks cover on-prem

cloud database administration

development advanced analytics

artificial intelligence and this time

there is a special track industry

solution

where speakers are going to share real

world experiences with you

yes so we have freshly baked sessions

for you

and you will see that content is king

for us

highly curated session highly selected

session

and all this comes to you at an

incredibly affordable

price in less than 300 dollars you can

buy the summit ticket and you can choose

a training class for yourself

the price is so low that even if your

company is not sponsoring you

you can buy the ticket on your own so

why wait

log on to data platform

virtualsummit2020.com

no data platform virtualsummit.com and

evaluate

the summit for yourself and see whether

it is worth your

time and money and i'm sure you are

going to be a winner

get the dps ticket today

grab the content if you join dps 2020

training class

you get to attend live and you also get

the class recordings to watch over next

12 months as many times as you want

let's get started t sequel guru itzik

bengan joins us

devops master alex yates etl gem andy

leonard

the world-renowned bob ward and anna

hoffman with their popular class

azure sql workshop kubernetes look no

beyond anthony nocentino anupama

netrajan from australia on machine

learning for developers

no one knows sql server big data

clusters better than bin wiseman

another etl gem benjamin koetner want to

dive deep into data science algorithms

join this class by dejan sarkar get

real-world experience into enterprise

power bi

with the help of dr greg lowe another

sql community

gm and microsoft certified master edwin

msr miyanto with his class on sql server

and docker containers

the best sql server performance tuning

class comes to dps

by eric darling niche topic cosmos db by

hassan savran

yey itzik bengan does his second

training class at dps kelly k power bi

admin

then kevin faisal comes with data

analysis in python and sql list topic

want to do cell service ai on power bi

desktop join this class by

marcus nish topic again azure sequel db

performance tuning by martin k

day want to do some deep dive into sql

server migration join this class from

microsoft

with the help of mukesh and raj world

renowned expert

on power bi peter mice comes to dps for

the fifth time with his popular class

power bi desktop modeling

philip c mark does dax in a day prashant

boyer with microsoft

conversational ai master class community

gym rob civil with his powershell and

sql server dba tools

shiva harinath with dashboard in a day

jana berkovich with power bi data

visualization another sql server expert

will be reckoned with his performance

unique class

want to deploy sql server on azure vms

join this class

by five experts from the sql server team

and

yes synapse is the buzzword in modern

analytics join this class by warner

chefs

so much to choose from be a winner join

dps today