Unusual slowness of an SQL Server View2019 Community Moderator ElectionHow do I perform an IF…THEN in an SQL SELECT?Add a column with a default value to an existing table in SQL ServerHow to return only the Date from a SQL Server DateTime datatypeHow to check if a column exists in a SQL Server table?Check if table exists in SQL ServerLEFT JOIN vs. LEFT OUTER JOIN in SQL ServerSQL Server: Query fast, but slow from procedureInserting multiple rows in a single SQL query?How do I UPDATE from a SELECT in SQL Server?Find all tables containing column with specified name - MS SQL Server

Welcoming 2019 Pi day: How to draw the letter π?

How to make healing in an exploration game interesting

Where is the 1/8 CR apprentice in Volo's Guide to Monsters?

Possible Leak In Concrete

Rejected in 4th interview round citing insufficient years of experience

What options are left, if Britain cannot decide?

Why does Deadpool say "You're welcome, Canada," after shooting Ryan Reynolds in the end credits?

How to generate globally unique ids for different tables of the same database?

Am I not good enough for you?

When do we add an hyphen (-) to a complex adjective word?

Brexit - No Deal Rejection

Making a sword in the stone, in a medieval world without magic

Why do Australian milk farmers need to protest supermarkets' milk price?

At what level can a dragon innately cast its spells?

Is it true that real estate prices mainly go up?

Have researchers managed to "reverse time"? If so, what does that mean for physics?

Why did it take so long to abandon sail after steamships were demonstrated?

Using "wallow" verb with object

Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?

Can hydraulic brake levers get hot when brakes overheat?

How do I hide Chekhov's Gun?

Humanity loses the vast majority of its technology, information, and population in the year 2122. How long does it take to rebuild itself?

Provisioning profile doesn't include the application-identifier and keychain-access-groups entitlements

Bash replace string at multiple places in a file from command line



Unusual slowness of an SQL Server View



2019 Community Moderator ElectionHow do I perform an IF…THEN in an SQL SELECT?Add a column with a default value to an existing table in SQL ServerHow to return only the Date from a SQL Server DateTime datatypeHow to check if a column exists in a SQL Server table?Check if table exists in SQL ServerLEFT JOIN vs. LEFT OUTER JOIN in SQL ServerSQL Server: Query fast, but slow from procedureInserting multiple rows in a single SQL query?How do I UPDATE from a SELECT in SQL Server?Find all tables containing column with specified name - MS SQL Server










2















I'm writing an application that queries the database of a third party application.



My app creates a "library" of SQL views in the third party DB so my later queries are much easier to write and read. (Not only because of the domain logic of my app, but also because the third party DB uses terrible names for tables and columns.)



I've noticed that one of my views (which joins other views, which in turn join other views...) shows unusual slowness on a client's system. I broke it down to smaller parts but couldn't figure out any clear suspect.



Following is a minified version of the view, with other views it normally references turned into CTEs, which is still exactly as slow as the original view. If I break it into any smaller pieces though, they each execute very fast. I've also added a few comments showing examples of small changes that make the query much faster.



-- The query takes about 5s when the server has no other load
-- That's too slow because the UI of the app needs the results
with
orderLines as (
select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
-- If I filter only by LineStatus or only by DocStatus here, query takes <1s
where r1.LineStatus = 'O' and r.DocStatus = 'O'
),
picklistDetails as (
select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr
),
picklistDocs as (
select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
from opkl p
left join picklistDetails pd on pd.pklDocId = p.AbsEntry
),
picklistDocLines as (
select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
from PKL1
)
select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId
-- If I force parallelism by using the following option, query takes <1s
--option(querytraceon 8649)


In addition to the fact that all parts of the query execute quite fast in isolation, I also get much faster execution time (again <1s in total) when I use #temp tables instead of CTEs, like the following:



-- This batch execution returns the same result but takes <1s

select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
into #orderLines
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
where r1.LineStatus = 'O' and r.DocStatus = 'O'

select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
into #picklistDetails
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr

select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
into #picklistDocs
from opkl p
left join #picklistDetails pd on pd.pklDocId = p.AbsEntry

select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
into #picklistDocLines
from PKL1

select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


Can anyone make sense of the behavior of SQL Server here? To me it seems kind of like a bug / failure of the query optimizer.



If I can't find a way to make the view as fast as it should be, I'll probably just turn it into a procedure that uses #temp tables like in the second code I pasted, but optimally I'd like to avoid that. I have dozens of views with similar complexity and none are so slow.










share|improve this question

















  • 1





    CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

    – Lukasz Szozda
    Mar 6 at 18:11







  • 1





    Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

    – Robert Sievers
    Mar 6 at 18:21











  • Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

    – Sean Lange
    Mar 6 at 19:36
















2















I'm writing an application that queries the database of a third party application.



My app creates a "library" of SQL views in the third party DB so my later queries are much easier to write and read. (Not only because of the domain logic of my app, but also because the third party DB uses terrible names for tables and columns.)



I've noticed that one of my views (which joins other views, which in turn join other views...) shows unusual slowness on a client's system. I broke it down to smaller parts but couldn't figure out any clear suspect.



Following is a minified version of the view, with other views it normally references turned into CTEs, which is still exactly as slow as the original view. If I break it into any smaller pieces though, they each execute very fast. I've also added a few comments showing examples of small changes that make the query much faster.



-- The query takes about 5s when the server has no other load
-- That's too slow because the UI of the app needs the results
with
orderLines as (
select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
-- If I filter only by LineStatus or only by DocStatus here, query takes <1s
where r1.LineStatus = 'O' and r.DocStatus = 'O'
),
picklistDetails as (
select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr
),
picklistDocs as (
select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
from opkl p
left join picklistDetails pd on pd.pklDocId = p.AbsEntry
),
picklistDocLines as (
select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
from PKL1
)
select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId
-- If I force parallelism by using the following option, query takes <1s
--option(querytraceon 8649)


In addition to the fact that all parts of the query execute quite fast in isolation, I also get much faster execution time (again <1s in total) when I use #temp tables instead of CTEs, like the following:



-- This batch execution returns the same result but takes <1s

select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
into #orderLines
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
where r1.LineStatus = 'O' and r.DocStatus = 'O'

select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
into #picklistDetails
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr

select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
into #picklistDocs
from opkl p
left join #picklistDetails pd on pd.pklDocId = p.AbsEntry

select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
into #picklistDocLines
from PKL1

select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


Can anyone make sense of the behavior of SQL Server here? To me it seems kind of like a bug / failure of the query optimizer.



If I can't find a way to make the view as fast as it should be, I'll probably just turn it into a procedure that uses #temp tables like in the second code I pasted, but optimally I'd like to avoid that. I have dozens of views with similar complexity and none are so slow.










share|improve this question

















  • 1





    CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

    – Lukasz Szozda
    Mar 6 at 18:11







  • 1





    Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

    – Robert Sievers
    Mar 6 at 18:21











  • Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

    – Sean Lange
    Mar 6 at 19:36














2












2








2


1






I'm writing an application that queries the database of a third party application.



My app creates a "library" of SQL views in the third party DB so my later queries are much easier to write and read. (Not only because of the domain logic of my app, but also because the third party DB uses terrible names for tables and columns.)



I've noticed that one of my views (which joins other views, which in turn join other views...) shows unusual slowness on a client's system. I broke it down to smaller parts but couldn't figure out any clear suspect.



Following is a minified version of the view, with other views it normally references turned into CTEs, which is still exactly as slow as the original view. If I break it into any smaller pieces though, they each execute very fast. I've also added a few comments showing examples of small changes that make the query much faster.



-- The query takes about 5s when the server has no other load
-- That's too slow because the UI of the app needs the results
with
orderLines as (
select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
-- If I filter only by LineStatus or only by DocStatus here, query takes <1s
where r1.LineStatus = 'O' and r.DocStatus = 'O'
),
picklistDetails as (
select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr
),
picklistDocs as (
select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
from opkl p
left join picklistDetails pd on pd.pklDocId = p.AbsEntry
),
picklistDocLines as (
select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
from PKL1
)
select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId
-- If I force parallelism by using the following option, query takes <1s
--option(querytraceon 8649)


In addition to the fact that all parts of the query execute quite fast in isolation, I also get much faster execution time (again <1s in total) when I use #temp tables instead of CTEs, like the following:



-- This batch execution returns the same result but takes <1s

select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
into #orderLines
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
where r1.LineStatus = 'O' and r.DocStatus = 'O'

select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
into #picklistDetails
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr

select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
into #picklistDocs
from opkl p
left join #picklistDetails pd on pd.pklDocId = p.AbsEntry

select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
into #picklistDocLines
from PKL1

select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


Can anyone make sense of the behavior of SQL Server here? To me it seems kind of like a bug / failure of the query optimizer.



If I can't find a way to make the view as fast as it should be, I'll probably just turn it into a procedure that uses #temp tables like in the second code I pasted, but optimally I'd like to avoid that. I have dozens of views with similar complexity and none are so slow.










share|improve this question














I'm writing an application that queries the database of a third party application.



My app creates a "library" of SQL views in the third party DB so my later queries are much easier to write and read. (Not only because of the domain logic of my app, but also because the third party DB uses terrible names for tables and columns.)



I've noticed that one of my views (which joins other views, which in turn join other views...) shows unusual slowness on a client's system. I broke it down to smaller parts but couldn't figure out any clear suspect.



Following is a minified version of the view, with other views it normally references turned into CTEs, which is still exactly as slow as the original view. If I break it into any smaller pieces though, they each execute very fast. I've also added a few comments showing examples of small changes that make the query much faster.



-- The query takes about 5s when the server has no other load
-- That's too slow because the UI of the app needs the results
with
orderLines as (
select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
-- If I filter only by LineStatus or only by DocStatus here, query takes <1s
where r1.LineStatus = 'O' and r.DocStatus = 'O'
),
picklistDetails as (
select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr
),
picklistDocs as (
select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
from opkl p
left join picklistDetails pd on pd.pklDocId = p.AbsEntry
),
picklistDocLines as (
select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
from PKL1
)
select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId
-- If I force parallelism by using the following option, query takes <1s
--option(querytraceon 8649)


In addition to the fact that all parts of the query execute quite fast in isolation, I also get much faster execution time (again <1s in total) when I use #temp tables instead of CTEs, like the following:



-- This batch execution returns the same result but takes <1s

select r.DocEntry as rdrDocId,
r1.LineNum as rdrLineId
into #orderLines
from rdr1 r1
join ordr r on r.DocEntry = r1.DocEntry
where r1.LineStatus = 'O' and r.DocStatus = 'O'

select U_KommNr as pklDocId,
max(cast(U_Space as int)) as maxPlace
into #picklistDetails
from [@PICKING]
where U_DeletedF = 'N'
group by U_KommNr

select p.AbsEntry as pklDocId,
case
when pd.maxPlace is null then 0
else pd.maxPlace
end as pklDocMaxPlace
into #picklistDocs
from opkl p
left join #picklistDetails pd on pd.pklDocId = p.AbsEntry

select AbsEntry as pklDocId,
PickEntry as pklLineId,
OrderEntry as rdrDocId,
OrderLine as rdrLineId
into #picklistDocLines
from PKL1

select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


Can anyone make sense of the behavior of SQL Server here? To me it seems kind of like a bug / failure of the query optimizer.



If I can't find a way to make the view as fast as it should be, I'll probably just turn it into a procedure that uses #temp tables like in the second code I pasted, but optimally I'd like to avoid that. I have dozens of views with similar complexity and none are so slow.







sql-server tsql sql-server-2014






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 6 at 18:10









TaylanUBTaylanUB

3,46311214




3,46311214







  • 1





    CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

    – Lukasz Szozda
    Mar 6 at 18:11







  • 1





    Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

    – Robert Sievers
    Mar 6 at 18:21











  • Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

    – Sean Lange
    Mar 6 at 19:36













  • 1





    CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

    – Lukasz Szozda
    Mar 6 at 18:11







  • 1





    Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

    – Robert Sievers
    Mar 6 at 18:21











  • Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

    – Sean Lange
    Mar 6 at 19:36








1




1





CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

– Lukasz Szozda
Mar 6 at 18:11






CTEs are not materialized in SQL Server, with temp table you force to materialize specific part of query and run another part on top of that. With single multilevel CTE query optimizer is free to apply any "optimization" and they are not always best. T-SQL Common Table Expression "Materialize" Option

– Lukasz Szozda
Mar 6 at 18:11





1




1





Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

– Robert Sievers
Mar 6 at 18:21





Views on views on views can confuse the cardinality estimator. if you look at your query execution plan, you will probably find a place where the estimated number of rows between two nodes blows way out of proportion. It isn't always immediately clear how to fix something like that.

– Robert Sievers
Mar 6 at 18:21













Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

– Sean Lange
Mar 6 at 19:36






Nested views are the spawn of the devil. They seem so logical but they are dreadful for performance. They are so horrific that Grant Fritchey deems this practice as one of the seven deadly sins for sql server performance. red-gate.com/simple-talk/sql/performance/… But you said you are joining views which is completely different. It is nested views, where one view selects data from another view where you get into trouble.

– Sean Lange
Mar 6 at 19:36













1 Answer
1






active

oldest

votes


















3















Can anyone make sense of the behavior of SQL Server here? To me it
seems kind of like a bug/failure of the query optimizer.




No, this is not a bug.



The task split to few smaller units:



Temporary table approach is nothing more than splitting that large query plan into smaller pieces and executing them independently.



The smaller the piece, the bigger the chance that SQL Server Query Optimizer will not perform some dramatical mismatch in cardinality estimation and will choose right physical operators and types of joins, so smaller chance to see a nested loop over millions of rows or some other nasty thing.



When there is a time run the piece of code stated below, the Query optimizer knows how much rows in every involved temporary table and how they are distributed:



select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


One unit of work:



The CTE approach, as mentioned by Lukasz and Robert in comments, is a kind of syntax sugar, similar to view on view on view. However, in the end, query optimizer has to flatten all CTEs into one consolidated and sometimes large query plan and execute it as one unit. Therefore, the larger plan the bigger chance of performance related surprises.



So, in contrast to a previous snippet, the query optimizer compiles plan at the moment when the number of rows is just guessed by a cardinality estimation using statistics:



select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


querytraceon 8649:



When you enable option(querytraceon 8649) you just force the query optimizer to change behavior, the same way as other query hints or like traces similar to 4199.
So forced parallelism, perhaps occasionally generated a better plan, but you can hardly rely on this.



Some ideas of how it can be solved:



  • Statistics update on involved tables

  • Playing with the switching of new and legacy cardinality estimators

  • (imho) Rewrite CTE to a derived table?

  • If large datasets involved, then splitting logic into smaller pieces using #temp table approach is something that can be a consistent workaround to choose.

  • etc etc

There is one exception:



  • Indexed views. By using hint NOEXPAND (or if Enterprise Edition in use). The logic of the view should not be flattened into the overall query plan of the query that involves it.





share|improve this answer

























  • +1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

    – Alan Burstein
    Mar 6 at 19:59











  • @AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

    – Alexander Volok
    Mar 7 at 8:19










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55029646%2funusual-slowness-of-an-sql-server-view%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3















Can anyone make sense of the behavior of SQL Server here? To me it
seems kind of like a bug/failure of the query optimizer.




No, this is not a bug.



The task split to few smaller units:



Temporary table approach is nothing more than splitting that large query plan into smaller pieces and executing them independently.



The smaller the piece, the bigger the chance that SQL Server Query Optimizer will not perform some dramatical mismatch in cardinality estimation and will choose right physical operators and types of joins, so smaller chance to see a nested loop over millions of rows or some other nasty thing.



When there is a time run the piece of code stated below, the Query optimizer knows how much rows in every involved temporary table and how they are distributed:



select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


One unit of work:



The CTE approach, as mentioned by Lukasz and Robert in comments, is a kind of syntax sugar, similar to view on view on view. However, in the end, query optimizer has to flatten all CTEs into one consolidated and sometimes large query plan and execute it as one unit. Therefore, the larger plan the bigger chance of performance related surprises.



So, in contrast to a previous snippet, the query optimizer compiles plan at the moment when the number of rows is just guessed by a cardinality estimation using statistics:



select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


querytraceon 8649:



When you enable option(querytraceon 8649) you just force the query optimizer to change behavior, the same way as other query hints or like traces similar to 4199.
So forced parallelism, perhaps occasionally generated a better plan, but you can hardly rely on this.



Some ideas of how it can be solved:



  • Statistics update on involved tables

  • Playing with the switching of new and legacy cardinality estimators

  • (imho) Rewrite CTE to a derived table?

  • If large datasets involved, then splitting logic into smaller pieces using #temp table approach is something that can be a consistent workaround to choose.

  • etc etc

There is one exception:



  • Indexed views. By using hint NOEXPAND (or if Enterprise Edition in use). The logic of the view should not be flattened into the overall query plan of the query that involves it.





share|improve this answer

























  • +1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

    – Alan Burstein
    Mar 6 at 19:59











  • @AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

    – Alexander Volok
    Mar 7 at 8:19















3















Can anyone make sense of the behavior of SQL Server here? To me it
seems kind of like a bug/failure of the query optimizer.




No, this is not a bug.



The task split to few smaller units:



Temporary table approach is nothing more than splitting that large query plan into smaller pieces and executing them independently.



The smaller the piece, the bigger the chance that SQL Server Query Optimizer will not perform some dramatical mismatch in cardinality estimation and will choose right physical operators and types of joins, so smaller chance to see a nested loop over millions of rows or some other nasty thing.



When there is a time run the piece of code stated below, the Query optimizer knows how much rows in every involved temporary table and how they are distributed:



select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


One unit of work:



The CTE approach, as mentioned by Lukasz and Robert in comments, is a kind of syntax sugar, similar to view on view on view. However, in the end, query optimizer has to flatten all CTEs into one consolidated and sometimes large query plan and execute it as one unit. Therefore, the larger plan the bigger chance of performance related surprises.



So, in contrast to a previous snippet, the query optimizer compiles plan at the moment when the number of rows is just guessed by a cardinality estimation using statistics:



select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


querytraceon 8649:



When you enable option(querytraceon 8649) you just force the query optimizer to change behavior, the same way as other query hints or like traces similar to 4199.
So forced parallelism, perhaps occasionally generated a better plan, but you can hardly rely on this.



Some ideas of how it can be solved:



  • Statistics update on involved tables

  • Playing with the switching of new and legacy cardinality estimators

  • (imho) Rewrite CTE to a derived table?

  • If large datasets involved, then splitting logic into smaller pieces using #temp table approach is something that can be a consistent workaround to choose.

  • etc etc

There is one exception:



  • Indexed views. By using hint NOEXPAND (or if Enterprise Edition in use). The logic of the view should not be flattened into the overall query plan of the query that involves it.





share|improve this answer

























  • +1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

    – Alan Burstein
    Mar 6 at 19:59











  • @AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

    – Alexander Volok
    Mar 7 at 8:19













3












3








3








Can anyone make sense of the behavior of SQL Server here? To me it
seems kind of like a bug/failure of the query optimizer.




No, this is not a bug.



The task split to few smaller units:



Temporary table approach is nothing more than splitting that large query plan into smaller pieces and executing them independently.



The smaller the piece, the bigger the chance that SQL Server Query Optimizer will not perform some dramatical mismatch in cardinality estimation and will choose right physical operators and types of joins, so smaller chance to see a nested loop over millions of rows or some other nasty thing.



When there is a time run the piece of code stated below, the Query optimizer knows how much rows in every involved temporary table and how they are distributed:



select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


One unit of work:



The CTE approach, as mentioned by Lukasz and Robert in comments, is a kind of syntax sugar, similar to view on view on view. However, in the end, query optimizer has to flatten all CTEs into one consolidated and sometimes large query plan and execute it as one unit. Therefore, the larger plan the bigger chance of performance related surprises.



So, in contrast to a previous snippet, the query optimizer compiles plan at the moment when the number of rows is just guessed by a cardinality estimation using statistics:



select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


querytraceon 8649:



When you enable option(querytraceon 8649) you just force the query optimizer to change behavior, the same way as other query hints or like traces similar to 4199.
So forced parallelism, perhaps occasionally generated a better plan, but you can hardly rely on this.



Some ideas of how it can be solved:



  • Statistics update on involved tables

  • Playing with the switching of new and legacy cardinality estimators

  • (imho) Rewrite CTE to a derived table?

  • If large datasets involved, then splitting logic into smaller pieces using #temp table approach is something that can be a consistent workaround to choose.

  • etc etc

There is one exception:



  • Indexed views. By using hint NOEXPAND (or if Enterprise Edition in use). The logic of the view should not be flattened into the overall query plan of the query that involves it.





share|improve this answer
















Can anyone make sense of the behavior of SQL Server here? To me it
seems kind of like a bug/failure of the query optimizer.




No, this is not a bug.



The task split to few smaller units:



Temporary table approach is nothing more than splitting that large query plan into smaller pieces and executing them independently.



The smaller the piece, the bigger the chance that SQL Server Query Optimizer will not perform some dramatical mismatch in cardinality estimation and will choose right physical operators and types of joins, so smaller chance to see a nested loop over millions of rows or some other nasty thing.



When there is a time run the piece of code stated below, the Query optimizer knows how much rows in every involved temporary table and how they are distributed:



select p.pklDocMaxPlace
from #picklistDocs p
join #picklistDocLines p1 on p.pklDocId = p1.pklDocId
join #orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


One unit of work:



The CTE approach, as mentioned by Lukasz and Robert in comments, is a kind of syntax sugar, similar to view on view on view. However, in the end, query optimizer has to flatten all CTEs into one consolidated and sometimes large query plan and execute it as one unit. Therefore, the larger plan the bigger chance of performance related surprises.



So, in contrast to a previous snippet, the query optimizer compiles plan at the moment when the number of rows is just guessed by a cardinality estimation using statistics:



select p.pklDocMaxPlace
from picklistDocs p
join picklistDocLines p1 on p.pklDocId = p1.pklDocId
join orderLines r1 on r1.rdrDocId = p1.rdrDocId
and r1.rdrLineId = p1.rdrLineId


querytraceon 8649:



When you enable option(querytraceon 8649) you just force the query optimizer to change behavior, the same way as other query hints or like traces similar to 4199.
So forced parallelism, perhaps occasionally generated a better plan, but you can hardly rely on this.



Some ideas of how it can be solved:



  • Statistics update on involved tables

  • Playing with the switching of new and legacy cardinality estimators

  • (imho) Rewrite CTE to a derived table?

  • If large datasets involved, then splitting logic into smaller pieces using #temp table approach is something that can be a consistent workaround to choose.

  • etc etc

There is one exception:



  • Indexed views. By using hint NOEXPAND (or if Enterprise Edition in use). The logic of the view should not be flattened into the overall query plan of the query that involves it.






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 6 at 19:37

























answered Mar 6 at 18:55









Alexander VolokAlexander Volok

3,102825




3,102825












  • +1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

    – Alan Burstein
    Mar 6 at 19:59











  • @AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

    – Alexander Volok
    Mar 7 at 8:19

















  • +1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

    – Alan Burstein
    Mar 6 at 19:59











  • @AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

    – Alexander Volok
    Mar 7 at 8:19
















+1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

– Alan Burstein
Mar 6 at 19:59





+1 for the excellent answer. It's worth noting that traceflag 8649 is undocumented and, therefore should only be used for testing and never in Production. I don't doubt that you are aware of this but most people are not. make_parallel() by Adam Machanic, however, is a safe and viable alternative. In my experience make_parallel() and OPTION(QUERYTRACEON 8649) perform identically. The advantage of the traceflag is that it does not mess with your cardinality estimates in the execution plan but make_parallel() does; it tricks the optimizer into estimating a gazzillion-row result set.

– Alan Burstein
Mar 6 at 19:59













@AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

– Alexander Volok
Mar 7 at 8:19





@AlanBurstein, thank you. Frankly, I mentioned 8649 because OP used it in his code. Going to get a deeper look into make_parallel() trick.

– Alexander Volok
Mar 7 at 8:19



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55029646%2funusual-slowness-of-an-sql-server-view%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

1928 у кіно

Захаров Федір Захарович

Ель Греко