XML Bulk Load: Need query optimization help.

(SQL Server 2000, SP4)
Hello all!
I'm seeing a lot of I/O generated by one of our queries, and I think it's
how the query is constructed. Consider the following:
use [tempdb]
go
if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
create table [dbo].[Test]
(
[RowID] uniqueidentifier not NULL,
[Field1] int not NULL,
[Field2] int not NULL,
[Date] datetime not NULL,
primary key clustered ([RowID])
)
insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
'0:0:1'
insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
'0:0:1'
select t.*
from [dbo].[Test] as t with (nolock)
select t.*
from [dbo].[Test] as t with (nolock)
inner join (
select [Field1] = t.[Field1],
[Field2] = t.[Field2],
[Date] = max(t.[Date])
from [dbo].[Test] as t with (nolock)
group by t.[Field1],
t.[Field2]
) as q on q.[Field1] = t.[Field1]
and q.[Field2] = t.[Field2]
and q.[Date] = t.[Date]
In essence, I'm trying to find the rows from Test that have the *maximum*
Date for a unique Field1/Field2 combination. This query is used in a VIEW,
and users of this VIEW typically JOIN on RowID.
I'm sure the "guts" of this VIEW can be refactored to be better, but I don't
know how to do it. Any suggestions would be *much* appreciated!
John PetersonWhat about (didn´t try that but it should work)
Select * From Test
Inner Join
(
Select Field1,Field2,Max(Date)
From Test
Group by Field1,Field2
) SUbQuery
on
SUbQuery.Field1 = Test.FIeld1 AND
SUbQuery.Field2 = Test.FIeld2 AND
SUbQuery.Date = Test.Date
Its alsways interesting for performance issues to send the execution plan
with the post, use the SET SHOWPLAN_TEXT ON Statement before issuing the
command and you´ll get the query execution plan in text displayes, which you
can copy & paste in the newsgroups.
HTH, Jens Suessmeyer.
"John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
> (SQL Server 2000, SP4)
> Hello all!
> I'm seeing a lot of I/O generated by one of our queries, and I think it's
> how the query is constructed. Consider the following:
>
> use [tempdb]
> go
> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
> create table [dbo].[Test]
> (
> [RowID] uniqueidentifier not NULL,
> [Field1] int not NULL,
> [Field2] int not NULL,
> [Date] datetime not NULL,
> primary key clustered ([RowID])
> )
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> select t.*
> from [dbo].[Test] as t with (nolock)
> select t.*
> from [dbo].[Test] as t with (nolock)
> inner join (
> select [Field1] = t.[Field1],
> [Field2] = t.[Field2],
> [Date] = max(t.[Date])
> from [dbo].[Test] as t with (nolock)
> group by t.[Field1],
> t.[Field2]
> ) as q on q.[Field1] = t.[Field1]
> and q.[Field2] = t.[Field2]
> and q.[Date] = t.[Date]
>
> In essence, I'm trying to find the rows from Test that have the *maximum*
> Date for a unique Field1/Field2 combination. This query is used in a
> VIEW, and users of this VIEW typically JOIN on RowID.
> I'm sure the "guts" of this VIEW can be refactored to be better, but I
> don't know how to do it. Any suggestions would be *much* appreciated!
> John Peterson
>|||Hello Jens!
I think your suggestion is exactly what I came up with (see original
post) -- I was hoping maybe there was an even better way to express this.
"Jens Süßmeyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote in
message news:uCi7hBPeFHA.3880@.tk2msftngp13.phx.gbl...
> What about (didn´t try that but it should work)
> Select * From Test
> Inner Join
> (
> Select Field1,Field2,Max(Date)
> From Test
> Group by Field1,Field2
> ) SUbQuery
> on
> SUbQuery.Field1 = Test.FIeld1 AND
> SUbQuery.Field2 = Test.FIeld2 AND
> SUbQuery.Date = Test.Date
> Its alsways interesting for performance issues to send the execution plan
> with the post, use the SET SHOWPLAN_TEXT ON Statement before issuing the
> command and you´ll get the query execution plan in text displayes, which
> you can copy & paste in the newsgroups.
> HTH, Jens Suessmeyer.
> "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
> news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
>> (SQL Server 2000, SP4)
>> Hello all!
>> I'm seeing a lot of I/O generated by one of our queries, and I think it's
>> how the query is constructed. Consider the following:
>>
>> use [tempdb]
>> go
>> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
>> create table [dbo].[Test]
>> (
>> [RowID] uniqueidentifier not NULL,
>> [Field1] int not NULL,
>> [Field2] int not NULL,
>> [Date] datetime not NULL,
>> primary key clustered ([RowID])
>> )
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> inner join (
>> select [Field1] = t.[Field1],
>> [Field2] = t.[Field2],
>> [Date] = max(t.[Date])
>> from [dbo].[Test] as t with (nolock)
>> group by t.[Field1],
>> t.[Field2]
>> ) as q on q.[Field1] = t.[Field1]
>> and q.[Field2] = t.[Field2]
>> and q.[Date] = t.[Date]
>>
>> In essence, I'm trying to find the rows from Test that have the *maximum*
>> Date for a unique Field1/Field2 combination. This query is used in a
>> VIEW, and users of this VIEW typically JOIN on RowID.
>> I'm sure the "guts" of this VIEW can be refactored to be better, but I
>> don't know how to do it. Any suggestions would be *much* appreciated!
>> John Peterson
>>
>|||John,
There is another way to do this, but I can not asure that it will give
better performance. you have to compare both execution plan.
select
t1.*
from
[dbo].[Test] as t1 with (nolock)
where
t1.[Date] = (
select
max(t2.[Date])
from
[dbo].[Test] as t2 with (nolock)
where
t2.[Field1] = t1.[Field1]
and t2.[Field2] = t1.[Field2]
)
AMB
"John Peterson" wrote:
> Hello Jens!
> I think your suggestion is exactly what I came up with (see original
> post) -- I was hoping maybe there was an even better way to express this.
>
> "Jens SÃ¼Ã?meyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote in
> message news:uCi7hBPeFHA.3880@.tk2msftngp13.phx.gbl...
> > What about (didnÂ´t try that but it should work)
> >
> > Select * From Test
> > Inner Join
> > (
> > Select Field1,Field2,Max(Date)
> > From Test
> > Group by Field1,Field2
> > ) SUbQuery
> > on
> > SUbQuery.Field1 = Test.FIeld1 AND
> > SUbQuery.Field2 = Test.FIeld2 AND
> > SUbQuery.Date = Test.Date
> >
> > Its alsways interesting for performance issues to send the execution plan
> > with the post, use the SET SHOWPLAN_TEXT ON Statement before issuing the
> > command and youÂ´ll get the query execution plan in text displayes, which
> > you can copy & paste in the newsgroups.
> >
> > HTH, Jens Suessmeyer.
> >
> > "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
> > news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
> >> (SQL Server 2000, SP4)
> >>
> >> Hello all!
> >>
> >> I'm seeing a lot of I/O generated by one of our queries, and I think it's
> >> how the query is constructed. Consider the following:
> >>
> >>
> >> use [tempdb]
> >> go
> >>
> >> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
> >> create table [dbo].[Test]
> >> (
> >> [RowID] uniqueidentifier not NULL,
> >> [Field1] int not NULL,
> >> [Field2] int not NULL,
> >> [Date] datetime not NULL,
> >> primary key clustered ([RowID])
> >> )
> >>
> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> >> '0:0:1'
> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> >> '0:0:1'
> >>
> >> select t.*
> >> from [dbo].[Test] as t with (nolock)
> >>
> >> select t.*
> >> from [dbo].[Test] as t with (nolock)
> >> inner join (
> >> select [Field1] = t.[Field1],
> >> [Field2] = t.[Field2],
> >> [Date] = max(t.[Date])
> >> from [dbo].[Test] as t with (nolock)
> >> group by t.[Field1],
> >> t.[Field2]
> >> ) as q on q.[Field1] = t.[Field1]
> >> and q.[Field2] = t.[Field2]
> >> and q.[Date] = t.[Date]
> >>
> >>
> >> In essence, I'm trying to find the rows from Test that have the *maximum*
> >> Date for a unique Field1/Field2 combination. This query is used in a
> >> VIEW, and users of this VIEW typically JOIN on RowID.
> >>
> >> I'm sure the "guts" of this VIEW can be refactored to be better, but I
> >> don't know how to do it. Any suggestions would be *much* appreciated!
> >>
> >> John Peterson
> >>
> >>
> >
> >
>
>|||Sorry, but thats not exact the same, because you additionaly did a
correlated query
within your subselect and additionaly did and a join outside the query.
Bit the best thing to see the differences is to get the query plan for that
options.
HTH, Jens Suessmeyer.
"John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
news:uHWKtFPeFHA.1288@.tk2msftngp13.phx.gbl...
> Hello Jens!
> I think your suggestion is exactly what I came up with (see original
> post) -- I was hoping maybe there was an even better way to express this.
>
> "Jens Süßmeyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote
> in message news:uCi7hBPeFHA.3880@.tk2msftngp13.phx.gbl...
>> What about (didn´t try that but it should work)
>> Select * From Test
>> Inner Join
>> (
>> Select Field1,Field2,Max(Date)
>> From Test
>> Group by Field1,Field2
>> ) SUbQuery
>> on
>> SUbQuery.Field1 = Test.FIeld1 AND
>> SUbQuery.Field2 = Test.FIeld2 AND
>> SUbQuery.Date = Test.Date
>> Its alsways interesting for performance issues to send the execution plan
>> with the post, use the SET SHOWPLAN_TEXT ON Statement before issuing the
>> command and you´ll get the query execution plan in text displayes, which
>> you can copy & paste in the newsgroups.
>> HTH, Jens Suessmeyer.
>> "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
>> news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
>> (SQL Server 2000, SP4)
>> Hello all!
>> I'm seeing a lot of I/O generated by one of our queries, and I think
>> it's how the query is constructed. Consider the following:
>>
>> use [tempdb]
>> go
>> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
>> create table [dbo].[Test]
>> (
>> [RowID] uniqueidentifier not NULL,
>> [Field1] int not NULL,
>> [Field2] int not NULL,
>> [Date] datetime not NULL,
>> primary key clustered ([RowID])
>> )
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
>> '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
>> '0:0:1'
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> inner join (
>> select [Field1] = t.[Field1],
>> [Field2] = t.[Field2],
>> [Date] = max(t.[Date])
>> from [dbo].[Test] as t with (nolock)
>> group by t.[Field1],
>> t.[Field2]
>> ) as q on q.[Field1] = t.[Field1]
>> and q.[Field2] = t.[Field2]
>> and q.[Date] = t.[Date]
>>
>> In essence, I'm trying to find the rows from Test that have the
>> *maximum* Date for a unique Field1/Field2 combination. This query is
>> used in a VIEW, and users of this VIEW typically JOIN on RowID.
>> I'm sure the "guts" of this VIEW can be refactored to be better, but I
>> don't know how to do it. Any suggestions would be *much* appreciated!
>> John Peterson
>>
>>
>|||Thanks Alejandro! I'll compare/contrast this technique. Typically, a
subselect within a field has performed more slowly for me -- but not every
case is the same. I'll do as you suggest and examine the Excecution Plans.
"Alejandro Mesa" <AlejandroMesa@.discussions.microsoft.com> wrote in message
news:EA463D89-3CD7-474A-8DAD-6638C9B4066D@.microsoft.com...
> John,
> There is another way to do this, but I can not asure that it will give
> better performance. you have to compare both execution plan.
> select
> t1.*
> from
> [dbo].[Test] as t1 with (nolock)
> where
> t1.[Date] = (
> select
> max(t2.[Date])
> from
> [dbo].[Test] as t2 with (nolock)
> where
> t2.[Field1] = t1.[Field1]
> and t2.[Field2] = t1.[Field2]
> )
>
> AMB
>
>
> "John Peterson" wrote:
>> Hello Jens!
>> I think your suggestion is exactly what I came up with (see original
>> post) -- I was hoping maybe there was an even better way to express this.
>>
>> "Jens Süßmeyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote
>> in
>> message news:uCi7hBPeFHA.3880@.tk2msftngp13.phx.gbl...
>> > What about (didn´t try that but it should work)
>> >
>> > Select * From Test
>> > Inner Join
>> > (
>> > Select Field1,Field2,Max(Date)
>> > From Test
>> > Group by Field1,Field2
>> > ) SUbQuery
>> > on
>> > SUbQuery.Field1 = Test.FIeld1 AND
>> > SUbQuery.Field2 = Test.FIeld2 AND
>> > SUbQuery.Date = Test.Date
>> >
>> > Its alsways interesting for performance issues to send the execution
>> > plan
>> > with the post, use the SET SHOWPLAN_TEXT ON Statement before issuing
>> > the
>> > command and you´ll get the query execution plan in text displayes,
>> > which
>> > you can copy & paste in the newsgroups.
>> >
>> > HTH, Jens Suessmeyer.
>> >
>> > "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
>> > news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
>> >> (SQL Server 2000, SP4)
>> >>
>> >> Hello all!
>> >>
>> >> I'm seeing a lot of I/O generated by one of our queries, and I think
>> >> it's
>> >> how the query is constructed. Consider the following:
>> >>
>> >>
>> >> use [tempdb]
>> >> go
>> >>
>> >> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
>> >> create table [dbo].[Test]
>> >> (
>> >> [RowID] uniqueidentifier not NULL,
>> >> [Field1] int not NULL,
>> >> [Field2] int not NULL,
>> >> [Date] datetime not NULL,
>> >> primary key clustered ([RowID])
>> >> )
>> >>
>> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> >> delay
>> >> '0:0:1'
>> >>
>> >> select t.*
>> >> from [dbo].[Test] as t with (nolock)
>> >>
>> >> select t.*
>> >> from [dbo].[Test] as t with (nolock)
>> >> inner join (
>> >> select [Field1] = t.[Field1],
>> >> [Field2] = t.[Field2],
>> >> [Date] = max(t.[Date])
>> >> from [dbo].[Test] as t with (nolock)
>> >> group by t.[Field1],
>> >> t.[Field2]
>> >> ) as q on q.[Field1] = t.[Field1]
>> >> and q.[Field2] = t.[Field2]
>> >> and q.[Date] = t.[Date]
>> >>
>> >>
>> >> In essence, I'm trying to find the rows from Test that have the
>> >> *maximum*
>> >> Date for a unique Field1/Field2 combination. This query is used in a
>> >> VIEW, and users of this VIEW typically JOIN on RowID.
>> >>
>> >> I'm sure the "guts" of this VIEW can be refactored to be better, but I
>> >> don't know how to do it. Any suggestions would be *much* appreciated!
>> >>
>> >> John Peterson
>> >>
>> >>
>> >
>> >
>>|||I don't mean to be obtuse, but I must be missing something. Your query and
my original query appear to be identical.
Here's what I had originally crafted:
select t.*
from [dbo].[Test] as t with (nolock)
inner join (
select [Field1] = t.[Field1],
[Field2] = t.[Field2],
[Date] = max(t.[Date])
from [dbo].[Test] as t with (nolock)
group by t.[Field1],
t.[Field2]
) as q on q.[Field1] = t.[Field1]
and q.[Field2] = t.[Field2]
and q.[Date] = t.[Date]
And here's what you suggested:
Select * From Test
Inner Join
(
Select Field1,Field2,Max(Date)
From Test
Group by Field1,Field2
) SUbQuery
on
SUbQuery.Field1 = Test.FIeld1 AND
SUbQuery.Field2 = Test.FIeld2 AND
SUbQuery.Date = Test.Date
Formatting aside, those seem *identical* to me. Or is there some subtle
difference that I don't perceive that might influence the performance?
"Jens Süßmeyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote in
message news:%238a5LuPeFHA.2984@.TK2MSFTNGP15.phx.gbl...
> Sorry, but thats not exact the same, because you additionaly did a
> correlated query
> within your subselect and additionaly did and a join outside the query.
> Bit the best thing to see the differences is to get the query plan for
> that options.
> HTH, Jens Suessmeyer.
>
> "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
> news:uHWKtFPeFHA.1288@.tk2msftngp13.phx.gbl...
>> Hello Jens!
>> I think your suggestion is exactly what I came up with (see original
>> post) -- I was hoping maybe there was an even better way to express this.
>>
>> "Jens Süßmeyer" <Jens@.Remove_this_For_Contacting.sqlserver2005.de> wrote
>> in message news:uCi7hBPeFHA.3880@.tk2msftngp13.phx.gbl...
>> What about (didn´t try that but it should work)
>> Select * From Test
>> Inner Join
>> (
>> Select Field1,Field2,Max(Date)
>> From Test
>> Group by Field1,Field2
>> ) SUbQuery
>> on
>> SUbQuery.Field1 = Test.FIeld1 AND
>> SUbQuery.Field2 = Test.FIeld2 AND
>> SUbQuery.Date = Test.Date
>> Its alsways interesting for performance issues to send the execution
>> plan with the post, use the SET SHOWPLAN_TEXT ON Statement before
>> issuing the command and you´ll get the query execution plan in text
>> displayes, which you can copy & paste in the newsgroups.
>> HTH, Jens Suessmeyer.
>> "John Peterson" <j0hnp@.comcast.net> schrieb im Newsbeitrag
>> news:eM0id1OeFHA.3620@.TK2MSFTNGP09.phx.gbl...
>> (SQL Server 2000, SP4)
>> Hello all!
>> I'm seeing a lot of I/O generated by one of our queries, and I think
>> it's how the query is constructed. Consider the following:
>>
>> use [tempdb]
>> go
>> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
>> create table [dbo].[Test]
>> (
>> [RowID] uniqueidentifier not NULL,
>> [Field1] int not NULL,
>> [Field2] int not NULL,
>> [Date] datetime not NULL,
>> primary key clustered ([RowID])
>> )
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor
>> delay '0:0:1'
>> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor
>> delay '0:0:1'
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> select t.*
>> from [dbo].[Test] as t with (nolock)
>> inner join (
>> select [Field1] = t.[Field1],
>> [Field2] = t.[Field2],
>> [Date] = max(t.[Date])
>> from [dbo].[Test] as t with (nolock)
>> group by t.[Field1],
>> t.[Field2]
>> ) as q on q.[Field1] = t.[Field1]
>> and q.[Field2] = t.[Field2]
>> and q.[Date] = t.[Date]
>>
>> In essence, I'm trying to find the rows from Test that have the
>> *maximum* Date for a unique Field1/Field2 combination. This query is
>> used in a VIEW, and users of this VIEW typically JOIN on RowID.
>> I'm sure the "guts" of this VIEW can be refactored to be better, but I
>> don't know how to do it. Any suggestions would be *much* appreciated!
>> John Peterson
>>
>>
>>
>|||Hi John,
The query you posted should perform just fine, provided you add the
proper indexes. It would definitely be desirable to have an index on
Test(Field1,Field2) or even on Test(Field1,Field2,"Date").
HTH,
Gert-Jan
John Peterson wrote:
> (SQL Server 2000, SP4)
> Hello all!
> I'm seeing a lot of I/O generated by one of our queries, and I think it's
> how the query is constructed. Consider the following:
> use [tempdb]
> go
> if (object_id('[dbo].[Test]') is not NULL) drop table [dbo].[Test]
> create table [dbo].[Test]
> (
> [RowID] uniqueidentifier not NULL,
> [Field1] int not NULL,
> [Field2] int not NULL,
> [Date] datetime not NULL,
> primary key clustered ([RowID])
> )
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 2, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 3, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 3, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 2, 1, getdate()) waitfor delay
> '0:0:1'
> insert into [dbo].[Test] values (newid(), 1, 1, getdate()) waitfor delay
> '0:0:1'
> select t.*
> from [dbo].[Test] as t with (nolock)
> select t.*
> from [dbo].[Test] as t with (nolock)
> inner join (
> select [Field1] = t.[Field1],
> [Field2] = t.[Field2],
> [Date] = max(t.[Date])
> from [dbo].[Test] as t with (nolock)
> group by t.[Field1],
> t.[Field2]
> ) as q on q.[Field1] = t.[Field1]
> and q.[Field2] = t.[Field2]
> and q.[Date] = t.[Date]
> In essence, I'm trying to find the rows from Test that have the *maximum*
> Date for a unique Field1/Field2 combination. This query is used in a VIEW,
> and users of this VIEW typically JOIN on RowID.
> I'm sure the "guts" of this VIEW can be refactored to be better, but I don't
> know how to do it. Any suggestions would be *much* appreciated!
> John Peterson

Monday, March 12, 2012

Need query optimization help.

No comments:

Post a Comment

XML Bulk Load

Blog Archive

About Me