Monday, March 26, 2012

need suggestions for indexing

I have a table that contains fields such as:
----
--
CustomerID ProductName SmallDateTime AnotherID1 AnotherID2
Amount
----
--
The data comes from another database that throws data over to this table,
and so no field in the above is a foreign key or primary key.
The ID fields are NOT going to be unique, even when combined with the
smalldatetime field.
The table contains millions of records and my stored proc needs to execute
as fast as possible. This is a mission-critical and speed-sensitive
environment.
MY QUESTION: What index(es) should I use to optimize this proc...
Create Procedure as GetCustomerProducts
@.CustomerID int,
@.ProductName varchar(10),
as
Select [field names here]
from [the-above-shown-table]
where CustomerID = @.CustomerID
and ProductName = @.ProductName
Order by ProductName, AnotherID1, AnotherID2
go
The SP will return anywhere from 1 to 100,000 records, depending upon the
two input params. I need the results "instantly". Hardware is top-notch.
Index suggestions?I would start with a clustered index on CustomerID. Try the query and
measure the results then add a nonclustered on ProductName and see if it
speeds up or slows down. Alternatively, feed the query to the index tuning
wizard and experiment with the results.
Walter
"HK" <replywithingroup@.notreal.com> wrote in message
news:let9f.3242$Hs.3087@.tornado.socal.rr.com...
>I have a table that contains fields such as:
> ----
> --
> CustomerID ProductName SmallDateTime AnotherID1 AnotherID2
> Amount
> ----
> --
> The data comes from another database that throws data over to this table,
> and so no field in the above is a foreign key or primary key.
> The ID fields are NOT going to be unique, even when combined with the
> smalldatetime field.
> The table contains millions of records and my stored proc needs to execute
> as fast as possible. This is a mission-critical and speed-sensitive
> environment.
> MY QUESTION: What index(es) should I use to optimize this proc...
> Create Procedure as GetCustomerProducts
> @.CustomerID int,
> @.ProductName varchar(10),
> as
> Select [field names here]
> from [the-above-shown-table]
> where CustomerID = @.CustomerID
> and ProductName = @.ProductName
> Order by ProductName, AnotherID1, AnotherID2
> go
>
> The SP will return anywhere from 1 to 100,000 records, depending upon the
> two input params. I need the results "instantly". Hardware is
> top-notch.
> Index suggestions?
>
>|||depending on how you load the data and how many changes
how about this...
you could make one big composite clustered with all the columns in order of
most restrictive(unique) to least
and make a non clustered on CustomerID and Product name in same order
"Walt Mallon" <waltmallon@.hotmail.nospam.com> wrote in message
news:uzX6yal3FHA.3876@.TK2MSFTNGP09.phx.gbl...
>I would start with a clustered index on CustomerID. Try the query and
>measure the results then add a nonclustered on ProductName and see if it
>speeds up or slows down. Alternatively, feed the query to the index tuning
>wizard and experiment with the results.
> Walter
>
> "HK" <replywithingroup@.notreal.com> wrote in message
> news:let9f.3242$Hs.3087@.tornado.socal.rr.com...
>>I have a table that contains fields such as:
>> ----
>> --
>> CustomerID ProductName SmallDateTime AnotherID1 AnotherID2
>> Amount
>> ----
>> --
>> The data comes from another database that throws data over to this table,
>> and so no field in the above is a foreign key or primary key.
>> The ID fields are NOT going to be unique, even when combined with the
>> smalldatetime field.
>> The table contains millions of records and my stored proc needs to
>> execute
>> as fast as possible. This is a mission-critical and speed-sensitive
>> environment.
>> MY QUESTION: What index(es) should I use to optimize this proc...
>> Create Procedure as GetCustomerProducts
>> @.CustomerID int,
>> @.ProductName varchar(10),
>> as
>> Select [field names here]
>> from [the-above-shown-table]
>> where CustomerID = @.CustomerID
>> and ProductName = @.ProductName
>> Order by ProductName, AnotherID1, AnotherID2
>> go
>>
>> The SP will return anywhere from 1 to 100,000 records, depending upon the
>> two input params. I need the results "instantly". Hardware is
>> top-notch.
>> Index suggestions?
>>
>>
>|||HK wrote:
> I have a table that contains fields such as:
> ----
> --
> CustomerID ProductName SmallDateTime AnotherID1
> AnotherID2 Amount
> ----
> --
> The data comes from another database that throws data over to this
> table, and so no field in the above is a foreign key or primary key.
> The ID fields are NOT going to be unique, even when combined with the
> smalldatetime field.
> The table contains millions of records and my stored proc needs to
> execute as fast as possible. This is a mission-critical and
> speed-sensitive environment.
> MY QUESTION: What index(es) should I use to optimize this proc...
> Create Procedure as GetCustomerProducts
> @.CustomerID int,
> @.ProductName varchar(10),
> as
> Select [field names here]
> from [the-above-shown-table]
> where CustomerID = @.CustomerID
> and ProductName = @.ProductName
> Order by ProductName, AnotherID1, AnotherID2
> go
>
> The SP will return anywhere from 1 to 100,000 records, depending upon
> the two input params. I need the results "instantly". Hardware is
> top-notch.
> Index suggestions?
I would add an identity column to the table and create a unique
clustered index on CustomerID + ProductName + ID. That way, you can
remove rows. It's no elegant (but you have no keys and that's already
not so elegant), but it does prevent you from having to duplicate any
clustered index keys in the non-clustered indexes and prevents SQL
Server from adding unique identifiers to all non-duplicate rows in a
non-unique clustered index.
The Order By clause confuses me. Why do you need to sort on the
ProductName column when you are querying for a single ProductName?
I would eliminate the ORDER BY if possible. Most queries don't require
sorting on SQL Server as the app can handle this. If you must have
sorting, then I would remove the ProductName and do two things:
1- Check performance of the queries that return the small, medium, and
larger result sets. If the Sort operation is not consuming too much
overhead for the larger result sets, then leave the indexing as is.
2- If the Sort operation is consuming too much overhead, add the
AnotherID1 + AnotherID2 to the clustered index before the ID column.
Also, you have not provided all the necessary detail for the query. You
left out the column names, which are important to know when designing
queries. For example, there's generally littlle reason to request a
column you are filtering against since you already know the answer.
Also, if you can design covering indexes, you can speed up queries by
preventing bookmark lookup operations. Not the case here as we only have
a single unique, clustered index.
--
David Gugick
Quest Software
www.imceda.com
www.quest.com

No comments:

Post a Comment