Tuesday, March 27, 2012

Help with blocking

Hi all,
I was hoping I could get some help from you all about a blocking
problem that is making me tear my hair out. I've got a client process
that is executing two DELETE statements hourly during our ten hour
production day. The first DELETE statement cleans out "expired"
entries from Table A and the second cleans out "expired" entries from
Table B.
Under normal circumstances, each of these queries executes in a few
seconds. About once a week the second query which deletes from Table B
runs long (2-3 minutes) though it does finish by itself. This creates
tons of blocking as both TABLEA and B are under heavy SELECT, INSERT,
and UPDATE activity. Here are the queries (sorta)...
Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
'00:05:00')
Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
[FieldGUID] FROM TABLEA)
(note: FieldGUID is the PK for TABLEA)
I've checked the execution plan for the second query and I know that's
it is going to perform a table scan. I also know the row counts for
Tables A and B and my I/O subsystem and and can say for certain that
the second query should not run for more than a few seconds.
I've checked sysprocesses while this second query is running and
consistently find that it is not blocked, is not waiting, the [status]
column shows sleeping, open_tran column equals 2, cpu and physical_io
columns show low values and do change slowly over time. I have also
checked my normal metrics that I capture throughout the day and can't
find a problem. My server's CPU usage is low, disk queues are low to
nonexistent, low to no paging going on, no other processes running,
etc.
I can't for the life of me figure out what causes this second query to
consistently (but infrequently) run long. I'm running SQL 2000 SP3a
Enterprise Edition on a 2-node Active/Passive cluster. Each cluster
node has 8GB of memory and 4x2.8GHz CPUs.
Any and all help is greatly appreciated. Thanks.
SteveWith "implicit_transaction' on by default, your #1 query is enclosed within
a transaction. Thus, it holds the locks until it completes with the delete.
Your #2 query's SELECT statement is blocked at this point. This is probably
desired because you probably want to be sure of the data in TableA.
So, to improve this, I would suggest putting an index on your TableA's
FieldModified column. Also, consider using NOT EXISTS instead.
Now, if #2 is nothing more than a maintenance step which will run again, you
could force a READPAST (i.e. skip locked rows).
e.g.
DELETE FROM TABLEB WHERE [FieldGUID]
NOT IN (SELECT
[FieldGUID]
FROM TABLEA with(READPAST) )
-oj
<steveg@.payerpath.com> wrote in message
news:1107799902.434459.80610@.z14g2000cwz.googlegroups.com...
> Hi all,
> I was hoping I could get some help from you all about a blocking
> problem that is making me tear my hair out. I've got a client process
> that is executing two DELETE statements hourly during our ten hour
> production day. The first DELETE statement cleans out "expired"
> entries from Table A and the second cleans out "expired" entries from
> Table B.
> Under normal circumstances, each of these queries executes in a few
> seconds. About once a week the second query which deletes from Table B
> runs long (2-3 minutes) though it does finish by itself. This creates
> tons of blocking as both TABLEA and B are under heavy SELECT, INSERT,
> and UPDATE activity. Here are the queries (sorta)...
> Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
> '00:05:00')
> Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
> [FieldGUID] FROM TABLEA)
> (note: FieldGUID is the PK for TABLEA)
>
> I've checked the execution plan for the second query and I know that's
> it is going to perform a table scan. I also know the row counts for
> Tables A and B and my I/O subsystem and and can say for certain that
> the second query should not run for more than a few seconds.
> I've checked sysprocesses while this second query is running and
> consistently find that it is not blocked, is not waiting, the [status]
> column shows sleeping, open_tran column equals 2, cpu and physical_io
> columns show low values and do change slowly over time. I have also
> checked my normal metrics that I capture throughout the day and can't
> find a problem. My server's CPU usage is low, disk queues are low to
> nonexistent, low to no paging going on, no other processes running,
> etc.
> I can't for the life of me figure out what causes this second query to
> consistently (but infrequently) run long. I'm running SQL 2000 SP3a
> Enterprise Edition on a 2-node Active/Passive cluster. Each cluster
> node has 8GB of memory and 4x2.8GHz CPUs.
> Any and all help is greatly appreciated. Thanks.
> Steve
>|||Hi,
Did you check lastwaittype and waitresource on sysprocesses which might give
you a hint..
We had some problems when we have max degree of parallelism is set to 0 (all
available processors) on some of the update statements for some queries. The
queries run with no problems when we use option maxdop (1) or when we set max
degree of parallelism to some value (processor number - 1). This is just an
assumption, our server has 8 cpu's..
Also check your log file growth settings, if you're transaction log file
setting is large and the auto growth is enabled like %10 etc maybe you need
to consider to change log auto-growth settings and auto-shrink settings as
well..
"steveg@.payerpath.com" wrote:
> Hi all,
> I was hoping I could get some help from you all about a blocking
> problem that is making me tear my hair out. I've got a client process
> that is executing two DELETE statements hourly during our ten hour
> production day. The first DELETE statement cleans out "expired"
> entries from Table A and the second cleans out "expired" entries from
> Table B.
> Under normal circumstances, each of these queries executes in a few
> seconds. About once a week the second query which deletes from Table B
> runs long (2-3 minutes) though it does finish by itself. This creates
> tons of blocking as both TABLEA and B are under heavy SELECT, INSERT,
> and UPDATE activity. Here are the queries (sorta)...
> Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
> '00:05:00')
> Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
> [FieldGUID] FROM TABLEA)
> (note: FieldGUID is the PK for TABLEA)
>
> I've checked the execution plan for the second query and I know that's
> it is going to perform a table scan. I also know the row counts for
> Tables A and B and my I/O subsystem and and can say for certain that
> the second query should not run for more than a few seconds.
> I've checked sysprocesses while this second query is running and
> consistently find that it is not blocked, is not waiting, the [status]
> column shows sleeping, open_tran column equals 2, cpu and physical_io
> columns show low values and do change slowly over time. I have also
> checked my normal metrics that I capture throughout the day and can't
> find a problem. My server's CPU usage is low, disk queues are low to
> nonexistent, low to no paging going on, no other processes running,
> etc.
> I can't for the life of me figure out what causes this second query to
> consistently (but infrequently) run long. I'm running SQL 2000 SP3a
> Enterprise Edition on a 2-node Active/Passive cluster. Each cluster
> node has 8GB of memory and 4x2.8GHz CPUs.
> Any and all help is greatly appreciated. Thanks.
> Steve
>|||oj,
The client that sends these two statements sends them in the same order
every time and won't send the second statement until the first has
completed. At a high level the code looks like the following so I
don't think the possibility you outlined is what is actually occuring.
try
prepare first delete
execute 1st delete
prepare 2nd delete
execute 2nd delete
catch
error handling logic
Steve
oj wrote:
> With "implicit_transaction' on by default, your #1 query is enclosed
within
> a transaction. Thus, it holds the locks until it completes with the
delete.
> Your #2 query's SELECT statement is blocked at this point. This is
probably
> desired because you probably want to be sure of the data in TableA.
> So, to improve this, I would suggest putting an index on your
TableA's
> FieldModified column. Also, consider using NOT EXISTS instead.
> Now, if #2 is nothing more than a maintenance step which will run
again, you
> could force a READPAST (i.e. skip locked rows).
> e.g.
> DELETE FROM TABLEB WHERE [FieldGUID]
> NOT IN (SELECT
> [FieldGUID]
> FROM TABLEA with(READPAST) )
>
> --
> -oj
>
> <steveg@.payerpath.com> wrote in message
> news:1107799902.434459.80610@.z14g2000cwz.googlegroups.com...
> > Hi all,
> >
> > I was hoping I could get some help from you all about a blocking
> > problem that is making me tear my hair out. I've got a client
process
> > that is executing two DELETE statements hourly during our ten hour
> > production day. The first DELETE statement cleans out "expired"
> > entries from Table A and the second cleans out "expired" entries
from
> > Table B.
> >
> > Under normal circumstances, each of these queries executes in a few
> > seconds. About once a week the second query which deletes from
Table B
> > runs long (2-3 minutes) though it does finish by itself. This
creates
> > tons of blocking as both TABLEA and B are under heavy SELECT,
INSERT,
> > and UPDATE activity. Here are the queries (sorta)...
> >
> > Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
> > '00:05:00')
> >
> > Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
> > [FieldGUID] FROM TABLEA)
> > (note: FieldGUID is the PK for TABLEA)
> >
> >
> > I've checked the execution plan for the second query and I know
that's
> > it is going to perform a table scan. I also know the row counts
for
> > Tables A and B and my I/O subsystem and and can say for certain
that
> > the second query should not run for more than a few seconds.
> >
> > I've checked sysprocesses while this second query is running and
> > consistently find that it is not blocked, is not waiting, the
[status]
> > column shows sleeping, open_tran column equals 2, cpu and
physical_io
> > columns show low values and do change slowly over time. I have
also
> > checked my normal metrics that I capture throughout the day and
can't
> > find a problem. My server's CPU usage is low, disk queues are low
to
> > nonexistent, low to no paging going on, no other processes running,
> > etc.
> >
> > I can't for the life of me figure out what causes this second query
to
> > consistently (but infrequently) run long. I'm running SQL 2000
SP3a
> > Enterprise Edition on a 2-node Active/Passive cluster. Each
cluster
> > node has 8GB of memory and 4x2.8GHz CPUs.
> >
> > Any and all help is greatly appreciated. Thanks.
> >
> > Steve
> >|||Sysprocesses shows waittype=0x0000, waittime=0, and lastwaittype is
PAGELATCH_SH or PAGELATCH_EX. What I get from that information is that
the DELETE statement that causes the trouble isn't waiting on anything
but that the last time it did wait was for PAGELATCH_SH or
PAGELATCH_EX.
I too have had problems with max degrees of parallelism which is why I
have it configured at the server level to be half the number of logical
CPUs in the server. I can also see in sysprocesses when the problem
DELETE is running that it is not running in parallel as there is only
one row for the problem SPID.
I have checked the file growth settings for both my database and
transaction log. Both TableA and TableB are in the same database whose
size is very stable throughout the day. This database is also set up
in Simple recovery mode so the transaction log stays mostly empty.
Steve|||I see.
Just reread your post. You might be experiencing this:
http://support.microsoft.com/kb/329906
-oj
<steveg@.payerpath.com> wrote in message
news:1107803288.881378.258020@.l41g2000cwc.googlegroups.com...
> oj,
> The client that sends these two statements sends them in the same order
> every time and won't send the second statement until the first has
> completed. At a high level the code looks like the following so I
> don't think the possibility you outlined is what is actually occuring.
> try
> prepare first delete
> execute 1st delete
> prepare 2nd delete
> execute 2nd delete
> catch
> error handling logic
>
> Steve
>
> oj wrote:
>> With "implicit_transaction' on by default, your #1 query is enclosed
> within
>> a transaction. Thus, it holds the locks until it completes with the
> delete.
>> Your #2 query's SELECT statement is blocked at this point. This is
> probably
>> desired because you probably want to be sure of the data in TableA.
>> So, to improve this, I would suggest putting an index on your
> TableA's
>> FieldModified column. Also, consider using NOT EXISTS instead.
>> Now, if #2 is nothing more than a maintenance step which will run
> again, you
>> could force a READPAST (i.e. skip locked rows).
>> e.g.
>> DELETE FROM TABLEB WHERE [FieldGUID]
>> NOT IN (SELECT
>> [FieldGUID]
>> FROM TABLEA with(READPAST) )
>>
>> --
>> -oj
>>
>> <steveg@.payerpath.com> wrote in message
>> news:1107799902.434459.80610@.z14g2000cwz.googlegroups.com...
>> > Hi all,
>> >
>> > I was hoping I could get some help from you all about a blocking
>> > problem that is making me tear my hair out. I've got a client
> process
>> > that is executing two DELETE statements hourly during our ten hour
>> > production day. The first DELETE statement cleans out "expired"
>> > entries from Table A and the second cleans out "expired" entries
> from
>> > Table B.
>> >
>> > Under normal circumstances, each of these queries executes in a few
>> > seconds. About once a week the second query which deletes from
> Table B
>> > runs long (2-3 minutes) though it does finish by itself. This
> creates
>> > tons of blocking as both TABLEA and B are under heavy SELECT,
> INSERT,
>> > and UPDATE activity. Here are the queries (sorta)...
>> >
>> > Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
>> > '00:05:00')
>> >
>> > Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
>> > [FieldGUID] FROM TABLEA)
>> > (note: FieldGUID is the PK for TABLEA)
>> >
>> >
>> > I've checked the execution plan for the second query and I know
> that's
>> > it is going to perform a table scan. I also know the row counts
> for
>> > Tables A and B and my I/O subsystem and and can say for certain
> that
>> > the second query should not run for more than a few seconds.
>> >
>> > I've checked sysprocesses while this second query is running and
>> > consistently find that it is not blocked, is not waiting, the
> [status]
>> > column shows sleeping, open_tran column equals 2, cpu and
> physical_io
>> > columns show low values and do change slowly over time. I have
> also
>> > checked my normal metrics that I capture throughout the day and
> can't
>> > find a problem. My server's CPU usage is low, disk queues are low
> to
>> > nonexistent, low to no paging going on, no other processes running,
>> > etc.
>> >
>> > I can't for the life of me figure out what causes this second query
> to
>> > consistently (but infrequently) run long. I'm running SQL 2000
> SP3a
>> > Enterprise Edition on a 2-node Active/Passive cluster. Each
> cluster
>> > node has 8GB of memory and 4x2.8GHz CPUs.
>> >
>> > Any and all help is greatly appreciated. Thanks.
>> >
>> > Steve
>> >
>|||Also, there have been some *nasty* bugs with Parallelism (which some are
fixed in Sql2k sp4).
http://support.microsoft.com/search/default.aspx?query=sql+maxdop
I believe Umut is right on target here.
http://support.microsoft.com/kb/837983
Some info on upcoming fixes:
http://support.microsoft.com/kb/888799
-oj
<steveg@.payerpath.com> wrote in message
news:1107803288.881378.258020@.l41g2000cwc.googlegroups.com...
> oj,
> The client that sends these two statements sends them in the same order
> every time and won't send the second statement until the first has
> completed. At a high level the code looks like the following so I
> don't think the possibility you outlined is what is actually occuring.
> try
> prepare first delete
> execute 1st delete
> prepare 2nd delete
> execute 2nd delete
> catch
> error handling logic
>
> Steve
>
> oj wrote:
>> With "implicit_transaction' on by default, your #1 query is enclosed
> within
>> a transaction. Thus, it holds the locks until it completes with the
> delete.
>> Your #2 query's SELECT statement is blocked at this point. This is
> probably
>> desired because you probably want to be sure of the data in TableA.
>> So, to improve this, I would suggest putting an index on your
> TableA's
>> FieldModified column. Also, consider using NOT EXISTS instead.
>> Now, if #2 is nothing more than a maintenance step which will run
> again, you
>> could force a READPAST (i.e. skip locked rows).
>> e.g.
>> DELETE FROM TABLEB WHERE [FieldGUID]
>> NOT IN (SELECT
>> [FieldGUID]
>> FROM TABLEA with(READPAST) )
>>
>> --
>> -oj
>>
>> <steveg@.payerpath.com> wrote in message
>> news:1107799902.434459.80610@.z14g2000cwz.googlegroups.com...
>> > Hi all,
>> >
>> > I was hoping I could get some help from you all about a blocking
>> > problem that is making me tear my hair out. I've got a client
> process
>> > that is executing two DELETE statements hourly during our ten hour
>> > production day. The first DELETE statement cleans out "expired"
>> > entries from Table A and the second cleans out "expired" entries
> from
>> > Table B.
>> >
>> > Under normal circumstances, each of these queries executes in a few
>> > seconds. About once a week the second query which deletes from
> Table B
>> > runs long (2-3 minutes) though it does finish by itself. This
> creates
>> > tons of blocking as both TABLEA and B are under heavy SELECT,
> INSERT,
>> > and UPDATE activity. Here are the queries (sorta)...
>> >
>> > Query1 - DELETE FROM TABLEA WHERE [FieldModified] < (GetDate() -
>> > '00:05:00')
>> >
>> > Query2 - DELETE FROM TABLEB WHERE [FieldGUID] NOT IN (SELECT
>> > [FieldGUID] FROM TABLEA)
>> > (note: FieldGUID is the PK for TABLEA)
>> >
>> >
>> > I've checked the execution plan for the second query and I know
> that's
>> > it is going to perform a table scan. I also know the row counts
> for
>> > Tables A and B and my I/O subsystem and and can say for certain
> that
>> > the second query should not run for more than a few seconds.
>> >
>> > I've checked sysprocesses while this second query is running and
>> > consistently find that it is not blocked, is not waiting, the
> [status]
>> > column shows sleeping, open_tran column equals 2, cpu and
> physical_io
>> > columns show low values and do change slowly over time. I have
> also
>> > checked my normal metrics that I capture throughout the day and
> can't
>> > find a problem. My server's CPU usage is low, disk queues are low
> to
>> > nonexistent, low to no paging going on, no other processes running,
>> > etc.
>> >
>> > I can't for the life of me figure out what causes this second query
> to
>> > consistently (but infrequently) run long. I'm running SQL 2000
> SP3a
>> > Enterprise Edition on a 2-node Active/Passive cluster. Each
> cluster
>> > node has 8GB of memory and 4x2.8GHz CPUs.
>> >
>> > Any and all help is greatly appreciated. Thanks.
>> >
>> > Steve
>> >
>|||I don't thing the DELETE is running as a parallel process because while
it's running I only see one row for it in sysprocesses. For parallel
queries I would expect to see multiple entries in sysprocesses for the
same SPID with different values in the ecid column for each process.
Steve|||Hi Steve,
PAGELATCH_SH or PAGELATCH_EX explanation:
Contention can be caused by issues other than IO or memory performance, for
example, heavy concurrent inserts into the same index range can cause this
type of contention. If a lot of inserts need to be placed on the same page
they are serialized using the latch. A lot of inserts into the same range
can also cause page splits in the index which will hold onto the latch while
allocating a new page (this can take a while). Any read accesses to the same
range as the inserts would also conflict on the latches. (from Tom Davidson
article)
I recommend you Tom Davidson article - Opening Performance Tuning Toolbox
from SQLMag. I use it all the time and helps me a lot.
You did not give any details about how many rows does the table have and how
many rows are being deleted, if this is a batch process then consider to do
this in non business hours. If it's a part of your application and if there
are a lot of records affected by the delete then i think you need to change
your code..
If there are a few records affected by the delete then try to use cursors
and delete one by one but i don't recommend this if it's a batch process and
a lot of records are affected..
"steveg@.payerpath.com" wrote:
> Sysprocesses shows waittype=0x0000, waittime=0, and lastwaittype is
> PAGELATCH_SH or PAGELATCH_EX. What I get from that information is that
> the DELETE statement that causes the trouble isn't waiting on anything
> but that the last time it did wait was for PAGELATCH_SH or
> PAGELATCH_EX.
> I too have had problems with max degrees of parallelism which is why I
> have it configured at the server level to be half the number of logical
> CPUs in the server. I can also see in sysprocesses when the problem
> DELETE is running that it is not running in parallel as there is only
> one row for the problem SPID.
> I have checked the file growth settings for both my database and
> transaction log. Both TableA and TableB are in the same database whose
> size is very stable throughout the day. This database is also set up
> in Simple recovery mode so the transaction log stays mostly empty.
> Steve
>|||I don't think parallel queries cause multiple rows in sysprocesses. I only
see multiple entries in sysprocesses related to same SPID when lightweight
pooling is set to 1. (NT Fibre's).
FYI: Parallel Process wait type is CXPACKET
"steveg@.payerpath.com" wrote:
> I don't thing the DELETE is running as a parallel process because while
> it's running I only see one row for it in sysprocesses. For parallel
> queries I would expect to see multiple entries in sysprocesses for the
> same SPID with different values in the ecid column for each process.
> Steve
>|||Umut,
I'm pretty sure that parallel queries do spawn multiple rown in
sysprocesses. I see them from time to time on large queries.
Sysprocesses would get, for example, 5 rows for the same SPID. The
ecid column in sysprocesses would be 0-4 for theses rows and indicates
the execution context that each thread is running under. We're not
running on NT fibers so I always *assumed* that these were from
parallel executions.
Anyway, I don't mean to turn my back on your suggestion. I don't think
my problem is related to parallel execution but I also don't know what
it is related to which is why I'm here soliciting suggestions from this
forum. I'll get the developers to add a query hint to disable
parallelism. I don't think it will fix the problem but it's worth
trying. Thanks for your help.
Steve
Umut Nazlica wrote:
> I don't think parallel queries cause multiple rows in sysprocesses. I
only
> see multiple entries in sysprocesses related to same SPID when
lightweight
> pooling is set to 1. (NT Fibre's).
>
> FYI: Parallel Process wait type is CXPACKET
>
> "steveg@.payerpath.com" wrote:
> > I don't thing the DELETE is running as a parallel process because
while
> > it's running I only see one row for it in sysprocesses. For
parallel
> > queries I would expect to see multiple entries in sysprocesses for
the
> > same SPID with different values in the ecid column for each
process.
> >
> > Steve
> >
> >|||Umut,
Thanks for the quote from Tom Davidson's article.
As far as row counts for these tables goes...
TableA holds one row for every user logged into our website. Nominally
100 and probably goes up to about 500.
TableB holds information related to what what the user is doing on the
website. This table is highly variable but is certainly larger than
TableA. It proabbly runs between 20,000 - 200,000 rows.
This process that executes the two DELETE statements is essentially
"expiring" user sessions which have been inactive for some period of
time. We want it to be running during production hours. As I
mentioned in my original post, these two queries normally (98%) run for
about 5 seconds and cause no problems. However, about 2% of the time
the second query appears to "hang" and finally completes after 2-3
minutes of processing.
While this second DELETE statement is "hung", sysprocesses shows that
it is sleeping. What would make SQL sleep this SPID on a simple
DELETE? I understand that it can happen when SQL is waiting for the
application to provide some feedback but this is a simple DELETE; there
shouldn't be any "back-and-forth" interaction between the client and
server that I'm aware of. Can anyone explain why a simple DELETE
statement would go to sleep?
Steve|||Hi Steve,
Did by chance did you have any measurements about the number of records that
TABLEB has when the delete statement runs. What i mean is from your message i
understand number of records of TableB is vary, so sometimes you are
deleteing 1000 records and sometimes you are deleting 199000 records etc..
Did you try delete like following:
DELETE FROM TABLEB WHERE NOT EXISTs (SELECT
NULL FROM TABLEA WHERE tableb.col1 = tablea.col1)
(did you test NOT EXISTS)
Regards..
"steveg@.payerpath.com" wrote:
> Umut,
> Thanks for the quote from Tom Davidson's article.
> As far as row counts for these tables goes...
> TableA holds one row for every user logged into our website. Nominally
> 100 and probably goes up to about 500.
> TableB holds information related to what what the user is doing on the
> website. This table is highly variable but is certainly larger than
> TableA. It proabbly runs between 20,000 - 200,000 rows.
> This process that executes the two DELETE statements is essentially
> "expiring" user sessions which have been inactive for some period of
> time. We want it to be running during production hours. As I
> mentioned in my original post, these two queries normally (98%) run for
> about 5 seconds and cause no problems. However, about 2% of the time
> the second query appears to "hang" and finally completes after 2-3
> minutes of processing.
> While this second DELETE statement is "hung", sysprocesses shows that
> it is sleeping. What would make SQL sleep this SPID on a simple
> DELETE? I understand that it can happen when SQL is waiting for the
> application to provide some feedback but this is a simple DELETE; there
> shouldn't be any "back-and-forth" interaction between the client and
> server that I'm aware of. Can anyone explain why a simple DELETE
> statement would go to sleep?
> Steve
>|||On 8 Feb 2005 05:32:33 -0800, steveg@.payerpath.com wrote:
(snip)
>While this second DELETE statement is "hung", sysprocesses shows that
>it is sleeping. What would make SQL sleep this SPID on a simple
>DELETE? I understand that it can happen when SQL is waiting for the
>application to provide some feedback but this is a simple DELETE; there
>shouldn't be any "back-and-forth" interaction between the client and
>server that I'm aware of. Can anyone explain why a simple DELETE
>statement would go to sleep?
Hi Steve,
I realize it's mentioned before in this thread, but this really sounds as
if eitehr tempdb or the log file (probably the latter) is autogrowing.
What is the average number of rows affected by this second delete?
If I were you, I'd check that autoshrink is not accidentally enabled for
this database, manually grow the log file to twice it's current size and
check if this helps.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||Yes, that might be possible as i mentioned before in addition to this youn
capture a trace which includes log auto growth and the statements to be sure..
I don't know if it's possible with your application as you mentioned before
tablaA has certain amount of records and tableb (could be 5000 or 200000)is
vary it would be better to insert the records to a new table which
tablea.col1=tableb.col1 and drop tableb and rename new table as tableb, let's
assume that you have 100 records in tablea and 100000 records in tableb, why
delete all the 99900 records instead of adding 100 records to a new table and
drop oldone and rename new one.. (Well again this looks like a batch process
to me..)
Regards..
"Hugo Kornelis" wrote:
> On 8 Feb 2005 05:32:33 -0800, steveg@.payerpath.com wrote:
> (snip)
> >While this second DELETE statement is "hung", sysprocesses shows that
> >it is sleeping. What would make SQL sleep this SPID on a simple
> >DELETE? I understand that it can happen when SQL is waiting for the
> >application to provide some feedback but this is a simple DELETE; there
> >shouldn't be any "back-and-forth" interaction between the client and
> >server that I'm aware of. Can anyone explain why a simple DELETE
> >statement would go to sleep?
> Hi Steve,
> I realize it's mentioned before in this thread, but this really sounds as
> if eitehr tempdb or the log file (probably the latter) is autogrowing.
> What is the average number of rows affected by this second delete?
> If I were you, I'd check that autoshrink is not accidentally enabled for
> this database, manually grow the log file to twice it's current size and
> check if this helps.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)
>|||Hugo,
I just checked my database settings. My user database is not set up to
auto shrink, the only options it has set are auto create and auto
update statistics.
I'm fairly certain that neither tempdb nor my user db are
growing/shrinking during this process. I'm basing that statement on
the date/time stamp on the data and log files. It is my understanding
that the data/log file date/time will be updated when the file grows,
shrinks, or when the file is initially opened by SQL after a server
restart. All of the files (db & log) for tempdb and my user database
have date/time stamps which correspond to the last time our server was
rebooted which happened last week. However, we experienced our hung
DELETE problem as recently as yesterday.
The second DELETE will affect a LOT of rows probably in the
neighborhood of 100,000. I suspect that the second DELETE is taking an
Exclusive lock at the table level based on its execution plan, large
number of rows affected, and other processes (INSERT/UPDATE) being
blocked while attempting to put an Intent-Exclusive lock on TableB.
I'm OK with that as long as the DELETE runs for 10s as opposed to 3m.
I had a Profiler session running yesterday watching long running
queries. Once the "hung" DELETE did finally complete I could see that
the level of CPU and I/O that it used during its 150s runtime was no
more that it normally uses during a 5-7s runtime. It's doing no more
work, it's not waiting, it's not blocked, sysprocesses shows that it is
sleeping with open transactions.
Steve
Hugo Kornelis wrote:
> On 8 Feb 2005 05:32:33 -0800, steveg@.payerpath.com wrote:
> (snip)
> >While this second DELETE statement is "hung", sysprocesses shows
that
> >it is sleeping. What would make SQL sleep this SPID on a simple
> >DELETE? I understand that it can happen when SQL is waiting for the
> >application to provide some feedback but this is a simple DELETE;
there
> >shouldn't be any "back-and-forth" interaction between the client and
> >server that I'm aware of. Can anyone explain why a simple DELETE
> >statement would go to sleep?
> Hi Steve,
> I realize it's mentioned before in this thread, but this really
sounds as
> if eitehr tempdb or the log file (probably the latter) is
autogrowing.
> What is the average number of rows affected by this second delete?
> If I were you, I'd check that autoshrink is not accidentally enabled
for
> this database, manually grow the log file to twice it's current size
and
> check if this helps.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)|||I'm going to be taking over control of thses two DELETE statements
temporarily. The developers will be commenting them out of our
application and I'll be moving them into a SQL job which will run
hourly. This will give me a bit more flexibility in trying some of the
suggestions that I've garnered here. We've got the space so I'll also
expand my log files and tempdb and see if that helps any.
Any other thoughts are still welcomed. Particularly regarding what
outside factors would cause this process to go to sleep instead of
continuing to work. Thanks.|||On 8 Feb 2005 09:54:09 -0800, steveg@.payerpath.com wrote:
>I'm fairly certain that neither tempdb nor my user db are
>growing/shrinking during this process.
(snip)
Hi Steve,
With that ruled out, and pretty much everything else I can think of
already checked as well, I'm afraid I can't help you further with this.
I guess the only options that you now have are to wait if Umut or some
other poster comes up with something, accept it and live with it, create
some creative workaround or open a case with Microsoft PSS (product
support).
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||"The second DELETE will affect a LOT of rows probably in the
neighborhood of 100,000. I suspect that the second DELETE is taking an
Exclusive lock at the table level based on its execution plan, large
number of rows affected, and other processes (INSERT/UPDATE) being
blocked while attempting to put an Intent-Exclusive lock on TableB.
I'm OK with that as long as the DELETE runs for 10s as opposed to 3m. "
Steve,
With reference to the above, do you think you might want to break down your
delete into batches(setting rowcount). this will give you some more
flexibility

No comments:

Post a Comment