Wednesday, March 7, 2012

Help troubleshooting SQL7 server hang

I have a Compaq ML370 running 2k server sp3 and SQL7 sp3. Very
intermittently the server hang and disconnect all sql users. I can still
ping the server but there is no video or other interaction possible. All
that can be done is to dump it with the switch and then bring it back up.
When it does come back up, there are no fingerprints in the event log what
happened except that "The previous system shutdown yada yada yada was
unexpected."
As much as it sounds like a hardware issue, all diags come back fine and all
the latest drivers and firmware are installed. To me it almost sounds like a
denial of service problem that is crashing the machine. The only
communications with the machine though are via tcp/ip to the sql listener.
My question...
Is it possible that the sql traffic could cause this kind of crash?
Another question...
Can anyone suggest a way to troubleshoot this? I can't seem to force it to
happen because I can't determine what contributes to it.
MikeYou might try doing a black box trace with profiler, that will contain the
last SQL things that were done prior to the hang up... ( from BOL)
Use sp_trace_create with the TRACE_PRODUCE_BLACKBOX option to define a trace
that appends trace information to a blackbox.trc file in the \Data
directory. Once the trace is started, trace information is recorded in the
blackbox.trc file until the size of the file reaches 5 megabytes (MB). The
trace then creates another trace file, blackbox_01.trc, and trace
information is written to the new file. When the size of blackbox_01.trc
reaches 5 MB, the trace reverts to blackbox.trc. Thus, up to 5 MB of trace
information is always available.
"Mike Strout" <m i k e s t r o u t @. h o t m a i l . c o m> wrote in message
news:vgouilr21jpp40@.corp.supernews.com...
> I have a Compaq ML370 running 2k server sp3 and SQL7 sp3. Very
> intermittently the server hang and disconnect all sql users. I can still
> ping the server but there is no video or other interaction possible. All
> that can be done is to dump it with the switch and then bring it back up.
> When it does come back up, there are no fingerprints in the event log what
> happened except that "The previous system shutdown yada yada yada was
> unexpected."
> As much as it sounds like a hardware issue, all diags come back fine and
all
> the latest drivers and firmware are installed. To me it almost sounds like
a
> denial of service problem that is crashing the machine. The only
> communications with the machine though are via tcp/ip to the sql listener.
> My question...
> Is it possible that the sql traffic could cause this kind of crash?
> Another question...
> Can anyone suggest a way to troubleshoot this? I can't seem to force it to
> happen because I can't determine what contributes to it.
> Mike
>|||Use sp_trace_create with the TRACE_PRODUCE_BLACKBOX option to define a trace
that appends trace information to a blackbox.trc file in the \Data
directory. Once the trace is started, trace information is recorded in the
blackbox.trc file until the size of the file reaches 5 megabytes (MB). The
trace then creates another trace file, blackbox_01.trc, and trace
information is written to the new file. When the size of blackbox_01.trc
reaches 5 MB, the trace reverts to blackbox.trc. Thus, up to 5 MB of trace
information is always available.
"Mike Strout" <m i k e s t r o u t @. h o t m a i l . c o m> wrote in message
news:vgouilr21jpp40@.corp.supernews.com...
> I have a Compaq ML370 running 2k server sp3 and SQL7 sp3. Very
> intermittently the server hang and disconnect all sql users. I can still
> ping the server but there is no video or other interaction possible. All
> that can be done is to dump it with the switch and then bring it back up.
> When it does come back up, there are no fingerprints in the event log what
> happened except that "The previous system shutdown yada yada yada was
> unexpected."
> As much as it sounds like a hardware issue, all diags come back fine and
all
> the latest drivers and firmware are installed. To me it almost sounds like
a
> denial of service problem that is crashing the machine. The only
> communications with the machine though are via tcp/ip to the sql listener.
> My question...
> Is it possible that the sql traffic could cause this kind of crash?
> Another question...
> Can anyone suggest a way to troubleshoot this? I can't seem to force it to
> happen because I can't determine what contributes to it.
> Mike
>

No comments:

Post a Comment