[Sysadmin] Serious, egregious problem with Servermatrix Debian boxes

David Kaufman david at gigawatt.com
Thu Oct 7 11:25:10 CDT 2004


hi John,

John Handelaar <john at userfrenzy.com> wrote:
> Hey, guess what?
>
> 1. They don't bother to compile proper IDE support into
>    the kernel.

First, I'm not any sort of kernel-fu black belt, in fact I *never*
compile kernels, unless it is unavoidable, and even then I bitch like
hell.  I consider any feature in any software that requires me to
recompile my kernel, as "maybe not ready for prime time" :-)

That may be a naive position (especially among hard-core Linux hackers,
it certainly annoys Gentoo folk) but on my own home systems, and those
I'm responsible for at my various day jobs over the last few years I
have used Debian's stable branch exclusively since Potato (now Woody)
and am quite familiar with it (I have a minor in RedHat & Fedora).

That said, I was alarmed by your subject line, and so I just ran
"uname -a" on tempest, which reports:

    Linux tempest 2.4.25-bf2.4 #1 SMP Wed Mar 10 10:35:09 PST 2004 i686
unknown

which tell me that ServerMatrix does not compile their kernels at *all*,
either (which makes me glad, not mad...)  The O/S on tempest was
installed as a recent "boot floppy" (bf) 2.4 kernel (not the bleeding
edge, but recent) from a stock debian binary distribution CD.  This is
also the way I install all my production machines, too.  "bf24" is a
precompiled debian kernel "flavor" option on the installation CD, that I
also use because it's documented as "optimized" for IDE and modern
system chipsets, and has support compiled in for most recent video and
ethernet chipsets commonly found "in the wild".

> 2. DMA on the IDE interface is therefore off.
>
> 3. DMA on the IDE interface cannot therefore be enabled.
>
> 4. DMA on the IDE interface is in fact also disabled at the
>    BIOS level.

I have to admit I've never run into all these DMA issues myself.  So I'm
not disputing them, but would like to investigate them to understand the
issue.

Can you point me to some doco on this?  How does one test if DMA is
enabled on an IDE interface


> Ergo...
>
> 5. The disk channel is running at about 10% of normal
>    performance

Yow!  If that's true I'm quite confident that it's *not* due to the
kernel we are running.  I'd be alarmed indeed if the zillions of Debian
stable users who choose this kernel (one of only two kernel choices on
the installation CD, IIRC) are suffering 90% performance degradation.

If it's off at the BIOS level, and the IDE controller (and drives) do
support DMA access, then I'm sure some tech at the place can enable it.
I'm guessing we'd need to be able to provide them with simple,
reproduciable disk access benchmark test that clearly point to a serious
and egregious lack of performance however.  But if we can prove that the
performance is for shit, I can and will certainly escalate this to
someone who can press F2 to enter setup, if it means calling their tech
support daily and being annoying for hours upon end.  Which I actually
sort of enjoy....

> 6. The split second we ever hit swap space, all the meters
>    go up to 100% usage and the box drops dead until you
>    kill almost all its running processes.

is this an observation you made on tempest or a prediction based on
experience elsewhere with this issue?


> Also FYI
>
> 7. Servermatrix's tech support has no idea what any of this
>    means.

Well, I have to confess to have only a vague clue, either.  I know DMA
stands for Direct Memory Access, that it comes with (or missing from)
your combination of IDE controller chip (probably on our motherboard)
and the choice of drives ...and it makes disks run faster.  That's about
it.  :-)  ...but not 10 times faster (or slower) I didn't think!  I do
reserve the right to think wrong at all time, however, and in all
circumstances, and will eat crow later if necessary (in which case, btw,
I prefer my crow with a side of mashed potatoes).

> Seb and I have just wasted three weeks of our lives trying to
> figure out why a Zope application wasn't working, and it turns
> out that this is the culprit.

Could you paste up some links that refer to this issue, especially
relating to Zope, and the specific ways that Zope is failing?  I'm not
sure what the connection is.

If disk access is in fact slow on tempest, does that actually prevent
Zope applications "from working" at all?  Don't they just work slowly?
How can I test, detect and confirm a lack of DMA, especially at the BIOS
level?

> ...Anyone here have some extreme
> kernel-fu?  Our box needs patching, and so does mine (I've now
> had two failed attempts to sort it out result in total server
> non-boot failure).

I googled "woody kernel IDE DMA" and several hits look like similar
issues as you describe, with various VIA chipsets, drives, etc.  If DMS
is diabled in the kernel, that might be way.  Woody is the stable
branch, so it's binary distro disks come with kernels that can
reasonably be expected not to lock up when botting on common chipsets
like VIA on Athlon.

FYI:

tempest was specified as:

  MB  : Gigabyte \ SuperCeleronMarkII \ GA-8LD533
  CPU : Intel \ Celeron \ 2.4 Ghz
  RAM : Transcend \ 1GB \ DDR266
  IDE : Unknown \ Onboard \ IDE
  Disk: Seagate \ 80GB:IDE:7200RPM Barracuda \ ST380011a


I also notice that it rebooted two weeks ago.  Was that part of this?

Thanks,

-dave



More information about the Sysadmin mailing list