[theforum] [Sysadmin] [mods] css-d list down again?

David Kaufman david at gigawatt.com
Tue Oct 31 02:09:08 CST 2006


The lists were apparently down again today from this afternoon about 2pm
till just now (2am).

John Handelaar <john at userfrenzy.com> wrote:
> Evolt:  I'm becoming convinced that ClamAV's voracious use of memory
>         and CPU cycles is what causes qmail-send to lock up.  Someone
>         want to look into this?
>
>         Oh, and the recipe I use, for future reference, is to
>         `mailmanctl stop` on the queue, `qmailctl stop` on the
>         mail server, `qmailctl stat` to check it went down (and if
>         not, a hard kill on qmail-send), then start them again in
>         reverse order.  Works every time (so far).

I agree the weak link appears to be spamassassin.  I had to add a dash
of "/etc/initd/spamassassin restart" to that recipe, and shake well.
Was seeing lots of these in the mail.log:

spamc[15061]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying
(#3 of 3): Connection refused

which after assassinating (and resurrecting) spamassassin, became:

spamd[15332]: connection from tempest [127.0.0.1] at port 56358
spamd[15332]: info: setuid to qscand succeeded
spamd[15332]: checking message <[...]> for qscand:1016.
spamd[15332]: clean message (-2.5/5.0) for qscand:1016 in 5.7 seconds,
1475 bytes.
spamd[15332]: result: . -2 - AWL,BAYES_00
scantime=5.7,size=1475,mid=<E1GenSr-0007Gm-By
@mailscan32.yourhostingaccount.com>,bayes=5.55111512312578e-17,autolearn=ham

So, what to do?

On the one hand, we (I, and our list subscribers) simply cannot live
without adequate and aggressive spam-filtrage and this
kicking-the-lists-twice-a-week thing is getting old fast... I'm thinking
one or both of the following approaches should be taken:

Approach A: Since this is an unstable network daemon issue, maybe we
should break down and finally upgrade our relatively-outdated
two-year-old kernel from 2.4 to 2.6 (maybe we should even if we weren't
having problems...)

Approach 2: Since SpamAssassin itself is such an actively developed
package, Debian Sarge's 3.0.3-2sarge1 (also circa 2004, tho up-to-date
with security patches) is getting very long in the tooth, too.  I'd
recommend upgrading to the 3.1.3-0bpo1 version available from
backports.org.

Though we have never rebooted tempest ...ever... I think a kernel
upgrade would certainly be A Good Thing.  But I'm not comfortable doing
it myself, certainly not remotely.  Any takers?  ThePlanet Support
charges us for such tasks but we can have them do it if no
cowboy-sysadmin steps up to take the challenge (and the heat).

I *am* comfortable upgrading spamassassin (and using backports' packages
in production) but I understand if others don't agree with the practice.
I've run backport.org versions of MySQL, Postfix, subvervsion and
numerous other fairly mission critical services on servers that I
administer for work with nothing but goodness to report.  But then our
machines don't get pummeled like tempest does!

Does anyone (at all) want to veto (or debate against) using backports?

Does anyone (w/root on tempest, obviously) wanna volunteer to perform
the long-overdue kernel upgrade?

-dave

PS: I'm cc-ing css-discuss moderators (cuz they're suffering too), and
theforum, (cuz someone said that, in the interests of transparency,
only sensitive (and boring) issues should take place on the private
sysadmin list).

Is this too boring?  I for one am excited by kernel upgrades (but it's
obvious that I'm nerdy in the extreme (and white as sour cream)).





More information about the theforum mailing list