intranet search tool (was RE: [thelist] full text search with indexing services on Win2K server)

Burhan Khalid thelist at
Mon Sep 6 03:05:59 CDT 2004

On Thu, 2004-09-02 at 21:47, Joel D Canfield wrote:
> Okay, how 'bout this: on your Windows intranet, what user-friendly
> method do *you* use for full-text searching of large volumes of
> documents? (large meaning tens of thousands)

My question is similar to this, but with a different slant.  The company
has a repository (around 5 gigs) of PDF and WHM files.  Currently, the
files are haphazardly categorized by publisher and subject matter (ie,
References\ORielly\CSharp) -- this is on a file system level. There are
separate directories for each publisher, etc.

We wanted to come up with an intranet solution so that employees can
search this library based on keywords and then find out what documents
they should download/checkout (or what book to ask for from the
reference library).

The files were on a Unix/Linux based server (now we have moved them to a
Windows 2003 Server).  My initial idea was to find a search engine
solution (like mngoSearch) and hope that it would index the files

Another idea that was presented at a staff meeting was to have someone
write down keywords for each file and then store this information in a
database. The end user then has the option of searching for keywords, or
viewing a list of all files sorted by category/subject.

I'm really looking for an automated system that can take care of
indexing the content and provide a user friendly interface for searching
the library.  Is there such a thing out there? I've never had to deal
with such large amounts of documents -- so I'm not sure what would be
the best solution.  The platforms that are available are Windows Server
2003, FreeBSD and Linux.

Can anyone offer some recommendations?

> > I can't find info about this. Win2K indexing services seems to only
> > search its own abstract of the document (the column 
> > and not really the text of the documents.
> > 
> > We've got thousands of docs, mostly RTF but lots of other MS and plain
> > text formats. I've used Site Server in the past and been very 
> > happy with
> > the speed of indexing and reliability of searches. Indexing service
> > doesn't seem to be doing it. Are my expectations unrealistic? 
> > Should it
> > be performing full text searches?

More information about the thelist mailing list