[thelist] search engine issue (development issue that is)

Chris Blessing webguy at mail.rit.edu
Thu Jun 20 10:41:00 CDT 2002


Hi all once again-

I'm currently working on a project for my company's website whereby I'm
creating a search engine to search our database of content.  The details
aren't important but I am looking for some pointers on how to handle the
actual search results.  Here's my current thought process:

1) stored procedure searches content and plops results (consisting of the
user's session id (IIS SID, not SQL Server SID), search id (an incremental
number starting at 1, increasing per search that the user performs), article
id and the rank for that article) into a "search results" table which I've
created ahead of time (i.e. it's not a temp table).

It would look like this:

session_id	search_id	article_id	rank
----------	---------	----------	----
123-134-ab	1		101		1000
123-134-ab	1		104		988
123-134-ab	1		198		450
323-333-de	1		987		980
323-33-de	1		239		430

That's two users' search results.  One user (123-134-ab) got back 3 results
based on the  search terms and they are ranked accordingly.  The other user
(323-333-de) got back 2 results for his/her search.  Note that the
session_id changes per user but the search_id is constant FOR THIS SEARCH.
If 123-134-ab decided to search again, the search_id would increment (via
some ASP code) to 2, and so on.

2) The actual search results are displayed by going back to this table of
search results, selecting all rows based on search_id and session_id
(search_id would be generated before the search itself and held as a
session-level variable so we can increment it everytime the user searches),
and then getting the article details based on article_id (probably a simple
table join, nothing really fancy)

Seems pretty straight-forward no?

My problem is that this search results table is going to fill up mighty fast
if users are getting 20-200 results per query and doing 1-5 searches per
session (that's what I'm guessing will happen, anyhow).  I thought about
cleaning the search results table out at Session_OnEnd but that poses a
problem. If the user lets their session time out by sitting idle, the
results are lost and the user would probably not be too happy since they
most likely forgot whatever their search terms were AND they'd have to go
back and search again.

Also I'm not so sure I can get the actual session ID generated by IIS once
the session_onend event begins.  If that's the case, I have no way of
identifying the session and therefore cannot delete anything reliably from
the search results table!

I suppose I could circumvent this by writing the session id to a real cookie
(instead of a session variable) and just overwrite that cookie the next time
the user comes to the site, but that seems a bit awkward.  It also still
doesn't help me with the session timeout problem, which btw is 20 minutes on
our servers.

Anyone have any insight or advice for this type of a project?  I apologize
for the lengthy message... it's so much to explain!

TIA!

Chris Blessing
webguy at mail.rit.edu
http://www.330i.net




More information about the thelist mailing list