[thelist] msql: using an index to search across columns?

Dunstan Orchard dunstan at 1976design.com
Wed Oct 1 08:44:54 CDT 2003


> Rudy wrote:
> good idea -- will a pipe character suffice, do you think? 
> 
>      select title, body
>        from blog_post
>       where concat_ws('|',title,body) LIKE '%telegraph%'
>         and concat_ws('|',title,body) LIKE '%death%'
>         
> cute function, n'est-ce pas?        
> 
> see http://www.mysql.com/doc/en/String_functions.html

That worked perfectly, thanks rudy (and Joshua).
Very neat bit of code.

If anyone's interested in what this is being used for, it's a search engine for
a blog, but one that displays the results in a slightly different way to normal
(I hope).

Most people's search engines display a list of post titles and then a section
from the post containing just the first search term.

You can see an example here:
http://www.holovaty.com/search/css%20coded

Even though we've searched for 'css' and 'coded' the displayed excerpt is
grabbed because it contains the first term - 'css'. Some of the excerpts also
contain 'coded', but that's pure chance.

I decided that I wanted excerpts for _each_ of my search terms.

My search engine does the following:

[1] Find the first occurrence of the first term.
[2] Grab 70 characters either side of it.
[3] Find the first occurrence of the next search term.
[4] Grab 70 characters either side of it.
[5] Compare the two blocks of text - if they overlap then combine them, if not,
then print out the first one with '...' where appropriate, and then move onto
the next search term.
[6] Repeat steps 3, 4, 5 for all the search terms.

You can see the results here:

1 search term:
http://www.1976design.com/blog/search/parents

This shows 4 results, each with the search term highlighted.


2 search terms:
http://www.1976design.com/blog/search/parents+come

The first result shows the two search term excerpts combined to form one excerpt.
The second result shows two separate excerpts, with the '...' between them.
The third result is the same as the first, but this time the crossover between
the two blocks of text was smaller, so the resulting excerpt is much larger.

Here's another one:
http://www.1976design.com/blog/search/alastair+brother

The first result shows two separate excerpts, with the '...' between them.
The second result comes about because while a match was found in the DB, the
HTML code where it was found has been stripped out, and so can't be displayed in
the excerpt. In this case the first 150 characters of the post are displayed.


I'm quite pleased with it so far, and _I've_ not seen anyone else's search
engines do this, though I'm sure someone else must have had a similar idea :o)

Thanks again for the help,

- Dunstan

---------------------------
Dorset, England
http://www.1976design.com/
http://www.1976design.com/blog/


More information about the thelist mailing list