[thelist] SQL : performance - separate table vs. (datatype) SET

Tue Nov 12 16:31:01 CST 2002

Hassan Schroeder wrote:
> Let's say I have a DB with recipes, and I need to be able to find
> items with certain ingredients.

OK. There's a few approaches here: here's one that might give you an
idea. Caveat: I'm a SQL Server guy, not a mysql guy, but it should
work very similarly.

Say you have the following tables:

=======
Recipe
=======

RecipeID    RecipeName
--------    ----------
1           Welsh Rarebit
2           Cheese Omelette
3           Cheese sandwich

=================
RecipeIngredient
=================

RecipeID    IngredientName
--------    --------------
1           Cheese
1           Bread
2           Cheese
2           Eggs
3           Cheese
3           Bread
3           Butter

And you want to find all recipes that use cheese and bread, you can
do something like:

[1] SELECT RecipeID, COUNT(Ingredients)
[2] FROM RecipeIngredient
[3] WHERE
[4]     IngredientName IN ('Cheese', 'Bread')
[5] GROUP BY RecipeID
[6] HAVING
[7]     COUNT(Ingredients) = 2

This will return Welsh Rarebit and Cheese Sandwich, but not the
Cheese Omelette, because that only has one RecipeIngredient in the
IN clause; the HAVING then eliminates this row.

Line 4 and line 7 are built dynamically depending on the number
of ingredients you're searching on; for example, if you were
trying to find chocolate cakes, you might change them to:
[4]     IngredientName IN ('Eggs', 'Flour', 'Chocolate')
and
[7]     COUNT(Ingredients) = 3

Getting the actual recipe data out, rather than just an ID, is up to
you, but I'm sure it won't be too hard.

Note that you'd probably want a seperate 'ingredients' table,
and 'recipeingredient' would contain an IngredientID rather than
an IngredientName; I was just trying to simplify the example. If
you need more help, let me know.

Cheers,

Paul