[thelist] Hierarchical data in the database
Bill Moseley
moseley at hank.org
Sun Jan 21 21:09:21 CST 2007
On Sun, Jan 21, 2007 at 08:10:55PM -0500, Matt Warden wrote:
> You say:
>
> On 1/21/07, Bill Moseley <moseley at hank.org> wrote:
> > The files are currently under revision control, so need to keep some
> > of those features -- that is, I want to track previous versions.
>
> But your example table has:
>
> > create table template (
> > id SERIAL PRIMARY KEY,
> > path text, -- e.g. path/to/templates/footer.html
> > current boolean, -- true if this is the current template
> > created_time timestamp(0) with time zone not null default now(),
> > update_by integer REFERENCES person,
> > CONSTRAINT one_current_template UNIQUE( path, current )
> > );
> > create index template_path ON template ( path );
> > create index template_current ON template ( current );
>
> >From what I can tell, you will still only have one version of the file
> (which seems to make the current field even more meaningless). You
> aren't actually storing the files in the database, just the path.
Sorry, just left it out of my post -- it's been a long weekend:
alter table template add column content text;
Better? ;)
> You haven't explained all of what you need to do with these files, so
> this is quite an assumption, but my guess is that you don't need all
> of this. My guess is that you can get a happy medium by storing the
> path and filename of each file in separate fields. The only thing this
> can't represent is a folder with no files.
By path do you mean directory (without the file name)?
I need two ways to access the "files" in the database. One way is
directly by path name path/to/template/foo/header.html, for example.
This has to be fast since a page might have twenty or so templates to
fetch to build the entire page. This suggests a single column named
"path" holding the full path (or a (name, dir) set of columns).
I also need to access the files in the web interfaced used to edit the
templates. I'd like to provide a file-manager type view, showing top
level "folders" that can be expanded to show their contents. To me this
suggests some kind of hierarchy similar to what I showed.
So, my question is how best to support both of those requirements in the
database.
> > I could add a "path" column to template_node, but that's
> > de-normalizing the table too much for my tastes. But, maybe that is
> > the best solution?
>
> I should learn to read ahead. table(id, fullpath) is no more
> normalized than table(id, path, filename).
I was commenting that with my template_node example there was nothing
to prevent column "parent" pointing to "foo" but column
"path" set to "path/to/bar/<name>" -- that is the path's parent is
different from the node's parent. Bad data.
> If you're okay with your
> first table structure above, then you should be fine with moving the
> path out into its own field. Denormalization is not a bad thing if it
> doesn't impact your requirements (including potential future ones,
> which is obviously harder to tease out -- but, again, my guess is that
> no requirement you ever have will require the hierarchical node
> structure you've built in your second solution).
So are you saying for path/to/templates/foo/heading.html the columns
would look like this?
name = 'heading.html'
dir = 'path/to/templates/foo'
And then to show all the files in the root select "where dir IS NULL"
or for the "templates" directory 'where dir = "path/to/templates"'?
But that won't show the sub directories. I guess I'm not following
how to do the folder-by-folder listing.
Thanks for the comments, Matt.
--
Bill Moseley
moseley at hank.org
More information about the thelist
mailing list