[thelist] redesigning a huge database
Lauri Väin
lauri_lists at tharapita.com
Thu May 11 02:18:39 CDT 2006
Hi,
As far as internationalization goes, there are two basic approaches:
- full templates
- language strings with replacement markers in them
Translation companies tend to mess up templates with markup in it to the
point of sometimes even translating the markup contents. You'll be suprised
what you will experience in the coming months/years. You can extract the
strings for translation from the markup or keep them separate from day one.
Or you can manage the full templates. To manage language strings, use
something like RBmanager or something that somebody else mentioned in this
thread already.
Optimizing the databases is a science in itself. It all depends on the kind
of a database you have - whether you write to it, update to it, select
something little on it by index or do frequent sequence scans with batch
jobs or something else.
When it's mostly selecting and adding an odd row every once a while, then
your requirements are not that bad. 700k rows and expanding to other
European countries is not a concern for most applications if you handle your
data properly. Just make sure your database looks good, you select by
indexes, keep your data sane and know what you are doing. Redundancy, uptime
requirements other requirements may make your system more complex, of
course. If you do lots of sequence scans you may need separate replica
servers for that and if you do lots of updates, then locking can get nasty
for some applications. The whole thing really comes with research... and
with experience most of all.
If the European expansion program is likely to have a VERY significant
takeup rate or you have a very special kind of an application, then you may
need to look at various things. Like horizontal (if you have a good column
to split the data by) or vertical clustering (if you have extensive read
operations, you will need to replicate your data from the master to slave
servers and read it off from there).
Make sure you can split your databases functionally, if needed, and do not
lock yourself in with your application. Functional splitting, of course,
brings other problems like multi phase commits etc. Depending on your
application, what you find with most large systems is the lack of foreign
keys (foreign keys are something for development, not for production). Also
the lack of triggers and other tricks (these are your DBA domain only, if at
all needed). Other techniques may range from preaggregation,
denormalization, archiving, caching (if you're on PHP, memcache might be a
good solution for you) etc.
PHP file based sessions suprisingly perform quite well. If you get multiple
frontends, depending on the load balancing method, you will need to start
storing sessions in memcache, look at something Zend or other people have to
offer or use some other approach, like write your own session server.
One thing to be aware of. Rebuilding from scratch and other major
refactorings can be a disaster time-wise for some systems. The risk goes up
by the number of manyears of code and thereby by the amount of maturation
that has gone into it. The key usually is to refactor very little pieces on
by one when you have many manyears of code to work with.
Know what you are building and for which requirements. Lots of the things
I've mentioned above may be overkill for you and there is no point in
jumping over your shadow to design 10 years in advance - you cannot predict
your future. 10 years from now the database may not be your problem, but it
may be the complexity of the underlying application, which gets even more
complex with the overdesigned database. It may well be that you will have to
add mainteinance points into the system if it is not your in-house system,
which you are maintaining anyway. The key is in striking the balance,
keeping simplicity and recognizing critical points, when something may need
to be changed.
Cheers,
Lauri
----- Original Message -----
From: "Rick den Haan" <rick.denhaan at gmail.com>
To: <thelist at lists.evolt.org>
Sent: Wednesday, May 10, 2006 7:29 PM
Subject: Re: [thelist] redesigning a huge database
> John,
>
> Spot on on the requirements.
>
> I know that 700,000 rows can be considered moderate in some environments,
> but we don't usually build applications that required such large
> databases.
> So for us, this is huge.
>
> The decision has already been made that the app will be completely rebuilt
> from scratch. There's bound to be some things in the old app we can
> re-use,
> but we're not counting too heavily on it.
>
> Rick.
> --
>
> * * Please support the community that supports you. * *
> http://evolt.org/help_support_evolt/
>
> For unsubscribe and other options, including the Tip Harvester
> and archives of thelist go to: http://lists.evolt.org
> Workers of the Web, evolt !
More information about the thelist
mailing list