[thelist] WEB APP EFFICIENCY: how to determine

Steve Lewis slewis at macrovista.net
Tue Jun 25 15:06:01 CDT 2002


Chris W. Parker wrote:

> only uses from 1-10k of the file itself. so i'd like to know what
> resources are used when loading a 95k file multiples time in a minute
> verses loading a 5k file multiple times in a minute.

The first time the ecommerce package is called on, that means that the
requesting process will send a request to the operating system.  The
operating system passes along the necessary request to the hardware and
freezes the process, allowing other processes to work if any are waiting
for CPU time.  When the hardware delivers the data, it is going to get
dropped in your disk cache, and your memory cache(s) on the way to
memory--and eventually to the CPU.  Thousands of very well educated
individuals have helped make this system as effciient as possible for
decades.

The ecommerce package is called on again, now what your machine will do
everytime the code asks to load more data (in this case the code for the
ecommerce package) it will make a call to load from memory.  What
happens there is fairly complex, but it looks in the cache before it
looks at RAM.  If it is not in RAM it will look at your disk cache
before actually going to the disk to retrieve data.  All of this is
good.  I means you dont need to wait for data to be loaded from disk
each time the code is called.

The monkey wrench is delivered when your OS starts swapping.  Check the
page-fault statistics (with Performance Monitor (NT) aka Performance
(2k)) to see how often your OS is swapping real memory for virtual
memory.  If you get more than a dozen page faults in 30 seconds you may
need to install more RAM in the machine and/or adjust the page size
because memory is your bottleneck.  When you see the problem you will
want to talk with folks who do more of this sort of thing than I do, I
can help you recognise the problem however.

Generally speaking, a single request for 25k is more efficient than 5
calls for 5k each.  There is significant latency and response time
issues with getting data flowing from disk to memory, that overhead per
connection to disk will likely make the single 95k file more efficient
unless over a two-week period you were able to track which portions of
the code were used and found that (for instance) 60% of the calls used a
specific 10k.

> also, the programmers also do "SELECT * FROM myTbl..." even when they
> only need to reference one field. sometimes the tables are small, but
> sometimes they are big. and so i'd like to also determine how much more
> efficient it would be to go through and change all those sorts of
> things.

You don't mention what SQL server you are using so I cant point to a
specific tool, but as a general rule it will be a better investment to
fix the code (select only columns you need) before you attempt to
cluster your web servers.  Until that time the benefits are hard to
estimate.  Consider the costs of fixing the code vs other efficiency
improvements and prioritize it based on the programmers' other
development deadlines, estimated work involved, as well as the load on
the server.




More information about the thelist mailing list