[thelist] SQL: data sampling/condensing

Matt Warden mwarden at gmail.com
Wed Feb 2 17:12:16 CST 2011


I built a webapp for myself that tracks my total portfolio and adjusts
for estimated taxation. I did this because I got tired of the
portfolio apps out there that would sum things like your 401k and your
bank account as if they were the same dollars. These would incorrectly
inflate the value of your porfolio and not acknowledge that a big part
of that portfolio is Uncle Sam's.

Almost 6 months ago, I added something that pushes various aggregated
post-tax portfolio values into a history table so that I can graph the
change in total value and various components of that total over time.
This push happens about every other day. I then graph the results with
various slices and dices using google's visualization api.

That's all well and good, but do the math. I have many data points
that are being plotted and it will only keep growing. I have tried a
couple times to work out a way to "condense" the data set, but have
thus far been unsuccessful.


Day           Value
1                $20
2                $23
3                $25
4                $21
5                $19
6                $15
7                $19
8                $22
9                $23

I want to do one of two things. I would be fine with just sampling
every nth row. The trouble is that I am using sqlite and have been
unable to get this to work using their modulo operator. i.e.:

Day           Value
1                $20
4                $21
7                $19

The holy grail, though, would be to AVERAGE every nth row to output
something like this:

Start Day    Value
1                $22.67
4                $18.33
7                $22.00

Any ideas? This definitely smells like something that will require a
SQL formulation that will ultimately impress me in its complexity
and/or creativity...

Matt Warden
Austin, TX, USA

This email proudly and graciously contributes to entropy.

More information about the thelist mailing list