[thelist] SQL: data sampling/condensing
Matt Warden
mwarden at gmail.com
Wed Feb 2 17:12:16 CST 2011
Howdy,
I built a webapp for myself that tracks my total portfolio and adjusts
for estimated taxation. I did this because I got tired of the
portfolio apps out there that would sum things like your 401k and your
bank account as if they were the same dollars. These would incorrectly
inflate the value of your porfolio and not acknowledge that a big part
of that portfolio is Uncle Sam's.
Almost 6 months ago, I added something that pushes various aggregated
post-tax portfolio values into a history table so that I can graph the
change in total value and various components of that total over time.
This push happens about every other day. I then graph the results with
various slices and dices using google's visualization api.
That's all well and good, but do the math. I have many data points
that are being plotted and it will only keep growing. I have tried a
couple times to work out a way to "condense" the data set, but have
thus far been unsuccessful.
Imagine:
Day Value
1 $20
2 $23
3 $25
4 $21
5 $19
6 $15
7 $19
8 $22
9 $23
...
I want to do one of two things. I would be fine with just sampling
every nth row. The trouble is that I am using sqlite and have been
unable to get this to work using their modulo operator. i.e.:
Day Value
1 $20
4 $21
7 $19
...
The holy grail, though, would be to AVERAGE every nth row to output
something like this:
Start Day Value
1 $22.67
4 $18.33
7 $22.00
...
Any ideas? This definitely smells like something that will require a
SQL formulation that will ultimately impress me in its complexity
and/or creativity...
--
Matt Warden
Austin, TX, USA
http://mattwarden.com
This email proudly and graciously contributes to entropy.
More information about the thelist
mailing list