[thelist] Slow SQL query

Bill Moseley moseley at hank.org
Wed Jan 3 21:13:13 CST 2007


I'm using Postgresql, and I have a query that is running very slowly
and I'm wondering if there's a better way to write it or better use of
indexes.  I'm using nested sub-queries and that seems to be killing my
speed.

Here's a somewhat long overview of the tables, if that helps:

I have a system that records transactions and then generates invoices
at given time intervals.  It does this by creating a row in the
invoice table and then links that to all the transactions that have
not been invoiced yet.

First the schema:

I have a "cart" table:

    create table cart (
        id          SERIAL PRIMARY KEY,
        first_name  text,
        last_name   text
    );


And a transaction is related to a given cart.
Thus, a cart can have multiple transactions.

    create table cart_transaction (
        id          SERIAL PRIMARY KEY,
        cart        integer REFERENCES cart,
        amount      numeric,
        invoice     integer REFERENCES invoice -- NULL until invoiced
    );

An invoice is a collection of transactions.  And a cart can have
multiple invoices.

Invoices are generated in a batch, so I have a table to group the
invoices together:

    create table invoice_group (
        id                  SERIAL PRIMARY KEY,
        transaction_time    timestamp(0) DEFAULT CURRENT_TIMESTAMP
    );

and the invoice table is used to group transactions together:

    create table invoice (
        id                  SERIAL PRIMARY KEY,
        invoice_group       integer REFERENCES invoice_group,
        invoice_time        timestamp(0) DEFAULT CURRENT_TIMESTAMP,
        cart                integer REFERENCES cart NOT NULL
    );

So, an invoice is made up of a bunch of transactions with the same
invoice ID.  And a batch of invoices all have the same invoice_group.
And again, a cart can have multiple invoices.

So, I want to be able to get a list of invoices, the current charges
and the previous balance.

    Invoice ID
    cart ID
    group ID
    invoice time
    charges
    previous balance

For any invoice, the previous balance is the sum of all transactions
invoiced before the current invoice.

I've got two views to do this.  The first one is faster, but
gives me a NULL result for previous_balance.

CREATE VIEW invoice_totals2 AS
    SELECT
        i.id AS id,
        i.cart AS cart,
        i.invoice_group AS invoice_group,
        i.invoice_time AS invoice_time,
        sum(t.amount) AS charges,

        (
            SELECT
                sum(tt.amount)
            FROM
                cart_transaction tt
                JOIN invoice ii ON tt.invoice = ii.id
            WHERE
                tt.cart = i.cart
                AND ii.invoice_time < i.invoice_time
        ) AS prev_balance

    FROM
        invoice i
        JOIN cart_transaction t ON t.invoice = i.id

    GROUP BY
        1,2,3,4;


How can I get "prev_balance" to not be null if there's no previous
balance?

Any suggestions how to improve the speed of this?  (an ANALYZE is
below).


This one give me the column I want (i.e. 0 as a previous balance)
but is a nested sub-query -- and is kind of convoluted and very slow.


CREATE VIEW invoice_totals AS
    SELECT
        i.id AS id,
        i.cart AS cart,
        i.invoice_group AS invoice_group,
        i.comment AS comment,
        i.invoice_time AS invoice_time,
        sum(t.amount) AS charges,

        (
            SELECT
                sum(CASE WHEN i.id != cutoff.id THEN amount ELSE 0 END) as prev_balance
            FROM
                cart_transaction t
                JOIN invoice i ON t.invoice = i.id
                JOIN (select id, invoice_time, cart from invoice where id = i.id) as cutoff
                    ON t.cart = cutoff.cart
            WHERE
                i.invoice_time <= cutoff.invoice_time
        ) AS prev_balance


    FROM
        invoice i
        JOIN cart_transaction t ON t.invoice = i.id

    GROUP BY
        1,2,3,4,5

    ORDER BY
        i.id, i.invoice_time;



ws2=> EXPLAIN ANALYZE select * from invoice_totals2;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=59400.97..59404.50 rows=1412 width=61) (actual time=2286.999..2287.592 rows=1278 loops=1)
   Sort Key: i.id, i.invoice_time
   ->  HashAggregate  (cost=120.52..59327.09 rows=1412 width=61) (actual time=18.646..2279.966 rows=1278 loops=1)
         ->  Hash Join  (cost=26.65..98.35 rows=1478 width=61) (actual time=4.822..11.834 rows=1425 loops=1)
               Hash Cond: ("outer".invoice = "inner".id)
               ->  Seq Scan on cart_transaction t  (cost=0.00..34.70 rows=1670 width=13) (actual time=0.015..2.339 rows=1580 loops=1)
               ->  Hash  (cost=23.12..23.12 rows=1412 width=52) (actual time=4.738..4.738 rows=1278 loops=1)
                     ->  Seq Scan on invoice i  (cost=0.00..23.12 rows=1412 width=52) (actual time=0.007..2.279 rows=1278 loops=1)
         SubPlan
           ->  Aggregate  (cost=41.91..41.92 rows=1 width=9) (actual time=1.762..1.762 rows=1 loops=1278)
                 ->  Nested Loop  (cost=0.00..41.90 rows=1 width=9) (actual time=1.451..1.758 rows=0 loops=1278)
                       ->  Seq Scan on cart_transaction tt  (cost=0.00..38.88 rows=1 width=13) (actual time=0.811..1.735 rows=1 loops=1278)
                             Filter: (cart = $0)
                       ->  Index Scan using invoice_pkey on invoice ii  (cost=0.00..3.02 rows=1 width=4) (actual time=0.011..0.011 rows=0 loops=1775)
                             Index Cond: ("outer".invoice = ii.id)
                             Filter: (invoice_time < $1)
 Total runtime: 2288.823 ms
(17 rows)

ws2=> EXPLAIN ANALYZE select * from invoice_totals; 
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=69572.31..69575.84 rows=1412 width=61) (actual time=4619.448..4620.162 rows=1278 loops=1)
   Sort Key: i.id, i.invoice_time
   ->  HashAggregate  (cost=120.52..69498.44 rows=1412 width=61) (actual time=20.665..4611.668 rows=1278 loops=1)
         ->  Hash Join  (cost=26.65..98.35 rows=1478 width=61) (actual time=5.236..12.326 rows=1425 loops=1)
               Hash Cond: ("outer".invoice = "inner".id)
               ->  Seq Scan on cart_transaction t  (cost=0.00..34.70 rows=1670 width=13) (actual time=0.017..2.428 rows=1580 loops=1)
               ->  Hash  (cost=23.12..23.12 rows=1412 width=52) (actual time=5.119..5.119 rows=1278 loops=1)
                     ->  Seq Scan on invoice i  (cost=0.00..23.12 rows=1412 width=52) (actual time=0.008..2.282 rows=1278 loops=1)
         SubPlan
           ->  Aggregate  (cost=49.11..49.12 rows=1 width=17) (actual time=3.584..3.585 rows=1 loops=1278)
                 ->  Nested Loop  (cost=3.02..49.10 rows=1 width=17) (actual time=1.775..3.575 rows=1 loops=1278)
                       Join Filter: ("inner".invoice_time <= "outer".invoice_time)
                       ->  Hash Join  (cost=3.02..46.08 rows=1 width=25) (actual time=1.749..3.541 rows=1 loops=1278)
                             Hash Cond: ("outer".cart = "inner".cart)
                             ->  Seq Scan on cart_transaction t  (cost=0.00..34.70 rows=1670 width=17) (actual time=0.002..1.904 rows=1580 loops=1278)
                             ->  Hash  (cost=3.01..3.01 rows=1 width=16) (actual time=0.021..0.021 rows=1 loops=1278)
                                   ->  Index Scan using invoice_pkey on invoice  (cost=0.00..3.01 rows=1 width=16) (actual time=0.014..0.016 rows=1 loops=1278)
                                         Index Cond: (id = $0)
                       ->  Index Scan using invoice_pkey on invoice i  (cost=0.00..3.01 rows=1 width=12) (actual time=0.010..0.011 rows=1 loops=1775)
                             Index Cond: ("outer".invoice = i.id)
 Total runtime: 4621.518 ms
(21 rows)




-- 
Bill Moseley
moseley at hank.org




More information about the thelist mailing list