Re: Assistance with Query Optimisation? - Mailing list pgsql-general
From | Isak Hansen |
---|---|
Subject | Re: Assistance with Query Optimisation? |
Date | |
Msg-id | 6b9e1eb20703010452m6fa97914he629df06e01a5601@mail.gmail.com Whole thread Raw |
In response to | Assistance with Query Optimisation? (Shaun Johnston <shaunj@benon.com>) |
Responses |
Re: Assistance with Query Optimisation?
|
List | pgsql-general |
On 3/1/07, Shaun Johnston <shaunj@benon.com> wrote: > Hi > > Apologies in advance for the verbosity of my explanation for this > problem, but I think it's all pertinent. > > I have a fairly simple query which postgresql's query planner seems to > be interpreting / optimising in interesting ways: > > Query: > > SELECT * > FROM account_transaction > WHERE account_id = 10783 > ORDER BY account_transaction_timestamp asc > OFFSET 0 > LIMIT 100 > > Table "public.account_transaction" > > (I have stripped the default values etc for the sake of formatting) > > Column | Type | > -------------------------------+--------------------------+ > account_transaction_id | bigint | > account_transaction_timestamp | timestamp with time zone | > account_id | integer | > other_account_transaction_id | bigint | > transaction_reason | text | > transaction_amount | numeric(15,2) | > transaction_exchange_rate | numeric(20,10) | > transaction_base_amount | numeric(15,2) | > transaction_locked_until | timestamp with time zone | > transaction_approved | boolean | > Indexes: > "account_transaction_pkey" PRIMARY KEY, btree (account_transaction_id), > tablespace "indexes" > "account_transaction_ai" btree (account_id), tablespace "indexes" > "account_transaction_timestamp" btree (account_transaction_timestamp), > tablespace "indexes" > "account_transaction_tlu" btree (transaction_locked_until), > tablespace "indexes" > Foreign-key constraints: > "$1" FOREIGN KEY (account_id) REFERENCES > account(account_id) > "$2" FOREIGN KEY (other_account_transaction_id) REFERENCES > account_transaction(account_transaction_id) ON UPDATE CASCADE > > Query Plans: > > With Sort and Limit > --------------------------------------------------------------------------------- > QUERY > PLAN > --------------------------------------------------------------------------------- > Limit (cost=0.00..12297.59 rows=100 width=120) > (actual time=23.537..275476.496 rows=100 loops=1) > -> Index Scan Backward using account_transaction_timestamp on > account_transaction > (cost=0.00..640704.23 rows=5210 width=120) > (actual time=23.529..275475.781 rows=100 loops=1) > Filter: (account_id = 10783) > Total runtime: 275476.944 ms > > > With Sort but no Limit > -------------------------------------------------------------------------------- > QUERY > PLAN > -------------------------------------------------------------------------------- > Sort (cost=18777.41..18790.43 rows=5210 width=120) > (actual time=1081.226..1082.170 rows=308 loops=1) > Sort Key: account_transaction_timestamp > -> Index Scan using account_transaction_ai on account_transaction > (cost=0.00..18455.77 rows=5210 width=120) > (actual time=47.731..1070.788 rows=308 loops=1) > Index Cond: (account_id = 10783) > Total runtime: 1083.182 ms > > With Limit but no Sort > --------------------------------------------------------------------------------- > QUERY > PLAN > --------------------------------------------------------------------------------- > Limit (cost=0.00..354.24 rows=100 width=120) > (actual time=0.029..1.070 rows=100 loops=1) > -> Index Scan using account_transaction_ai on account_transaction > (cost=0.00..18455.77 rows=5210 width=120) > (actual time=0.022..0.467 rows=100 loops=1) > Index Cond: (account_id = 10783) > Total runtime: 1.422 ms > > With Limit and Sort, but sorted by transaction_base_amount > --------------------------------------------------------------------------------- > QUERY > PLAN > --------------------------------------------------------------------------------- > Limit (cost=18777.41..18777.66 rows=100 width=120) > (actual time=55.294..56.221 rows=100 loops=1) > -> Sort (cost=18777.41..18790.43 rows=5210 width=120) > (actual time=55.285..55.600 rows=100 loops=1) > Sort Key: transaction_base_amount > -> Index Scan using account_transaction_ai on > account_transaction > (cost=0.00..18455.77 rows=5210 width=120) > (actual time=0.057..53.187 rows=308 loops=1) > Index Cond: (account_id = 10783) > Total runtime: 56.597 ms > > The table currently contains about 10 million records. > > If I sort by account_transaction_timestamp then limit, the planner > performs a backward index scan based on account_transaction_timestamp > then limits. > > If I sort but don't limit, it performs a forward scan on the table, then > a sort on the results. > > If I limit but don't sort, it performs the forward scan then limits. > > If I limit and sort, but sort by transaction_base_amount instead, it > performs a forward index scan, then sorts the results, then limits - in > stark contrast to sorting and limiting using account_transaction_timestamp. > > As shown, there is a vast difference in efficiency between sorting and > limiting by account_transaction_timestamp, and doing the same using > transaction_base_amount (in favour of transaction_base_amount by a > factor of about 4860). This seems to be due to the way the planner is > optimising the query. > > Could this optimisation peculiarity be due to the combination of indexed > columns in the query? Also, Is there a way I can 'force' the planner to > perform an 'index scan -> sort -> limit' or even better an 'index scan > -> limit -> sort'? > > Any pointers / assistance appreciated. > Starting off with a disclaimer, just to get that out of the way; I'm a programmer; no DBA, have limited experience with Postgres, and my terminology is probably way off. First thing I notice is that your query plans seem to only use one index. Postgres should be able to combine the timestamp and account_id indexes in that first query, if the optimizer thought there was a point in doing so? Did you analyze the db recently? The optimizer won't perform too well with incorrect statistics. If that doesn't help, you could perhaps add account_id to your timestamp index. I have no idea how far the postgres planner goes in restructuring your queries, but using explicit subqueries, correlated vs. non-correlated, may give it a good hint about what you're trying to do. HTH, Isak > Kind Regards, > > Shaun Johnston > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
pgsql-general by date: