Thread: speeding up a query
Hi, I'm on 8.0.10 and there is a query I cannot quite get adequately fast. Should it take 2.5s to sort these 442 rows? Are my settings bad? Is my query stupid? Would appreciate any tips. Best regards, Marcus apa=> explain analyze apa-> select apa-> ai.objectid as ai_objectid apa-> from apa-> apa_item ai apa-> where apa-> idxfti @@ to_tsquery('default', 'KCA0304') AND apa-> ai.status = 30 apa-> ORDER BY ai.calc_rating desc apa-> LIMIT 1000; Limit (cost=54.40..54.43 rows=12 width=8) (actual time=2650.254..2651.093 rows=442 loops=1) -> Sort (cost=54.40..54.43 rows=12 width=8) (actual time=2650.251..2650.515 rows=442 loops=1) Sort Key: calc_rating -> Index Scan using apa_item_fts on apa_item ai (cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045 rows=442 loops=1) Index Cond: (idxfti @@ '''kca0304'''::tsquery) Filter: (status = 30) Total runtime: 2651.659 ms (7 rows) apa=> explain analyze apa-> select apa-> ai.objectid as ai_objectid apa-> from apa-> apa_item ai apa-> where apa-> idxfti @@ to_tsquery('default', 'KCA0304') AND apa-> ai.status = 30 apa-> LIMIT 1000; Limit (cost=0.00..54.18 rows=12 width=4) (actual time=0.186..18.628 rows=442 loops=1) -> Index Scan using apa_item_fts on apa_item ai (cost=0.00..54.18 rows=12 width=4) (actual time=0.183..17.999 rows=442 loops=1) Index Cond: (idxfti @@ '''kca0304'''::tsquery) Filter: (status = 30) Total runtime: 19.062 ms (5 rows)
Marcus Engene <mengpg2@engene.se> writes: > Should it take 2.5s to sort these 442 rows? > Limit (cost=54.40..54.43 rows=12 width=8) (actual > time=2650.254..2651.093 rows=442 loops=1) > -> Sort (cost=54.40..54.43 rows=12 width=8) (actual > time=2650.251..2650.515 rows=442 loops=1) > Sort Key: calc_rating > -> Index Scan using apa_item_fts on apa_item ai > (cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045 > rows=442 loops=1) > Index Cond: (idxfti @@ '''kca0304'''::tsquery) > Filter: (status = 30) > Total runtime: 2651.659 ms It's not the sort that's taking 2.5s --- the sort looks to be taking about a millisec and a half. The indexscan is eating the other 2649 msec. The question that seems to be interesting is what's the difference between the contexts of your two queries, because they sure look like the indexscans were the same. Maybe the second one is merely benefiting from the first one having already sucked all the data into cache? regards, tom lane
Hi again, I was thinking, in my slow query it seems the sorting is the villain. Doing a simple qsort test I notice that: ehsmeng@menglap /cygdrive/c/pond/dev/tt $ time ./a.exe 430 real 0m0.051s user 0m0.030s sys 0m0.000s ehsmeng@menglap /cygdrive/c/pond/dev/tt $ time ./a.exe 430000 real 0m0.238s user 0m0.218s sys 0m0.015s ehsmeng@menglap /cygdrive/c/pond/dev/tt $ time ./a.exe 4300000 real 0m2.594s user 0m2.061s sys 0m0.108s From this very unfair test indeed I see that my machine has the capability to sort 4.3 million entries during the same time my pg is sorting 430. And i cannot stop wondering if there is some generic sorting routine that is incredibly slow? Would it be possible to, in the situations where order by is by simple datatypes of one column, to do a special sorting, like the qsort example in the end of this mail? Is this already addressed in later versions? If no, why? and if yes, where in the pg code do I look? Best regards, Marcus #include <stdio.h> #include <stdlib.h> typedef struct { int val; void *pek; } QSORTSTRUCT_INT_S; int sortstruct_int_compare(void const *a, void const *b) { return ( ((QSORTSTRUCT_INT_S *)a)->val - ((QSORTSTRUCT_INT_S *)b)->val ); } int main (int argc, char **argv) { int nbr = 0; int i = 0; QSORTSTRUCT_INT_S *sort_arr = 0; if (1 == argc) { printf("forgot amount argument\n"); exit(1); } nbr = atoi (argv[1]); if (0 == (sort_arr = malloc (sizeof(QSORTSTRUCT_INT_S) * nbr))) { printf("cannot alloc\n"); exit(1); } srand(123); for (i=0; i<nbr; i++) { sort_arr[i].val = rand(); } qsort(sort_arr, nbr, sizeof(QSORTSTRUCT_INT_S),sortstruct_int_compare); return 0; }
Tom Lane skrev: > Marcus Engene <mengpg2@engene.se> writes: > >> Should it take 2.5s to sort these 442 rows? >> > > >> Limit (cost=54.40..54.43 rows=12 width=8) (actual >> time=2650.254..2651.093 rows=442 loops=1) >> -> Sort (cost=54.40..54.43 rows=12 width=8) (actual >> time=2650.251..2650.515 rows=442 loops=1) >> Sort Key: calc_rating >> -> Index Scan using apa_item_fts on apa_item ai >> (cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045 >> rows=442 loops=1) >> Index Cond: (idxfti @@ '''kca0304'''::tsquery) >> Filter: (status = 30) >> Total runtime: 2651.659 ms >> > > It's not the sort that's taking 2.5s --- the sort looks to be taking > about a millisec and a half. The indexscan is eating the other 2649 > msec. The question that seems to be interesting is what's the > difference between the contexts of your two queries, because they > sure look like the indexscans were the same. Maybe the second one > is merely benefiting from the first one having already sucked all the > data into cache? > > regards, tom lane > Yes indeed you are completely right! Both queries take about the same when run after the other. And I just made a fool of myself with an optimizing idea I had... Sorry for the noise and thanks for your answer! Best regards, Marcus