Thread: speeding up a query

speeding up a query

From
Marcus Engene
Date:
Hi,

I'm on 8.0.10 and there is a query I cannot quite get adequately fast.
Should it take 2.5s to sort these 442 rows? Are my settings bad? Is
my query stupid?

Would appreciate any tips.

Best regards,
Marcus


apa=> explain analyze
apa->  select
apa->      ai.objectid as ai_objectid
apa->  from
apa->      apa_item ai
apa->  where
apa->      idxfti @@ to_tsquery('default', 'KCA0304')  AND
apa->      ai.status = 30
apa->  ORDER BY ai.calc_rating desc
apa->  LIMIT 1000;

 Limit  (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.254..2651.093 rows=442 loops=1)
   ->  Sort  (cost=54.40..54.43 rows=12 width=8) (actual
time=2650.251..2650.515 rows=442 loops=1)
         Sort Key: calc_rating
         ->  Index Scan using apa_item_fts on apa_item ai
(cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
rows=442 loops=1)
               Index Cond: (idxfti @@ '''kca0304'''::tsquery)
               Filter: (status = 30)
 Total runtime: 2651.659 ms
(7 rows)

apa=> explain analyze
apa->  select
apa->      ai.objectid as ai_objectid
apa->  from
apa->      apa_item ai
apa->  where
apa->      idxfti @@ to_tsquery('default', 'KCA0304')  AND
apa->      ai.status = 30
apa->  LIMIT 1000;

 Limit  (cost=0.00..54.18 rows=12 width=4) (actual time=0.186..18.628
rows=442 loops=1)
   ->  Index Scan using apa_item_fts on apa_item ai  (cost=0.00..54.18
rows=12 width=4) (actual time=0.183..17.999 rows=442 loops=1)
         Index Cond: (idxfti @@ '''kca0304'''::tsquery)
         Filter: (status = 30)
 Total runtime: 19.062 ms
(5 rows)


Re: speeding up a query

From
Tom Lane
Date:
Marcus Engene <mengpg2@engene.se> writes:
> Should it take 2.5s to sort these 442 rows?

>  Limit  (cost=54.40..54.43 rows=12 width=8) (actual
> time=2650.254..2651.093 rows=442 loops=1)
>    ->  Sort  (cost=54.40..54.43 rows=12 width=8) (actual
> time=2650.251..2650.515 rows=442 loops=1)
>          Sort Key: calc_rating
>          ->  Index Scan using apa_item_fts on apa_item ai
> (cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
> rows=442 loops=1)
>                Index Cond: (idxfti @@ '''kca0304'''::tsquery)
>                Filter: (status = 30)
>  Total runtime: 2651.659 ms

It's not the sort that's taking 2.5s --- the sort looks to be taking
about a millisec and a half.  The indexscan is eating the other 2649
msec.  The question that seems to be interesting is what's the
difference between the contexts of your two queries, because they
sure look like the indexscans were the same.  Maybe the second one
is merely benefiting from the first one having already sucked all the
data into cache?

            regards, tom lane

Re: speeding up a query

From
Marcus Engene
Date:
Hi again,

I was thinking, in my slow query it seems the sorting is the villain.
Doing a simple qsort test I notice that:
ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 430

real    0m0.051s
user    0m0.030s
sys     0m0.000s

ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 430000

real    0m0.238s
user    0m0.218s
sys     0m0.015s

ehsmeng@menglap /cygdrive/c/pond/dev/tt
$ time ./a.exe 4300000

real    0m2.594s
user    0m2.061s
sys     0m0.108s

 From this very unfair test indeed I see that my machine has the
capability to sort 4.3 million entries during the same time my pg is
sorting 430.

And i cannot stop wondering if there is some generic sorting routine
that is incredibly slow? Would it be possible to, in the situations
where order by is by simple datatypes of one column, to do a special
sorting, like the qsort example in the end of this mail? Is this already
addressed in later versions?

If no, why? and if yes, where in the pg code do I look?

Best regards,
Marcus


#include <stdio.h>
#include <stdlib.h>

typedef struct {
    int     val;
    void   *pek;
} QSORTSTRUCT_INT_S;

int sortstruct_int_compare(void const *a, void  const *b)
{
    return ( ((QSORTSTRUCT_INT_S *)a)->val - ((QSORTSTRUCT_INT_S
*)b)->val );
}

int main (int argc, char **argv)
{
    int nbr = 0;
    int i = 0;
    QSORTSTRUCT_INT_S *sort_arr = 0;
    if (1 == argc) {
        printf("forgot amount argument\n");
        exit(1);
    }
    nbr = atoi (argv[1]);
    if (0 == (sort_arr = malloc (sizeof(QSORTSTRUCT_INT_S) * nbr))) {
        printf("cannot alloc\n");
        exit(1);
    }
    srand(123);
    for (i=0; i<nbr; i++) {
        sort_arr[i].val = rand();
    }
    qsort(sort_arr, nbr, sizeof(QSORTSTRUCT_INT_S),sortstruct_int_compare);
    return 0;
}


Re: speeding up a query

From
Marcus Engene
Date:
Tom Lane skrev:
> Marcus Engene <mengpg2@engene.se> writes:
>
>> Should it take 2.5s to sort these 442 rows?
>>
>
>
>>  Limit  (cost=54.40..54.43 rows=12 width=8) (actual
>> time=2650.254..2651.093 rows=442 loops=1)
>>    ->  Sort  (cost=54.40..54.43 rows=12 width=8) (actual
>> time=2650.251..2650.515 rows=442 loops=1)
>>          Sort Key: calc_rating
>>          ->  Index Scan using apa_item_fts on apa_item ai
>> (cost=0.00..54.18 rows=12 width=8) (actual time=61.261..2649.045
>> rows=442 loops=1)
>>                Index Cond: (idxfti @@ '''kca0304'''::tsquery)
>>                Filter: (status = 30)
>>  Total runtime: 2651.659 ms
>>
>
> It's not the sort that's taking 2.5s --- the sort looks to be taking
> about a millisec and a half.  The indexscan is eating the other 2649
> msec.  The question that seems to be interesting is what's the
> difference between the contexts of your two queries, because they
> sure look like the indexscans were the same.  Maybe the second one
> is merely benefiting from the first one having already sucked all the
> data into cache?
>
>             regards, tom lane
>
Yes indeed you are completely right! Both queries take about the same when
run after the other. And I just made a fool of myself with an optimizing
idea I
had...

Sorry for the noise and thanks for your answer!

Best regards,
Marcus