Cursor-based results: bafflingly slow - Mailing list pgsql-jdbc

From Robin Houston
Subject Cursor-based results: bafflingly slow
Date
Msg-id 1b795e7b0907030640m347931c1v64e6039b9b8590b0@mail.gmail.com
Whole thread Raw
In response to Cursor-based results: bafflingly slow  (Robin Houston <robin.houston@gmail.com>)
List pgsql-jdbc
2009/7/3 Oliver Jowett <oliver@opencloud.com>:

> They're not really equivalent. A portal with a limited fetchsize is more
> like DECLARE CURSOR, not LIMIT.
>
> What does the plan look like if you plan it without the LIMIT?

Hmm, well... If I do the same as before, but without the limit, then the plan is:

Sort  (cost=353822.56..353835.72 rows=5264 width=612)
  Sort Key: polypeptide.feature_id

  InitPlan
    ->  Seq Scan on organism  (cost=0.00..1.95 rows=1 width=4)
          Filter: ((common_name)::text = $1)
  ->  Nested Loop  (cost=88664.31..352053.24 rows=5264 width=612)
        ->  Hash Join  (cost=88663.00..351946.66 rows=188 width=96)
              Hash Cond: (featureprop.feature_id = polypeptide.feature_id)
              ->  Hash Join  (cost=2790.36..264059.63 rows=51268 width=96)
                    Hash Cond: (featureprop.type_id = featureprop_type.cvterm_id)
                    ->  Seq Scan on featureprop  (cost=0.00..222284.97 rows=10259097 width=56)
                    ->  Hash  (cost=2785.51..2785.51 rows=388 width=52)
                          ->  Seq Scan on cvterm featureprop_type  (cost=0.00..2785.51 rows=388 width=52)
                                Filter: (cv_id = cv_id)
              ->  Hash  (cost=85344.24..85344.24 rows=32192 width=4)
                    ->  Bitmap Heap Scan on feature polypeptide  (cost=827.57..85344.24 rows=32192 width=4)
                          Recheck Cond: ((organism_id = $0) AND (type_id = 191))
                          ->  Bitmap Index Scan on feature_genedb_idx1  (cost=0.00..819.52 rows=32192 width=0)

                                Index Cond: ((organism_id = $0) AND (type_id = 191))
        ->  Materialize  (cost=1.31..1.59 rows=28 width=516)

              ->  Seq Scan on cv featureprop_type_cv  (cost=0.00..1.28 rows=28 width=516)


Surely the server is not so stupid as to use an unnecessary server-side sort, if I've explicitly asked to fetch the rows 100 at a time?

Is there any easy way to find out what execution plan the server is actually using? I suppose I could ask the DBAs to enable auto_explain on the dev database.

If this *is* the problem, do you know any way round it. Obviously I want the plan that *doesn't* involve sorting everything before returning anything.

If I explicitly create a cursor, with no limit, then it does return the first 100 rows very quickly. On the other hand, I have to hard-code the parameter this way (because I don't know any way, in psql, to use a bind variable with an explicit cursor.)

begin;
declare polypeptide_props_c no scroll cursor for [...query...]
fetch forward 100 from polypeptide_props_c;
rollback;

This fetch completes in 31.313 ms.

Thanks for your help so far!

Robin

pgsql-jdbc by date:

Previous
From: Oliver Jowett
Date:
Subject: Re: Cursor-based results: bafflingly slow
Next
From: Oliver Jowett
Date:
Subject: Re: Cursor-based results: bafflingly slow