BUG #6587: Limit on a query is mis-documented - Mailing list pgsql-bugs

From mboldi@prospectiv.com
Subject BUG #6587: Limit on a query is mis-documented
Date
Msg-id E1SImMF-0001YR-06@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #6587: Limit on a query is mis-documented  (Greg Stark <stark@mit.edu>)
Re: BUG #6587: Limit on a query is mis-documented  (John R Pierce <pierce@hogranch.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      6587
Logged by:          Mike Boldi
Email address:      mboldi@prospectiv.com
PostgreSQL version: 9.0.3
Operating system:   Linux/Redhat
Description:=20=20=20=20=20=20=20=20

in the manual it says
If a limit count is given, no more than that many rows will be returned (but
possibly less, if the query itself yields less rows)
BUT !!  limit caps the largest internal spool. not the rows returned.
I am doing a large query so I limit the query to 5500000 but I get 88 rows
returned. I increase this to 9000000 and get 136 rows returned ..
explain Plan ...
Aggregate  (cost=3D1689951.65..1689951.66 rows=3D1 width=3D0)
   ->  Hash Join  (cost=3D644.62..1687656.17 rows=3D918194 width=3D0)
         Hash Cond: (a.email_a =3D o.email)
         ->  Subquery Scan on a  (cost=3D0.00..1654854.18 rows=3D9149 width=
=3D64)
               Filter: ((a.memberid_a > a.memberid_b) AND (a.email_a =3D
a.email_b) AND (length(a.email_a) > 0))
               ->  Limit  (cost=3D0.00..1325501.10 rows=3D16467654 width=3D=
690)
                     ->  Nested Loop  (cost=3D0.00..1325501.10 rows=3D16467=
654
width=3D690)
                           ->  Seq Scan on lgen_fraud_members12282=20
(cost=3D0.00..1597.14 rows=3D5381 width=3D345)
                                 Filter: ((regcomplete <> (-8)) AND (dateid
>=3D 5938))
                           ->  Index Scan using lgah11_12282 on
lgen_fraud_members12282  (cost=3D0.00..123.62 rows=3D3060 width=3D345)
                                 Index Cond:
((staging.lgen_fraud_members12282.dateid >=3D
staging.lgen_fraud_members12282.dateid) AND
(staging.lgen_fraud_members12282.memberid >
staging.lgen_fraud_members12282.memberid))
         ->  Hash  (cost=3D393.72..393.72 rows=3D20072 width=3D32)
               ->  Seq Scan on lgen_fraud_email o  (cost=3D0.00..393.72
rows=3D20072 width=3D32)

pgsql-bugs by date:

Previous
From: Robert Haas
Date:
Subject: Re: BUG #6572: The example of SPI_execute is bogus
Next
From: Greg Stark
Date:
Subject: Re: BUG #6587: Limit on a query is mis-documented