Thread: BUG #2243: Postgresql fails to finish some queries

BUG #2243: Postgresql fails to finish some queries

From
"Matej Rizman"
Date:
The following bug has been logged online:

Bug reference:      2243
Logged by:          Matej Rizman
Email address:      matej.rizman@gmail.com
PostgreSQL version: 8.0 and 8.1
Operating system:   Linux Debian, kernel 2.6.12-1-k7
Description:        Postgresql fails to finish some queries
Details:

Execute the following script:

CREATE TABLE a (
num int4
);

CREATE TABLE b (
num int4
);

CREATE UNIQUE INDEX ix_a_num ON a(num);
CREATE INDEX ix_b_num ON b(num);

COPY a FROM '/tmp/a_30000.txt';
COPY b FROM '/tmp/b_30000.txt';

SELECT * FROM b WHERE num NOT IN (SELECT num FROM a);

Files a_30000.txt and b_30000.txt contain 30000 numbers each.

The last query (SELECT) is executed on my machine in 125ms.

If I load data from files a_100000.txt and b_100000.txt that contain 100000
numbers each, the last SELECT does not finish in more than ten minutes (the
real-world sample hasn't finished in more than an hour).

The similar real-world sample does not even return the results of EXPLAIN
statement. However, this real-world sample is quite complex and I am not
posting it there. In the case I provided the EXPLAIN statement works fine.

This behaviour has been observed on postgresql 8.0 and on postgresql 8.1. It
has also been tested on two different computers, both running debian linux.

I can provide files a_* and b_* if you want. However, they are quite large
and unsuitable for mailing list attachments.

Best regards,
Matej Rizman

Re: BUG #2243: Postgresql fails to finish some queries

From
Bruno Wolff III
Date:
On Tue, Feb 07, 2006 at 02:41:03 +0000,
  Matej Rizman <matej.rizman@gmail.com> wrote:
>
> Files a_30000.txt and b_30000.txt contain 30000 numbers each.
>
> The last query (SELECT) is executed on my machine in 125ms.
>
> If I load data from files a_100000.txt and b_100000.txt that contain 100000
> numbers each, the last SELECT does not finish in more than ten minutes (the
> real-world sample hasn't finished in more than an hour).

Did you do an ANALYZE after doing the load?

Re: BUG #2243: Postgresql fails to finish some queries

From
Matej Rizman
Date:
MjAwNi8yLzgsIEJydW5vIFdvbGZmIElJSSA8YnJ1bm9Ad29sZmYudG8+Ogo+
IE9uIFR1ZSwgRmViIDA3LCAyMDA2IGF0IDAyOjQxOjAzICswMDAwLAo+ICAg
TWF0ZWogUml6bWFuIDxtYXRlai5yaXptYW5AZ21haWwuY29tPiB3cm90ZToK
PiA+Cj4gPiBGaWxlcyBhXzMwMDAwLnR4dCBhbmQgYl8zMDAwMC50eHQgY29u
dGFpbiAzMDAwMCBudW1iZXJzIGVhY2guCj4gPgo+ID4gVGhlIGxhc3QgcXVl
cnkgKFNFTEVDVCkgaXMgZXhlY3V0ZWQgb24gbXkgbWFjaGluZSBpbiAxMjVt
cy4KPiA+Cj4gPiBJZiBJIGxvYWQgZGF0YSBmcm9tIGZpbGVzIGFfMTAwMDAw
LnR4dCBhbmQgYl8xMDAwMDAudHh0IHRoYXQgY29udGFpbiAxMDAwMDAKPiA+
IG51bWJlcnMgZWFjaCwgdGhlIGxhc3QgU0VMRUNUIGRvZXMgbm90IGZpbmlz
aCBpbiBtb3JlIHRoYW4gdGVuIG1pbnV0ZXMgKHRoZQo+ID4gcmVhbC13b3Js
ZCBzYW1wbGUgaGFzbid0IGZpbmlzaGVkIGluIG1vcmUgdGhhbiBhbiBob3Vy
KS4KPgo+IERpZCB5b3UgZG8gYW4gQU5BTFlaRSBhZnRlciBkb2luZyB0aGUg
bG9hZD8KPgoKTm8uCgpCdXQgdGhlIHNpbWlsYXIgcHJvYmxlbSBmaXJzdCBh
cHBlYXJlZCBvbiB0YWJsZXMgdGhhdCBhcmUgb24KcHJvZHVjdGlvbiBzZXJ2
ZXJzIGFuZCBWQUNVVU1lZCByZWd1bGFybHkgKGluIGFkZGl0aW9uIHRvIGF1
dG92YWN1dW0pLgoKVGhlIHNhbWUgcHJvYmxlbSBhcHBlYXJzIGlmIHRhYmxl
cyBhcmUgY3JlYXRlZCB3aXRoIFNFTEVDVCBJTlRPIGFuZAp3aXRoIGluZGlj
ZXMgY3JlYXRlZCBsYXR0ZXIuIENPUFkgaXMgb25seSB1c2VkIGluIHRoaXMg
YnVnIHJlcG9ydCBzbwp0aGF0IEkgY2FuIGZpbGwgdGFibGVzIHdpdGggc29t
ZSBudW1iZXJzLgoKRGlkIHlvdSBnZXQgbWVzc2FnZSBpbiB3aGljaCBJIGZv
dW5kIGEgc29sdXRpb24/IFBhcmFtZXRlciB3b3JrX21lbQpoYXMgdG8gYmUg
Y2hhbmdlZCB0byAxNjM4NCBhbmQgdGhlbiBwb3N0Z3Jlc3FsIGZpbmlzaGVz
IHF1ZXJ5IGluIDItMwpzZWNvbmRzLiBJZiB0aGlzIHBhcmFtZXRlciBpcyBz
ZXQgdG8gZGVmYXVsdCB2YWx1ZSwgaXQgdGFrZXMgYWJvdXQgdHdvCmhvdXJz
IHRvIGZpbmlzaCB0aGlzIHF1ZXJ5LgoKSSB0aG91Z2ggdGhhdCBpZiB3b3Jr
X21lbSBwYXJhbWV0ZXIgd2FzIHRvbyBzbWFsbCwgcG9zdGdyZXNxbCB3b3Vs
ZApleHRlbnNpdmVseSB1c2UgZGlzay4gSG93ZXZlciwgdGhpcyBkaWRuJ3Qg
aGFwcGVuIGluIG15IGNhc2UgLSBkaXNrCkxFRHMgYmxpbmtlZCBvbmx5IGZy
b20gdGltZSB0byB0aW1lIGFzIHVuZGVyIG5vIGxvYWQuCgpUaGFuayB5b3Ug
Zm9yIHlvdXIgYW5zd2VyLApNYXRlaiBSaXptYW4K

Re: BUG #2243: Postgresql fails to finish some queries

From
Andrew - Supernews
Date:
On 2006-02-08, Matej Rizman <matej.rizman@gmail.com> wrote:
> No.
>
> But the similar problem first appeared on tables that are on
> production servers and VACUUMed regularly (in addition to autovacuum).
>
> The same problem appears if tables are created with SELECT INTO and
> with indices created latter. COPY is only used in this bug report so
> that I can fill tables with some numbers.
>
> Did you get message in which I found a solution? Parameter work_mem
> has to be changed to 16384 and then postgresql finishes query in 2-3
> seconds. If this parameter is set to default value, it takes about two
> hours to finish this query.

Any time you use a NOT IN (subselect) query, you are pretty much setting
yourself up for performance problems; pg currently doesn't have any way
to plan these queries as joins, so it will plan either as a plain subplan
(in which case the subselect is run once to completion for _every row_ of
the outer query) or as a hashed subplan (in which case the subselect is
run once, stored in a hashtable which is then consulted for each row).

> I though that if work_mem parameter was too small, postgresql would
> extensively use disk. However, this didn't happen in my case - disk
> LEDs blinked only from time to time as under no load.

The "hashed subplan" can only be used if the _estimated_ size of the
subquery result is small enough that the hashtable will fit within work_mem.
If the estimate is larger than this, a hashed subplan will not be used since
it does not spill to disk; instead it will use a plain subplan.

Rewrite the query as an outer join and you will be much better off.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services