Thread: BUG #2243: Postgresql fails to finish some queries
The following bug has been logged online: Bug reference: 2243 Logged by: Matej Rizman Email address: matej.rizman@gmail.com PostgreSQL version: 8.0 and 8.1 Operating system: Linux Debian, kernel 2.6.12-1-k7 Description: Postgresql fails to finish some queries Details: Execute the following script: CREATE TABLE a ( num int4 ); CREATE TABLE b ( num int4 ); CREATE UNIQUE INDEX ix_a_num ON a(num); CREATE INDEX ix_b_num ON b(num); COPY a FROM '/tmp/a_30000.txt'; COPY b FROM '/tmp/b_30000.txt'; SELECT * FROM b WHERE num NOT IN (SELECT num FROM a); Files a_30000.txt and b_30000.txt contain 30000 numbers each. The last query (SELECT) is executed on my machine in 125ms. If I load data from files a_100000.txt and b_100000.txt that contain 100000 numbers each, the last SELECT does not finish in more than ten minutes (the real-world sample hasn't finished in more than an hour). The similar real-world sample does not even return the results of EXPLAIN statement. However, this real-world sample is quite complex and I am not posting it there. In the case I provided the EXPLAIN statement works fine. This behaviour has been observed on postgresql 8.0 and on postgresql 8.1. It has also been tested on two different computers, both running debian linux. I can provide files a_* and b_* if you want. However, they are quite large and unsuitable for mailing list attachments. Best regards, Matej Rizman
On Tue, Feb 07, 2006 at 02:41:03 +0000, Matej Rizman <matej.rizman@gmail.com> wrote: > > Files a_30000.txt and b_30000.txt contain 30000 numbers each. > > The last query (SELECT) is executed on my machine in 125ms. > > If I load data from files a_100000.txt and b_100000.txt that contain 100000 > numbers each, the last SELECT does not finish in more than ten minutes (the > real-world sample hasn't finished in more than an hour). Did you do an ANALYZE after doing the load?
MjAwNi8yLzgsIEJydW5vIFdvbGZmIElJSSA8YnJ1bm9Ad29sZmYudG8+Ogo+ IE9uIFR1ZSwgRmViIDA3LCAyMDA2IGF0IDAyOjQxOjAzICswMDAwLAo+ICAg TWF0ZWogUml6bWFuIDxtYXRlai5yaXptYW5AZ21haWwuY29tPiB3cm90ZToK PiA+Cj4gPiBGaWxlcyBhXzMwMDAwLnR4dCBhbmQgYl8zMDAwMC50eHQgY29u dGFpbiAzMDAwMCBudW1iZXJzIGVhY2guCj4gPgo+ID4gVGhlIGxhc3QgcXVl cnkgKFNFTEVDVCkgaXMgZXhlY3V0ZWQgb24gbXkgbWFjaGluZSBpbiAxMjVt cy4KPiA+Cj4gPiBJZiBJIGxvYWQgZGF0YSBmcm9tIGZpbGVzIGFfMTAwMDAw LnR4dCBhbmQgYl8xMDAwMDAudHh0IHRoYXQgY29udGFpbiAxMDAwMDAKPiA+ IG51bWJlcnMgZWFjaCwgdGhlIGxhc3QgU0VMRUNUIGRvZXMgbm90IGZpbmlz aCBpbiBtb3JlIHRoYW4gdGVuIG1pbnV0ZXMgKHRoZQo+ID4gcmVhbC13b3Js ZCBzYW1wbGUgaGFzbid0IGZpbmlzaGVkIGluIG1vcmUgdGhhbiBhbiBob3Vy KS4KPgo+IERpZCB5b3UgZG8gYW4gQU5BTFlaRSBhZnRlciBkb2luZyB0aGUg bG9hZD8KPgoKTm8uCgpCdXQgdGhlIHNpbWlsYXIgcHJvYmxlbSBmaXJzdCBh cHBlYXJlZCBvbiB0YWJsZXMgdGhhdCBhcmUgb24KcHJvZHVjdGlvbiBzZXJ2 ZXJzIGFuZCBWQUNVVU1lZCByZWd1bGFybHkgKGluIGFkZGl0aW9uIHRvIGF1 dG92YWN1dW0pLgoKVGhlIHNhbWUgcHJvYmxlbSBhcHBlYXJzIGlmIHRhYmxl cyBhcmUgY3JlYXRlZCB3aXRoIFNFTEVDVCBJTlRPIGFuZAp3aXRoIGluZGlj ZXMgY3JlYXRlZCBsYXR0ZXIuIENPUFkgaXMgb25seSB1c2VkIGluIHRoaXMg YnVnIHJlcG9ydCBzbwp0aGF0IEkgY2FuIGZpbGwgdGFibGVzIHdpdGggc29t ZSBudW1iZXJzLgoKRGlkIHlvdSBnZXQgbWVzc2FnZSBpbiB3aGljaCBJIGZv dW5kIGEgc29sdXRpb24/IFBhcmFtZXRlciB3b3JrX21lbQpoYXMgdG8gYmUg Y2hhbmdlZCB0byAxNjM4NCBhbmQgdGhlbiBwb3N0Z3Jlc3FsIGZpbmlzaGVz IHF1ZXJ5IGluIDItMwpzZWNvbmRzLiBJZiB0aGlzIHBhcmFtZXRlciBpcyBz ZXQgdG8gZGVmYXVsdCB2YWx1ZSwgaXQgdGFrZXMgYWJvdXQgdHdvCmhvdXJz IHRvIGZpbmlzaCB0aGlzIHF1ZXJ5LgoKSSB0aG91Z2ggdGhhdCBpZiB3b3Jr X21lbSBwYXJhbWV0ZXIgd2FzIHRvbyBzbWFsbCwgcG9zdGdyZXNxbCB3b3Vs ZApleHRlbnNpdmVseSB1c2UgZGlzay4gSG93ZXZlciwgdGhpcyBkaWRuJ3Qg aGFwcGVuIGluIG15IGNhc2UgLSBkaXNrCkxFRHMgYmxpbmtlZCBvbmx5IGZy b20gdGltZSB0byB0aW1lIGFzIHVuZGVyIG5vIGxvYWQuCgpUaGFuayB5b3Ug Zm9yIHlvdXIgYW5zd2VyLApNYXRlaiBSaXptYW4K
On 2006-02-08, Matej Rizman <matej.rizman@gmail.com> wrote: > No. > > But the similar problem first appeared on tables that are on > production servers and VACUUMed regularly (in addition to autovacuum). > > The same problem appears if tables are created with SELECT INTO and > with indices created latter. COPY is only used in this bug report so > that I can fill tables with some numbers. > > Did you get message in which I found a solution? Parameter work_mem > has to be changed to 16384 and then postgresql finishes query in 2-3 > seconds. If this parameter is set to default value, it takes about two > hours to finish this query. Any time you use a NOT IN (subselect) query, you are pretty much setting yourself up for performance problems; pg currently doesn't have any way to plan these queries as joins, so it will plan either as a plain subplan (in which case the subselect is run once to completion for _every row_ of the outer query) or as a hashed subplan (in which case the subselect is run once, stored in a hashtable which is then consulted for each row). > I though that if work_mem parameter was too small, postgresql would > extensively use disk. However, this didn't happen in my case - disk > LEDs blinked only from time to time as under no load. The "hashed subplan" can only be used if the _estimated_ size of the subquery result is small enough that the hashtable will fit within work_mem. If the estimate is larger than this, a hashed subplan will not be used since it does not spill to disk; instead it will use a plain subplan. Rewrite the query as an outer join and you will be much better off. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services