Thread: CTTAS w/ DISTINCT ON crashes backend
Hello. I subscribed to pgsql-bugs and submitted a bug report about 10 hours ago, but have yet to see my post go through, so I thought I'd through this out on -general. In a nutshell: CREATE TEMPORARY TABLE foo AS SELECT DISTINCT ON (x, y, z) * FROM bar; crashes the backend and screws up data pages associated with the catalog under 7.4.1. It worked fine under 7.3.2. It always works when there isn't any data. With my data however, it crashes the backend every time. Just to be sure, I fsck'ed w/badblocks (-c -c) and ran memtest86 - no errors. I reloaded the entire cluster from backup cleanly and executed the same query again and it crashes at precisely the exact same place using the above construct. No core images were left behind. If I get enough time, I'll attach gdb and generate a backtrace. Rewritting the query as: CREATE TEMPORARY TABLE foo AS SELECT * FROM bar LIMIT 0; INSERT INTO foo SELECT DISTINCT ON (x, y, z) * FROM bar; does not crash the backend and works as expected. 'bar', btw is also a temporary table... Mike Mascari
Tom Lane wrote: >>In a nutshell: >> >>CREATE TEMPORARY TABLE foo AS >>SELECT DISTINCT ON (x, y, z) * >>FROM bar; >> >>crashes the backend and screws up data pages associated with the catalog under >>7.4.1. >> >> > >Works for me ... > > ... >Perhaps providing a specific test case would help. > > Could you give me a bit of direction? I dumped the data associated with the tables involved from database "Development", loaded them into a new database "Test" and ran the script which causes the backend to crash and it worked fine, no errors. This was on the same machine. I then run the same script against the database "Development" (from which I had just dumped the data and relevant schema for "Test") and I get a crashed backend. Here's a backtrace: Program received signal SIGSEGV, Segmentation fault. 0x0806fb52 in nocachegetattr () #0 0x0806fb52 in nocachegetattr () #1 0x0810186e in execTuplesMatch () #2 0x081115dc in ExecUnique () #3 0x08104cf8 in ExecProcNode () #4 0x0810356d in ExecutePlan () #5 0x08102968 in ExecutorRun () #6 0x08178682 in ProcessQuery () #7 0x08179144 in PortalRunMulti () #8 0x08178afb in PortalRun () #9 0x08175545 in exec_simple_query () #10 0x08177c09 in PostgresMain () #11 0x08151b9b in BackendFork () #12 0x081515a3 in BackendStartup () #13 0x0814faa8 in ServerLoop () #14 0x0814f171 in PostmasterMain () #15 0x0811f5b5 in main () #16 0x42015967 in __libc_start_main () from /lib/i686/libc.so.6 (gdb) I don't get it. I had received this error in the "Development" database while running the application. I thought perhaps it was bad blocks or flaky RAM. So I *wiped out* the database cluster after running fsck and restored from the "Production" database dump copied from another machine. I ran the application again and it crashed at the exact same place. I can send you the query and the schema, but as I've said, when I load the schema & data associated with the tables and views involved with this query from "Development" into a new "Test" database in the same cluster, it executes fine??? I'll try and dump the entire database and restore it on a third machine and see if the query crashes that backend as well. But it will take a bit of time. If it does crash what does that mean? If not, what does that mean? Mike Mascari
Mike Mascari <mascarm@mascari.com> writes: > Could you give me a bit of direction? > [ same query works in one DB and crashes in another ] I have a feeling this is a problem with an incorrect plan --- possibly the same thing I just fixed a few days ago, http://archives.postgresql.org/pgsql-committers/2004-01/msg00134.php or perhaps another bug. Look at EXPLAIN output and see if the two databases are generating different plans for the query. If so, perhaps ANALYZE in the test database is needed? In any case, don't ANALYZE in the development DB, for fear of moving the stats enough to make the problem go away ... regards, tom lane