Thread: PANIC: unexpected hash relation size
Hi, I'm using Postgres 8.2.2 on Windows. I frequently get this error message: PANIC: unexpected hash relation size: X, should be Y There is no specific point where I get this error, it can occur anytime after a SELECT/INSERT/UPDATE. I've never got it after DELETE so far. When it occurs, the server needs to be restarted (otherwise pgAdmin disconnects after each access attempt to any DB object). After restart, any query I run on the same table I was querying before (sometimes also on other tables), results the same error with different values (X,Y). Only way I could get rid of this error was dump the database, drop it, then restore it. After a few minutes/hours/days the error occured again however. My tables are small (few thousand records max), I rarely use indexes. I searched the Google for the error message, but found nothing, only some source code of Postgres. Has anyone ever recieved this error? What could be the cause? Is it a bug, or I'm doing something wrong? Thanks in advance, Csaba ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Csaba Kalman <csaba818@freemail.hu> writes: > I frequently get this error message: > PANIC: unexpected hash relation size: X, should be Y > There is no specific point where I get this error, it can occur anytime > after a SELECT/INSERT/UPDATE. I've never got it after DELETE so far. AFAICS this should only happen in INSERT/UPDATE, since it must be coming from an attempt to insert an entry in a hash index. What's the exact schema of this table including indexes? > Is it a bug, or I'm doing something wrong? It's a bug, but you've not provided enough information to let anyone reproduce or fix it. regards, tom lane
I completed another migration from 8.1.3 to 8.2.3, and again the problem with "unexpected data beyond EOF", exactly twicelike before, but in two tables different from the last time. The kernel bug hypothesis seems a strong one. I told Unix Ops about the possible bug, and one of the guys said 2.6.5-7.244was well known to be trouble on our hardware. We should be upgrading soon. I'll try to get more info regarding the particular Linux bug. Thanks Tom Lane wrote: I did some idle searching when we were looking at the previous report but couldn't find any specific kernel patches about this. It's presumably known on the kernel side but maybe not under the specific "lseek reports stale EOF position" guise that we know it as. If you manage to track it down please do post back here, because we'd all like to know. regards, tom lane *********************************************************************** Bear Stearns is not responsible for any recommendation, solicitation, offer or agreement or any information about any transaction, customer account or account activity contained in this communication. Bear Stearns does not provide tax, legal or accounting advice. You should consult your own tax, legal and accounting advisors before engaging in any transaction. In order for Bear Stearns to comply with Internal Revenue Service Circular 230 (if applicable), you are notified that any discussion of U.S. federal tax issues contained or referred to herein is not intended or written to be used, and cannot be used, for the purpose of: (A) avoiding penalties that may be imposed under the Internal Revenue Code; nor (B) promoting, marketing or recommending to another party any transaction or matter addressed herein. ***********************************************************************
Jaime Silvela wrote: > I completed another migration from 8.1.3 to 8.2.3, and again the problem > with "unexpected data beyond EOF", exactly twice like before, but in two > tables different from the last time. > The kernel bug hypothesis seems a strong one. I told Unix Ops about the > possible bug, and one of the guys said 2.6.5-7.244 was well known to be > trouble on our hardware. We should be upgrading soon. I'll try to get > more info regarding the particular Linux bug. 2.6.5-7.244 looks like an older SLES kernel which makes this bug suspicially look like one we got reported a while ago: http://archives.postgresql.org/pgsql-admin/2006-09/msg00092.php and toms analysis: http://archives.postgresql.org/pgsql-hackers/2006-09/msg01899.php Stefan
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: > 2.6.5-7.244 looks like an older SLES kernel which makes this bug > suspicially look like one we got reported a while ago: > http://archives.postgresql.org/pgsql-admin/2006-09/msg00092.php > and toms analysis: > http://archives.postgresql.org/pgsql-hackers/2006-09/msg01899.php Just for context: the error message you are seeing is the one I added pursuant to the second message above. If you'd been running this with an older PG version, you'd have been experiencing silent data loss instead of the error :-( regards, tom lane
csaba <csaba818@freemail.hu> writes: > Here is the CREATE script of the table which is most likely to be one > source of the error. The last error happened after an insert into this > table. Hm, what can you say about the data that goes into the three columns that have hash indexes? I'm going to try testing this with random data, but it'd likely be more useful if I could reproduce your data distribution. regards, tom lane
Tom Lane writes: > Csaba Kalman <csaba818@freemail.hu> writes: >> I frequently get this error message: >> PANIC: unexpected hash relation size: X, should be Y >> There is no specific point where I get this error, it can occur anytime >> after a SELECT/INSERT/UPDATE. I've never got it after DELETE so far. > > AFAICS this should only happen in INSERT/UPDATE, since it must be coming > from an attempt to insert an entry in a hash index. What's the exact > schema of this table including indexes? > >> Is it a bug, or I'm doing something wrong? > > It's a bug, but you've not provided enough information to let anyone > reproduce or fix it. > Sorry for not being specific. Here is the CREATE script of the table which is most likely to be one source of the error. The last error happened after an insert into this table. CREATE TABLE munkalap ( pk character(12) NOT NULL, statusz_pk character(12) NOT NULL, ugyfel_pk character(12) NOT NULL, ellenor_pk character(12), szamla_pk character(12), cim_pk character(12), napi_munka_pk character(12), vonalkod character varying(10), teljesites_datum date, archivalas_datum timestamp without time zone, visszavetel_megjegyzes character varying(600), nyomtatva bit(1) NOT NULL, csere_oka character varying(100), operator_pk_rogzitette character(12) NOT NULL, kapcsolat_nev character varying(100), kapcsolat_tel character varying(20), vizmero_tipus character(1), operator_megjegyzes character varying(600), kiszallas_napszak character(1), rogzites_datum timestamp without time zone, merok_szama smallint, merok_atmero character varying(50), visszavetel_idopont timestamp without time zone, leadhatosag_pk character(12), leadas_datum date, CONSTRAINT munkalap_tabla_pk PRIMARY KEY (pk), CONSTRAINT munkalap_cim_pk_fkey FOREIGN KEY (cim_pk) REFERENCES cim(pk), CONSTRAINT munkalap_ellenor_pk_fkey FOREIGN KEY (ellenor_pk) REFERENCES alkalmazott(pk) ON DELETE SET NULL, CONSTRAINT munkalap_leadhatosag_statusz_pk_fkey FOREIGN KEY (leadhatosag_pk) REFERENCES munkalap_leadhatosag_statusz(pk) ON DELETE RESTRICT, CONSTRAINT munkalap_napi_munka_pk_fkey FOREIGN KEY (napi_munka_pk) REFERENCES napi_munka(pk) ON DELETE SET NULL, CONSTRAINT munkalap_rogzito_fkey FOREIGN KEY (operator_pk_rogzitette) REFERENCES alkalmazott(pk), CONSTRAINT munkalap_statusz_pk_fkey FOREIGN KEY (statusz_pk) REFERENCES munkalap_statusz(pk), CONSTRAINT munkalap_szamla_pk_fkey FOREIGN KEY (szamla_pk) REFERENCES szamla(pk), CONSTRAINT munkalap_ugyfel_pk_fkey FOREIGN KEY (ugyfel_pk) REFERENCES ugyfel(pk) ); CREATE INDEX munkalap_cim_pk_index ON munkalap USING btree (cim_pk); CREATE INDEX munkalap_ellenor_pk_index ON munkalap USING btree (ellenor_pk); CREATE INDEX munkalap_napi_munka_pk_index ON munkalap USING hash (napi_munka_pk); CREATE INDEX munkalap_statusz_pk_index ON munkalap USING hash (statusz_pk); CREATE INDEX munkalap_szamla_pk_index ON munkalap USING btree (szamla_pk); CREATE INDEX munkalap_teljesites_index ON munkalap USING hash (teljesites_datum); CREATE INDEX munkalap_ugyfel_pk_index ON munkalap USING btree (ugyfel_pk); (Funny thing is, while copy-pasting this script, I noticed that I created hash indexes. It was not my intention, I always use B-tree. I must have accidentally clicked on hash when I was creating the indexes with the wizard. Shame on me. I did not understand why the error said unexpected _hash_ relation size, because I was sure there are no hash indexes. I know hash index use is discouraged. I'm going to check all my indexes, and correct these mistakes if there are more.) Anyway, I can't make a list of steps to reproduce the error. Once I inserted 100000 rows and succeeded, but another time I inserted less than a hundred (to an empty table) and I got the error. Thanks, Csaba
Tom Lane writes: > csaba <csaba818@freemail.hu> writes: >> Here is the CREATE script of the table which is most likely to be one >> source of the error. The last error happened after an insert into this >> table. > > Hm, what can you say about the data that goes into the three columns > that have hash indexes? I'm going to try testing this with random data, > but it'd likely be more useful if I could reproduce your data > distribution. > A row in table Munkalap represents a job, an employee has to do. teljesites_datum About 2-300 rows have the same date. statusz_pk It's a reference to another table which describes the different states a job can be in. There are 10 of them. Contains only numeric values. When a row is inserted it's value is always '000000100000'. Possible values are 0 through 9 at the last digit. On a long term most rows will have '000000100004' in this column (that is the final state.) napi_munka_pk Reference to a table which contains data about grouping the jobs. Contains only numeric values. When inserting a row it is always null. Groups are created at a later stage, then the Munkalap table is updated. About 20-30 rows have the same value if it is not null and there's always a few hundred rows with null. A few sample rows: INSERT INTO munkalap VALUES ('100000003176', '000000100000', '100000003170', NULL, '100000003171', '900000016024', NULL, 'M07I000005', '2007-03-29', NULL, NULL, B'0', 'igen', '100000003122', 'Nagy Pistike', '723-5467', 'a', '', '0', '2007-03-28 18:19:31.828', 4, '', NULL, NULL, NULL); INSERT INTO munkalap VALUES ('100000003646', '000000100000', '100000003640', NULL, '100000003641', '900000027306', NULL, 'M07I000052', '2007-04-04', NULL, NULL, B'0', 'igen', '100000003123', 'Kis Moricka', '946-4861', 'a', '', '0', '2007-03-28 19:07:25.546', 2, '', NULL, NULL, NULL); INSERT INTO munkalap VALUES ('100000003761', '000000100001', '100000002783', '100000005939', '100000003756', '900000161044', '100000007745', 'M07I000064', '2007-03-30', NULL, NULL, B'0', 'nem', '100000003122', 'Toth Geza', '864-1135', 'a', '', '1', '2007-03-28 19:16:04.187', 2, '', NULL, NULL, NULL); INSERT INTO munkalap VALUES ('100000004314', '000000100001', '100000002637', '100000005914', '100000004309', '900000097696', '100000007740', 'M07I000127', '2007-03-30', NULL, NULL, B'0', 'nem', '100000003122', 'Rizi Rozi', '955-1919', 'a', '', '1', '2007-03-28 19:56:01', 1, '13', NULL, NULL, NULL); Thank you, Csaba
Csaba Kalman <csaba818@freemail.hu> writes: > Tom Lane writes: >> Hm, what can you say about the data that goes into the three columns >> that have hash indexes? I'm going to try testing this with random data, >> but it'd likely be more useful if I could reproduce your data >> distribution. > [ some info ] Thanks, but I've been unable to reproduce the problem with either random data or data tailored to match yours. I'm a bit worried that the issue may only appear on Windows --- hash indexes currently assume that the filesystem will react in a particular way to a write that's well beyond the current EOF, and I wouldn't put it past Microsoft to get that wrong. Can you put together a self-contained script that causes the error for you? It's okay if it only fails once every ten or even 100 tries. What we need is something that other people can run and say fairly definitively "it happens here" or "it doesn't". regards, tom lane