Re: [HACKERS] emergency outage requiring database restart - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: [HACKERS] emergency outage requiring database restart |
Date | |
Msg-id | CAHyXU0w43BkN9R3FTSatRwGR0Qjjuy8m909yGtKtnUncrxq98Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] emergency outage requiring database restart (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
List | pgsql-hackers |
On Tue, Jan 3, 2017 at 1:05 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 11/7/16 5:31 PM, Merlin Moncure wrote: >> Regardless, it seems like you might be on to something, and I'm >> inclined to patch your change, test it, and roll it out to production. >> If it helps or at least narrows the problem down, we ought to give it >> consideration for inclusion (unless someone else can think of a good >> reason not to do that, heh!). > > Any results yet? Not yet unfortunately. I compiled the server with the change, but was not able get $libdir working so that I could just do a binary swap over my pgdg compiled package. If anyone has some pointers on how to do that, I'd appreciated it. Still getting checksum failures. Over the last 30 days, I see the following. Since enabling checksums FWICT none of the damage is permanent and rolls back with the transaction. So creepy! [root@rcdylsdbmpf001 pg_log]# cat *.log | grep "page verification failed" 2016-12-05 10:17:48 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 61797 but expected 61798 2016-12-05 11:15:31 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 37750 but expected 37749 2016-12-05 11:15:58 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 44483 but expected 44482 2016-12-05 11:16:33 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 58926 but expected 58925 2016-12-05 11:17:08 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 38527 but expected 38528 2016-12-05 11:18:34 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 61932 but expected 61933 2016-12-05 11:18:55 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 23757 but expected 23758 2016-12-05 12:13:48 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 44192 but expected 44225 at character 417 2016-12-08 14:18:37 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 36083 but expected 36082 2016-12-08 15:52:31 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 63414 but expected 63415 at character 1096 2016-12-09 09:12:21 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 25781 but expected 25780 2016-12-09 09:13:20 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 63043 but expected 63044 at character 4230 2016-12-12 08:57:45 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 31775 but expected 31771 2016-12-13 09:47:11 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 40802 but expected 40806 2016-12-15 12:49:04 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 11625 but expected 11592 at character 417 2016-12-15 12:51:08 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 51017 but expected 51018 2016-12-15 12:52:36 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 51017 but expected 51018 at character 417 2016-12-16 12:16:31 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 23580 but expected 23576 2016-12-20 13:59:33 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 45273 but expected 45285 2016-12-20 14:00:22 CST [postgres@castaging]: WARNING: page verification failed, calculated checksum 10524 but expected 10525 note second database 'mpf2'. This is a new development. Example of query that is jacking things is this: 2016-12-15 12:51:08 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 51017 but expected 51018 2016-12-15 12:51:08 CST [rms@mpf2]: CONTEXT: SQL statement " COPY ( SELECT 'DELETE FROM tblNAptCommonSample WHEREReportPeriod = 201612;' UNION ALL SELECT format( 'INSERT INTO tblNAptCommonSample(' 'ReportPeriod,Period, AdjustmentType, PlanType, MSA, MSASubMkt, ' 'Sample, Occupancy, OccupancyChange, AverageRent, AverageRentChange, RentSF, ' 'RentSFChange)' 'VALUES(' '%s, %s, ''%s'', ''%s'', ''%s'', %s,' '%s, %s, %s, %s,%s, %s,' '%s)', ReportPeriod, Period, AdjustmentType, PlanType, MSA, MSASubMkt, c(Sample), c(Occupancy), c(OccupancyChange),c(AverageRent), c(AverageRentChange), c(RentSF), c(RentSFChange)) FROM tblNAptCommonSample WHERE Period = 201612 AND MSA !='5610' UNION ALL SELECT 'go' ) TO '/tmp/tblnaptcommonsample.sql'; " PL/pgSQL function writempf1history(integer)line 75 at EXECUTE or this: 2016-12-15 12:52:36 CST [rms@mpf2]: WARNING: page verification failed, calculated checksum 51017 but expected 51018 at character 417 2016-12-15 12:52:36 CST [rms@mpf2]: QUERY: COPY ( SELECT 'DELETE FROM tbltwrexistingunits WHERE ReportPeriod =201612;' UNION ALL SELECT format( 'INSERT INTO tbltwrexistingunits(' 'ReportPeriod, market, submarketnum,yr_qtr, cmpltns, deletions, ' 'existing, unadjexisting)' 'VALUES(' '%s, ''%s'', %s, ''%s'', %s, %s,' '%s, %s)', ReportPeriod, market, submarket, yr_qtr, c(cmpltns), c(deletions), c(existing), c(unadjexisting)) FROM tbltwrexistingunits WHERE ReportPeriod = 201612 AND market != '5610' UNION ALL SELECT'go' ) TO '/tmp/tbltwrexistingunits.sql'; 2016-12-15 12:52:36 CST [rms@mpf2]: CONTEXT: PL/pgSQL function writempf1history(integer) line 109 at EXECUTE This is another (much simpler) routine that: 1. writes out data to scratch file with COPY 2. uses pl/sh to pipe to sqsh merlin
pgsql-hackers by date: