RE: [HACKERS] Recovery on incomplete write - Mailing list pgsql-hackers
From | Hiroshi Inoue |
---|---|
Subject | RE: [HACKERS] Recovery on incomplete write |
Date | |
Msg-id | 000701bf0f13$9c0790c0$2801007e@cadzone.tpf.co.jp Whole thread Raw |
List | pgsql-hackers |
> > > -----Original Message----- > > From: Bruce Momjian [mailto:maillist@candle.pha.pa.us] > > Sent: Tuesday, September 28, 1999 11:54 PM > > To: Tom Lane > > Cc: Hiroshi Inoue; pgsql-hackers > > Subject: Re: [HACKERS] Recovery on incomplete write > > > > > > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > > > I have wondered that md.c handles incomplete block(page)s > > > > correctly. > > > > Am I mistaken ? > > > > > > I think you are right, and there may be some other trouble > spots in that > > > file too. I remember thinking that the code depended heavily on never > > > having a partial block at the end of the file. > > > > > > But is it worth fixing? The only way I can see for the file length > > > to become funny is if we run out of disk space part way > through writing > > > a page, which seems unlikely... > > > > > > > That is how he got started, the TODO item about running out of disk > > space causing corrupted databases. I think it needs a fix, if we can. > > > > Maybe it isn't so difficult to fix. > I would provide a patch. > Here is a patch. 1) mdnblocks() ignores a partial block at the end of relation files. 2) mdread() ignores a partial block of block number 0. 3) mdextend() adjusts its position to a multiple of BLCKSZ before writing. 4) mdextend() truncates extra bytes in case of incomplete write. If there's no objection,I would commit this change to the current tree. Regards. Hiroshi Inoue Inoue@tpf.co.jp *** storage/smgr/md.c.orig Thu Sep 30 10:50:58 1999 --- storage/smgr/md.c Tue Oct 5 13:30:55 1999 *************** *** 233,239 **** int mdextend(Relation reln, char *buffer) { ! long pos; int nblocks; MdfdVec *v; --- 233,239 ---- int mdextend(Relation reln, char *buffer) { ! long pos, nbytes; int nblocks; MdfdVec *v; *************** *** 243,250 **** if ((pos = FileSeek(v->mdfd_vfd, 0L, SEEK_END)) < 0) return SM_FAIL; ! if (FileWrite(v->mdfd_vfd, buffer, BLCKSZ) != BLCKSZ) return SM_FAIL; /* remember that we did a write, so we can sync at xact commit */ v->mdfd_flags |= MDFD_DIRTY; --- 243,264 ---- if ((pos = FileSeek(v->mdfd_vfd, 0L, SEEK_END)) < 0) return SM_FAIL; ! if (pos % BLCKSZ != 0) /* the last block is incomplete */ ! { ! pos = BLCKSZ * (long)(pos / BLCKSZ); ! if (FileSeek(v->mdfd_vfd, pos, SEEK_SET) < 0) ! return SM_FAIL; ! } ! ! if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ) ! { ! if (nbytes > 0) ! { ! FileTruncate(v->mdfd_vfd, pos); ! FileSeek(v->mdfd_vfd, pos, SEEK_SET); ! } return SM_FAIL; + } /* remember that we did a write, so we can sync at xact commit */ v->mdfd_flags |= MDFD_DIRTY; *************** *** 432,437 **** --- 446,453 ---- { if (nbytes == 0) MemSet(buffer, 0, BLCKSZ); + else if (blocknum == 0 && nbytes > 0 && mdnblocks(reln) == 0) + MemSet(buffer, 0, BLCKSZ); else status = SM_FAIL; } *************** *** 1067,1072 **** { long len; ! len = FileSeek(file, 0L, SEEK_END) - 1; ! return (BlockNumber) ((len < 0) ? 0 : 1 + len / blcksz); } --- 1083,1088 ---- { long len; ! len = FileSeek(file, 0L, SEEK_END); ! return (BlockNumber) (len / blcksz); }
pgsql-hackers by date: