On 05.05.2016 7:16, Amit Kapila wrote:
> On Wed, May 4, 2016 at 8:03 PM, Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com
> <mailto:amit.kapila16@gmail.com>> writes:
> > > On Wed, May 4, 2016 at 4:02 PM, Alex Ignatov
> <a.ignatov@postgrespro.ru <mailto:a.ignatov@postgrespro.ru>>
> > > wrote:
> > >> On 03.05.2016 2:17, Tom Lane wrote:
> > >>> Writing a single sector ought to be atomic too.
> >
> > >> pg_control is 8k long(i think it is legth of one page in default PG
> > >> compile settings).
> >
> > > The actual data written is always sizeof(ControlFileData) which
> should be
> > > less than one sector.
> >
> > Yes. We don't care what happens to the rest of the file as long as the
> > first sector's worth is updated atomically. See the comments for
> > PG_CONTROL_SIZE and the code in ReadControlFile/WriteControlFile.
> >
> > We could change to a different PG_CONTROL_SIZE pretty easily, and there's
> > certainly room to argue that reducing it to 512 or 1024 would be more
> > efficient. I think the motivation for setting it at 8K was basically
> > "we're already assuming that 8K writes are efficient, so let's assume
> > it here too". But since the file is only written once per checkpoint,
> > efficiency is not really a key selling point anyway. If you could make
> > an argument that some other size would reduce the risk of failures,
> > it would be interesting --- but I suspect any such argument would be
> > very dependent on the quirks of a specific file system.
> >
>
> How about using 512 bytes as a write size and perform direct writes
> rather than going via OS buffer cache for control file? Alex, is the
> issue reproducible (to ensure that if we try to solve it in some way, do
> we have way to test it as well)?
>
> >
> > One point worth considering is that on most file systems, rewriting
> > a fraction of a page is *less* efficient than rewriting a full page,
> > because the kernel first has to read in the old contents to fill
> > the disk buffer it's going to partially overwrite with new data.
> > This motivates against trying to reduce the write size too much.
> >
>
> Yes, you are very much right and I have observed that recently during my
> work on WAL Re-Writes [1]. However, I think that won't be the issue if
> we use direct writes for control file.
>
>
> [1] -
> http://www.postgresql.org/message-id/CAA4eK1+=O33dZZ=jBtjXBFyD67R5dLcqFyOMj4f-qmFXBP1OOQ@mail.gmail.com
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
Hi!
No issue happened only once. Also any attempts to reproduce it is not
successful yet