Re: WIP: Avoid creation of the free space map for small tables - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: WIP: Avoid creation of the free space map for small tables
Date
Msg-id CAD21AoBWxiGSStZGWkaBMcK4bPvekcoHMaTU661piEVwiJYTzw@mail.gmail.com
Whole thread Raw
In response to Re: WIP: Avoid creation of the free space map for small tables  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: WIP: Avoid creation of the free space map for small tables  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Jan 29, 2019 at 9:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 29, 2019 at 12:37 AM John Naylor
> <john.naylor@2ndquadrant.com> wrote:
> >
> > On Mon, Jan 28, 2019 at 12:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > 2.
> > > @@ -15,13 +15,9 @@
> > >  SELECT octet_length(get_raw_page('test_rel_forks', 'main', 100)) AS main_100;
> > >  ERROR:  block number 100 is out of range for relation "test_rel_forks"
> > >  SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 0)) AS fsm_0;
> > > - fsm_0
> > > --------
> > > -  8192
> > > -(1 row)
> > > -
> > > +ERROR:  could not open file "base/50769/50798_fsm": No such file or directory
> > >  SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 10)) AS fsm_10;
> > > -ERROR:  block number 10 is out of range for relation "test_rel_forks"
> > > +ERROR:  could not open file "base/50769/50798_fsm": No such file or directory
> > >
> > > This indicates that even though the Vacuum is executed, but the FSM
> > > doesn't get created.  This could be due to different BLCKSZ, but the
> > > failed machines don't seem to have a non-default value of it.  I am
> > > not sure why this could happen, maybe we need to check once in the
> > > failed regression database to see the size of relation?
> >
> > I'm also having a hard time imagining why this failed. Just in case,
> > we could return ctid in a plpgsql loop and stop as soon as we see the
> > 5th block. I've done that for some tests during development and is a
> > safer method anyway.
> >
>
> I think we can devise some concrete way, but it is better first we try
> to understand why it failed, otherwise there is always a chance that
> we will repeat the mistake in some other case.  I think we have no
> other choice, but to request the buildfarm owners to either give us
> the access to see what happens or help us in investigating the
> problem. The four buildfarms where it failed were lapwing, locust,
> dromedary, prairiedog.   Among these, the owner of last two is Tom
> Lane and others I don't recognize.  Tom, Andrew, can you help us in
> getting the access of one of those four?  Yet another alternative is
> the owner can apply the patch attached (this is same what got
> committed) or reset to commit ac88d2962a and execute below statements
> and share the results:
>
> CREATE EXTENSION pageinspect;
>
> CREATE TABLE test_rel_forks (a int);
> INSERT INTO test_rel_forks SELECT i from generate_series(1,1000) i;
> VACUUM test_rel_forks;
> SELECT octet_length(get_raw_page('test_rel_forks', 'main', 0)) AS main_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'main', 100)) AS main_100;
>
> SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 0)) AS fsm_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 10)) AS fsm_10;
>
> SELECT octet_length(get_raw_page('test_rel_forks', 'vm', 0)) AS vm_0;
> SELECT octet_length(get_raw_page('test_rel_forks', 'vm', 1)) AS vm_1;
>
> If the above statements give error: "ERROR:  could not open file ...", then run:
> Analyze test_rel_forks;
> Select oid, relname, relpages, reltuples from pg_class where relname
> like 'test%';
>
> The result of the above tests will tell us whether there are 5 pages
> in the table or not.  If the table contains 5 pages and throws an
> error, then there is some bug in our code, otherwise, there is
> something specific to those systems where the above insert doesn't
> result in 5 pages.

I'd suspect the alignment of integer. In my environemnt, the tuple
actual size is 28 bytes but the aligned size is 32 bytes (=
MAXALIGN(28)), so we can store 226 tuples to single page. But if
MAXALIGN(28) = 28 then we can store 255 tuples and 1000 tuples fits
within 4 pages. The MAXALIGN of four buildfarms seem 4 accroding to
the configure script so MAXALIGN(28) might be 28 on these buildfarms.

configure:16816: checking alignment of short
configure:16839: result: 2
configure:16851: checking alignment of int
configure:16874: result: 4
configure:16886: checking alignment of long
configure:16909: result: 4
configure:16922: checking alignment of long long int
configure:16945: result: 4

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: ALTER SESSION
Next
From: Evgeniy Efimkin
Date:
Subject: Re: [WIP] CREATE SUBSCRIPTION with FOR TABLES clause (table filter)