Thread: Test "tablespace" fails during `make installcheck` on master-replica setup
Test "tablespace" fails during `make installcheck` on master-replica setup
From
Aleksander Alekseev
Date:
Hello. I noticed, that `make installcheck` fails on my laptop with following errors: http://afiskon.ru/s/98/6f94ce2cfa_regression.out.txt http://afiskon.ru/s/b3/d0da05597e_regression.diffs.txt My first idea was to use `git bisect`. It turned out that this issue reproduces on commits back from 2015 as well (older versions don't compile on my laptop). However it reproduces rarely and with different errors: http://afiskon.ru/s/8e/1ad2c8ed2b_regression.diffs-8c48375.txt Here are scripts I use to compile and test PostgreSQL: https://github.com/afiskon/pgscripts Exact steps to reproduce are: ``` ./quick-build.sh && ./install.sh && make installcheck ``` Completely removing all `configure` flags doesn't make any difference. Issue reproduces only on master-replica setup i.e. if instead of install.sh you run ./single-install.sh all tests pass. I'm using Arch Linux and GCC 6.2.1. Any ideas what can cause this issue? -- Best regards, Aleksander Alekseev
Re: Test "tablespace" fails during `make installcheck` on master-replica setup
From
Michael Paquier
Date:
On Wed, Dec 07, 2016 at 03:18:59PM +0300, Aleksander Alekseev wrote: > I noticed, that `make installcheck` fails on my laptop with following > errors: > > http://afiskon.ru/s/98/6f94ce2cfa_regression.out.txt > http://afiskon.ru/s/b3/d0da05597e_regression.diffs.txt The interesting bit for the archives: *** /home/eax/work/postgrespro/postgresql-src/src/test/regress/expected/tablespace.out 2016-12-07 13:53:44.000728436 +0300 --- /home/eax/work/postgrespro/postgresql-src/src/test/regress/results/tablespace.out 2016-12-07 13:53:46.150728558 +0300 *************** *** 66,71 **** --- 66,72 ---- INSERT INTO testschema.test_default_tab VALUES (1); CREATE INDEX test_index1 on testschema.test_default_tab (id); CREATE INDEX test_index2 on testschema.test_default_tab (id) TABLESPACE regress_tblspace; + ERROR: could not open file "pg_tblspc/16395/PG_10_201612061/16393/16407": No such file or directory \d testschema.test_index1 > Any ideas what can cause this issue? In the same host, primary and standby will try to use the tablespace in the same path. That's the origin of this breakage. -- Michael
Re: Test "tablespace" fails during `make installcheck` on master-replica setup
From
Aleksander Alekseev
Date:
> In the same host, primary and standby will try to use the tablespace > in the same path. That's the origin of this breakage. Sorry, I don't follow. Don't master and replica use different directories to store _all_ data? Particularly in my case: ``` $ find path/to/postgresql-install/ -type d -name pg_tblspc /home/eax/work/postgrespro/postgresql-install/data-slave/pg_tblspc /home/eax/work/postgrespro/postgresql-install/data-master/pg_tblspc ``` Where exactly a collision happens? On Wed, Dec 07, 2016 at 09:39:20PM +0900, Michael Paquier wrote: > On Wed, Dec 07, 2016 at 03:18:59PM +0300, Aleksander Alekseev wrote: > > I noticed, that `make installcheck` fails on my laptop with following > > errors: > > > > http://afiskon.ru/s/98/6f94ce2cfa_regression.out.txt > > http://afiskon.ru/s/b3/d0da05597e_regression.diffs.txt > > The interesting bit for the archives: > > *** /home/eax/work/postgrespro/postgresql-src/src/test/regress/expected/tablespace.out 2016-12-07 13:53:44.000728436+0300 > --- /home/eax/work/postgrespro/postgresql-src/src/test/regress/results/tablespace.out 2016-12-07 13:53:46.150728558+0300 > *************** > *** 66,71 **** > --- 66,72 ---- > INSERT INTO testschema.test_default_tab VALUES (1); > CREATE INDEX test_index1 on testschema.test_default_tab (id); > CREATE INDEX test_index2 on testschema.test_default_tab (id) TABLESPACE regress_tblspace; > + ERROR: could not open file "pg_tblspc/16395/PG_10_201612061/16393/16407": No such file or directory > \d testschema.test_index1 > > > Any ideas what can cause this issue? > > In the same host, primary and standby will try to use the tablespace > in the same path. That's the origin of this breakage. > -- > Michael -- Best regards, Aleksander Alekseev
Re: Test "tablespace" fails during `make installcheck` on master-replica setup
From
Michael Paquier
Date:
On Wed, Dec 07, 2016 at 03:42:53PM +0300, Aleksander Alekseev wrote: > > In the same host, primary and standby will try to use the tablespace > > in the same path. That's the origin of this breakage. > > Sorry, I don't follow. Don't master and replica use different > directories to store _all_ data? Particularly in my case: > > ``` > $ find path/to/postgresql-install/ -type d -name pg_tblspc > /home/eax/work/postgrespro/postgresql-install/data-slave/pg_tblspc > /home/eax/work/postgrespro/postgresql-install/data-master/pg_tblspc > ``` > > Where exactly a collision happens? At the location of the tablespaces, pg_tblspc just stores symlinks to the place data is stored, and both point to the same path, the same path being stream to the standby when replaying the create tablespace record. -- Michael
Re: Test "tablespace" fails during `make installcheck` on master-replica setup
From
Stephen Frost
Date:
Michael, all, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Wed, Dec 07, 2016 at 03:42:53PM +0300, Aleksander Alekseev wrote: > > > In the same host, primary and standby will try to use the tablespace > > > in the same path. That's the origin of this breakage. > > > > Sorry, I don't follow. Don't master and replica use different > > directories to store _all_ data? Particularly in my case: > > > > ``` > > $ find path/to/postgresql-install/ -type d -name pg_tblspc > > /home/eax/work/postgrespro/postgresql-install/data-slave/pg_tblspc > > /home/eax/work/postgrespro/postgresql-install/data-master/pg_tblspc > > ``` > > > > Where exactly a collision happens? > > At the location of the tablespaces, pg_tblspc just stores symlinks to > the place data is stored, and both point to the same path, the same path > being stream to the standby when replaying the create tablespace record. It would be really nice if we would detect that some other postmaster is already using a given tablespace directory and to throw an error and complain rather than starting up thinking everything is fine. We do that already for $PGDATA, of course, but not tablespaces. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > It would be really nice if we would detect that some other postmaster is > already using a given tablespace directory and to throw an error and > complain rather than starting up thinking everything is fine. In principle, we could have the postmaster run through $PGDATA/pg_tblspc and drop a lockfile into each referenced directory. But the devil is in the details --- in particular, not sure how to get the right thing to happen during a CREATE TABLESPACE. Also, I kinda doubt that this is going to fix anything for the replica-on-same-machine problem. regards, tom lane
Re: Test "tablespace" fails during `make installcheck` on master-replica setup
From
Michael Paquier
Date:
On Thu, Dec 8, 2016 at 12:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> It would be really nice if we would detect that some other postmaster is >> already using a given tablespace directory and to throw an error and >> complain rather than starting up thinking everything is fine. > > In principle, we could have the postmaster run through $PGDATA/pg_tblspc > and drop a lockfile into each referenced directory. But the devil is in > the details --- in particular, not sure how to get the right thing to > happen during a CREATE TABLESPACE. Also, I kinda doubt that this is going > to fix anything for the replica-on-same-machine problem. That's where having a node-based ID would become helpful, which is different from the global system ID. Ages ago when working on Postgres-XC, we took care of this problem by appending to the tablespace folder name, the one prefixed with PGXX, a suffix using a node name. When applying this concept to PG, we could have standbys to set up this node ID each time recovery is done using a backup_label. This won't solve the problem of tablespaces already created, that should be handled by users when taking the base backup by remapping them. But it would adress the problems for newly-created ones. -- Michael