Thread: Fail to create PK or index for large table in Windows

Fail to create PK or index for large table in Windows

From
Pavel
Date:
PostgreSQL 11.0, 11.1
OS: Windows 7 x64
RAM: 16GB

Database location was created with initdb.
Scheme sc.

Table contains 600 000 000 rows with structure:
table sc.address
(
  id_address integer not null,
  base varchar not null,
  raw bytea not null,
  key bytea,
  id_key integer
)

Server start command:
pg_ctl -D g:\PostgreSQL11\data -l g:\PostgreSQL11\log\log.txt -o "-p 5426" start

Bug:
>psql -d dbname -p 5426
psql (11.1)
dbname =# \set VERBOSITY verbose
dbname =# alter table sc.address add primary key (id_address);
ERROR:  58P01: could not determine size of temporary file "0"
LOCATION:  ltsConcatWorkerTapes,
d:\pginstaller.auto\postgres.windows-x64\src\backend\utils\sort\logtape.c:439
dbname=#

Log file (command "alter..." requested at 2018-11-12 20:21:37):
2018-11-12 20:21:37.640 MSK [6848] ERROR:  canceling autovacuum task
2018-11-12 20:21:37.640 MSK [6848] CONTEXT:  automatic vacuum of table
"dbname.sc.address"
2018-11-12 21:31:05.182 MSK [6672] ERROR:  could not determine size of
temporary file "0"
2018-11-12 21:31:05.182 MSK [6672] STATEMENT:  alter table sc.address
add primary key (id_address);
2018-11-12 21:31:05.889 MSK [7008] LOG:  could not rmdir directory
"base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset": Directory not empty

After error the directory
"base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset" is empty.

In the PostgreSQL 10 with the same database (copied via pg_dump and
psql) there is no problems - index/key created successfully.

It looks like this bug is disallow to index any large tables under Windows.


Best regards,
Pavel Oskin


Re: Fail to create PK or index for large table in Windows

From
Thomas Munro
Date:
On Tue, Nov 13, 2018 at 10:22 AM Pavel <ospavelmail@gmail.com> wrote:
> PostgreSQL 11.0, 11.1
> OS: Windows 7 x64
> RAM: 16GB

Thanks for the report.  This is the same issue as reported in bug #15460:

https://www.postgresql.org/message-id/flat/15460-b6db80de822fa0ad%40postgresql.org

We are still trying to figure out what causes this, and there have
been no similar reports on Unix.  For a workaround, you can disable
parallel index building with SET max_parallel_maintenance_workers = 0.

> 2018-11-12 21:31:05.182 MSK [6672] ERROR:  could not determine size of
> temporary file "0"
> 2018-11-12 21:31:05.182 MSK [6672] STATEMENT:  alter table sc.address
> add primary key (id_address);

This is the thing we haven't understood yet.

> 2018-11-12 21:31:05.889 MSK [7008] LOG:  could not rmdir directory
> "base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset": Directory not empty
>
> After error the directory
> "base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset" is empty.

For the later "could not rmdir directory" error, I am fairly sure I
understand what's happening there (see the other bug thread) but
that's only a LOG message and an empty directory left behind after an
error is raised, it's not the root cause.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Fail to create PK or index for large table in Windows

From
Pavel
Date:
Thank you for your work!

You can log contents of directory before rmdir.
If it empty then the reason should be in multiworkers: looks like
Windows disallow to delete directory if it (or files in it) used by
somebody else (like other workers) and maybe some workers didn't
release it before rmdir. Linux allows as I know.

Best regards,
Pavel Oskin

вт, 13 нояб. 2018 г. в 12:01, Thomas Munro:
>
> On Tue, Nov 13, 2018 at 10:22 AM Pavel wrote:
> > PostgreSQL 11.0, 11.1
> > OS: Windows 7 x64
> > RAM: 16GB
>
> Thanks for the report.  This is the same issue as reported in bug #15460:
>
> https://www.postgresql.org/message-id/flat/15460-b6db80de822fa0ad%40postgresql.org
>
> We are still trying to figure out what causes this, and there have
> been no similar reports on Unix.  For a workaround, you can disable
> parallel index building with SET max_parallel_maintenance_workers = 0.
>
> > 2018-11-12 21:31:05.182 MSK [6672] ERROR:  could not determine size of
> > temporary file "0"
> > 2018-11-12 21:31:05.182 MSK [6672] STATEMENT:  alter table sc.address
> > add primary key (id_address);
>
> This is the thing we haven't understood yet.
>
> > 2018-11-12 21:31:05.889 MSK [7008] LOG:  could not rmdir directory
> > "base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset": Directory not empty
> >
> > After error the directory
> > "base/pgsql_tmp/pgsql_tmp6672.0.sharedfileset" is empty.
>
> For the later "could not rmdir directory" error, I am fairly sure I
> understand what's happening there (see the other bug thread) but
> that's only a LOG message and an empty directory left behind after an
> error is raised, it's not the root cause.
>
> --
> Thomas Munro
> http://www.enterprisedb.com


Re: Fail to create PK or index for large table in Windows

From
Thomas Munro
Date:
On Tue, Nov 13, 2018 at 10:22 PM Pavel <ospavelmail@gmail.com> wrote:
> You can log contents of directory before rmdir.
> If it empty then the reason should be in multiworkers: looks like
> Windows disallow to delete directory if it (or files in it) used by
> somebody else (like other workers) and maybe some workers didn't
> release it before rmdir. Linux allows as I know.

Yeah, its "unlink" operation seems to be an incomplete illusion.  The
fix for that is simply to reorder some cleanup operations in our error
path.

Oh, I think I just understood the root bug.  It looks like a stupid
"large file" problem, 32 bit off_t being overflowed.  I'll keep the
discussion off that over on the other thread.  I posted a patch on
that thread; if you're able to recompile PostgreSQL, it'd be great if
you could test it.  Otherwise, no worries.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Fail to create PK or index for large table in Windows

From
Thomas Munro
Date:
On Tue, Nov 13, 2018 at 10:57 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Oh, I think I just understood the root bug.  It looks like a stupid
> "large file" problem, 32 bit off_t being overflowed.  I'll keep the
> discussion off that over on the other thread.  I posted a patch on
> that thread; if you're able to recompile PostgreSQL, it'd be great if
> you could test it.  Otherwise, no worries.

Pavel offered privately to test the change if I could supply a build.
I tried asking AppVeyor to build the current master branch as of a few
minutes ago and consider the "Release" directory to be an "artifact",
and amazingly it spat out a .zip file available for 6 months:

https://ci.appveyor.com/project/macdice/postgres/builds/20341645/artifacts

I'm not sure how easy this will be to work with if you're used to one
of those installers.  Hopefully all the libraries needed are present
and you can just unzip it and run "initdb.exe -D pgdata",
"postgres.exe -D pgdata" -- but I have no personal experience of
PostgreSQL on Windows so I'm guessing here.  Pavel, if you have time
to try testing the bug fix that was committed, that'd be great.  If it
goes well, perhaps we could consider making bleeding edge builds
available all the time.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Fail to create PK or index for large table in Windows

From
Thomas Munro
Date:
On Fri, Nov 16, 2018 at 2:10 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> <thomas.munro@enterprisedb.com> wrote:
> https://ci.appveyor.com/project/macdice/postgres/builds/20341645/artifacts
>
> I'm not sure how easy this will be to work with if you're used to one
> of those installers.  Hopefully all the libraries needed are present
> and you can just unzip it and run "initdb.exe -D pgdata",
> "postgres.exe -D pgdata" -- but I have no personal experience of
> PostgreSQL on Windows so I'm guessing here.  Pavel, if you have time
> to try testing the bug fix that was committed, that'd be great.  If it
> goes well, perhaps we could consider making bleeding edge builds
> available all the time.

A couple of people wrote to me off-list to say they'd like to test the
11 stable branch, not the master branch, and would like a build.
Thanks for testing!  Presumably they might want to test against their
existing pgdata where they've seen the problem.  Fair enough -- but
please do use a *copy* of your pgdata directory, not your production
system, if it's a database you care about!  These builds are for
testing only, though the the stable 11 branch offers a preview of what
will probably ship in 11.2.

Here's the REL_11_STABLE branch built for Windows as of a couple minutes ago:

https://ci.appveyor.com/project/macdice/postgres/builds/20412803/artifacts

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Fail to create PK or index for large table in Windows

From
Thomas Munro
Date:
On Tue, Nov 20, 2018 at 8:51 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Here's the REL_11_STABLE branch built for Windows as of a couple minutes ago:
>
> https://ci.appveyor.com/project/macdice/postgres/builds/20412803/artifacts

I received an off-list confirmation that a large CREATE INDEX could
complete normally on a database that previously showed the error,
using that snapshot of 11 stable.

I'm really pleased with this experiment: it seems there are plenty of
people willing to test on Windows if we can supply builds, in
particular people who've run into a bug.  This makes me think that I
should figure out how to get cfbot to spit out a builds for all
commitfest entries so that proposed fixes and features are
automatically available in a format usable by anyone with Windows and
no special developer tools.  The only problem is the space
requirement; if I just turn that on naively it'll eat 2-3TB of
appveyor's archive space and they might hate me, so I'll need to come
up with a way to push the zip files to external storage and keep just
the latest one, and make sure it's clear which master commit + patches
went into the build.

-- 
Thomas Munro
http://www.enterprisedb.com