Thread: High table creation rate results in “File exists” error
Hi,
We occasionally run into an internal error when creating tables. The error message is «could not create file "base/17006/2851173576": File exists» with SQLSTATE 58P02. The attachment is included with this email contains the PostgreSQL logs. Note that some identifiers are renamed in the logs because they contain sensitive information, but identifier names are unlikely the source of the error.
An old mailing list thread [1] suggests a problem related to OID wraparound. Our use case involves creating and dropping thousands of tables every day. The CREATE TABLE statements for these tables are similar to the one in the attached logs, but with different table names and columns. Nonetheless, the error is difficult to reproduce as it happens only sporadically. Retrying the CREATE TABLE statement fixes the problem temporarily.
Some more information about our PostgreSQL instance and usage:
- PostgreSQL version: PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit.
- The DBMS did not crash or abruptly shut down recently.
- The file base/17006/2851173576 was actually in use by an index, as indicated by pg_class.relfilenode.
- Autovacuum is disabled for these tables. Logging is disabled for these tables upon creation, but enabled after filling them with data.
- The query “SELECT max(oid) FROM pg_class” reports 4,294,603,797, which is close to 2³² - 1.
Kind regards,
Radek Slupik
Channable
Attachment
On Tue, Jan 21, 2020 at 02:04:24PM +0100, Radek Slupik wrote: >Hi, > >We occasionally run into an internal error when creating tables. The error >message is «could not create file "base/17006/2851173576": File exists» >with SQLSTATE 58P02. The attachment is included with this email contains >the PostgreSQL logs. Note that some identifiers are renamed in the logs >because they contain sensitive information, but identifier names are >unlikely the source of the error. > >An old mailing list thread [1] suggests a problem related to OID >wraparound. Our use case involves creating and dropping thousands of tables >every day. The CREATE TABLE statements for these tables are similar to the >one in the attached logs, but with different table names and columns. >Nonetheless, the error is difficult to reproduce as it happens only >sporadically. Retrying the CREATE TABLE statement fixes the problem >temporarily. > >Some more information about our PostgreSQL instance and usage: > > - PostgreSQL version: PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on > x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 > 20160609, 64-bit. > - The DBMS did not crash or abruptly shut down recently. > - The file base/17006/2851173576 was actually in use by an index, as > indicated by pg_class.relfilenode. > - Autovacuum is disabled for these tables. Logging is disabled for these > tables upon creation, but enabled after filling them with data. > - The query “SELECT max(oid) FROM pg_class” reports 4,294,603,797, which > is close to 2³² - 1. > So what was the actual rate of table creation? How long did it take to get to this state? What kernel/filesystem was used? What did lsof say? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, thanks for looking into this! On Wed, 22 Jan 2020 at 18:07, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > So what was the actual rate of table creation? How long did it take to > get to this state? Currently we create tables at about 1.8 Hz. We drop them at the same rate, almost all tables are short-lived (between 1d and 1h). We started seeing this error roughly 1.5 years after we started using this system, but in the beginning the rate was not as high as it is now. I don’t have all historical occurrences at hand, but in the past 14 days, the error has occurred twice. > What kernel/filesystem was used? What did lsof say? We use kernel 4.15.0-1036-gcp and the file system is ext4. Lsof does not currently list the file in the most recent error message, but I don't know what the state was at the time of the error. Best, Ruud