Thread: General Bug Report: Files greater than 1 GB are created while sorting

General Bug Report: Files greater than 1 GB are created while sorting

From
Unprivileged user
Date:
============================================================================
                        POSTGRESQL BUG REPORT TEMPLATE
============================================================================


Your name               : Doug Mitchell
Your email address      : doug@mitchcraft.com

Category                : runtime: back-end
Severity                : serious

Summary: Files greater than 1 GB are created while sorting

System Configuration
--------------------
  Operating System   : Linux 2.2.5 ELF (RedHat 6.0)

  PostgreSQL version : 6.5

  Compiler used      : cc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)

Hardware:
---------
PII-300 with 384 MB RAM

Versions of other tools:
------------------------
$ rpm -q make flex
make-3.77-6
flex-2.5.4a-6


--------------------------------------------------------------------------

Problem Description:
--------------------

The backed is creating files bigger than 1 GB when sorting
and it will break when the file gets to 2 GB.

Here are the biggest files:

1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:10 bigtable_pkey
1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:36 pg_temp.2446.0
1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:55 bigtable
1122136 -rw-------   1 postgres postgres 1147937412 Jun 30 21:39 pg_temp2769.3
1148484 -rw-------   1 postgres postgres 1174890288 Jun 30 21:26 pg_temp2769.4

I also have some smaller ".1" files that are the rest of the above
files along with everything else you might expect to find in a
PG database directory.  It's those two last big ones that are
troublesome.

Table and indicies are segmenting just fine at 1GB, but
some sort files just keep growing.  I did actually get a
back-end error one time when one exceeded 2 GB.

Thanks,
Doug


--------------------------------------------------------------------------

Test Case:
----------
Just do:
mydb=> select * into bigtable2 from bigtable order by custno;

You might want to decrease RELSEGSZ to see it faster.
Mail be back if you can't reproduce it.
(and please make the bug report form boxes bigger!)

--------------------------------------------------------------------------

Solution:
---------
Something is not using the Magnetic Disk Storage Manager,
but is writing a temp file out on its own during the sort.



--------------------------------------------------------------------------



Nice analysis of a problem.  Probably psort code is not using those
routines because we never expected the sorts to get over 2Gigs.


> The backed is creating files bigger than 1 GB when sorting
> and it will break when the file gets to 2 GB.
>
> Here are the biggest files:
>
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:10 bigtable_pkey
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:36 pg_temp.2446.0
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:55 bigtable
> 1122136 -rw-------   1 postgres postgres 1147937412 Jun 30 21:39 pg_temp2769.3
> 1148484 -rw-------   1 postgres postgres 1174890288 Jun 30 21:26 pg_temp2769.4
>
> I also have some smaller ".1" files that are the rest of the above
> files along with everything else you might expect to find in a
> PG database directory.  It's those two last big ones that are
> troublesome.
>
> Table and indicies are segmenting just fine at 1GB, but
> some sort files just keep growing.  I did actually get a
> back-end error one time when one exceeded 2 GB.
>
> Thanks,
> Doug
>
>
> --------------------------------------------------------------------------
>
> Test Case:
> ----------
> Just do:
> mydb=> select * into bigtable2 from bigtable order by custno;
>
> You might want to decrease RELSEGSZ to see it faster.
> Mail be back if you can't reproduce it.
> (and please make the bug report form boxes bigger!)
>
> --------------------------------------------------------------------------
>
> Solution:
> ---------
> Something is not using the Magnetic Disk Storage Manager,
> but is writing a temp file out on its own during the sort.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

I haved added to the TODO list:

    * Allow creation of sort tables > 1 Gig

> Problem Description:
> --------------------
>
> The backed is creating files bigger than 1 GB when sorting
> and it will break when the file gets to 2 GB.
>
> Here are the biggest files:
>
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:10 bigtable_pkey
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:36 pg_temp.2446.0
> 1049604 -rw-------   1 postgres postgres 1073741824 Jun 30 19:55 bigtable
> 1122136 -rw-------   1 postgres postgres 1147937412 Jun 30 21:39 pg_temp2769.3
> 1148484 -rw-------   1 postgres postgres 1174890288 Jun 30 21:26 pg_temp2769.4

I have renamed these sort temp tables to pg_sorttemp so they will not be
confused with actual temp tables.  Because people installing 6.5.1 will
have stopped their postmasters, that will not cause any problems.

You are safe up to 2 gigs, and at that point, the OS will can cause a
problem.  The new naming should make the cause clearer.  Don't know if
we can get this done in 6.5.1 because the change to segment these
requires some work.  Looks like the psort code goes right to fd/*,
bypassing the storage manager.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Bruce Momjian <maillist@candle.pha.pa.us> writes:
> I have renamed these sort temp tables to pg_sorttemp so they will not be
> confused with actual temp tables.

I didn't realize that the names generated for temp tables were so close
to those generated for temp files.  Changing one or the other does seem
like a good idea.  But I do not like "pg_sorttemp" because fd.c's
temporary-file mechanism is used for more things than just sorting.
Hash joins, for example.  Can we think of a better name?

Alternatively, how about including the user-given name for a temp table
into its real name?  That would be helpful for debugging, I'm sure.
I'm thinking of something like

    snprintf(newrelname, NAMEDATALEN, "pg_temp.%d.%u.%s",
         (int) MyProcPid, uniqueId++, userrelname);

(relying on snprintf to truncate the user name if too long, here).


> You are safe up to 2 gigs, and at that point, the OS will can cause a
> problem.  The new naming should make the cause clearer.  Don't know if
> we can get this done in 6.5.1 because the change to segment these
> requires some work.  Looks like the psort code goes right to fd/*,
> bypassing the storage manager.

Yes, it will take some thought to figure out how to handle multi-segment
temp files without cluttering the code too badly.  I think it can be
handled inside fd.c, though.

Note that under ordinary circumstances, the data being processed by a
sort or hash join will be written into several temp files that each get
just a fraction of the data; so you would not actually see a problem
until you got to several-times-2-Gig total data volume.

            regards, tom lane

> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > I have renamed these sort temp tables to pg_sorttemp so they will not be
> > confused with actual temp tables.
>
> I didn't realize that the names generated for temp tables were so close
> to those generated for temp files.  Changing one or the other does seem
> like a good idea.  But I do not like "pg_sorttemp" because fd.c's
> temporary-file mechanism is used for more things than just sorting.
> Hash joins, for example.  Can we think of a better name?

I am open to other names.  It is really for joining and sorting.
Suggestions?

>
> Alternatively, how about including the user-given name for a temp table
> into its real name?  That would be helpful for debugging, I'm sure.
> I'm thinking of something like
>
>     snprintf(newrelname, NAMEDATALEN, "pg_temp.%d.%u.%s",
>          (int) MyProcPid, uniqueId++, userrelname);
>
> (relying on snprintf to truncate the user name if too long, here).

You can only do the truncation of the user-supplied part, not the actual
numbers.  I guess we could.  I wanted it to be uniform, I guess.


> Note that under ordinary circumstances, the data being processed by a
> sort or hash join will be written into several temp files that each get
> just a fraction of the data; so you would not actually see a problem
> until you got to several-times-2-Gig total data volume.

He already has 1.4 gig sort files.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026