Thread: OS File Size > 1GB

OS File Size > 1GB

From
Chris Ruprecht
Date:
Hi all,

The default size of a Postgres file seems to be 1 GB. I know, I can increase
that by modifying <can't remember what it was I did here - that parameter
which gives you the file size, once you divide it by the blocksize>
My question is: is it safe to do so?
My motivation: I symlink the files across different drives but when postgres
decided to create a new file (.1, .2, .3 ...) they are created in the
data/base/nnnn directory and not where all his little brothers and sisters
live and where they SHOULD be.

Best regards,
Chris

Re: OS File Size > 1GB

From
Tom Lane
Date:
Chris Ruprecht <chrup@earthlink.net> writes:
> The default size of a Postgres file seems to be 1 GB. I know, I can increase
> that by modifying <can't remember what it was I did here - that parameter
> which gives you the file size, once you divide it by the blocksize>
> My question is: is it safe to do so?

Yes, *if* your OS supports large files.

It might fail at 2GB, and definitely will fail beyond 4GB, because we
use 32-bit arithmetic to compute file offsets.  You could possibly fix
this with some fairly localized hacking in fd.c, which AFAIK is pretty
much the only place that actually deals in byte offsets rather than
block numbers.  If you were to make that code talk to a 64-bit-offset
fseek call, you could probably disable segment splitting entirely (look
in md.c to see the #ifdef for that).

If you try this, let us know how it works.  That code hasn't been
touched recently, but I think it would be cool if there were a
compile option for 64-bit file offsets in place of segment splitting.

            regards, tom lane

Re: OS File Size > 1GB

From
Chris Ruprecht
Date:
On Thursday 25 July 2002 01:31 pm, Tom Lane wrote:
> Chris Ruprecht <chrup@earthlink.net> writes:
> > The default size of a Postgres file seems to be 1 GB. I know, I can
> > increase that by modifying <can't remember what it was I did here - that
> > parameter which gives you the file size, once you divide it by the
> > blocksize> My question is: is it safe to do so?
>
> Yes, *if* your OS supports large files.
>

No problem on the Linux side. I'm not planning to run this on anything else.

> It might fail at 2GB, and definitely will fail beyond 4GB, because we

it will probably fail at 2 GB, since the seek position is signed.

> use 32-bit arithmetic to compute file offsets.  You could possibly fix
> this with some fairly localized hacking in fd.c, which AFAIK is pretty
> much the only place that actually deals in byte offsets rather than
> block numbers.  If you were to make that code talk to a 64-bit-offset
> fseek call, you could probably disable segment splitting entirely (look
> in md.c to see the #ifdef for that).
>

I have had a look at fd.c and md.c and it doesn't look too bad. I think I can
get away with modifying seekpos from type "long" to "unsigned long long"
which would give me the full 64 bits.
The most complicated issue, I would think, is to make the thing define
USE_64_BIT_SEEK in the './configure' process ;-). I take a shot at it and see
how it goes. Right now, my larges table (22 million records) is nearing the 4
GB boundary, 3 x 1 GB plus about 700 MB.
I know Linux can handle sizes > 2 GB, I do that all the time with pg_dumpall,
although, pg_dumpall chokes, when it hits the 2GB boundary. To circumvent
that, I pipe it's output to cat and redirct cat's output to the final output
file (I have to look at pg_dump too, some time, to find out why this is
happening).

> If you try this, let us know how it works.  That code hasn't been
> touched recently, but I think it would be cool if there were a
> compile option for 64-bit file offsets in place of segment splitting.

I'm working on it - but don't hold your breath. I will need this working, when
I have my application done, and right now, I will do both, parallel, with
priority on the app. I'm not good with C, so I might miss a few things here
or there, but I'm sure, we can get this working.

Best regards,
Chris

Re: OS File Size > 1GB

From
Bruce Momjian
Date:
Chris Ruprecht wrote:
> I know Linux can handle sizes > 2 GB, I do that all the time with pg_dumpall,
> although, pg_dumpall chokes, when it hits the 2GB boundary. To circumvent
> that, I pipe it's output to cat and redirct cat's output to the final output
> file (I have to look at pg_dump too, some time, to find out why this is
> happening).

How exactly does pg_dumpall choke on a 2GB boundary?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026