Home > mailing lists

Re: pg_dump / copy bugs with "big lines" ? - Mailing list pgsql-hackers

From	Daniel Verite
Subject	Re: pg_dump / copy bugs with "big lines" ?
Date	March 23, 2016 17:14:28
Msg-id	d3fe524a-1c78-4cb3-9814-849cd4f43fe6@mm Whole thread Raw
In response to	Re: pg_dump / copy bugs with "big lines" ? (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses	Re: pg_dump / copy bugs with "big lines" ?
List	pgsql-hackers

Tree view

    Alvaro Herrera wrote:

> >   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> >
> > which fails because (HEAPTUPLESIZE + len) is again considered
> > an invalid size, the  size being 1468006476 in my test.
>
> Um, it seems reasonable to make this one be a huge-zero-alloc:
>
>     MemoryContextAllocExtended(CurrentMemoryContext,
>                  HEAPTUPLESIZE + len,
>        MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)

Good, this allows the tests to go to completion! The tests in question
are dump/reload of a row with several fields totalling 1.4GB (deflated),
with COPY TO/FROM file and psql's \copy in both directions, as well as
pg_dump followed by pg_restore|psql.

The modified patch is attached.

It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:

- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
  could push this to 4GB with limited changes to libpq, by
  interpreting the Int32 field in the CopyData message as unsigned).

It's also possible to go beyond 4GB per row with this patch, but
when not using the protocol. I've managed to get a 5.6GB single-row
file with COPY TO file. That doesn't help with pg_dump, but that might
be useful in other situations.

In StringInfo, I've changed int64 to Size, because on 32 bits platforms
the downcast from int64 to Size is problematic, and as the rest of the
allocation routines seems to favor Size, it seems more consistent
anyway.

I couldn't test on 32 bits though, as I seem to never have enough
free contiguous memory available on a 32 bits VM to handle
that kind of data.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

huge-stringinfo-v2.diff

pgsql-hackers by date:

From: Robert Haas
Date: 23 March 2016, 17:11:29
Subject: Re: [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

From: Robert Haas
Date: 23 March 2016, 17:25:40
Subject: Re: Rationalizing code-sharing among src/bin/ directories

Re: pg_dump / copy bugs with "big lines" ? - Mailing list pgsql-hackers

Attachment

Previous

Next