Re: logrep stuck with 'ERROR: int2vector has too many elements' - Mailing list pgsql-hackers

From Andres Freund
Subject Re: logrep stuck with 'ERROR: int2vector has too many elements'
Date
Msg-id 20230115203631.gqespegsyphzjebx@awork3.anarazel.de
Whole thread Raw
In response to Re: logrep stuck with 'ERROR: int2vector has too many elements'  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: logrep stuck with 'ERROR: int2vector has too many elements'
List pgsql-hackers
Hi,

On 2023-01-15 15:17:16 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-01-15 14:39:41 -0500, Tom Lane wrote:
> >> But I suppose we are stuck with that, seeing that this datatype choice
> >> is effectively part of the logrep protocol now.  I think the only
> >> reasonable solution is to get rid of the FUNC_MAX_ARGS restriction
> >> in int2vectorin.  We probably ought to back-patch that as far as
> >> pg_publication_rel.prattrs exists, too.
> 
> > Are you thinking of introducing another, or just "rely" on too long arrays to
> > trigger errors when forming tuples?
> 
> There's enough protections already, eg repalloc will complain if you
> try to go past 1GB.

We'll practically error out at a much lower limit than that, due to, due to
reaching the max length of a row and int2vector not being toastable. So I'm
just wondering if we want to set the limit to something that'll commonly avoid
erroring out out with something like
  ERROR:  54000: row is too big: size 10048, maximum size 8160

For the purpose here a limit of MaxTupleAttributeNumber or such instead of
FUNC_MAX_ARGS would do the trick, I think?


> diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
> index e47c15a54f..44d1c7ad0c 100644
> --- a/src/backend/utils/adt/int.c
> +++ b/src/backend/utils/adt/int.c
> @@ -143,11 +143,13 @@ int2vectorin(PG_FUNCTION_ARGS)
>      char       *intString = PG_GETARG_CSTRING(0);
>      Node       *escontext = fcinfo->context;
>      int2vector *result;
> +    int            nalloc;
>      int            n;
>  
> -    result = (int2vector *) palloc0(Int2VectorSize(FUNC_MAX_ARGS));
> +    nalloc = 32;                /* arbitrary initial size guess */
> +    result = (int2vector *) palloc0(Int2VectorSize(nalloc));
>  
> -    for (n = 0; n < FUNC_MAX_ARGS; n++)
> +    for (n = 0;; n++)
>      {
>          long        l;
>          char       *endp;
> @@ -157,6 +159,12 @@ int2vectorin(PG_FUNCTION_ARGS)
>          if (*intString == '\0')
>              break;
>  
> +        if (n >= nalloc)
> +        {
> +            nalloc *= 2;
> +            result = (int2vector *) repalloc(result, Int2VectorSize(nalloc));
> +        }

Should this be repalloc0? I don't know if the palloc0 above was just used with
the goal of initializing the "header" fields, or also to avoid trailing
uninitialized bytes?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Sampling-based timing for EXPLAIN ANALYZE
Next
From: Tom Lane
Date:
Subject: Re: logrep stuck with 'ERROR: int2vector has too many elements'