Thread: BUG #16302: too many range table entries - when count partition table(65538 childs)

BUG #16302: too many range table entries - when count partition table(65538 childs)

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16302
Logged by:          Zhou Digoal
Email address:      digoal@126.com
PostgreSQL version: 12.2
Operating system:   CentOS 7.7 x64
Description:

hi,
  when i count a partition table with many childs , it's error below:

```
postgres=> select count(*) from p;
ERROR:  54000: too many range table entries
LOCATION:  add_rte_to_flat_rtable, setrefs.c:425
```

regenerate:

```
do language plpgsql $$   
declare  
begin  
drop table if exists p;  
create table p (id int , info text, crt_time timestamp) partition by range
(crt_time);    
create table p2020 partition of p FOR VALUES FROM ('2020-01-01') TO
('2021-01-01') partition by hash (id);    
create table p2021 partition of p FOR VALUES FROM ('2021-01-01') TO
('2022-01-01') partition by hash (id);    
for i in 0..30000 loop  
  execute format ('create table p2020_%s partition of p2020 FOR VALUES WITH
(MODULUS 65537, REMAINDER %s)', i, i);  
  execute format('alter table p2020_%s set (parallel_workers=52)',i);  
end loop;  
end;  
$$;  

do language plpgsql $$   
declare  
begin  
for i in 30001..60000 loop  
  execute format ('create table p2020_%s partition of p2020 FOR VALUES WITH
(MODULUS 65537, REMAINDER %s)', i, i);  
  execute format('alter table p2020_%s set (parallel_workers=52)',i);  
end loop;  
end;  
$$;  

do language plpgsql $$   
declare  
begin  
for i in 60001..65536 loop  
  execute format ('create table p2020_%s partition of p2020 FOR VALUES WITH
(MODULUS 65537, REMAINDER %s)', i, i);  
  execute format('alter table p2020_%s set (parallel_workers=52)',i);  
end loop;  
end;  
$$;  
```

best regards, 
digoal


PG Bug reporting form <noreply@postgresql.org> writes:
>   when i count a partition table with many childs , it's error below:

> postgres=> select count(*) from p;
> ERROR:  54000: too many range table entries
> LOCATION:  add_rte_to_flat_rtable, setrefs.c:425

This hardly seems like a bug.  We do not support an infinite number of
partitions --- and in the real world, performance would have tanked
long before you got to this many partitions.

            regards, tom lane



On Sun, Mar 15, 2020 at 12:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> >   when i count a partition table with many childs , it's error below:
>
> > postgres=> select count(*) from p;
> > ERROR:  54000: too many range table entries
> > LOCATION:  add_rte_to_flat_rtable, setrefs.c:425
>
> This hardly seems like a bug.  We do not support an infinite number of
> partitions --- and in the real world, performance would have tanked
> long before you got to this many partitions.

Would it make sense to document this hard upper bound on the number of
relations that can be handled by a query?

--
Thank you,
Amit



Amit Langote <amitlangote09@gmail.com> writes:
> On Sun, Mar 15, 2020 at 12:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> PG Bug reporting form <noreply@postgresql.org> writes:
>>> ERROR:  54000: too many range table entries

>> This hardly seems like a bug.  We do not support an infinite number of
>> partitions --- and in the real world, performance would have tanked
>> long before you got to this many partitions.

> Would it make sense to document this hard upper bound on the number of
> relations that can be handled by a query?

Well, if we say "PG can handle up to 64K relations in a query",
I think people would read that as meaning that you actually get
usable performance with up to 64K relations.  Which is a long
way away, even if certain specific cases might work acceptably.
The existing docs discourage using more than a few thousand
partitions, IIRC, and that seems like sufficient guidance for now.

            regards, tom lane



On Sun, Mar 15, 2020 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, if we say "PG can handle up to 64K relations in a query",
> I think people would read that as meaning that you actually get
> usable performance with up to 64K relations.  Which is a long
> way away, even if certain specific cases might work acceptably.
> The existing docs discourage using more than a few thousand
> partitions, IIRC, and that seems like sufficient guidance for now.

Would would be the downside to raising INNER_VAR to, say, a billion? I
mean, I get that queries involving many tables aren't necessarily
going to work very well, but due to all of the work that has been done
in this area over the last couple of years, they have a much better
chance of working well now than they did before, and I think there is
some appetite to continue fixing things that would let them work even
better in the future.

Incidentally, this is the second complaint I've seen about this
problem in the last week, so it seems pretty clear that the
partitioning work is actually moving the needle in this area. We ought
to be happy, not sad, that people are trying to use more partitions;
we ought to be receptive, not resistant, to their reports of trouble;
and if we decide that a particular problem is absolutely impossible to
fix, we should certainly not resist documenting those restrictions
which are unavoidable. Refusing to give people either a fix or any
useful information about the limits seems very unhelpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Mar 15, 2020 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, if we say "PG can handle up to 64K relations in a query",
>> I think people would read that as meaning that you actually get
>> usable performance with up to 64K relations.  Which is a long
>> way away, even if certain specific cases might work acceptably.
>> The existing docs discourage using more than a few thousand
>> partitions, IIRC, and that seems like sufficient guidance for now.

> Would would be the downside to raising INNER_VAR to, say, a billion?

We'd have to widen AttrNumber to int32 and find all the places that
assume it's only 16 bits.  (Good luck with testing your way to having
any confidence in having found them all, so I'm not sure exactly how
to acquire such confidence.)  Maybe someday that will be a profitable
use of developer effort, but I have to say that I think that day is
a long way off.

            regards, tom lane



On Tue, Mar 17, 2020 at 9:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Sun, Mar 15, 2020 at 10:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Well, if we say "PG can handle up to 64K relations in a query",
> >> I think people would read that as meaning that you actually get
> >> usable performance with up to 64K relations.  Which is a long
> >> way away, even if certain specific cases might work acceptably.
> >> The existing docs discourage using more than a few thousand
> >> partitions, IIRC, and that seems like sufficient guidance for now.
>
> > Would would be the downside to raising INNER_VAR to, say, a billion?
>
> We'd have to widen AttrNumber to int32 and find all the places that
> assume it's only 16 bits.  (Good luck with testing your way to having
> any confidence in having found them all, so I'm not sure exactly how
> to acquire such confidence.)  Maybe someday that will be a profitable
> use of developer effort, but I have to say that I think that day is
> a long way off.

That sounds like a lot of work, but I think AttrNumber is used to
store varattno, and I think varno is typically stored using either
Index or int, both of which are 32 bits. Am I confused?

(It doesn't seem like we're all that careful about whether we're using
int, which is signed, or Index, which is unsigned, and we might want
to go around and tighten that up. But if we adopted my suggested value
of a billion it still wouldn't make any difference, because 10^9<2^31.
I feel fairly safe saying that queries involving more than a billion
relations will be impractical for other reasons for many years yet.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Mar 17, 2020 at 9:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We'd have to widen AttrNumber to int32 and find all the places that
>> assume it's only 16 bits.  (Good luck with testing your way to having
>> any confidence in having found them all, so I'm not sure exactly how
>> to acquire such confidence.)  Maybe someday that will be a profitable
>> use of developer effort, but I have to say that I think that day is
>> a long way off.

> That sounds like a lot of work, but I think AttrNumber is used to
> store varattno, and I think varno is typically stored using either
> Index or int, both of which are 32 bits. Am I confused?

Argh, -ENOCAFFEINE.  You're right, AttrNumber is not the relevant
type here.  There's still an issue of whether anyplace has tried
to store a varno in less-than-int, but it seems less likely than
I was thinking earlier.

> (It doesn't seem like we're all that careful about whether we're using
> int, which is signed, or Index, which is unsigned, and we might want
> to go around and tighten that up.

Yeah, I'd be leery of trying to use that last bit.  The cost-benefit
ratio on that is *definitely* not attractive.

            regards, tom lane



On Tue, Mar 17, 2020 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Argh, -ENOCAFFEINE.  You're right, AttrNumber is not the relevant
> type here.  There's still an issue of whether anyplace has tried
> to store a varno in less-than-int, but it seems less likely than
> I was thinking earlier.

That seems pretty unlikely to me, but even if it does happen, it's
just a bug which we could simply fix when we find it. I mean, I don't
want to have bugs, but there's no patch that is free of all risk. If
we go through and look around at how this is handled in various
places, and check as carefully as we can for possible problems, I
don't think it would be a particularly high-risk patch.

> > (It doesn't seem like we're all that careful about whether we're using
> > int, which is signed, or Index, which is unsigned, and we might want
> > to go around and tighten that up.
>
> Yeah, I'd be leery of trying to use that last bit.  The cost-benefit
> ratio on that is *definitely* not attractive.

That seems clear, but trying to be consistent about the types we use
for things is a good idea on general principle, even if we have no
intention of doing anything that really relies on it any time in the
forseeable future. If nothing else, it might avoid some compiler
warnings.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 2020-03-17 16:04, Robert Haas wrote:
>>> (It doesn't seem like we're all that careful about whether we're using
>>> int, which is signed, or Index, which is unsigned, and we might want
>>> to go around and tighten that up.
>>
>> Yeah, I'd be leery of trying to use that last bit.  The cost-benefit
>> ratio on that is *definitely* not attractive.
> 
> That seems clear, but trying to be consistent about the types we use
> for things is a good idea on general principle, even if we have no
> intention of doing anything that really relies on it any time in the
> forseeable future. If nothing else, it might avoid some compiler
> warnings.

You could make INNER_VAR -1 and so on.  Then it would probably break 
quite clearly if it's not signedness clean.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> You could make INNER_VAR -1 and so on.  Then it would probably break 
> quite clearly if it's not signedness clean.

I kinda like this, because it'd avoid having random large constants
in the code.

We'd have to change Var.varno and some other places from Index to
plain int, or else invent a typedef --- but the number of places
currently using plain int for this sort of purpose is very large,
so probably a new typedef is not worth the hassle.

            regards, tom lane



On Thu, Mar 19, 2020 at 9:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > You could make INNER_VAR -1 and so on.  Then it would probably break
> > quite clearly if it's not signedness clean.
>
> I kinda like this, because it'd avoid having random large constants
> in the code.
>
> We'd have to change Var.varno and some other places from Index to
> plain int, or else invent a typedef --- but the number of places
> currently using plain int for this sort of purpose is very large,
> so probably a new typedef is not worth the hassle.

Honestly, having typedef int RangeTableIndex seems pretty appealing to
me. OK, maybe there aren't that many places, but it would make it a
lot more greppable, which seems pretty nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company