Re: pg_class.relpages/allvisible probably shouldn't be a int4 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pg_class.relpages/allvisible probably shouldn't be a int4
Date
Msg-id 20140511213037.GA9586@awork2.anarazel.de
Whole thread Raw
In response to Re: pg_class.relpages/allvisible probably shouldn't be a int4  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_class.relpages/allvisible probably shouldn't be a int4
Re: pg_class.relpages/allvisible probably shouldn't be a int4
List pgsql-hackers
On 2014-05-11 12:24:30 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-10 23:21:34 -0700, Peter Geoghegan wrote:
> >> On Fri, May 9, 2014 at 1:50 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >>> And adding a proper unsigned type doesn't sound like a small amount of work.
> 
> >> Perhaps not, but it's overdue. We ought to have one.
> 
> > Maybe. But there's so many things to decide around it that I don't think
> > it's a good prerequisite for not showing essentially corrupted values in
> > a supported scenario.
> 
> It's a lot harder than it sounds at first; see past discussions about how
> we could shoehorn one into the numeric type hierarchy.  And think about
> how C expressions that mix signed and unsigned inputs tend to give
> surprising results :-(

Yea. I really don't like to take on such a major project to solve a
minor problem.
What I am thinking about right now is to expose a 'pg_blocknumber'
type. That only does very basic operations and implicitly casts to
int64. That's probably a much more handleable problem and it also might
give us some more experience with unsigned types. Comments?

> The bigger picture though is that it's hard to get excited about this
> particular scenario, because if you are up to the point where your table
> size overflows int32, you are less than a factor of 2 away from the hard
> limit of BlockNumber, and will therefore soon have much bigger problems to
> worry about than whether pg_class.relpages prints confusingly.

Well. It takes several years to get to 16TB, so properly supporting >16TB &&
< 32TB is a pretty good step. It'll allow a couple of years continued operation.

> My advice
> to anyone who reported this from the field would certainly be "time to
> think about partitioning that table".

That definitely got to be the mid to longterm plan. The problem is that
our partitioning sucks. Majorly. My hope is that by allowing that large
tables to work more properly we have time to improve partitioning to the
level that it doesn't basically requires to remove all nontrivial
constraints.

> So if I were to take Andres'
> complaint seriously at all, I'd be thinking in terms of "do we need to
> widen BlockNumber to int64?", not "how do we make this print as
> unsigned?".  But I doubt such a proposal would fly, because of the
> negative impact on index sizes.

Yea, I am not wild for that either. I guess migrating to a postgres with
a larger blocksize is the next step.

> A possible cosmetic fix is to just clamp the value vacuum/analyze
> will store into relpages at INT_MAX, while scaling reltuples so that
> the all-important reltuples/relpages ratio stays at reality.  But
> that might be harder than it sounds too, because of the moving
> average tracking behavior for reltuples.  And it'd be a code path
> that gets no meaningful testing :-(

I am absolutely not a fan of that either :(.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: imprecise pg_basebackup documentation about excluded files
Next
From: Andres Freund
Date:
Subject: Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..