Re: Millions of tables - Mailing list pgsql-performance

From Tom Lane
Subject Re: Millions of tables
Date
Msg-id 22569.1474908759@sss.pgh.pa.us
Whole thread Raw
In response to Re: Millions of tables  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Jeff Janes <jeff.janes@gmail.com> writes:
> A problem is that those statistics are stored in one file (per database; it
> used to be one file per cluster).  With 8 million tables, that is going to
> be a pretty big file.  But the code pretty much assumes the file is going
> to be pretty small, and so it has no compunction about commanding that it
> be read and written, in its entirety, quite often.

I don't know that anyone ever believed it would be small.  But at the
time the pgstats code was written, there was no good alternative to
passing the data through files.  (And I'm not sure we envisioned
applications that would be demanding fresh data constantly, anyway.)

Now that the DSM stuff exists and has been more or less shaken out,
I wonder how practical it'd be to use a DSM segment to make the stats
collector's data available to backends.  You'd need a workaround for
the fact that not all the DSM implementations support resize (although
given the lack of callers of dsm_resize, one could be forgiven for
wondering whether any of that code has been tested at all).  But you
could imagine abandoning one DSM segment and creating a new one of
double the size anytime the hash tables got too big.

            regards, tom lane


pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Millions of tables
Next
From: Jim Nasby
Date:
Subject: Re: Storing large documents - one table or partition by doc?