On Mon, Feb 07, 2022 at 10:10:53AM -0700, David G. Johnston wrote:
> On Mon, Feb 7, 2022 at 9:58 AM Esteban Zimanyi <estebanzimanyi@gmail.com>
> wrote:
>
> >
> > As suggested by David, this goes beyond the "traditional" usage of
> > PostgreSQL. Therefore my questions are
> > * What is the suggested strategy to splitting these 2K attributes into
> > vertically partitioned tables where the tables are linked by the primary
> > key (e.g. trip number in the example above). Are there any limitations/best
> > practices in the number/size of TOASTED attributes that a table should
> > contain.
> > * In each partitioned table containing N TOASTED attributes, given the
> > above requirements, are there any limitations/best practices in storing
> > them using extended storage or an alternative one such as external.
> >
> >
> Frankly, the best practice is "don't have that many columns". Since you
> do, I posit that you are just going to have to make decisions (possibly
> with experimentation) on your own. Or maybe ask around on a MobilityDB
> forum what people using that tool and having these kinds of data structures
> do. From a core PostgreSQL perspective you've already deviated from the
> model structures that it was designed with in mind.
> I'm really confused that you'd want the data value itself to contain a
> timestamp that, on a per-row basis, should be the same timestamp that every
> other value on the row has. Extracting the timestamp to it own column and
> using simpler and atomic data types is how core PostgreSQL and the
> relational model normalization recommend dealing with this situation. Then
> you just break up the attributes of a similar nature into their own tables
> based upon their shared nature. In almost all cases relying on "main"
> storage.
Actually looking at the original example:
> CREATE TYPE tint (
> internallength = variable,
> [...]
> storage = extended,
> alignment = double,
> [...]
> );
I'm wondering if it's just some miscommunication here. If the tint data type
only needs to hold a timestamp and an int, I don't see why it would be
varlerna at all.
So if a single tint can internally hold thousands of (int, timestamptz), a bit
like pgpointcloud, then having it by default external (so both possibly
out-of-line and compressed) seems like a good idea, as you can definitely hit
the 8k boundary, it should compress nicely and you also avoid some quite high
tuple header overhead.