Questions about Partitioning - Mailing list pgsql-general

From Phoenix Kiula
Subject Questions about Partitioning
Date
Msg-id BANLkTikEB1rVYiBG9jTRLxkesXKq2wBXHA@mail.gmail.com
Whole thread Raw
Responses Re: Questions about Partitioning
Re: Questions about Partitioning
List pgsql-general
While I fix some bigger DB woes, I have learned a lesson. Huge indexes
and tables are a pain.

Which makes me doubly keen on looking at partitioning.

Most examples I see online are partitioned by date. As in months, or
quarter, and so on. This doesn't work for me as I don't have too much
logic required based on time.

The biggest, highest volume SELECT in my database happens through an
"alias" column. This is an alphanumeric column. The second-biggest
SELECT happens through the "userid" column -- because many users check
their account every day.

A rough table definition can be considered as follows:

   CREATE TABLE maintable
          id    SERIAL  primary key
          alias  VARCHAR(42)
          ...
          user_id  VARCHAR(30)
          user_registered   BOOLEAN
          status    VARCHAR(1)


 My questions:

1. Which column should I partition by -- the "alias" because it's the
largest contributor of queries? This should be OK, but my concern is
that when user_id queries are happening, then the data for the same
user will come through many subtables that are partitioned by "alias"
-- will this happen automatically (presuming "constraint exclusion" is
on)? How does partitioning by one column affect queries on others.
Will there be subtable-by-subtable indexes on both alias and

2. How does SERIAL type work with partitions? Will INSERT data go into
the respective partitions and yet maintain an overall sequence -- I
mean, the *same* overall sequence for the parent table distributed
automagically across subtables?

3. If I partition using "a%", "b%" etc up to "z%" as the partition
condition, is this an issue -- are about 26 subtables too many
partitions? Mine are static partitions as in they will be the same
forever, unlike data-based partitions. And each partition will
continue to grow. If I include that "alias"es can begin with numbers
and allowed symbols too, then this may be 45 partitions? What's the
limit of partitions -- not only official limit, but practical limit in
terms of performance?

4. Given that it's a wildcard LIKE condition (with a "%") will this
affect the index and subsequent SELECT speed? Are partition conditions
recommended to be "=" or "<" type operators only or is LIKE ok??

5. Does partitioning need to happen only through one column? Can I
have a condition containing two columns instead?

   CREATE TABLE subtable_a (
    PRIMARY KEY (id)
    CHECK ( user_id LIKE 'a%' and user_registered IS TRUE)
    ) INHERITS (maintable);

   CREATE TABLE subtable_b (
    PRIMARY KEY (id),
    CHECK ( user_id LIKE 'b%' and user_registered IS TRUE)
    ) INHERITS (maintable);

   ......etc....


6. Triggers - how do they affect speed? Everything, insert, update,
select will happen through this conditional trigger. I will likely be
writing this in PLSQL, but I read in several websites that C triggers
are much faster than PLSQL triggers. Is this a concern?

7. "Constraint exclusion" - is it recommended to have this in the
pg.conf, or will I need to do this before every SQL? I prefer the
pg.conf way, but want to confirm that there are no downsides for other
regular SQL operations with this setting?

8. How will JOIN work? I have different tables JOINing with the parent
table now. With partitioned subtables, will constraint exclusion
automatically do what's needed and my SQL does not need to change? Or
will there be triggers required for each and every query I currently
have?


Eight questions is enough for my first post in this partitioning thread   :)

Thanks much!

pgsql-general by date:

Previous
From: Taras Kopets
Date:
Subject: Re: how to force an insert before Raise Exception?
Next
From: Benjamin Smith
Date:
Subject: Re: SSDs with Postgresql?