Questions about Partitioning - Mailing list pgsql-general
From | Phoenix Kiula |
---|---|
Subject | Questions about Partitioning |
Date | |
Msg-id | BANLkTikEB1rVYiBG9jTRLxkesXKq2wBXHA@mail.gmail.com Whole thread Raw |
Responses |
Re: Questions about Partitioning
Re: Questions about Partitioning |
List | pgsql-general |
While I fix some bigger DB woes, I have learned a lesson. Huge indexes and tables are a pain. Which makes me doubly keen on looking at partitioning. Most examples I see online are partitioned by date. As in months, or quarter, and so on. This doesn't work for me as I don't have too much logic required based on time. The biggest, highest volume SELECT in my database happens through an "alias" column. This is an alphanumeric column. The second-biggest SELECT happens through the "userid" column -- because many users check their account every day. A rough table definition can be considered as follows: CREATE TABLE maintable id SERIAL primary key alias VARCHAR(42) ... user_id VARCHAR(30) user_registered BOOLEAN status VARCHAR(1) My questions: 1. Which column should I partition by -- the "alias" because it's the largest contributor of queries? This should be OK, but my concern is that when user_id queries are happening, then the data for the same user will come through many subtables that are partitioned by "alias" -- will this happen automatically (presuming "constraint exclusion" is on)? How does partitioning by one column affect queries on others. Will there be subtable-by-subtable indexes on both alias and 2. How does SERIAL type work with partitions? Will INSERT data go into the respective partitions and yet maintain an overall sequence -- I mean, the *same* overall sequence for the parent table distributed automagically across subtables? 3. If I partition using "a%", "b%" etc up to "z%" as the partition condition, is this an issue -- are about 26 subtables too many partitions? Mine are static partitions as in they will be the same forever, unlike data-based partitions. And each partition will continue to grow. If I include that "alias"es can begin with numbers and allowed symbols too, then this may be 45 partitions? What's the limit of partitions -- not only official limit, but practical limit in terms of performance? 4. Given that it's a wildcard LIKE condition (with a "%") will this affect the index and subsequent SELECT speed? Are partition conditions recommended to be "=" or "<" type operators only or is LIKE ok?? 5. Does partitioning need to happen only through one column? Can I have a condition containing two columns instead? CREATE TABLE subtable_a ( PRIMARY KEY (id) CHECK ( user_id LIKE 'a%' and user_registered IS TRUE) ) INHERITS (maintable); CREATE TABLE subtable_b ( PRIMARY KEY (id), CHECK ( user_id LIKE 'b%' and user_registered IS TRUE) ) INHERITS (maintable); ......etc.... 6. Triggers - how do they affect speed? Everything, insert, update, select will happen through this conditional trigger. I will likely be writing this in PLSQL, but I read in several websites that C triggers are much faster than PLSQL triggers. Is this a concern? 7. "Constraint exclusion" - is it recommended to have this in the pg.conf, or will I need to do this before every SQL? I prefer the pg.conf way, but want to confirm that there are no downsides for other regular SQL operations with this setting? 8. How will JOIN work? I have different tables JOINing with the parent table now. With partitioned subtables, will constraint exclusion automatically do what's needed and my SQL does not need to change? Or will there be triggers required for each and every query I currently have? Eight questions is enough for my first post in this partitioning thread :) Thanks much!
pgsql-general by date: