Thread: Partitioned Database and Choosing Subtables
I've got a ver 8.4.5 partitioned data base with records organized by US state, so the partitions are set up by state. When I query this database and include the key field that tells postgres what partition you , everything works as I expect. It searches only the specified partition, and it's fast . But that's only if I use a constant, like this: SELECT lions, tigers, bears FROM WildLife WHERE state_pt = 'CO'; What I want to be able to do is put this key value in a table and PG look in whatever partition the column specifies, like so: SELECT lions, tigers, bears, statecode FROM WildLife WHERE state_pt = statecode; However when I try anything other than a constant, it search EVERY partition, sequentially, which is not what I want it to do. So is there any way to specify the partition to search using a variable/column name? -- *Bill Thoen* GISnet - www.gisnet.com 303-786-9961
On Mar 14, 2011, at 8:30 PM, Bill Thoen wrote: > I've got a ver 8.4.5 partitioned data base with records organized by US state, so the partitions are set up by state. WhenI query this database and include the key field that tells postgres what partition you , everything works as I expect.It searches only the specified partition, and it's fast . But that's only if I use a constant, like this: > > SELECT lions, tigers, bears FROM WildLife > WHERE state_pt = 'CO'; > > What I want to be able to do is put this key value in a table and PG look in whatever partition the column specifies, likeso: > > SELECT lions, tigers, bears, statecode FROM WildLife > WHERE state_pt = statecode; > > However when I try anything other than a constant, it search EVERY partition, sequentially, which is not what I want itto do. So is there any way to specify the partition to search using a variable/column name? To answer the specific question you ask, you can always tell Postgres to search a particular child table: SELECT lions, tigers, bears FROM Wildlife_CA ... But your example seems somewhat unclear to me. Are both "state_pt" and "statecode" columns in Wildlife? If so, Postgresis going to have to search every partition, because it can't just from the partition constraint know which entrieswill match and which do not until it looks inside every record. Or did you mean 'statecode' to be a column in a different table, on which you're joining? -- -- Christophe Pettus xof@thebuild.com
> -----Original Message----- > From: Bill Thoen [mailto:bthoen@gisnet.com] > Sent: Monday, March 14, 2011 11:31 PM > To: pgsql-general@postgresql.org > Subject: Partitioned Database and Choosing Subtables > > I've got a ver 8.4.5 partitioned data base with records > organized by US state, so the partitions are set up by state. > When I query this database and include the key field that > tells postgres what partition you , everything works as I > expect. It searches only the specified partition, and it's > fast . But that's only if I use a constant, like this: > > SELECT lions, tigers, bears FROM WildLife WHERE state_pt = 'CO'; > > What I want to be able to do is put this key value in a table > and PG look in whatever partition the column specifies, like so: > > SELECT lions, tigers, bears, statecode FROM WildLife WHERE > state_pt = statecode; > > However when I try anything other than a constant, it search > EVERY partition, sequentially, which is not what I want it to > do. So is there any way to specify the partition to search > using a variable/column name? > > -- > *Bill Thoen* > GISnet - www.gisnet.com > 303-786-9961 > Try using "dynamic" sql: EXECUTE 'SELECT lions, tigers, bears, statecode FROM WildLife WHERE state_pt = ' || statecode INTO ...; See: http://www.postgresql.org/docs/8.4/interactive/plpgsql-statements.html Regards, Igor Neyman
On 3/15/2011 12:02 PM, Igor Neyman wrote: > > >> -----Original Message----- >> From: Bill Thoen [mailto:bthoen@gisnet.com] >> Sent: Monday, March 14, 2011 11:31 PM >> To: pgsql-general@postgresql.org >> Subject: Partitioned Database and Choosing Subtables >> >> I've got a ver 8.4.5 partitioned data base with records >> organized by US state, so the partitions are set up by state. >> When I query this database and include the key field that >> tells postgres what partition you , everything works as I >> expect. It searches only the specified partition, and it's >> fast . But that's only if I use a constant, like this: >> >> SELECT lions, tigers, bears FROM WildLife WHERE state_pt = 'CO'; >> >> What I want to be able to do is put this key value in a table >> and PG look in whatever partition the column specifies, like so: >> >> SELECT lions, tigers, bears, statecode FROM WildLife WHERE >> state_pt = statecode; >> >> However when I try anything other than a constant, it search >> EVERY partition, sequentially, which is not what I want it to >> do. So is there any way to specify the partition to search >> using a variable/column name? >> >> -- >> *Bill Thoen* >> GISnet - www.gisnet.com >> 303-786-9961 >> > Try using "dynamic" sql: > > EXECUTE 'SELECT lions, tigers, bears, statecode FROM WildLife WHERE > state_pt = ' || statecode INTO ...; > > See: > http://www.postgresql.org/docs/8.4/interactive/plpgsql-statements.html > Thanks Igor. It was a nice try -- and I thought it would work, but the Planner had other plans. Basically, I tried the interactive method using a PREPARE statement with one text parammeter followed by an EXECUTE statement. Unfortunately, the query still went rummaging across the entire database sequentially. To recap, (I still hope there's a chance for a workable answer) here's a simple contrived scenario Incidents Table ----------- total integer, -- total number of human:critter encounters statecode char(2), -- by state outcome char(20) -- outcome of incident {fatal, serious, minor, close call} Wildlife Table -- the Partitioned table. Uses state_pt to separate data tables. table 1: state_pt='AK' table 2: state_pt='AL' ... table 50: state_pt='WY' ----------------- lions integer, -- total population of lions in the state tigers integer, -- total tigers bears integer, -- total bears outcome char(20), -- outcome of incident {fatal, serious, minor, close call} ... more attributes... state_pt -- key index that separates the sub tables inthe partitioned datbase Some incidents table data: total | statecode | outcome ------+--------------+------------- 7 | CA | fatal 12 | CA | close call 3 | CO | fatal 16 | CO | close call 4 | WY | minor If I do the following it finds the data and goes directly to the right table. SELECT lions, tigers, bears FROM Wildlife WHERE state_pt = 'CO'; Or this: \set theState '\''CO'\'' SELECT lions, tigers, bears FROM Wildlife WHERE state_pt = :theState ; It works perfectly. But when I tried this: PREPARE foo (text) AS SELECT lions, tigers, bears FROM Wildlife WHERE state_pt = $1; EXECUTE foo( 'CO' ) It sequentially scanned everything, so I didn't even try the scenario I'm trying to create, which is using the results of a select to supply the key code to the Planner as to what partition it should search for reach record. SELECT lions, tigers , bears FROM Wildlife a JOIN incidents b ON a.outcome=b.outcome WHERE state_pt=statecode; Apparently, the Planner is pretty fussy about how you refer to one of your subtables, but I almost always use queries that involve multiple states,and the knowledge of what state table a particular record needs to access is carried as one of that record's attributes. I'm surprised that there's so little info on this. I tseems like an obvious application of partitioned databases, but maybe I just don't understand how it should be used? It seems that the only variables it likes are host variables, and there maybe something down that road... Anywya the quuestion is still open if anyone has some good ideas.
>> Try using "dynamic" sql: >> >> EXECUTE 'SELECT lions, tigers, bears, statecode FROM WildLife WHERE >> state_pt = ' || statecode INTO ...; > Thanks Igor. It was a nice try -- and I thought it would work, but the Planner had other plans. Basically, I tried theinteractive method using a PREPARE statement with one text parammeter followed by an EXECUTE statement. Unfortunately,the query still went rummaging across the entire database sequentially. That's not the same as using dynamic SQL. In fact, that would behave just like your previous, problematic, query. Dynamic SQL makes that the planner sees a new query each time, that needs to be planned from scratch. OTOH, using prepared statements the query plan is stored with placeholders for the parameter values at the moment the PREPAREexecutes. That query plan is already set in stone - it will be a generic plan that can't be optimised for differentvalues of parameters anymore. Not what you want. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4d8069c0235881427912932!
On 3/16/2011 12:40 AM, Alban Hertroys wrote: >>> Try using "dynamic" sql: >>> >>> EXECUTE 'SELECT lions, tigers, bears, statecode FROM WildLife WHERE >>> state_pt = ' || statecode INTO ...; >> Thanks Igor. It was a nice try -- and I thought it would work, but the Planner had other plans. Basically, I tried theinteractive method using a PREPARE statement with one text parammeter followed by an EXECUTE statement. Unfortunately,the query still went rummaging across the entire database sequentially. > > That's not the same as using dynamic SQL. In fact, that would behave just like your previous, problematic, query. > > Dynamic SQL makes that the planner sees a new query each time, that needs to be planned from scratch. > Thanks for explaining this. I think I can just see what you mean. I can also see that I'm way out of my depth here. It's just so much fun when you mix deadlines with ignorance. > If you can't see the forest for the trees, > cut the trees and you'll see there is no forest. Good point. I'm feeling a bit chainsaw-ish right now. -- *Bill Thoen* GISnet - www.gisnet.com 303-786-9961