Thread: Data Warehouse Reevaluation - MySQL vs Postgres
Hi all, I had a difficult time deciding which list to post this to, so please forgive me if this list doesn't perfectly match my questions. My decision will not solely be based on performance, but it is the primary concern. I would be very appreciative if you all could comment on my test plan. Real world examples of a Postgres implementation of >=600G with a web front-end would be great, or any data warehouse with some size to it. The dilemma: The time has come to reevaluate/rearchitect an application which I built about 3 years ago. There are no performance concerns with MySQL, but it would benefit greatly from stored procedures, views, etc. It is a very large rolling data warehouse that inserts about 4.5 million rows every 2 hours and subsequently rolls this data off the back end of a 90 day window. A web interface has been designed for querying the data warehouse. Migration planning is much easier with views and stored procedures and this is my primary reason for evaluating Postgres once again. As the application grows I want to have the ability to provide backward compatible views for those who are accustomed to the current structure. This is not possible in MySQL. Some of the mining that we do could benefit from stored procedures as well. MySQL may have these in the works, but we won't be able to move to a version of MySQL that supports stored procs for another year or two. Requirements: Merge table definition equivalent. We use these extensively. Merge table equivalent with all tables containing over 100M rows(and about 40 columns, some quite wide) will need to do index scans in at least 5 seconds(MySQL currently does 2, but we can live with 5) and return ~200 rows. Um, gonna sound silly, but the web interface has to remain "snappy" under load. I don't see this as a major concern since you don't require table locking. If business logic is moved to the database(likely with Postgres) performance for inserting with light logic on each insert has to keep up with the 4.5M inserts per 2 hours(which MySQL completes in ~35min currently). Acceptable numbers for this aggregation would be 45-55min using stored procedures. About 3 years ago I did some performance characterizations of Postgres vs. MySQL and didn't feel Postgres was the best solution. 3 years later we've won runner-up for MySQL application of the year(behind Saabre). Oddly enough this reevaluting database strategy is right on the coattails of this award. I'll begin writing my business logic within the next week and start migrating test data shortly thereafter. Case studies would be very beneficial as I put together my analysis. Also, this is for a Fortune 500 company that uses this data warehouse extensively. It is an internal application that is widely used and gets about 4 hits per employee per day. Much of customer care, data engineering, plant engineering(it's a cable company), and marketing use the interface. I've done a great deal of press for MySQL and would be equally willing to tout the benefits of Postgres to trade rags, magazines, etc provided the results are favorable. Here's our case study if you're interested . . . http://www.mysql.com/customers/customer.php?id=16 Thoughts, suggestions? 'njoy, Mark
On Sat, 11 Sep 2004, Mark Cotner wrote: > There are no performance concerns with MySQL, but it would benefit > greatly from stored procedures, views, etc. It is a very large rolling > data warehouse that inserts about 4.5 million rows every 2 hours and > subsequently rolls this data off the back end of a 90 day window. While it is impossible to know without testing, postgresql has the benefit of readers and writers that does not block each other. So in situations where you do lots of concurrent inserts and selects postgresql should behave well. > Merge table definition equivalent. We use these extensively. As far as I can tell a merge table in mysql is the same as a view over a number of unions of other tables. And possibly a rule that defines how inserts will be done if you do inserts in the merged table. > Merge table equivalent with all tables containing over 100M rows(and > about 40 columns, some quite wide) will need to do index scans in at > least 5 seconds(MySQL currently does 2, but we can live with 5) and > return ~200 rows. Since each table that are merged will have it's own index the speed should be proportional to the number of tables. Index scans in them self are very fast, and of you have 30 tables you need 30 index scans. Also, are you sure you really need merge tables? With pg having row locks and mvcc, maybe you could go for a simpler model with just one big table. Often you can also combine that with partial indexes to get a smaller index to use for lots of your queries. > Thoughts, suggestions? I see nothing in what you have written that indicates that pg can not do the job, and do it well. It's however very hard to know exactly what is the bottleneck before one tries. There are lots of cases where people have converted mysql applications to postgresql and have gotten a massive speedup. You could be lucky and have such a case, who knows.. I spend some time each day supporting people using postgresql in the #postgresql irc channel (on the freenode.net network). There I talk to people doing both small and big conversions and the majority is very happy with the postgresql performance. Postgresql have gotten faster and faster with each release and while speed was a fair argument a number of years ago it's not like that today. That said, in the end it depends on the application. We are all interested in how it goes (well, at least me :-), so feel free to send more mails keeping us posted. Good luck. -- /Dennis Björklund
Mark Cotner wrote: > Requirements: > Merge table definition equivalent. We use these > extensively. What do you mean with "merge table definition equivalent"? Regards Gaetano Mendola
The world rejoiced as Mischa Sandberg <ischamay.andbergsay@activestateway.com> wrote: > Mark Cotner wrote: >> Requirements: >> Merge table definition equivalent. We use these >> extensively. > Looked all over mysql.com etc, and afaics merge table is indeed > exactly a view of a union-all. Is that right? > PG supports views, of course, as well (now) as tablespaces, allowing > you to split tables/tablesets across multiple disk systems. PG is > also pretty efficient in query plans on such views, where (say) you > make one column a constant (identifier, sort of) per input table. The thing that _doesn't_ work well with these sorts of UNION views are when you do self-joins. Supposing you have 10 members, a self-join leads to a 100-way join, which is not particularly pretty. I'm quite curious as to how MySQL(tm) copes with this, although it may not be able to take place; they may not support that... >> Um, gonna sound silly, but the web interface has to remain "snappy" >> under load. I don't see this as a major concern since you don't >> require table locking. > Agreed. It's more in your warehouse design, and intelligent bounding > of queries. I'd say PG's query analyzer is a few years ahead of > MySQL for large and complex queries. The challenge comes in if the application has had enormous amounts of effort put into it to attune it exactly to MySQL(tm)'s feature set. The guys working on RT/3 have found this a challenge; they had rather a lot of dependancies on its case-insensitive string comparisons, causing considerable grief. > On the other hand, if you do warehouse-style loading (Insert, or PG > COPY, into a temp table; and then 'upsert' into the perm table), I > can guarantee 2500 inserts/sec is no problem. The big wins are thus: 1. Group plenty of INSERTs into a single transaction. 2. Better still, use COPY to cut parsing costs plenty more. 3. Adding indexes _after_ the COPY are a further win. Another possibility is to do clever things with stored procs; load incoming data using the above optimizations, and then run stored procedures to use some more or less fancy logic to put the data where it's ultimately supposed to be. Having the logic running inside the engine is the big optimization. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','linuxfinances.info'). http://linuxfinances.info/info/spreadsheets.html Rules of the Evil Overlord #198. "I will remember that any vulnerabilities I have are to be revealed strictly on a need-to-know basis. I will also remember that no one needs to know." <http://www.eviloverlord.com/>
See comments . . . thanks for the feedback. 'njoy, Mark --- Christopher Browne <cbbrowne@acm.org> wrote: > The world rejoiced as Mischa Sandberg > <ischamay.andbergsay@activestateway.com> wrote: > > Mark Cotner wrote: > >> Requirements: > >> Merge table definition equivalent. We use these > >> extensively. > > > Looked all over mysql.com etc, and afaics merge > table is indeed > > exactly a view of a union-all. Is that right? > > > PG supports views, of course, as well (now) as > tablespaces, allowing > > you to split tables/tablesets across multiple disk > systems. PG is > > also pretty efficient in query plans on such > views, where (say) you > > make one column a constant (identifier, sort of) > per input table. > > The thing that _doesn't_ work well with these sorts > of UNION views are > when you do self-joins. Supposing you have 10 > members, a self-join > leads to a 100-way join, which is not particularly > pretty. > > I'm quite curious as to how MySQL(tm) copes with > this, although it may > not be able to take place; they may not support > that... > > >> Um, gonna sound silly, but the web interface has > to remain "snappy" > >> under load. I don't see this as a major concern > since you don't > >> require table locking. > > > Agreed. It's more in your warehouse design, and > intelligent bounding > > of queries. I'd say PG's query analyzer is a few > years ahead of > > MySQL for large and complex queries. > > The challenge comes in if the application has had > enormous amounts of > effort put into it to attune it exactly to > MySQL(tm)'s feature set. > > The guys working on RT/3 have found this a > challenge; they had rather > a lot of dependancies on its case-insensitive string > comparisons, > causing considerable grief. > Not so much, I've tried to be as agnostic as possible. Much of the more advanced mining that I've written is kinda MySQL specific, but needs to be rewritten as stored procedures anyway. > > On the other hand, if you do warehouse-style > loading (Insert, or PG > > COPY, into a temp table; and then 'upsert' into > the perm table), I > > can guarantee 2500 inserts/sec is no problem. > > The big wins are thus: > > 1. Group plenty of INSERTs into a single > transaction. > > 2. Better still, use COPY to cut parsing costs > plenty more. > > 3. Adding indexes _after_ the COPY are a further > win. > > Another possibility is to do clever things with > stored procs; load > incoming data using the above optimizations, and > then run stored > procedures to use some more or less fancy logic to > put the data where > it's ultimately supposed to be. Having the logic > running inside the > engine is the big optimization. Agreed, I did some preliminary testing today and am very impressed. I wasn't used to running analyze after a data load, but once I did that everything was snappy. My best results from MySQL bulk inserts was around 36k rows per second on a fairly wide table. Today I got 42k using the COPY command, but with the analyze post insert the results were similar. These are excellent numbers. It basically means we could have our cake(great features) and eat it too(performance that's good enough to run the app). Queries from my test views were equally pleasing. I won't bore you with the details just yet, but PostgreSQL is doing great. Not that you all are surprised. ;) > -- > wm(X,Y):-write(X),write('@'),write(Y). > wm('cbbrowne','linuxfinances.info'). > http://linuxfinances.info/info/spreadsheets.html > Rules of the Evil Overlord #198. "I will > remember that any > vulnerabilities I have are to be revealed strictly > on a need-to-know > basis. I will also remember that no one needs to > know." > <http://www.eviloverlord.com/> > > ---------------------------(end of > broadcast)--------------------------- > TIP 8: explain analyze is your friend >
A long time ago, in a galaxy far, far away, mcotner@yahoo.com (Mark Cotner) wrote: > Agreed, I did some preliminary testing today and am very impressed. > I wasn't used to running analyze after a data load, but once I did > that everything was snappy. Something worth observing is that this is true for _any_ of the database systems supporting a "cost-based" optimization system, including Oracle and DB2. When working with SAP R/3 Payroll, on one project, we found that when the system was empty of data, the first few employee creates were quick enough, but it almost immediately got excruciatingly slow. One of the DBAs told the Oracle instance underneath to collect statistics on the main table, and things _immediately_ got snappy again. But it didn't get snappy until the conversion folk had run the conversion process for several minutes, to the point to which it would get painfully slow :-(. There, with MILLIONS of dollars worth of license fees being paid, across the various vendors, it still took a fair bit of manual fiddling. MySQL(tm) is just starting to get into cost-based optimization; in that area, they're moving from where the "big DBs" were about 10 years ago. It was either version 7 or 8 where Oracle started moving to cost-based optimization, and (as with the anecdote above) it took a release or two for people to get accustomed to the need to 'feed' the optimizer with statistics. This is a "growing pain" that bites users with any database where this optimization gets introduced. It's worthwhile, but is certainly not costless. I expect some forseeable surprises will be forthcoming for MySQL AB's customers in this regard... > My best results from MySQL bulk inserts was around 36k rows per > second on a fairly wide table. Today I got 42k using the COPY > command, but with the analyze post insert the results were similar. > These are excellent numbers. It basically means we could have our > cake(great features) and eat it too(performance that's good enough > to run the app). In the end, performance for inserts is always fundamentally based on how much disk I/O there is, and so it should come as no shock that when roughly the same amount of data is getting laid down on disk, performance won't differ much on these sorts of essentials. There are a few places where there's some need for cleverness; if you see particular queries running unusually slowly, it's worth doing an EXPLAIN or EXPLAIN ANALYZE on them, to see how the query plans are being generated. There's some collected wisdom out here on how to encourage the right plans. There are also unexpected results that are OK. We did a system upgrade a few days ago that led to one of the tables starting out totally empty. A summary report that looks at that table wound up with a pretty wacky looking query plan (compared to what's usual) because the postmaster knew that the query would be reading in essentially the entire table. You'd normally expect an index scan, looking for data for particular dates. In this case, it did a "scan the whole table; filter out a few irrelevant entries" plan. It looked wacky, compared to what's usual, but it ran in about 2 seconds, which was way FASTER than what's usual. So the plan was exactly the right one. Telling the difference between the right plan and a poor one is a bit of an art; we quite regularly take a look at query plans on this list to figure out what might be not quite right. If you find slow ones, make sure you have run ANALYZE on the tables recently, to be sure that the plans are sane, and you may want to consider posting some of them to see if others can point to improvements that can be made. -- If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me http://linuxfinances.info/info/linuxdistributions.html "I can't believe my room doesn't have Ethernet! Why wasn't it wired when the house was built?" "The house was built in 1576." -- Alex Kamilewicz on the Oxford breed of `conference American.'
Mark, I thought some additional comments on top of Christopher's excellent notes might help you. > Christopher Browne > The world rejoiced as Mischa Sandberg > <ischamay.andbergsay@activestateway.com> wrote: > > Mark Cotner wrote: > >> Requirements: > >> Merge table definition equivalent. We use these > >> extensively. > > > Looked all over mysql.com etc, and afaics merge table is indeed > > exactly a view of a union-all. Is that right? > PostgreSQL's functionality is in many ways similar to Oracle Partitioning. Loading up your data in many similar tables, then creating a view like: CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS SELECT 200409130800, col1, col2, col3... FROM table200409130800 UNION ALL SELECT 200409131000, col1, col2, col3... FROM table200409131000 UNION ALL SELECT 200409131200, col1, col2, col3... FROM table200409131200 ...etc... will allow the PostgreSQL optimizer to eliminate partitions from the query when you run queries which include a predicate on the partitioning_col, e.g. select count(*) from bigtable where idate >= 200409131000 will scan the last two partitions only... There are a few other ways of creating the view that return the same answer, but only using constants in that way will allow the partitions to be eliminated from the query, and so run for much longer. So you can give different VIEWS to different user groups, have different indexes on different tables etc. However, I haven't managed to get this technique to work when performing a star join to a TIME dimension table, since the parition elimination relies on comparison of constant expressions. You'll need to check out each main join type to make sure it works for you in your environment. > > PG supports views, of course, as well (now) as tablespaces, allowing > > you to split tables/tablesets across multiple disk systems. PG is > > also pretty efficient in query plans on such views, where (say) you > > make one column a constant (identifier, sort of) per input table. > > The thing that _doesn't_ work well with these sorts of UNION views are > when you do self-joins. Supposing you have 10 members, a self-join > leads to a 100-way join, which is not particularly pretty. > Well, that only happens when you forget to include the partitioning constant in the self join. e.g. select count(*) from bigtable a, bigtable b where a.idate = .idate; --works just fine The optimizer really is smart enough to handle that too, but I'm sure such large self-joins aren't common for you anyhow. > I'm quite curious as to how MySQL(tm) copes with this, although it may > not be able to take place; they may not support that... > It doesn't, AFAIK. > Christopher Browne wrote > A long time ago, in a galaxy far, far away, mcotner@yahoo.com > (Mark Cotner) wrote: > > Agreed, I did some preliminary testing today and am very impressed. > > I wasn't used to running analyze after a data load, but once I did > > that everything was snappy. > > Something worth observing is that this is true for _any_ of the > database systems supporting a "cost-based" optimization system, > including Oracle and DB2. Agreed. You can reduce the time for the ANALYZE by ignoring some of the (measures) columns not used in WHERE clauses. Also, if you're sure that each load is very similar to the last, you might even consider directly updating pg_statistic rows with the statistical values produced from an earlier ANALYZE...scary, but it can work. To create a set of tables of > 600Gb, you will benefit from creating each table WITHOUT OIDS. Hope some of that helps you... Best Regards, Simon Riggs
Mark Cotner wrote: > Hi all, > I had a difficult time deciding which list to post > this to, so please forgive me if this list doesn't > perfectly match my questions. My decision will not > solely be based on performance, but it is the primary > concern. I would be very appreciative if you all > could comment on my test plan. Real world examples of > a Postgres implementation of >=600G with a web > front-end would be great, or any data warehouse with > some size to it. I'm only in the 30GB range of database, in case that's a consideration for my comments that follow. At this time, I'm working out the best ROLAP options for our PG transaction store. The transaction store is highly volatile -- longest a txn stays in it is 15 days ... so you imagine the need for historic summaries :-) I've also implemented multiple data servers, including a federated server that had to make the best of existing engines (like MySQL, PG and everything from MSJet to RedBrick in the commercial world). > The time has come to reevaluate/rearchitect an > application which I built about 3 years ago. There > are no performance concerns with MySQL, but it would > benefit greatly from stored procedures, views, etc. If your company is currently happy with MySQL, there probably are other (nontechnical) reasons to stick with it. I'm impressed that you'd consider reconsidering PG. > Some of the mining that we do could benefit from > stored procedures as well. MySQL may have these in > the works, but we won't be able to move to a version > of MySQL that supports stored procs for another year > or two. And PG lets you back-end with some powerful pattern- and aggregate-handling languages, like Perl. This was definitely a plus for data mining of web traffic, for example. The power of server-side extensibility for bailing you out of a design dead-end is not inconsequential. PG doesn't have PIVOT operators (qv Oracle and MSSQL), but it makes the translation from data to column fairly painless otherwise. > Requirements: > Merge table definition equivalent. We use these > extensively. Looked all over mysql.com etc, and afaics merge table is indeed exactly a view of a union-all. Is that right? PG supports views, of course, as well (now) as tablespaces, allowing you to split tables/tablesets across multiple disk systems. PG is also pretty efficient in query plans on such views, where (say) you make one column a constant (identifier, sort of) per input table. > Merge table equivalent with all tables containing over > 100M rows(and about 40 columns, some quite wide) will > need to do index scans in at least 5 seconds(MySQL > currently does 2, but we can live with 5) and return > ~200 rows. PG has TOAST for handling REALLY BIG columns, and the generic TEXT type is as efficient as any size-specific VARCHAR() type ... should make things easier for you. > Um, gonna sound silly, but the web interface has to > remain "snappy" under load. I don't see this as a > major concern since you don't require table locking. Agreed. It's more in your warehouse design, and intelligent bounding of queries. I'd say PG's query analyzer is a few years ahead of MySQL for large and complex queries. > If business logic is moved to the database(likely with > Postgres) performance for inserting with light logic > on each insert has to keep up with the 4.5M inserts > per 2 hours(which MySQL completes in ~35min > currently). Acceptable numbers for this aggregation > would be 45-55min using stored procedures. Again, it's a matter of pipeline design. The tools for creating an efficient pipeline are at least as good in PG as MySQL. If you try to insert and postprocess information one row at a time, procedures or no, there's no offhand way to guarantee your performance without a test/prototype. On the other hand, if you do warehouse-style loading (Insert, or PG COPY, into a temp table; and then 'upsert' into the perm table), I can guarantee 2500 inserts/sec is no problem. > Here's our case study if you're interested . . . > http://www.mysql.com/customers/customer.php?id=16
You all have been so very helpful so far and I really appreciate it. The data in these tables is thankfully static since they are logging tables and an analyze only takes about 4 minutes for the largest of them. I've finished porting the schema and am importing the data now. My estimates for just two-thirds(60 of the 90 days) of one of our 30 cable systems(MySQL dbs) is estimated to take about 16 hours. This may seem like a lot, but I'm satisfied with the performance. I've created a slightly normalized version and some stored procedures to help me normalize the data. When this finishes I'm going to query the data as is with the views as you suggested, and I'm going to create views for the normalized version to test that as well. This will then be contrasted to the MySQL query results and I plan to write a white paper of my findings. I don't have any concerns that Postgres will do fine, but if I run into any performance problems I'll be sure and post them here first. It should be noted that our development life cycle is currently severely hindered by lack of features in MySQL like views and stored procedures. Frankly I've implemented some pretty ugly SQL using as many as 5 temp tables to generate a result set with MySQL. Having stored procedures and views is going to help us tremendously. This performance evaluation is to verify that Postgres can handle what we're going to throw at it, not to find out if it's faster in milliseconds than MySQL. We love the speed and ease of maintenance with MySQL, but have simply outgrown it. This will be reflected in the white paper. I have already imported our customer tables, which aren't too small(2.4M rows x 3 tables), and stuck a view in front of it. The view queried faster than MySQL would query a pre-joined flat table. Getting carried away . . . needless to say I'm really excited about the possiblity of Postgres, but I won't bore you with the details just yet. I'll send the link out to the white paper so you all can review it before I send it anywhere else. If anything could have been optimized more please let me know and I'll see that it gets updated before it's widely published. Thanks again for all the great feedback! 'njoy, Mark --- Christopher Browne <cbbrowne@acm.org> wrote: > A long time ago, in a galaxy far, far away, > mcotner@yahoo.com (Mark Cotner) wrote: > > Agreed, I did some preliminary testing today and > am very impressed. > > I wasn't used to running analyze after a data > load, but once I did > > that everything was snappy. > > Something worth observing is that this is true for > _any_ of the > database systems supporting a "cost-based" > optimization system, > including Oracle and DB2. > > When working with SAP R/3 Payroll, on one project, > we found that when > the system was empty of data, the first few employee > creates were > quick enough, but it almost immediately got > excruciatingly slow. One > of the DBAs told the Oracle instance underneath to > collect statistics > on the main table, and things _immediately_ got > snappy again. But it > didn't get snappy until the conversion folk had run > the conversion > process for several minutes, to the point to which > it would get > painfully slow :-(. There, with MILLIONS of dollars > worth of license > fees being paid, across the various vendors, it > still took a fair bit > of manual fiddling. > > MySQL(tm) is just starting to get into cost-based > optimization; in > that area, they're moving from where the "big DBs" > were about 10 years > ago. It was either version 7 or 8 where Oracle > started moving to > cost-based optimization, and (as with the anecdote > above) it took a > release or two for people to get accustomed to the > need to 'feed' the > optimizer with statistics. This is a "growing pain" > that bites users > with any database where this optimization gets > introduced. It's > worthwhile, but is certainly not costless. > > I expect some forseeable surprises will be > forthcoming for MySQL AB's > customers in this regard... > > > My best results from MySQL bulk inserts was around > 36k rows per > > second on a fairly wide table. Today I got 42k > using the COPY > > command, but with the analyze post insert the > results were similar. > > These are excellent numbers. It basically means > we could have our > > cake(great features) and eat it too(performance > that's good enough > > to run the app). > > In the end, performance for inserts is always > fundamentally based on > how much disk I/O there is, and so it should come as > no shock that > when roughly the same amount of data is getting laid > down on disk, > performance won't differ much on these sorts of > essentials. > > There are a few places where there's some need for > cleverness; if you > see particular queries running unusually slowly, > it's worth doing an > EXPLAIN or EXPLAIN ANALYZE on them, to see how the > query plans are > being generated. There's some collected wisdom out > here on how to > encourage the right plans. > > There are also unexpected results that are OK. We > did a system > upgrade a few days ago that led to one of the tables > starting out > totally empty. A summary report that looks at that > table wound up > with a pretty wacky looking query plan (compared to > what's usual) > because the postmaster knew that the query would be > reading in > essentially the entire table. You'd normally expect > an index scan, > looking for data for particular dates. In this > case, it did a "scan > the whole table; filter out a few irrelevant > entries" plan. > > It looked wacky, compared to what's usual, but it > ran in about 2 > seconds, which was way FASTER than what's usual. So > the plan was > exactly the right one. > > Telling the difference between the right plan and a > poor one is a bit > of an art; we quite regularly take a look at query > plans on this list > to figure out what might be not quite right. If you > find slow ones, > make sure you have run ANALYZE on the tables > recently, to be sure that > the plans are sane, and you may want to consider > posting some of them > to see if others can point to improvements that can > be made. > -- > If this was helpful, > <http://svcs.affero.net/rm.php?r=cbbrowne> rate me > http://linuxfinances.info/info/linuxdistributions.html > "I can't believe my room doesn't have Ethernet! Why > wasn't it wired > when the house was built?" > "The house was built in 1576." > -- Alex Kamilewicz on the Oxford breed of > `conference American.' > > ---------------------------(end of > broadcast)--------------------------- > TIP 7: don't forget to increase your free space map > settings >
Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
From
Pierre-Frédéric Caillaud
Date:
Performance hint : For static data, do not normalize too much. For instance if you have a row which can be linked to several other rows, you can do this : create table parents ( id serial primary key, values... ) create table children ( id serial primary key, parent_id references parents(id), integer slave_value ) Or you can do this, using an array : create table everything ( id serial primary key, integer[] children_values, values... ) Pros : No Joins. Getting the list of chilndren_values from table everything is just a select. On an application with several million rows, a query lasting 150 ms with a Join takes 30 ms with an array. You can build the arrays from normalized tables by using an aggregate function. You can index the array elements with a GIST index... Cons : No joins, thus your queries are a little bit limited ; problems if the array is too long ;
Hi, Mischa, On Sun, 12 Sep 2004 20:47:17 GMT Mischa Sandberg <ischamay.andbergsay@activestateway.com> wrote: > On the other hand, if you do warehouse-style loading (Insert, or PG > COPY, into a temp table; and then 'upsert' into the perm table), I can > guarantee 2500 inserts/sec is no problem. As we can forsee that we'll have similar insert rates to cope with in the not-so-far future, what do you mean with 'upsert'? Do you mean a stored procedure that iterates over the temp table? Generally, what is the fastest way for doing bulk processing of update-if-primary-key-matches-and-insert-otherwise operations? Thanks, Markus Schaber -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
> Mark Cotner wrote: > > The time has come to reevaluate/rearchitect an > > application which I built about 3 years ago. There > > are no performance concerns with MySQL, but it would > > benefit greatly from stored procedures, views, etc. > From: "Mischa Sandberg" <ischamay.andbergsay@activestateway.com> > If your company is currently happy with MySQL, there probably are other > (nontechnical) reasons to stick with it. I'm impressed that you'd > consider reconsidering PG. I'd like to second Mischa on that issue. In general, if you migrate an *existing* application from one RDBMS to another, you should expect performance to decrease significantly. This is always true in a well performing system even if the replacement technology is more sophisticated. This is because of several factors. Even if you try to develop in totally agnostic generic SQL, you are always customizing to a feature set, namely the ones in the current system. Any existing application has had substantial tuning and tweaking, and the new one is at a disadvantage. Moreover, an existing system is a Skinnerian reward/punishment system to the developers and DBAs, rewarding or punishing them for very environment specific choices - resulting in an application, dbms, OS, and platform that are both explicitly and unconsciously customized to work together in a particular manner. The net effect is a rule of thumb that I use: NEVER reimplement an existing system unless the project includes substantial functional imporovement. Every time I've broken that rule, I've found that users expectations, based on the application they are used to, are locked in. Any place where the new system is slower, the users are dissatisfied; where it exceeds expectations it isn't appreciated: the users are used to the old system quirks, and the improvements only leave them uncomforable since the system "acts differently". (I've broken the rule on occation for standardization conversions.) My expectation is that pg will not get a fair shake here. If you do it - I'd like to see the results anyway. /Aaron
* Markus Schaber (schabios@logi-track.com) wrote: > Generally, what is the fastest way for doing bulk processing of > update-if-primary-key-matches-and-insert-otherwise operations? This is a very good question, and I havn't seen much of an answer to it yet. I'm curious about the answer myself, actually. In the more recent SQL specs, from what I understand, this is essentially what the 'MERGE' command is for. This was recently added and unfortunately is not yet supported in Postgres. Hopefully it will be added soon. Otherwise, what I've done is basically an update followed by an insert using outer joins. If there's something better, I'd love to hear about it. The statements looks something like: update X set colA = a.colA, colB = a.colB from Y a where keyA = a.keyA and keyB = a.keyB; insert into X select a.keyA, a.keyB, a.colA, a.colB from Y a left join X b using (keyA, keyB) where b.keyA is NULL and b.keyB is NULL; With the appropriate indexes, this is pretty fast but I think a merge would be much faster. Thanks, Stephen
Attachment
Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
From
"Harald Lau (Sector-X)"
Date:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > From: "Mischa Sandberg" <ischamay.andbergsay@activestateway.com> > > > If your company is currently happy with MySQL, there probably are > > other (nontechnical) reasons to stick with it. I'm impressed that > > you'd consider reconsidering PG. > > I'd like to second Mischa on that issue. Though both of you are right from my point of view, I don't think it's very useful to discuss this item here. Having once migrated a MySQL-DB to PG I can confirm, that in fact chances are good you will be unhappy if you adopt the MySQL data-model and the SQL 1:1. As well as PG has to be much more configured and optimized than MySQL. As well as the client-application is supposed to be modified to a certain extend, particularly if you want to take over some -or some more- business-logic from client to database. But, from what Mark stated so far I'm sure he is not going to migrate his app just for fun, resp. without having considered this. > NEVER reimplement an existing system unless the project includes > substantial functional imporovement. or monetary issues I know one big database that was migrated from Oracle to PG and another from SQLServer to PG because of licence-costs. Definitely there are some more. That applies to MySQL, too; licence policy is somewhat obscure to me, but under certain circumstances you have to pay regards Harald -----BEGIN PGP SIGNATURE----- Version: PGPfreeware 6.5.3 for non-commercial use <http://www.pgp.com> iQA/AwUBQUb+O8JpD/drhCuMEQJCZACgqdJsrWjOwdP779PFaFMjxdgvqkwAoIPc jPONy6urLRLf3vylVjVlEyci =/1Ka -----END PGP SIGNATURE-----
>>>>> "MC" == Mark Cotner <mcotner@yahoo.com> writes: MC> I've finished porting the schema and am importing the MC> data now. My estimates for just two-thirds(60 of the MC> 90 days) of one of our 30 cable systems(MySQL dbs) is MC> estimated to take about 16 hours. This may seem like MC> a lot, but I'm satisfied with the performance. I've be sure to load your data without indexes defined for your initial import. check your logs to see if increasing checkpoint_segments is recommended. I found that bumping it up to 50 helped speed up my data loads (restore from dump) significantly. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
> Stephen Frost > * Markus Schaber (schabios@logi-track.com) wrote: > > Generally, what is the fastest way for doing bulk processing of > > update-if-primary-key-matches-and-insert-otherwise operations? > > This is a very good question, and I havn't seen much of an answer to it > yet. I'm curious about the answer myself, actually. In the more recent > SQL specs, from what I understand, this is essentially what the 'MERGE' > command is for. This was recently added and unfortunately is not yet > supported in Postgres. Hopefully it will be added soon. > Yes, I think it is an important feature for both Data Warehousing (used in set-operation mode for bulk processing) and OLTP (saves a round-trip to the database, so faster on single rows also). It's in my top 10 for 2005. Best Regards, Simon Riggs
On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote: > PostgreSQL's functionality is in many ways similar to Oracle Partitioning. > > Loading up your data in many similar tables, then creating a view like: > > CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS > SELECT 200409130800, col1, col2, col3... FROM table200409130800 > UNION ALL > SELECT 200409131000, col1, col2, col3... FROM table200409131000 > UNION ALL > SELECT 200409131200, col1, col2, col3... FROM table200409131200 > ...etc... > > will allow the PostgreSQL optimizer to eliminate partitions from the query > when you run queries which include a predicate on the partitioning_col, e.g. > > select count(*) from bigtable where idate >= 200409131000 > > will scan the last two partitions only... > > There are a few other ways of creating the view that return the same answer, > but only using constants in that way will allow the partitions to be > eliminated from the query, and so run for much longer. Is there by any chance a set of functions to manage adding and removing partitions? Certainly this can be done by hand, but having a set of tools would make life much easier. I just looked but didn't see anything on GBorg. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
> Jim C. Nasby > On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote: > > PostgreSQL's functionality is in many ways similar to Oracle > Partitioning. > > > > Loading up your data in many similar tables, then creating a view like: > > > > CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS > > SELECT 200409130800, col1, col2, col3... FROM table200409130800 > > UNION ALL > > SELECT 200409131000, col1, col2, col3... FROM table200409131000 > > UNION ALL > > SELECT 200409131200, col1, col2, col3... FROM table200409131200 > > ...etc... > > > > will allow the PostgreSQL optimizer to eliminate partitions > from the query > > when you run queries which include a predicate on the > partitioning_col, e.g. > > > > select count(*) from bigtable where idate >= 200409131000 > > > > will scan the last two partitions only... > > > > There are a few other ways of creating the view that return the > same answer, > > but only using constants in that way will allow the partitions to be > > eliminated from the query, and so run for much longer. > > Is there by any chance a set of functions to manage adding and removing > partitions? Certainly this can be done by hand, but having a set of > tools would make life much easier. I just looked but didn't see anything > on GBorg. Well, its fairly straightforward to auto-generate the UNION ALL view, and important as well, since it needs to be re-specified each time a new partition is loaded or an old one is cleared down. The main point is that the constant placed in front of each table must in some way relate to the data, to make it useful in querying. If it is just a unique constant, chosen at random, it won't do much for partition elimination. So, that tends to make the creation of the UNION ALL view an application/data specific thing. The "partitions" are just tables, so no need for other management tools. Oracle treats the partitions as sub-tables, so you need a range of commands to add, swap etc the partitions of the main table. I guess a set of tools that emulates that functionality would be generically a good thing, if you can see a way to do that. Oracle partitions were restricted in only allowing a single load statement into a single partition at any time, whereas multiple COPY statements can access a single partition table on PostgreSQL. BTW, multi-dimensional partitioning is also possible using the same general scheme.... Best Regards, Simon Riggs
On Sep 15, 2004, at 8:32 AM, Simon Riggs wrote: > The "partitions" are just tables, so no need for other management > tools. > Oracle treats the partitions as sub-tables, so you need a range of > commands > to add, swap etc the partitions of the main table. > > I guess a set of tools that emulates that functionality would be > generically > a good thing, if you can see a way to do that. > > Oracle partitions were restricted in only allowing a single load > statement > into a single partition at any time, whereas multiple COPY statements > can > access a single partition table on PostgreSQL. How does this compare to DB2 partitioning? Michael Glaesemann grzm myrealbox com
simon@2ndquadrant.com ("Simon Riggs") wrote: > The main point is that the constant placed in front of each table > must in some way relate to the data, to make it useful in > querying. If it is just a unique constant, chosen at random, it > won't do much for partition elimination. It just struck me - this is much the same notion as that of "cutting planes" used in Integer Programming. The approach, there, is that you take a linear program, which can give fractional results, and throw on as many additional constraints as you need in order to force the likelihood of particular variable falling on integer values. The constraints may appear redundant, but declaring them allows the answers to be pushed in the right directions. In this particular case, the (arguably redundant) constraints let the query optimizer have criteria for throwing out unnecessary tables. Thanks for pointing this out; it may turn a fowl into a feature, when I can get some "round tuits" :-). That should allow me to turn an 81-way evil join into something that's 4-way at the worst. Cheers! -- "cbbrowne","@","linuxfinances.info" http://linuxfinances.info/info/nonrdbms.html Implementing systems is 95% boredom and 5% sheer terror.
simon@2ndquadrant.com ("Simon Riggs") writes: > Well, its fairly straightforward to auto-generate the UNION ALL view, and > important as well, since it needs to be re-specified each time a new > partition is loaded or an old one is cleared down. The main point is that > the constant placed in front of each table must in some way relate to the > data, to make it useful in querying. If it is just a unique constant, chosen > at random, it won't do much for partition elimination. So, that tends to > make the creation of the UNION ALL view an application/data specific thing. Ah, that's probably a good thought. When we used big "UNION ALL" views, it was with logging tables, where there wasn't really any meaningful distinction between partitions. So you say that if the VIEW contains, within it, meaningful constraint information, that can get applied to chop out irrelevant bits? That suggests a way of resurrecting the idea... Might we set up the view as: create view combination_of_logs as select * from table_1 where txn_date between 'this' and 'that' union all select * from table_2 where txn_date between 'this2' and 'that2' union all select * from table_3 where txn_date between 'this3' and 'that3' union all select * from table_4 where txn_date between 'this4' and 'that4' union all ... ad infinitum union all select * from table_n where txn_date > 'start_of_partition_n'; and expect that to help, as long as the query that hooks up to this has date constraints? We'd have to regenerate the view with new fixed constants each time we set up the tables, but that sounds like it could work... -- "cbbrowne","@","acm.org" http://www3.sympatico.ca/cbbrowne/x.html But what can you do with it? -- ubiquitous cry from Linux-user partner. -- Andy Pearce, <ajp@hpopd.pwd.hp.com>
Chris Browne wrote: > Might we set up the view as: > > create view combination_of_logs as > select * from table_1 where txn_date between 'this' and 'that' > union all > select * from table_2 where txn_date between 'this2' and 'that2' > union all > select * from table_3 where txn_date between 'this3' and 'that3' > union all > select * from table_4 where txn_date between 'this4' and 'that4' > union all > ... ad infinitum > union all > select * from table_n where txn_date > 'start_of_partition_n'; > > and expect that to help, as long as the query that hooks up to this > has date constraints? > > We'd have to regenerate the view with new fixed constants each time we > set up the tables, but that sounds like it could work... That's exactly what we're doing, but using inherited tables instead of a union view. With inheritance, there is no need to rebuild the view each time a table is added or removed. Basically, in our application, tables are partitioned by either month or week, depending on the type of data involved, and queries are normally date qualified. We're not completely done with our data conversion (from a commercial RDBMSi), but so far the results have been excellent. Similar to what others have said in this thread, the conversion involved restructuring the data to better suit Postgres, and the application (data analysis/mining vs. the source system which is operational). As a result we've compressed a > 1TB database down to ~0.4TB, and seen at least one typical query reduced from ~9 minutes down to ~40 seconds. Joe
Hi Joe, > That's exactly what we're doing, but using inherited tables instead of a > union view. With inheritance, there is no need to rebuild the view each > time a table is added or removed. Basically, in our application, tables > are partitioned by either month or week, depending on the type of data > involved, and queries are normally date qualified. That sounds interesting. I have to admit that I havn't touched iheritance in pg at all yet so I find it hard to imagine how this would work. If you have a chance, would you mind elaborating on it just a little? Regards Iain
On Tue, Sep 14, 2004 at 05:33:33PM -0500, Jim C. Nasby wrote: > On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote: > > PostgreSQL's functionality is in many ways similar to Oracle Partitioning. > > > > Loading up your data in many similar tables, then creating a view like: > > > > CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS > > SELECT 200409130800, col1, col2, col3... FROM table200409130800 > > UNION ALL > > SELECT 200409131000, col1, col2, col3... FROM table200409131000 > > UNION ALL > > SELECT 200409131200, col1, col2, col3... FROM table200409131200 > > ...etc... [...] > > Is there by any chance a set of functions to manage adding and removing > partitions? Certainly this can be done by hand, but having a set of > tools would make life much easier. I just looked but didn't see anything > on GBorg. I've done a similar thing with time-segregated data by inheriting all the partition tables from an (empty) parent table. Adding a new partition is just a "create table tablefoo () inherits(bigtable)" and removing a partition just "drop table tablefoo". Cheers, Steve
In the last exciting episode, mail@joeconway.com (Joe Conway) wrote: > That's exactly what we're doing, but using inherited tables instead of > a union view. With inheritance, there is no need to rebuild the view > each time a table is added or removed. Basically, in our application, > tables are partitioned by either month or week, depending on the type > of data involved, and queries are normally date qualified. Sounds interesting, and possibly usable. Where does the constraint come in that'll allow most of the data to be excluded? Or is this just that the entries are all part of "bigtable" so that the self join is only 2-way? -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://linuxfinances.info/info/advocacy.html "Be humble. A lot happened before you were born." - Life's Little Instruction Book
Hi, On Tue, 14 Sep 2004 22:10:04 -0700 Steve Atkins <steve@blighty.com> wrote: > > Is there by any chance a set of functions to manage adding and removing > > partitions? Certainly this can be done by hand, but having a set of > > tools would make life much easier. I just looked but didn't see anything > > on GBorg. > > I've done a similar thing with time-segregated data by inheriting > all the partition tables from an (empty) parent table. > > Adding a new partition is just a "create table tablefoo () inherits(bigtable)" > and removing a partition just "drop table tablefoo". But you have to add table constraints restricting the time after adding the partition? Thanks, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
On Tue, 2004-09-14 at 21:30, Joe Conway wrote: > That's exactly what we're doing, but using inherited tables instead of a > union view. With inheritance, there is no need to rebuild the view each > time a table is added or removed. Basically, in our application, tables > are partitioned by either month or week, depending on the type of data > involved, and queries are normally date qualified. We do something very similar, also using table inheritance and a lot of triggers to automatically generate partitions and so forth. It works pretty well, but it is a custom job every time I want to implement a partitioned table. You can save a lot on speed and space if you use it to break up large tables with composite indexes, since you can drop columns from the table depending on how you use it. A big part of performance gain is that the resulting partitions end up being more well-ordered than the non-partitioned version, since inserts are hashed to different partition according to the key and hash function. It is kind of like a cheap and dirty real-time CLUSTER operation. It also lets you truncate, lock, and generally be heavy-handed with subsets of the table without affecting the rest of the table. I think generic table partitioning could pretty much be built on top of existing capabilities with a small number of tweaks. The main difference would be the ability to associate a partitioning hash function with a table (probably defined inline at CREATE TABLE time). Something with syntax like: ...PARTITION ON 'date_trunc(''hour'',ts)'... There would also probably need to be some type of metadata table to associate specific hashes with partition table names. Other than that, the capabilities largely already exist, and managing the partition hashing and association is the ugly part when rolling your own. Intercepting DML when necessary and making it behave correctly is already pretty easy, but could probably be streamlined. j. andrew rogers
"J. Andrew Rogers" <jrogers@neopolitan.com> writes: > We do something very similar, also using table inheritance I have a suspicion postgres's table inheritance will end up serving as a good base for a partitioned table feature. Is it currently possible to query which subtable a record came from though? > A big part of performance gain is that the resulting partitions end up being > more well-ordered than the non-partitioned version, since inserts are hashed > to different partition according to the key and hash function. It is kind of > like a cheap and dirty real-time CLUSTER operation. There is also one particular performance gain that cannot be obtained via other means: A query that accesses a large percentage of a single partition can use a sequential table scan of just that partition. This can be several times faster than using an index scan which is the only option if all the data is stored in a single large table. This isn't an uncommon occurrence. Consider an accounting table partitioned by accounting period. Any aggregate reports for a single accounting period fall into this category. If you define your partitions well that can often by most or all of your reports. Of course this applies equally if the query is accessing a small number of partitions. A further refinement is to leverage the partitioning in GROUP BY or ORDER BY clauses. If you're grouping by the partition key you can avoid a large sort without having to resort to an index scan or even a hash. And of course it's tempting to think about parallelization of such queries, especially if the partitions are stored in separate table spaces on different drives. > It also lets you truncate, lock, and generally be heavy-handed with subsets > of the table without affecting the rest of the table. The biggest benefit by far is this management ability of being able to swap in and out partitions in a single atomic transaction that doesn't require extensive i/o. In the application we used them on Oracle 8i they were an absolute life-saver. They took a huge batch job that took several days to run in off-peak hours and turned it into a single quick cron job that could run at peak hours. We were able to cut the delay for our OLTP data showing up in the data warehouse from about a week after extensive manual work to hours after a daily cron job. > ...PARTITION ON 'date_trunc(''hour'',ts)'... > > There would also probably need to be some type of metadata table to > associate specific hashes with partition table names. Other than that, > the capabilities largely already exist, and managing the partition > hashing and association is the ugly part when rolling your own. > Intercepting DML when necessary and making it behave correctly is > already pretty easy, but could probably be streamlined. I would suggest you look at the Oracle syntax to handle this. They've already gone through several iterations of implementations. The original Oracle 7 implementation was much as people have been describing where you had to define a big UNION ALL view and enable an option to have the optimizer look for such views and attempt to eliminate partitions. In Oracle 8i they introduced first class partitions with commands to define and manipulate them. You defined a high bound for each partition. In Oracle 9 (or thereabouts, sometime after 8i at any rate) they introduced a new form where you specify a specific constant value for each partition. This seems to be more akin to how you're thinking about things. The optimizer has several plan nodes specific for partitioned tables. It can select a single known partition based on information present in the query. It can also detect cases where it can be sure the query will only access a single partition but won't be able to determine which until execution time based on placeholder parameters for cases like "WHERE partition_key = ?". It can also detect cases like "WHERE partition_key between ? and ?" and "WHERE partition_key IN (?,?,?)" Or join clauses on partitions. It can also do some magic things with "GROUP BY partition_key" and "ORDER BY partition_key". The work in the optimizer will be the most challenging part. In an ideal world if the optimizer is very solid it will be possible to bring some partitions to slow or even near-line storage media. As long as no live queries accidentally access the wrong partitions the rest of the database need never know that the data isn't readily available. -- greg
Iain wrote: >>That's exactly what we're doing, but using inherited tables instead of a >>union view. With inheritance, there is no need to rebuild the view each >>time a table is added or removed. Basically, in our application, tables >>are partitioned by either month or week, depending on the type of data >>involved, and queries are normally date qualified. > > That sounds interesting. I have to admit that I havn't touched iheritance in > pg at all yet so I find it hard to imagine how this would work. If you have > a chance, would you mind elaborating on it just a little? OK, see below: ===================== create table foo(f1 int, f2 date, f3 float8); create table foo_2004_01() inherits (foo); create table foo_2004_02() inherits (foo); create table foo_2004_03() inherits (foo); create index foo_2004_01_idx1 on foo_2004_01(f2); create index foo_2004_02_idx1 on foo_2004_02(f2); create index foo_2004_03_idx1 on foo_2004_03(f2); insert into foo_2004_02 values(1,'2004-feb-15',3.14); -- needed just for illustration since these are toy tables set enable_seqscan to false; explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Result (cost=100000000.00..100000061.32 rows=16 width=16) (actual time=0.224..0.310 rows=1 loops=1) -> Append (cost=100000000.00..100000061.32 rows=16 width=16) (actual time=0.214..0.294 rows=1 loops=1) -> Seq Scan on foo (cost=100000000.00..100000022.50 rows=5 width=16) (actual time=0.004..0.004 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_01_idx1 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.101..0.101 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_02_idx1 on foo_2004_02 foo (cost=0.00..4.68 rows=1 width=16) (actual time=0.095..0.101 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_03_idx1 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.066..0.066 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.582 ms (11 rows) create table foo_2004_04() inherits (foo); create index foo_2004_04_idx1 on foo_2004_04(f2); explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Result (cost=100000000.00..100000078.38 rows=21 width=16) (actual time=0.052..0.176 rows=1 loops=1) -> Append (cost=100000000.00..100000078.38 rows=21 width=16) (actual time=0.041..0.159 rows=1 loops=1) -> Seq Scan on foo (cost=100000000.00..100000022.50 rows=5 width=16) (actual time=0.004..0.004 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_01_idx1 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.012..0.012 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_02_idx1 on foo_2004_02 foo (cost=0.00..4.68 rows=1 width=16) (actual time=0.016..0.022 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_03_idx1 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_04_idx1 on foo_2004_04 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.095..0.095 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.443 ms (13 rows) For loading data, we COPY into foo, and have a trigger that redirects the rows to the appropriate partition. Notice that the partitions which do not contain any data of interest are still probed for data, but since they have none it is very quick. In a real life example I got the following results just this afternoon: - aggregate row count = 471,849,665 - total number inherited tables = 216 (many are future dated and therefore contain no data) - select one month's worth of data for one piece of equipment by serial number (49,257 rows) = 526.015 ms Not too bad -- quick enough for my needs. BTW, this is using NFS mounted storage (NetApp NAS). Joe
On Wed, Sep 15, 2004 at 11:16:44AM +0200, Markus Schaber wrote: > Hi, > > On Tue, 14 Sep 2004 22:10:04 -0700 > Steve Atkins <steve@blighty.com> wrote: > > > > Is there by any chance a set of functions to manage adding and removing > > > partitions? Certainly this can be done by hand, but having a set of > > > tools would make life much easier. I just looked but didn't see anything > > > on GBorg. > > > > I've done a similar thing with time-segregated data by inheriting > > all the partition tables from an (empty) parent table. > > > > Adding a new partition is just a "create table tablefoo () inherits(bigtable)" > > and removing a partition just "drop table tablefoo". > > But you have to add table constraints restricting the time after adding > the partition? Uhm... unless I'm confused that's not a meaningful thing in this context. There's no rule that's putting insertions into an inherited table - the decision of which inherited table to insert into is made at application level. As I was using it to segregate data based on creation timestamp the application just inserts into the 'newest' inherited table until it's full, then creates a new inherited table. I've no doubt you could set up rules to scatter inserted data across a number of tables, but that's not something that's been applicaable for the problems I tend to work with, so I've not looked at it. Cheers, Steve
Googling 'upsert' (an Oraclism, I believe) will get you hits on Oracle and DB2's implementation of MERGE, which does what AMOUNTS to what is described below (one mass UPDATE...FROM, one mass INSERT...WHERE NOT EXISTS). No, you shouldn't iterate row-by-row through the temp table. Whenever possible, try to do updates in one single (mass) operation. Doing it that way gives the optimizer the best chance at amortizing fixed costs, and batching operations. --------- In any database other than Postgres, I would recommend doing the INSERT /followed by/ the UPDATE. That order looks wonky --- your update ends up pointlessly operating on the rows just INSERTED. The trick is, UPDATE acquires and holds write locks (the rows were previously visible to other processes), while INSERT's write locks refer to rows that no other process could try to lock. Stephen Frost wrote: > * Markus Schaber (schabios@logi-track.com) wrote: > >>Generally, what is the fastest way for doing bulk processing of >>update-if-primary-key-matches-and-insert-otherwise operations? > > > This is a very good question, and I havn't seen much of an answer to it > yet. I'm curious about the answer myself, actually. In the more recent > SQL specs, from what I understand, this is essentially what the 'MERGE' > command is for. This was recently added and unfortunately is not yet > supported in Postgres. Hopefully it will be added soon. > > Otherwise, what I've done is basically an update followed by an insert > using outer joins. If there's something better, I'd love to hear about > it. The statements looks something like: > > update X > set colA = a.colA, > colB = a.colB > from Y a > where keyA = a.keyA and > keyB = a.keyB; > > insert into X > select a.keyA, > a.keyB, > a.colA, > a.colB > from Y a left join X b > using (keyA, keyB) > where b.keyA is NULL and > b.keyB is NULL; > > With the appropriate indexes, this is pretty fast but I think a merge > would be much faster. > > Thanks, > > Stephen
Simon Riggs wrote: >>Jim C. Nasby >>On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote: >> >>>PostgreSQL's functionality is in many ways similar to Oracle >>Partitioning. >> >>>Loading up your data in many similar tables, then creating a view like: >>> >>>CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS >>>SELECT 200409130800, col1, col2, col3... FROM table200409130800 >>>UNION ALL >>>SELECT 200409131000, col1, col2, col3... FROM table200409131000 >>>UNION ALL >>>SELECT 200409131200, col1, col2, col3... FROM table200409131200 >>>...etc... >>> >>>will allow the PostgreSQL optimizer to eliminate partitions >>from the query >>>when you run queries which include a predicate on the >>partitioning_col, e.g. >> >>>select count(*) from bigtable where idate >= 200409131000 > > The "partitions" are just tables, so no need for other management tools. > Oracle treats the partitions as sub-tables, so you need a range of commands > to add, swap etc the partitions of the main table. A few years ago I wrote a federated query engine (wrapped as an ODBC driver) that had to handle thousands of contributors (partitions) to a pseudotable / VIEWofUNIONs. Joins did require some special handling in the optimizer, because of the huge number of crossproducts between different tables. It was definitely worth the effort at the time, because you need different strategies for: joining a partition to another partition on the same subserver; joining two large partitions on different servers; and joining a large partition on one server to a small one on another. The differences may not be so great for a solitary server; but they're still there, because of disparity in subtable sizes. The simplistic query plans tend to let you down, when you're dealing with honking warehouses. I'm guessing that Oracle keeps per-subtable AND cross-all-subtables statistics, rather than building the latter from scratch in the course of evaluating the query plan. That's the one limitation I see in emulating their partitioned tables with Views.
>> insert into X >> select a.keyA, >> a.keyB, >> a.colA, >> a.colB >> from Y a left join X b >> using (keyA, keyB) >> where b.keyA is NULL and >> b.keyB is NULL; >> >> With the appropriate indexes, this is pretty fast but I think a merge >> would be much faster. Problem is it's subject to race conditions if another process is inserting stuff at the same time... Chris
Hi Joe, You went to quite a bit of effort, thanks I have the picture now. Using inheritence seems to be a useful refinement on top of the earlier outlined aproach using the UNION ALL view with appropriate predicates on the condition used to do the partitioning. Having the individual partitions derived from a parent table makes a lot of sense. regards Iain ----- Original Message ----- From: "Joe Conway" <mail@joeconway.com> To: "Iain" <iain@mst.co.jp> Cc: <pgsql-performance@postgresql.org> Sent: Thursday, September 16, 2004 1:07 PM Subject: Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- > Iain wrote: > >>That's exactly what we're doing, but using inherited tables instead of a > >>union view. With inheritance, there is no need to rebuild the view each > >>time a table is added or removed. Basically, in our application, tables > >>are partitioned by either month or week, depending on the type of data > >>involved, and queries are normally date qualified. > > > > That sounds interesting. I have to admit that I havn't touched iheritance in > > pg at all yet so I find it hard to imagine how this would work. If you have > > a chance, would you mind elaborating on it just a little? > > OK, see below: > ===================== > > create table foo(f1 int, f2 date, f3 float8); > > create table foo_2004_01() inherits (foo); > create table foo_2004_02() inherits (foo); > create table foo_2004_03() inherits (foo); > > create index foo_2004_01_idx1 on foo_2004_01(f2); > create index foo_2004_02_idx1 on foo_2004_02(f2); > create index foo_2004_03_idx1 on foo_2004_03(f2); > > insert into foo_2004_02 values(1,'2004-feb-15',3.14); > > > -- needed just for illustration since these are toy tables > set enable_seqscan to false; > explain analyze select * from foo where f2 = '2004-feb-15'; > > QUERY PLAN > -------------------------------------------------------------------------- ---------------------------------------------------------------------- > Result (cost=100000000.00..100000061.32 rows=16 width=16) (actual > time=0.224..0.310 rows=1 loops=1) > -> Append (cost=100000000.00..100000061.32 rows=16 width=16) > (actual time=0.214..0.294 rows=1 loops=1) > -> Seq Scan on foo (cost=100000000.00..100000022.50 rows=5 > width=16) (actual time=0.004..0.004 rows=0 loops=1) > Filter: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_01_idx1 on foo_2004_01 foo > (cost=0.00..17.07 rows=5 width=16) (actual time=0.101..0.101 rows=0 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_02_idx1 on foo_2004_02 foo > (cost=0.00..4.68 rows=1 width=16) (actual time=0.095..0.101 rows=1 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_03_idx1 on foo_2004_03 foo > (cost=0.00..17.07 rows=5 width=16) (actual time=0.066..0.066 rows=0 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > Total runtime: 0.582 ms > (11 rows) > > create table foo_2004_04() inherits (foo); > create index foo_2004_04_idx1 on foo_2004_04(f2); > > explain analyze select * from foo where f2 = '2004-feb-15'; > > QUERY PLAN > -------------------------------------------------------------------------- ---------------------------------------------------------------------- > Result (cost=100000000.00..100000078.38 rows=21 width=16) (actual > time=0.052..0.176 rows=1 loops=1) > -> Append (cost=100000000.00..100000078.38 rows=21 width=16) > (actual time=0.041..0.159 rows=1 loops=1) > -> Seq Scan on foo (cost=100000000.00..100000022.50 rows=5 > width=16) (actual time=0.004..0.004 rows=0 loops=1) > Filter: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_01_idx1 on foo_2004_01 foo > (cost=0.00..17.07 rows=5 width=16) (actual time=0.012..0.012 rows=0 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_02_idx1 on foo_2004_02 foo > (cost=0.00..4.68 rows=1 width=16) (actual time=0.016..0.022 rows=1 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_03_idx1 on foo_2004_03 foo > (cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.008 rows=0 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > -> Index Scan using foo_2004_04_idx1 on foo_2004_04 foo > (cost=0.00..17.07 rows=5 width=16) (actual time=0.095..0.095 rows=0 loops=1) > Index Cond: (f2 = '2004-02-15'::date) > Total runtime: 0.443 ms > (13 rows) > > For loading data, we COPY into foo, and have a trigger that redirects > the rows to the appropriate partition. > > Notice that the partitions which do not contain any data of interest are > still probed for data, but since they have none it is very quick. In a > real life example I got the following results just this afternoon: > > - aggregate row count = 471,849,665 > - total number inherited tables = 216 > (many are future dated and therefore contain no data) > - select one month's worth of data for one piece of equipment by serial > number (49,257 rows) = 526.015 ms > > Not too bad -- quick enough for my needs. BTW, this is using NFS mounted > storage (NetApp NAS). > > Joe
Christopher Browne wrote: > In the last exciting episode, mail@joeconway.com (Joe Conway) wrote: >>That's exactly what we're doing, but using inherited tables instead of >>a union view. With inheritance, there is no need to rebuild the view >>each time a table is added or removed. Basically, in our application, >>tables are partitioned by either month or week, depending on the type >>of data involved, and queries are normally date qualified. > Where does the constraint come in that'll allow most of the data to be > excluded? Not sure I follow this. > Or is this just that the entries are all part of "bigtable" so that > the self join is only 2-way? We don't have a need for self-joins in our application. We do use a crosstab function to materialize some transposed views of the data, however. That allows us to avoid self-joins in the cases where we might otherwise need them. Joe
Joe, Christopher, Joe's example wasn't excluding partions, as he didn't use a predicated UNION ALL view to select from. His queries use an indexed column that allow the various partitions to be probed at low cost, and he was satisfied wth that. My point in my previous post was that you could still do all that that if you wanted to, by building the predicated view with UNION ALL of each of the child tables. regards Iain ----- Original Message ----- From: "Joe Conway" <mail@joeconway.com> To: "Christopher Browne" <cbbrowne@acm.org> Cc: <pgsql-performance@postgresql.org> Sent: Thursday, September 16, 2004 2:17 PM Subject: Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- > Christopher Browne wrote: > > In the last exciting episode, mail@joeconway.com (Joe Conway) wrote: > >>That's exactly what we're doing, but using inherited tables instead of > >>a union view. With inheritance, there is no need to rebuild the view > >>each time a table is added or removed. Basically, in our application, > >>tables are partitioned by either month or week, depending on the type > >>of data involved, and queries are normally date qualified. > > > Where does the constraint come in that'll allow most of the data to be > > excluded? > > Not sure I follow this. > > > Or is this just that the entries are all part of "bigtable" so that > > the self join is only 2-way? > > We don't have a need for self-joins in our application. We do use a > crosstab function to materialize some transposed views of the data, > however. That allows us to avoid self-joins in the cases where we might > otherwise need them. > > Joe > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Hi, Mischa, On Tue, 14 Sep 2004 22:58:20 GMT Mischa Sandberg <ischamay.andbergsay@activestateway.com> wrote: > Googling 'upsert' (an Oraclism, I believe) will get you hits on Oracle > and DB2's implementation of MERGE, which does what AMOUNTS to what is > described below (one mass UPDATE...FROM, one mass INSERT...WHERE NOT > EXISTS). > > No, you shouldn't iterate row-by-row through the temp table. > Whenever possible, try to do updates in one single (mass) operation. > Doing it that way gives the optimizer the best chance at amortizing > fixed costs, and batching operations. But when every updated row has a different value for the column(s) to be updated, then I still have to use one update statement per row, which I expect to be faster when done via a stored procedure than having the whole client-server roundtrip including parsing every time. Or did I miss some nice SQL statement? Have a nice day, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
On 15 Sep 2004 23:55:24 -0400, Greg Stark <gsstark@mit.edu> wrote: > > "J. Andrew Rogers" <jrogers@neopolitan.com> writes: > > > We do something very similar, also using table inheritance > > I have a suspicion postgres's table inheritance will end up serving as a good > base for a partitioned table feature. Is it currently possible to query which > subtable a record came from though? From the docs on http://www.postgresql.org/docs/7.4/static/ddl-inherit.html : ... In some cases you may wish to know which table a particular row originated from. There is a system column called TABLEOID in each table which can tell you the originating table: SELECT c.tableoid, c.name, c.altitude FROM cities c WHERE c.altitude > 500; which returns: tableoid | name | altitude ----------+-----------+---------- 139793 | Las Vegas | 2174 139793 | Mariposa | 1953 139798 | Madison | 845 (If you try to reproduce this example, you will probably get different numeric OIDs.) By doing a join with pg_class you can see the actual table names: SELECT p.relname, c.name, c.altitude FROM cities c, pg_class p WHERE c.altitude > 500 and c.tableoid = p.oid; which returns: relname | name | altitude ----------+-----------+---------- cities | Las Vegas | 2174 cities | Mariposa | 1953 capitals | Madison | 845 --miker
Hi, Steve, On Wed, 15 Sep 2004 21:17:03 -0700 Steve Atkins <steve@blighty.com> wrote: > On Wed, Sep 15, 2004 at 11:16:44AM +0200, Markus Schaber wrote: > > But you have to add table constraints restricting the time after adding > > the partition? > > Uhm... unless I'm confused that's not a meaningful thing in this context. > There's no rule that's putting insertions into an inherited table - the > decision of which inherited table to insert into is made at application > level. I thought of the query optimizer. I thought it could use the table constraints to drop tables when creating the union. But now I think that an index gives enough win, because the tree-based indices are rather quick at returning zero rows when the queried value is out of the indexed range. Greetings, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
Iain wrote: > Joe's example wasn't excluding partions, as he didn't use a predicated UNION > ALL view to select from. His queries use an indexed column that allow the > various partitions to be probed at low cost, and he was satisfied wth that. Right. > My point in my previous post was that you could still do all that that if > you wanted to, by building the predicated view with UNION ALL of each of the > child tables. Right. It doesn't look that much different: create or replace view foo_vw as select * from foo_2004_01 where f2 >= '2004-jan-01' and f2 <= '2004-jan-31' union all select * from foo_2004_02 where f2 >= '2004-feb-01' and f2 <= '2004-feb-29' union all select * from foo_2004_03 where f2 >= '2004-mar-01' and f2 <= '2004-mar-31' ; -- needed just for illustration since these are toy tables set enable_seqscan to false; explain analyze select * from foo_vw where f2 = '2004-feb-15'; QUERY PLAN ---------------------------------------------------------------------------------- Subquery Scan foo_vw (cost=0.00..14.54 rows=3 width=16) (actual time=0.022..0.027 rows=1 loops=1) -> Append (cost=0.00..14.51 rows=3 width=16) (actual time=0.019..0.022 rows=1 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..4.84 rows=1 width=16) (actual time=0.004..0.004 rows=0 loops=1) -> Index Scan using foo_2004_01_idx2 on foo_2004_01 (cost=0.00..4.83 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1) Index Cond: ((f2 >= '2004-01-01'::date) AND (f2 <= '2004-01-31'::date) AND (f2 = '2004-02-15'::date)) -> Subquery Scan "*SELECT* 2" (cost=0.00..4.84 rows=1 width=16) (actual time=0.013..0.015 rows=1 loops=1) -> Index Scan using foo_2004_02_idx2 on foo_2004_02 (cost=0.00..4.83 rows=1 width=16) (actual time=0.009..0.010 rows=1 loops=1) Index Cond: ((f2 >= '2004-02-01'::date) AND (f2 <= '2004-02-29'::date) AND (f2 = '2004-02-15'::date)) -> Subquery Scan "*SELECT* 3" (cost=0.00..4.84 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=1) -> Index Scan using foo_2004_03_idx2 on foo_2004_03 (cost=0.00..4.83 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((f2 >= '2004-03-01'::date) AND (f2 <= '2004-03-31'::date) AND (f2 = '2004-02-15'::date)) Total runtime: 0.188 ms (12 rows) regression=# explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN ---------------------------------------------------------------------------------- Result (cost=100000000.00..100000073.70 rows=20 width=16) (actual time=0.059..0.091 rows=1 loops=1) -> Append (cost=100000000.00..100000073.70 rows=20 width=16) (actual time=0.055..0.086 rows=1 loops=1) -> Seq Scan on foo (cost=100000000.00..100000022.50 rows=5 width=16) (actual time=0.001..0.001 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_01_idx2 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.045..0.045 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_02_idx2 on foo_2004_02 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) -> Index Scan using foo_2004_03_idx2 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.029..0.029 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.191 ms (11 rows) The main difference being that the view needs to be recreated every time a table is added or dropped, whereas with the inherited tables method that isn't needed. Joe
On Wed, Sep 15, 2004 at 02:09:31PM -0700, J. Andrew Rogers wrote: > On Tue, 2004-09-14 at 21:30, Joe Conway wrote: > > That's exactly what we're doing, but using inherited tables instead of a > > union view. With inheritance, there is no need to rebuild the view each > > time a table is added or removed. Basically, in our application, tables > > are partitioned by either month or week, depending on the type of data > > involved, and queries are normally date qualified. > > > > We do something very similar, also using table inheritance and a lot of > triggers to automatically generate partitions and so forth. It works > pretty well, but it is a custom job every time I want to implement a > partitioned table. You can save a lot on speed and space if you use it > to break up large tables with composite indexes, since you can drop > columns from the table depending on how you use it. A big part of Forgive my ignorance, but I didn't think you could have a table that inherits from a parent not have all the columns. Or is that not what you mean by 'you can drop columns from the table...'? This is one advantage I see to a big UNION ALL view; if you're doing partitioning based on unique values, you don't actually have to store that value in the partition tables. For example, http://stats.distributed.net has a table that details how much work each participant did each day for each project. Storing project_id in that table is an extra 4 bytes... doesn't sound like much until you consider that the table has over 130M rows right now. So it would be nice to have an easy way to partition the table based on unique project_id's and not waste space in the partition tables on a field that will be the same for every row (in each partition). -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
On Thu, 2004-09-16 at 13:39, Jim C. Nasby wrote: > Forgive my ignorance, but I didn't think you could have a table that > inherits from a parent not have all the columns. Or is that not what you > mean by 'you can drop columns from the table...'? > > This is one advantage I see to a big UNION ALL view; if you're doing > partitioning based on unique values, you don't actually have to store > that value in the partition tables. For example, > http://stats.distributed.net has a table that details how much work each > participant did each day for each project. Storing project_id in that > table is an extra 4 bytes... doesn't sound like much until you consider > that the table has over 130M rows right now. So it would be nice to have > an easy way to partition the table based on unique project_id's and not > waste space in the partition tables on a field that will be the same for > every row (in each partition). Yeah, it is harder to do this automagically, though in theory it should be possible. Since we have to roll our own partitioning anyway, we've broken up composite primary keys so that one of the key columns hashes to a partition, using the key itself in the partition table name rather than replicating that value several million times. Ugly as sin, but you can make it work in some cases. I do just enough work for our queries to behave correctly, and a lot of times I actually hide the base table and its descendents underneath a sort of metadata table that is grafted to the base tables by a lot of rules/triggers/functions/etc, and then do queries against that or a view of that. As I said, ugly as sin and probably not universal, but you need a lot of abstraction to make it look halfway normal. I'm going to think about this some more and see if I can't construct a generic solution. cheers, j. andrew rogers
> Iain > Joe's example wasn't excluding partions, as he didn't use a > predicated UNION > ALL view to select from. His queries use an indexed column that allow the > various partitions to be probed at low cost, and he was satisfied > wth that. Agreed - very very interesting design though. > My point in my previous post was that you could still do all that that if > you wanted to, by building the predicated view with UNION ALL of > each of the > child tables. > AFAICS of all the designs proposed there is still only one design *using current PostgreSQL* that allows partitions to be excluded from queries as a way of speeding up queries against very large tables: UNION ALL with appended constants. Best Regards, Simon Riggs