Re: Constraint Exclusion + Joins? - Mailing list pgsql-hackers
From | Brandon Black |
---|---|
Subject | Re: Constraint Exclusion + Joins? |
Date | |
Msg-id | 84621a60605020553y219a85d3pa3d1734f20b70ba@mail.gmail.com Whole thread Raw |
In response to | Constraint Exclusion + Joins? ("Brandon Black" <blblack@gmail.com>) |
List | pgsql-hackers |
On 5/2/06, Simon Riggs <simon@2ndquadrant.com> wrote: > On Tue, 2006-05-02 at 00:27 -0500, Brandon Black wrote: > > > I tried things like the above with small test data sets against cvs > > just now on my home machine, and constraint exclusion doesn't seem to > > apply here (even if all of the joined jstuff rows have ids which only > > match the constraint for basic_sub3, all basic_subX's seem to get > > scanned, as is the case I'm seeing in my real code against 8.1.3). Is > > this sort of dynamic constraint exclusion on the radar? > > (You should use life size data for your testing.) > > If you think the plan you get can be improved, post the EXPLAIN (maybe > EXPLAIN ANALYZE) and say how you think it can be improved. It's a lot > easier to consider concrete requirements. > Perhaps I should rewind a bit here then since clearly, as Tom pointed out, dynamic constraint exclusion is fantasy-land :) My real life-size situation is a bit more complicated, hence my trying to just break things down into a simple example for the earlier email, as I didn't want to bog you down reading all the details. It's really easier to explain in english than trying to post my pages upon pages of DDL and explain analyze outputs (but I can try to summarize some good example cases of those if neccesary later from work): In my real code, I've got a number of seperate tables logging time-series numerical statistics of various kinds. The PKs on these tables are normally composed of two fields, a resource identifier (there's about 4,500 resources), and a timestamp (the statistics come in once per minute). I'm using inheritance-based partitioning to split each stat table up into 1-hour chunks, and I keep about 72 hours of "current" data around (dropping old subtables and adding new ones as appropriate), which means there's about ~270k rows per subtable, ~19.4 million for the whole set at any given time. A few of the (top-level, seperate) tables in this scheme have a third PK component which is a sub-resource identifier, which there might be ~2-6 of per resource, multiplying the row counts further in those cases. The timestamp field I'm partitioning on is called t_stamp, and it's just an integer (unix epoch time, easier for my application). For a lot of my queries on these tables, I already have an explicit constant time-span to work with, so the constraint exclusion works well. For certain queries however, what I really want is to join into these statistical tables for a number of resource_ids, and for only the rows that are at the max(t_stamp) for each resource - the last thing inserted for each resource (they aren't neccesarily in perfect time synch with each other). My (quite possibly flawed) current solution for this is to maintain a seperate singular "resource_status" table which tracks the max timestamp for each resource as the data rows are inserted (it gets updated with the new timestamp being inserted if the new timestamp is greater than the max_t_stamp I had recorded earlier - sometimes data arruves not in timestamp order), so that I could quickly determine the current per-resource max t_stamp. I'm happy with the design in the overall, as my problematically huge insert (and transaction) rate is humming along nicely, and most of my queries (some of which are very complex) are running in excellent time. The only big performance issue I face now is those cases described above, where I need to get one datum per resource from a stat table for a number of resources using each individual resource's max t_stamp. When I do this by joining from the table where I recorded the max t_stamps into the top-level stat table, the plan does index scans of every child (obviously) to match the stamps, even though in the most likely case all of the stamps fall in a narrow window within just 1-2 child tables. By manually rewriting the query as a loop in plpgsql which first extracts the max t_stamp per resource from my "status" table, then inside the loop runs an individual select statement per-resource with the t_stamp as a constant, I see huge performance increases over the join (in the neighboorhood of an order of magnitude, but it varies up or down in different cases), to the point where things are acceptably fast. As I mentioned though, this (the plpgsql looping construct) is difficult to work with for dynamically constructed user queries (in terms of further where-clause constraints, joins to other tables afterwards, limit/offset counts, ordering, etc), whereas plugging those things in is easy if the client code is executing a single literal dynamic sql statement for the whole thing. There aren't any ugly sequential scans going on here or anything in the join case, it appears it's just the difference between checking ~72 indexes versus checking 1. I could break up the tables in larger chunks: if I did 12 hours instead of 1 hour, I'd be looking at more like 6 subtables/indexes, but each growing up to the vicinity of 1.5-9 million (in some rare cases, perhaps 20 million) rows per table. It is quite difficult and time-consuming to experiment with this, but clearly it's an avenue I need to look into. Another option I considered was that on each statistic insert into the big partitioned history table, I should also update a single row per-resource in a seperate table (or the top-level table), which tracks only the most recent data. This decreased my insertion-transaction performance substantially of course (I tried a few different logical ways of doing it), so I've been trying to work with the max_stamp storing method instead lately. It could turn out that the "update a table of latest-data when doing the inserts" is the least painful way to pay for these queries of course. And finally, a relatively-easy option I'm going to be trying out later today is to have the client code do a dynamic "create function" to make a custom plpgsql loop for itself, use it once, and toss it. For the current schema design this may turn out to be the best combination of fast and flexible, even though it's a bit ugly. Anyways, thanks for your time and efforts on postgres in general, I love it and don't mean this to sound as "complaining" as it probably does. I'm just finding it hard to come up with a way out of this latest corner I've painted myself into :) -- Brandon
pgsql-hackers by date: