Thread: Finding latest record for a number of groups in an INSERT-only table
This is basically exactly the same as http://archives.postgresql.org/pgsql-sql/2008-10/msg00009.php; I'm just asking again, to see if thinking on the problem has changed: The basic problem, restated, is one has a relation with tuples like this: (key, recency_stamp, other_columns*) Some example data may be: 'ice cream', 0, 'vanilla' 'ice cream', 1, 'chocolate' 'beer', 0, 'ale' 'beer', 3, 'lager' The goal is to extract from this relation the following: 'ice cream', '1', 'chocolate' 'beer', 3, 'lager' In my dataset, the distinct number of keys is << the number of rows; in fact, that ratio grows ever smaller until the data is truncated, with a convergence point probably resting at a negligible fraction of a percent -- definitely less than 0.1% in short order, easily getting to 0.01% and 0.001% in not a vast amount of time. There is a standard btree index on (key, recency_stamp), which ideally is exploitable to compute the 'most recent' relation very quickly. Approaches I've investigated include DISTINCT ON (this may be the cleanest), windowing functions (rank, and a predicate), and a GROUP BY with a self-join, but none of them avoid a full relation scan, which is just crippling performance-wise. The only way I could find to trick the optimizer into doing something non-crippling was to trick it with http://wiki.postgresql.org/wiki/Loose_indexscan to find distinct values quickly and subsequently using a subquery in the target list that the optimizer decides to run over every distinct value (it estimates a very high cost for this, even though it is actually rather fast, but the ridiculous costing can subsequent joins against the relation in unfortunate ways). This mechanism is obscure enough that I may just write a plpgsql function as a workaround, as that may well be more lucid. I am using PostgreSQL 9. Does anyone have fresh thoughts or suggestions for dealing with INSERT-mostly tables conceived in this manner? -- fdr
On Jul 4, 2011, at 19:48, Daniel Farina <daniel@heroku.com> wrote: > This is basically exactly the same as > http://archives.postgresql.org/pgsql-sql/2008-10/msg00009.php; I'm > just asking again, to see if thinking on the problem has changed: > > The basic problem, restated, is one has a relation with tuples like this: > > (key, recency_stamp, other_columns*) > > Some example data may be: > > 'ice cream', 0, 'vanilla' > 'ice cream', 1, 'chocolate' > 'beer', 0, 'ale' > 'beer', 3, 'lager' > > The goal is to extract from this relation the following: > > 'ice cream', '1', 'chocolate' > 'beer', 3, 'lager' > > > > I am using PostgreSQL 9. > > Does anyone have fresh thoughts or suggestions for dealing with > INSERT-mostly tables conceived in this manner? > > Setup a materialized view. David J.
On 5 Jul 2011, at 3:23, David Johnston wrote: >> Does anyone have fresh thoughts or suggestions for dealing with >> INSERT-mostly tables conceived in this manner? You're struggling with read-performance in an INSERT-mostly table? Where are your performance priorities, on INSERT or onSELECT? > Setup a materialized view. Basically that means moving the tracking of the last row to the INSERT phase. If you go this way, a few approaches keep a"latest records" table are: 1. Replace the record with the same key with the latest one from your INSERT-mostly table, or 2. Update a foreign key reference to the latest record in your INSERT-mostly table. The new table should probably have a UNIQUE constraint on the key field. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4e12b56712091122515968!
On Mon, Jul 4, 2011 at 11:55 PM, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote: > On 5 Jul 2011, at 3:23, David Johnston wrote: > >>> Does anyone have fresh thoughts or suggestions for dealing with >>> INSERT-mostly tables conceived in this manner? > > You're struggling with read-performance in an INSERT-mostly table? Where are your performance priorities, on INSERT oron SELECT? Principally on INSERT, but yet I don't want SELECT to be several orders of magnitude slower than it needs to be (and getting slower, as ndistinct vanishes into a tiny fraction of all records). With procedural code or tricking the optimizer I can convince it to do something reasonable, even if the constants are much higher than they need to be (from things like the index scan having to be restarted all the time, for example). There will be many, many records that are *never* fetched/joined. In this case, this is a monitoring application that is appending large amounts of data, but it needs to join back recent values only when an operator/human being wants to take a look at what has is happening right now. > Setup a materialized view. This rather defeats the point AFAIK, because keeping the materialized view up to date (being more than thirty seconds out of date is not desirable) will be expensive. Maintaining the index on the (key, recency) pair is, in effect, a materialization of sorts already. In any case, as I was saying: there are terrible workarounds for this, but I think this is a rather common problem with INSERT-mostly relations that effectively want row-versioning of a sort, so I was hoping that lucid solutions to this issue have grown since 2008, when the thread I linked to transpired. -- fdr
On Tue, Jul 5, 2011 at 12:48 AM, Daniel Farina <daniel@heroku.com> wrote: > This is basically exactly the same as > http://archives.postgresql.org/pgsql-sql/2008-10/msg00009.php; I'm > just asking again, to see if thinking on the problem has changed: > > The basic problem, restated, is one has a relation with tuples like this: > > (key, recency_stamp, other_columns*) > > Some example data may be: > > 'ice cream', 0, 'vanilla' > 'ice cream', 1, 'chocolate' > 'beer', 0, 'ale' > 'beer', 3, 'lager' > > The goal is to extract from this relation the following: > > 'ice cream', '1', 'chocolate' > 'beer', 3, 'lager' > > In my dataset, the distinct number of keys is << the number of rows; > in fact, that ratio grows ever smaller until the data is truncated, > with a convergence point probably resting at a negligible fraction of > a percent -- definitely less than 0.1% in short order, easily getting > to 0.01% and 0.001% in not a vast amount of time. There is a standard > btree index on (key, recency_stamp), which ideally is exploitable to > compute the 'most recent' relation very quickly. Approaches I've > investigated include DISTINCT ON (this may be the cleanest), windowing > functions (rank, and a predicate), and a GROUP BY with a self-join, > but none of them avoid a full relation scan, which is just crippling > performance-wise. > > The only way I could find to trick the optimizer into doing something > non-crippling was to trick it with > http://wiki.postgresql.org/wiki/Loose_indexscan to find distinct > values quickly and subsequently using a subquery in the target list > that the optimizer decides to run over every distinct value (it > estimates a very high cost for this, even though it is actually rather > fast, but the ridiculous costing can subsequent joins against the > relation in unfortunate ways). This mechanism is obscure enough that > I may just write a plpgsql function as a workaround, as that may well > be more lucid. I think its a pretty common requirement and we should be looking to optimize it if it isn't handled well. The only problem is that there is a few ways of writing that in SQL and we would probably need to recognise how to transform between query types to achieve a common form. For example, the post you cite uses a correlated subquery whereas its possible to write it using an IN subselect. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Jul 5, 2011 at 12:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > I think its a pretty common requirement and we should be looking to > optimize it if it isn't handled well. I agree; although I wanted to be sure that it is not in fact handled well by some mechanism I haven't seen yet. > The only problem is that there is a few ways of writing that in SQL > and we would probably need to recognise how to transform between query > types to achieve a common form. I think that'd be good for completeness, although I also think that having a 'design pattern' of sorts for dealing with this workload would be totally acceptable for quite some time, even if canonicalization from other forms is technically possible > For example, the post you cite uses a correlated subquery whereas its > possible to write it using an IN subselect. I liked the brevity of DISTINCT ON form, even if semantically it feels a little weird, yet, I think the biggest bang for the buck is looking at the "row_number() = N" form of the problem: I've seen people do this a number of times for different reasons. In my case N = 1, but if I wanted to take an average of a metric of three records or so then one could easily imagine wanting row_number() >= 3, followed by aggregations. I've seen some other real-world use cases fly by where people wanted some N where N != 1, too. When such a predicate is around, one can be sure that no more than N rows are generated per partition, and if there's an appropriate index then some kind of costing can take place to see if it's worth using. Interestingly, this may also be a way to (in a degenerate case) acquire a path to enable skip scan, but last I checked skip scan was problematic somehow for other reasons, which makes me wonder how feasible it'd be to embark on this optimizer enhancement, should it prove necessary. -- fdr
On 05/07/11 11:48, Daniel Farina wrote: > This is basically exactly the same as > http://archives.postgresql.org/pgsql-sql/2008-10/msg00009.php; I'm > just asking again, to see if thinking on the problem has changed: > > The basic problem, restated, is one has a relation with tuples like this: > > (key, recency_stamp, other_columns*) > > Some example data may be: > > 'ice cream', 0, 'vanilla' > 'ice cream', 1, 'chocolate' > 'beer', 0, 'ale' > 'beer', 3, 'lager' > > The goal is to extract from this relation the following: > > 'ice cream', '1', 'chocolate' > 'beer', 3, 'lager' > > In my dataset, the distinct number of keys is<< the number of rows; > in fact, that ratio grows ever smaller until the data is truncated, > with a convergence point probably resting at a negligible fraction of > a percent -- definitely less than 0.1% in short order, easily getting > to 0.01% and 0.001% in not a vast amount of time. There is a standard > btree index on (key, recency_stamp), which ideally is exploitable to > compute the 'most recent' relation very quickly. Approaches I've > investigated include DISTINCT ON (this may be the cleanest), windowing > functions (rank, and a predicate), and a GROUP BY with a self-join, > but none of them avoid a full relation scan, which is just crippling > performance-wise. > > The only way I could find to trick the optimizer into doing something > non-crippling was to trick it with > http://wiki.postgresql.org/wiki/Loose_indexscan to find distinct > values quickly and subsequently using a subquery in the target list > that the optimizer decides to run over every distinct value (it > estimates a very high cost for this, even though it is actually rather > fast, but the ridiculous costing can subsequent joins against the > relation in unfortunate ways). This mechanism is obscure enough that > I may just write a plpgsql function as a workaround, as that may well > be more lucid. > > I am using PostgreSQL 9. > > Does anyone have fresh thoughts or suggestions for dealing with > INSERT-mostly tables conceived in this manner? > My solution, which is not especially optimised for an 'insert only' table, is: DROP TABLE IF EXISTS T; CREATE TABLE T ( type text, id int, flavour text, PRIMARY kEY (type, id) ); INSERT INTO T (type, id, flavour) VALUES ('ice cream', 0, 'vanilla'), ('ice cream', 1, 'chocolate'), ('beer', 0, 'ale'), ('beer', 3, 'lager'); WITH M (type, id) AS ( SELECT type, max(id) FROM T GROUP BY type ) SELECT T.type, T.id, T.flavour FROM M, T WHERE T.type = M.type AND T.id = M.id ; Cheers, Gavin
On 5 Jul 2011, at 9:13, Daniel Farina wrote: >> Setup a materialized view. > > This rather defeats the point AFAIK, because keeping the materialized > view up to date (being more than thirty seconds out of date is not > desirable) will be expensive. Maintaining the index on the (key, > recency) pair is, in effect, a materialization of sorts already. Except that you can't query an index directly and you can't create an index that only contains the latest rows without markingthem as such, which is practically what a materialized view does. Your concerns about the overhead of the materialized view seem rather pessimistic. The materialization adds some overheadto every insert/update/delete to test whether the current record in the materialized view should be replaced or kept,but since it's a much smaller table, that should be quite fast. It's basically just a set of insert/update/delete triggers that fire once for every row, or once for every statement (thatcan be more efficient with insert-heavy tables, as long as you group your inserts into a single statement). Whether you can handle the extra overhead depends on your insert-load. You could also go with an intermediate solution, where you cache the latest record per key in the new inserts in a batchand send those to a different table once the batch finishes processing. The insert-load on that table will be much lower,so there will be more time to execute triggers of some kind to update the materialized view. Of course, you also introducesome lag there with respect to the visibility of those rows. > In any case, as I was saying: there are terrible workarounds for this, > but I think this is a rather common problem with INSERT-mostly > relations that effectively want row-versioning of a sort, so I was > hoping that lucid solutions to this issue have grown since 2008, when > the thread I linked to transpired. You really only want to know that a row is the latest in a group of similar rows. With MVCC, you can only really achievethat by updating the latest row and the last latest row to reflect their status (latest or not). That means you dotwo DELETE and INSERT operations every time you want to update what the latest row is. That's hardly different from using triggers anyway. The alternatives are the various SELECT queries that you've already seen. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4e1340c712095745717687!
On Tue, Jul 5, 2011 at 9:49 AM, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote: > On 5 Jul 2011, at 9:13, Daniel Farina wrote: > >>> Setup a materialized view. >> >> This rather defeats the point AFAIK, because keeping the materialized >> view up to date (being more than thirty seconds out of date is not >> desirable) will be expensive. Maintaining the index on the (key, >> recency) pair is, in effect, a materialization of sorts already. > > Except that you can't query an index directly and you can't create an index that only contains the latest rows withoutmarking them as such, which is practically what a materialized view does. It doesn't need to contain only the latest rows (although that'd be an interesting index type, come to think of it), but the speed of adding things to the index is greater than having to find the original row via UPDATE, and the index is considerably more compact than the rest of this fairly-fat row. Ergo, it's not "practically" doing the work of a materialized view. > Your concerns about the overhead of the materialized view seem rather pessimistic. The materialization adds some overheadto every insert/update/delete to test whether the current record in the materialized view should be replaced or kept,but since it's a much smaller table, that should be quite fast. It *might* be fast enough, but the point of this thread was to see if I could write a lucid query less contorted than the one I have come up with (emulated skip-scan + SELECT per record). Right now the application is better off doing a thousand plus queries (Cn+1) because the optimizer knows how to deal with finding the latest record given a specific group, but not how to find the latest records over an entire relation. By wiring this approach into the backend I can avoid the painful network overhead. We also have other (denormalization-based) approaches in mind that are likely faster than even proper use of the index, but it's unfortunate we have to take that step, as otherwise the latest-record index scanning approach would likely be fast-enough, but as long as we're doing contortions, we may as well try to use the one that we think will result in maximum read performance. > You really only want to know that a row is the latest in a group of similar rows. With MVCC, you can only really achievethat by updating the latest row and the last latest row to reflect their status (latest or not). That means you dotwo DELETE and INSERT operations every time you want to update what the latest row is. What you say about MVCC does not follow from the problem. -- fdr
It seems to me one solution is to alter your table topology by partitioning your table by the keys you need to query on, and then using simple aggregates. You;d have to set up ON INSERT DO INSTEAD rules, and you might get a performance hit..... Another solution might be to break up the query into several pieces, and running smaller queries aimed at retrieivng individual rows. This could be done inside a stored proc. Looking into how we did this with some queries in LedgerSMB..... Here's a stored procedure we used in LedgerSMB to pull distinct years from a table with, maybe 10M rows in a timely fashion. Something similar might be doable for you with modifications of course: CREATE OR REPLACE FUNCTION date_get_all_years() returns setof INT AS $$ DECLARE next_record int; BEGIN SELECT MIN(EXTRACT ('YEAR' FROM transdate))::INT INTO next_record FROM acc_trans; LOOP EXIT WHEN next_record IS NULL; RETURN NEXT next_record; SELECT MIN(EXTRACT ('YEAR' FROM transdate))::INT AS YEAR INTO next_record FROM acc_trans WHERE EXTRACT ('YEAR' FROM transdate) > next_record; END LOOP; END; $$ language plpgsql;