Thread: count(*) performance improvement ideas
I am reading discussion about improving count(*) performance. I have also seen a TODO for this. Many people have suggested TRIGGER based solution to the slow count(*) problem. I looked at the following link which presents the solution neatly. http://www.varlena.com/GeneralBits/120.php But how does that really work for SERIALIZABLE transactions ? If two concurrent transactions INSERT/DELETE rows from a table, the trigger execution of one of the transactions is bound to fail because of concurrent access. Even for READ COMMITTED transactions, the trigger execution would wait if the other transaction has executed the trigger on the same table. Well, I think the READ COMMITTED case can be handled with DEFERRED triggers, but that may require queuing up too many triggers if there are many inserts/deletes in a transaction. Running trigger for every insert/delete seems too expensive. I wonder if we can have a separate "counter" table (as suggested in the TRIGGER based solution) and track total number of tuples inserted and deleted in a transaction (and all the committed subtransactions). We then execute a single UPDATE at the end of the transaction. With HOT, updating the "counter" table should not be a big pain since all these updates can potentially be HOT updates. Also since the update of the "counter" table happens at the commit time, other transactions inserting/deleting from the same user table may need to wait for a very small period on the "counter" table tuple. This still doesn't solve the serializable transaction problem though. But I am sure we can figure out some solution for that case as well if we agree on the general approach. I am sure this must have been discussed before. So what are the objections ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > I am sure this must have been discussed before. Indeed. Apparently you didn't find the threads in which the idea of having transactions enter "delta" entries was discussed? Solves both the locking and the MVCC problems, at the cost that you need to make cleanup passes over the counter table fairly often. I don't see this as material for the core DB but it would make a great contrib module. regards, tom lane
Tom Lane wrote: > "Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > > I am sure this must have been discussed before. > > Indeed. Apparently you didn't find the threads in which the idea of > having transactions enter "delta" entries was discussed? Solves both > the locking and the MVCC problems, at the cost that you need to make > cleanup passes over the counter table fairly often. > > I don't see this as material for the core DB but it would make a great > contrib module. The TODO list has lots of details on this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Wed, Mar 12, 2008 at 9:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > > I am sure this must have been discussed before. > > Indeed. Apparently you didn't find the threads in which the idea of > having transactions enter "delta" entries was discussed? Solves both > the locking and the MVCC problems, at the cost that you need to make > cleanup passes over the counter table fairly often. > Ok. I would surely look at those threads. Hopefully HOT will considerably solve the counter table cleanup issue. > I don't see this as material for the core DB but it would make a great > contrib module. > I guess we will need some backend hooks to make it really work, no ? At the minimum we need to track the "deltas" at the transaction level and the ability to do some extra processing at the commit time. May be I should first read those threads and I will find the answers. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Pavan Deolasee wrote: > I am reading discussion about improving count(*) performance. I have > also seen a TODO for this. > > Many people have suggested TRIGGER based solution to the slow count(*) > problem. I looked at the following link which presents the solution > neatly. > > http://www.varlena.com/GeneralBits/120.php > > But how does that really work for SERIALIZABLE transactions ? If > two concurrent transactions INSERT/DELETE rows from a table, > the trigger execution of one of the transactions is bound to fail > because of concurrent access. Even for READ COMMITTED transactions, > the trigger execution would wait if the other transaction has executed > the trigger on the same table. Well, I think the READ COMMITTED case > can be handled with DEFERRED triggers, but that may require queuing up > too many triggers if there are many inserts/deletes in a transaction. > > Running trigger for every insert/delete seems too expensive. I wonder > if we can have a separate "counter" table (as suggested in the TRIGGER > based solution) and track total number of tuples inserted and deleted > in a transaction (and all the committed subtransactions). We then > execute a single UPDATE at the end of the transaction. With HOT, > updating the "counter" table should not be a big pain since all these > updates can potentially be HOT updates. Also since the update of > the "counter" table happens at the commit time, other transactions > inserting/deleting from the same user table may need to wait for a > very small period on the "counter" table tuple. > > This still doesn't solve the serializable transaction problem > though. But I am sure we can figure out some solution for that case > as well if we agree on the general approach. > > I am sure this must have been discussed before. So what are the > objections If you are talking about automatically doing this for every table - I have an objection that the performance impact seems unwarranted against the gain. We are still talking about every insert or update updating some counter table, with the only mitigating factor being that the trigger would be coded deeper into PostgreSQL theoretically making it cheaper? You can already today create a trigger on insert that will append to a summary table of some sort, whose only purpose is to maintain counts. At the simplest, it is as somebody else suggested where you might have the other table only store the primary keys with foreign key references back to the main table for handling deletes and updates. Storing transaction numbers and such might allow the data to be reduced in terms of size (and therefore elapsed time to scan), but it seems complex. If this really is a problem that must be solved - I prefer the suggestion from the past of keeping track of live rows per block for a certain transaction range, and any that fall within this range can check off this block quickly with an exact count, then the exceptional blocks (the ones being changed) can be scanned to be sure. But, it's still pretty complicated to implement right and maintain, for what is probably limited gain. I don't personally buy into the need to do exact count(*) on a whole table quickly. I know people ask for it - but I find these same people either confused, or trying to use this functionality to accomplish some other end, under the assumption that because they can get counts faster from other databases, therefore PostgreSQL should do it as well. I sometimes wonder whether these people would even notice if PostgreSQL translated count(*) on the whole table to query reltuples. :-) Cheers, mark -- Mark Mielke <mark@mielke.cc>
On Wed, Mar 12, 2008 at 9:14 PM, Mark Mielke <mark@mark.mielke.cc> wrote: > > If you are talking about automatically doing this for every table - I > have an objection that the performance impact seems unwarranted against > the gain. We are still talking about every insert or update updating > some counter table, with the only mitigating factor being that the > trigger would be coded deeper into PostgreSQL theoretically making it > cheaper? > No, I am not suggesting that. If you read proposal carefully, its one UPDATE per transaction. With HOT, I am hoping that the counter table may be completely cached in memory and won't bloat much. Also, we can always have a GUC (like pgstats) to control the overhead. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Pavan Deolasee wrote: <blockquote cite="mid:2e78013d0803120852h11a1022fw952900d925405294@mail.gmail.com" type="cite"><prewrap="">On Wed, Mar 12, 2008 at 9:14 PM, Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mark.mielke.cc"><mark@mark.mielke.cc></a>wrote: </pre><blockquote type="cite"><pre wrap=""> If youare talking about automatically doing this for every table - Ihave an objection that the performance impact seems unwarrantedagainstthe gain. We are still talking about every insert or update updatingsome counter table, with the only mitigatingfactor being that thetrigger would be coded deeper into PostgreSQL theoretically making itcheaper? </pre></blockquote><prewrap=""> No, I am not suggesting that. If you read proposal carefully, its one UPDATE per transaction. With HOT, I am hoping that the counter table may be completely cached in memory and won't bloat much. Also, we can always have a GUC (like pgstats) to control the overhead. </pre></blockquote><br /> Fine - once per transactioninstead of once per insert. Still, if there is overhead to this (updating a secondary summary table), does itreally make sense to have it for every table? Most of my tables do not require count(*) on the whole table (actually -none of them do). For the same reason as I don't want oid, I don't think I would want "fast count" capabilities to impactmy regular queries. Again, I don't think count(*) on the whole table is a particularly useful case. count(*) on particularsubsets of the data may be, but of the whole table?<br /><br /> If you can make a secondary summary table to beused for count(*) optional, great. If using HOT makes the secondary table more efficient, great.<br /><br /> Cheers,<br/> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
Mark Mielke <mark@mark.mielke.cc> writes: > Fine - once per transaction instead of once per insert. Still, if there > is overhead to this (updating a secondary summary table), does it really > make sense to have it for every table? We certainly wouldn't accept a patch that imposed this overhead on every table. One of the advantages of framing it as an explicit set of triggers is that then you have a natural way of indicating which table(s) you want the feature for (and are willing to pay the overhead to get it). regards, tom lane
On Wed, Mar 12, 2008 at 9:53 PM, Mark Mielke <mark@mark.mielke.cc> wrote: > > > > Fine - once per transaction instead of once per insert. Still, if there is > overhead to this (updating a secondary summary table), does it really make > sense to have it for every table? Most of my tables do not require count(*) > on the whole table (actually - none of them do). For the same reason as I > don't want oid, I don't think I would want "fast count" capabilities to > impact my regular queries. Again, I don't think count(*) on the whole table > is a particularly useful case. count(*) on particular subsets of the data > may be, but of the whole table? > ISTM that you are complaining because we never had an *fast* count(*) and adding that now comes at a cost. Had it been there from day one with the same overhead as we are talking about now, nobody would have complained :-) Anyways, your point is taken and it would be great if can make it configurable, if not table level then at least globally. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Pavan Deolasee wrote: <blockquote cite="mid:2e78013d0803120959r515667ecl4edb3bd3afbfbdfe@mail.gmail.com" type="cite"><prewrap="">On Wed, Mar 12, 2008 at 9:53 PM, Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mark.mielke.cc"><mark@mark.mielke.cc></a>wrote: </pre><blockquote type="cite"><pre wrap=""> Fine - once per transaction instead of once per insert. Still, if there is overhead to this (updating a secondary summary table), does it really make sense to have it for every table? Most of my tables do not require count(*) on the whole table (actually - none of them do). For the same reason as I don't want oid, I don't think I would want "fast count" capabilities to impact my regular queries. Again, I don't think count(*) on the whole table is a particularly useful case. count(*) on particular subsets of the data may be, but of the whole table? </pre></blockquote><pre wrap=""> ISTM that you are complaining because we never had an *fast* count(*) and adding that now comes at a cost. Had it been there from day one with the same overhead as we are talking about now, nobody would have complained :-) </pre></blockquote><br /> Quite possibly - but, similarly, if PostgreSQL was generally slower due to suchthings, I might not have chosen PostgreSQL as my favoured db. :-) I think there is a reason PostgreSQL doesn't come witha fast count(*), and it's that the developers leading up to this point shared the opinion that this feature was not critical.<br/><br /> I somewhat disagree about my personal reaction, though. I complain about many things, even configurablethings, such as LISTEN/NOTIFY. I occasionally wish I had such a function, but given it's implementation, I wouldchoose to use a non-PostgreSQL mechanism for implementation if available. Luckily, it's also "only suffer the performancecost if you CHOOSE to use it." Seems like a historically valuable trend.<br /><br /><blockquote cite="mid:2e78013d0803120959r515667ecl4edb3bd3afbfbdfe@mail.gmail.com"type="cite"><pre wrap="">Anyways, your point is takenand it would be great if can make it configurable, if not table level then at least globally</pre></blockquote><br /> If table level, I might even try it out. If global level,I will not be trying it out. I'm only one person - but perhaps this will help point you in a direction favourable tomany?<br /><br /> If you are REALLY feeling clever, the queries that I would find it to benefit the MOST on, would includeWHERE conditions. Just like indexes with WHERE conditions. If you get to this point, I think you are reaching somethingthat will have far more universal benefit to existing applications. CREATE COUNT INDEX blah ON blah WHERE ... Iwould use this in many places.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Mark Mielke <mark@mark.mielke.cc> writes: >> Fine - once per transaction instead of once per insert. Still, if there >> is overhead to this (updating a secondary summary table), does it really >> make sense to have it for every table? > > We certainly wouldn't accept a patch that imposed this overhead on every > table. If you look at this at the right angle it's actually a degenerate case of materialized views. I think think it would be more useful to approach it from that direction even if it only supported a very limited set of expressions. In an ideal world I would love to be able to do something like: CREATE MATERIALIZED VIEW foo AS (select count(*) from bar) WITH INCREMENTAL UPDATES; and have that automatically create both a heap to store the count and another to store the incremental changes. To do this would require some magic to know what "incremental changes" means for each aggregate where it's meaningful though. Then it would require some magic in the optimizer to recognize when piece of the query can be satisfied by a materialized view. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
Gregory Stark wrote: > In an ideal world I would love to be able to do something like: > > CREATE MATERIALIZED VIEW foo AS (select count(*) from bar) WITH INCREMENTAL UPDATES; > > and have that automatically create both a heap to store the count and another > to store the incremental changes. To do this would require some magic to know > what "incremental changes" means for each aggregate where it's meaningful > though. > > Then it would require some magic in the optimizer to recognize when piece of > the query can be satisfied by a materialized view. > Hehe... +5 if you think it can be done in the next 12 - 24 months. :-) Cheers, mark -- Mark Mielke <mark@mielke.cc>
Tom Lane wrote > "Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > > I am sure this must have been discussed before. > > Indeed. Apparently you didn't find the threads in which the idea of > having transactions enter "delta" entries was discussed? Solves both > the locking and the MVCC problems, at the cost that you need to make > cleanup passes over the counter table fairly often. I can't find any posts that directly address what I was looking for. In my situation I have a small number of concurrent transactions with each transaction running a large number of single rowinserts or deletes. However I'm not after a fast count(*) from table, but more like a fastselect grouping_id, count(*) from my_table group bygrouping_id I initially thought that what you meant by "having transactions enter "delta" entries" was that I have a trigger that wouldcreate a row each time it was called, writing how many records where inserted or deleted. I didn't understand how thiswould be much of an improvement, as each of my rows would contain either +1 or -1. But I just realised you might mean to include a txid row in my table of deltas, and in my trigger insert or update that rowwhere txid = txid_current() (per grouping_id) Is that what is recommended? No locking problems as each transaction is only updating its own rows. Can you clarify the lack of MVCC problems? Do new versions of rows get created if the original version of the row was created in the current transaction? Does this idea apply with the same efficiency in pre 8.3, non-HOT implementations? Any advice on creating or avoiding indexes on the tables in question? I can think of two options for a performing the cleanup passes using current functionality: 1) Using Listen/Notify Issue a notify whenever a new row is inserted for a transaction. They get delivered post transaction commit don't they? Andnot at all on rollback? Have an application listening for them, performing the aggregation & cleanup work. 2) Use a schedule, based on local knowledge of expected number of transactions over time. So I'd rephrase Pavan's suggestion as a request to have post-transaction commit triggers that have access to (at least) thetxid of the transaction that was committed. Suggested syntax is to add the option "TRANSACTION" (or perhaps "COMMIT") to the CREATE TRIGGER statement: CREATE TRIGGER name AFTER INSERT OR UPDATE OR DELETE ON table FOR EACH TRANSACTION EXECUTE PROCEDURE funcname ( arguments); Any of the triggering actions on the specified table ensure that the function is called once if the transaction commits. Requires a new TG_LEVEL. TG_OP could be the first action triggered. Would implementing this be extremely difficult due to transferring information from within the transaction to outside thetransaction? If so, perhaps I'd get the same benefit from having a trigger set up to fire pre-commit (or pre-prepare), and be a part ofthe transaction. Would the locking difficulties be reduced as the lock would not be required till late in the game, and not held for long? Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > I initially thought that what you meant by "having transactions enter "delta" entries" was that I have a trigger that wouldcreate a row each time it was called, writing how many records where inserted or deleted. I didn't understand how thiswould be much of an improvement, as each of my rows would contain either +1 or -1. Well, ideally you'd aggregate all the deltas caused by a particular transaction into one entry in the counting table. Whether or not that happens, though, the point of the concept is that some background task aggregates all the deltas from long-gone transactions into just one base row, and then deletes the old delta entries. To get a valid value of COUNT(*), what onlookers must do is SUM() the base row and delta records from all transactions that they can "see" under MVCC rules. The amount of work involved is proportional to the number of recent updates, not the total size of the underlying table. > However I'm not after a fast count(*) from table, but more like a fast > select grouping_id, count(*) from my_table group by grouping_id You could apply the same technique across each group id, though this certainly is getting beyond what any built-in feature might offer. > Can you clarify the lack of MVCC problems? The point there is that the "right answer" actually depends on the observer, since each observer might have a different snapshot and therefore be able to "see" a different set of committed rows in the underlying table. The multiple-delta table handles this automatically, because you can "see" a delta entry if and only if you could "see" the underlying-table changes it represents. > Does this idea apply with the same efficiency in pre 8.3, non-HOT implementations? I didn't claim it was amazingly efficient in any implementation ;-). HOT in particular is nearly useless since most rows in the count table will never be updated, only inserted and eventually deleted. You might get some mileage on the base row, but that'd be about it. The count table will need frequent vacuums as well as frequent aggregation scans. It should beat scanning a large underlying table, but it's hardly gonna be free. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > >> However I'm not after a fast count(*) from table, but more like a fast >> select grouping_id, count(*) from my_table group by grouping_id > > You could apply the same technique across each group id, though this > certainly is getting beyond what any built-in feature might offer. At that point you're talking about materialized views. Which makes it a whole lot more interesting imho. >> Does this idea apply with the same efficiency in pre 8.3, non-HOT implementations? > > I didn't claim it was amazingly efficient in any implementation ;-). > HOT in particular is nearly useless since most rows in the count > table will never be updated, only inserted and eventually deleted. > You might get some mileage on the base row, but that'd be about it. > The count table will need frequent vacuums as well as frequent > aggregation scans. It might be better not to update this delta table in normal transactional updates. After all the tuples you're deleting are precisely the ones that nobody should be interested in any more. If you locked the table and magically deleted those tuples and updated the master tuple using the global xmin instead of your real xid people would get the same result and you could reclaim the space much much sooner. Locking the table kind of sucks though. And crash recovery would be a problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
(Sorry for the repeat email Tom, I forgot the cc's) Tom Lane wrote: > "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > > I initially thought that what you meant by "having > > transactions enter "delta" entries" was that I have a trigger > > that would create a row each time it was called, writing how > > many records where inserted or deleted. I didn't understand > > how this would be much of an improvement, as each of my rows > > would contain either +1 or -1. > > Well, ideally you'd aggregate all the deltas caused by a particular > transaction into one entry in the counting table. Yes, that's what I was attempting to do, but without changing the application code. Using txid_current() can do that, so each of a large number of individual inserts or deletes within a transaction updatesthe same delta row for that transaction. I haven't found any references to this being a solution, and thought it wasworth recording. > > However I'm not after a fast count(*) from table, but more > like a fast > > select grouping_id, count(*) from my_table group by grouping_id > > You could apply the same technique across each group id, though this > certainly is getting beyond what any built-in feature might offer. Agreed. I've tried it out now, and am fairly happy with what I've got. > > Can you clarify the lack of MVCC problems? > > The point there is that the "right answer" actually depends on the > observer, since each observer might have a different snapshot and > therefore be able to "see" a different set of committed rows in the > underlying table. The multiple-delta table handles this > automatically, > because you can "see" a delta entry if and only if you could "see" > the underlying-table changes it represents. > > > Does this idea apply with the same efficiency in pre 8.3, > non-HOT implementations? > > I didn't claim it was amazingly efficient in any implementation ;-). > HOT in particular is nearly useless since most rows in the count > table will never be updated, only inserted and eventually deleted. > You might get some mileage on the base row, but that'd be about it. > The count table will need frequent vacuums as well as frequent > aggregation scans. > > It should beat scanning a large underlying table, but it's hardly > gonna be free. What I was asking about with those questions, is if a single row is inserted in transaction A, and updated 1000 times stillwithin transaction A, then transaction A commits... does a single row version get written, or 1001, 1000 of which areremovable? I tested this with a small summary table freshly vacuum full'ed. 10 transactions, one after the other, each transaction inserted three delta rows, and updates one of them three times, andthe other two five times each. So 3 inserts and 13 updates per transaction. The updates only affect non-indexed fields in rows created in the same transaction. The next vacuum full found 52 removable row versions. I repeated the test, and got 13 removable row versions. I repeated the test again, and got 13 removable row versions. I repeated just one of the ten transactions, 13 removable row versions. All inserts and updates are probably in the one page that has a fair bit of free space. Is it possible to update the HOT code to re-use row versions on the same page if they were created in the same transaction? Conclusion: making use of txid_current(), I can get single delta rows with deltas of 10000, but doing so creates 10000 deadrow versions. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
Tom Lane wrote > "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > > What I was asking about with those questions, is if a > > single row is inserted in transaction A, and updated 1000 > > times still within transaction A, then transaction A > > commits... does a single row version get written, or 1001, > > 1000 of which are removable? > > Umm ... AFAICS there is no need for an UPDATE to touch the count table > at all. You'd only need ON INSERT and ON DELETE triggers. I'm not referring to updates of my base table... the single row inserted was referring to the delta row... I'm trying to group together a large number of +1 or -1 deltas into a single delta per transaction. A cut down example: CREATE TABLE document_type_summary_detail ( document_type_id integer NOT NULL, txid bigint NOT NULL DEFAULT 0, documents bigint NOT NULL DEFAULT 0, CONSTRAINT pk_document_type_summaryPRIMARY KEY (document_type_id, txid) ); CREATE OR REPLACE FUNCTION document_count_trig() RETURNS TRIGGER AS $$ DECLARE BEGIN IF TG_OP = 'INSERT' THEN UPDATE document_type_summary_detail set documents=documents+1 where document_type_id= NEW.document_type_id and txid=txid_current(); IF NOT FOUND THEN INSERT INTO document_type_summary_detail(document_type_id,documents,txid) VALUES(NEW.document_type_id,1,txid_current()); END IF; RETURN NEW; ELSIF TG_OP = 'DELETE' THEN UPDATE document_type_summary_detail set documents=documents-1 where document_type_id= OLD.document_type_id and txid=txid_current(); IF NOT FOUND THEN INSERT INTO document_type_summary_detail(document_type_id,documents,txid) VALUES(OLD.document_type_id,-1,txid_current()); END IF; RETURN OLD; END IF; END; $$ LANGUAGE 'plpgsql'; create trigger document_count_trig before insert or delete on document for each row execute procedure document_count_trig(); --one off setup: insert into document_type_summary_detail (document_type_id,txid,documents) select dt.id, 0, count(d.id) from document_type dt left join document d on d.document_type_id = dt.id group by 1,2; --useful view: CREATE OR REPLACE VIEW document_type_summary AS SELECT document_type_id, sum(documents) AS documents FROM document_type_summary_detailGROUP BY document_type_id; --scheduled cleanup: CREATE OR REPLACE FUNCTION document_type_summary_aggregate() RETURNS void AS $$ BEGIN INSERT INTO document_type_summary_detail(document_type_id) select distinct document_type_id from document_type_summary_detailexcept select document_type_id from document_type_summary_detail where txid=0; UPDATE document_type_summary_detailset documents=v.documents from document_type_summary as v where document_type_summary_detail.document_type_id= v.document_type_id and document_type_summary_detail.txid=0 and exists (select1 from document_type_summary_detail ss where ss.document_type_id = document_type_summary_detail.document_type_id andss.txid <> 0); DELETE FROM document_type_summary_detail where txid <> 0; END; $$ LANGUAGE 'plpgsql'; My assumption is that this solves the "locking causes serialisation of transactions" problem as the only rows updated arethose inserted by the same transaction. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > Tom Lane wrote >> Umm ... AFAICS there is no need for an UPDATE to touch the count table >> at all. You'd only need ON INSERT and ON DELETE triggers. > I'm not referring to updates of my base table... the single row inserted was referring to the delta row... > I'm trying to group together a large number of +1 or -1 deltas into a single delta per transaction. Oh. You can't realistically do that from the level of individual ON INSERT/DELETE triggers --- as you've found out, you end up with about the same number of dead rows anyway. Collapsing a transaction's changes into a single row would require keeping some intra-transaction state, which is do-able at the C-code level but not from plpgsql. regards, tom lane
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > What I was asking about with those questions, is if a single row is inserted in transaction A, and updated 1000 times stillwithin transaction A, then transaction A commits... does a single row version get written, or 1001, 1000 of which areremovable? Umm ... AFAICS there is no need for an UPDATE to touch the count table at all. You'd only need ON INSERT and ON DELETE triggers. regards, tom lane
Tom Lane wrote: <blockquote cite="mid:20927.1205895835@sss.pgh.pa.us" type="cite"><pre wrap="">"Stephen Denne" <a class="moz-txt-link-rfc2396E"href="mailto:Stephen.Denne@datamail.co.nz"><Stephen.Denne@datamail.co.nz></a> writes:</pre><blockquote type="cite"><pre wrap="">What I was asking about with those questions, is if a single row is insertedin transaction A, and updated 1000 times still within transaction A, then transaction A commits... does a singlerow version get written, or 1001, 1000 of which are removable? </pre></blockquote><pre wrap=""> Umm ... AFAICS there is no need for an UPDATE to touch the count table at all. You'd only need ON INSERT and ON DELETE triggers. </pre></blockquote><br /> This returns to the question of whethercount of the whole table is useful, or whether count of a GROUP BY or WHERE is useful. If GROUP BY or WHERE is useful,then trigger on UPDATE becomes necessary.<br /><br /> What is the direction here? Is it count of the whole table only?(<-- not interesting to me)<br /><br /> Or count of more practical real life examples, which I completely agree withGreg, that this gets into the materialized view realm, and becomes very interesting.<br /><br /> In my current db project,I never count all of the rows in a table. However, I do use count(*) with GROUP BY and WHERE.<br /><br /> Cheers,<br/> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
Mark Mielke wrote > This returns to the question of whether count of the whole table is useful, or whether > count of a GROUP BY or WHERE is useful. > If GROUP BY or WHERE is useful, then trigger on UPDATE becomes necessary. True... for the example I gave I should have had an update trigger on my table. I neglected it as I based the example of a more complex multi-table example from my application, where I know that the valueI'm grouping by doesn't change. > What is the direction here? Is it count of the whole table only? (<-- not interesting to me) > Or count of more practical real life examples, which I completely agree with Greg, > that this gets into the materialized view realm, and becomes very interesting. > In my current db project, I never count all of the rows in a table. However, I do use count(*) with GROUP BY and WHERE. I'm trying to figure out how close I can come to a useful efficient materialized view with current production builds of postgresql, and identifying areas where changes to postgresql could make it easier. Currently I can see three areas of concern: 1) Turning it on My query to initially populate the materialized view table takes 6 hours on my (limited hardware) development system, whichhighlights the problem of when do you turn on the triggers. An outage is one way to guarantee that there are neither missing details nor double counted details. Would turning on the triggers and then running my initial population query in the same transaction work? 2) Table bloat I'm trying to group together a large number of +1 or -1 deltas into a single delta per transaction. This creates as many dead rows as there are updates. This is also a problem with Pavan's suggestion of maintaining a counter table. They can all be HOT updates in 8.3, but they still all create new row versions. Tom says "Collapsing a transaction's changes into a single row would require keeping some intra-transaction state, which is do-able at the C-code level but not from plpgsql." So can I hope that this problem is solvable as a contrib module that will work with at least 8.3? I'd still want to write plpgsql trigger function myself, as I know the aggregation rules, and call contributed functionsto integrate with the collapsing of the transaction's changes into a single row. (Expect I'd need a small numberof rows per transaction, as I need to create delta rows for each value of my grouping field involved in the transaction. 3) How to incorporate the deltas. With my technique, if the transaction rolls back the delta record becomes a dead row, if it commits, the delta is then visibleto transaction started after this time. I need to have a regular process run to sum and remove the deltas, rewriting the summary rows. I'd like to be able to write an after-commit trigger that fires after changes in dependent tables that I identify, whichcan add the transaction's deltas to the summary table. I would want it to effectively be a new, small transaction. I think that if these three areas are addressed, then before even considering writing code to automatically convert any givenview into a materialized view. 2) and 3) could perhaps be implemented with a per transaction map from my_custom_key to a prepared statement and a list ofparameter values. Provide access to the values, making them updateable within the transaction. Have the statement automatically executed oncommit. Pavan also refers to deferred triggers, which has got me thinking about another possible solution: Instead of inserting a delta row, that will be updated a lot of times, create an on commit drop temp table named after thetxid and the grouping value (which is an integer in my case, perhaps hash it if you're grouping by something that doesn'teasily convert to part of a table name), create an after insert initially deferred constraint to call a function which will move the 'at commit' values of the rowsin the temp table to the summary table. The temp table will only have one row inserted, updated many times, then on commit the trigger is fired once, and the temptable is dropped. Does anyone think this will or won't work for some reason? Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
On Thu, Mar 20, 2008 at 3:24 AM, Stephen Denne <Stephen.Denne@datamail.co.nz> wrote: > > Pavan also refers to deferred triggers, which has got me thinking about another possible solution: > > Instead of inserting a delta row, that will be updated a lot of times, create an on commit drop temp table named afterthe txid and the grouping value (which is an integer in my case, perhaps hash it if you're grouping by something thatdoesn't easily convert to part of a table name), > create an after insert initially deferred constraint to call a function which will move the 'at commit' values of therows in the temp table to the summary table. > The temp table will only have one row inserted, updated many times, then on commit the trigger is fired once, and thetemp table is dropped. > > Does anyone think this will or won't work for some reason? I think this should work, although you may need to take some extra steps to manage the summary table. Also, I think a single temp table per transaction should suffice. The temp table would have one row per "group by" or "where" condition on which you want to track the count. The corresponding row will be updated as and when the corresponding count changes. You would need INSERT/DELETE/UPDATE triggers to do that. If there are any subtransaction aborts, that will be taken care by MVCC. As you said, a single deferred trigger would then merge the temp table with the summary table. Here we need to be extra careful because serializable transactions may fail to update the same row in the summary table. One solution I can think of is (and I haven't checked the archives, so somebody might have already suggested this before): Each summary table will have one summary row per "group by" or "where" condition (same as temp table). In addition to that, it can have zero or more temporary rows for the conditions. - select summary_rows from summary table for update nowait; - if lock is not available, insert our deltas into the summarytable as a new row. These rows will be deleted as soon as some other transaction gets lock on the summary rowsand merge our deltas with them - if lock is available - merge our deltas with the summary rows - check for othertemporary deltas and merge them with the summary rows and delete those temporary rows I guess we can write the generic triggers as contrib module. What needs to done is to let user specify the tables and the conditions on which they want to track count(*) and then apply those conditions in the generic triggers. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Pavan Deolasee wrote > On Thu, Mar 20, 2008 at 3:24 AM, Stephen Denne > <Stephen.Denne@datamail.co.nz> wrote: > > > > > Pavan also refers to deferred triggers, which has got me > thinking about another possible solution: > > > > Instead of inserting a delta row, that will be updated a > lot of times, create an on commit drop temp table named after > the txid and the grouping value (which is an integer in my > case, perhaps hash it if you're grouping by something that > doesn't easily convert to part of a table name), > > create an after insert initially deferred constraint to > call a function which will move the 'at commit' values of the > rows in the temp table to the summary table. > > The temp table will only have one row inserted, updated > many times, then on commit the trigger is fired once, and the > temp table is dropped. > > > > Does anyone think this will or won't work for some reason? > > > I think this should work, although you may need to take some > extra steps > to manage the summary table. Also, I think a single temp > table per transaction > should suffice. The temp table would have one row per "group > by" or "where" > condition on which you want to track the count. The > corresponding row will > be updated as and when the corresponding count changes. You would need > INSERT/DELETE/UPDATE triggers to do that. If there are any > subtransaction > aborts, that will be taken care by MVCC. Thanks for that. I had gone ahead and tried out the idea, and it was working 'ok'. Using one table per transaction has the benefit of less temp tables (but the same number of triggers waiting to run). Italso removes the grouping key from the table name. I was using a single table per grouping key, with a single updated row in it. The benefit was simpler queries, and I could create an ON INSERT trigger that would be triggered only once when the temptable was created, and a 'zero' row was inserted, thereby separating the setup of the trigger from the maintenance ofthe delta. I haven't explored the transactional implications of updating vs inserting delta rows in the summary table at the time oftransaction commit. The code below updates the summary table, which I think could lead to a large delay or deadlocks ifthere are other styles of updates on that table (other than on-commit triggers)? I also hadn't considered sub-transactions. Below is a cut-down version of what I tried out. I was summarizing more than just the number of documents shown below, I was storing a sum, and two maximums of timestamps(using the 'greatest' function for aggregating each record). These were extra fields in both the summary tableand the temp tables. This is able to be made more generic by changing get_temp_table_name() to take an additional couple of parameters specifyingthe name of the function to run at commit, and an identifying key (eg 'doc_count' vs 'doc_size_sum'), or perhapsincluding the delta value too, (which looks like it would simplify the triggers on the tables whose changes we wishto summarize, except that it doesn't cater for storing greatest or least aggregates.) I took a wild guess at a way of finding out whether the temp table already exists:not exists(select tablename from pg_catalog.pg_tableswhere tablename=temp_delta_txid_group) Is there a better/safer way? Here's part of the code I've got at the moment (edited here to cut it down to the core example, so it may contain errors): CREATE TABLE doc_type_summary ( document_type_id integer NOT NULL, documents bigint NOT NULL DEFAULT 0, CONSTRAINT pk_doc_type_summaryPRIMARY KEY (document_type_id) ); CREATE OR REPLACE FUNCTION process_delta() RETURNS TRIGGER AS $$ BEGIN EXECUTE 'UPDATE doc_type_summary set documents=doc_type_summary.documents+d.documents from ' || TG_ARGV[0]|| ' as d where document_type_id = ' || TG_ARGV[1]; RETURN NULL; END; $$ LANGUAGE 'plpgsql'; CREATE OR REPLACE FUNCTION get_temp_table_name(document_type_id integer) RETURNS text AS $$ DECLARE temp_delta_txid_group text; BEGIN temp_delta_txid_group := 'temp_delta_' || txid_current() || '_' || document_type_id; IF not exists(select tablenamefrom pg_catalog.pg_tables where tablename=temp_delta_txid_group) THEN EXECUTE 'CREATE TEMP TABLE ' || temp_delta_txid_group|| '(documents bigint NOT NULL DEFAULT 0) ON COMMIT DROP'; EXECUTE 'CREATE CONSTRAINT TRIGGER '|| temp_delta_txid_group || '_trig AFTER INSERT ON ' || temp_delta_txid_group || ' DEFERRABLE INITIALLY DEFERRED FOR EACHROW EXECUTE PROCEDURE process_delta ("' || temp_delta_txid_group || '", ' || document_type_id || ')'; EXECUTE 'INSERTINTO ' || temp_delta_txid_group || ' DEFAULT VALUES'; END IF; RETURN temp_delta_txid_group; END; $$ LANGUAGE 'plpgsql'; CREATE OR REPLACE FUNCTION document_count_trig() RETURNS TRIGGER AS $$ DECLARE temp_delta_txid_group text; BEGIN IF TG_OP = 'INSERT' THEN temp_delta_txid_group := get_temp_table_name(NEW.document_type_id); EXECUTE 'UPDATE' || temp_delta_txid_group || ' set documents=documents+1'; RETURN NEW; ELSIF TG_OP = 'DELETE' THEN temp_delta_txid_group:= get_temp_table_name(OLD.document_type_id); EXECUTE 'UPDATE ' || temp_delta_txid_group || ' setdocuments=documents-1'; RETURN OLD; END IF; END; $$ LANGUAGE 'plpgsql'; -- Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
(There is a possible performance bug mentioned at the end of the email, the rest is further discussion regarding materialisedviews) I wrote > Pavan Deolasee wrote > > On Thu, Mar 20, 2008 at 3:24 AM, Stephen Denne > > <Stephen.Denne@datamail.co.nz> wrote: > > > > > > > > Pavan also refers to deferred triggers, which has got me > > thinking about another possible solution: > > > > > > Instead of inserting a delta row, that will be updated a > > lot of times, create an on commit drop temp table named after > > the txid and the grouping value (which is an integer in my > > case, perhaps hash it if you're grouping by something that > > doesn't easily convert to part of a table name), > > > create an after insert initially deferred constraint to > > call a function which will move the 'at commit' values of the > > rows in the temp table to the summary table. > > > The temp table will only have one row inserted, updated > > many times, then on commit the trigger is fired once, and the > > temp table is dropped. > > > > > > Does anyone think this will or won't work for some reason? > > > > > > I think this should work, although you may need to take some > > extra steps > > to manage the summary table. Also, I think a single temp > > table per transaction > > should suffice. The temp table would have one row per "group > > by" or "where" > > condition on which you want to track the count. The > > corresponding row will > > be updated as and when the corresponding count changes. You > would need > > INSERT/DELETE/UPDATE triggers to do that. If there are any > > subtransaction > > aborts, that will be taken care by MVCC. > > Thanks for that. I had gone ahead and tried out the idea, and > it was working 'ok'. > > Using one table per transaction has the benefit of less temp > tables (but the same number of triggers waiting to run). It > also removes the grouping key from the table name. > > I was using a single table per grouping key, with a single > updated row in it. > The benefit was simpler queries, and I could create an ON > INSERT trigger that would be triggered only once when the > temp table was created, and a 'zero' row was inserted, > thereby separating the setup of the trigger from the > maintenance of the delta. One temp table per grouping key would presumably allocate at least one disk page per grouping key. This might result in pretty poor performance. Though if the number of updates per grouping key is large, HOT would have plentyof room on the page to write new row versions. Does creation & dropping of temp tables result in system catalog bloat? > I took a wild guess at a way of finding out whether the temp > table already exists: > not exists(select tablename from pg_catalog.pg_tables > where tablename=temp_delta_txid_group) > Is there a better/safer way? Answering my own question: There is a better way to do what I was doing, (not sure about a better way to check existence of a temp table though)... A custom variable class can be set up and used to record whether the transaction in question has been set up. (Thanks toAndreas Kretschmer for pointing those out in another thread on -general) Alter this setting within postgresql.conf to add 'mv': custom_variable_classes = 'mv' Add this setting to postgresql.conf: mv.initialized = 'false' Then only set it to true local to the transaction. Usage within a trigger: IF NOT (current_setting('mv.initialized')::boolean) THEN -- trigger a deferred constraint function: INSERT INTOmv_txid_doc_type_summary VALUES (txid_current()); PERFORM set_config('mv.initialized', 'true', true); END IF; The custom variable class can also be used to aggregate the deltas within a transaction, though there are some minor difficulties: 1) They only store text 2) You can only find out that a value has not been set by catching an exception 3) There is no way to list the settings. The main benefit is that changing a variable's setting does not write a new row version. Creating new transactionally scoped variables seems to take around 150 to 200 bytes of the process ram per variable, (dependingon the size of the value stored). The time to create them for the first time for a connection seems to vary a bit, between 200 and 1000 per minute. No IO isbeing performed, CPU is at 100% These statistics gathered when trying variations of this query: select count(*) from (select set_config('mv.test11.' || s1.s1, s2.s2::text, true) from (select s1 from generate_series(1,1000)as s1) as s1, (select s2 from generate_series(1,1000) as s2) as s2) as s3; Which sets each one of 1000 variables to 1000 different values. This can take a few minutes, but strangely a second connectionstarting the same query just after the first one, can finish in seconds. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > The custom variable class can also be used to aggregate the deltas within a transaction, though there are some minor difficulties: > 1) They only store text > 2) You can only find out that a value has not been set by catching an exception > 3) There is no way to list the settings. As for 2) and 3), can't you look into the pg_settings view? > The time to create them for the first time for a connection seems to vary a bit, between 200 and 1000 per minute. Yeah, that path is not optimized at all because it was never considered performance-critical. Updating an existing variable should be cheaper. regards, tom lane
<div dir="ltr" id="idOWAReplyText1271"><div dir="ltr"><font color="#000000" face="Arial" size="2">(apologies for the emailformat & top posting, I've had to temporarily switch to using a web interface from home that doesn't seem to knowwhat plain text is)</font></div><div dir="ltr"><font face="Arial" size="2"></font> </div><div dir="ltr"><font face="Arial"size="2">pg_settings view doesn't contain custom variables created on the fly, (nor, from memory, ones definedin postgresql.conf. I'm not able to check and confirm that at the moment).</font></div><div dir="ltr"><font face="Arial"size="2">Fixing that would satisfy 2 & 3 nicely.</font></div><div dir="ltr"><font face="Arial" size="2">Thedocs on them say the ones in postgresql.conf are to be used by modules when they initialize, as the values touse when setting up actual real server variables, (able to store more types than just text), which I presume would appearin pg_settings.</font></div><div dir="ltr"><font face="Arial" size="2"></font> </div><div dir="ltr"><font face="Arial"size="2">Updating existing variables is much quicker, over 100000 updates per second.</font></div><div dir="ltr"><fontface="Arial" size="2"></font> </div><div dir="ltr"><font face="Arial" size="2">Regards,</font></div><div dir="ltr"><fontface="Arial" size="2">Stephen Denne.</font></div></div><div dir="ltr"><br /></div><div dir="ltr"><hr tabindex="-1"/></div><div dir="ltr"><font face="Tahoma" size="2"><b>From:</b> Tom Lane [mailto:tgl@sss.pgh.pa.us]<br /><b>Sent:</b>Wed 16/04/2008 1:55 p.m.<br /></font><font size="2"></font></div><div dir="ltr"><font size="2">"Stephen Denne"<Stephen.Denne@datamail.co.nz> writes:<br />> The custom variable class can also be used to aggregate thedeltas within a transaction, though there are some minor difficulties:<br /><br />> 1) They only store text<br />>2) You can only find out that a value has not been set by catching an exception<br />> 3) There is no way to listthe settings.<br /><br />As for 2) and 3), can't you look into the pg_settings view?<br /><br />> The time to createthem for the first time for a connection seems to<br />vary a bit, between 200 and 1000 per minute.<br /><br />Yeah,that path is not optimized at all because it was never considered<br />performance-critical. Updating an existingvariable should be cheaper.<br /><br /> regards, tom lane<br /></font></div><pre> Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________ </pre>
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >> As for 2) and 3), can't you look into the pg_settings view? > pg_settings view doesn't contain custom variables created on the fly, Really? [ pokes around ... ] Hm, you're right, because add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this usage it'll never be cleared. I wonder if we should change that. The whole thing is a bit of an abuse of what the mechanism was intended for, and so I'm not sure we should rejigger GUC's behavior to make it more pleasant, but on the other hand if we're not ready to provide a better substitute ... regards, tom lane
Tom Lane wrote: > "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > >> As for 2) and 3), can't you look into the pg_settings view? > > > pg_settings view doesn't contain custom variables created on the > > fly, > > Really? [ pokes around ... ] Hm, you're right, because > add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this > usage it'll never be cleared. I wonder if we should change that. > > The whole thing is a bit of an abuse of what the mechanism was > intended for, and so I'm not sure we should rejigger GUC's behavior > to make it more pleasant, but on the other hand if we're not ready to > provide a better substitute ... While I agree with that part, is there any actual *reason* why we shouldn't have the custom variables included in pg_settings? ISTM that it would be usable in cases that aren't an abuse as well... //Magnus
Magnus Hagander <magnus@hagander.net> writes: > Tom Lane wrote: >> Really? [ pokes around ... ] Hm, you're right, because >> add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this >> usage it'll never be cleared. I wonder if we should change that. >> >> The whole thing is a bit of an abuse of what the mechanism was >> intended for, and so I'm not sure we should rejigger GUC's behavior >> to make it more pleasant, but on the other hand if we're not ready to >> provide a better substitute ... > While I agree with that part, is there any actual *reason* why we > shouldn't have the custom variables included in pg_settings? IIRC, the motivation for doing that was to not expose a completely bogus set of attributes for a variable whose defining C-module hadn't been loaded yet. I thought about this in the shower just now, and ISTM that if we want to turn this into an actual feature rather than a kluge, there needs to be some sort of "define variable" command that sets up a custom variable and specifies its type (and whatever other properties seem worth setting). IOW expose the DefineCustomFooVariable functions to SQL users. I'd be a bit inclined to restrict the namespace that can be set up that way, eg allow only "local." or "session." as the prefix. Maybe that's just being too anal, but we could guarantee not to introduce colliding built-in GUCs in future releases, whereas people trying to define variables with any random name would definitely be at risk. Comments? regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Tom Lane wrote: >>> Really? [ pokes around ... ] Hm, you're right, because >>> add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this >>> usage it'll never be cleared. I wonder if we should change that. >>> >>> The whole thing is a bit of an abuse of what the mechanism was >>> intended for, and so I'm not sure we should rejigger GUC's behavior >>> to make it more pleasant, but on the other hand if we're not ready to >>> provide a better substitute ... > >> While I agree with that part, is there any actual *reason* why we >> shouldn't have the custom variables included in pg_settings? I've needed it myself before -- I think it is a good idea. > IIRC, the motivation for doing that was to not expose a completely bogus > set of attributes for a variable whose defining C-module hadn't been > loaded yet. > > I thought about this in the shower just now, and ISTM that if we want to > turn this into an actual feature rather than a kluge, there needs to be > some sort of "define variable" command that sets up a custom variable > and specifies its type (and whatever other properties seem worth > setting). IOW expose the DefineCustomFooVariable functions to SQL users. > > I'd be a bit inclined to restrict the namespace that can be set up that > way, eg allow only "local." or "session." as the prefix. Maybe > that's just being too anal, but we could guarantee not to introduce > colliding built-in GUCs in future releases, whereas people trying to > define variables with any random name would definitely be at risk. > > Comments? Would it make sense to have built-in GUCs belong to "pg_catalog." and user defined GUCs default to "public."? Joe
Tom Lane wrote > "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > >> As for 2) and 3), can't you look into the pg_settings view? > > > pg_settings view doesn't contain custom variables created > on the fly, > > Really? [ pokes around ... ] Hm, you're right, because > add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this > usage it'll never be cleared. I wonder if we should change that. > > The whole thing is a bit of an abuse of what the mechanism > was intended > for, and so I'm not sure we should rejigger GUC's behavior to make it > more pleasant, but on the other hand if we're not ready to provide a > better substitute ... In my experiments with materialized views, I identified these problems as "minor" difficulties. Resolving them would allowfurther abuse ;) Aside: It is currently more cumbersome to get a function to run, if needed, at commit. Ideal solution would be somethinglike "EXECUTE ON COMMIT my_function()" or maybe "SAVEPOINT my_name ON COMMIT my_function()", but these suggestionsare made without investigating what provision the SQL standard has made to address this need. My use of mv.initialized means I can create variables when initializing a transaction, and afterwards know that they havevalues, but what I can't easily do is use those variables to identify which grouping keys have been updated. To do thatI select & conditionally insert to a table for that explicit purpose. If select doesn't find the key, then I create variablesnamed after that key, with zero values. Performance and efficiency-wise.... which would be better way of keeping track of grouping keys used in a transaction?: 1) Create a temp table, on commit drop, for the transaction, storing grouping keys affected. 2) Use a persistent table, storing txid and grouping keys affected, deleting txid rows at commit. 3) Use pg_settings, storing tx local grouping keys affected, existence check via catching an exception, listing via checkingexistence for all possible values (a possibility in my scenario). Speed is my priority, low disk IO is a probable means to that end, which is why I investigated using variables. Basically, (3) isn't a viable option, so what are the trade-offs between creating a temporary table per transaction, or usingrows in a permanent table with a txid column? Here are some more plpgsql code fragments: mv := 'mv.' || view_name || '.' || key_value || '.'; When recording a grouping key as being affected by the transaction, create the variables with zeroes: PERFORM set_config(mv||'documents', '0', true); PERFORM set_config(mv||'last_addition', 'null', true); In an insert trigger: PERFORM set_config(mv||'documents', (current_setting(mv||'documents')::bigint + 1)::text, true); PERFORM set_config(mv||'last_addition',now()::text, true); In the defferred till commit trigger: UPDATE materialized_view set documents=documents+current_setting(mv||'documents')::bigint, last_addition=greatest(last_addition,nullif(current_setting(mv||'last_addition'),'null')::timestamp) where group_id = key_values.key_value; Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
"Stephen Denne" <Stephen.Denne@datamail.co.nz> writes: > Aside: It is currently more cumbersome to get a function to run, if needed, at commit. Ideal solution would be somethinglike "EXECUTE ON COMMIT my_function()" or maybe "SAVEPOINT my_name ON COMMIT my_function()", but these suggestionsare made without investigating what provision the SQL standard has made to address this need. There is none, and the reason seems pretty obvious to me. What if your "on commit" function fails? Or if you have two, and the second one fails? Or even more to the point, the second one does something that the first one expected to see the effects of? Transaction commit is an exceedingly subtle and carefully structured thing. Throwing random user-defined code into it ain't gonna happen. regards, tom lane
>> The whole thing is a bit of an abuse of what the mechanism >> was intended >> for, and so I'm not sure we should rejigger GUC's behavior to make it >> more pleasant, but on the other hand if we're not ready to provide a >> better substitute ... > > In my experiments with materialized views, I identified these problems > as "minor" difficulties. Resolving them would allow further abuse ;) Let's try this quick & dirty implementation of a local count-delta cache using a local in-memory hashtable (ie. {}). Writing the results to stable storage in an ON COMMIT trigger is left as an exercise to the reader ;) Performance isn't that bad, calling the trigger takes about 50 us. Oldskool implementation with a table is at the end, it's about 10x slower. Example : INSERT INTO victim1 (key) VALUES ('one'),('two'),('two'); INSERT 0 3 Temps : 1,320 ms test=# SELECT * FROM get_count(); key | cnt -----+----- two | 2 one | 1 CREATE OR REPLACE FUNCTION clear_count( ) RETURNS VOID AS $$ GD.clear() $$ LANGUAGE plpythonu; CREATE OR REPLACE FUNCTION update_count( key TEXT, delta INTEGER ) RETURNS INTEGER AS $$ if key in GD: GD[key] += delta else: GD[key] = delta return GD[key] $$ LANGUAGE plpythonu; CREATE TYPE count_data AS ( key TEXT, cnt INTEGER ); CREATE OR REPLACE FUNCTION get_count( ) RETURNS SETOF count_data AS $$ return GD.iteritems() $$ LANGUAGE plpythonu; CREATE TABLE victim( id SERIAL PRIMARY KEY, key TEXT NOT NULL ); INSERT INTO victim (key) SELECT (random() * 300)::INTEGER::TEXT FROM generate_series( 1,100000 ); CREATE TABLE victim1( id SERIAL PRIMARY KEY, key TEXT NOT NULL ); \timing INSERT INTO victim1 SELECT * FROM victim; TRUNCATE TABLE victim1; SELECT clear_count(); INSERT INTO victim1 SELECT * FROM victim RETURNING update_count( key, 1 ); SELECT * FROM get_count(); TRUNCATE TABLE victim1; CREATE OR REPLACE FUNCTION counter_trigger_f() RETURNS TRIGGER LANGUAGE plpgsql AS $$ DECLARE BEGIN IF TG_OP = 'INSERT' THEN PERFORM update_count( NEW.key, 1 ); RETURN NEW; ELSEIF TG_OP = 'UPDATE'THEN -- update topic IF NEW.key != OLD.key THEN PERFORM update_count( OLD.key, -1 ), update_count(NEW.key, 1 ); END IF; RETURN NEW; ELSE -- DELETE PERFORM update_count( OLD.key, -1 ); RETURN OLD; END IF; END; $$; CREATE TRIGGER count_trigger BEFORE INSERT OR UPDATE OR DELETE ON victim1 FOR EACH ROW EXECUTE PROCEDURE counter_trigger_f(); SELECT clear_count(); INSERT INTO victim1 SELECT * FROM victim; SELECT * FROM get_count(); SELECT clear_count(); TRUNCATE TABLE victim1; INSERT INTO victim1 (key) VALUES ('one'),('two'),('two'); SELECT * FROM get_count(); DELETE FROM victim1 WHERE key='two'; SELECT * FROM get_count(); UPDATE victim1 SET key='three' WHERE key='one'; SELECT * FROM get_count(); DELETE FROM victim1; SELECT * FROM get_count(); CREATE TABLE counts( key TEXT PRIMARY KEY, total INTEGER NOT NULL DEFAULT 0 ); CREATE OR REPLACE FUNCTION table_counter_trigger_f() RETURNS TRIGGER LANGUAGE plpgsql AS $$ DECLARE BEGIN IF TG_OP = 'INSERT' THEN UPDATE counts SET total=total+1 WHERE key=NEW.key; IF NOT FOUND THEN INSERTINTO counts (key,total) VALUES (NEW.key,1); END IF; RETURN NEW; ELSEIF TG_OP = 'UPDATE' THEN -- update topic IF NEW.key != OLD.keyTHEN UPDATE counts SET total=total-1 WHERE key=OLD.key; UPDATE counts SET total=total+1 WHEREkey=NEW.key; IF NOT FOUND THEN INSERT INTO counts (key,total) VALUES (NEW.key,1); END IF; END IF; RETURN NEW; ELSE -- DELETE UPDATE counts SET total=total-1 WHEREkey=OLD.key; RETURN OLD; END IF; END; $$; CREATE TABLE victim2( id SERIAL PRIMARY KEY, key TEXT NOT NULL ); CREATE TRIGGER table_count_trigger BEFORE INSERT OR UPDATE OR DELETE ON victim2 FOR EACH ROW EXECUTE PROCEDURE table_counter_trigger_f(); SELECT * FROM counts; TRUNCATE TABLE victim2; INSERT INTO victim2 SELECT * FROM victim;
Tom Lane wrote > Transaction commit is an exceedingly subtle and carefully structured > thing. Throwing random user-defined code into it ain't gonna happen. Deferred constraint triggers currently run random user-defined code. This'll do me. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
PFC wrote: > Let's try this quick & dirty implementation of a local > count-delta cache > using a local in-memory hashtable (ie. {}). > CREATE OR REPLACE FUNCTION update_count( key TEXT, delta INTEGER ) > RETURNS INTEGER > AS $$ > if key in GD: > GD[key] += delta > else: > GD[key] = delta > return GD[key] > $$ LANGUAGE plpythonu; Thanks for the code, this seems to be very much what I was looking for. I don't know plpythonu (nor python), just read a few docs now: "The global dictionary SD is available to store data between function calls. This variable is private static data. The globaldictionary GD is public data, available to all Python functions within a session. Use with care." Does session == transaction or connection? I don't understand the difference between SD and GD, private and public. Where are the context boundaries? Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
On Thu, 17 Apr 2008 02:48:37 +0200, Stephen Denne <Stephen.Denne@datamail.co.nz> wrote: > PFC wrote: >> Let's try this quick & dirty implementation of a local >> count-delta cache >> using a local in-memory hashtable (ie. {}). > >> CREATE OR REPLACE FUNCTION update_count( key TEXT, delta INTEGER ) >> RETURNS INTEGER >> AS $$ >> if key in GD: >> GD[key] += delta >> else: >> GD[key] = delta >> return GD[key] >> $$ LANGUAGE plpythonu; > > Thanks for the code, this seems to be very much what I was looking for. > > I don't know plpythonu (nor python), just read a few docs now: Learn Python, it is a really useful language ;) > "The global dictionary SD is available to store data between function > calls. This variable is private static data. The global dictionary GD is > public data, available to all Python functions within a session. Use > with care." > > Does session == transaction or connection? > I don't understand the difference between SD and GD, private and public. > Where are the context boundaries? There is no sharing between processes, so- both SD and GD are limited to the current session (connection, postgres process), no shared memory is involved- GD is global between all python functions (global)- SD is specific to each pythonfunction (static) The big gotcha is that these are all non-transactional : if you rollback, GD and SD stay the same, and when you issue a query, you can assume the state of SD and GD is random (due to previous queries) unless you initialize them to a known value.
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> I thought about this in the shower just now, and ISTM that if we want to >> turn this into an actual feature rather than a kluge, there needs to be >> some sort of "define variable" command that sets up a custom variable >> and specifies its type (and whatever other properties seem worth >> setting). IOW expose the DefineCustomFooVariable functions to SQL users. >> >> I'd be a bit inclined to restrict the namespace that can be set up that >> way, eg allow only "local." or "session." as the prefix. Maybe >> that's just being too anal, but we could guarantee not to introduce >> colliding built-in GUCs in future releases, whereas people trying to >> define variables with any random name would definitely be at risk. > Would it make sense to have built-in GUCs belong to "pg_catalog." and > user defined GUCs default to "public."? [ after a bit of reflection... ] I don't think that we want to tie GUC names to schemas, especially not schemas that might not be there (remember public is droppable). The existing scheme for qualified GUC names considers that the prefix is the name of a loadable module, which isn't typically tied to any particular schema. I kinda like "session" as the prefix since it helps remind people that these things will have session lifespan. OTOH, there's a possibility for confusion with the SET SESSION syntax ("hm, did you mean SET SESSION foo = ... or SET session.foo = ...?"). "local" has got the same issue. Maybe "temp"? regards, tom lane
PFC wrote > <Stephen.Denne@datamail.co.nz> wrote: > > I don't know plpythonu (nor python), just read a few docs now: > > Learn Python, it is a really useful language ;) My wife has a snake phobia, besides, I've just started learning Scala. > There is no sharing between processes, so > - both SD and GD are limited to the current session > (connection, postgres > process), no shared memory is involved > - GD is global between all python functions (global) > - SD is specific to each python function (static) Thanks. > The big gotcha is that these are all non-transactional > : if you rollback, > GD and SD stay the same, and when you issue a query, you can > assume the > state of SD and GD is random (due to previous queries) unless you > initialize them to a known value. Using txid_current() as a key should alleviate that. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
> > The big gotcha is that these are all non-transactional > > : if you rollback, > > GD and SD stay the same, and when you issue a query, you can > > assume the > > state of SD and GD is random (due to previous queries) unless you > > initialize them to a known value. > > Using txid_current() as a key should alleviate that. No... hold on, it is per session, and a session can't have two or more transactions active at once can it? (Though I hear of things called sub-transactions) So the problem is that other functions may be using GD themselves, and your own code is at the mercy of the other functions.Conversely you shouldn't clear GD, as some other function may be using it. So you're better off using a single function for everything, and using SD within it? There isn't any way of telling whether the function is being called for the first time in a transaction. You don't know whento clear it. Regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal BusinessQuality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __________________________________________________________________
> My wife has a snake phobia, besides, I've just started learning Scala. Just had a look at Scala, it looks nice. Slightly Lispish (like all good languages)... > txid_current() > No... hold on, it is per session, and a session can't have two or more > transactions active at once can it? It could be used to detect rollback. > So the problem is that other functions may be using GD themselves, and > your own code is at the mercy of the other functions. Conversely you > shouldn't clear GD, as some other function may be using it. Exactly. > So you're better off using a single function for everything, and using > SD within it? Since the purpose is to store counts for rows matching a certain criteria in a set of tables, you could build a hashtable of hashtables, like : GD[table name][criteria name][criteria value] = count This would add complexity, about half a line of code. But you'd have to create lots of plpgsql trigger functions to wrap it. > There isn't any way of telling whether the function is being called for > the first time in a transaction. You don't know when to clear it. The first time in a session, GD will be empty.Clearing it at the start of a transaction would not be useful (clearing it at ROLLBACK would).It is updating the "real" summary table with the contents of this hash that is the problem, also. So, basically, if you connect, do one insert, and disconnect, this would be useless.But, if you do a zillion inserts, caching the counts deltas in RAM would be faster.And if you use persistent connections, you could update the counts in the real table only every N minutes, for instance, but this would need some complicity from the backend. > Regards, > Stephen Denne. > > Disclaimer: > At the Datamail Group we value team commitment, respect, achievement, > customer focus, and courage. This email with any attachments is > confidential and may be subject to legal privilege. If it is not > intended for you please advise by reply immediately, destroy it and do > not copy, disclose or use it in any way. > __________________________________________________________________ > This email has been scanned by the DMZGlobal Business Quality > Electronic Messaging Suite. > Please see http://www.dmzglobal.com/dmzmessaging.htm for details. > __________________________________________________________________ > >
Added to TODO: * Allow custom variables to appear in pg_settings() --------------------------------------------------------------------------- Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > Tom Lane wrote: > >> Really? [ pokes around ... ] Hm, you're right, because > >> add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this > >> usage it'll never be cleared. I wonder if we should change that. > >> > >> The whole thing is a bit of an abuse of what the mechanism was > >> intended for, and so I'm not sure we should rejigger GUC's behavior > >> to make it more pleasant, but on the other hand if we're not ready to > >> provide a better substitute ... > > > While I agree with that part, is there any actual *reason* why we > > shouldn't have the custom variables included in pg_settings? > > IIRC, the motivation for doing that was to not expose a completely bogus > set of attributes for a variable whose defining C-module hadn't been > loaded yet. > > I thought about this in the shower just now, and ISTM that if we want to > turn this into an actual feature rather than a kluge, there needs to be > some sort of "define variable" command that sets up a custom variable > and specifies its type (and whatever other properties seem worth > setting). IOW expose the DefineCustomFooVariable functions to SQL users. > > I'd be a bit inclined to restrict the namespace that can be set up that > way, eg allow only "local." or "session." as the prefix. Maybe > that's just being too anal, but we could guarantee not to introduce > colliding built-in GUCs in future releases, whereas people trying to > define variables with any random name would definitely be at risk. > > Comments? > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +