Re: count(*) performance improvement ideas - Mailing list pgsql-hackers

From Stephen Denne
Subject Re: count(*) performance improvement ideas
Date
Msg-id F0238EBA67824444BC1CB4700960CB48051D5FCC@dmpeints002.isotach.com
Whole thread Raw
In response to Re: count(*) performance improvement ideas  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: count(*) performance improvement ideas  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: count(*) performance improvement ideas  (PFC <lists@peufeu.com>)
List pgsql-hackers
Tom Lane wrote
> "Stephen Denne" <Stephen.Denne@datamail.co.nz> writes:
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> >> As for 2) and 3), can't you look into the pg_settings view?
>
> > pg_settings view doesn't contain custom variables created
> on the fly,
>
> Really?  [ pokes around ... ]  Hm, you're right, because
> add_placeholder_variable() sets the GUC_NO_SHOW_ALL flag, and in this
> usage it'll never be cleared.  I wonder if we should change that.
>
> The whole thing is a bit of an abuse of what the mechanism
> was intended
> for, and so I'm not sure we should rejigger GUC's behavior to make it
> more pleasant, but on the other hand if we're not ready to provide a
> better substitute ...

In my experiments with materialized views, I identified these problems as "minor" difficulties. Resolving them would
allowfurther abuse ;) 

Aside: It is currently more cumbersome to get a function to run, if needed, at commit. Ideal solution would be
somethinglike "EXECUTE ON COMMIT my_function()" or maybe "SAVEPOINT my_name ON COMMIT my_function()", but these
suggestionsare made without investigating what provision the SQL standard has made to address this need. 

My use of mv.initialized means I can create variables when initializing a transaction, and afterwards know that they
havevalues, but what I can't easily do is use those variables to identify which grouping keys have been updated. To do
thatI select & conditionally insert to a table for that explicit purpose. If select doesn't find the key, then I create
variablesnamed after that key, with zero values. 

Performance and efficiency-wise.... which would be better way of keeping track of grouping keys used in a transaction?:
1) Create a temp table, on commit drop, for the transaction, storing grouping keys affected.
2) Use a persistent table, storing txid and grouping keys affected, deleting txid rows at commit.
3) Use pg_settings, storing tx local grouping keys affected, existence check via catching an exception, listing via
checkingexistence for all possible values (a possibility in my scenario). 

Speed is my priority, low disk IO is a probable means to that end, which is why I investigated using variables.

Basically, (3) isn't a viable option, so what are the trade-offs between creating a temporary table per transaction, or
usingrows in a permanent table with a txid column? 

Here are some more plpgsql code fragments:
  mv := 'mv.' || view_name || '.' || key_value || '.';

When recording a grouping key as being affected by the transaction, create the variables with zeroes:
  PERFORM set_config(mv||'documents', '0', true);  PERFORM set_config(mv||'last_addition', 'null', true);

In an insert trigger:
  PERFORM set_config(mv||'documents', (current_setting(mv||'documents')::bigint + 1)::text, true);  PERFORM
set_config(mv||'last_addition',now()::text, true); 

In the defferred till commit trigger:
     UPDATE materialized_view set         documents=documents+current_setting(mv||'documents')::bigint,
last_addition=greatest(last_addition,nullif(current_setting(mv||'last_addition'),'null')::timestamp)       where
group_id = key_values.key_value; 


Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 
__________________________________________________________________ This email has been scanned by the DMZGlobal
BusinessQuality             Electronic Messaging Suite. 
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__________________________________________________________________




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Lessons from commit fest
Next
From: "Alex Hunsaker"
Date:
Subject: Re: Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout