Re: Is there a faster way to do this? - Mailing list pgsql-sql

From Greg Sabino Mullane
Subject Re: Is there a faster way to do this?
Date
Msg-id a2a9e91124547cf39eefa84ee9b6dca0@biglumber.com
Whole thread Raw
In response to Is there a faster way to do this?  (Patrick Hatcher <PHatcher@macys.com>)
List pgsql-sql
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 
Hard to imagine it taking that long, even for a table that large,
but we don't know the details of your system. I got a sample
table with 800,000 records down to 92 seconds using the function
below. See how it does for you. Notes follow. 
CREATE OR REPLACE FUNCTION percentile_calc(INTEGER) RETURNS text AS 'DECLARE v_tile ALIAS FOR $1; v_interval integer;
v_recordrecord; v_rowcount integer := 0; v_percentile integer := 1; BEGIN  SELECT COUNT(*)/v_tile FROM cdm_indiv_mast
WHEREval_purch_com > 0   INTO v_interval;  FOR v_record IN   SELECT ctid FROM cdm_indiv_mast WHERE val_purch_com>0
ORDERBY val_purch_com DESC LOOP   v_rowcount := v_rowcount + 1;   UPDATE cdm_indiv_mast SET percentiler=v_percentile
WHEREctid = v_record.ctid;   IF v_rowcount >= v_interval THEN     v_percentile := v_percentile + 1;     v_rowcount :=
0;  END IF; END LOOP;  RETURN  \'DONE\';
 
END;
' LANGUAGE plpgsql STABLE STRICT; 
CREATE OR REPLACE FUNCTION percentile_calc() RETURNS text AS 'SELECT new10(100);' LANGUAGE SQL; 
Notes: 
Since 100 seemed to be hard-coded into the original function, there was
no need for v_count. Instead, I made "v_tile" a variable, with a default
of "100" if the function is called with no argument. 
There may be a false assumption here. If the values of val_purch_com are
not unique, then two items with the same val_purch_com may have different
percentiler values. If this is the case, you may want to at least enforce
some ordering of these values by adding more to the ORDER BY clause.
Without knowing the full table structure, I can't recommend what columns
to add there. 
To really speed this up, make sure that you do not have any indexes on
the table. By using tids, we neatly avoid having to use any indexes in the
function itself. Unless you are using oids and really need them (highly
unlikely because of the "indiv_key" column), you should remove them: 
ALTER TABLE cdm_indiv_mast SET WITHOUT OIDS; 
Of course, vacuuming completely and often is recommended for a table this
size as well, especially when updating this many rows at once. I'd
recommend a VACUUM FULL immediately before running it.
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200406162303
-----BEGIN PGP SIGNATURE-----
iD8DBQFA0QpdvJuQZxSWSsgRAk88AKDtGrs6+/ypaaNU6DWqvhCgtsrM0gCg+2ve
J8JKOPgxp42c54Nx/rzHdxs=
=sNFW
-----END PGP SIGNATURE-----




pgsql-sql by date:

Previous
From: "Jie Liang"
Date:
Subject: Re: [JDBC] Prepare Statement
Next
From: Hannes Korte
Date:
Subject: use of a composite type in CREATE TABLE?