Thread: Slow update query

Slow update query

From
"elias ghanem"
Date:

Hi,

Thanks for your help, here’s more details as you requested:

-The version of postgres is 8.4 (by the way select pg_version() is not working but let’s concentrate on the query issue)

Here’s the full definition of the table with it’s indices:

-- Table: in_sortie

 

-- DROP TABLE in_sortie;

 

CREATE TABLE in_sortie

(

  "type" character(1),

  site_id character varying(100),

  fiche_produit_id character varying(100),

  numero_commande character varying(100),

  ligne_commande integer,

  date_sortie date,

  quantite_sortie numeric(15,2),

  date_livraison_souhaitee date,

  quantite_souhaitee numeric(15,2),

  client_ref character varying(100),

  valeur numeric(15,2),

  type_mouvement character varying(100),

  etat_sortie_annulation integer,

  etat_sortie_prevision integer,

  etat_sortie_taux_service integer,

  date_commande date,

  valide character varying(1)

)

WITH (

  OIDS=FALSE

)

TABLESPACE "AG_INTERFACE";

 

-- Index: idx_in_sortie

 

-- DROP INDEX idx_in_sortie;

 

CREATE INDEX idx_in_sortie

  ON in_sortie

  USING btree

  (site_id, fiche_produit_id);

 

-- Index: idx_in_sortie_fp

 

-- DROP INDEX idx_in_sortie_fp;

 

CREATE INDEX idx_in_sortie_fp

  ON in_sortie

  USING btree

  (fiche_produit_id);

 

-- Index: idx_in_sortie_site

 

-- DROP INDEX idx_in_sortie_site;

 

CREATE INDEX idx_in_sortie_site

  ON in_sortie

  USING btree

  (site_id);

 

-Concerning the postgresql.conf file I’ve tried to changed the default values such as: shared_buffers and effective_cache_size… but this did not change the result.

 

-The WAL IS NOT ON DIFFERENT DISK, THEY ARE ON THE SAME DISK WHER THE DB IS (for the moment I don’t have the possibility of moving them to another disk but maybe “just for testing” you can tell me how I can totally disable WAL if possible).

 

I’m using postgresql 8.4 on Linux machine with 1.5 GB RAM, and I’m issuing an update query with a where clause that updates approximately 100 000 rows in a table containing approximately 3 200 000 rows.

The update query is very simple: UPDATE IN_SORTIE SET VALIDE = VALIDE WHERE VALEUR < 0.83 (the where clause is used to limit the affected rows to ~ 100 000, and the “SET VALIDE = VALIDE” is only on purpose to keep the data of the table unchanged).

Actually this query is inside a function and this function is called from a .sh file using the following syntax: psql -h $DB_HOST -p $DB_PORT -d $DB_NAME -U $DB_USER -c "SELECT testupdate()"

 (the function is called 100 times with a vacuum analyze after each call for the table).

So the average execution time of the function is around 2.5 mins, meaning that the update query (+ the vacuum) takes 2.5 mins to execute. So is this a normal behavior? (The same function in oracle with the same environment (with our vacuum obviously) is executed in 11 second).

 

Thanks for your help.

 

Re: Slow update query

From
"Kevin Grittner"
Date:
"elias ghanem" <e.ghanem@acteos.com> wrote:

> here's more details as you requested

You didn't include an EXPLAIN ANALYZE of the UPDATE statement.

> -The version of postgres is 8.4 (by the way select pg_version() is
> not working but let's concentrate on the query issue)

As far as I know, there is no pg_version() function; try

SELECT version();

Sometimes the exact version is relevant to a performance issue, but
there aren't many of fixes for performance regression in 8.4 minor
releases, so it might not matter in this particular case.

> -Concerning the postgresql.conf file I've tried to changed the
> default values such as: shared_buffers and effective_cache_size.
> but this did not change the result.

Perhaps not, but other settings might help performance.  Am I to
understand that you're running an "out of the box" configuration,
with no tuning yet?

> -The WAL IS NOT ON DIFFERENT DISK, THEY ARE ON THE SAME DISK WHER
> THE DB IS (for the moment I don't have the possibility of moving
> them to another disk but maybe "just for testing" you can tell me
> how I can totally disable WAL if possible).

You can't totally disable it, as it is there primarily to ensure
database integrity.  There are several ways to tune it, based on the
number of WAL segments, the WAL buffers, the background writer
aggressiveness, various delays, etc.  Understanding the workload is
key to appropriate tuning.

> I'm using postgresql 8.4 on Linux machine with 1.5 GB RAM, and I'm
> issuing an update query with a where clause that updates
> approximately 100 000 rows in a table containing approximately
> 3 200 000 rows.

This is not a use case where PostgreSQL shines; it is, however, a
rather unusual use case in normal operations.  I'm curious why
you're testing this -- if we understood the real problem behind the
test we might be able to provide more useful advice.  "Teaching to
the test" has its limitations.

> The update query is very simple: UPDATE IN_SORTIE SET VALIDE =
> VALIDE WHERE VALEUR < 0.83 (the where clause is used to limit the
> affected rows to ~ 100 000, and the "SET VALIDE = VALIDE" is only
> on purpose to keep the data of the table unchanged).

So you want to optimize a query which does absolutely nothing to the
data.  It's not hard to make that particular case *much* faster,
which again leads one to wonder what you're *really* trying to
optimize.  If we knew that, it might open up options not applicable
to the synthetic case.

> (the function is called 100 times with a vacuum analyze after
> each call for the table).
>
> So the average execution time of the function is around 2.5 mins,
> meaning that the update query (+ the vacuum) takes 2.5 mins to
> execute.

Vacuuming normally happens as a background or off-hours process, so
as not to slow down user queries.  Now, running ten million updates
against a table with 3.2 million rows without vacuuming would cause
its own set of problems; so we're back to the question of -- if you
really don't want to do ten million updates to a three million row
table to make no changes, what is it that you *do* want to do for
which you're using this test to optimize?  Any advice given without
knowing that would be a shot in the dark.

-Kevin

Re: Slow update query

From
Craig Ringer
Date:
elias ghanem wrote:

> Actually this query is inside a function and this function is called
> from a .sh file using the following syntax: psql -h $DB_HOST -p $DB_PORT
> -d $DB_NAME -U $DB_USER -c "SELECT testupdate()"
>
>  (the function is called 100 times with a vacuum analyze after each call
> for the table).
>
> So the average execution time of the function is around 2.5 mins,
> meaning that the update query (+ the vacuum) takes 2.5 mins to execute.
> So is this a normal behavior? (The same function in oracle with the same
> environment (with our vacuum obviously) is executed in 11 second).

It might be worth measuring using psql's \timing to see how long the
update and the vacuum take individually.

--
Craig Ringer

Re: Slow update query

From
Robert Haas
Date:
On Thu, Jan 21, 2010 at 11:14 AM, elias ghanem <e.ghanem@acteos.com> wrote:
> So the average execution time of the function is around 2.5 mins, meaning
> that the update query (+ the vacuum) takes 2.5 mins to execute. So is this a
> normal behavior? (The same function in oracle with the same environment
> (with our vacuum obviously) is executed in 11 second).

Does Oracle get slower if you actually change something?

...Robert