Re: Adding TEXT columns tanks performance? - Mailing list pgsql-general

From Arturo Perez
Subject Re: Adding TEXT columns tanks performance?
Date
Msg-id C7C8048D-6322-485F-B038-8918DCD51210@hayesinc.com
Whole thread Raw
In response to Re: Adding TEXT columns tanks performance?  ("Merlin Moncure" <mmoncure@gmail.com>)
Responses Re: Adding TEXT columns tanks performance?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Feb 9, 2007, at 11:43 AM, Merlin Moncure wrote:

> On 2/8/07, Arturo Perez <aperez@hayesinc.com> wrote:
>> Hi all,
>>
>> Saturday I changed a table to add a varchar(24) and a TEXT column.
>> It's used for some reporting purposes (small potatoe stuff really)
>> and the TEXT column remains mostly empty.  However, this week
>> performance has gotten terrible.  Queries joining against the
>> aforementioned table have gone from 40s to 1500s.  The schema change
>> is the only explanation I have for the 30x slower queries.  The
>> queries have definitely gotten disk-bound (I can see the connection
>> process sitting in the D state for several minutes).
>>
>> This is pg8.1.4 on Linux RedHat. Would adding a TEXT column impact
>> things that much?  The change to the schema was
>>
>> alter table foo add column title text;
>
> explain analyze please.
>
> no reason for adding text column to do that.  especially since you
> didn't default the column which would effectively update the entire
> table.
>
> merlin


Here's the explain analyze.  Note that this query was never very fast
but's it has literally gotten two orders of magnitude slower. This is
with 8.1.4 on linux with nothing special in terms of disks (I think
it's a mirrored system drive).  It's a hosted environment kind of thing.

Shared buffers is 160MB (20000) and effective cache is 1GB.  The
user_tracking
table has about 2M rows and the extended_user table has about 6K.
I'm not very
good at reading these but it looks like sort memory might be too
low?  work_mem is 1024,
the default.  I did this "set session work_mem to 100000;" in psql
but it still takes
quite a while.

iht=> explain analyze SELECT session_id, action, count(ACTION) as hits
iht->                                 FROM extended_user LEFT JOIN
user_tracking USING (user_id)
iht->                                 WHERE subscription_id = 1147
iht->                                 GROUP BY session_id, action
iht->                                 HAVING count(ACTION) > 0;

              QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------
GroupAggregate  (cost=172717.49..173695.46 rows=32599 width=60)
(actual time=411713.041..411761.857 rows=7309 loops=1)
    Filter: (count("action") > 0)
    ->  Sort  (cost=172717.49..172798.99 rows=32599 width=60) (actual
time=411712.907..411732.032 rows=16012 loops=1)
          Sort Key: user_tracking.session_id, user_tracking."action"
          ->  Merge Left Join  (cost=0.00..169571.78 rows=32599
width=60) (actual time=147593.828..411070.706 rows=16012 loops=1)
                Merge Cond: ("outer".user_id = "inner".user_id)
                ->  Index Scan using extended_user_pkey on
extended_user  (cost=0.00..236.92 rows=117 width=4) (actual
time=1.627..154.499 rows=119 loops=1)
                      Filter: (subscription_id = 1147)
                ->  Index Scan using user_tracking_user_id_idx on
user_tracking  (cost=0.00..164008.04 rows=2000218 width=64) (actual
time=0.010..408731.064 rows=2000620 loops=1)
Total runtime: 411781.174 ms
(10 rows)



pgsql-general by date:

Previous
From: Kenneth Downs
Date:
Subject: Trouble w/plperl sproc on red hat 9
Next
From: "Anton Melser"
Date:
Subject: drop table if exists mytable;