Thread: Help speeding up delete

Help speeding up delete

From
Steve Wampler
Date:
We've got an older system in production (PG 7.2.4).  Recently
one of the users has wanted to implement a selective delete,
but is finding that the time it appears to take exceeds her
patience factor by several orders of magnitude.  Here's
a synopsis of her report.  It appears that the "WHERE
id IN ..." is resulting in a seq scan that is causing
the problem, but we're not SQL expert enough to know
what to do about it.

Can someone point out what we're doing wrong, or how we
could get a (much) faster delete?  Thanks!

Report:
============================================================
This command yields results in only a few seconds:

# SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
#   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';

However, the following command does not seen to want to ever
complete (the person running this killed it after 1/2 hour).

# DELETE FROM "tmp_table2" WHERE id IN
# (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
#   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505');

==============================================================

The table has four columns. There are 6175 rows satifying the condition
given, and the table itself has 1539688 entries.  Layout is:

lab.devel.configdb=# \d tmp_table2
              Table "tmp_table2"
 Column |           Type           | Modifiers
--------+--------------------------+-----------
 id     | character varying(64)    |
 name   | character varying(64)    |
 units  | character varying(32)    |
 value  | text                     |
 time   | timestamp with time zone |

==============================================================

lab.devel.configdb=# EXPLAIN DELETE FROM "tmp_table2" WHERE id IN
lab.devel.configdb-#   (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
lab.devel.configdb(#     WHERE at.id=a.id AND a.name='obsid' AND a.value='oid080505');
NOTICE:  QUERY PLAN:

Seq Scan on tmp_table2  (cost=0.00..154893452082.10 rows=769844 width=6)
  SubPlan
    ->  Materialize  (cost=100600.52..100600.52 rows=296330 width=100)
          ->  Hash Join  (cost=42674.42..100600.52 rows=296330 width=100)
                ->  Seq Scan on tmp_table2 at  (cost=0.00..34975.88 rows=1539688 width=50)
                ->  Hash  (cost=42674.32..42674.32 rows=38 width=50)
                      ->  Seq Scan on tmp_table2 a  (cost=0.00..42674.32 rows=38 width=50)
EXPLAIN

lab.devel.configdb=# EXPLAIN  (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
lab.devel.configdb(#    WHERE at.id=a.id AND a.name='obsid' AND a.value='oid080505');
NOTICE:  QUERY PLAN:

Hash Join  (cost=42674.42..100600.52 rows=296330 width=100)
  ->  Seq Scan on tmp_table2 at  (cost=0.00..34975.88 rows=1539688 width=50)
  ->  Hash  (cost=42674.32..42674.32 rows=38 width=50)
        ->  Seq Scan on tmp_table2 a  (cost=0.00..42674.32 rows=38 width=50)

EXPLAIN

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Scott Lamb
Date:
On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';

Isn't this equivalent?

select id from tmp_table2 where name = 'obsid' and value = 'oid080505';

> # DELETE FROM "tmp_table2" WHERE id IN
> # (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505');

and this?

delete from tmp_table2 where name = 'obsid' and value = 'oid080505';

Why are you doing a self-join using id, which I assume is a primary key?

--
Scott Lamb <http://www.slamb.org/>



Re: Help speeding up delete

From
Tom Lane
Date:
Steve Wampler <swampler@noao.edu> writes:
> We've got an older system in production (PG 7.2.4).  Recently
> one of the users has wanted to implement a selective delete,
> but is finding that the time it appears to take exceeds her
> patience factor by several orders of magnitude.  Here's
> a synopsis of her report.  It appears that the "WHERE
> id IN ..." is resulting in a seq scan that is causing
> the problem, but we're not SQL expert enough to know
> what to do about it.

> Can someone point out what we're doing wrong, or how we
> could get a (much) faster delete?  Thanks!

Update to 7.4 or later ;-)

Quite seriously, if you're still using 7.2.4 for production purposes
you could justifiably be accused of negligence.  There are three or four
data-loss-grade bugs fixed in the later 7.2.x releases, not to mention
security holes; and that was before we abandoned support for 7.2.
You *really* need to be thinking about an update.

            regards, tom lane

Re: Help speeding up delete

From
Steve Wampler
Date:
Scott Lamb wrote:
> On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
>
>> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';
>
>
> Isn't this equivalent?
>
> select id from tmp_table2 where name = 'obsid' and value = 'oid080505';

Probably, the user based the above on a query designed to find
all rows with the same id as those rows that have a.name='obsid' and
a.value='oid080505'.  However, I think the above would work to locate
all the ids, which is all we need for the delete (see below)

>> # DELETE FROM "tmp_table2" WHERE id IN
>> # (SELECT at.id FROM "tmp_table2" at, "tmp_table2" a
>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505');
>
>
> and this?
>
> delete from tmp_table2 where name = 'obsid' and value = 'oid080505';
>
> Why are you doing a self-join using id, which I assume is a primary key?

Because I think we need to.  The above would only delete rows that have
name = 'obsid' and value = 'oid080505'.  We need to delete all rows that
have the same ids as those rows.  However, from what you note, I bet
we could do:

   DELETE FROM "tmp_table2" WHERE id IN
      (SELECT id FROM "temp_table2" WHERE name = 'obsid' and value= 'oid080505');

However, even that seems to have a much higher cost than I'd expect:

   lab.devel.configdb=# explain delete from "tmp_table2" where id in
        (select id from tmp_table2 where name='obsid' and value = 'oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=0.00..65705177237.26 rows=769844 width=6)
     SubPlan
       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32 rows=38 width=50)

   EXPLAIN

And, sure enough, is taking an extrordinarily long time to run (more than
10 minutes so far, compared to < 10seconds for the select).  Is this
really typical of deletes?  It appears (to me) to be the Seq Scan on tmp_table2
that is the killer here.  If we put an index on, would it help?  (The user
claims she tried that and it's EXPLAIN cost went even higher, but I haven't
checked that...)

Thanks!
--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Steve Wampler
Date:
Tom Lane wrote:
> Steve Wampler <swampler@noao.edu> writes:
>
>>We've got an older system in production (PG 7.2.4).  Recently
>>one of the users has wanted to implement a selective delete,
>>but is finding that the time it appears to take exceeds her
>>patience factor by several orders of magnitude.  Here's
>>a synopsis of her report.  It appears that the "WHERE
>>id IN ..." is resulting in a seq scan that is causing
>>the problem, but we're not SQL expert enough to know
>>what to do about it.
>
>
>>Can someone point out what we're doing wrong, or how we
>>could get a (much) faster delete?  Thanks!
>
>
> Update to 7.4 or later ;-)

I was afraid you'd say that :-)  I'm not officially involved in
this project anymore and was hoping for a fix that wouldn't drag
me back in.  The security issues aren't a concern because this
DB is *well* hidden from the outside world (it's part of a telescope
control system behind several firewalls with no outside access).
However, the data-loss-grade bugs issue *is* important.  We'll
try to do the upgrade as soon as we get some cloudy days to
actually do it!

Is the performance behavior that we're experiencing a known
problem with 7.2 that has been addressed in 7.4?  Or will the
upgrade fix other problems while leaving this one?

> Quite seriously, if you're still using 7.2.4 for production purposes
> you could justifiably be accused of negligence.  There are three or four
> data-loss-grade bugs fixed in the later 7.2.x releases, not to mention
> security holes; and that was before we abandoned support for 7.2.
> You *really* need to be thinking about an update.

Thanks!
Steve
--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Joshua Marsh
Date:


On 11/14/05, Steve Wampler <swampler@noao.edu> wrote:
However, even that seems to have a much higher cost than I'd expect:

   lab.devel.configdb=# explain delete from "tmp_table2" where id in
        (select id from tmp_table2 where name='obsid' and value = 'oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=0.00..65705177237.26 rows=769844 width=6)
     SubPlan
       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32 rows=38 width=50)

For one reason or the other, the planner things a sequential scan is the best solution. Try turning off seq_scan before the query and see if it changes the plan (set enable_seqscan off;). 

I've seen this problem with sub queries and that usually solves it.

--
This E-mail is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521 and is legally privileged.

This information is confidential information and is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

Re: Help speeding up delete

From
Steve Wampler
Date:
Joshua Marsh wrote:
>
>
> On 11/14/05, *Steve Wampler* <swampler@noao.edu
> <mailto:swampler@noao.edu>> wrote:
>
>     However, even that seems to have a much higher cost than I'd expect:
>
>        lab.devel.configdb=# explain delete from "tmp_table2" where id in
>             (select id from tmp_table2 where name='obsid' and value =
>     'oid080505');
>        NOTICE:  QUERY PLAN:
>
>        Seq Scan on tmp_table2  (cost=0.00..65705177237.26 rows=769844
>     width=6)
>          SubPlan
>            ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
>                  ->  Seq Scan on tmp_table2  (cost=0.00..42674.32
>     rows=38 width=50)
>
>
> For one reason or the other, the planner things a sequential scan is the
> best solution. Try turning off seq_scan before the query and see if it
> changes the plan (set enable_seqscan off;).
>
> I've seen this problem with sub queries and that usually solves it.
>

Hmmm, not only does it still use sequential scans, it thinks it'll take
even longer:

   set enable_seqscan to off;
   SET VARIABLE
   explain delete from "tmp_table2" where id in
     (select id from tmp_table2 where name='obsid' and value = 'oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=100000000.00..160237039405992.50 rows=800836 width=6)
     SubPlan
       ->  Materialize  (cost=100043604.06..100043604.06 rows=45 width=26)
             ->  Seq Scan on tmp_table2  (cost=100000000.00..100043604.06 rows=45 width=26)

   EXPLAIN

But the advice sounds like it *should* have helped...

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Scott Lamb
Date:
On Nov 14, 2005, at 3:52 PM, Steve Wampler wrote:
> Scott Lamb wrote:
>> On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
>>
>>> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
>>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';
>>
>>
>> Isn't this equivalent?
>>
>> select id from tmp_table2 where name = 'obsid' and value =
>> 'oid080505';
>
> Probably, the user based the above on a query designed to find
> all rows with the same id as those rows that have a.name='obsid' and
> a.value='oid080505'.

Well, this indirection is only significant if those two sets can
differ. If (A) you meant "tmp_table2" when you wrote "tmp_tabl2e", so
this is a self-join, and (B) there is a primary key on "id", I don't
think that can ever happen.

> It appears (to me) to be the Seq Scan on tmp_table2
> that is the killer here.  If we put an index on, would it help?

On...tmp_table2.id? If it is a primary key, there already is one. If
not, yeah, I expect it would help.

--
Scott Lamb <http://www.slamb.org/>



Re: Help speeding up delete

From
Steve Wampler
Date:
Scott Lamb wrote:
> On Nov 14, 2005, at 3:52 PM, Steve Wampler wrote:
>
>> Scott Lamb wrote:
>>
>>> On Nov 14, 2005, at 2:07 PM, Steve Wampler wrote:
>>>
>>>> # SELECT at.id FROM "tmp_table2" at, "tmp_tabl2e" a
>>>> #   WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';
>>>
>>>
>>>
>>> Isn't this equivalent?
>>>
>>> select id from tmp_table2 where name = 'obsid' and value =  'oid080505';
>>
>>
>> Probably, the user based the above on a query designed to find
>> all rows with the same id as those rows that have a.name='obsid' and
>> a.value='oid080505'.
>
>
> Well, this indirection is only significant if those two sets can
> differ. If (A) you meant "tmp_table2" when you wrote "tmp_tabl2e", so
> this is a self-join, and (B) there is a primary key on "id", I don't
> think that can ever happen.

I wasn't clear.  The original query was:

   SELECT at.* FROM "tmp_table2" at, "tmp_table2" a
       WHERE at.id=a.id and a.name='obsid' and a.value='oid080505';

which is significantly different than:

   SELECT * FROM "tmp_table2" WHERE name='obsid' and value='oid080505';

The user had adapted that query for her needs, but it would have been
better to just use the query that you suggested (as the subselect in
the DELETE FROM...).  Unfortunately, that only improves performance
slightly - it is still way too slow on deletes.

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
"Magnus Hagander"
Date:
> Because I think we need to.  The above would only delete rows
> that have name = 'obsid' and value = 'oid080505'.  We need to
> delete all rows that have the same ids as those rows.
> However, from what you note, I bet we could do:
>
>    DELETE FROM "tmp_table2" WHERE id IN
>       (SELECT id FROM "temp_table2" WHERE name = 'obsid' and
> value= 'oid080505');
>
> However, even that seems to have a much higher cost than I'd expect:
>
>    lab.devel.configdb=# explain delete from "tmp_table2" where id in
>         (select id from tmp_table2 where name='obsid' and
> value = 'oid080505');
>    NOTICE:  QUERY PLAN:
>
>    Seq Scan on tmp_table2  (cost=0.00..65705177237.26
> rows=769844 width=6)
>      SubPlan
>        ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
>              ->  Seq Scan on tmp_table2  (cost=0.00..42674.32
> rows=38 width=50)
>
>    EXPLAIN
>
> And, sure enough, is taking an extrordinarily long time to
> run (more than 10 minutes so far, compared to < 10seconds for
> the select).  Is this really typical of deletes?  It appears
> (to me) to be the Seq Scan on tmp_table2 that is the killer
> here.  If we put an index on, would it help?  (The user
> claims she tried that and it's EXPLAIN cost went even higher,
> but I haven't checked that...)


Earlier pg versions have always been bad at dealing with IN subqueries.
Try rewriting it as (with fixing any broken syntax, I'm not actually
testing this :P)

DELETE FROM tmp_table2 WHERE EXISTS
 (SELECT * FROM tmp_table2 t2 WHERE t2.id=tmp_table2.id AND
t2.name='obsid' AND t2.value='oid080505')


I assume you do have an index on tmp_table2.id :-) And that it's
non-unique? (If it was unique, the previous simplification of the query
really should've worked..)

Do you also have an index on "name,value" or something like that, so you
get an index scan from it?

//Magnus

Re: Help speeding up delete

From
Steve Wampler
Date:
Magnus Hagander wrote:
>>Because I think we need to.  The above would only delete rows
>>that have name = 'obsid' and value = 'oid080505'.  We need to
>>delete all rows that have the same ids as those rows.
>>However, from what you note, I bet we could do:
>>
>>   DELETE FROM "tmp_table2" WHERE id IN
>>      (SELECT id FROM "temp_table2" WHERE name = 'obsid' and
>>value= 'oid080505');
>>
>>However, even that seems to have a much higher cost than I'd expect:
>>
>>   lab.devel.configdb=# explain delete from "tmp_table2" where id in
>>        (select id from tmp_table2 where name='obsid' and
>>value = 'oid080505');
>>   NOTICE:  QUERY PLAN:
>>
>>   Seq Scan on tmp_table2  (cost=0.00..65705177237.26
>>rows=769844 width=6)
>>     SubPlan
>>       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
>>             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32
>>rows=38 width=50)
>>
>>   EXPLAIN
...
>
> Earlier pg versions have always been bad at dealing with IN subqueries.
> Try rewriting it as (with fixing any broken syntax, I'm not actually
> testing this :P)
>
> DELETE FROM tmp_table2 WHERE EXISTS
>  (SELECT * FROM tmp_table2 t2 WHERE t2.id=tmp_table2.id AND
> t2.name='obsid' AND t2.value='oid080505')

Thanks - that looks *significantly* better:

   lab.devel.configdb=# explain delete from tmp_table2 where exists
                           (select 1 from tmp_table2 t2 where
                           t2.id=tmp_table2.id and
                           t2.name='obsid' and t2.value='oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=0.00..9297614.80 rows=769844 width=6)
     SubPlan
       ->  Index Scan using inv_index_2 on tmp_table2 t2  (cost=0.00..6.02 rows=1 width=0)

   EXPLAIN

(This is after putting an index on the (id,name,value) tuple.)  That outer seq scan
is still annoying, but maybe this will be fast enough.

I've passed this on, along with the (strong) recommendation that they
upgrade PG.

Thanks!!

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Simon Riggs
Date:
On Mon, 2005-11-14 at 18:42 -0500, Tom Lane wrote:
> Steve Wampler <swampler@noao.edu> writes:
> > We've got an older system in production (PG 7.2.4).

>
> Update to 7.4 or later ;-)
>
> Quite seriously, if you're still using 7.2.4 for production purposes
> you could justifiably be accused of negligence.  There are three or four
> data-loss-grade bugs fixed in the later 7.2.x releases, not to mention
> security holes; and that was before we abandoned support for 7.2.
> You *really* need to be thinking about an update.

Perhaps we should put a link on the home page underneath LATEST RELEASEs
saying
    7.2: de-supported

with a link to a scary note along the lines of the above.

ISTM that there are still too many people on older releases.

We probably need an explanation of why we support so many releases (in
comparison to licenced software) and a note that this does not imply the
latest releases are not yet production (in comparison to MySQL or Sybase
who have been in beta for a very long time).

Best Regards, Simon Riggs



Re: Help speeding up delete

From
Steve Wampler
Date:
Arjen van der Meijden wrote:
> On 15-11-2005 15:18, Steve Wampler wrote:
>
>> Magnus Hagander wrote:
>> (This is after putting an index on the (id,name,value) tuple.)  That
>> outer seq scan
>> is still annoying, but maybe this will be fast enough.
>>
>> I've passed this on, along with the (strong) recommendation that they
>> upgrade PG.
>
>
> Have you tried with an index on (name,value) and of course one on id ?

Yes, although not with a unique index on (name,value) [possible, but not
so on the just-id index].  Anyway, it turns out the latest incarnation
is 'fast enough' for the user's need, so she's not doing any more with
it until after an upgrade.


--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: Help speeding up delete

From
Arjen van der Meijden
Date:
On 15-11-2005 15:18, Steve Wampler wrote:
> Magnus Hagander wrote:
> (This is after putting an index on the (id,name,value) tuple.)  That outer seq scan
> is still annoying, but maybe this will be fast enough.
>
> I've passed this on, along with the (strong) recommendation that they
> upgrade PG.

Have you tried with an index on (name,value) and of course one on id ?

Best regards,

Arjen

Re: Help speeding up delete

From
Christopher Kings-Lynne
Date:
>>Update to 7.4 or later ;-)
>>
>>Quite seriously, if you're still using 7.2.4 for production purposes
>>you could justifiably be accused of negligence.  There are three or four
>>data-loss-grade bugs fixed in the later 7.2.x releases, not to mention
>>security holes; and that was before we abandoned support for 7.2.
>>You *really* need to be thinking about an update.
>
>
> Perhaps we should put a link on the home page underneath LATEST RELEASEs
> saying
>     7.2: de-supported
>
> with a link to a scary note along the lines of the above.

I strongly support an explicit desupported notice for 7.2 and below on
the website...

Chris


Re: Help speeding up delete

From
Christopher Kings-Lynne
Date:
> Perhaps we should put a link on the home page underneath LATEST RELEASEs
> saying
>     7.2: de-supported
>
> with a link to a scary note along the lines of the above.
>
> ISTM that there are still too many people on older releases.
>
> We probably need an explanation of why we support so many releases (in
> comparison to licenced software) and a note that this does not imply the
> latest releases are not yet production (in comparison to MySQL or Sybase
> who have been in beta for a very long time).

By the way, is anyone interested in creating some sort of online
repository on pgsql.org or pgfoundry where we can keep statically
compiled pg_dump/all for several platforms for 8.1?

That way if someone wanted to upgrade from 7.2 to 8.1, they can just
grab the latest dumper from the website, dump their old database, then
upgrade easily.

In my experience not many pgsql admins have test servers or the skills
to build up test machines with the latest pg_dump, etc.  (Seriously.)
In fact, few realise at all that they should use the 8.1 dumper.

Chris


Re: Help speeding up delete

From
"Steinar H. Gunderson"
Date:
On Thu, Nov 17, 2005 at 09:40:42AM +0800, Christopher Kings-Lynne wrote:
> In my experience not many pgsql admins have test servers or the skills
> to build up test machines with the latest pg_dump, etc.  (Seriously.)
> In fact, few realise at all that they should use the 8.1 dumper.

Isn't your distribution supposed to do this for you? Mine does these days...

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Help speeding up delete

From
Christopher Kings-Lynne
Date:
> Isn't your distribution supposed to do this for you? Mine does these days...

A distribution that tries to automatically do a major postgresql update
is doomed to fail - spectacularly...

Chris

Re: Help speeding up delete

From
Scott Marlowe
Date:
On Wed, 2005-11-16 at 19:40, Christopher Kings-Lynne wrote:
> > Perhaps we should put a link on the home page underneath LATEST RELEASEs
> > saying
> >     7.2: de-supported
> >
> > with a link to a scary note along the lines of the above.
> >
> > ISTM that there are still too many people on older releases.
> >
> > We probably need an explanation of why we support so many releases (in
> > comparison to licenced software) and a note that this does not imply the
> > latest releases are not yet production (in comparison to MySQL or Sybase
> > who have been in beta for a very long time).
>
> By the way, is anyone interested in creating some sort of online
> repository on pgsql.org or pgfoundry where we can keep statically
> compiled pg_dump/all for several platforms for 8.1?
>
> That way if someone wanted to upgrade from 7.2 to 8.1, they can just
> grab the latest dumper from the website, dump their old database, then
> upgrade easily.
>
> In my experience not many pgsql admins have test servers or the skills
> to build up test machines with the latest pg_dump, etc.  (Seriously.)
> In fact, few realise at all that they should use the 8.1 dumper.

I would especially like such a thing available as an RPM.  A
pgsql-8.1-clienttools.rpm or something like that, with psql, pg_dump,
pg_restore, and what other command line tools you can think of that
would help.

Re: Help speeding up delete

From
"Steinar H. Gunderson"
Date:
On Thu, Nov 17, 2005 at 11:07:42PM +0800, Christopher Kings-Lynne wrote:
>> Isn't your distribution supposed to do this for you? Mine does these
>> days...
> A distribution that tries to automatically do a major postgresql update
> is doomed to fail - spectacularly...

Automatically? Well, you can install the two versions side-by-side, and do
pg_upgradecluster, which ports your configuration to the new version and does
a pg_dump between the two versions; exactly what a system administrator would
do. Of course, stuff _can_ fail, but it works for the simple cases, and a
great deal of the not-so-simple cases. I did this for our cluster the other
day (130 wildly different databases, from 7.4 to 8.1) and it worked
flawlessly.

I do not really see why all the distributions could do something like this,
instead of mucking around with special statically compiled pg_dumps and the
like...

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Help speeding up delete

From
Ron Mayer
Date:
Christopher Kings-Lynne wrote:
>>>
>>> Quite seriously, if you're still using 7.2.4 for production purposes
>>> you could justifiably be accused of negligence....
>>
>> Perhaps we should put a link on the home page underneath LATEST RELEASEs
>> saying
>>     7.2: de-supported
>> with a link to a scary note along the lines of the above.
>
> I strongly support an explicit desupported notice for 7.2 and below on
> the website...


I'd go so far as to say the version #s of supported versions
is one of pieces of information I'd most expect to see on
the main support page ( http://www.postgresql.org/support/ ).

Perhaps it'd be nice to even show a table like
    Version    Released On     Support Ends
    7.1        4 BC            Sep 3 1752
    7.2        Feb 31 1900     Jan 0 2000
    7.4        2003-11-17      At least 2005-x-x
    8.0        2005-01-19      At least 2006-x-x
with a footnote saying that only the most recent dot release
of each family is considered supported.

It also might be nice to have a footnote saying that any
of the commercical support companies might support the older
versions for longer periods of time.

Re: Help speeding up delete

From
Christopher Kings-Lynne
Date:
> I do not really see why all the distributions could do something like this,
> instead of mucking around with special statically compiled pg_dumps and the
> like...

Contrib modules and tablespaces.

Plus, no version of pg_dump before 8.0 is able to actually perform such
reliable dumps and reloads (due to bugs).  However, that's probably moot
these days.

Chris


Re: Help speeding up delete

From
Leigh Dyer
Date:
Steve Wampler wrote:
>
> Is the performance behavior that we're experiencing a known
> problem with 7.2 that has been addressed in 7.4?  Or will the
> upgrade fix other problems while leaving this one?

I'm pretty sure that in versions earlier than 7.4, IN clauses that use a
subquery will always use a seqscan, regardless of what indexes are
available. If you try an IN using explicit values though, it should use
the index.

Thanks
Leigh