Home > mailing lists

Re: How to do faster DML - Mailing list pgsql-general

From	Peter J. Holzer
Subject	Re: How to do faster DML
Date	February 3, 2024 20:29:47
Msg-id	20240203202947.u45hpyiwzuodbav2@hjp.at Whole thread Raw
In response to	Re: How to do faster DML (Lok P <loknath.73@gmail.com>)
List	pgsql-general

Tree view

On 2024-02-03 19:25:12 +0530, Lok P wrote:
> Apology. One correction, the query is like below. I. E filter will be on on
> ctid which I believe is equivalent of rowid in oracle and we will not need the
> index on Id column then. 
>
>  But, it still runs long, so thinking any other way to make the duplicate
> removal faster? 
>
> Also wondering , the index creation which took ~2.5hrs+ , would that have been
> made faster any possible way by allowing more db resource through some session
> level db parameter setting?
>
> create table TAB1_New
> as
> SELECT  * from TAB1 A
> where CTID in
>       (select min(CTID) from TAB1
>       group by ID having count(ID)>=1 );

That »having count(ID)>=1« seems redundant to me. Surely every id which
occurs in the table occurs at least once?

Since you want ID to be unique I assume that it is already almost
unique - so only a small fraction of the ids will be duplicates. So I
would start with creating a list of duplicates:

create table tab1_dups as
select id, count(*) from tab1 group by id having count(*) > 1;

This will still take some time because it needs to build a temporary
structure large enough to hold a count for each individual id. But at
least then you'll have a much smaller table to use for further cleanup.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

pgsql-general by date:

From: Francisco Olarte
Date: 03 February 2024, 19:20:01
Subject: Re: How to do faster DML

From: Lok P
Date: 03 February 2024, 20:44:20
Subject: Re: How to do faster DML

Re: How to do faster DML - Mailing list pgsql-general

Attachment

Previous

Next