On 2024-02-03 19:25:12 +0530, Lok P wrote:
> Apology. One correction, the query is like below. I. E filter will be on on
> ctid which I believe is equivalent of rowid in oracle and we will not need the
> index on Id column then.
>
> But, it still runs long, so thinking any other way to make the duplicate
> removal faster?
>
> Also wondering , the index creation which took ~2.5hrs+ , would that have been
> made faster any possible way by allowing more db resource through some session
> level db parameter setting?
>
> create table TAB1_New
> as
> SELECT * from TAB1 A
> where CTID in
> (select min(CTID) from TAB1
> group by ID having count(ID)>=1 );
That »having count(ID)>=1« seems redundant to me. Surely every id which
occurs in the table occurs at least once?
Since you want ID to be unique I assume that it is already almost
unique - so only a small fraction of the ids will be duplicates. So I
would start with creating a list of duplicates:
create table tab1_dups as
select id, count(*) from tab1 group by id having count(*) > 1;
This will still take some time because it needs to build a temporary
structure large enough to hold a count for each individual id. But at
least then you'll have a much smaller table to use for further cleanup.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"