Re: Merging records in a table with 2-columns primary key - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Merging records in a table with 2-columns primary key
Date
Msg-id CA+bJJbwDn=F8egmFcDXOP6mfpw6LFG9EBYsZVO-5RVbs38zCkQ@mail.gmail.com
Whole thread Raw
In response to Merging records in a table with 2-columns primary key  (Alexander Farber <alexander.farber@gmail.com>)
List pgsql-general
Alexander:

On Sun, Apr 2, 2017 at 4:26 PM, Alexander Farber
<alexander.farber@gmail.com> wrote:
> The purpose of the function is to merge several user records to one (with
> the lowest uid).

It looks complicated ( more on this later )

> While merging the reviews records I delete all self-reviews and try to copy
> over as many remaining reviews as possible.
...
> test=> SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);
> ERROR:  new row for relation "reviews" violates check constraint
> "reviews_check"
> DETAIL:  Failing row contains (1, 1, User 4 says: 3 is ugly).

mmm, Maybe this is related to constrint immediateness, but anyway I
would suggest another approach for calculating the set of new reviews.

If you just delete every review for the set and capture the
not-self-referential:

with deleted as ( delete from reviews where uid in in_uids returning *)
, candidates as ( select * from deleted where author not in in_uids )

You can then generate a new review-set from it with some criteria:

, cleaned as (select author, min(review) as review from candidates group by 1)

And then insert them back

insert into reviews select $out_id, author, review from cleaned;

If I were more fluent with the window functions I would recommend
ordering the cleaned query by uid=$out_id DESC and getting the first
row with one of them ( the DESC order puts true first, IIRC, so it
favors keeping the original review for $out_id), or using string_agg
to try to keep all the texts ).

I've found that on many of this "merging" problems it's easier to
extract all, clean them, reinsert. Normally my data sets are big so I
just delete to a temporary ( not in sql way, just one which I create
and then drop ) table, clean on it and reimport them. It also makes
debugging the code much easier ( as the temp table can be cloned to
test easily ). For encapsulation "with" helps a lot, or, in a
function, you can use an real temporary table.

Francisco Olarte


pgsql-general by date:

Previous
From: Alexander Farber
Date:
Subject: Re: Merging records in a table with 2-columns primary key
Next
From: Francisco Olarte
Date:
Subject: Re: Merging records in a table with 2-columns primary key