I am using a query like this to try and normalize a table.
WITH nd as (select * from sales order by id limit 100),
people_update as (update people p set first_name = nd.first_name from
nd where p.email = nd.email returning nd.id),
insert into people (first_name, email, created_at, updated_at)
select first_name, email , now(), now()
from nd
left join people_update using(id) where
people_update.id is null),
This works pretty good except for when the top 100 records have
duplicated email address (two sales for the same email address).
I am wondering what the best strategy is for dealing with this
scenario. Doing the records one at a time would work but obviously it
would be much slower. There are no other columns I can rely on to
make the record more unique either.