From: Sherman Willden <operasopranos@gmail.com> Date: Sunday, 31 December 2017 at 18:19 To: <pgsql-general@postgresql.org> Subject: Find duplicates in a column then print Information to a file
Development Platform: Ubuntu 17.10 mainly command line work
Tools: perl 5.26 and postgresql 9.6
Goal: Display duplicate aria titles on screen and to a local file
Database name: arias
Table name: aria_precis
csv delimiter: the # symbol
arias=# \d aria_precis Table "public.aria_precis" Column | Type | Modifiers -------------+------+----------- id | text | not null aria | text | artist | text | a_artist | text | album_title | text | Indexes: "aria_precis_pkey" PRIMARY KEY, btree (id)
Problems: 1. Can't connect aria title to id 2. Can't write discovered information to file
What I know displays the 46 duplicate aria titles of 413 entries but not the id or artist: SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY aria HAVING COUNT(aria)>1);
When I perform the following I get (0 rows): SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY id, aria HAVING COUNT(aria)>1); aria ------ (0 rows)
After I get the above information how do I write that information to a file? The only thing I know writes the entire database to a file: \COPY aria_precis TO '/home/sherman/aria_precis_2.csv' WITH DELIMITER '#,' CSV HEADER;
Thank you;
Sherman
========================================
On the cmdline just do:
psql arias <pwd> -c ‘SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY aria HAVING COUNT(aria)>1)‘ > outfile.txt