Thread: "Unvisible" duplicates in SELECT [...] HAVING count(attrib) > 1
============================================================================ POSTGRESQL BUG REPORT TEMPLATE ============================================================================ Your name : Krzysztof Gajdemski Your email address : songo@vi.pl System Configuration --------------------- Architecture (example: Intel Pentium) : Intel Pentium Operating System (example: Linux 2.0.26 ELF) : 2.2.19 SMP PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4 Compiler used (example: gcc 2.8.0) : gcc 2.95.3 Please enter a FULL description of your problem: ------------------------------------------------ SELECT attrib FROM table GROUP BY attrib HAVING count(attrib) > 1; query doesn't produce reliable results for some specific circumstances (I suppose that number of records and/or strange ASCII chars in varchar fields are one of required conditions to reproduce this behaviour, but I can't define additional for now). Table which I use in following example is available at: http://i.use.vi.pl/postgres/dup_example.sql.gz (size 400KB, this is why I don't place it here). Please describe a way to repeat the problem. Please try to provide a concise reproducible example, if at all possible: ---------------------------------------------------------------------- songo=> SELECT version(); version ---------------------------------------------------------------- PostgreSQL 7.1rc4 on i686-pc-linux-gnu, compiled by GCC 2.95.3 (1 row) songo=> \i dup_example.sql psql:dup_example.sql:14: NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'user_id_pkey' for table 'user_id' CREATE CHANGE CHANGE songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick ------------ sexylady\@ (1 row) songo=> DELETE FROM user_id WHERE nick IN ( songo(> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1); DELETE 2 songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick ------ (0 rows) songo=> CREATE UNIQUE INDEX nick_idx ON user_id(nick); ERROR: Cannot create unique index. Table contains non-unique values songo=> SELECT nick FROM user_id WHERE nick LIKE 'dr dre%'; nick -------------------- dr dre\$\$\$\$\$\$ dr dre\$\$\$\$\$\$ (2 rows) songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick ------ (0 rows) If you know how this problem might be fixed, list the solution below: --------------------------------------------------------------------- k. -- - - - - - - Krzysztof Gajdemski | songo @ vi.pl | KG4751-RIPE Registered Linux User # 133457 | BLUG Registered Member # 0005 PGP public keys at: keyserver.linux.bydg.org * KeyID: D336705B ,,Szanujê was wszystkich, którzy pozostajecie w cieniu'' SNERG
Krzysztof Gajdemski <songo@vi.pl> writes: > PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4 > query doesn't produce reliable results for some specific circumstances I tried your example in current sources and get reasonable-looking behavior: regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick -------------------- dr dre\$\$\$\$\$\$ sexylady\@ (2 rows) regression=# DELETE FROM user_id WHERE nick IN ( regression(# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1); DELETE 4 regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick ------ (0 rows) regression=# CREATE UNIQUE INDEX nick_idx ON user_id(nick); CREATE regression=# SELECT nick FROM user_id WHERE nick LIKE 'dr dre%'; nick ------ (0 rows) regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; nick ------ (0 rows) regression=# It could be that there's a locale dependency here (I did this in a non-locale-enabled compilation), but I think it's more likely that you've run into one of the complex-query bugs we corrected since rc4. Please update to 7.1.1 (or better 7.1.2, which should be out very soon) and see if you still see a problem. regards, tom lane
18.05.2001, 11:12:27, Tom Lane wrote: > Krzysztof Gajdemski <songo@vi.pl> writes: > > PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4 > > > query doesn't produce reliable results for some specific circumstances > > I tried your example in current sources and get reasonable-looking > behavior: > > regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1; > nick > -------------------- > dr dre\$\$\$\$\$\$ > sexylady\@ > (2 rows) [ cut ] > It could be that there's a locale dependency here (I did this in a > non-locale-enabled compilation), I've used binary PostgreSQL package from Debian/unstable distribution, so could it be, that this is Debian specific problem if you can't reproduce it. > but I think it's more likely that > you've run into one of the complex-query bugs we corrected since rc4. > Please update to 7.1.1 (or better 7.1.2, which should be out very soon) > and see if you still see a problem. Yes, it is a good idea. I've to try compile it directly from source (with locale support enabled). k. -- - - - - - - Krzysztof Gajdemski | songo @ vi.pl | KG4751-RIPE Registered Linux User # 133457 | BLUG Registered Member # 0005 PGP public keys at: keyserver.linux.bydg.org * KeyID: D336705B ,,Szanujê was wszystkich, którzy pozostajecie w cieniu'' SNERG