Thread: duplicate rows mystery
I'm trying to remove duplicate records from a table. The best way I could think of was to select into a new table using DISTINCT. I issued the following command: select distinct on (symbol, date) * into price from pricebackup; It did indeed remove several rows that had duplicate items. I then tried to build a unique index on the columns (symbol, date) in pricebackup with: create unique index price_symbol_date on pricebackup (symbol, date); and got the error: ERROR: could not create unique index DETAIL: Table contains duplicated values. How is it possible that I could have duplicate values in that table when I just selected them into the table as distinct? What am I missing? Any help is appreciated. Thanks, Gerry
On Thu, Apr 14, 2005 at 03:26:16AM -0600, Gerry Jensen wrote: > > select distinct on (symbol, date) * into price from pricebackup; > > It did indeed remove several rows that had duplicate items. I then tried > to build a unique index on the columns (symbol, date) in pricebackup with: > > create unique index price_symbol_date on pricebackup (symbol, date); > > and got the error: > > ERROR: could not create unique index > DETAIL: Table contains duplicated values. If the SELECT and CREATE statements are what you really did, then you tried to create an index on the original table (pricebackup), not the table you SELECTed into (price). Are you sure you have the right table names in the right places? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Gerry Jensen <gerry@xmission.com> writes: > I issued the following command[s]: > select distinct on (symbol, date) * into price from pricebackup; > create unique index price_symbol_date on pricebackup (symbol, date); > and got the error: > ERROR: could not create unique index > DETAIL: Table contains duplicated values. Um, isn't pricebackup the one that still has the dup rows? You should have created the index on the new table. regards, tom lane
Sorry. That was a typo in my post. I switched the table names in my SELECT INTO line in my post (I tested it again to make sure that's not what I did in psql and my results were the same). price was the original table with duplicate rows. What I really did was: select distinct on (symbol, date) * into pricebackup from price; then: create unique index price_symbol_date on pricebackup (symbol, date); Yet it still gave me the duplicate value errors. How is this possible? Thanks, Gerry On Thu, 14 Apr 2005, Michael Fuhr wrote: > On Thu, Apr 14, 2005 at 03:26:16AM -0600, Gerry Jensen wrote: >> >> select distinct on (symbol, date) * into price from pricebackup; >> >> It did indeed remove several rows that had duplicate items. I then tried >> to build a unique index on the columns (symbol, date) in pricebackup with: >> >> create unique index price_symbol_date on pricebackup (symbol, date); >> >> and got the error: >> >> ERROR: could not create unique index >> DETAIL: Table contains duplicated values. > > If the SELECT and CREATE statements are what you really did, then > you tried to create an index on the original table (pricebackup), > not the table you SELECTed into (price). Are you sure you have the > right table names in the right places? > > -- > Michael Fuhr > http://www.fuhr.org/~mfuhr/ >
How about the simple case that pricebackup is not empty before the select? You say it is a new table, but did you do a test run or anything like that before the first select? How about dropping or truncating the table and starting again? On Thu, 2005-04-14 at 18:09, Gerry Jensen wrote: > Sorry. That was a typo in my post. I switched the table names in my > SELECT INTO line in my post (I tested it again to make sure that's not > what I did in psql and my results were the same). price was the original > table with duplicate rows. What I really did was: > > select distinct on (symbol, date) * into pricebackup from price; > > then: > > create unique index price_symbol_date on pricebackup (symbol, date); > > Yet it still gave me the duplicate value errors. How is this possible? > > Thanks, > > Gerry > > On Thu, 14 Apr 2005, Michael Fuhr wrote: > > > On Thu, Apr 14, 2005 at 03:26:16AM -0600, Gerry Jensen wrote: > >> > >> select distinct on (symbol, date) * into price from pricebackup; > >> > >> It did indeed remove several rows that had duplicate items. I then tried > >> to build a unique index on the columns (symbol, date) in pricebackup with: > >> > >> create unique index price_symbol_date on pricebackup (symbol, date); > >> > >> and got the error: > >> > >> ERROR: could not create unique index > >> DETAIL: Table contains duplicated values. > > > > If the SELECT and CREATE statements are what you really did, then > > you tried to create an index on the original table (pricebackup), > > not the table you SELECTed into (price). Are you sure you have the > > right table names in the right places? > > > > -- > > Michael Fuhr > > http://www.fuhr.org/~mfuhr/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Regards, Steve Tucknott ReTSol Ltd DDI 01903 828769 MOBILE 07736715772
Yes. I tried it several times, dropping the newly created table each time. I'm wondering if my database is just corrupt. My indexes in my original table were definitely corrupt (I originally had a unique (symbol, date) index). I discovered this when certain queries failed to return certain rows (e.g. selecting on a specific date), but different queries did return those rows (e.g. selecting a range of dates that included the missing date). When I dropped the indexes, I was able to again see the missing rows with the specific date query. Apparently the index corruption had allowed the duplicate rows to be inserted in the first place. So perhaps there's something seriously wrong with my database and I need to start over from scratch. The main reason for my post is that I wanted to make sure I was understanding "DISTINCT ON" correctly and there was nothing wrong with my logic. If those things look fine and logical to everyone, then my posting here has been beneficial to me. Anyway, thanks for the suggestion. Gerry On Thu, 14 Apr 2005, Steve Tucknott wrote: > How about the simple case that pricebackup is not empty before the > select? You say it is a new table, but did you do a test run or anything > like that before the first select? How about dropping or truncating the > table and starting again? > > > On Thu, 2005-04-14 at 18:09, Gerry Jensen wrote: >> Sorry. That was a typo in my post. I switched the table names in my >> SELECT INTO line in my post (I tested it again to make sure that's not >> what I did in psql and my results were the same). price was the original >> table with duplicate rows. What I really did was: >> >> select distinct on (symbol, date) * into pricebackup from price; >> >> then: >> >> create unique index price_symbol_date on pricebackup (symbol, date); >> >> Yet it still gave me the duplicate value errors. How is this possible? >> >> Thanks, >> >> Gerry >> >> On Thu, 14 Apr 2005, Michael Fuhr wrote: >> >>> On Thu, Apr 14, 2005 at 03:26:16AM -0600, Gerry Jensen wrote: >>>> >>>> select distinct on (symbol, date) * into price from pricebackup; >>>> >>>> It did indeed remove several rows that had duplicate items. I then tried >>>> to build a unique index on the columns (symbol, date) in pricebackup with: >>>> >>>> create unique index price_symbol_date on pricebackup (symbol, date); >>>> >>>> and got the error: >>>> >>>> ERROR: could not create unique index >>>> DETAIL: Table contains duplicated values. >>> >>> If the SELECT and CREATE statements are what you really did, then >>> you tried to create an index on the original table (pricebackup), >>> not the table you SELECTed into (price). Are you sure you have the >>> right table names in the right places? >>> >>> -- >>> Michael Fuhr >>> http://www.fuhr.org/~mfuhr/ >>> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 6: Have you searched our list archives? >> >> http://archives.postgresql.org > -- > > > Regards, > > Steve Tucknott > ReTSol Ltd > > DDI 01903 828769 > MOBILE 07736715772 > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >