Searching for Duplicates and Hosed the System - Mailing list pgsql-general

From Bill Thoen
Subject Searching for Duplicates and Hosed the System
Date
Msg-id 20070819164450.GA15623@www.gisnet.com
Whole thread Raw
Responses Re: Searching for Duplicates and Hosed the System  (Bill Moran <wmoran@potentialtech.com>)
Re: Searching for Duplicates and Hosed the System  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I'm new to PostgreSQL and I ran into problem I don't want to repeat. I have
a database with a little more than 18 million records that takes up about
3GB. I need to check to see if there are duplicate records, so I tried a
command like this:

SELECT count(*) AS count, fld1, fld2, fld3, fld4 FROM MyTable
  GROUP BY fld1, fld2, fld3, fld4
  ORDER BY 1 DESC;

I knew this would take some time, but what I didn't expect was that about
an hour into the select, my mouse and keyboard locked up and also I
couldn't log in from another computer via SSH. This is a Linux machine
running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on
the disc too.

I finally had to shut the power off and reboot to regain control of my
computer (that wasn't good idea, either, but eventually I got everything
working again.)

Is this normal behavior by PG with large databases? Did I misconfigure
something? Does anyone know what might be wrong?

- Bill Thoen


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: WAITING in PG_STATS_ACTIVITY
Next
From: Bill Moran
Date:
Subject: Re: Searching for Duplicates and Hosed the System