Home > mailing lists

median query causes disk to fill up - Mailing list pgsql-general

From	richyen3@gmail.com
Subject	median query causes disk to fill up
Date	April 14, 2007 19:44:25
Msg-id	1176481529.620598.150230@p77g2000hsh.googlegroups.com Whole thread Raw
List	pgsql-general

Tree view

Hi,

I'm trying to run a query to find the median value, organized by
date.  However, when I run the query, it runs for about 4 hours, and
then quits with the following message:

> ERROR:  could not write block 10447102 of temporary file: No space left on device
> HINT:  Perhaps out of disk space?

My table has 512327 rows and is the following:

>                 Table "public.m_uop_times"
>      Column     |            Type             | Modifiers
> ----------------+-----------------------------+-----------
>  objectid       | integer                     |
>  date_part      | double precision            |
>  date_start     | timestamp without time zone |
>  date_processed | timestamp without time zone |
>  gen_time       | integer                     |
> Indexes:
>     "m_uop_date_idx" btree (date_processed)
>     "m_uop_epoch_idx" btree (date_part)
>     "m_uop_gen_idx" btree (gen_time)
>     "m_uop_objectid_idx" btree (objectid)
>     "m_uop_start_idx" btree (date_start)

The date_part column is actually simply "EXTRACT (EPOCH FROM
date_start::date)" so that I could put an index on that date, and the
gen_time column is actually "date_processed-date_start" so that there
could be an index on that difference as well.

My median query is copied from
http://book.itzero.com/read/others/0602/OReilly.SQL.Cookbook.Dec.2005_html/0596009763/sqlckbk-CHP-7-SECT-10.html

Here it is:
> select date_start, avg(gen_time)
>   from (
> select a.date_start::date, a.gen_time
>   from m_uop_times a, m_uop_times b
>  where a.date_part = b.date_part
>  group by a.date_start::date, a.gen_time
> having sum(case when a.gen_time=b.gen_time then 1 else 0 end)
>                           >= abs(sum(sign(a.gen_time - b.gen_time)))
>        ) as foo
> group by date_start;

Basically, I want to find the median gen_time for each day.

Would anyone know a better way to do this, or have suggestions on how
I can make this work without dying?

Any help appreciated!
--Richard

pgsql-general by date:

From: Steve P Shiflett
Date: 14 April 2007, 19:44:23
Subject: display cursor contents

From: "Anton Melser"
Date: 14 April 2007, 20:01:51
Subject: Re: postgresql 8.1.4 to 8.2.3

median query causes disk to fill up - Mailing list pgsql-general

Previous

Next