Re: Optimize date query for large child tables: GiST or GIN? - Mailing list pgsql-performance

From David Jarvis
Subject Re: Optimize date query for large child tables: GiST or GIN?
Date
Msg-id AANLkTikl8wlHPJaArMWO9XWgG0vk9tununw3eUAYh1IC@mail.gmail.com
Whole thread Raw
In response to Re: Optimize date query for large child tables: GiST or GIN?  (Thom Brown <thombrown@gmail.com>)
Responses Re: Optimize date query for large child tables: GiST or GIN?
List pgsql-performance
When using MySQL, the performance was okay (~5 seconds per query) using:

  date( concat_ws( '-', y.year, m.month, d.day ) ) between
    -- Start date.
    date( concat_ws( '-', y.year, $P{Month1}, $P{Day1} ) ) AND
    -- End date. Calculated by checking to see if the end date wraps
    -- into the next year. If it does, then add 1 to the current year.
    --
    date(
      concat_ws( '-',
        y.year + greatest( -1 *
          sign(
            datediff(
              date(
                concat_ws('-', y.year, $P{Month2}, $P{Day2} )
              ),
              date(
                concat_ws('-', y.year, $P{Month1}, $P{Day1} )
              )
            )
          ), 0
        ), $P{Month2}, $P{Day2}
      )
    )

This calculated the correct start days and end days, including leap years.

With MySQL, I "normalized" the date into three different tables: year references, month references, and day references. The days contained only the day (of the month) the measurement was made and the measured value. The month references contained the month number for the measurement. The year references had the years and station. Each table had its own index on the year, month, or day.

When I had proposed that solution to the mailing list, I was introduced to a more PostgreSQL-way, which was to use indexes on the date field.

In PostgreSQL, I have a single "measurement" table for the data (divided into 72 child tables), which includes the date and station. I like this because it feels clean and it is easier to understand. So far, however, it has not been fast.

I was thinking that I could add three more columns to the measurement table:

year_taken, month_taken, day_taken

Then index those. That should allow me to avoid extracting years, months, and days from the m.taken date column.

What do you think?

Thanks again!
Dave

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: [BUGS] Query causing explosion of temp space with join involving partitioning
Next
From: Tom Lane
Date:
Subject: Re: Optimize date query for large child tables: GiST or GIN?