Re: Optimize date query for large child tables: GiST or GIN? - Mailing list pgsql-performance

From David Jarvis
Subject Re: Optimize date query for large child tables: GiST or GIN?
Date
Msg-id AANLkTinFXWorUTT9iqagOKbsid_ojdniMC0e41ZK_mB7@mail.gmail.com
Whole thread Raw
In response to Re: Optimize date query for large child tables: GiST or GIN?  (Yeb Havinga <yebhavinga@gmail.com>)
Responses Re: Optimize date query for large child tables: GiST or GIN?
List pgsql-performance
Hi, Yeb.

This is starting to go back to the design I used with MySQL:
  • YEAR_REF - Has year and station
  • MONTH_REF - Has month, category, and yea referencer
  • MEASUREMENT - Has month reference, amount, and day
Normalizing by date parts was fast. Partitioning the tables by year won't do much good -- users will probably choose 1900 to 2009, predominately.

I thought about splitting the data by station by category, but that's ~73000 tables. My understanding is that PostgreSQL uses files per index, which would be messy at the OS level (Linux 2.6.31). Even by station alone is 12139 tables, which might be tolerable for now, but with an order of magnitude more stations on the distant horizon, it will not scale.

I also thought about splitting the data by station district by category -- there are 79 districts, yielding 474 child tables, which is ~575000 rows per child table. Most of the time I'd imagine only one or two districts would be selected. (Again, hard to know exactly.)

Dave

pgsql-performance by date:

Previous
From: Matthew Wakeling
Date:
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Next
From: Yeb Havinga
Date:
Subject: Re: Optimize date query for large child tables: GiST or GIN?