Home > mailing lists

Re: Database design - best practice - Mailing list pgsql-performance

From	Kevin Grittner
Subject	Re: Database design - best practice
Date	November 28, 2012 17:40:17
Msg-id	20121128144006.69300@gmx.com Whole thread Raw
In response to	Database design - best practice (Niels Kristian Schjødt <nielskristian@autouncle.com>)
List	pgsql-performance

Tree view

Niels Kristian Schjødt wrote:

> So my main concern is actually about the cars table, since this
> one currently has a lot of columns (151 - I expect thats quite a
> lot?),

That's pretty wide, but not outrageous.

> and a lot of data (4 mil. rows, and growing).

That's not a big deal. It's not unusual to have hundreds of
millions of rows in a PostgreSQL table. Properly indexed, that
should perform fine on queries. Sometimes partitioning rows into
sub-tables helps, but you didn't really mention anything which
suggests that would be helpful for you.

> Now you might start by thinking, this could sound like a regular
> need for some normalization

On the contrary, what you describe sounds well normalized. Breaking
off attributes of a car into separate tables would not advance
that.

> The columns in this table is for the most very short stings,
> integers, decimals or booleans. So take for an example
> has_automatic_transmission (boolean) I can't see why it would
> make sense to put that into a separate table and join in the
> values. Or the milage or the price as another example. The cars
> table used for search is indexed quite a lot.

On the face of it, it sounds like you should have some one-column
indexes on the columns most useful for selection (based on
frequency of use and how selective a selection on the column tends
to be).

You might benefit from a technique called "vertical partitioning"
-- where you split off less frequently referenced column and/or
columns which are updated more often into "sibling" tables, with
the same primary key as the car table. That can sometimes buy some
performance at the expense of programming complexity and more
difficulty maintaining data integrity. I wouldn't go there without
evidence that your performance is not adequate without it.

-Kevin

pgsql-performance by date:

From: Marcin Mirosław
Date: 28 November 2012, 17:39:59
Subject: Re: Optimize update query

From: Vitalii Tymchyshyn
Date: 28 November 2012, 17:41:24
Subject: Re: Database design - best practice

Re: Database design - best practice - Mailing list pgsql-performance

Previous

Next