Re: Database design - best practice - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Database design - best practice
Date
Msg-id 20121128144006.69300@gmx.com
Whole thread Raw
In response to Database design - best practice  (Niels Kristian Schjødt <nielskristian@autouncle.com>)
List pgsql-performance
Niels Kristian Schjødt wrote:

> So my main concern is actually about the cars table, since this
> one currently has a lot of columns (151 - I expect thats quite a
> lot?),

That's pretty wide, but not outrageous.

> and a lot of data (4 mil. rows, and growing).

That's not a big deal. It's not unusual to have hundreds of
millions of rows in a PostgreSQL table. Properly indexed, that
should perform fine on queries. Sometimes partitioning rows into
sub-tables helps, but you didn't really mention anything which
suggests that would be helpful for you.

> Now you might start by thinking, this could sound like a regular
> need for some normalization

On the contrary, what you describe sounds well normalized. Breaking
off attributes of a car into separate tables would not advance
that.

> The columns in this table is for the most very short stings,
> integers, decimals or booleans. So take for an example
> has_automatic_transmission (boolean) I can't see why it would
> make sense to put that into a separate table and join in the
> values. Or the milage or the price as another example. The cars
> table used for search is indexed quite a lot.

On the face of it, it sounds like you should have some one-column
indexes on the columns most useful for selection (based on
frequency of use and how selective a selection on the column tends
to be).

You might benefit from a technique called "vertical partitioning"
-- where you split off less frequently referenced column and/or
columns which are updated more often into "sibling" tables, with
the same primary key as the car table. That can sometimes buy some
performance at the expense of programming complexity and more
difficulty maintaining data integrity. I wouldn't go there without
evidence that your performance is not adequate without it.

-Kevin


pgsql-performance by date:

Previous
From: Marcin Mirosław
Date:
Subject: Re: Optimize update query
Next
From: Vitalii Tymchyshyn
Date:
Subject: Re: Database design - best practice