Niels Kristian Schjødt wrote:
> So my main concern is actually about the cars table, since this
> one currently has a lot of columns (151 - I expect thats quite a
> lot?),
That's pretty wide, but not outrageous.
> and a lot of data (4 mil. rows, and growing).
That's not a big deal. It's not unusual to have hundreds of
millions of rows in a PostgreSQL table. Properly indexed, that
should perform fine on queries. Sometimes partitioning rows into
sub-tables helps, but you didn't really mention anything which
suggests that would be helpful for you.
> Now you might start by thinking, this could sound like a regular
> need for some normalization
On the contrary, what you describe sounds well normalized. Breaking
off attributes of a car into separate tables would not advance
that.
> The columns in this table is for the most very short stings,
> integers, decimals or booleans. So take for an example
> has_automatic_transmission (boolean) I can't see why it would
> make sense to put that into a separate table and join in the
> values. Or the milage or the price as another example. The cars
> table used for search is indexed quite a lot.
On the face of it, it sounds like you should have some one-column
indexes on the columns most useful for selection (based on
frequency of use and how selective a selection on the column tends
to be).
You might benefit from a technique called "vertical partitioning"
-- where you split off less frequently referenced column and/or
columns which are updated more often into "sibling" tables, with
the same primary key as the car table. That can sometimes buy some
performance at the expense of programming complexity and more
difficulty maintaining data integrity. I wouldn't go there without
evidence that your performance is not adequate without it.
-Kevin