Re: Questions about indexes? - Mailing list pgsql-performance

From Josh Berkus
Subject Re: Questions about indexes?
Date
Msg-id 200302180937.21001.josh@agliodbs.com
Whole thread Raw
Responses Performance Baseline Script
Re: Questions about indexes?
List pgsql-performance
Ryan,

I am crossing this discussion to the PGSQL-PERFORMANCE list, which is the
proper place for it.   Anyone interested, please follow us there!

>>>Ryan Bradetich said:
 > the table would look like:
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
 > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
 > 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.
 > etc...
 >
 > So I do not need the anomaly to be part of the index, I only need it to
 >
 > I agree with you, that I would not normally add the anomally to the
 > index, except for the unique row requirement.  Thinking about it now,
 > maybe I should guarentee unique rows via a check constraint...
 >
 > Thanks for making me think about this in a different way!

First off, I'm not clear on why a duplicate anominaly would be necessarily
invalid, given the above.  Not useful, certainly, but legitimate real data.

I realize that you have performance considerations to work with.  However, I
must point out that your database design is not *at all* normalized ... and
that normalizing it might solve some of your indexing problems.

A normalized structure would look something like:

TABLE operations
    id serial not null primary key,
    host_id int not null,
    timeoccurred timestamp not null default now(),
    category varchar(5) not null,
    constraint operations_unq unique (host_id, timeoccurred, category)

TABLE anominalies
    id serial not null primary key,
    operation_id int not null references operations(id) on delete cascade,
    anominaly text

This structure would have two advantages for you:
1) the operations table would be *much* smaller than what you have now, as you
would not be repeating rows for each anominaly.
2) In neither table would you be including the anominaly text in an index ...
thus reducing index size tremendously.

As a warning, though:  you may find that for insert speed the referential
integrity key is better maintained at the middleware layer.   We've had some
complaints about slow FK enforcement in Postgres.

--
Josh Berkus
Aglio Database Solutions
San Francisco

pgsql-performance by date:

Previous
From: Manfred Koizar
Date:
Subject: Re: cost and actual time
Next
From: "Keith Bottner"
Date:
Subject: Performance Baseline Script