Thread: Building a DB with performance in mind

Building a DB with performance in mind

From

Alexander Ranaldi

Date:

17 March 2005, 15:56:16

Greetings everyone,

I am about to migrate to Postgres from MySQL. My DB isn't enormous (<
1gb), consists mostly of just text, but is accessed quite heavily.
Because size isn't a huge issue, but performance is, I am willing to
normalize as necessary.

Currently I have a table "Entries" containing 500k rows. The table
contains many text columns, and a few others:
EntryID (unique, indexed)
UserID (references "Users" table, indexed)
Private (boolean. indexed)

Most of my queries return rows based on UserID, and also only if
Private is FALSE. Would it be in the interest of best performance to
split this table into two tables: "EntriesPrivate",
"EntriesNotPrivate" and remove the "Private" column?

I appreciate any feedback. I'm certainly not a DB design expert. :)

Thanks,
Alex

Re: Building a DB with performance in mind

From

"Steinar H. Gunderson"

Date:

17 March 2005, 16:09:47

On Thu, Mar 17, 2005 at 10:56:10AM -0500, Alexander Ranaldi wrote:
> Most of my queries return rows based on UserID, and also only if
> Private is FALSE. Would it be in the interest of best performance to
> split this table into two tables: "EntriesPrivate",
> "EntriesNotPrivate" and remove the "Private" column?

You could do a partial index if you'd like (ie. one only indexing rows where
Private=FALSE), but I'm not sure if it's the best solution.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Building a DB with performance in mind

From

John Arbash Meinel

Date:

17 March 2005, 16:21:48

Alexander Ranaldi wrote:

>Greetings everyone,
>
>I am about to migrate to Postgres from MySQL. My DB isn't enormous (<
>1gb), consists mostly of just text, but is accessed quite heavily.
>Because size isn't a huge issue, but performance is, I am willing to
>normalize as necessary.
>
>Currently I have a table "Entries" containing 500k rows. The table
>contains many text columns, and a few others:
>EntryID (unique, indexed)
>UserID (references "Users" table, indexed)
>Private (boolean. indexed)
>
>Most of my queries return rows based on UserID, and also only if
>Private is FALSE. Would it be in the interest of best performance to
>split this table into two tables: "EntriesPrivate",
>"EntriesNotPrivate" and remove the "Private" column?
>
>
Perhaps. You might also consider creating a multi-column index on
(UserID, Private).
However, in a more conceptual idea, separating the tables may help you
with preventing accidental queries. It's pretty easy to forget to add
"... AND Private = False". It is much harder to accidentally add "...
JOIN EntriesPrivate ON ..."

>I appreciate any feedback. I'm certainly not a DB design expert. :)
>
>
>
It shouldn't be very hard to test which one works better for you:

\timing
CREATE INDEX entries_user_private_idx ON Entries(UserID, Private);

SELECT * FROM Entries WHERE ... AND Private = False;

CREATE TABLE EntriesPrivate AS SELECT * FROM Entries WHERE Private=True;
CREATE TABLE EntriesPublic AS SELECT * FROM Entries WHERE Private=False;
ALTER TABLE EntriesPrivate DROP COLUMN Private;
ALTER TABLE EntriesPrivate ADD PRIMARY KEY (EntriesID);
ALTER TABLE EntriesPrivate ALTER COLUMN SET
DEFAULT=nextval('Entries_...EntryId');
-- Make sure you don't have duplicate entries. This could also be done
with a foreign key to some
-- other entries table
ALTER TABLE EntriesPrivate ADD CONSTRAINT EntriesID NOT in (SELECT
EntriesId FROM EntriesPublic);
CREATE INDEX entriesprivate_userid_idx ON EntriesPrivate(UserID);

-- Do the same thing for EntriesPublic
ALTER TABLE EntriesPublic DROP COLUMN Private;

These queries have not been tested, but they should give you a decent
starting point to creating 2 tables, and running a bunch of test queries
on them.
I think the biggest difficulty is making sure that you don't get
duplicate EntriesID values, assuming that is important to you.
Also, if you have foreign key references, this won't work. You'll have
to create a new table (it can have just 1 column) containing EntriesID,
and then you can reference that column from both of these tables.

John
=:->

>Thanks,
>Alex
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>

Attachment

signature.asc