Thread: table design strategy

table design strategy

From
"Johnson, Shaunn"
Date:
Howdy:
 
Running Postgres 7.1.3 on RedHat 7.2, kernel 2.4.7  rel. 10.
 
My question is about table design strategies ...
 
I have a imported a table, broken up in such a fashion that
there are now 8 tables on Postgres.  For example: table_north,
table_south, table_east and table_west originally comes from
another source on another database called 'table_direction'.
From that, the tables span years, so, I have tables called
'table_north_2000' and 'table_north_2001', 'table_south_2000'
and table_south_2001' and so on ...
 
Now ... I was thinking that now that I have all 8 parts, I'd like
to:
 
* create indices on the similar names in each table
* create a view that joins all 8 tables into ONE table again
 
I don't want to join all of the tables back into ONE table
because every month or so, I'll have to update  and import
a fresh copy of the 2001 year tables ... and because they are
pretty big, pulling it all at once is totally not an option. Of course,
I'll have to start pulling 2002 data soon ...
 
I'm looking for creative suggestions.  Otherwise, I think I'll have to
go the route I stated above ... it just seems like it's going to be
painful ...
 
Thanks!
 
PS: Has anyone had a chance to test a Data Model / database
structure modeling tool (for creating pretty pictures and relational
info / documentation)?
 
-X
-----Original Message-----
From: Johnson, Shaunn
Sent: Monday, March 25, 2002 4:15 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] file size issue?

--I think you've answered at least 1/2 of my question,
Andrew.
 
--I'd like to figure out if Postgres reaches a point where
it will no longer index or vacuum a table based on its size (your answer
tells me 'No' -  it will continue until it is done, splitting each
table on 1Gig increments).
 
--And if THAT is true, then why am I getting failures when
I'm vacuuming or indexing a table just after reaching 2 Gig?
 
--And if it's an OS (or any other) problem, how can I factor
out Postgres?
 
--Thanks!
 
-X
 
 
[snip]
 
 
> Has anyone seen if it is a problem with the OS or with the way
> Postgres handles large files (or, if I should compile it again
> with some new options).
 
 
What do you mean "postgres handles large files"? The filesize
problem isn't related to the size of your table, because postgres
splits files at 1 Gig.
If it's an output problem, you could see something, but you said you
were vacuuming.
 
 
A

[snip]

Re: table design strategy

From
"Johnson, Shaunn"
Date:

-- Thanks for responding:

-- Regarding the trigger suggestion, I'm not sure
-- I follow  (translation: I wouldn't know a trigger from the gun).

-- I'm looking at my PostgreSQL book and looking that
-- feature up ...

-- While I do, what about this:

-- * Append all of the 2000 and 2001 tables into one table
-- * Make index of columns in the one table
-- * For updates and new imports, create a
-- method to only append new data / updates
-- (I imagine this is where the trigger function comes
-- in, right?) when I have to pull data (the source tables
-- will be much smaller ... hopefully ...

-- I have quite a lot to research ... any suggestions
-- will be appreciated.

-- Thanks again!

-X

-----Original Message-----
From: Andrew Sullivan [mailto:andrew@libertyrms.info]

> Howdy:

> Running Postgres 7.1.3 on RedHat 7.2, kernel 2.4.7  rel. 10.

> My question is about table design strategies ...

> I have a imported a table, broken up in such a fashion that
> there are now 8 tables on Postgres.  For example: table_north,
> table_south, table_east and table_west originally comes from
> another source on another database called 'table_direction'.
> >From that, the tables span years, so, I have tables called
> 'table_north_2000' and 'table_north_2001', 'table_south_2000'
> and table_south_2001' and so on ...

> Now ... I was thinking that now that I have all 8 parts, I'd like
> to:

> * create indices on the similar names in each table
> * create a view that joins all 8 tables into ONE table again

That view is going to be _painful_ in performance.  What if you wrote
a trigger that inserts/deletes/updates in table_all when any of the
other tables are touched?  It's a lot of disk, but disk is cheap.

> PS: Has anyone had a chance to test a Data Model / database
> structure modeling tool (for creating pretty pictures and relational
> info / documentation)?

DbVisualiser turns out to be much better than I expected.  You can
also get a program called postgres_autodoc.pl if you just want to
create pretty UML diagrams.  The latter requires dia, the former Java
2.  (I don't have references handy, but a google search should get
you there.)

A

Re: table design strategy

From
Andrew Sullivan
Date:
On Thu, Mar 28, 2002 at 12:28:40PM -0500, Johnson, Shaunn wrote:
> -- While I do, what about this:
>
> -- * Append all of the 2000 and 2001 tables into one table
> -- * Make index of columns in the one table
> -- * For updates and new imports, create a
> -- method to only append new data / updates
> -- (I imagine this is where the trigger function comes
> -- in, right?) when I have to pull data (the source tables
> -- will be much smaller ... hopefully ...

Right, the trigger will only fire on insert/update/delete, so you'd
have to populate the big table manually.  But after that, with a
trigger on all the source tables, any change you make there will also
be reflected in table_all.

Note that doing it this way is a very un-SQL-ish thing to do, in that
you'll have redundant data all over the place.  Still, given what you
already have, doing it this way might be a quick and dirty answer for
looking at the big table.  Of course, if you hardly ever reference
the big table, maybe it isn't worth the storage cost.

A
--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110