improvement suggestions for performance design - Mailing list pgsql-performance

From tfinneid@ifi.uio.no
Subject improvement suggestions for performance design
Date
Msg-id 40362.134.32.140.234.1183637748.squirrel@webmail.uio.no
Whole thread Raw
Responses Re: improvement suggestions for performance design
List pgsql-performance
Hi

I have the following scenario for a database that I need to design, and
would like some hints on what to improve or do differently to achieve the
desired performance goal, disregarding hardware and postgres tuning.

The premise is an attribute database that stores about 100 different
attribute types as attribute values. Every X seconds, Y number of new
attribute values are stored in the database. X is constant and currently
between 6 and 20 seconds, depending on the setup. In the future X could
become as low as 3 seconds. Y can, within the next 5-10 years, become as
high as 200 000.

That means that for example, every 6 seconds 100 000 attributes needs to
be written to the database.

At the same time, somewhere between 5-20 users needs to read parts of
those newly written attributes, maybe in total 30 000 attributes.

This continues for the duration of the field operation, which could be
18hrs a day for 6 weeks. So the total db size is up towards 200 gigs.

Now here is how I suggest doing this:

1- the tables

table attribute_values:
    id         int
    attr_type     int  ( references attribute_types(id) )
    posX        int
    posY        int
    data_type    int
    value        varchar(50)

table attribute_types:
    id        int
    name        varchar(200);



2- function

   a function that receives an array of data and inserts each attribute.
   perhaps one array per attribute data (type, posX, posY, data_type,
   value) so five arrays as in parameters ot the function

3- java client

   the client receives the data from a corba request, and splits it
   into, say 4 equally sized blocks and executes 4 threads that insert
   each block (this seems to be more efficient than just using one
   thread.)

Now I am wondering if this is the most efficient way of doing it?

- I know that I could group the attributes so that each type of attribute
gets its own table with all attributes in one row. But I am not sure if
that is any more efficient than ont attribute per row since I pass
everything to the function as an array.
With the above design a change in attribute types only requires changing
the data in a table instead of having to modify the client, the function
and the tables.

- I am also wondering if writing the client and function in C would create
a more efficient solution.

any comments?

ps, I am currently running postgres 8.1, but could probably use 8.2 if it
is needed for functionality or performance reasons. It will run on a sparc
machine with solaris 10 and perhaps 4-6 processors, as many GB of RAM as
necessary and SCSI disks ( perhaps in raid 0 ).

regards

thomas




pgsql-performance by date:

Previous
From: Dimitri
Date:
Subject: Re: Filesystem Direct I/O and WAL sync option
Next
From: Heikki Linnakangas
Date:
Subject: Re: improvement suggestions for performance design