Home > mailing lists

Thread: Table Partitioning in Postgres:

Table Partitioning in Postgres:

From

"Bodanapu, Sravan"

Date:

11 February 2003, 11:30:55

Hi,

We are trying to migrate a database from Oracle to Postgres which is about 150Gig.

How do you setup and maintain Big tables having around 20-30 million rows ?

Is there a way to setup table partitioning ? How can I improve the Postgres Database performance for such a bid database ?

Please advice.

Thanks,

- Sravan.

Re: Table Partitioning in Postgres:

From

Curt Sampson

Date:

13 February 2003, 08:25:23

On Tue, 11 Feb 2003, Bodanapu, Sravan wrote:

> We are trying to migrate a database from Oracle to Postgres which is about
> 150Gig.
> How do you setup and maintain Big tables having around 20-30 million rows ?
> Is there a way to setup table partitioning ? How can I improve the Postgres
> Database performance for such a bid database ?

I've set up tables with 500 million or more rows just as I would with
any other table. There is no table partitioning per se in postgres, but
you can always modify your application to use separate tables (which I
have also done for some large ones).

As for performance, that is soooo application dependent that you really
probably want to hire a consultant to help you out if you don't have time
to spend studying it yourself.

At the very least, for anything big like this, you'd want to spend
a week or two playing around with your database and application on
postgres before you even think about whether you want to convert or not.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: Table Partitioning in Postgres:

From

"Bodanapu, Sravan"

Date:

17 February 2003, 15:42:25

Thanks Curt!!! The data was actually taken out of Oracle database and then dumped into Postgres database
using bulk copy. Most of the tables were very large ( having around 20-30 million rows and around
200-300 columns in each ). In Oracle, these tables were partitioned into chunks to get maximum performance.

1. When a table is created in postgres, it will always create the datafile in /pgdata/base/16975 or 16976 directory.

What does 16975 and 16976 mean ? Is there a way that the datafile(for table/data/index) gets generated
in different directories instead of one. If yes, how ?

2. Is there a way to limit a datafile size ( say 3GB ) ? This is a concept in Ingres that you can span the data
across different files.

3. Please suggest us some tips for setting up a big database to acheive maximum performance ?

Thanks and Regards,

- Sravan.

-----Original Message-----
From: Curt Sampson [mailto:cjs@cynic.net]
Sent: Thursday, February 13, 2003 7:25 AM
To: Bodanapu, Sravan
Cc: PGSQL General (E-mail)
Subject: Re: [GENERAL] Table Partitioning in Postgres:

On Tue, 11 Feb 2003, Bodanapu, Sravan wrote:

> We are trying to migrate a database from Oracle to Postgres which is about
> 150Gig.
> How do you setup and maintain Big tables having around 20-30 million rows ?
> Is there a way to setup table partitioning ? How can I improve the Postgres
> Database performance for such a bid database ?

I've set up tables with 500 million or more rows just as I would with
any other table. There is no table partitioning per se in postgres, but
you can always modify your application to use separate tables (which I
have also done for some large ones).

As for performance, that is soooo application dependent that you really
probably want to hire a consultant to help you out if you don't have time
to spend studying it yourself.

At the very least, for anything big like this, you'd want to spend
a week or two playing around with your database and application on
postgres before you even think about whether you want to convert or not.

cjs
--
Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

Re: Table Partitioning in Postgres:

From

Jonathan Bartlett

Date:

17 February 2003, 16:41:19

> 1.    When a table is created in postgres, it will always create the
> datafile in /pgdata/base/16975 or 16976 directory.
>     What does 16975 and 16976 mean ? Is there a way that the
> datafile(for table/data/index) gets generated
>     in different directories instead of one. If yes, how ?

The numbers are the object IDs, I believe.  After a table has been
created, it can be moved anywhere on the hard drive while PostgreSQL is
stopped - just symlink the old location to the new one before starting
back up.

> 2.    Is there a way to limit a datafile size ( say 3GB ) ? This is a
> concept in Ingres that you can span the data
>     across different files.

Not that I know of

> 3.    Please suggest us some tips for setting up a big database to acheive
> maximum performance ?

Choose your disk setup carefully.  Put the transaction logs on a different
RAID set than your data.  Keep indexes on a different RAID set than the
main tables.

You may even be able to do traditional partitioning using rewrite rules,
although I've never tried that.

Jon


>
>
> Thanks and Regards,
>
> - Sravan.
>
>
> -----Original Message-----
> From: Curt Sampson [mailto:cjs@cynic.net]
> Sent: Thursday, February 13, 2003 7:25 AM
> To: Bodanapu, Sravan
> Cc: PGSQL General (E-mail)
> Subject: Re: [GENERAL] Table Partitioning in Postgres:
>
>
> On Tue, 11 Feb 2003, Bodanapu, Sravan wrote:
>
> > We are trying to migrate a database from Oracle to Postgres which is about
> > 150Gig.
> > How do you setup and maintain Big tables having around 20-30 million rows
> ?
> > Is there a way to setup table partitioning ? How can I improve the
> Postgres
> > Database performance for such a bid database ?
>
> I've set up tables with 500 million or more rows just as I would with
> any other table. There is no table partitioning per se in postgres, but
> you can always modify your application to use separate tables (which I
> have also done for some large ones).
>
> As for performance, that is soooo application dependent that you really
> probably want to hire a consultant to help you out if you don't have time
> to spend studying it yourself.
>
> At the very least, for anything big like this, you'd want to spend
> a week or two playing around with your database and application on
> postgres before you even think about whether you want to convert or not.
>
> cjs
> --
> Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
>     Don't you know, in this new Dark Age, we're all light.  --XTC
>

Re: Table Partitioning in Postgres:

From

Curt Sampson

Date:

17 February 2003, 22:23:12

On Mon, 17 Feb 2003, Bodanapu, Sravan wrote:

> Thanks Curt!!! The data was actually taken out of Oracle database and then
> dumped into Postgres database
> using bulk copy. Most of the tables were very large ( having around 20-30
> million rows and around
> 200-300 columns in each ). In Oracle, these tables were partitioned into
> chunks to get maximum performance.

Oh, well that is indeed a reasonably large database, from the sound of
it. (Most people here who say "large" about large databases have these
little toy two-million row things like the little test one I keep on my
laptop. :-))

> 1. When a table is created in postgres, it will always create the
> datafile in /pgdata/base/16975 or 16976 directory. What does 16975 and
> 16976 mean ? Is there a way that the datafile(for table/data/index)
> gets generated in different directories instead of one. If yes, how ?

The 16975 and 16976 are the OIDs (Object IDs) of a couple of databases.
You can install and run the contrib/oid2name program to easily work out
which number corresponds to which database. In each of those directories,
you'll find files with similar numbers; those contain table data and index
data. The oid2name program will also help you work that out.

> 2.    Is there a way to limit a datafile size ( say 3GB ) ? This is a
> concept in Ingres that you can span the data
>     across different files.

In fact, you'll find that postgres limits data file size to 1 GB, and
creates new files for every 1 GB chunk of a table. So you might see a
pair of files So you might see a pair of files called 43561 and 43561.1
which are two parts of that table.

So if you wanted to move things around, you could always shut down
pgsql, move files off to another filesystem, and add symlinks pointing
to them. Note that doing a re-index on an index, or clustering a table,
will end up moving it back to the main directory, usually. The symlink
thing is really just a bit of a kluge that you can use until tablespaces
come.

> 3.    Please suggest us some tips for setting up a big database to acheive
> maximum performance ?

It's really very application dependent. Get a good understanding of
how disks work, how operating systems use disks, and how postgres uses
files, analyze your application to see just what it's likely to be doing
in the way of disk I/O (sizes, locations and proportions of reads and
writes on the physical disk) and then experiment with some tests to see
how you can do it better. I know this is all very vague, but there's no
"generic" advice that will get you anything near the best performance
for your particular application.

About a year ago, after not having touched postgresql much in about
three or four years, I did this for a large application. I'd already
come in with some database knowledge and a fair bit of OS knowledge,
and I found it took me about three weeks of work and experimentation
to figure out how to get good performance out of postgresql for the
application in question.

Note also that you can also learn a lot from studying the documentation
on other database systems, and comparing that with how postgresql works.
I found, for example, _Inside SQL Server 7_ to be very useful for
learning more about database performance in general. (I happened to be
running a large app on that in my last job.)

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: Table Partitioning in Postgres:

From

"Shridhar Daithankar"

Date:

18 February 2003, 00:33:26

On Tuesday 18 Feb 2003 2:12 am, you wrote:
> 3.    Please suggest us some tips for setting up a big database to acheive
> maximum performance ?

well, typically postgresql tries to do things that nobody else in the supply
chain will do. If an OS would take care of mirroring of drives, then
postgresql won't do it for example.

Large tables/database are limited by IO performance. So you need to have
maximum IO throughput available for postgresql to perform best.

I suggest you look at RAID configurations. SCSI if better. If not, then even
software RAID on some OS's like linux should be a good enough option. Of
course, put two disk on two channels but that's about it.

Basically, postgresql won't do anything out of box to support disk
partitioning because that's is an OS's job. symlinking is a poor solution
because say if you recreate an index, postgresql will drop the symlink and
recreate a file instead.

HTH

 Shridhar

Re: Table Partitioning in Postgres:

From

Jean-Luc Lachance

Date:

18 February 2003, 13:54:27

Shridhar,

I must disagree with "that's is an OS's job".
OSs try to be generic.  With databases, we know more about the data
structure.

If a large table could be split (partitioned) based on specific key, we
could expect huge improvements for agregates queries for example when
that key is involved.

Also, DBA must be able to place table on different file systems. They
know more about the application than the OS does.

Even Postgresql has to be told to perform vaccum and analyze.
If the OS had enough intelligence we could trust it to do a good job,
but until then ...

JLL

"Shridhar Daithankar" wrote:
>
> On Tuesday 18 Feb 2003 2:12 am, you wrote:
> > 3.    Please suggest us some tips for setting up a big database to acheive
> > maximum performance ?
>
> well, typically postgresql tries to do things that nobody else in the supply
> chain will do. If an OS would take care of mirroring of drives, then
> postgresql won't do it for example.
>
> Large tables/database are limited by IO performance. So you need to have
> maximum IO throughput available for postgresql to perform best.
>
> I suggest you look at RAID configurations. SCSI if better. If not, then even
> software RAID on some OS's like linux should be a good enough option. Of
> course, put two disk on two channels but that's about it.
>
> Basically, postgresql won't do anything out of box to support disk
> partitioning because that's is an OS's job. symlinking is a poor solution
> because say if you recreate an index, postgresql will drop the symlink and
> recreate a file instead.
>
> HTH
>
>  Shridhar
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Table Partitioning in Postgres:

From

"Dann Corbit"

Date:

18 February 2003, 14:13:25

> -----Original Message-----
> From: Jean-Luc Lachance [mailto:jllachan@nsd.ca]
> Sent: Tuesday, February 18, 2003 10:54 AM
> To: Shridhar Daithankar<shridhar_daithankar@persistent.co.in>
> Cc: PGSQL General (E-mail)
> Subject: Re: [GENERAL] Table Partitioning in Postgres:
>
>
> Shridhar,
>
> I must disagree with "that's is an OS's job".
> OSs try to be generic.  With databases, we know more about
> the data structure.
>
> If a large table could be split (partitioned) based on
> specific key, we could expect huge improvements for agregates
> queries for example when that key is involved.
>
> Also, DBA must be able to place table on different file
> systems. They know more about the application than the OS does.
>
> Even Postgresql has to be told to perform vaccum and analyze.
> If the OS had enough intelligence we could trust it to do a
> good job, but until then ...

Oracle's Rdb has a nice syntax for index creation that takes into
account separate areas.  This will probably look horrible, but here is
the ASCII representation of the syntax grammar:

CREATE INDEX Subtopic? form

CREATE

  INDEX

    Format


       CREATE - ----------- -> INDEX <index-name> ------------------
                -> UNIQUE -
        ------------------------------------------------------------
        -- ----------------------------------- -> ON <table-name> --
           -> STORED NAME IS <stored-name> ---
        ----------------------------- <-----------------------------
        > (  > <column-name> --
                 --------------
                 - -------------  ----------------------------- - - ) -
                   > ASCENDING -  > SIZE IS <n> ---------------
                   > DESCENDING   > MAPPING VALUES <l> TO <h> -
             ---------------------- , <--------------------------
        ------------------------------ <-------------------------------
         ---------------- - ----------------------- -
         -> type-clause -   -> compression-clause -
        ---------------------------------------------
         ----------------------- -------------------------------------->
         -> index-store-clause -



       type-clause =

       -> TYPE IS  > HASHED  -------------
------------------------------ >
                             > ORDERED ---

                             > SCATTERED -

                   > SORTED  ----------------------------------------- -

                             > RANKED - -----------------------------

                                        > DUPLICATES ARE COMPRESSED -

                           ---------------------------------------------

                           - -----------------------
--------------------
                             > sorted-index-clause -



       sorted-index-clause =

       -- - -> NODE SIZE <number-bytes> ---------- - -->
            -> PERCENT FILL <percentage> ---------
            -> USAGE - -> UPDATE - ---------------
                       -> QUERY --
          ------------------- <---------------------



       compression-clause =

       - -> ENABLE COMPRESSION - ------------------------------ - ->
                                 > ( MINIMUM RUN LENGTH <n> ) -
         -> DISABLE COMPRESSION ---------------------------------



       index-store-clause =

       STORE ----------------------
        ---------------------------
         -> IN <area-name> - ---------------------------------- ------ >
                             -> ( -> threshold-clause -> ) ----
         -> USING -> ( - --> <column-name> - -> ) ----------
                         ------- , <--------
         ---------------------------------------------------
          > IN <area-name> - ------------------------------- -
                             -> ( -> threshold-clause -> ) -
            --------------------------------------------------
            -> WITH LIMIT OF -> ( - --> <literal> - -> ) --- ---
                                    ------ , <-----
          -----------------------<--------------------------
        --------------------------------------------------------
         ------------------------------------------------------------
         -> OTHERWISE IN <area-name>  ------------------------------
                                      > ( -> threshold-clause -> ) -



       threshold-clause =

       -- -> THRESHOLD - -> IS - -> ( --> <val1> --> ) --------- ->
                         -> OF -

          -> THRESHOLDS - -> ARE - -----------
                          -> OF  -
            ----------------------------------
            -> ( --> <val1> - --------------------------- -> ) -
                              -> , <val2> - -------------
                                            -> , <val3> -



CREATE INDEX Subtopic?
[snip]

Re: Table Partitioning in Postgres:

From

Jean-Luc Lachance

Date:

18 February 2003, 14:42:00

How about:

PARTITION table ON atribute;

for now?


Dann Corbit wrote:
>
> > -----Original Message-----
> > From: Jean-Luc Lachance [mailto:jllachan@nsd.ca]
> > Sent: Tuesday, February 18, 2003 10:54 AM
> > To: Shridhar Daithankar<shridhar_daithankar@persistent.co.in>
> > Cc: PGSQL General (E-mail)
> > Subject: Re: [GENERAL] Table Partitioning in Postgres:
> >
> >
> > Shridhar,
> >
> > I must disagree with "that's is an OS's job".
> > OSs try to be generic.  With databases, we know more about
> > the data structure.
> >
> > If a large table could be split (partitioned) based on
> > specific key, we could expect huge improvements for agregates
> > queries for example when that key is involved.
> >
> > Also, DBA must be able to place table on different file
> > systems. They know more about the application than the OS does.
> >
> > Even Postgresql has to be told to perform vaccum and analyze.
> > If the OS had enough intelligence we could trust it to do a
> > good job, but until then ...
>
> Oracle's Rdb has a nice syntax for index creation that takes into
> account separate areas.  This will probably look horrible, but here is
> the ASCII representation of the syntax grammar:
>
> CREATE INDEX Subtopic? form
>
> CREATE
>
>   INDEX
>
>     Format
>
>        CREATE - ----------- -> INDEX <index-name> ------------------
>                 -> UNIQUE -
>         ------------------------------------------------------------
>         -- ----------------------------------- -> ON <table-name> --
>            -> STORED NAME IS <stored-name> ---
>         ----------------------------- <-----------------------------
>         > (  > <column-name> --
>                  --------------
>                  - -------------  ----------------------------- - - ) -
>                    > ASCENDING -  > SIZE IS <n> ---------------
>                    > DESCENDING   > MAPPING VALUES <l> TO <h> -
>              ---------------------- , <--------------------------
>         ------------------------------ <-------------------------------
>          ---------------- - ----------------------- -
>          -> type-clause -   -> compression-clause -
>         ---------------------------------------------
>          ----------------------- -------------------------------------->
>          -> index-store-clause -
>
>        type-clause =
>
>        -> TYPE IS  > HASHED  -------------
> ------------------------------ >
>                              > ORDERED ---
>
>                              > SCATTERED -
>
>                    > SORTED  ----------------------------------------- -
>
>                              > RANKED - -----------------------------
>
>                                         > DUPLICATES ARE COMPRESSED -
>
>                            ---------------------------------------------
>
>                            - -----------------------
> --------------------
>                              > sorted-index-clause -
>
>        sorted-index-clause =
>
>        -- - -> NODE SIZE <number-bytes> ---------- - -->
>             -> PERCENT FILL <percentage> ---------
>             -> USAGE - -> UPDATE - ---------------
>                        -> QUERY --
>           ------------------- <---------------------
>
>        compression-clause =
>
>        - -> ENABLE COMPRESSION - ------------------------------ - ->
>                                  > ( MINIMUM RUN LENGTH <n> ) -
>          -> DISABLE COMPRESSION ---------------------------------
>
>        index-store-clause =
>
>        STORE ----------------------
>         ---------------------------
>          -> IN <area-name> - ---------------------------------- ------ >
>                              -> ( -> threshold-clause -> ) ----
>          -> USING -> ( - --> <column-name> - -> ) ----------
>                          ------- , <--------
>          ---------------------------------------------------
>           > IN <area-name> - ------------------------------- -
>                              -> ( -> threshold-clause -> ) -
>             --------------------------------------------------
>             -> WITH LIMIT OF -> ( - --> <literal> - -> ) --- ---
>                                     ------ , <-----
>           -----------------------<--------------------------
>         --------------------------------------------------------
>          ------------------------------------------------------------
>          -> OTHERWISE IN <area-name>  ------------------------------
>                                       > ( -> threshold-clause -> ) -
>
>        threshold-clause =
>
>        -- -> THRESHOLD - -> IS - -> ( --> <val1> --> ) --------- ->
>                          -> OF -
>
>           -> THRESHOLDS - -> ARE - -----------
>                           -> OF  -
>             ----------------------------------
>             -> ( --> <val1> - --------------------------- -> ) -
>                               -> , <val2> - -------------
>                                             -> , <val3> -
>
> CREATE INDEX Subtopic?
> [snip]

Re: Table Partitioning in Postgres:

From

"Shridhar Daithankar"

Date:

19 February 2003, 03:43:36

On Wednesday 19 Feb 2003 12:24 am, you wrote:
> Shridhar,
>
> I must disagree with "that's is an OS's job".
> OSs try to be generic.  With databases, we know more about the data
> structure.
>
> If a large table could be split (partitioned) based on specific key, we
> could expect huge improvements for agregates queries for example when
> that key is involved.

Well, you can create a base table and inherit four or five of them. But that
does not solve the problem really, because postgresql as such does not
support different disk storage area that granularly.

> Also, DBA must be able to place table on different file systems. They
> know more about the application than the OS does.

Certainly. But the advantage will not be visible unless you put it on a disk
that is on separate IDE or SCSI channel. Now that you have a large database,
are you using more than one SCSI channel? Otherwise just putting on different
disks will not help as much.

> Even Postgresql has to be told to perform vaccum and analyze.
> If the OS had enough intelligence we could trust it to do a good job,
> but until then ...

Partially true. Postgresql could have done vacuum at runtime at the cost od
performance. So developers delegated the task to admin.

Looking for a solution in problem, the real benefits won't be visible unless
you put it on a different disk channel. Otherwise RAID is your best bait now
as OS can handle it intelligently and it enhances the IO bandwidth immensely.

Other than that you can not do much with postgresql right now.

 Shridhar

Re: Table Partitioning in Postgres:

From

Jonathan Bartlett

Date:

19 February 2003, 07:02:19

> Certainly. But the advantage will not be visible unless you put it on a disk
> that is on separate IDE or SCSI channel. Now that you have a large database,
> are you using more than one SCSI channel? Otherwise just putting on different
> disks will not help as much.

This is quite untrue.  Even going over a single channel, it is highly
unlikely that a single disk or RAID set is going to saturate the channel,
because of disk seek times.  The purpose of putting tables on different
disks than the indexes is so that the disks don't have to keep seeking
back-and-forth between table, index, table, index, table, index, etc.
Likewise with transaction logs, if they have their own disk, the
read/write heads can pretty much stay in the same place with transaaction
log updates.  Using these benefits you are more likely to make full use of
a single channel than otherwise.

Jon


>
>
> > Even Postgresql has to be told to perform vaccum and analyze.
> > If the OS had enough intelligence we could trust it to do a good job,
> > but until then ...
>
> Partially true. Postgresql could have done vacuum at runtime at the cost od
> performance. So developers delegated the task to admin.
>
> Looking for a solution in problem, the real benefits won't be visible unless
> you put it on a different disk channel. Otherwise RAID is your best bait now
> as OS can handle it intelligently and it enhances the IO bandwidth immensely.
>
> Other than that you can not do much with postgresql right now.
>
>  Shridhar
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: Table Partitioning in Postgres:

From

Greg Copeland

Date:

19 February 2003, 11:11:42

On Wed, 2003-02-19 at 05:43, Jonathan Bartlett wrote:
> > Certainly. But the advantage will not be visible unless you put it on a disk
> > that is on separate IDE or SCSI channel. Now that you have a large database,
> > are you using more than one SCSI channel? Otherwise just putting on different
> > disks will not help as much.
>
> This is quite untrue.  Even going over a single channel, it is highly
> unlikely that a single disk or RAID set is going to saturate the channel,
> because of disk seek times.  The purpose of putting tables on different
> disks than the indexes is so that the disks don't have to keep seeking
> back-and-forth between table, index, table, index, table, index, etc.
> Likewise with transaction logs, if they have their own disk, the
> read/write heads can pretty much stay in the same place with transaaction
> log updates.  Using these benefits you are more likely to make full use of
> a single channel than otherwise.
>
> Jon
>

While your statement is correct I did want to clarify that IDE and SCSI
were lumped together and they should not be.  SCSI and IDE performance
expectations differ because their bus technologies are dramatically
different.  IDE has some serious performance issues with multiple disks
per channel.  A single disk can effectively tie up an IDE channel for
the duration of an outstanding I/O operation, unlike what you would
expect with SCSI.  As such, it is highly recommended to have as many
channels as disks where you expect concurrent disk I/O on said disks.

In the case of IDE, you should see significant boosts in performance by
following this tip.  Even when sharing a bus with something as simple as
a CDROM, it can significantly and negatively impact IDE bus performance;
reflective of both throughput and latency.  As such, I highly recommend
that you use a channel per disk where you expect high rates of
concurrent disk I/O.  As a rule of thumb, you should use a channel per
disk with IDE where throughput and latency are of concern.

Regards,

    Greg

>
> >
> >
> > > Even Postgresql has to be told to perform vaccum and analyze.
> > > If the OS had enough intelligence we could trust it to do a good job,
> > > but until then ...
> >
> > Partially true. Postgresql could have done vacuum at runtime at the cost od
> > performance. So developers delegated the task to admin.
> >
> > Looking for a solution in problem, the real benefits won't be visible unless
> > you put it on a different disk channel. Otherwise RAID is your best bait now
> > as OS can handle it intelligently and it enhances the IO bandwidth immensely.
> >
> > Other than that you can not do much with postgresql right now.
> >
> >  Shridhar
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting

Re: Table Partitioning in Postgres:

From

Jonathan Bartlett

Date:

19 February 2003, 11:31:50

> While your statement is correct I did want to clarify that IDE and SCSI
> were lumped together and they should not be.  SCSI and IDE performance
> expectations differ because their bus technologies are dramatically
> different.  IDE has some serious performance issues with multiple disks
> per channel.  A single disk can effectively tie up an IDE channel for

Actually, IDE has performance issues even with only 1 disk per channel.
The SCSI command set allows disconnected operations, so that the kernel
can send commands (get block xxx, get block yyy, get block zzz) and keep
sending commands while the drive processes the answers.  With IDE, it is a
synchronous transmission - get blox xxx, wait until disk processes,
receive block xxx, get block yyy, wait until disk processes, receive block
yyy.  SCSI disks can also reorder the requests and service them based on
how quickly it can get to each one.

Even on one-channel-per-disk, SCSI wins out.

Jon


> the duration of an outstanding I/O operation, unlike what you would
> expect with SCSI.  As such, it is highly recommended to have as many
> channels as disks where you expect concurrent disk I/O on said disks.
>
> In the case of IDE, you should see significant boosts in performance by
> following this tip.  Even when sharing a bus with something as simple as
> a CDROM, it can significantly and negatively impact IDE bus performance;
> reflective of both throughput and latency.  As such, I highly recommend
> that you use a channel per disk where you expect high rates of
> concurrent disk I/O.  As a rule of thumb, you should use a channel per
> disk with IDE where throughput and latency are of concern.
>
>
> Regards,
>
>     Greg
>
>
> >
> > >
> > >
> > > > Even Postgresql has to be told to perform vaccum and analyze.
> > > > If the OS had enough intelligence we could trust it to do a good job,
> > > > but until then ...
> > >
> > > Partially true. Postgresql could have done vacuum at runtime at the cost od
> > > performance. So developers delegated the task to admin.
> > >
> > > Looking for a solution in problem, the real benefits won't be visible unless
> > > you put it on a different disk channel. Otherwise RAID is your best bait now
> > > as OS can handle it intelligently and it enhances the IO bandwidth immensely.
> > >
> > > Other than that you can not do much with postgresql right now.
> > >
> > >  Shridhar
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> --
> Greg Copeland <greg@copelandconsulting.net>
> Copeland Computer Consulting
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: Table Partitioning in Postgres:

From

Greg Copeland

Date:

19 February 2003, 12:01:38

On Wed, 2003-02-19 at 10:31, Jonathan Bartlett wrote:
> > While your statement is correct I did want to clarify that IDE and SCSI
> > were lumped together and they should not be.  SCSI and IDE performance
> > expectations differ because their bus technologies are dramatically
> > different.  IDE has some serious performance issues with multiple disks
> > per channel.  A single disk can effectively tie up an IDE channel for
>
> Actually, IDE has performance issues even with only 1 disk per channel.
> The SCSI command set allows disconnected operations, so that the kernel
> can send commands (get block xxx, get block yyy, get block zzz) and keep
> sending commands while the drive processes the answers.  With IDE, it is a
> synchronous transmission - get blox xxx, wait until disk processes,
> receive block xxx, get block yyy, wait until disk processes, receive block
> yyy.  SCSI disks can also reorder the requests and service them based on
> how quickly it can get to each one.
>
> Even on one-channel-per-disk, SCSI wins out.
>
> Jon
>

Agreed.  I think we are saying the same thing.  You just happen to go
into more detail.  :P  My point being, if you use IDE, you should be
attempting to use a disk per channel.  BTW, on a different note, IBM
created some IDE drives which allow for command tagging which allows for
multiple outstanding IDE commands, however, I have no idea what the
availability of said drives and drivers are.  I'm actually fairly
ignorant on the exact device details.  You wouldn't happen to have the
skinny of those things would ya?  They still being made?

Your comments really serve to enforce that IDE stinks and stresses that
IDE should not be used where serious database performance is needed.
Needless to say, I think we all already understood that.  ;)

Regards,

--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting

Re: Table Partitioning in Postgres:

From

Jonathan Bartlett

Date:

19 February 2003, 12:31:19

> ignorant on the exact device details.  You wouldn't happen to have the
> skinny of those things would ya?  They still being made?

I wish, especially if they are the same price as regular IDE disks and the
Linux kernel supports them!

> Your comments really serve to enforce that IDE stinks and stresses that
> IDE should not be used where serious database performance is needed.
> Needless to say, I think we all already understood that.  ;)

Even more so, it shows the difference between server-clas computer
components and consumer-class computer components.  It's sometimes wearing
on the mind to get the finance guy at my company to understand why a
server with the same "specs" (using the term loosely) as a desktop machine
costs thousands more.  After long discussions extolling the virtues of ECC
RAM, redundant hot-swappable power supplies, SCSI hard disks, RAID-1, and
cooling requirements, I can sometimes convince him that there is a real
reason for the price difference.

Jon

>
> Regards,
>
>
> --
> Greg Copeland <greg@copelandconsulting.net>
> Copeland Computer Consulting
>

Re: Table Partitioning in Postgres:

From

Peter Childs

Date:

19 February 2003, 12:39:37

On Wed, 19 Feb 2003, Jonathan Bartlett wrote:

> > ignorant on the exact device details.  You wouldn't happen to have the
> > skinny of those things would ya?  They still being made?
>
> I wish, especially if they are the same price as regular IDE disks and the
> Linux kernel supports them!
>
> > Your comments really serve to enforce that IDE stinks and stresses that
> > IDE should not be used where serious database performance is needed.
> > Needless to say, I think we all already understood that.  ;)
>
> Even more so, it shows the difference between server-clas computer
> components and consumer-class computer components.  It's sometimes wearing
> on the mind to get the finance guy at my company to understand why a
> server with the same "specs" (using the term loosely) as a desktop machine
> costs thousands more.  After long discussions extolling the virtues of ECC
> RAM, redundant hot-swappable power supplies, SCSI hard disks, RAID-1, and
> cooling requirements, I can sometimes convince him that there is a real
> reason for the price difference.
>

    So what about Serial ATA that new standard, does that improve
things when it finally come into use?

Peter Childs

Re: Table Partitioning in Postgres:

From

Greg Copeland

Date:

19 February 2003, 14:08:21

On Wed, 2003-02-19 at 11:39, Peter Childs wrote:
> On Wed, 19 Feb 2003, Jonathan Bartlett wrote:
>
> > > ignorant on the exact device details.  You wouldn't happen to have the
> > > skinny of those things would ya?  They still being made?
> >
> > I wish, especially if they are the same price as regular IDE disks and the
> > Linux kernel supports them!
> >
> > > Your comments really serve to enforce that IDE stinks and stresses that
> > > IDE should not be used where serious database performance is needed.
> > > Needless to say, I think we all already understood that.  ;)
> >
> > Even more so, it shows the difference between server-clas computer
> > components and consumer-class computer components.  It's sometimes wearing
> > on the mind to get the finance guy at my company to understand why a
> > server with the same "specs" (using the term loosely) as a desktop machine
> > costs thousands more.  After long discussions extolling the virtues of ECC
> > RAM, redundant hot-swappable power supplies, SCSI hard disks, RAID-1, and
> > cooling requirements, I can sometimes convince him that there is a real
> > reason for the price difference.
> >
>
>     So what about Serial ATA that new standard, does that improve
> things when it finally come into use?
>
> Peter Childs

Peter,

While I have read that many expect serial ATA to seriously challenge
SCSI I honestly have no idea where the rhetoric stops and reality
begins.  I'd hazard a guess we'll really not know the whole truth until
samples become widely available from multiple sources including drives,
drivers, and host interfaces.

Regards,

--
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting

Re: Table Partitioning in Postgres:

From

Jean-Luc Lachance

Date:

19 February 2003, 18:27:39

Shridhar,

I think you missed the most important point.

Regardless of the hardware, having a large table partitioned (not
necessarely split across multiple discs) on a particular keyfield will
improve performance when that keyfield is involved in the query.




"Shridhar Daithankar" wrote:
>
> On Wednesday 19 Feb 2003 12:24 am, you wrote:
> > Shridhar,
> >
> > I must disagree with "that's is an OS's job".
> > OSs try to be generic.  With databases, we know more about the data
> > structure.
> >
> > If a large table could be split (partitioned) based on specific key, we
> > could expect huge improvements for agregates queries for example when
> > that key is involved.
>
> Well, you can create a base table and inherit four or five of them. But that
> does not solve the problem really, because postgresql as such does not
> support different disk storage area that granularly.
>
> > Also, DBA must be able to place table on different file systems. They
> > know more about the application than the OS does.
>
> Certainly. But the advantage will not be visible unless you put it on a disk
> that is on separate IDE or SCSI channel. Now that you have a large database,
> are you using more than one SCSI channel? Otherwise just putting on different
> disks will not help as much.
>
> > Even Postgresql has to be told to perform vaccum and analyze.
> > If the OS had enough intelligence we could trust it to do a good job,
> > but until then ...
>
> Partially true. Postgresql could have done vacuum at runtime at the cost od
> performance. So developers delegated the task to admin.
>
> Looking for a solution in problem, the real benefits won't be visible unless
> you put it on a different disk channel. Otherwise RAID is your best bait now
> as OS can handle it intelligently and it enhances the IO bandwidth immensely.
>
> Other than that you can not do much with postgresql right now.
>
>  Shridhar
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Table Partitioning in Postgres:

From

Christopher Browne

Date:

19 February 2003, 23:17:29

In an attempt to throw the authorities off his trail, greg@CopelandConsulting.Net (Greg Copeland) transmitted:
> While I have read that many expect serial ATA to seriously challenge
> SCSI I honestly have no idea where the rhetoric stops and reality
> begins.  I'd hazard a guess we'll really not know the whole truth until
> samples become widely available from multiple sources including drives,
> drivers, and host interfaces.

I'd see Serial ATA being a "serious challenger" if it allowed you to
cheaply build some "embarrassingly-parallel" RAID servers where you
were able to get 16 drives hooked up, each on its own ATA 'bus' for
about the price of 4-way SCSI.

But the intended market is doubtless the super-price-conscious market,
which naturally leads to the overall quality of the results being
pretty compromised.
--
output = ("cbbrowne" "@ntlug.org")
http://www3.sympatico.ca/cbbrowne/sgml.html
Rules of the  Evil Overlord #43. "I will maintain  a healthy amount of
skepticism when  I capture the beautiful  rebel and she  claims she is
attracted  to my  power  and good  looks  and will  gladly betray  her
companions if I just let her in on my plans."
<http://www.eviloverlord.com/>

Re: Table Partitioning in Postgres:

From

"scott.marlowe"

Date:

20 February 2003, 11:15:38

On Wed, 19 Feb 2003, Christopher Browne wrote:

> In an attempt to throw the authorities off his trail, greg@CopelandConsulting.Net (Greg Copeland) transmitted:
> > While I have read that many expect serial ATA to seriously challenge
> > SCSI I honestly have no idea where the rhetoric stops and reality
> > begins.  I'd hazard a guess we'll really not know the whole truth until
> > samples become widely available from multiple sources including drives,
> > drivers, and host interfaces.
>
> I'd see Serial ATA being a "serious challenger" if it allowed you to
> cheaply build some "embarrassingly-parallel" RAID servers where you
> were able to get 16 drives hooked up, each on its own ATA 'bus' for
> about the price of 4-way SCSI.
>
> But the intended market is doubtless the super-price-conscious market,
> which naturally leads to the overall quality of the results being
> pretty compromised.

These guys:

http://www.ide-raid.com/

will probably make use of serial ata, since they're already making some
very nice, very fast IDE based RAID devices, and very dense (12 disk 3U
units.)

Re: Table Partitioning in Postgres: [Viruschecked]

From

"Patric Bechtel"

Date:

20 February 2003, 12:07:37

On Thu, 20 Feb 2003 09:02:06 -0700 (MST), scott.marlowe wrote:

Hi,

I doubt that it will be for "super-price-conscious" only. I recently
saw Western Digital annoucing a 10.000 RPM drive with 36 Gig for
serial ATA, <5.2ms access time, about 55 MB/s, for 30% than the SCSI price.
That's serious. Together with 5 years warranty and explicit statement
about being 24x7x365 capable (which most ATA-drives aren't!).

0.02$...

Patric

>> I'd see Serial ATA being a "serious challenger" if it allowed you to
>> cheaply build some "embarrassingly-parallel" RAID servers where you
>> were able to get 16 drives hooked up, each on its own ATA 'bus' for
>> about the price of 4-way SCSI.
>>
>> But the intended market is doubtless the super-price-conscious market,
>> which naturally leads to the overall quality of the results being
>> pretty compromised.

>These guys:

>http://www.ide-raid.com/

>will probably make use of serial ata, since they're already making some
>very nice, very fast IDE based RAID devices, and very dense (12 disk 3U
>units.)

PGP Public Key Fingerprint: 2636 F26E F523 7D62  4377 D206 7C68 06BB

Re: Table Partitioning in Postgres:

From

Dennis Gearon

Date:

20 February 2003, 13:41:57

Most of the specs I've seen haven't shown IDE raid to be that much faster.


2/20/2003 8:02:06 AM, "scott.marlowe" <scott.marlowe@ihs.com> wrote:

>On Wed, 19 Feb 2003, Christopher Browne wrote:
>
>> In an attempt to throw the authorities off his trail, greg@CopelandConsulting.Net (Greg
Copeland) transmitted:
>> > While I have read that many expect serial ATA to seriously challenge
>> > SCSI I honestly have no idea where the rhetoric stops and reality
>> > begins.  I'd hazard a guess we'll really not know the whole truth until
>> > samples become widely available from multiple sources including drives,
>> > drivers, and host interfaces.
>>
>> I'd see Serial ATA being a "serious challenger" if it allowed you to
>> cheaply build some "embarrassingly-parallel" RAID servers where you
>> were able to get 16 drives hooked up, each on its own ATA 'bus' for
>> about the price of 4-way SCSI.
>>
>> But the intended market is doubtless the super-price-conscious market,
>> which naturally leads to the overall quality of the results being
>> pretty compromised.
>
>These guys:
>
>http://www.ide-raid.com/
>
>will probably make use of serial ata, since they're already making some
>very nice, very fast IDE based RAID devices, and very dense (12 disk 3U
>units.)
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: Table Partitioning in Postgres:

From

Martijn van Oosterhout

Date:

20 February 2003, 15:21:57

On Wed, Feb 19, 2003 at 11:00:58AM -0600, Greg Copeland wrote:
> On Wed, 2003-02-19 at 10:31, Jonathan Bartlett wrote:
> > Actually, IDE has performance issues even with only 1 disk per channel.
> > The SCSI command set allows disconnected operations, so that the kernel
> > can send commands (get block xxx, get block yyy, get block zzz) and keep
> > sending commands while the drive processes the answers.  With IDE, it is a
> > synchronous transmission - get blox xxx, wait until disk processes,
> > receive block xxx, get block yyy, wait until disk processes, receive block
> > yyy.  SCSI disks can also reorder the requests and service them based on
> > how quickly it can get to each one.
>
> Agreed.  I think we are saying the same thing.  You just happen to go
> into more detail.  :P  My point being, if you use IDE, you should be
> attempting to use a disk per channel.  BTW, on a different note, IBM
> created some IDE drives which allow for command tagging which allows for
> multiple outstanding IDE commands, however, I have no idea what the
> availability of said drives and drivers are.  I'm actually fairly
> ignorant on the exact device details.  You wouldn't happen to have the
> skinny of those things would ya?  They still being made?

For reference it's called Tagged Command Queueing (TCQ). It appears to be
supported by the IBM Deskstar drive only so far. There is some support in
Linux 2.5 AFAICS. It doesn't work with all controllers either apparently.
Bit of a mish-mash really.

http://lwn.net/2002/0411/a/ah-tcq.php3
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Support bacteria! They're the only culture some people have.

Attachment

msg-32091-55370.dat