Thread: Index location patch for review

Index location patch for review

From
"Jim Buttafuoco"
Date:
Hi all,

Attached is a patch that adds support for specifying a location  for
indexes via the "create database" command.

I believe this patch is complete, but it is my first .


Thanks
Jim



Attachment

Re: Index location patch for review

From
Bruce Momjian
Date:
> Hi all,
> 
> Attached is a patch that adds support for specifying a location  for
> indexes via the "create database" command.
> 
> I believe this patch is complete, but it is my first .

This patch allows index locations to be specified as different from data
locations.  Is this a feature direction we want to go in?  Comments?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Index location patch for review

From
Thomas Lockhart
Date:
> > Attached is a patch that adds support for specifying a location  for
> > indexes via the "create database" command.
> This patch allows index locations to be specified as different from data
> locations.  Is this a feature direction we want to go in?  Comments?

I have not looked at the patch, either implementation or proposed
syntax, but in general we certainly want to head in a direction where we
have full control over placement of storage resources.
                     - Thomas


Re: Index location patch for review

From
Stefan Rindeskar
Date:
I am very new to this mailinglist so I apologize if I start talking early but
I've been working as a sysadmin and that kind of problems for a long while
now and my suggestion is that it is a start but I think that we should aim a
little higher than this and use something more like the Oracle approach
instead. Where they introduce an abstraction layer in the form of a
tablespace. And this tablespace is then referenced from the create table or
create index instead.
eg:
table -> tablespace -> path to physical storage
index -> tablespace -> path to physical storage

Advantages:
Changes can be done to storage whithout need to change create scripts for db,
tables and so on.
Designers can specify in which tablespace tables/indexes should reside based
on usage.
Sysadmins can work with tablespaces and change paths without changing
anything in the database/table/index definitions.

The alternative is symlinks to distribute the load and that is not a pretty
sight dba-wise.

Hope you can bare with me on this, since I think it is an very important
issue.
I'm unfortunately not a fast coder yet (but I'm getting faster :-) ). But I
could start writing a spec if someone is interrested.

Bruce Momjian wrote:

> > Hi all,
> >
> > Attached is a patch that adds support for specifying a location  for
> > indexes via the "create database" command.
> >
> > I believe this patch is complete, but it is my first .
>
> This patch allows index locations to be specified as different from data
> locations.  Is this a feature direction we want to go in?  Comments?
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl



Re: Index location patch for review

From
"Mikheev, Vadim"
Date:
> > Attached is a patch that adds support for specifying a
> > location  for indexes via the "create database" command.
> > 
> > I believe this patch is complete, but it is my first .
> 
> This patch allows index locations to be specified as
> different from data locations. Is this a feature direction
> we want to go in?  Comments?

The more general and "standard" way to go are TABLESPACEs.
But probably proposed feature will be compatible with
tablespaces, when we'll got them: we could use new "create
database" syntax to specify default tablespace for indices.
Unfortunately I removed message with patch, can you send it
to me, Bruce?

Vadim


Re: Index location patch for review

From
Tom Lane
Date:
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> The more general and "standard" way to go are TABLESPACEs.
> But probably proposed feature will be compatible with
> tablespaces, when we'll got them:

Will it be?  I'm afraid of creating a backwards-compatibility
problem for ourselves when it comes time to implement tablespaces.

At the very least I'd like to see some information demonstrating
how much benefit there is to this proposed patch, before we
consider whether to adopt it.  If there's a significant performance
benefit to splitting a PG database along the table-vs-index divide,
then it's interesting as a short-term improvement ... but Jim didn't
even make that assertion, let alone provide evidence to back it up.
        regards, tom lane


Re: Index location patch for review

From
Bruce Momjian
Date:
> "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> > The more general and "standard" way to go are TABLESPACEs.
> > But probably proposed feature will be compatible with
> > tablespaces, when we'll got them:
> 
> Will it be?  I'm afraid of creating a backwards-compatibility
> problem for ourselves when it comes time to implement tablespaces.
> 
> At the very least I'd like to see some information demonstrating
> how much benefit there is to this proposed patch, before we
> consider whether to adopt it.  If there's a significant performance
> benefit to splitting a PG database along the table-vs-index divide,
> then it's interesting as a short-term improvement ... but Jim didn't
> even make that assertion, let alone provide evidence to back it up.

If that is your only concern, I can tell you for sure that if the
locations are on different drives, there will be a performance benefit. 
It is standard database practice to put indexes on different drives than
data.  In fact, sometimes you want to put two tables that are frequently
joined on separate drives.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Index location patch for review

From
Thomas Lockhart
Date:
...
> At the very least I'd like to see some information demonstrating
> how much benefit there is to this proposed patch, before we
> consider whether to adopt it.  If there's a significant performance
> benefit to splitting a PG database along the table-vs-index divide,
> then it's interesting as a short-term improvement ... but Jim didn't
> even make that assertion, let alone provide evidence to back it up.

Clearly there can be a *storage management* benefit to having control
over what gets put where, so this does not need to be justified strictly
on a performance basis.

For features like this, we will feel free to evolve them or
revolutionize them with further development, so I'm not worried about
the backward compatibility issue for cases like this.

Comments?
                   - Thomas


Re: Index location patch for review

From
Bruce Momjian
Date:
> ...
> > At the very least I'd like to see some information demonstrating
> > how much benefit there is to this proposed patch, before we
> > consider whether to adopt it.  If there's a significant performance
> > benefit to splitting a PG database along the table-vs-index divide,
> > then it's interesting as a short-term improvement ... but Jim didn't
> > even make that assertion, let alone provide evidence to back it up.
> 
> Clearly there can be a *storage management* benefit to having control
> over what gets put where, so this does not need to be justified strictly
> on a performance basis.
> 
> For features like this, we will feel free to evolve them or
> revolutionize them with further development, so I'm not worried about
> the backward compatibility issue for cases like this.
> 
> Comments?

Agreed.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Index location patch for review

From
"Mikheev, Vadim"
Date:
> > The more general and "standard" way to go are TABLESPACEs.
> > But probably proposed feature will be compatible with
> > tablespaces, when we'll got them:
> 
> Will it be?  I'm afraid of creating a backwards-compatibility
> problem for ourselves when it comes time to implement tablespaces.

As I said, INDEX_LOCATION in CREATE DATABASE could mean location
of default tablespace for indices in future and one will be able
to override tablespace for particular index with TABLESPACE
clause in CREATE INDEX command.

> At the very least I'd like to see some information demonstrating
> how much benefit there is to this proposed patch, before we
> consider whether to adopt it.  If there's a significant performance
> benefit to splitting a PG database along the table-vs-index divide,
> then it's interesting as a short-term improvement ... but Jim didn't
> even make that assertion, let alone provide evidence to back it up.

Agreed. He mentioned significant performance difference but it would
be great to see results of pgbench tests with scaling factor of >= 10.
Jim?

Also, after reviewing patch I have to say that it will NOT work
with WAL. Jim, please do not name index' dir as "<TBL_NODE>_index".
Instead, just use different TBL_NODE for indices (different number).
It's not good to put if(reln->rd_rel->relkind == RELKIND_INDEX)
stuff into storage manager - only two numbers (tblnode & relnode)
must be used to identify file, no any other logical information
totally unrelated to storage issues.

Vadim


Re: Index location patch for review

From
"Jim Buttafuoco"
Date:
just change the work tablespace below to location and that is exactly
what this patch is trying to do.  You can think of the LOCATION and
INDEX_LOCATION provided to the create database command as the default
storage locations for these objects.  In the future, I want to enable
the DBA to specify LOCATIONS any object just like Oracle.  I am also
planning on a pg_locations table and "create location" command which
will do what the current initlocation script does and more.


Jim


> 
> I am very new to this mailinglist so I apologize if I start talking
early but
> I've been working as a sysadmin and that kind of problems for a long
while
> now and my suggestion is that it is a start but I think that we should
aim a
> little higher than this and use something more like the Oracle
approach
> instead. Where they introduce an abstraction layer in the form of a
> tablespace. And this tablespace is then referenced from the create
table or
> create index instead.
> eg:
> table -> tablespace -> path to physical storage
> index -> tablespace -> path to physical storage
> 
> Advantages:
> Changes can be done to storage whithout need to change create scripts
for db,
> tables and so on.
> Designers can specify in which tablespace tables/indexes should reside
based
> on usage.
> Sysadmins can work with tablespaces and change paths without changing
> anything in the database/table/index definitions.
> 
> The alternative is symlinks to distribute the load and that is not a
pretty
> sight dba-wise.
> 
> Hope you can bare with me on this, since I think it is an very
important
> issue.
> I'm unfortunately not a fast coder yet (but I'm getting faster :-) ).
But I
> could start writing a spec if someone is interrested.
> 
> Bruce Momjian wrote:
> 
> > > Hi all,
> > >
> > > Attached is a patch that adds support for specifying a location 
for
> > > indexes via the "create database" command.
> > >
> > > I believe this patch is complete, but it is my first .
> >
> > This patch allows index locations to be specified as different from
data
> > locations.  Is this a feature direction we want to go in?  Comments?
> >
> > --
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 853-3000
> >   +  If your life is a hard drive,     |  830 Blythe Avenue
> >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania
19026
> >
> > ---------------------------(end of
broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://www.postgresql.org/search.mpl
> 
> 
> ---------------------------(end of
broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)
> 
> 




Re: Index location patch for review

From
"Jim Buttafuoco"
Date:
I agree that groups of objects in separate data storage areas are needed
and that is what I am trying to get to.  Don't you think that Postgresql
with locations/files is the same as Oracle tablespaces.  I don't think
we want to invent our own filesystem (which is what a tablespace really
is...).

Jim



> > > Attached is a patch that adds support for specifying a
> > > location  for indexes via the "create database" command.
> > > 
> > > I believe this patch is complete, but it is my first .
> > 
> > This patch allows index locations to be specified as
> > different from data locations. Is this a feature direction
> > we want to go in?  Comments?
> 
> The more general and "standard" way to go are TABLESPACEs.
> But probably proposed feature will be compatible with
> tablespaces, when we'll got them: we could use new "create
> database" syntax to specify default tablespace for indices.
> Unfortunately I removed message with patch, can you send it
> to me, Bruce?
> 
> Vadim
> 
> 




Re: Index location patch for review

From
"Jim Buttafuoco"
Date:
Vadim,

I don't understand the WAL issue below, can you explain.  The dir name
is the same name as the database with _index added to it.  This is how
the current datpath stuff works.  I really just copied the datpath code
to get this patch to work...

Also I have been running this patch (both 7.1.3 and 7.2devel) against
some of my companies applications.  I have loaded a small database 10G
data and 15G indexes both with and without the patch.  There seems to be
between 5% and 10% performance gain doing most common db commands
(selects, selects with joins and inserts).  The system is a DUAL P3 733
with 3 IDE disks.  One for PGDATA, second for APPDATA and third for
APPIDX.  As you can see I have seperated WAL files, GLOBAL, Application
data and application indexes over 3 disks.  Our production systems have
around 50k queries/day ( not including data loads), so I believe that
when this patch get put into production,  with 20 disks and 10 database
the performance increase should go up.


I should also add, that I have been working on the second part of this
patch, which will allow tables and indexes to be put into LOCATIONS
also.  I am going planning on having a PG_LOCATIONS table and
CREATE|DROP|ALTER location SQL command instead of the initlocation shell
script we currently have. The only thing stopping me now is 7.2 testing
I am planning on doing once the beta begins and problems adding a
location column to the pg_class table with the necessary support code in
heap.c...



Thanks for all the comments (keep them comming)

Jim

> > > The more general and "standard" way to go are TABLESPACEs.
> > > But probably proposed feature will be compatible with
> > > tablespaces, when we'll got them:
> > 
> > Will it be?  I'm afraid of creating a backwards-compatibility
> > problem for ourselves when it comes time to implement tablespaces.
> 
> As I said, INDEX_LOCATION in CREATE DATABASE could mean location
> of default tablespace for indices in future and one will be able
> to override tablespace for particular index with TABLESPACE
> clause in CREATE INDEX command.
> 
> > At the very least I'd like to see some information demonstrating
> > how much benefit there is to this proposed patch, before we
> > consider whether to adopt it.  If there's a significant performance
> > benefit to splitting a PG database along the table-vs-index divide,
> > then it's interesting as a short-term improvement ... but Jim didn't
> > even make that assertion, let alone provide evidence to back it up.
> 
> Agreed. He mentioned significant performance difference but it would
> be great to see results of pgbench tests with scaling factor of >= 10.
> Jim?
> 
> Also, after reviewing patch I have to say that it will NOT work
> with WAL. Jim, please do not name index' dir as "<TBL_NODE>_index".
> Instead, just use different TBL_NODE for indices (different number).
> It's not good to put if(reln->rd_rel->relkind == RELKIND_INDEX)
> stuff into storage manager - only two numbers (tblnode & relnode)
> must be used to identify file, no any other logical information
> totally unrelated to storage issues.
> 
> Vadim
> 
> ---------------------------(end of
broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 
> 




Re: Index location patch for review

From
"Mikheev, Vadim"
Date:
> I don't understand the WAL issue below, can you explain. The dir name
> is the same name as the database with _index added to it. This is how
> the current datpath stuff works.  I really just copied the datpath
> code to get this patch to work...

At the time of after crash recovery WAL is not able to read relation
description from catalog and so only relfilenode is provided for
storage manager in relation structure (look backend/access/transam/
xlogutils.c:XLogOpenRelation). Well, we could add Index/Table
file type identifier to RmgrData (rmgr.c in the same dir) to set
relkind in relation structure, but I don't see any reason to
do so when we can just use different tblnode number for indices and
name index dirs just like other dirs under 'base' named - ie
only tblnode number is used for dir names, without any additions
unrelated to storage issues.

Vadim


Re: Index location patch for review

From
"Mikheev, Vadim"
Date:
> Also I have been running this patch (both 7.1.3 and 7.2devel) against
> some of my companies applications.  I have loaded a small database 10G

We are not familiar with your applications. It would be better to see
results of test suit available to the community. pgbench is first to
come in mind. Such tests would be more valuable.

Vadim


Re: Index location patch for review

From
"Jim Buttafuoco"
Date:
I could also symlink all index files back to the tblnode directory?



> > I don't understand the WAL issue below, can you explain. The dir
name
> > is the same name as the database with _index added to it. This is
how
> > the current datpath stuff works.  I really just copied the datpath
> > code to get this patch to work...
> 
> At the time of after crash recovery WAL is not able to read relation
> description from catalog and so only relfilenode is provided for
> storage manager in relation structure (look backend/access/transam/
> xlogutils.c:XLogOpenRelation). Well, we could add Index/Table
> file type identifier to RmgrData (rmgr.c in the same dir) to set
> relkind in relation structure, but I don't see any reason to
> do so when we can just use different tblnode number for indices and
> name index dirs just like other dirs under 'base' named - ie
> only tblnode number is used for dir names, without any additions
> unrelated to storage issues.
> 
> Vadim
> 
> ---------------------------(end of
broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org
> 
> 




Re: Index location patch for review

From
"Jim Buttafuoco"
Date:
Here is my pgbench results.  As you can see the I am getting 2X tps with
the 2 directories.  I believe this is a BIG win for Postgresql if we can
figure out the WAL recovery issues.


Can someone other than me apply the patch and verify the pgbench
results.


My hardward setup is a dual processor P3/733 running Redhat 7.1 with 512
megs of memory. The postgresql.conf file is the installed version with
NO changes.

Jim


template1=# create database one_dir with location='PGDATA1';
template1=# create database two_dir with location='PGDATA1'
index_location='PGIDX1';
for X in 1 2 3 4 5 6 7 8 9 10 
dopgbench -i -s 10 one_dir >>one_dir.logpgbench -i -s 10 two_dir >>two_dir.log
done

bash-2.04$ grep 'excluding' one_dir.log
tps = 44.319306(excluding connections establishing)
tps = 34.641020(excluding connections establishing)
tps = 50.516889(excluding connections establishing)
tps = 52.747039(excluding connections establishing)
tps = 16.203821(excluding connections establishing)
tps = 36.902861(excluding connections establishing)
tps = 52.511769(excluding connections establishing)
tps = 53.479882(excluding connections establishing)
tps = 54.599429(excluding connections establishing)
tps = 36.780419(excluding connections establishing)
tps = 48.048279(excluding connections establishing)

bash-2.04$ grep 'excluding' two_dir.log
tps = 58.739049(excluding connections establishing)
tps = 100.259270(excluding connections establishing)
tps = 103.156166(excluding connections establishing)
tps = 110.829358(excluding connections establishing)
tps = 111.929690(excluding connections establishing)
tps = 106.840118(excluding connections establishing)
tps = 101.563159(excluding connections establishing)
tps = 102.877060(excluding connections establishing)
tps = 103.784717(excluding connections establishing)
tps = 53.056309(excluding connections establishing)
tps = 73.842428(excluding connections establishing)


> > Also I have been running this patch (both 7.1.3 and 7.2devel)
against
> > some of my companies applications.  I have loaded a small database
10G
> 
> We are not familiar with your applications. It would be better to see
> results of test suit available to the community. pgbench is first to
> come in mind. Such tests would be more valuable.
> 
> Vadim
> 
> ---------------------------(end of
broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 
> 




Re: Index location patch for review

From
Darren King
Date:
> > Attached is a patch that adds support for specifying a location  for
> > indexes via the "create database" command.
> > 
> > I believe this patch is complete, but it is my first .
> 
> This patch allows index locations to be specified as 
> different from data locations.  Is this a feature direction
> we want to go in?  Comments?

Having the table and index on separate drives can do wonders for i/o
performance. :)

darrenk