Thread: Multiple Indexing, performance impact

Multiple Indexing, performance impact

From
Daniel Åkerud
Date:
I just made i simple little test application inserting 50'000 'pwgen 8' data into a table with only a primary key id and a text column.
 
In every run, it is all deleted and the tables are vacuumed.
 
Having one separate index on name it took 36 seconds
Having an additional index, also on name, it took 69 seconds.
Furthermore:
3 indexes: 97 seconds
4 indexes: 131 seconds
5 indexes: 163 seconds
6 indexes: 210 seconds
7 indexes: 319 seconds
8 indexes: 572 seconds
9 indexes: 831 seconds
10 indexes: 1219 seconds
 
Anyone know what causes the signifacant performance decrease after 7 indexes?
 
Daniel Åkerud
 
 

Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
=?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:
> Anyone know what causes the signifacant performance decrease after 7 indexe=
> s?

I'd bet that somewhere around there, you are starting to see thrashing
of the buffer pool due to needing to touch too many different pages to
insert each tuple.  What is your -B setting?  If you increase it,
does the performance improve?

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Daniel Åkerud
Date:
I did a ps ax | postmaster but found no -B, and concluded that it uses the
value specified in /etc/postgrelsql/postgresql.conf on shared_buffers (I
saw -B was shared buffer doing a man postmaster). I'll change this to 256
and rerun the test!

Will post the results here later. Please tell if this was a too puny
increase!

Daniel Åkerud

> =?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:
> > Anyone know what causes the signifacant performance decrease after 7
indexe=
> > s?
>
> I'd bet that somewhere around there, you are starting to see thrashing
> of the buffer pool due to needing to touch too many different pages to
> insert each tuple.  What is your -B setting?  If you increase it,
> does the performance improve?
>
> regards, tom lane
>


Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
=?iso-8859-1?Q?Daniel_=C5kerud?= <zilch@home.se> writes:
> I did a ps ax | postmaster but found no -B, and concluded that it uses the
> value specified in /etc/postgrelsql/postgresql.conf on shared_buffers (I
> saw -B was shared buffer doing a man postmaster). I'll change this to 256
> and rerun the test!

> Will post the results here later. Please tell if this was a too puny
> increase!

That should be enough to see if there's a performance change, but for
future reference, yes you should go higher.  On modern machines with
many megs of RAM, you should probably be using -B on the order of a few
thousand, at least for production installations.  The reason the default
is so low is that we hope the system will still be able to fire up on
machines where the kernel enforces a SHMMAX limit of only a meg or so.
This hope is possibly in vain anymore anyway, since the system's
non-buffer shared-memory usage keeps creeping up; I think 7.1 is well
past 1MB shmem even with 64 buffers...

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Daniel Åkerud
Date:
Holy ultra-violet-active macaronies :)

First I changed it to 256, then I changed it to 1024.

-B 128 is A
-B 256 is B
-B 1024 is C

New multiple-index performance data):

1.    A: 36    B: 32    C: 35
2.    A: 69    B: 53    C: 38
3.    A: 97    B: 79    C: 40
4.    A: 131    B: 98    C: 48
5.    A: 163    B: 124    C: 52
6.    A: 210    B: 146    C: 66
7.    A: 319    B: 233    C: 149
8.    A: 572    B: 438    C: 268
9.    A: 831    B: 655    C:
10.    A: 1219    B: 896    C:

The last test hasn't finished yet, but THANKS! I know the reson now, at
least... i'll try
2048 also.

-B equals --brutal-performance ? ;)

Thanks,
Daniel Åkerud

> That should be enough to see if there's a performance change, but for
> future reference, yes you should go higher.  On modern machines with
> many megs of RAM, you should probably be using -B on the order of a few
> thousand, at least for production installations.  The reason the default
> is so low is that we hope the system will still be able to fire up on
> machines where the kernel enforces a SHMMAX limit of only a meg or so.
> This hope is possibly in vain anymore anyway, since the system's
> non-buffer shared-memory usage keeps creeping up; I think 7.1 is well
> past 1MB shmem even with 64 buffers...
>
> regards, tom lane
>


Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> Holy ultra-violet-active macaronies :)
>
> First I changed it to 256, then I changed it to 1024.
>
> -B 128 is A
> -B 256 is B
> -B 1024 is C
>
> New multiple-index performance data):
>
> 1.    A: 36    B: 32    C: 35
> 2.    A: 69    B: 53    C: 38
> 3.    A: 97    B: 79    C: 40
> 4.    A: 131    B: 98    C: 48
> 5.    A: 163    B: 124    C: 52
> 6.    A: 210    B: 146    C: 66
> 7.    A: 319    B: 233    C: 149
> 8.    A: 572    B: 438    C: 268
> 9.    A: 831    B: 655    C:
> 10.    A: 1219    B: 896    C:
>
> The last test hasn't finished yet, but THANKS! I know the reson now, at
> least... i'll try
> 2048 also.

Strange that even at 1024 performance still drops off at 7.  Seems it
may be more than buffer thrashing.


> -B equals --brutal-performance ? ;)

See my performance article on techdocs.postgresql.org.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> This does remind me that I'd been thinking of suggesting that we
>> raise the default -B to something more reasonable, maybe 1000 or so
>> (yielding an 8-meg-plus shared memory area).

> BSD/OS has a 4MB max but we document how to increase it by recompiling
> the kernel.

Hmm.  Anyone like the idea of a platform-specific default established
by configure?  We could set it in the template file on platforms where
the default SHMMAX is too small to allow 1000 buffers.

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Strange that even at 1024 performance still drops off at 7.  Seems it
> > may be more than buffer thrashing.
>
> Yeah, if anything the knee in the curve seems to be worse at 1024
> buffers.  Curious.  Deserves more investigation, perhaps.
>
> This does remind me that I'd been thinking of suggesting that we
> raise the default -B to something more reasonable, maybe 1000 or so
> (yielding an 8-meg-plus shared memory area).  This wouldn't prevent
> people from setting it small if they have a small SHMMAX, but it's
> probably time to stop letting that case drive our default setting.
> Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
> the original rationale for using 64 is looking pretty broken anyway.
> Comments?

BSD/OS has a 4MB max but we document how to increase it by recompiling
the kernel.  Maybe if we fail the startup we can tell them how to
decrease the buffers in postgresql.conf file.  Seems quite clear.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Strange that even at 1024 performance still drops off at 7.  Seems it
> may be more than buffer thrashing.

Yeah, if anything the knee in the curve seems to be worse at 1024
buffers.  Curious.  Deserves more investigation, perhaps.

This does remind me that I'd been thinking of suggesting that we
raise the default -B to something more reasonable, maybe 1000 or so
(yielding an 8-meg-plus shared memory area).  This wouldn't prevent
people from setting it small if they have a small SHMMAX, but it's
probably time to stop letting that case drive our default setting.
Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
the original rationale for using 64 is looking pretty broken anyway.
Comments?

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> This does remind me that I'd been thinking of suggesting that we
> >> raise the default -B to something more reasonable, maybe 1000 or so
> >> (yielding an 8-meg-plus shared memory area).
>
> > BSD/OS has a 4MB max but we document how to increase it by recompiling
> > the kernel.
>
> Hmm.  Anyone like the idea of a platform-specific default established
> by configure?  We could set it in the template file on platforms where
> the default SHMMAX is too small to allow 1000 buffers.

Template file seems like a good idea for platforms that can't handle the
default.  I don't think configure should be doing such tests because the
target could be a different kernel.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Hmm.  Anyone like the idea of a platform-specific default established
>> by configure?  We could set it in the template file on platforms where
>> the default SHMMAX is too small to allow 1000 buffers.

> Template file seems like a good idea for platforms that can't handle the
> default.  I don't think configure should be doing such tests because the
> target could be a different kernel.

Right, I wasn't thinking of an actual run-time test in configure, just
that we could use it to let the OS-specific template file override the
normal default.

We could offer a --with switch to manually choose the default, too.

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Hmm.  Anyone like the idea of a platform-specific default established
> >> by configure?  We could set it in the template file on platforms where
> >> the default SHMMAX is too small to allow 1000 buffers.
>
> > Template file seems like a good idea for platforms that can't handle the
> > default.  I don't think configure should be doing such tests because the
> > target could be a different kernel.
>
> Right, I wasn't thinking of an actual run-time test in configure, just
> that we could use it to let the OS-specific template file override the
> normal default.
>
> We could offer a --with switch to manually choose the default, too.

Good idea, yes.  Not sure if we need a --with switch because they can
just edit the postgresql.conf or postgresql.conf.sample file.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Multiple Indexing, performance impact

From
Daniel Åkerud
Date:
Tried with 2048 also, it complete took away the strange steep after 7:

D is now 2048

1.    A: 36    B: 32    C: 35    D: 31
2.    A: 69    B: 53    C: 38    D: 38
3.    A: 97    B: 79    C: 40    D: 40
4.    A: 131    B: 98    C: 48    D: 43
5.    A: 163    B: 124    C: 52    D: 49
6.    A: 210    B: 146    C: 66    D: 50
7.    A: 319    B: 233    C: 149    D: 58
8.    A: 572    B: 438    C: 268    D: 65
9.    A: 831    B: 655    C: 437    D: 76
10.    A: 1219    B: 896    C: 583    D: 79

What is the program called that flushes the buffers every 30 seconds on a
linux 2.2.x system?

Daniel Åkerud

> > Holy ultra-violet-active macaronies :)
> >
> > First I changed it to 256, then I changed it to 1024.
> >
> > -B 128 is A
> > -B 256 is B
> > -B 1024 is C
> >
> > New multiple-index performance data):
> >
> > 1.    A: 36    B: 32    C: 35
> > 2.    A: 69    B: 53    C: 38
> > 3.    A: 97    B: 79    C: 40
> > 4.    A: 131    B: 98    C: 48
> > 5.    A: 163    B: 124    C: 52
> > 6.    A: 210    B: 146    C: 66
> > 7.    A: 319    B: 233    C: 149
> > 8.    A: 572    B: 438    C: 268
> > 9.    A: 831    B: 655    C:
> > 10.    A: 1219    B: 896    C:
> >
> > The last test hasn't finished yet, but THANKS! I know the reson now, at
> > least... i'll try
> > 2048 also.
>
> Strange that even at 1024 performance still drops off at 7.  Seems it
> may be more than buffer thrashing.
>
>
> > -B equals --brutal-performance ? ;)
>
> See my performance article on techdocs.postgresql.org.
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>


Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> We could offer a --with switch to manually choose the default, too.

> Good idea, yes.  Not sure if we need a --with switch because they can
> just edit the postgresql.conf or postgresql.conf.sample file.

Well, we have a --with switch for DEF_MAXBACKENDS, so one for the
default number of buffers doesn't seem too unreasonable.  I wouldn't
bother with it if configure didn't have to touch the value anyway...
but it's just another line or two in configure.in...

            regards, tom lane

Re: Multiple Indexing, performance impact

From
Peter Eisentraut
Date:
Tom Lane writes:

> This does remind me that I'd been thinking of suggesting that we
> raise the default -B to something more reasonable, maybe 1000 or so
> (yielding an 8-meg-plus shared memory area).

On Modern(tm) systems, 8 MB is just as arbitrary and undersized as 1 MB.
So while for real use, manual tuning will still be necessary, on test
systems we'd use significant amounts of memory for nothing, or not start
up at all.

Maybe we could look around what the default limit is these days, but
raising it to arbitrary values will just paint over the fact that user
intervention is still required and that there is almost no documentation
for this.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter


Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> This does remind me that I'd been thinking of suggesting that we
>> raise the default -B to something more reasonable, maybe 1000 or so
>> (yielding an 8-meg-plus shared memory area).

> On Modern(tm) systems, 8 MB is just as arbitrary and undersized as 1 MB.

A fair complaint, but at least it's within an order of magnitude of
being reasonable; you don't *have* to tune it before you get something
approaching reasonable performance.  64 is two or more orders of
magnitude off.

> So while for real use, manual tuning will still be necessary, on test
> systems we'd use significant amounts of memory for nothing, or not start
> up at all.

The thought of test postmasters was what kept me from proposing
something even higher than 1000.  8Mb is small enough that you can
still expect to run several postmasters without problems, on most
machines where you might contemplate the idea of multiple postmasters
at all.

Would you suggest that we have no default at all, and make users pick
something?

> Maybe we could look around what the default limit is these days, but
> raising it to arbitrary values will just paint over the fact that user
> intervention is still required and that there is almost no documentation
> for this.

We do need to have a section in the administrator's guide about tuning.

            regards, tom lane

Re: [HACKERS] Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> We could offer a --with switch to manually choose the default, too.
>
> > Good idea, yes.  Not sure if we need a --with switch because they can
> > just edit the postgresql.conf or postgresql.conf.sample file.
>
> Well, we have a --with switch for DEF_MAXBACKENDS, so one for the
> default number of buffers doesn't seem too unreasonable.  I wouldn't
> bother with it if configure didn't have to touch the value anyway...
> but it's just another line or two in configure.in...

Yes, we could add that too, but now that we have postgresql.conf should
we even be mentioning stuff like that in configure.  In the old days we
had a compiled-in limit but that is not true anymore, right?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Multiple Indexing, performance impact

From
Lincoln Yeoh
Date:
At 05:56 PM 22-06-2001 -0400, Bruce Momjian wrote:
>> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
>> the original rationale for using 64 is looking pretty broken anyway.
>> Comments?
>
>BSD/OS has a 4MB max but we document how to increase it by recompiling
>the kernel.  Maybe if we fail the startup we can tell them how to
>decrease the buffers in postgresql.conf file.  Seems quite clear.
>

Why is SHMMAX so low on some O/Ses? What are the advantages?

My guess is it's a minimum vs median/popular situation. Get the same thing
looking at the default www.kernel.org linux kernel settings vs the Redhat
kernel settings.

I'd personally prefer the popular situation. But would that mean the
minimum case can't even boot up to recompile? Maybe the BSD guys should
ship with two kernels then. FreeBSD esp, since it's easy to recompile the
kernel, just do two, during installation default to "Regular", with an
option for "Tiny".

It's more fair that the people trying the extraordinary (16MB 386) should
be the ones doing the extra work.

Cheerio,
Link.


Re: Multiple Indexing, performance impact

From
Tom Lane
Date:
Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> Why is SHMMAX so low on some O/Ses?

Historical artifact, I think: the SysV IPC code was developed on
machines that were tiny by current standards.  Unfortunately, vendors
haven't stopped to review their kernel parameters and scale them up
appropriately.

            regards, tom lane

Re: Re: Multiple Indexing, performance impact

From
Bruce Momjian
Date:
> At 05:56 PM 22-06-2001 -0400, Bruce Momjian wrote:
> >> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Since 64 is already too much to let 7.1 fit in SHMMAX = 1MB, I think
> >> the original rationale for using 64 is looking pretty broken anyway.
> >> Comments?
> >
> >BSD/OS has a 4MB max but we document how to increase it by recompiling
> >the kernel.  Maybe if we fail the startup we can tell them how to
> >decrease the buffers in postgresql.conf file.  Seems quite clear.
> >
>
> Why is SHMMAX so low on some O/Ses? What are the advantages?
>
> My guess is it's a minimum vs median/popular situation. Get the same thing
> looking at the default www.kernel.org linux kernel settings vs the Redhat
> kernel settings.
>
> I'd personally prefer the popular situation. But would that mean the
> minimum case can't even boot up to recompile? Maybe the BSD guys should
> ship with two kernels then. FreeBSD esp, since it's easy to recompile the
> kernel, just do two, during installation default to "Regular", with an
> option for "Tiny".
>
> It's more fair that the people trying the extraordinary (16MB 386) should
> be the ones doing the extra work.

I think the problem is that with a default-sized kernel, the little guys
couldn't even boot the OS.  Also, some of the OS's hard-wire things into
the kernel for performance reasons.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026