Thread: Priority to a mission critical transaction

Priority to a mission critical transaction

From

"Carlos H. Reimer"

Date:

21 November 2006, 22:11:52

Hi,

We have an application that is mission critical, normally very fast, but when an I/O or CPU bound transaction appears, the mission critical application suffers. Is there a way go give some kind of priority to this kind of application?

Reimer

Re: Priority to a mission critical transaction

From

Brad Nicholson

Date:

23 November 2006, 16:40:31

On Tue, 2006-11-21 at 21:43 -0200, Carlos H. Reimer wrote:
> Hi,
>
> We have an application that is mission critical, normally very fast,
> but when an I/O or CPU bound transaction appears, the mission critical
> application suffers. Is there a way go give some kind of priority to
> this kind of application?
> Reimer

Not that I'm aware of.  Depending on what the problems transactions are,
setting up a replica on a separate machine and running those
transactions against the replica might be the solution.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.

Re: Priority to a mission critical transaction

From

"Jim C. Nasby"

Date:

26 November 2006, 20:52:01

On Thu, Nov 23, 2006 at 03:40:15PM -0500, Brad Nicholson wrote:
> On Tue, 2006-11-21 at 21:43 -0200, Carlos H. Reimer wrote:
> > Hi,
> >
> > We have an application that is mission critical, normally very fast,
> > but when an I/O or CPU bound transaction appears, the mission critical
> > application suffers. Is there a way go give some kind of priority to
> > this kind of application?
> > Reimer
>
>
> Not that I'm aware of.  Depending on what the problems transactions are,
> setting up a replica on a separate machine and running those
> transactions against the replica might be the solution.

The BizGres project has been working on resource quotas, which might
eventually evolve to what you're looking for.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

RES: Priority to a mission critical transaction

From

"Carlos H. Reimer"

Date:

28 November 2006, 15:01:37

Hi,

There is an article about "Lowering the priority of a PostgreSQL query"
(http://weblog.bignerdranch.com/?p=11) that explains how to use the
setpriority() to lower PostgreSQL processes.

I?m wondering how much effective it would be for i/o bound systems.

Will the setpriority() system call affect i/o queue too?

Reimer


> -----Mensagem original-----
> De: Jim C. Nasby [mailto:jim@nasby.net]
> Enviada em: domingo, 26 de novembro de 2006 22:52
> Para: Brad Nicholson
> Cc: carlos.reimer@opendb.com.br; pgsql-performance@postgresql.org
> Assunto: Re: [PERFORM] Priority to a mission critical transaction
>
>
> On Thu, Nov 23, 2006 at 03:40:15PM -0500, Brad Nicholson wrote:
> > On Tue, 2006-11-21 at 21:43 -0200, Carlos H. Reimer wrote:
> > > Hi,
> > >
> > > We have an application that is mission critical, normally very fast,
> > > but when an I/O or CPU bound transaction appears, the mission critical
> > > application suffers. Is there a way go give some kind of priority to
> > > this kind of application?
> > > Reimer
> >
> >
> > Not that I'm aware of.  Depending on what the problems transactions are,
> > setting up a replica on a separate machine and running those
> > transactions against the replica might be the solution.
>
> The BizGres project has been working on resource quotas, which might
> eventually evolve to what you're looking for.
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
>
>

Re: RES: Priority to a mission critical transaction

From

Tom Lane

Date:

28 November 2006, 15:07:31

"Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> There is an article about "Lowering the priority of a PostgreSQL query"
> (http://weblog.bignerdranch.com/?p=11) that explains how to use the
> setpriority() to lower PostgreSQL processes.

> I?m wondering how much effective it would be for i/o bound systems.

That article isn't worth the electrons it's written on.  Aside from the
I/O point, there's a little problem called "priority inversion".  See
the archives for (many) past discussions of nice'ing backends.

            regards, tom lane

Re: RES: Priority to a mission critical transaction

From

Andreas Kostyrka

Date:

28 November 2006, 15:17:59

* Carlos H. Reimer <carlos.reimer@opendb.com.br> [061128 20:02]:
> Hi,
>
> There is an article about "Lowering the priority of a PostgreSQL query"
> (http://weblog.bignerdranch.com/?p=11) that explains how to use the
> setpriority() to lower PostgreSQL processes.
>
> I?m wondering how much effective it would be for i/o bound systems.
>
> Will the setpriority() system call affect i/o queue too?

Nope, and in fact the article shows the way not to do it.

See http://en.wikipedia.org/wiki/Priority_inversion

Basically, lowering the priority of one backend in PostgreSQL can lead
to reduced performance of all, especially also the backends with
higher priorities.

(Think of priority inversion as a timed soft deadlock. It will
eventually resolve, because it's not a real deadlock, but it might
mean halting important stuff for quite some time.)

Taking the example above, consider the following processes and nice
values:

19x backends As nice = 0
 1x backend B nice = 10 (doing maintenance work)
 1x updatedb nice = 5 (running as a cronjob at night)

 Now, it possible (the probability depends upon your specific
situation), where backend B grabs some internal lock that is needed,
and then it gets preempted by higher priority stuff. Well, the A
backends need that lock too, so they cannot run; instead we wait till
updatedb (which updates the locate search db, and goes through the
whole filesystem of the server) is finished.

Luckily most if not all of these processes are disc io bound, so they
get interrupted any way, and low priority processes don't starve.
Well, replace updatedb with something hogging the CPU, and rethink the
situation.

Andreas

Re: RES: Priority to a mission critical transaction

From

Josh Berkus

Date:

28 November 2006, 15:45:40

All,

The Bizgres project is working on resource management for PostgreSQL.  So far,
however, they have been able to come up with schemes that work for BI/DW at
the expense of OLTP.  Becuase of O^N lock checking issues, resource
management for OLTP which doesn't greatly reduce overall performance seems a
near-impossible task.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: RES: Priority to a mission critical transaction

From

Ron Mayer

Date:

28 November 2006, 16:31:37

Short summary:
  * Papers studying priority inversion issues with
    databases including PosgreSQL and realistic workloads
    conclude setpriority() helps even in the presence of
    priority inversion issues for TCP-C and TCP-W like
    workloads.
  * Avoiding priority inversion with priority inheritance
    will further help some workloads (TCP-C) more than
    others (TCP-W) but even without such schedulers
    priority inversion does not cause as much harm
    as the benefit you get from indirectly scheduling
    I/O through setpriority() in any paper I've seen.

Andreas Kostyrka wrote:
> * Carlos H. Reimer <carlos.reimer@opendb.com.br> [061128 20:02]:
>> Will the setpriority() system call affect i/o queue too?
>
> Nope, and in fact the article shows the way not to do it.

Actually *YES* setpriority() does have an indirect effect
on the I/O queue.

This paper: http://www.cs.cmu.edu/~bianca/icde04.pdf
studies setpriority() with non-trivial (TCP-W and TCP-C)
workloads on a variety of databases and shows that
that setpriority() is *extremely* effective for
PostgreSQL.

"For TPC-C on MVCC DBMS, and in particular PostgreSQL,
 CPU scheduling is most effective, due to its ability
 to indirectly schedule the I/O bottleneck.

 For TPC-C running on PostgreSQL,
 the simplest CPU scheduling policy (CPU-Prio) provides
 a factor of 2 improvement for high-priority transactions,
 while adding priority  inheritance (CPU-Prio-Inherit)
 provides a factor of 6 improvement while hardly
 penalizing low-priority transactions. Preemption
 (P-CPU) provides no appreciable benefit over
 CPU-Prio-Inherit."

> See http://en.wikipedia.org/wiki/Priority_inversion

Priority Inversion is a well studied problem; and depends
on both the workload and the database.  In particular,
TPC-W workloads have been studied on a variety of databases
including PostgreSQL.  Again, from:
 http://www.cs.cmu.edu/~bianca/icde04.pdf

They observe that avoiding priority inversion
issues by enabling priority inheritance with PostgreSQL
has a negligible effect on TCP-W like workloads, but
a significant improvement on TCP-C like workloads.

  "Recall from Section 5.3 that CPU scheduling (CPUPrio)
   is more effective than NP-LQ for TPC-W. Thus Figure 8
   compares the policies CPU-Prio-Inherit to CPU-Prio for
   the TPC-W workload on PostgreSQL.

   We find that there is no improvement for CPU-Prio-
   Inherit over CPU-Prio. This is to be expected given
   the low data contention found in the TPC-W workload; priority
   inversions can only occur during data contention. Results
   for low-priority transactions are not shown, but as in
   Figure 4, low-priority transactions are only negligibly
   penalized on average."

Yes, theoretically priority inversion can have pathologically
bad effects (not unlike qsort), it affects some workloads more
than others.

But in particular, their paper concludes that
PostgreSQL with TCP-C and TCP-W like workloads
gain significant benefits and no drawbacks from
indirectly tuning I/O scheduling with setpriority().

If anyone has references to papers or studies that suggest that
priority inversion actually is a problem with RDBMS's - and
PostgreSQL on Linux in particular, I'd be very interested.

Otherwise it seems to me existing research points to
significant benefits with only theoretical drawbacks
in pathological cases.

Re: RES: Priority to a mission critical transaction

From

Bruce Momjian

Date:

28 November 2006, 19:44:57

Someone should ask them to remove the article.

---------------------------------------------------------------------------

Tom Lane wrote:
> "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
> > There is an article about "Lowering the priority of a PostgreSQL query"
> > (http://weblog.bignerdranch.com/?p=11) that explains how to use the
> > setpriority() to lower PostgreSQL processes.
>
> > I?m wondering how much effective it would be for i/o bound systems.
>
> That article isn't worth the electrons it's written on.  Aside from the
> I/O point, there's a little problem called "priority inversion".  See
> the archives for (many) past discussions of nice'ing backends.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: RES: Priority to a mission critical transaction

From

Ron Mayer

Date:

28 November 2006, 21:20:34

Before asking them to remove it, are we sure priority inversion
is really a problem?

I thought this paper: http://www.cs.cmu.edu/~bianca/icde04.pdf
did a pretty good job at studying priority inversion on RDBMs's
including PostgreSQL on various workloads (TCP-W and TCP-C) and
found that the benefits of setting priorities vastly outweighed
the penalties of priority inversion across all the databases and
all the workloads they tested.

Bruce Momjian wrote:
> Someone should ask them to remove the article.
>
> ---------------------------------------------------------------------------
>
> Tom Lane wrote:
>> "Carlos H. Reimer" <carlos.reimer@opendb.com.br> writes:
>>> There is an article about "Lowering the priority of a PostgreSQL query"
>>> (http://weblog.bignerdranch.com/?p=11) that explains how to use the
>>> setpriority() to lower PostgreSQL processes.
>>> I?m wondering how much effective it would be for i/o bound systems.
>> That article isn't worth the electrons it's written on.  Aside from the
>> I/O point, there's a little problem called "priority inversion".  See
>> the archives for (many) past discussions of nice'ing backends.
>>
>>             regards, tom lane
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>>                 http://www.postgresql.org/about/donate
>

Re: RES: Priority to a mission critical transaction

From

Mark Kirkwood

Date:

28 November 2006, 21:26:32

Josh Berkus wrote:
> All,
>
> The Bizgres project is working on resource management for PostgreSQL.  So far,
> however, they have been able to come up with schemes that work for BI/DW at
> the expense of OLTP.  Becuase of O^N lock checking issues, resource
> management for OLTP which doesn't greatly reduce overall performance seems a
> near-impossible task.
>

Right - I guess it is probably more correct to say that the
implementation used in Bizgres is specifically targeted at BI/DW
workloads rather than OLTP.

At this point we have not measured its impact on concurrency in anything
other than a handwaving manner - e.g pgbench on an older SMP system
showed what looked like about a 10% hit. However the noise level for
pgbench is typically >10% so - a better benchmark on better hardware is
  needed.

Cheers

Mark

Re: RES: Priority to a mission critical transaction

From

Mark Kirkwood

Date:

28 November 2006, 21:43:17

Ron Mayer wrote:
> Short summary:
>   * Papers studying priority inversion issues with
>     databases including PosgreSQL and realistic workloads
>     conclude setpriority() helps even in the presence of
>     priority inversion issues for TCP-C and TCP-W like
>     workloads.
>   * Avoiding priority inversion with priority inheritance
>     will further help some workloads (TCP-C) more than
>     others (TCP-W) but even without such schedulers
>     priority inversion does not cause as much harm
>     as the benefit you get from indirectly scheduling
>     I/O through setpriority() in any paper I've seen.
>
> Andreas Kostyrka wrote:
>> * Carlos H. Reimer <carlos.reimer@opendb.com.br> [061128 20:02]:
>>> Will the setpriority() system call affect i/o queue too?
>> Nope, and in fact the article shows the way not to do it.
>
> Actually *YES* setpriority() does have an indirect effect
> on the I/O queue.
>

While I was at Greenplum a related point was made to me:

For a TPC-H/BI type workload on a well configured box the IO subsystem
can be fast enough so that CPU is the bottleneck for much of the time -
so being able to use setpriority() as a resource controller makes sense.

Also, with such a workload being mainly SELECT type queries, the dangers
connected with priority inversion are considerably reduced.

Cheers

Mark

Re: RES: Priority to a mission critical transaction

From

Ron Mayer

Date:

29 November 2006, 06:55:45

Mark Kirkwood wrote:
> Ron Mayer wrote:
>> Short summary:
>>   * Papers studying priority inversion issues with
>>     databases including PosgreSQL and realistic workloads
>>     conclude setpriority() helps even in the presence of
>>     priority inversion issues for TCP-C and TCP-W like
>>     workloads.
>>   * Avoiding priority inversion with priority inheritance
>>     will further help some workloads (TCP-C) more than
>>     others (TCP-W) but even without such schedulers
>>     priority inversion does not cause as much harm
>>     as the benefit you get from indirectly scheduling
>>     I/O through setpriority() in any paper I've seen.
>>
>> Andreas Kostyrka wrote:
>>> * Carlos H. Reimer <carlos.reimer@opendb.com.br> [061128 20:02]:
>>>> Will the setpriority() system call affect i/o queue too?
>>> Nope, and in fact the article shows the way not to do it.
>>
>> Actually *YES* setpriority() does have an indirect effect
>> on the I/O queue.
>>
>
> While I was at Greenplum a related point was made to me:
>
> For a TPC-H/BI type workload on a well configured box the IO subsystem
> can be fast enough so that CPU is the bottleneck for much of the time -
> so being able to use setpriority() as a resource controller makes sense.

Perhaps - but section 4 of the paper in question (pages 3 through 6
of the 12 pages at http://www.cs.cmu.edu/~bianca/icde04.pdf) go
through great lengths to identify the bottlenecks for each workload
and each RDBMS.   Indeed for the TCP-W on PostgreSQL and DB2, CPU
was a bottleneck but no so for TCP-C - which had primarily I/O
contention on PostgreSQL and lock contention on DB2.

 http://www.cs.cmu.edu/~bianca/icde04.pdf
 "for TPC-C ... The main result shown in Figure 1 is that locks
  are the bottleneck resource for both Shore and DB2 (rows 1 and
  2), while I/O tends to be the bottleneck resource for PostgreSQL
  (row 3). We now discuss these in more detail.
  ...
  Thus, CPU is the bottleneck resource for TPC-W 1."

> Also, with such a workload being mainly SELECT type queries, the dangers
> connected with priority inversion are considerably reduced.

And indeed the TCP-W benchmark did not show further improvement
for high priority transactions with Priority Inheritance enabled
in the scheduler (which mitigates the priority inversion problem) -
but the TCP-C benchmark did show further improvement -- which agrees
with Mark's observation.   However even with priority inversion
problems; the indirect benefits of setpriority() on I/O scheduling
outweighed the penalties of priority inversion in each of their
test cases.

Re: RES: Priority to a mission critical transaction

From

Brian Hurt

Date:

29 November 2006, 09:26:19

Ron Mayer wrote:

>Before asking them to remove it, are we sure priority inversion
>is really a problem?
>
>I thought this paper: http://www.cs.cmu.edu/~bianca/icde04.pdf
>did a pretty good job at studying priority inversion on RDBMs's
>including PostgreSQL on various workloads (TCP-W and TCP-C) and
>found that the benefits of setting priorities vastly outweighed
>the penalties of priority inversion across all the databases and
>all the workloads they tested.
>
>
>
I have the same question.  I've done some embedded real-time
programming, so my innate reaction to priority inversions is that
they're evil.  But, especially given priority inheritance, is there any
situation where priority inversion provides *worse* performance than
running everything at the same priority?  I can easily come up with
situations where it devolves to that case- where all processes get
promoted to the same high priority.  But I can't think of one where
using priorities makes things worse, and I can think of plenty where it
makes things better.

Brian

Re: RES: Priority to a mission critical transaction

From

Mark Lewis

Date:

29 November 2006, 11:03:58

On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
...
> I have the same question.  I've done some embedded real-time
> programming, so my innate reaction to priority inversions is that
> they're evil.  But, especially given priority inheritance, is there any
> situation where priority inversion provides *worse* performance than
> running everything at the same priority?  I can easily come up with
> situations where it devolves to that case- where all processes get
> promoted to the same high priority.  But I can't think of one where
> using priorities makes things worse, and I can think of plenty where it
> makes things better.
...

It can make things worse when there are at least 3 priority levels
involved.  The canonical sequence looks as follows:

LOW:  Aquire a lock
MED:  Start a long-running batch job that hogs CPU
HIGH: Wait on lock held by LOW task

at this point, the HIGH task can't run until the LOW task releases its
lock. but the LOW task can't run to completion and release its lock
until the MED job completes.

(random musing): I wonder if PG could efficiently be made to temporarily
raise the priority of any task holding a lock that a high priority task
waits on.  I guess that would just make it so that instead of HIGH tasks
being effectively reduced to LOW, then LOW tasks could be promoted to
HIGH.

-- Mark Lewis

Re: RES: Priority to a mission critical transaction

From

Brian Hurt

Date:

29 November 2006, 11:17:27

Mark Lewis wrote:

On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
...

I have the same question.  I've done some embedded real-time 
programming, so my innate reaction to priority inversions is that 
they're evil.  But, especially given priority inheritance, is there any 
situation where priority inversion provides *worse* performance than 
running everything at the same priority?  I can easily come up with 
situations where it devolves to that case- where all processes get 
promoted to the same high priority.  But I can't think of one where 
using priorities makes things worse, and I can think of plenty where it 
makes things better.

...

It can make things worse when there are at least 3 priority levels
involved.  The canonical sequence looks as follows:

LOW:  Aquire a lock
MED:  Start a long-running batch job that hogs CPU
HIGH: Wait on lock held by LOW task

at this point, the HIGH task can't run until the LOW task releases its
lock. but the LOW task can't run to completion and release its lock
until the MED job completes.

(random musing): I wonder if PG could efficiently be made to temporarily
raise the priority of any task holding a lock that a high priority task
waits on.  I guess that would just make it so that instead of HIGH tasks
being effectively reduced to LOW, then LOW tasks could be promoted to
HIGH.

I thought that was what priority inheritance did- once HIGH blocks on a lock held by LOW, LOW gets it's priority raised to that of HIGH. Then LOW takes precedence over MED. If LOW blocks on a lock held by MED when it has the same priority of HIGH, MED gets it's priority raised to HIGH. Note that now all three processes are running with HIGH priority- but is this any different from the default case of running them as the same priority? This is what I was talking about when I said I could imagine priority inheritance "devolving" to the single priority case.

Of course, this is a little tricky to implement. I haven't looked at how difficult it'd be within Postgres.

Brian

Re: RES: Priority to a mission critical transaction

From

Ron Mayer

Date:

29 November 2006, 12:43:58

Brian Hurt wrote:
> Mark Lewis wrote:
>> On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
>>
>>> I have the same question.  I've done some embedded real-time
>>> programming, so my innate reaction to priority inversions is that
>>> they're evil.  But, especially given priority inheritance, is there any
>>> situation where priority inversion provides *worse* performance than
>>> running everything at the same priority?

Yes, there are certainly cases where a single high priority
transaction will suffer far worse than it otherwise would have.

Apparently there are plenty of papers stating that priority inversion
is a major problem in RDBMs's for problems that require that specific
deadlines have to be met (such as in real time systems).  However the
papers using the much weaker criteria of "most high priority things
finish faster than they would have otherwise, and the others aren't
hurt too bad" suggest that it's not as much of a problem.   Two of
the articles referenced by the paper being discussed here apparently
go into these cases.

The question in my mind is whether overall the benefits outweigh
the penalties - in much the same way that qsort's can have O(n^2)
behavior but in practice outweigh the penalties of many alternatives.

>> It can make things worse when there are at least 3 priority levels
>> involved.  The canonical sequence looks as follows:
>>
>> LOW:  Aquire a lock
>> MED:  Start a long-running batch job that hogs CPU
>> HIGH: Wait on lock held by LOW task
>>
>> at this point, the HIGH task can't run until the LOW task releases its
>> lock. but the LOW task can't run to completion and release its lock
>> until the MED job completes.

Don't many OS's dynamically tweak priorities such that processes
that don't use most of their timeslice (like LOW) get priority
boosts and those that do use a lot of CPU (like MED) get
penalized -- which may help protect against this particular
sequence if you don't set LOW and MED too far apart?

>> (random musing): I wonder if PG could efficiently be made to temporarily
>> raise the priority of any task holding a lock that a high priority task
>> waits on. ...
>
> I thought that was what priority inheritance did-

Yes, me too..

> Of course, this is a little tricky to implement.  I haven't looked at
> how difficult it'd be within Postgres.

ISTM that it would be rather OS-dependent anyway.  Different OS's
have different (or no) hooks - heck, even different 2.6.* linuxes
(pre 2.6.18 vs post) have different hooks for priority
inheritance - so I wouldn't really expect to see cpu scheduling
policy details like that merged with postgresql except maybe from
a patched version from a RTOS vendor.

Re: RES: Priority to a mission critical transaction

From

Brian Hurt

Date:

29 November 2006, 13:21:48

Ron Mayer wrote:

Brian Hurt wrote:

Mark Lewis wrote:

On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:

I have the same question.  I've done some embedded real-time 
programming, so my innate reaction to priority inversions is that 
they're evil.  But, especially given priority inheritance, is there any 
situation where priority inversion provides *worse* performance than 
running everything at the same priority?

Yes, there are certainly cases where a single high priority
transaction will suffer far worse than it otherwise would have.

OK.

Although I'm tempted to make the issue more complex by throwing Software Transactional Memory into the mix:
http://citeseer.ist.psu.edu/shavit95software.html
http://citeseer.ist.psu.edu/anderson95realtime.html

That second paper is interesting in that it says that STM solves the priority inversion problem. Basically the higher priority process forces the lower priority process to abort it's transaction and retry it.

Is it possible to recast Postgres' use of locks to use STM instead? How would STM interact with Postgres' existing transactions? I don't know. This would almost certainly require Postgres to write it's own locking, with all the problems it entails (does the source currently use inline assembly anywhere? I'd guess not.).

Apparently there are plenty of papers stating that priority inversion
is a major problem in RDBMs's for problems that require that specific
deadlines have to be met (such as in real time systems).

It's definately a problem in realtime systems, not just realtime DBMS. In this case, running everything at the same priority doesn't work and isn't an option.

The question in my mind is whether overall the benefits outweigh
the penalties - in much the same way that qsort's can have O(n^2)
behavior but in practice outweigh the penalties of many alternatives.

Also, carefull choice of pivot values, and switching to other sorting methods like heapsort when you detect you're in a pathological case, help. Make the common case fast and the pathological case not something that causes the database to fall over.

Setting priorities would be a solution to a problem I haven't hit yet, but can see myself needing to deal with. Which is why I'm interested in this issue. If it's a case of "setting priorities can make things better, and doesn't make things worse" is great. If it's a case of "setting priorities can make things better, but occassionally makes things much worse" is a problem.

Of course, this is a little tricky to implement.  I haven't looked at
how difficult it'd be within Postgres.

ISTM that it would be rather OS-dependent anyway.  Different OS's
have different (or no) hooks - heck, even different 2.6.* linuxes
(pre 2.6.18 vs post) have different hooks for priority
inheritance - so I wouldn't really expect to see cpu scheduling
policy details like that merged with postgresql except maybe from
a patched version from a RTOS vendor.

Hmm. I was thinking of Posix.4's setpriority() call.

Brian

Re: RES: Priority to a mission critical transaction

From

Ron Mayer

Date:

29 November 2006, 14:47:05

Brian Hurt wrote:
> Ron Mayer wrote:
>> Brian Hurt wrote:
>>> Mark Lewis wrote:
>>>> On Wed, 2006-11-29 at 08:25 -0500, Brian Hurt wrote:
>>>>> But, especially given priority inheritance, is there any
>
> That second paper is interesting in that it says that STM solves the
> priority inversion problem.  Basically the higher priority process
> forces the lower priority process to abort it's transaction and retry it.
>
> Is it possible to recast Postgres' use of locks to use STM instead?

If I read the CMU paper right (http://www.cs.cmu.edu/~bianca/icde04.pdf),
that's equivalent to what they call "preemptive abort scheduling"
and tested as well as priority inversion.

They did test this and compared it to priority inversion with
postgresql and found them about equivalent.

  "Preemptive scheduling (P-LQ and P-CPU) attempts to eliminate the
  wait excess for high-priority transactions by preempting low-priority
  lock holders in the way of high-priority transactions.  We find that
  preemptive policies provide little benefit
  ...
  TPC-C running on PostgreSQL ... Preemption (P-CPU) provides
  no appreciable benefit over CPU-Prio-Inherit."

> Setting priorities would be a solution to a problem I haven't hit yet,
> but can see myself needing to deal with.  Which is why I'm interested in
> this issue.  If it's a case of "setting priorities can make things
> better, and doesn't make things worse" is great.  If it's a case of
> "setting priorities can make things better, but occassionally makes
> things much worse" is a problem.

From the papers, it seems to depend quite a bit on the workload.

Every actual experiment I've seen published suggests that on the
average the higher-priority transactions will do better - but that
there is the risk of specific individual high priority transactions
that can be slower than they would have otherwise been.

I have yet to see a case where anyone measured anything getting
"much" worse, though.

>>> Of course, this is a little tricky to implement.  I haven't looked at
>>> how difficult it'd be within Postgres.
>>
>> ISTM that it would be rather OS-dependent anyway.  Different OS's
>> have different (or no) hooks - heck, even different 2.6.* linuxes
>> (pre 2.6.18 vs post) have different hooks for priority
>> inheritance - so I wouldn't really expect to see cpu scheduling
>> policy details like that merged with postgresql except maybe from
>> a patched version from a RTOS vendor.
>>
>>
> Hmm.  I was thinking of Posix.4's setpriority() call.
>

Hmm - I thought you were thinking the priority inheritance would
be using something like the priority-inheriting
futexes that were added to the linux kernel in 2.6.18.
http://lwn.net/Articles/178253/
http://www.linuxhq.com/kernel/v2.6/18/Documentation/pi-futex.txt

Re: RES: Priority to a mission critical transaction

From

Josh Berkus

Date:

30 November 2006, 00:29:08

Bruce,

> Someone should ask them to remove the article.

"Someone".

Um, *who* taught for Big Nerd Ranch for several years, Bruce?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

Postgres and really huge tables

From

Brian Hurt

Date:

18 January 2007, 16:31:48

Is there any experience with Postgresql and really huge tables?  I'm
talking about terabytes (plural) here in a single table.  Obviously the
table will be partitioned, and probably spread among several different
file systems.  Any other tricks I should know about?

We have a problem of that form here.  When I asked why postgres wasn't
being used, the opinion that postgres would "just <explicitive> die" was
given.  Personally, I'd bet money postgres could handle the problem (and
better than the ad-hoc solution we're currently using).  But I'd like a
couple of replies of the form "yeah, we do that here- no problem" to
wave around.

Brian

Re: [pgsql-advocacy] Postgres and really huge tables

From

"Joshua D. Drake"

Date:

18 January 2007, 16:37:01

Brian Hurt wrote:
> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.  Obviously the
> table will be partitioned, and probably spread among several different
> file systems.  Any other tricks I should know about?
>
> We have a problem of that form here.  When I asked why postgres wasn't
> being used, the opinion that postgres would "just <explicitive> die" was
> given.  Personally, I'd bet money postgres could handle the problem (and
> better than the ad-hoc solution we're currently using).  But I'd like a
> couple of replies of the form "yeah, we do that here- no problem" to
> wave around.

It entirely depends on the machine and how things are accessed. In
theory you could have a multi-terabyte table but my question of course
is why in the world would you do that? That is what partitioning is for.

Regardless, appropriate use of things like partial indexes should make
it possible.

Joshua D. Drake


>
> Brian
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Postgres and really huge tables

From

Scott Marlowe

Date:

18 January 2007, 17:04:21

On Thu, 2007-01-18 at 14:31, Brian Hurt wrote:
> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.  Obviously the
> table will be partitioned, and probably spread among several different
> file systems.  Any other tricks I should know about?
>
> We have a problem of that form here.  When I asked why postgres wasn't
> being used, the opinion that postgres would "just <explicitive> die" was
> given.  Personally, I'd bet money postgres could handle the problem (and
> better than the ad-hoc solution we're currently using).  But I'd like a
> couple of replies of the form "yeah, we do that here- no problem" to
> wave around.

It really depends on what you're doing.

Are you updating every row by a single user every hour, or are you
updating dozens of rows by hundreds of users at the same time?

PostgreSQL probably wouldn't die, but it may well be that for certain
batch processing operations it's a poorer choice than awk/sed or perl.

If you do want to tackle it with PostgreSQL, you'll likely want to build
a truly fast drive subsystem.  Something like dozens to hundreds of
drives in a RAID-10 setup with battery backed cache, and a main server
with lots of memory on board.

But, really, it depends on what you're doing to the data.

Re: [pgsql-advocacy] Postgres and really huge tables

From

Chris Mair

Date:

18 January 2007, 17:43:04

> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.  Obviously the
> table will be partitioned, and probably spread among several different
> file systems.  Any other tricks I should know about?
>
> We have a problem of that form here.  When I asked why postgres wasn't
> being used, the opinion that postgres would "just <explicitive> die" was
> given.  Personally, I'd bet money postgres could handle the problem (and
> better than the ad-hoc solution we're currently using).  But I'd like a
> couple of replies of the form "yeah, we do that here- no problem" to
> wave around.

I've done a project using 8.1 on solaris that had a table that was
closed to 2TB. The funny thing is that it just worked fine even without
partitioning.

But, then again: the size of a single record was huge too: ~ 50K.
So there were not insanly many records: "just" something
in the order of 10ths of millions.

The queries just were done on some int fields, so the index of the
whole thing fit into RAM.

A lot of data, but not a lot of records... I don't know if that's
valid. I guess the people at Greenplum and/or Sun have more exciting
stories ;)


Bye, Chris.

Re: Postgres and really huge tables

From

Tom Lane

Date:

18 January 2007, 17:53:04

Brian Hurt <bhurt@janestcapital.com> writes:
> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.

The 2MASS sky survey point-source catalog
http://www.ipac.caltech.edu/2mass/releases/allsky/doc/sec2_2a.html
is 470 million rows by 60 columns; I don't have it loaded up but
a very conservative estimate would be a quarter terabyte.  (I've
got a copy of the data ... 5 double-sided DVDs, gzipped ...)
I haven't heard from Rae Stiening recently but I know he's been using
Postgres to whack that data around since about 2001 (PG 7.1 or so,
which is positively medieval compared to current releases).  So at
least for static data, it's certainly possible to get useful results.
What are your processing requirements?

            regards, tom lane

Re: [pgsql-advocacy] Postgres and really huge tables

From

"Luke Lonergan"

Date:

18 January 2007, 18:42:02

Chris,

On 1/18/07 1:42 PM, "Chris Mair" <chris@1006.org> wrote:

> A lot of data, but not a lot of records... I don't know if that's
> valid. I guess the people at Greenplum and/or Sun have more exciting
> stories ;)

You guess correctly :-)

Given that we're Postgres 8.2, etc compatible, that might answer Brian's
coworker's question.  Soon we will be able to see that Greenplum/Postgres
are handling the world's largest databases both in record count and size.

While the parallel scaling technology we employ is closed source, we are
still contributing scaling technology to the community (partitioning, bitmap
index, sort improvements, resource management, more to come), so Postgres as
a "bet" is likely safer and better than a completely closed source
commercial product.

- Luke

Re: Postgres and really huge tables

From

Gavin Sherry

Date:

18 January 2007, 19:10:23

Hi Brian,

On Thu, 18 Jan 2007, Brian Hurt wrote:

> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.  Obviously the
> table will be partitioned, and probably spread among several different
> file systems.  Any other tricks I should know about?

Here is a blog post from a user who is in the multi-tb range:

http://www.lethargy.org/~jesus/archives/49-PostreSQL-swelling.html

I think Theo sums up some of the pros and cons well.

Your best bet is a test on scale. Be sure to get our feed back if you
encounter issues.

Gavin

Re: [pgsql-advocacy] Postgres and really huge tables

From

Oleg Bartunov

Date:

19 January 2007, 06:03:20

On Thu, 18 Jan 2007, Tom Lane wrote:

> Brian Hurt <bhurt@janestcapital.com> writes:
>> Is there any experience with Postgresql and really huge tables?  I'm
>> talking about terabytes (plural) here in a single table.
>
> The 2MASS sky survey point-source catalog
> http://www.ipac.caltech.edu/2mass/releases/allsky/doc/sec2_2a.html
> is 470 million rows by 60 columns; I don't have it loaded up but
> a very conservative estimate would be a quarter terabyte.  (I've
> got a copy of the data ... 5 double-sided DVDs, gzipped ...)
> I haven't heard from Rae Stiening recently but I know he's been using
> Postgres to whack that data around since about 2001 (PG 7.1 or so,
> which is positively medieval compared to current releases).  So at
> least for static data, it's certainly possible to get useful results.
> What are your processing requirements?

We are working in production with 2MASS and other catalogues, and
2MASS is not the biggest. The nomad catalog has more than milliard records.
You could query them online
http://vo.astronet.ru/cas/conesearch.php
Everything is in PostgreSQL 8.1.5 and at present migrate to the 8.2.1,
which is very slow, since slow COPY.
The hardware we use is HP rx1620, dual Itanium2, MSA 20, currently
4.5 Tb.



>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Postgres and really huge tables

From

"Merlin Moncure"

Date:

19 January 2007, 10:11:38

On 1/18/07, Brian Hurt <bhurt@janestcapital.com> wrote:
> Is there any experience with Postgresql and really huge tables?  I'm
> talking about terabytes (plural) here in a single table.  Obviously the
> table will be partitioned, and probably spread among several different
> file systems.  Any other tricks I should know about?

A pretty effective partitioning strategy that works in some cases is
to identify a criteria in your dataset that isolates your data on a
session basis.  For example, if you have a company_id that divides up
your company data and a session only needs to deal with company_id,
you can separate out all your tables based on company_id into
different schemas and have the session set the search_path variable
when it logs in.  Data that does not partition on your criteria sits
in public schemas that all the companies can see.

This takes advantage of a special trick regarding stored procedures
that they do not attach to tables until the first time they are
executed in a session -- keeping you from having to make a function
for each schema. (note: views do not have this property).   You can
still cross query using views and the like or hand rolled sql.

I would call this type of partitioning logical partitioning since you
are leveraging logical divisions in your data.  It obviously doesn't
work in all cases but when it does  it works great.

> We have a problem of that form here.  When I asked why postgres wasn't
> being used, the opinion that postgres would "just <explicitive> die" was
> given.  Personally, I'd bet money postgres could handle the problem (and
> better than the ad-hoc solution we're currently using).  But I'd like a
> couple of replies of the form "yeah, we do that here- no problem" to
> wave around.

pg will of course not die as when your dataset hits a certain
threshold.  It will become slower based on well know mathematical
patterns that grow with your working set size.  One of the few things
that gets to be a pain with large tables is vacuum -- since you can't
vacuum a piece of table and there are certain annoyances with having a
long running vacuum this is something to think about.

Speaking broadly about table partitioning, it optimizes one case at
the expense of another.  Your focus (IMO) should be on reducing your
working set size under certain conditions -- not the physical file
size.  If you have a properly laid out and logical dataset and can
identify special cases where you need some information and not other
information, the partitioning strategy should fall into place, whether
it is to do nothing, isolate data into separate schemas/tables/files,
or use the built in table partitioning feature (which to be honest I
am not crazy about).

merlin

Re: [pgsql-advocacy] Postgres and really huge tables

From

Josh Berkus

Date:

19 January 2007, 20:01:37

> A lot of data, but not a lot of records... I don't know if that's
> valid. I guess the people at Greenplum and/or Sun have more exciting
> stories ;)

Not really.   Pretty much multi-terabyte tables are fine on vanilla
PostgreSQL if you can stick to partitioned and/or indexed access.  If you
need to do unindexed fishing expeditions on 5tb of data, then talk to
Greenplum.

http://www.powerpostgresql.com/Downloads/terabytes_osc2005.pdf

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco