Thread: Maximum number of sequences that can be created

Maximum number of sequences that can be created

From
Vidhya Bondre
Date:
Hi All,

Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it.

Regards
Vidhya


Re: Maximum number of sequences that can be created

From
Robert Klemme
Date:
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote:
> Is there any max limit set on sequences that can be created on the database
> ? Also would like to know if we create millions of sequences in a single db
> what is the downside of it.

On the contrary: what would be the /advantage/ of being able to create
millions of sequences?  What's the use case?

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: Maximum number of sequences that can be created

From
Віталій Тимчишин
Date:


2012/5/11 Robert Klemme <shortcutter@googlemail.com>
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote:
> Is there any max limit set on sequences that can be created on the database
> ? Also would like to know if we create millions of sequences in a single db
> what is the downside of it.

The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).
 

On the contrary: what would be the /advantage/ of being able to create
millions of sequences?  What's the use case?


We are using sequences as statistics counters - they produce almost no performance impact and we can tolerate it's non-transactional nature. I can imaging someone who wants to have a  sequence per user or other relation row. 

--
Best regards,
 Vitalii Tymchyshyn

Re: Maximum number of sequences that can be created

From
Robert Klemme
Date:
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
> 2012/5/11 Robert Klemme <shortcutter@googlemail.com>

>> On the contrary: what would be the /advantage/ of being able to create
>> millions of sequences?  What's the use case?
>
> We are using sequences as statistics counters - they produce almost no
> performance impact and we can tolerate it's non-transactional nature. I can
> imaging someone who wants to have a  sequence per user or other relation
> row.

I can almost see the point. But my natural choice in that case would
be a table with two columns.  Would that actually be so much less
efficient? Of course you'd have fully transactional behavior and thus
locking.

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: Maximum number of sequences that can be created

From
Craig James
Date:


On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:


2012/5/11 Robert Klemme <shortcutter@googlemail.com>
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote:
> Is there any max limit set on sequences that can be created on the database
> ? Also would like to know if we create millions of sequences in a single db
> what is the downside of it.

The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).

Not "may slow down."  Change that to "will slow down and possibly corrupt" your system.

In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well.  Somewhere over 1,000,000 relations, the system becomes unusable.  It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work.  But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).

For example, pg_dump fails altogether.  After 24 hours, it won't even start writing to its output file.  The auto-completion in psql of table and column names freezes the system.  It takes minutes to drop one table.  Stuff like that. You'll have a system that works, but can't be backed up, dumped, repaired or managed.

As I said, this was 8.4.x. Things may have changed in 9.x.

Craig

Re: Maximum number of sequences that can be created

From
Jeff Janes
Date:
On Sun, May 13, 2012 at 9:01 AM, Craig James <cjames@emolecules.com> wrote:
>
> In my experience (PG 8.4.x), the system can handle in the neighborhood of
> 100,000 relations pretty well.  Somewhere over 1,000,000 relations, the
> system becomes unusable.  It's not that it stops working -- day-to-day
> operations such as querying your tables and running your applications
> continue to work.  But system operations that have to scan for table
> information seem to freeze (maybe they run out of memory, or are
> encountering an O(N^2) operation and simply cease to complete).
>
> For example, pg_dump fails altogether.  After 24 hours, it won't even start
> writing to its output file.  The auto-completion in psql of table and column
> names freezes the system.  It takes minutes to drop one table.  Stuff like
> that. You'll have a system that works, but can't be backed up, dumped,
> repaired or managed.
>
> As I said, this was 8.4.x. Things may have changed in 9.x.

I think some of those things might have improved, but enough of them
have not improved, or not by enough.

So I agree with your assessment, under 9.2 having millions of
sequences might technically work, but would render the database
virtually unmanageable.

Cheers,

Jeff

Re: Maximum number of sequences that can be created

From
Віталій Тимчишин
Date:


2012/5/13 Robert Klemme <shortcutter@googlemail.com>
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
> 2012/5/11 Robert Klemme <shortcutter@googlemail.com>

>> On the contrary: what would be the /advantage/ of being able to create
>> millions of sequences?  What's the use case?
>
> We are using sequences as statistics counters - they produce almost no
> performance impact and we can tolerate it's non-transactional nature. I can
> imaging someone who wants to have a  sequence per user or other relation
> row.

I can almost see the point. But my natural choice in that case would
be a table with two columns.  Would that actually be so much less
efficient? Of course you'd have fully transactional behavior and thus
locking.

We've had concurrency problems with table solution (a counter that is updated by many concurrent queries), so we traded transactionality for speed. We are actually using this data to graph pretty graphs in nagios, so it's quite OK. But we have only ~10 sequences, not millions :)

--
Best regards,
 Vitalii Tymchyshyn

Re: Maximum number of sequences that can be created

From
Andres Freund
Date:
On Tuesday, May 15, 2012 08:29:11 AM Віталій Тимчишин wrote:
> 2012/5/13 Robert Klemme <shortcutter@googlemail.com>
>
> > On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com>
> >
> > wrote:
> > > 2012/5/11 Robert Klemme <shortcutter@googlemail.com>
> > >
> > >> On the contrary: what would be the /advantage/ of being able to create
> > >> millions of sequences?  What's the use case?
> > >
> > > We are using sequences as statistics counters - they produce almost no
> > > performance impact and we can tolerate it's non-transactional nature. I
> >
> > can
> >
> > > imaging someone who wants to have a  sequence per user or other
> > > relation row.
> >
> > I can almost see the point. But my natural choice in that case would
> > be a table with two columns.  Would that actually be so much less
> > efficient? Of course you'd have fully transactional behavior and thus
> > locking.
>
> We've had concurrency problems with table solution (a counter that is
> updated by many concurrent queries), so we traded transactionality for
> speed. We are actually using this data to graph pretty graphs in nagios, so
> it's quite OK. But we have only ~10 sequences, not millions :)
I would rather suggest going with a suming table if you need to do something
like that:

sequence_id | value
1 | 3434334
1 | 1
1 | -1
1 | 1
1 | 1
...

You then can get the current value with SELECT SUM(value) WHERE sequence_id =
1. For garbage collection you can delete those values and insert the newly
summed up value again.
That solution won't ever block if done right.

Andres

Re: Maximum number of sequences that can be created

From
Robert Klemme
Date:
Hi,

On Tue, May 15, 2012 at 12:57 PM, Andres Freund <andres@anarazel.de> wrote:

> I would rather suggest going with a suming table if you need to do something
> like that:
>
> sequence_id | value
> 1 | 3434334
> 1 | 1
> 1 | -1
> 1 | 1
> 1 | 1
> ...
>
> You then can get the current value with SELECT SUM(value) WHERE sequence_id =
> 1. For garbage collection you can delete those values and insert the newly
> summed up value again.
> That solution won't ever block if done right.

I was going to suggest another variant which would not need GC but
would also increase concurrency:

sequence_id | hash | value
1 | 0 | 3
1 | 1 | 9
1 | 2 | 0
1 | 3 | 2
...

with PK = (sequence_id, hash) and hash in a fixed range (say 0..15).

Value would be obtained the same way, i.e. via
SELECT SUM(value) FROM T WHERE sequence_id = 1

The hash value would have to be calculated

 - at session start time (cheap but might reduce concurrency due to
small number of changes) or
 - at TX start time (more expensive but probably better concurrency
due to higher change rate)

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: Maximum number of sequences that can be created

From
Greg Spiegelberg
Date:
On Sun, May 13, 2012 at 10:01 AM, Craig James <cjames@emolecules.com> wrote:

On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:

The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).

Not "may slow down."  Change that to "will slow down and possibly corrupt" your system.

In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well.  Somewhere over 1,000,000 relations, the system becomes unusable.  It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work.  But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).

Glad I found this thread.

Is this 1M relation mark for the whole database cluster or just for a single database within the cluster?

Thanks,
-Greg
 

Re: Maximum number of sequences that can be created

From
Craig James
Date:
On Fri, May 25, 2012 at 4:58 AM, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
On Sun, May 13, 2012 at 10:01 AM, Craig James <cjames@emolecules.com> wrote:

On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:

The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).

Not "may slow down."  Change that to "will slow down and possibly corrupt" your system.

In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well.  Somewhere over 1,000,000 relations, the system becomes unusable.  It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work.  But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).

Glad I found this thread.

Is this 1M relation mark for the whole database cluster or just for a single database within the cluster?

I don't know.  When I discovered this, our system only had a few dozen databases, and I never conducted any experiments.  We had to write our own version of pg_dump to get the data out of the damaged system, and then reload from scratch.  And it's not a "hard" number.  Even at a million relation things work ... they just bog down dramatically.  By the time I got to 5 million relations (a rogue script was creating 50,000 tables per day and not cleaning up), the system was effectively unusable.

Craig



Thanks,
-Greg