Thread: Maximum number of sequences that can be created
Hi All,
Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it.
Regards
Vidhya
Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it.
Regards
Vidhya
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote: > Is there any max limit set on sequences that can be created on the database > ? Also would like to know if we create millions of sequences in a single db > what is the downside of it. On the contrary: what would be the /advantage/ of being able to create millions of sequences? What's the use case? Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
2012/5/11 Robert Klemme <shortcutter@googlemail.com>
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote:
> Is there any max limit set on sequences that can be created on the database
> ? Also would like to know if we create millions of sequences in a single db
> what is the downside of it.
The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).
On the contrary: what would be the /advantage/ of being able to create
millions of sequences? What's the use case?
We are using sequences as statistics counters - they produce almost no performance impact and we can tolerate it's non-transactional nature. I can imaging someone who wants to have a sequence per user or other relation row.
Best regards,
Vitalii Tymchyshyn
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote: > 2012/5/11 Robert Klemme <shortcutter@googlemail.com> >> On the contrary: what would be the /advantage/ of being able to create >> millions of sequences? What's the use case? > > We are using sequences as statistics counters - they produce almost no > performance impact and we can tolerate it's non-transactional nature. I can > imaging someone who wants to have a sequence per user or other relation > row. I can almost see the point. But my natural choice in that case would be a table with two columns. Would that actually be so much less efficient? Of course you'd have fully transactional behavior and thus locking. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
Not "may slow down." Change that to "will slow down and possibly corrupt" your system.
In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well. Somewhere over 1,000,000 relations, the system becomes unusable. It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work. But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).
For example, pg_dump fails altogether. After 24 hours, it won't even start writing to its output file. The auto-completion in psql of table and column names freezes the system. It takes minutes to drop one table. Stuff like that. You'll have a system that works, but can't be backed up, dumped, repaired or managed.
As I said, this was 8.4.x. Things may have changed in 9.x.
Craig
2012/5/11 Robert Klemme <shortcutter@googlemail.com>On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre <meetvbondre@gmail.com> wrote:
> Is there any max limit set on sequences that can be created on the database
> ? Also would like to know if we create millions of sequences in a single db
> what is the downside of it.The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).
Not "may slow down." Change that to "will slow down and possibly corrupt" your system.
In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well. Somewhere over 1,000,000 relations, the system becomes unusable. It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work. But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).
For example, pg_dump fails altogether. After 24 hours, it won't even start writing to its output file. The auto-completion in psql of table and column names freezes the system. It takes minutes to drop one table. Stuff like that. You'll have a system that works, but can't be backed up, dumped, repaired or managed.
As I said, this was 8.4.x. Things may have changed in 9.x.
Craig
On Sun, May 13, 2012 at 9:01 AM, Craig James <cjames@emolecules.com> wrote: > > In my experience (PG 8.4.x), the system can handle in the neighborhood of > 100,000 relations pretty well. Somewhere over 1,000,000 relations, the > system becomes unusable. It's not that it stops working -- day-to-day > operations such as querying your tables and running your applications > continue to work. But system operations that have to scan for table > information seem to freeze (maybe they run out of memory, or are > encountering an O(N^2) operation and simply cease to complete). > > For example, pg_dump fails altogether. After 24 hours, it won't even start > writing to its output file. The auto-completion in psql of table and column > names freezes the system. It takes minutes to drop one table. Stuff like > that. You'll have a system that works, but can't be backed up, dumped, > repaired or managed. > > As I said, this was 8.4.x. Things may have changed in 9.x. I think some of those things might have improved, but enough of them have not improved, or not by enough. So I agree with your assessment, under 9.2 having millions of sequences might technically work, but would render the database virtually unmanageable. Cheers, Jeff
2012/5/13 Robert Klemme <shortcutter@googlemail.com>
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
> 2012/5/11 Robert Klemme <shortcutter@googlemail.com>>> On the contrary: what would be the /advantage/ of being able to createI can almost see the point. But my natural choice in that case would
>> millions of sequences? What's the use case?
>
> We are using sequences as statistics counters - they produce almost no
> performance impact and we can tolerate it's non-transactional nature. I can
> imaging someone who wants to have a sequence per user or other relation
> row.
be a table with two columns. Would that actually be so much less
efficient? Of course you'd have fully transactional behavior and thus
locking.
We've had concurrency problems with table solution (a counter that is updated by many concurrent queries), so we traded transactionality for speed. We are actually using this data to graph pretty graphs in nagios, so it's quite OK. But we have only ~10 sequences, not millions :)
Best regards,
Vitalii Tymchyshyn
On Tuesday, May 15, 2012 08:29:11 AM Віталій Тимчишин wrote: > 2012/5/13 Robert Klemme <shortcutter@googlemail.com> > > > On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин <tivv00@gmail.com> > > > > wrote: > > > 2012/5/11 Robert Klemme <shortcutter@googlemail.com> > > > > > >> On the contrary: what would be the /advantage/ of being able to create > > >> millions of sequences? What's the use case? > > > > > > We are using sequences as statistics counters - they produce almost no > > > performance impact and we can tolerate it's non-transactional nature. I > > > > can > > > > > imaging someone who wants to have a sequence per user or other > > > relation row. > > > > I can almost see the point. But my natural choice in that case would > > be a table with two columns. Would that actually be so much less > > efficient? Of course you'd have fully transactional behavior and thus > > locking. > > We've had concurrency problems with table solution (a counter that is > updated by many concurrent queries), so we traded transactionality for > speed. We are actually using this data to graph pretty graphs in nagios, so > it's quite OK. But we have only ~10 sequences, not millions :) I would rather suggest going with a suming table if you need to do something like that: sequence_id | value 1 | 3434334 1 | 1 1 | -1 1 | 1 1 | 1 ... You then can get the current value with SELECT SUM(value) WHERE sequence_id = 1. For garbage collection you can delete those values and insert the newly summed up value again. That solution won't ever block if done right. Andres
Hi, On Tue, May 15, 2012 at 12:57 PM, Andres Freund <andres@anarazel.de> wrote: > I would rather suggest going with a suming table if you need to do something > like that: > > sequence_id | value > 1 | 3434334 > 1 | 1 > 1 | -1 > 1 | 1 > 1 | 1 > ... > > You then can get the current value with SELECT SUM(value) WHERE sequence_id = > 1. For garbage collection you can delete those values and insert the newly > summed up value again. > That solution won't ever block if done right. I was going to suggest another variant which would not need GC but would also increase concurrency: sequence_id | hash | value 1 | 0 | 3 1 | 1 | 9 1 | 2 | 0 1 | 3 | 2 ... with PK = (sequence_id, hash) and hash in a fixed range (say 0..15). Value would be obtained the same way, i.e. via SELECT SUM(value) FROM T WHERE sequence_id = 1 The hash value would have to be calculated - at session start time (cheap but might reduce concurrency due to small number of changes) or - at TX start time (more expensive but probably better concurrency due to higher change rate) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
On Sun, May 13, 2012 at 10:01 AM, Craig James <cjames@emolecules.com> wrote:
On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).
Not "may slow down." Change that to "will slow down and possibly corrupt" your system.
In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well. Somewhere over 1,000,000 relations, the system becomes unusable. It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work. But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).
Glad I found this thread.
Is this 1M relation mark for the whole database cluster or just for a single database within the cluster?
Thanks,
-Greg
On Fri, May 25, 2012 at 4:58 AM, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
I don't know. When I discovered this, our system only had a few dozen databases, and I never conducted any experiments. We had to write our own version of pg_dump to get the data out of the damaged system, and then reload from scratch. And it's not a "hard" number. Even at a million relation things work ... they just bog down dramatically. By the time I got to 5 million relations (a rogue script was creating 50,000 tables per day and not cleaning up), the system was effectively unusable.
Craig
On Sun, May 13, 2012 at 10:01 AM, Craig James <cjames@emolecules.com> wrote:On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class).
Not "may slow down." Change that to "will slow down and possibly corrupt" your system.
In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well. Somewhere over 1,000,000 relations, the system becomes unusable. It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work. But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete).Glad I found this thread.Is this 1M relation mark for the whole database cluster or just for a single database within the cluster?
I don't know. When I discovered this, our system only had a few dozen databases, and I never conducted any experiments. We had to write our own version of pg_dump to get the data out of the damaged system, and then reload from scratch. And it's not a "hard" number. Even at a million relation things work ... they just bog down dramatically. By the time I got to 5 million relations (a rogue script was creating 50,000 tables per day and not cleaning up), the system was effectively unusable.
Craig
Thanks,-Greg