Re: typical active table count? - Mailing list pgsql-general
From | Jeremy Schneider |
---|---|
Subject | Re: typical active table count? |
Date | |
Msg-id | 03a01fb1-e0be-24f2-eadb-8e1a4b4981bf@ardentperf.com Whole thread Raw |
In response to | Re: typical active table count? (Ron <ronljohnsonjr@gmail.com>) |
List | pgsql-general |
On 6/27/23 12:08 PM, Ron wrote: > On 6/27/23 13:47, Jeremy Schneider wrote: >> On 6/27/23 9:32 AM, Ben Chobot wrote: >>> We certainly have databases where far more than 100 tables are updated >>> within a 10 second period. Is there a specific concern you have? >>> >> Thank Ben, not a concern but I'm trying to better understand how common >> this might be. And I think sharing general statistics about how people >> use PostgreSQL is a great help to the developers who build and maintain it. >> >> One really nice thing about PostgreSQL is that two quick copies of >> pg_stat_all_tables and you can easily see this sort of info. >> >> If you have a database where more than 100 tables are updated within a >> 10 second period - this seems really uncommon to me - I'm very curious >> about the workload. > > 100 tables updates just means /possibly complicated schema/, not > necessarily high volume. > > ... > > And honestly, 100 tables in 10 seconds is 10 tables/second. If each > gets one insert, that's a laughably slow transaction rate. (Unless of > course there's 85 indices per table, and foreign keys don't have > supporting indices.) I don't think the math actually works this way on highly concurrent systems. In fact, this morning I connected with a coworker who works on Amazon fulfillment center DBs and there was almost no difference in the number of tables with insert/update/delete regardless of whether you looked at a 10 second window or a 2 second window. I was also able to chat with another coworker at Amazon who got numbers from a couple of their PG databases, and connected w one person on slack at a different company who passed along numbers, and got a few emails from Oracle folks. The numbers reported back to me ranged from 29 to over a hundred. Obviously there are also lots of small databases behind wordpress websites with much less activity, but I found this to be an interesting measure of some respectably busy systems. The original context was a conversation related to logical replication of DB changes. But then I got interested in the general question and topic - and someone on the Oracle side mentioned system tables which is a really good point that hadn't occurred to me yet. The original conversation was concerned with user tables and not system ones, but there would be a fair amount of ongoing system table activity too. Besides partitioning, another interesting dimension of the conversation has been thinking about different categories of workloads. For example: SaaS or multitenant applications with many copies of a similar schema, ISVs, ERPs, or large enterprise databases with lots of development history. All of these categories can easily ramp up the counts. I'm still interested in more data - if anyone reading this can grab a couple snapshots of pg_stat_all_tables and report back numbers for a 10 second window and a 2 second window, that would be amazing! -Jeremy -- http://about.me/jeremy_schneider
pgsql-general by date: