Home > mailing lists

Re: Millions of tables - Mailing list pgsql-performance

From	Greg Spiegelberg
Subject	Re: Millions of tables
Date	September 26, 2016 13:57:16
Msg-id	CAEtnbpU5cLV7eKXdTUZdS5zRhN+FErEp9WyMPLEqKoCNAhK-Mg@mail.gmail.com Whole thread Raw
In response to	Re: Millions of tables ("Mike Sofen" <msofen@runbox.com>)
List	pgsql-performance

Tree view

On Mon, Sep 26, 2016 at 7:05 AM, Mike Sofen <msofen@runbox.com> wrote:

From: Rick Otten Sent: Monday, September 26, 2016 3:24 AM
Are the tables constantly being written to, or is this a mostly read scenario?

With regards to consistent query performance, I think you need to get out of AWS. That environment is terrible if you are going for consistency unless you buy dedicated hardware, and then you are paying so much money it is ridiculous.

Also I think having 10M rows in a table is not a problem for the query times you are referring to. So instead of millions of tables, unless I'm doing my math wrong, you probably only need thousands of tables.
----------
Excellent thoughts: the read/write behavior will/should drive a lot of the design; AWS does not guarantee consistency or latency; and 10m rows is nothing to PG.

Re AWS: we’re on it, at least for now. In my profiling of our performance there, I consistently get low latencies…I just know that there will be random higher latencies, but the statistical average will be low. I just ran a quick test against a modest sized table on a modest sized EC2 instance (m4.xlarge – 4 core/16gb ram, 3 tb ssd): the table has 15m rows but is huge (it represents nearly 500m rows compressed in jsonb documents), with 5 indexed key columns and a total of 12 columns. I queried for a single, non-PK, indexed value using “select *” (so it included the json) and it took 22ms, without the json it took 11ms. Especially with the db/memory-optimized EC2 instances now available (with guaranteed IOPS), performance against even 100m row tables should still stay within your requirements.

So Rick’s point about not needing millions of tables is right on. If there’s a way to create table “clumps”, at least you’ll have a more modest table count.

Absolutely! The 8M tables do "belong" to a larger group and the option to reduce the 8M tables to ~4000 is an option however the problem then becomes rather than having an anticipated 140k records/table to 140M to 500M records/table. I'm concerned read access times will go out the window. It is on the docket to test.

-Greg

pgsql-performance by date:

From: Greg Spiegelberg
Date: 26 September 2016, 13:53:22
Subject: Re: Millions of tables

From: Greg Spiegelberg
Date: 26 September 2016, 14:09:12
Subject: Re: Millions of tables

Re: Millions of tables - Mailing list pgsql-performance

Previous

Next