Home > mailing lists

Re: Millions of tables - Mailing list pgsql-performance

From	julyanto SUTANDANG
Subject	Re: Millions of tables
Date	September 26, 2016 06:28:17
Msg-id	CAGu3fETDSVOMmuv4SiK-hMzoSd22Q_wNbj5B4iF+tF5KJ6RViQ@mail.gmail.com Whole thread
In response to	Re: Millions of tables (Greg Spiegelberg <gspiegelberg@gmail.com>)
List	pgsql-performance

Tree view

-sorry for my last email, which also not bottom posting-

Hi Greg,

On Mon, Sep 26, 2016 at 11:19 AM, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:

I did look at PostgresXL and CitusDB. Both are admirable however neither could support the need to read a random record consistently under 30ms. It's a similar problem Cassandra and others have: network latency. At this scale, to provide the ability to access any given record amongst trillions it is imperative to know precisely where it is stored (system & database) and read a relatively small index. I have other requirements that prohibit use of any technology that is eventually consistent.

Then, you can get below 30ms, but how many process you might have to have conncurently?

This is something that you should consider, single machine can only have less than 50 HT for intel, 192HT for Power8, still it is far below millions compare with the number of tables (8Million)

If you use index correctly, you would not need sequencial scan since the scanning run on the memory (index loaded into memory)
Do you plan to query thru Master table of the partition? it is quite slow actually, considering millions rule to check for every query.

with 8 Millions of data, you would require very big data storage for sure and it would not fit mounted into single machine unless you would planning to use IBM z machines.

I liken the problem to fishing. To find a particular fish of length, size, color &c in a data lake you must accept the possibility of scanning the entire lake. However, if all fish were in barrels where each barrel had a particular kind of fish of specific length, size, color &c then the problem is far simpler.

pgsql-performance by date:

From: Jeff Janes
Date: 26 September 2016, 06:07:56
Subject: Re: Millions of tables

From: Dev Nop
Date: 26 September 2016, 08:27:39
Subject: Re: Storing large documents - one table or partition by doc?

Re: Millions of tables - Mailing list pgsql-performance

Previous

Next