Re: Millions of tables - Mailing list pgsql-performance

From julyanto SUTANDANG
Subject Re: Millions of tables
Date
Msg-id CAGu3fETDSVOMmuv4SiK-hMzoSd22Q_wNbj5B4iF+tF5KJ6RViQ@mail.gmail.com
Whole thread Raw
In response to Re: Millions of tables  (Greg Spiegelberg <gspiegelberg@gmail.com>)
List pgsql-performance
-sorry for my last email, which also not bottom posting-

Hi Greg, 
On Mon, Sep 26, 2016 at 11:19 AM, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
I did look at PostgresXL and CitusDB.  Both are admirable however neither could support the need to read a random record consistently under 30ms.  It's a similar problem Cassandra and others have: network latency.  At this scale, to provide the ability to access any given record amongst trillions it is imperative to know precisely where it is stored (system & database) and read a relatively small index.  I have other requirements that prohibit use of any technology that is eventually consistent.
 Then, you can get below 30ms, but how many process you might have to have conncurently? 
This is something that you should consider, single machine can only have less than 50 HT for intel, 192HT for Power8, still it is far below millions compare with the number of tables (8Million) 
If you use index correctly, you would not need sequencial scan since the scanning run on the memory (index loaded into memory)
Do you plan to query thru Master table of the partition? it is quite slow actually, considering millions rule to check for every query. 

with 8 Millions of data, you would require very big data storage for sure and it would not fit mounted into single machine unless you would planning to use IBM z machines.


I liken the problem to fishing.  To find a particular fish of length, size, color &c in a data lake you must accept the possibility of scanning the entire lake.  However, if all fish were in barrels where each barrel had a particular kind of fish of specific length, size, color &c then the problem is far simpler.

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Millions of tables
Next
From: Dev Nop
Date:
Subject: Re: Storing large documents - one table or partition by doc?