Home > mailing lists

Re: Millions of tables - Mailing list pgsql-performance

From	Simon Riggs
Subject	Re: Millions of tables
Date	September 29, 2016 08:21:45
Msg-id	CANP8+jKE0A1s6UK3F+EmPEmv9b_zWJ6T-qsMGvUJyEfLCZ7HTg@mail.gmail.com Whole thread
In response to	Re: Millions of tables (Greg Spiegelberg <gspiegelberg@gmail.com>)
List	pgsql-performance

Tree view

On 26 September 2016 at 05:19, Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
> I did look at PostgresXL and CitusDB.  Both are admirable however neither
> could support the need to read a random record consistently under 30ms.
> It's a similar problem Cassandra and others have: network latency.  At this
> scale, to provide the ability to access any given record amongst trillions
> it is imperative to know precisely where it is stored (system & database)
> and read a relatively small index.  I have other requirements that prohibit
> use of any technology that is eventually consistent.

Then XL is exactly what you need, since it does allow you to calculate
exactly where the record is via hash and then access it, which makes
the request just a single datanode task.

XL is not the same as CitusDB.

> I liken the problem to fishing.  To find a particular fish of length, size,
> color &c in a data lake you must accept the possibility of scanning the
> entire lake.  However, if all fish were in barrels where each barrel had a
> particular kind of fish of specific length, size, color &c then the problem
> is far simpler.

The task of putting the fish in the appropriate barrel is quite hard.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-performance by date:

From: Karl Denninger
Date: 28 September 2016, 19:06:43
Subject: Re: PostgreSQL on ZFS: performance tuning

From: "Alex Ignatov \(postgrespro\)"
Date: 29 September 2016, 11:11:18
Subject: Re: Millions of tables

Re: Millions of tables - Mailing list pgsql-performance

Previous

Next