Home > mailing lists

RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs - Mailing list pgsql-general

From	Godfrin, Philippe E
Subject	RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs
Date	December 10, 2021 21:04:07
Msg-id	SA0PR15MB393324C6A5522A80913DC42582719@SA0PR15MB3933.namprd15.prod.outlook.com Whole thread Raw
In response to	Re: performance expectations for table(s) with 2B recs ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Responses	Re: [EXTERNAL] Re: performance expectations for table(s) with 2B recs ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List	pgsql-general

Tree view

On 2021-12-08 14:44:47 -0500, David Gauthier wrote:
> So far, the tables I have in my DB have relatively low numbers of records (most
> are < 10K, all are < 10M). Things have been running great in terms of
> performance. But a project is being brainstormed which may require some tables
> to contain a couple billion records.
[...]
> What else should I be worried about ?
>
> I suspect that part of why things are running really well so far is that the
> relatively small amounts of data in these tables ends up in the DB cache and
> disk I/O is kept at a minimum. Will that no longer be the case once queries
> start running on these big tables ?

>

>Depends a lot on how good the locality of your queries is. If most read

>only the same parts of the same indexes, those will still be in the

>cache. If they are all over the place or if you have queries which need

>to read large parts of your tables, cache misses will make your

>performance a lot less predictable, yes. That stuff is also hard to

>test, because when you are testing a query twice in a row, the second

>time it will likely hit the cache and be quite fast.

>But in my experience the biggest problem with large tables are unstable

>execution plans - for most of the parameters the optimizer will choose

>to use an index, but for some it will erroneously think that a full

>table scan is faster. That can lead to a situation where a query

>normally takes less than a second, but sometimes (seemingly at random)

>it takes several minutes - users will understandably be upset about such

>behaviour. It is in any case a good idea to monitor execution times to

>find such problems (ideally before users complain), but each needs to be

>treated on an individual basis, and sometimes there seems to be no good

>solution.

To the OP, that’s is a tall order to answer – basically that’s wjhy DBA’s still have

Jobs…

For Peter I have a question. What exactly causes ‘unstable execution plans’ ??

Besides not using bind variables, bad statistics, would you elaborate in what would

contribute to that instability?

pgsql-general by date:

From: Adrian Klaver
Date: 10 December 2021, 19:01:20
Subject: Re: Postgresql + containerization possible use case

From: Michael Lewis
Date: 10 December 2021, 23:06:20
Subject: Re: Postgresql + containerization possible use case

RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs - Mailing list pgsql-general

Previous

Next