Home > mailing lists

Re: Many DataFileRead - IO waits - Mailing list pgsql-performance

From	Ben Snaidero
Subject	Re: Many DataFileRead - IO waits
Date	February 28, 2020 17:01:50
Msg-id	CAEPE5bOtH=8dwFyu8MSOiksK56jAH7j_-i6c-_JatFx=6UofYg@mail.gmail.com Whole thread Raw
In response to	Re: Many DataFileRead - IO waits (Michael Lewis <mlewis@entrata.com>)
Responses	Re: Many DataFileRead - IO waits (Michael Lewis <mlewis@entrata.com>)
List	pgsql-performance

Tree view

On Thu, Feb 27, 2020 at 11:54 AM Michael Lewis <mlewis@entrata.com> wrote:

How big is ix_tabledata_intid_timestampdate_intotherid3_intotherid2 on disk? If you create another index with same fields, how much space does it take? Real question- are you vacuuming aggressively enough for your workload? Your index name seems to indicate that intotherid3 would be the third key, and yet the planner chose not to scan that deep and instead filtered after it found the relevant tuples based on intid and timestampdate. That seems peculiar to me.

The documentation discourages multi-column indexes because they have limited application unless the same fields are always used. Personally, I don't love reviewing the stats of indexscans or how many tuples were fetched and having to guess how deeply the index was scanned for the various queries involved.

I'd wonder if an index on only intid_timestampdate would be both much smaller and also have a more right-leaning pattern of information being added and accessed in terms of keeping frequently needing blocks in shared buffers.

As a side note, that planning time seems high to me for such a simple query. Have you increased default_statistics_target significantly perhaps?

In this case the index is quite large ~400GB but as you can see from the explain plan it's doing a backward scan and not accessing that many buffers. Other servers with this issue are much smaller. We have autovacuum set to the default setting but this table does not get any deletes so I don't think that is the problem. I think the reason it does not go deeper into the index keys is because it's just looking for the first occurence based on date (limit 1) not all of them although even if looking for all of them I think it would still scan in the same way since there would be other intotherid3 values between the ones in this search key

In regards to default_statistics_target I have not increased this value at all.

All this said regarding statistics and vacuum/bloat we restored a two day old copy of the database (on one of the servers experiencing the issue) and the issue was still present. These systems are all on cloud infrastructure so I am leaning towards it being something hardware related (especially as it's only happening on a few servers) but our cloud provider says nothing has changed in this respect.

pgsql-performance by date:

From: Michael Lewis
Date: 27 February 2020, 19:54:04
Subject: Re: Many DataFileRead - IO waits

From: Michael Lewis
Date: 28 February 2020, 19:40:46
Subject: Re: Many DataFileRead - IO waits

Re: Many DataFileRead - IO waits - Mailing list pgsql-performance

Previous

Next