Thread: Too much blocks read

Too much blocks read

From

Waldomiro

Date:

18 November 2009, 15:18:58

Hello everbody,

I´m doing some tests with a large table about 7 milions tuples.

So, I need to retrieve only the last value for some key. That key has
about 20.000 tuples in this table.

SELECT field1
FROM table_7milions
WHERE field1 = 'my_key'
ORDER BY field1 DESC
LIMIT 1

The statistics tables shows the postgres read about 782656 block from
disk for the table and more 24704 blocks from disk for the index.
A simple select is reading about 98 MB from disk and putting into shared
memory.

So I did some tests like that:

-- I have created a partial table for that key
SELECT *
INTO part_table
FROM table_7milions
WHERE field1 = 'my_key'

-- Now I do the same select on the same 20.000 tuples, but in the
part_table
SELECT field1
FROM part_table
WHERE field1 = 'my_key'
ORDER BY field1 desc
LIMIT 1

Now the statistics shows the postgres read 54016 blocks from disk, only
for the table becouse it doesn´t have a index.
The same select is reading about 6 MB from disk and putting into shared
memory.

I´m thinking It hapens because in the 7 millions tables, the same 8k
block has diferent records with different keys, so only a few records
with 'my_key' is retrieved when I read a 8k block.
In the part_table, all records stored in a 8k block have 'my_key', so
It´s much optimized.

My doubt, there is a way to defrag my 7 millions table to put all
records with the same key in the same 8k block?

How can I do that?
If there is not, I think it´s a good idea for the next versions.

Thank you,

Waldomiro Caraiani

Re: Too much blocks read

From

Matthew Wakeling

Date:

18 November 2009, 15:24:40

On Wed, 18 Nov 2009, Waldomiro wrote:
> So, I need to retrieve only the last value for some key. That key has about
> 20.000 tuples in this table.
>
> SELECT field1
> FROM table_7milions
> WHERE field1 = 'my_key'
> ORDER BY field1 DESC
> LIMIT 1

What's the point of this query? You are forcing Postgresql to read in all
the rows where field1 = 'my_key', so that they can be sorted, but the sort
will be completely unpredictable because all the values will be the same.
If you wanted to grab any row, then remove the ORDER BY, and it will just
return the first one it finds.

Matthew

--
 The best way to accelerate a Microsoft product is at 9.8 metres per second
 per second.
         - Anonymous

Re: Too much blocks read

From

Craig James

Date:

18 November 2009, 15:30:09

Waldomiro wrote:
> ...
> I´m thinking It hapens because in the 7 millions tables, the same 8k
> block has diferent records with different keys, so only a few records
> with 'my_key' is retrieved when I read a 8k block.
> In the part_table, all records stored in a 8k block have 'my_key', so
> It´s much optimized.
>
> My doubt, there is a way to defrag my 7 millions table to put all
> records with the same key in the same 8k block?

Read about the "CLUSTER ON index-name" SQL command.  It does exactly what you're asking.

Craig

Re: Too much blocks read

From

"A. Kretschmer"

Date:

18 November 2009, 15:31:00

In response to Waldomiro :
> I?m thinking It hapens because in the 7 millions tables, the same 8k
> block has diferent records with different keys, so only a few records
> with 'my_key' is retrieved when I read a 8k block.
> In the part_table, all records stored in a 8k block have 'my_key', so
> It?s much optimized.
>
> My doubt, there is a way to defrag my 7 millions table to put all
> records with the same key in the same 8k block?
>
> How can I do that?

CLUSTER your table:
http://www.postgresql.org/docs/current/static/sql-cluster.html


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99