Re: Question about disk IO an index use and seeking advice - Mailing list pgsql-performance

From PFC
Subject Re: Question about disk IO an index use and seeking advice
Date
Msg-id op.t94smqkmcigqcu@apollo13.peufeu.com
Whole thread Raw
In response to Re: Question about disk IO an index use and seeking advice  ("Nikolas Everett" <nik9000@gmail.com>)
List pgsql-performance
> An index scan looks through the index and pulls in each pages as it sees
> it.
> A bitmap index scan looks through the index and makes a sorted list of
> all
> the pages it needs and then the bitmap heap scan reads all the pages.
> If your data is scattered then you may as well do the index scan, but if
> your data is sequential-ish then you should do the bitmap index scan.
>
> Is that right?  Where can I learn more?  I've read

    That's about it, yes.
    If your bitmap has large holes, it will seek, but if it has little holes,
readahead will work. Hence, fast, and good.
    On indexscan, readahead doesn't help since the hits are pretty random. If
you have N rows in the index with the same date, in which order whill they
get scanned ? There is no way to know that, and no way to be sure this
order corresponds to physical order on disk.

> About clustering:  I know that CLUSTER takes an exclusive lock on the
> table.  At present, users can query the table at any time, so I'm not
> allowed to take an exclusive lock for more than a few seconds.

    Then, CLUSTER is out.

> Could I
> achieve the same thing by creating a second copy of the table and then
> swapping the first copy out for the second?  I think something like that
> would fit in my time frames

    If the archive table is read-only, then yes, you can do this.
.
> About partitioning:  I can definitely see how having the data in more
> manageable chunks would allow me to do things like clustering.  It will
> definitely make vacuuming easier.
>
> About IO speeds:  The db is always under some kind of load.  I actually
> get
> scared if the load average isn't at least 2.  Could I try to run
> something
> like bonnie++ to get some real load numbers?  I'm sure that would cripple
> the system while it is running, but if it only takes a few seconds that
> would be ok.
>
> There were updates running while I was running the test.  The WAL log is
> on
> the hardware raid 10.  Moving it from the software raid 5 almost doubled
> our
> insert performance.

    Normal ; fsync on a RAID5-6 is bad, bad.
    You have battery backed up cache ?

> Thanks again,
>
> --Nik



pgsql-performance by date:

Previous
From: "Nikolas Everett"
Date:
Subject: Re: Question about disk IO an index use and seeking advice
Next
From: Dave Cramer
Date:
Subject: Planner won't use composite index if there is an order by ????