Hello all,
(7.2.1 on Debian Woody, so it's using backported 7.2.3 patches.)
I've found several threads on this in the past, so I've done a bit of
homework. (Questions at end.)
Most of you who follow this know I put out a request for commercial help a
few days ago. I'm swamped with doing workarounds so I haven't been able to
follow up yet (some of my engineers are on vacation). However, one
immediate suggestion was for me to put the indices on another disk. (The
only other disk I have right now is the transaction log disk, so it's not
necessarily viable, but I can relatively quickly add an external RAID
enclosure.)
That's easy enough to do: find the index in pg_class, get the filename, and
move the corresponding files to another disk.
However, these indexes are fairly large: almost 6 gigs in one case (my slow
VARCHAR index). This causes PostgreSQL to create multiple index files:
6402181, 6402181.1, 6402181.2, etc. It's obvious that as the index grows,
it will need more files.
So, say I symlink the existing ones to another drive (while the database is
offline, of course). Now the index expands to another file. That will
obviously be on the "wrong" disk now.
What would happen if I did something like "touch 6402181.7" on the next few
file IDs, move the empty files over, and then put in symlinks? Will that
screw anything up? What will the database server do?
Does 7.3 have any provision to say "create all index files in such-and-such
directory" and so on?
Final miscellaneous questions:
1) Does using those "conditional indexes" speed up indexing? For example,
on one table, I have two indexes, one on columns (type, col1) and another
on (type, col2) - all INTEGER columns. The index on (type, col1) is used
for some values of type, and the index on (type, col2) is used for other
values of type. Should I turn those into conditional indexes which list the
exact values of type which they should be used for? Should I have a
separate conditional index for each possible type value instead? (I'd end
up with 20-odd indices, but I don't think that's a big deal.)
2) How badly does turning on stats_row_level and stats_block_level impact
performance?
a) Which one(s) do I need to get useful data out of
pg_stat(io)_user_indexes and such, so I can see where most of my reads and
writes are going to?
b) What's the difference between *_blks_hit and *_blks_read?
c) What are toast_blks? tidx_blocks? (It refers to a TOAST table but not
sure what that is.)
d) Is there a reference for how to interpret this data? (Beyond the
current 7.2.1 manual "monitoring-stats" section)
3) It says I'm using client/server encoding SQL_ASCII. Is this the fastest
one? I know I'm supposed to use the "C" locale, but I'm not sure if that
locale is SQL_ASCII.
4) Does anyone have any positive or negative feelings on Hyperthreading?
It's enabled on my dual P4 Xeon, but I'm considering disabling it, as there
are really only 2 CPUs and PostgreSQL might be blocking on locks thinking
there are 4, or the same thing at the O/S level.
Thanks,
Doug