Thread: indexing words slow

indexing words slow

From
Bruce Momjian
Date:
>
> Hi,
>
> I have done a little more testing, and the performance bottleneck
> seems definitely be memory related. Note that it does not really seems
> to be dependend on buffer-settings, but really on disk caches.
>
> additional info:
>     the index on this table is around 155 Megs big
>
> Now, if I do a count(*) on '^rol', after the second query, this takes
> around 1 second, and returns 2528.
>
> On the other hand, if I do a count(*) on '^ric', his takes consequently
> around 1:30 mins, no matter how often I run it. This returns 7866.
>
> A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after
> running it several times.

Wow, this makes no sense to me.  How can this happen?  'rol' is fast, but
'ric' which returns 3 times as many rows takes 1.5 minutes, no matter how
often it is run?  And 64MB is a lot of memory.

Does profiling show that 'rol' and 'ric' take the same amount of CPU
time in the backend?  Is EXPLAIN identical for these two?  If so, there
must be something different about the I/O demands of the two queries,
but I can't really understand what that difference could be.  If I had
to take a guess, it is that 'ric' is somehow causing a cache flush for
each row it retrieves, while 'rol' fits in the cache.

Several million rows is a good sized table, but certainly not too big
for PostgreSQL.  I am sure people have larger tables.

I am kind of tempted to test it on my machine here.  I have BSD/OS with
a PP200 and 64MB RAM.  Can you send me something I can recreate here?
Perhaps put it on our ftp site?

>
> Running different queries in between affect these times.
>
> My computer has 64 Megs of RAM, and I'm running X (linux 2.0.30)
>
> So, it seems to me that with this amount of memory, my system is usable
> only for smaller tables (ie. no 550,000 rows in the main table, and no
> 4,500,000 rows in the 'index' table). If I want better performance for this
> setup, I need faster disks and more (how much more?) memory. Wish I could
> test this somehow.... Maybe I can ask my sysop at the university if I may
> test this on the dual PPro with 256 megs, but I don't think so....
>
> Maarten
>
>
> _____________________________________________________________________________
> | TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
> |                   Department of Electrical Engineering                    |
> |           Computer Architecture and Digital Technique section             |
> |                          M.Boekhold@et.tudelft.nl                         |
> -----------------------------------------------------------------------------
>
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: indexing words slow

From
Maarten Boekhold
Date:
On Wed, 11 Mar 1998, Bruce Momjian wrote:

> >
> > Hi,
> >
> > I have done a little more testing, and the performance bottleneck
> > seems definitely be memory related. Note that it does not really seems
> > to be dependend on buffer-settings, but really on disk caches.
> >
> > additional info:
> >     the index on this table is around 155 Megs big
> >
> > Now, if I do a count(*) on '^rol', after the second query, this takes
> > around 1 second, and returns 2528.
> >
> > On the other hand, if I do a count(*) on '^ric', his takes consequently
> > around 1:30 mins, no matter how often I run it. This returns 7866.
> >
> > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after
> > running it several times.

btw. to make things clearer on what I mean with '^lling' and '^tones', I
really mean "'^lling' *AND* '^tones'", ie. a join :) actually pretty good
don't ya think? :)

> Wow, this makes no sense to me.  How can this happen?  'rol' is fast, but
> 'ric' which returns 3 times as many rows takes 1.5 minutes, no matter how
> often it is run?  And 64MB is a lot of memory.
>
> Does profiling show that 'rol' and 'ric' take the same amount of CPU
> time in the backend?  Is EXPLAIN identical for these two?  If so, there

Profiling show different time (I guess you're talking about the
cumulative time for the query?). 'rol' takes 0.68 secs cumulative, 'ric'
takes 1.69 secs cumulative. EXPLAIN's are identical.

> I am kind of tempted to test it on my machine here.  I have BSD/OS with
> a PP200 and 64MB RAM.  Can you send me something I can recreate here?
> Perhaps put it on our ftp site?

I *can* put something up through http on my universiy machine I guess.
It'll be a pg_dump of that table. Note however that this is generated
from a commercial database, so please promise me to delete it after
testing :) don't wanna get into any legal trouble (ah what, who's gonna
find out :).

I'll prepare a mail for you (ie. bruce) either tonight or tomorrow
morning (CET), with the source and a 'psql script' to set things up, as a
tar-file. The data itself you'll have to get through the web, much to
large to mail (over 10 megs gzip -9).

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
|                   Department of Electrical Engineering                    |
|           Computer Architecture and Digital Technique section             |
|                          M.Boekhold@et.tudelft.nl                         |
-----------------------------------------------------------------------------



Re: [HACKERS] Re: indexing words slow

From
ocie@paracel.com
Date:
Maarten Boekhold wrote:
>
> On Wed, 11 Mar 1998, Bruce Momjian wrote:
>
> > >
> > > Hi,
> > >
> > > I have done a little more testing, and the performance bottleneck
> > > seems definitely be memory related. Note that it does not really seems
> > > to be dependend on buffer-settings, but really on disk caches.
> > >
> > > additional info:
> > >     the index on this table is around 155 Megs big
> > >
> > > Now, if I do a count(*) on '^rol', after the second query, this takes
> > > around 1 second, and returns 2528.
> > >
> > > On the other hand, if I do a count(*) on '^ric', his takes consequently
> > > around 1:30 mins, no matter how often I run it. This returns 7866.
> > >
> > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after
> > > running it several times.
>
> btw. to make things clearer on what I mean with '^lling' and '^tones', I
> really mean "'^lling' *AND* '^tones'", ie. a join :) actually pretty good
> don't ya think? :)

This sounds like an unsatisfyable query, perhaps if the database
figured this out, it could return zero rows without even hitting the
index.  If the first item matched, the first character is an 'l', if
the second matches, a 't'.  It can't be both an 'l' and a 't'!

We could also do this for some other queries, like:

select ... from ... where a<b and b<c and c<d

If a and d are parameters to the query (constants), and d<a, then
there can never be a row returned.

Ocie

Re: [HACKERS] Re: indexing words slow

From
Bruce Momjian
Date:
> > btw. to make things clearer on what I mean with '^lling' and '^tones', I
> > really mean "'^lling' *AND* '^tones'", ie. a join :) actually pretty good
> > don't ya think? :)
>
> This sounds like an unsatisfyable query, perhaps if the database
> figured this out, it could return zero rows without even hitting the
> index.  If the first item matched, the first character is an 'l', if
> the second matches, a 't'.  It can't be both an 'l' and a 't'!

He has to instances of the table, and is looking for index entries with
both words.  We have only been sharing part of the conversation.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: indexing words slow

From
Maarten Boekhold
Date:
On Wed, 11 Mar 1998 ocie@paracel.com wrote:

> Maarten Boekhold wrote:
> >
> > On Wed, 11 Mar 1998, Bruce Momjian wrote:
> >
> > > >
> > > > Hi,
> > > >
> > > > I have done a little more testing, and the performance bottleneck
> > > > seems definitely be memory related. Note that it does not really seems
> > > > to be dependend on buffer-settings, but really on disk caches.
> > > >
> > > > additional info:
> > > >     the index on this table is around 155 Megs big
> > > >
> > > > Now, if I do a count(*) on '^rol', after the second query, this takes
> > > > around 1 second, and returns 2528.
> > > >
> > > > On the other hand, if I do a count(*) on '^ric', his takes consequently
> > > > around 1:30 mins, no matter how often I run it. This returns 7866.
> > > >
> > > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after
> > > > running it several times.
> >
> > btw. to make things clearer on what I mean with '^lling' and '^tones', I
> > really mean "'^lling' *AND* '^tones'", ie. a join :) actually pretty good
> > don't ya think? :)
>
> This sounds like an unsatisfyable query, perhaps if the database
> figured this out, it could return zero rows without even hitting the
> index.  If the first item matched, the first character is an 'l', if
> the second matches, a 't'.  It can't be both an 'l' and a 't'!

OK, I wasn't clear enough:

select count(*) from table t1, table t2 where t1.string ~ '^lling' and
t2.string  '^tones';

Ofcourse the can be done (note that this is a table with 4,500,000 rows).

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
|                   Department of Electrical Engineering                    |
|           Computer Architecture and Digital Technique section             |
|                          M.Boekhold@et.tudelft.nl                         |
-----------------------------------------------------------------------------


Re: indexing words slow

From
Bruce Momjian
Date:
>
> On Wed, 11 Mar 1998, Bruce Momjian wrote:
>
> > >
> > > Hi,
> > >
> > > I have done a little more testing, and the performance bottleneck
> > > seems definitely be memory related. Note that it does not really seems
> > > to be dependend on buffer-settings, but really on disk caches.
> > >
> > > additional info:
> > >     the index on this table is around 155 Megs big
> > >
> > > Now, if I do a count(*) on '^rol', after the second query, this takes
> > > around 1 second, and returns 2528.
> > >
> > > On the other hand, if I do a count(*) on '^ric', his takes consequently
> > > around 1:30 mins, no matter how often I run it. This returns 7866.
> > >
> > > A search on count(*) of '^lling' and '^tones' takes around 2.5 secs after
> > > running it several times.

OK, I have the artist_fti table with 4.5 million rows, with an index
artist_fti_idx on artist_fti.string.  I don't have a 'product' table
because I don't have the disk space, but that is not really a problem
for testing.

I used the product table to populate the artits_fti, then deleted the
table so I could create the artist_fti_idx index.  Single table, no
join.

I am running on BSD/OS 3.1 PP200 with SCSI Ultra Barracuda disks.

I am seeing the same thing you are.  'lling' and 'tones' take 11-22
seconds on the first few runs, then return almost immediately.  If I do
another query and come back to the first one, I have the same slowness,
with the disk light being on almost constantly.  Certain queries seem to
lend themselves to speeding up, while 'rol' never seems to get really
fast.

I have to conclude that because of the way this table is created by
slicing words, its layout is almost random.  The table is 272MB, and
with 8k/page, that is 34,000 pages.  If we are going for 2,500 rows,
odds are that each row is in a separate page.  So, to do the query, we
are retrieving 2,500 8k pages, or 20MB of random data on the disk.  How
long does it take to issue 2,500 disk requests that are scattered all
over the disk.  Probably 11-22 seconds.

My OS only lets me have 400 8k buffers, or 3.2MB of buffer.  As we are
transferring 2,500 8k pages or 20MB of data, it is no surprise that the
buffer cache doesn't help much.  Sometime, the data is grouped together
on the disk, and that is why some are fast, but often they are not, and
certainly in a situation where you are looking for two words to appear
on the same row, they certainly are not on adjacent pages.

Just started running CLUSTER, and it appears to be taking forever.  Does
not seem to be using the disk, but a lot of CPU.  It appears to be
caught in an endless loop.

I can attach to a running process, so when I attach to the backend, I
see:

#$ gdb /u/pg/bin/postgres 29755 GDB is free software and you are welcome
to distribute copies of it under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-bsdi3.0),
Copyright 1996 Free Software Foundation, Inc...

/usr/local/src/pgsql/pgsql/src/backend/commands/29755: No such file or
directory.
Attaching to program `/u/pg/bin/postgres', process 29755
0x94999 in WaitIO (buf=0x4c0ccc8, spinlock=0) at bufmgr.c:1107
1107            S_LOCK(&(buf->io_in_progress_lock));


---------------------------------------------------------------------------

Stepping to the next statement shows it is just hung here.

The backtrace shows:

---------------------------------------------------------------------------

#0  0x9499b in WaitIO (buf=0x4c0ccc8, spinlock=0) at bufmgr.c:1107
#1  0x94109 in BufferAlloc (reln=0x19df90, blockNum=11324,
    foundPtr=0xefbfb903 "\001\030����=\t", bufferLockHeld=0) at bufmgr.c:400
#2  0x93e91 in ReadBufferWithBufferLock (reln=0x19df90, blockNum=11324,
    bufferLockHeld=0) at bufmgr.c:255
#3  0x93dee in ReadBuffer (reln=0x19df90, blockNum=11324) at bufmgr.c:174
#4  0xb28f in heap_fetch (relation=0x19df90, seeself=0 '\000', tid=0x1b2256,
    b=0xefbfb974) at heapam.c:1050
#5  0x303d5 in rebuildheap (OIDNewHeap=11466400, OIDOldHeap=6424406,
    OIDOldIndex=11466368) at cluster.c:357
#6  0x30111 in cluster (oldrelname=0x112790 "artist_fti",
    oldindexname=0x10cdd0 "artist_fti_idx") at cluster.c:160
#7  0xa29c1 in ProcessUtility (parsetree=0x1127b0, dest=Remote)
    at utility.c:626


---------------------------------------------------------------------------

Nothing else is happening on my machine, so it must be locked against
itself.  I will check into this.

CLUSTER may be misbehaving on your machine too.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: indexing words slow

From
Bruce Momjian
Date:
> OK, I have the artist_fti table with 4.5 million rows, with an index
> artist_fti_idx on artist_fti.string.  I don't have a 'product' table
> because I don't have the disk space, but that is not really a problem
> for testing.
>
> I used the product table to populate the artits_fti, then deleted the
> table so I could create the artist_fti_idx index.  Single table, no
> join.
>
> I am running on BSD/OS 3.1 PP200 with SCSI Ultra Barracuda disks.
>
> I am seeing the same thing you are.  'lling' and 'tones' take 11-22
> seconds on the first few runs, then return almost immediately.  If I do
> another query and come back to the first one, I have the same slowness,
> with the disk light being on almost constantly.  Certain queries seem to
> lend themselves to speeding up, while 'rol' never seems to get really
> fast.
>
> I have to conclude that because of the way this table is created by
> slicing words, its layout is almost random.  The table is 272MB, and
> with 8k/page, that is 34,000 pages.  If we are going for 2,500 rows,
> odds are that each row is in a separate page.  So, to do the query, we
> are retrieving 2,500 8k pages, or 20MB of random data on the disk.  How
> long does it take to issue 2,500 disk requests that are scattered all
> over the disk.  Probably 11-22 seconds.
>
> My OS only lets me have 400 8k buffers, or 3.2MB of buffer.  As we are
> transferring 2,500 8k pages or 20MB of data, it is no surprise that the
> buffer cache doesn't help much.  Sometime, the data is grouped together
> on the disk, and that is why some are fast, but often they are not, and
> certainly in a situation where you are looking for two words to appear
> on the same row, they certainly are not on adjacent pages.
>
> Just started running CLUSTER, and it appears to be taking forever.  Does
> not seem to be using the disk, but a lot of CPU.  It appears to be
> caught in an endless loop.

OK, I have an answer for you.

First, the CLUSTER problem I was having was some fluke, probably
something else that got stuck somewhere.  Not a bug.

Second, CLUSTER takes forever because it is moving all over the disk
retrieving each row in order.

Third, I should have been able to solve this for you sooner.  The issues
of slowness you are seeing are the exact issues I dealt with five years
ago when I designed this fragmentation system.  I was so excited that
you could develop triggers to slice the words, I had forgotten the other
issues.

OK, it is.  As part of this fragmentation job, EVERY NIGHT, I re-slice
the user strings and dump them into a flat file.  I then load them into
Ingres as a heap table, then modify the table to ISAM.

You may say, why every night.  Well, the problem is I load the data at
100% fill-factor, and ISAM rapidly becomes fragmented with
insertions/deletions.

Btree does not become fragmented, but the problem there is that btree
use was very slow.  I believe this is because ISAM is basically a SORTED
HEAP with an index, so everything is close and packed tight.  Btree
doesn't seem to do that as well.  It is more spread out.

Users complain about the nightly re-index, but I tell them, 'Hey, you
are searching for fragments of words in 200k entries in 6 seconds.  I can
design it without the night job, but your searches all day will take
much longer."  That stops the conversation.

So, to your solution.  CLUSTER is too slow.  The disk is going crazy
moving single rows into the temp table.  I recommend doing a COPY of
artist_fti to a flat file, doing a Unix 'sort' on the flat file, then
re-loading the data into the artist_fti, and then putting the index on
the table and vacuum.

I have done this, and now all searches are instantaneous THE FIRST TIME
and every time.

With this change, I am anxious to hear how fast you can now do your
multi-word searches.  Daily changes will not really impact performance
because they are a small part of the total search, but this process of
COPY/sort/reload/reindex will need to be done on a regular basis to
maintain performance.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: indexing words slow

From
dg@illustra.com (David Gould)
Date:
Bruce Momjian writes:
> Just started running CLUSTER, and it appears to be taking forever.  Does
> not seem to be using the disk, but a lot of CPU.  It appears to be
> caught in an endless loop.
...
> Attaching to program `/u/pg/bin/postgres', process 29755
> 0x94999 in WaitIO (buf=0x4c0ccc8, spinlock=0) at bufmgr.c:1107
> 1107            S_LOCK(&(buf->io_in_progress_lock));
>
> ---------------------------------------------------------------------------
>
> Stepping to the next statement shows it is just hung here.
> The backtrace shows:
>
> ---------------------------------------------------------------------------
>
> #0  0x9499b in WaitIO (buf=0x4c0ccc8, spinlock=0) at bufmgr.c:1107
> #1  0x94109 in BufferAlloc (reln=0x19df90, blockNum=11324,
>     foundPtr=0xefbfb903 "\001\030����=\t", bufferLockHeld=0) at bufmgr.c:400
> #2  0x93e91 in ReadBufferWithBufferLock (reln=0x19df90, blockNum=11324,
>     bufferLockHeld=0) at bufmgr.c:255
> #3  0x93dee in ReadBuffer (reln=0x19df90, blockNum=11324) at bufmgr.c:174
> #4  0xb28f in heap_fetch (relation=0x19df90, seeself=0 '\000', tid=0x1b2256,
>     b=0xefbfb974) at heapam.c:1050
> #5  0x303d5 in rebuildheap (OIDNewHeap=11466400, OIDOldHeap=6424406,
>     OIDOldIndex=11466368) at cluster.c:357
> #6  0x30111 in cluster (oldrelname=0x112790 "artist_fti",
>     oldindexname=0x10cdd0 "artist_fti_idx") at cluster.c:160
> #7  0xa29c1 in ProcessUtility (parsetree=0x1127b0, dest=Remote)
>     at utility.c:626

This is all just speculation from memory as I do not have the code in front
of me, but here is what I think might be happening:

heap_fetch -> ReadBuffer -> ReadBufferWithBufferLock
-- trying to fetch the next row, needed to read a new page

 -> BufferAlloc
    -- to read a new page, we need a buffer, so grab one
    -- or possibly we found the buffer we were looking for

    -> WaitIO
       -- but the target buffer was already being read or written
          (by another process???)

       -> S_LOCK(&(buf->io_in_progress_lock))
          -- so spin on the buffer in_progress lock waiting for the I/O to
             complete and clear the lock

             Apparently, the I/O completion is never seen so the lock is
             never cleared so this will wait forever.

Since the system is stuck at this point, somebody dropped the ball on
clearing the io_in_progress_lock spinlock. I can't tell without looking
at the code (and probably not even then) if this is I/O initiated by the
stuck process and just forgotten about, or if some other process was involved.

Were there any other active backends while this was running?

If so, did any of them exit abnormally (ie without cleaning up)?

Is the stuck process holding any other spinlocks (like the bufmgr spinlock
for instance)? (There is an array of held spinlocks in the Proc structure
that gets set by SpinAcquire to record spinlocks owned by the current
process).

If so, is it possible that another process is somehow deadlocked
with this one?

  eg: A has bufmgr spinlock, wants io_in_progress_lock spinlock
      B has io_in_progress_lock spinlock, wants bufmgr spinlock

I hope this is useful...

-dg

ps: what is CLUSTER anyhow? I have a guess, but...

David Gould            dg@illustra.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
 - I realize now that irony has no place in business communications.

Re: [HACKERS] Re: indexing words slow

From
Bruce Momjian
Date:
> Second, CLUSTER takes forever because it is moving all over the disk
> retrieving each row in order.
>

> So, to your solution.  CLUSTER is too slow.  The disk is going crazy
> moving single rows into the temp table.  I recommend doing a COPY of
> artist_fti to a flat file, doing a Unix 'sort' on the flat file, then
> re-loading the data into the artist_fti, and then putting the index on
> the table and vacuum.
>
> I have done this, and now all searches are instantaneous THE FIRST TIME
> and every time.
>
> With this change, I am anxious to hear how fast you can now do your
> multi-word searches.  Daily changes will not really impact performance
> because they are a small part of the total search, but this process of
> COPY/sort/reload/reindex will need to be done on a regular basis to
> maintain performance.

One more piece of good news.  The reason CLUSTER was so slow is because
you loaded massive unordered amounts of data into the system.  Once you
do the COPY out/reload, subsequent clusters will run very quickly,
because 99% of the data is already ordered.  Only the new/changed data
is unordered, so you should be able to rapidly run CLUSTER from then on
to keep good performance.

I think a user module allowing this word fragment searching will be a
big hit with users.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: indexing words slow

From
Maarten Boekhold
Date:
 > I have to conclude that because of the way this table is created by
> slicing words, its layout is almost random.  The table is 272MB, and
> with 8k/page, that is 34,000 pages.  If we are going for 2,500 rows,
> odds are that each row is in a separate page.  So, to do the query, we
> are retrieving 2,500 8k pages, or 20MB of random data on the disk.  How
> long does it take to issue 2,500 disk requests that are scattered all
> over the disk.  Probably 11-22 seconds.

> My OS only lets me have 400 8k buffers, or 3.2MB of buffer.  As we are
> transferring 2,500 8k pages or 20MB of data, it is no surprise that the
> buffer cache doesn't help much.  Sometime, the data is grouped together
> on the disk, and that is why some are fast, but often they are not, and
> certainly in a situation where you are looking for two words to appear
> on the same row, they certainly are not on adjacent pages.

Thanx bruce, this si a good report :) This confirms what I thought myself
too btw.

> Just started running CLUSTER, and it appears to be taking forever.  Does
> not seem to be using the disk, but a lot of CPU.  It appears to be
> caught in an endless loop.

Note that cluster actually *did* something on my system. It created a
temp-table, which was populated, but very very slowly. I also had
disk-activity, but maybe you don't notice much of it because of your SCSI
disks....

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
|                   Department of Electrical Engineering                    |
|           Computer Architecture and Digital Technique section             |
|                          M.Boekhold@et.tudelft.nl                         |
-----------------------------------------------------------------------------