Thread: More full text index..

More full text index..

From
"Mitch Vincent"
Date:
I hadn't concentrated on the INSERT/UPDATE/DELETE speed of this until today
and I find that it's amazingly slow. Of course the time it takes is relative
to the size of the text but still, almost a minute to delete one record on a
Dual celeron 600 with 256 Megs of RAM and an ATA/66 7200 RPM 30 GIG hard
drive... INSERTs seem to be quite a bit faster (which puzzles me) but
they're still 10-20 seconds for a single record... UPDATEs seems very fast
(a few seconds).

I do have a lot of stop works in fti.c, however when I imported the 10,000
text files into the data base it was super fast (before I created indexes)
so I'm assuming that the indexes are slowing down the INSERTs UPDATEs and
DELETEs, which is expected I think? The database is VACUUMed on a regular
basis (and VACUUM ANALYZEed as well).

I'd rather have the fast search than the fast data entry, I just want to be
absolutely sure that I can't do anything to speed things along..

If I run PGOPTIONS="-d2 -s" psql databasename

I get this in the logs on an INSERT -- it doesn't appear to give any stats
on the queries that the function called by the fti trigger is doing..


--Here is my insert query (20k of text) --
query: INSERT INTO resumes_fti (string, id) VALUES ($1, $2)
! system usage stats:
!       0.644167 elapsed 0.380151 user 0.126785 system sec
!       [0.387579 user 0.149069 sys total]
!       9/2 [13/2] filesystem blocks in/out
!       0/2228 [0/2459] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
!       9/4 [16/7] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:         20 read,          0 written, buffer hit rate
= 99.77%
!       Local  blocks:          0 read,          0 written, buffer hit rate
= 0.00%
!       Direct blocks:          0 read,          0 written
CommitTransactionCommand
proc_exit(0)

Like I said, I just need to know if this is expected or if there might be
something (anything) I can do to speed it up.. It's going to be running on a
damn fast machine so I'm sure that these times are going to get smaller, if
not from just brute force.

Thanks guys!

-Mitch




Re: More full text index..

From
Bruce Momjian
Date:
I would check with EXPLAIN to see when indexes are being used.


[ Charset ISO-8859-1 unsupported, converting... ]
> I hadn't concentrated on the INSERT/UPDATE/DELETE speed of this until today
> and I find that it's amazingly slow. Of course the time it takes is relative
> to the size of the text but still, almost a minute to delete one record on a
> Dual celeron 600 with 256 Megs of RAM and an ATA/66 7200 RPM 30 GIG hard
> drive... INSERTs seem to be quite a bit faster (which puzzles me) but
> they're still 10-20 seconds for a single record... UPDATEs seems very fast
> (a few seconds).
> 
> I do have a lot of stop works in fti.c, however when I imported the 10,000
> text files into the data base it was super fast (before I created indexes)
> so I'm assuming that the indexes are slowing down the INSERTs UPDATEs and
> DELETEs, which is expected I think? The database is VACUUMed on a regular
> basis (and VACUUM ANALYZEed as well).
> 
> I'd rather have the fast search than the fast data entry, I just want to be
> absolutely sure that I can't do anything to speed things along..
> 
> If I run PGOPTIONS="-d2 -s" psql databasename
> 
> I get this in the logs on an INSERT -- it doesn't appear to give any stats
> on the queries that the function called by the fti trigger is doing..
> 
> 
> --Here is my insert query (20k of text) --
> query: INSERT INTO resumes_fti (string, id) VALUES ($1, $2)
> ! system usage stats:
> !       0.644167 elapsed 0.380151 user 0.126785 system sec
> !       [0.387579 user 0.149069 sys total]
> !       9/2 [13/2] filesystem blocks in/out
> !       0/2228 [0/2459] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
> !       9/4 [16/7] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:         20 read,          0 written, buffer hit rate
> = 99.77%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written
> CommitTransactionCommand
> proc_exit(0)
> 
> Like I said, I just need to know if this is expected or if there might be
> something (anything) I can do to speed it up.. It's going to be running on a
> damn fast machine so I'm sure that these times are going to get smaller, if
> not from just brute force.
> 
> Thanks guys!
> 
> -Mitch
> 
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: More full text index..

From
"Mitch Vincent"
Date:
EXPLAIN on a delete isn't very interesting..

databasename=# explain delete from applicants_resumes where app_id=62908;
NOTICE:  QUERY PLAN:

Index Scan using app_resume_app_id_index on applicants_resumes
(cost=0.00..3.70 rows=1 width=6)

EXPLAIN

I'm not thinking that is what's taking so long though, I think it's the fti
trigger. There is another table resumes_fti that has individual words (over
20 million rows) on delete in the applicants_resumes table it searches
through and deletes out of that table as well, evidently that's where it's
taking forever.. In fit.c I can see the delete query generated, it's as
straight forward as they come (DELETE from resumes_fti WHERE ID=<whatever>)

Check this out..

databasename=# explain delete from resumes_fti where id=86370016;
NOTICE:  QUERY PLAN:

Seq Scan on resumes_fti  (cost=0.00..394577.18 rows=1956 width=6)

EXPLAIN

Ouch :-)

Now this :

query: delete from resumes_fti where id=86370016;
ProcessQuery
! system usage stats:
!       94.297058 elapsed 66.381692 user 24.776035 system sec
!       [66.399740 user 24.785696 sys total]
!       10926/8 [10926/8] filesystem blocks in/out
!       0/30789 [0/31005] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
!       186/1493 [189/1496] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:      45945 read,         32 written, buffer hit rate
= 3.24%
!       Local  blocks:          0 read,          0 written, buffer hit rate
= 0.00%
!       Direct blocks:          0 read,          0 written
CommitTransactionCommand
proc_exit(0)
shmem_exit(0)
exit(0)

Most of that is greek to me -- speaking of which, is there any place where
these stats are explained a bit?

Anyway, do you see anything that could be correctable?

Thanks!!

-Mitch
----- Original Message -----
From: Bruce Momjian <pgman@candle.pha.pa.us>
To: Mitch Vincent <mitch@venux.net>
Cc: <pgsql-sql@postgresql.org>
Sent: Saturday, June 24, 2000 2:33 PM
Subject: Re: [SQL] More full text index..


> I would check with EXPLAIN to see when indexes are being used.
>
>
> [ Charset ISO-8859-1 unsupported, converting... ]
> > I hadn't concentrated on the INSERT/UPDATE/DELETE speed of this until
today
> > and I find that it's amazingly slow. Of course the time it takes is
relative
> > to the size of the text but still, almost a minute to delete one record
on a
> > Dual celeron 600 with 256 Megs of RAM and an ATA/66 7200 RPM 30 GIG hard
> > drive... INSERTs seem to be quite a bit faster (which puzzles me) but
> > they're still 10-20 seconds for a single record... UPDATEs seems very
fast
> > (a few seconds).
> >
> > I do have a lot of stop works in fti.c, however when I imported the
10,000
> > text files into the data base it was super fast (before I created
indexes)
> > so I'm assuming that the indexes are slowing down the INSERTs UPDATEs
and
> > DELETEs, which is expected I think? The database is VACUUMed on a
regular
> > basis (and VACUUM ANALYZEed as well).
> >
> > I'd rather have the fast search than the fast data entry, I just want to
be
> > absolutely sure that I can't do anything to speed things along..
> >
> > If I run PGOPTIONS="-d2 -s" psql databasename
> >
> > I get this in the logs on an INSERT -- it doesn't appear to give any
stats
> > on the queries that the function called by the fti trigger is doing..
> >
> >
> > --Here is my insert query (20k of text) --
> > query: INSERT INTO resumes_fti (string, id) VALUES ($1, $2)
> > ! system usage stats:
> > !       0.644167 elapsed 0.380151 user 0.126785 system sec
> > !       [0.387579 user 0.149069 sys total]
> > !       9/2 [13/2] filesystem blocks in/out
> > !       0/2228 [0/2459] page faults/reclaims, 0 [0] swaps
> > !       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
> > !       9/4 [16/7] voluntary/involuntary context switches
> > ! postgres usage stats:
> > !       Shared blocks:         20 read,          0 written, buffer hit
rate
> > = 99.77%
> > !       Local  blocks:          0 read,          0 written, buffer hit
rate
> > = 0.00%
> > !       Direct blocks:          0 read,          0 written
> > CommitTransactionCommand
> > proc_exit(0)
> >
> > Like I said, I just need to know if this is expected or if there might
be
> > something (anything) I can do to speed it up.. It's going to be running
on a
> > damn fast machine so I'm sure that these times are going to get smaller,
if
> > not from just brute force.
> >
> > Thanks guys!
> >
> > -Mitch
> >
> >
> >
>
>
> --
>   Bruce Momjian                        |  http://www.op.net/~candle
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>



Re: More full text index..

From
Bruce Momjian
Date:
Check this out..
> 
> databasename=# explain delete from resumes_fti where id=86370016;
> NOTICE:  QUERY PLAN:
> 
> Seq Scan on resumes_fti  (cost=0.00..394577.18 rows=1956 width=6)
> 
> EXPLAIN
> 
> Ouch :-)

Bingo.  Is there an index on id?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: More full text index..

From
Tom Lane
Date:
"Mitch Vincent" <mitch@venux.net> writes:
> EXPLAIN on a delete isn't very interesting..
> databasename=# explain delete from applicants_resumes where app_id=62908;
> NOTICE:  QUERY PLAN:

> Index Scan using app_resume_app_id_index on applicants_resumes
> (cost=0.00..3.70 rows=1 width=6)

I believe that doesn't tell you anything about triggers that might be
triggered during execution of the statement.  I think you are right that
the deletes issued by the trigger are the problem...

> Now this :

> query: delete from resumes_fti where id=86370016;
> ProcessQuery
> ! system usage stats:
> !       94.297058 elapsed 66.381692 user 24.776035 system sec
> !       [66.399740 user 24.785696 sys total]
> !       10926/8 [10926/8] filesystem blocks in/out
> !       0/30789 [0/31005] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [3/3] messages rcvd/sent
> !       186/1493 [189/1496] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:      45945 read,         32 written, buffer hit rate
> = 3.24%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written

> Most of that is greek to me -- speaking of which, is there any place where
> these stats are explained a bit?

For the "system usage" stats, see man getrusage(2).  I think the other
thing you need to know is that the numbers in square brackets are total
getrusage for the current backend, the numbers before brackets are the
incremental usage for the current command.  These do count absolutely
everything including trigger activity.

The "postgres usage" stats are just I/O block request counts for the
shared and transaction-local buffer caches respectively ... but I forget
whether they are kernel I/O requests or logical I/O requests, ie which
side of the buffer cache they are counted on.
        regards, tom lane