Thread: Index speeds up one row table (why)?
version: 7.3.2 Ok, not really sure if this a bug per se, but its non-intuitive, and it goes against the advice stated in the user guide (page 150, "...there is no plan that can beat sequentially fetching 1 page...") I have an application that performs many inserts in a second (its doing real-time data collection from other hardware), in the process of these inserts, it is sometimes necessary to consult the following with: select next_id from unique_ids where name=whatever for update; update unique_ids set next_id=next_id+1 where name=whatever; pass on value of old next_id to other code... where unique_ids is: create table unique_ids ( name text not null, next_id bigint not null ) without oids; Currently this table has one row in it, where name is 15 unicode characters long. It would seem that there would be no need for an index on name. However, doing: create index unique_ids__name on unique_ids(name); resulted in literally an order-of-magnatude increase in the speed of the application. (it went from 10-20 seconds to handle approximately 30 records, to 1/2-3/4 second, and this was the only change). Presumably I would have never discovered this had I remembered to declare name as a primary key, which would have created the index. Experimenting around, and doing a vacuum full without the index didn't make any difference (I suspected that perhaps seq_scan had to go through a bunch of "dead" records). For some reason, postgresql is significantly slower doing the sequential scan than the index (I checked with explain and it is using the index when its present) in spite of there only being one row. So, it appears that there is some performance problem in the seq_scan logic, or in caching (like, maybe its willing to cache an index, but it always goes to the disk for a seq_scan? Even so, I would think the OS would cache it.), or something really non-intuitive is happening that should be documented (the present documentation implied that I should *not* create that index, but doing so was a significant improvement). p.s. You may be wondering why i'm not using serial or sequences. I need this application to be database agnostic.
On Sat, May 31, 2003 at 00:14:18 -0600, Dave E Martin XXIII <postgresql-to.dave@dave.to> wrote: > version: 7.3.2 > > Currently this table has one row in it, where name is 15 unicode > characters long. It would seem that there would be no need for an index > on name. However, doing: It probably has one visible row in it. If it can changed a lot, there may be lots of deleted tuples in a row. That would explain why an index scan speeds things up.
On Sat, 31 May 2003, Dave E Martin XXIII wrote: > select next_id from unique_ids where name=whatever for update; > update unique_ids set next_id=next_id+1 where name=whatever; > pass on value of old next_id to other code... > > where unique_ids is: > > create table unique_ids ( > name text not null, > next_id bigint not null > ) without oids; > > Currently this table has one row in it, where name is 15 unicode > characters long. It would seem that there would be no need for an index > on name. However, doing: > > create index unique_ids__name on unique_ids(name); > > resulted in literally an order-of-magnatude increase in the speed of the > application. (it went from 10-20 seconds to handle approximately 30 > records, to 1/2-3/4 second, and this was the only change). Presumably I > would have never discovered this had I remembered to declare name as a > primary key, which would have created the index. Experimenting around, > and doing a vacuum full without the index didn't make any difference (I > suspected that perhaps seq_scan had to go through a bunch of "dead" > records). For some reason, postgresql is significantly slower doing the > sequential scan than the index (I checked with explain and it is using > the index when its present) in spite of there only being one row. It may be just be a question of plan choice, but we'd need to see explain analyze output to really make a reasonable guess.
Bruno Wolff III <bruno@wolff.to> writes: > It probably has one visible row in it. If it can changed a lot, there > may be lots of deleted tuples in a row. That would explain why an > index scan speeds things up. Right, every UPDATE on unique_ids generates a dead row, and a seqscan has no alternative but to wade through them all. When a unique index is present, the indexscan code knows that after it's fetched one live tuple there can be no more matching the same index key, and so it need not keep examining index entries. Furthermore, due to the way that btree handles equal keys, it is likely (not certain, just likely) that more-recent and hence more-likely-to-be-live tuples will be seen first. However, the above-described optimization for unique keys is new in 7.3.*, and it's buggy. It's disabled as of 7.3.3, so the performance improvement you're seeing will go away as soon as you update (which you should). There's a fresh try at it in 7.4 CVS. More-frequent vacuums would be a much more reliable solution, in any case. If you are updating the single row once a second, then a cron job to vacuum (not full, just plain "vacuum") that particular table every couple of minutes would not be a bad idea. A hundred dead rows will still fit in one disk block (unless there's lots more in the row than you've mentioned), and as long as you can keep the table to one disk block you shouldn't notice any performance degradation. You might care to use contrib/pgstattuple to check out the contents of the table, but I'm pretty sure what you'll find ... regards, tom lane
Tom Lane wrote: >Bruno Wolff III <bruno@wolff.to> writes: > > >>It probably has one visible row in it. If it can changed a lot, there >>may be lots of deleted tuples in a row. That would explain why an >>index scan speeds things up. >> >> > >Right, every UPDATE on unique_ids generates a dead row, and a seqscan >has no alternative but to wade through them all. When a unique index is > Speaking of which, since the row is locked with select for update (so it can't be involved in any other transactions anyway) and the change doesn't change the length of the row, can't it just be updated in-place, or would that break something else? (pardon if this is answered already, me thinks its time to go reread the todo's and the architecture documents...)
Tom Lane Writes: >Bruno Wolff III <bruno@wolff.to> writes: >> It probably has one visible row in it. If it can changed a lot, there >> may be lots of deleted tuples in a row. That would explain why an >> index scan speeds things up. >Right, every UPDATE on unique_ids generates a dead row, and a seqscan >has no alternative but to wade through them all. When a unique index is >present, the indexscan code knows that after it's fetched one live tuple ... >More-frequent vacuums would be a much more reliable solution, The index I created wasn't unique (though it should have been), but perhaps much of the same reasoning still applies. Also, I could have swore I tried a vacuum, and it didn't make a difference, although experimenting just now, it did. The data collection rate is considerably slower at the moment though, so perhaps last time the table simply quickly got "inefficient" very quickly again during/immediately after the vacuum (or I wasn't where I thought I was when I vacuumed). I'll have to experiment with this a bit more, when the data generation is high again. (ok, experimented a bit more just now) Hm, it appears that degredation occurs with the index as well, I guess at the time I created the index, it just initially did better because it got to skip all the already dead rows at creation time: but this is disturbing, I do a vacuum, and the access times are better, but still horrible: explain analyze select next_id from bigint_unique_ids where table_name='CONNECTION_DATA'; Index Scan using bigint_unique_ids__table_name on bigint_unique_ids (cost=0.00..8.01 rows=1 width=8) (actual time=13.77..844.14 rows=1 loops=1) Index Cond: (table_name = 'CONNECTION_DATA'::text) Total runtime: 844.36 msec (3 rows) vacuum; -- takes about 10 minutes VACUUM explain analyze select next_id from bigint_unique_ids where table_name='CONNECTION_DATA'; Index Scan using bigint_unique_ids__table_name on bigint_unique_ids (cost=0.00..84.01 rows=1 width=8) (actual time=0.17..99.94 rows=1 loops=1) Index Cond: (table_name = 'CONNECTION_DATA'::text) Total runtime: 100.09 msec vacuum; --takes about 2 minutes Index Scan using bigint_unique_ids__table_name on bigint_unique_ids (cost=0.00..179.01 rows=1 width=8) (actual time=0.45..219.05 rows=1 loops=1) Index Cond: (table_name = 'CONNECTION_DATA'::text) Total runtime: 219.20 msec --ACK, worse, ran twice more, got 212.5 ms, and 394.39 vacuum bigint_unique_ids; -- try specific table only, takes about 5 seconds Index Scan using bigint_unique_ids__table_name on bigint_unique_ids (cost=0.00..163.01 rows=1 width=8) (actual time=0.23..143.59 rows=1 loops=1) Index Cond: (table_name = 'CONNECTION_DATA'::text) Total runtime: 143.72 msec vacuum full bigint_unique_ids; -- try full, takes about 3 seconds. Seq Scan on bigint_unique_ids (cost=0.00..1.02 rows=1 width=8) (actual time=0.10..0.10 rows=1 loops=1) Filter: (table_name = 'CONNECTION_DATA'::text) Total runtime: 0.25 msec -- ah, much much much, better. So apparently vacuum by itself isn't going to be sufficent, i'm going to need vacuum fulls? Or if I do vacuum's often enough (that should allow old rows to be overwritten?) will that do it? I'm a bit hazy on why vacuum isn't doing just as well as vacuum full, I thought the only difference was that full released space back to the operating system (and presumably defragments existing data, but for one row, this shouldn't matter?). wait several minutes: Seq Scan on bigint_unique_ids (cost=0.00..39.01 rows=1 width=8) (actual time=2.97..2.98 rows=1 loops=1) Filter: (table_name = 'CONNECTION_DATA'::text) Total runtime: 3.13 msec reindex index bigint_unique_ids__table_name; REINDEX Index Scan using bigint_unique_ids__table_name on bigint_unique_ids (cost=0.00..5.97 rows=1 width=8) (actual time=0.11..0.20 rows=1 loops=1) Index Cond: (table_name = 'CONNECTION_DATA'::text) Total runtime: 0.30 msec It appears reindex has the same speed up effect. (and in this case made it switch back from seq_scan to index scan). Let me throw in this too, if its helpful: vacuum verbose bigint_unique_ids; INFO: --Relation public.bigint_unique_ids-- INFO: Index bigint_unique_ids__table_name: Pages 29; Tuples 1: Deleted 5354. CPU 0.01s/0.04u sec elapsed 0.05 sec. INFO: Removed 11348 tuples in 79 pages. CPU 0.00s/0.02u sec elapsed 0.02 sec. INFO: Pages 79: Changed 1, Empty 0; Tup 1: Vac 11348, Keep 0, UnUsed 0. Total CPU 0.03s/0.06u sec elapsed 0.14 sec. INFO: --Relation pg_toast.pg_toast_21592-- INFO: Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0. Total CPU 0.00s/0.00u sec elapsed 0.00 sec. VACUUM vacuum full verbose bigint_unique_ids; INFO: --Relation public.bigint_unique_ids-- INFO: Pages 79: Changed 1, reaped 79, Empty 0, New 0; Tup 1: Vac 297, Keep/VTL 0/0, UnUsed 11157, MinLen 52, MaxLen 52; Re-using: Free/Avail. Space 599716/22724; EndEmpty/Avail. Pages 76/3. CPU 0.01s/0.00u sec elapsed 0.01 sec. INFO: Index bigint_unique_ids__table_name: Pages 29; Tuples 1: Deleted 297. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: Rel bigint_unique_ids: Pages: 79 --> 1; Tuple(s) moved: 1. CPU 0.00s/0.00u sec elapsed 0.02 sec. INFO: Index bigint_unique_ids__table_name: Pages 29; Tuples 1: Deleted 1. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: --Relation pg_toast.pg_toast_21592-- INFO: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: Index pg_toast_21592_index: Pages 1; Tuples 0. CPU 0.00s/0.00u sec elapsed 0.01 sec. VACUUM
> vacuum verbose bigint_unique_ids; > INFO: --Relation public.bigint_unique_ids-- > INFO: Index bigint_unique_ids__table_name: Pages 29; Tuples 1: Deleted= =20 > 5354. > CPU 0.01s/0.04u sec elapsed 0.05 sec. > INFO: Removed 11348 tuples in 79 pages. > CPU 0.00s/0.02u sec elapsed 0.02 sec. > INFO: Pages 79: Changed 1, Empty 0; Tup 1: Vac 11348, Keep 0, UnUsed 0. > Total CPU 0.03s/0.06u sec elapsed 0.14 sec. Vacuum (regular, not full) frequently enough that the 'Pages' value doesn't increase past 1 and you'll be fine. A sequential scan on a very small table is what you want to have. In this particular case, vacuum removed over 11000 dead versions of the tuple. An 8 k page will hold approx 140 tuples based on your structure. So, for every ~100 updates you'll want to run vacuum (regular, not full) on the table. --=20 Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
On Sat, May 31, 2003 at 16:56:56 -0600, Dave E Martin XXIII <postgresql-to.dave@dave.to> wrote: > > (ok, experimented a bit more just now) > Hm, it appears that degredation occurs with the index as well, I guess > at the time I created the index, it just initially did better because it > got to skip all the already dead rows at creation time: but this is > disturbing, I do a vacuum, and the access times are better, but still > horrible: You really don't want to use an index, so this probably doesn't matter for the current application. The problem is that when data is inserted into an index that just increases (or decreases) in value space from deleted entries doesn't get reused. I believe this is fixed in 7.4. This case would apply to indexes based on counters, dates or times where new values are added and old values get deleted.
On Sat, May 31, 2003 at 17:17:38 -0600, Dave E Martin XXIII <postgresql-to.dave@dave.to> wrote: > Speaking of which, since the row is locked with select for update (so it > can't be involved in any other transactions anyway) and the change > doesn't change the length of the row, can't it just be updated in-place, > or would that break something else? (pardon if this is answered already, > me thinks its time to go reread the todo's and the architecture > documents...) No. Select for update only blocks writers, not readers. This has important performance advantages. You might want read the documentation on MVCC. Tom Lane also has a copy of a presentation he made on the web somewhere. I have it read it but it has gotten favorable mention on the lists before.
Rod Taylor wrote: >An 8 k page will hold approx 140 tuples based on your structure. So, >for every ~100 updates you'll want to run vacuum (regular, not full) on >the table Alas, for this application, that means a vacuum once every 5 seconds or so. I'll see if I can set up a separate little task to do that (I assume at this rate, its better to just keep a connection open, than setup/teardown). I don't suppose there is a way to get a trigger to do a vacuum (which doesn't want to be in a transaction) (thinking it could check for id mod 100=0 or something)? I also assume a few pages isn't going to be that bad (just don't let it get to 11000 8).
On Sun, Jun 01, 2003 at 01:20:03 -0600, Dave E Martin XXIII <postgresql-to.dave@dave.to> wrote: > Rod Taylor wrote: > > >An 8 k page will hold approx 140 tuples based on your structure. So, > >for every ~100 updates you'll want to run vacuum (regular, not full) on > >the table > > Alas, for this application, that means a vacuum once every 5 seconds or > so. I'll see if I can set up a separate little task to do that (I assume > at this rate, its better to just keep a connection open, than > setup/teardown). I don't suppose there is a way to get a trigger to do a > vacuum (which doesn't want to be in a transaction) (thinking it could > check for id mod 100=0 or something)? I also assume a few pages isn't > going to be that bad (just don't let it get to 11000 8). Maybe you should reconsider how badly you want the app to be totally database agnostic? Using a sequence might be less of a contortion than using vacuum a few times a minute. You are likely to have similar performance issues with other databases, so this section of code may not turn out to be very portable in any case.
Bruno Wolff III wrote: >Maybe you should reconsider how badly you want the app to be totally database >agnostic? Using a sequence might be less of a contortion than using vacuum >a few times a minute. You are likely to have similar performance issues >with other databases, so this section of code may not turn out to be very >portable in any case. > > Maybe I can further abstract out the generate unique-id portion, Since unique-id generation does seem to be a pretty common database extension (for some reason...), and then provide a generic schema definition, and a postgresql specific one (along with whatever others I can drum up). The generic one will rely on the software to come up with the unique id in the fashion I'm currently doing. Speaking of which, is there a better way than what i'm currently doing (when the database doesn't have any such support)? I've heard of one method based on something like "select max(id)+1 from table" but this seems error prone, at the very least, you'd have to have a unique index, and be prepared to redo on failure, which could get messy if its a big transaction, and frequent if there is a lot of concurrent inserting going on.
On Mon, 2 Jun 2003, Dave E Martin XXIII wrote: > Bruno Wolff III wrote: > > >Maybe you should reconsider how badly you want the app to be totally database > >agnostic? Using a sequence might be less of a contortion than using vacuum > >a few times a minute. You are likely to have similar performance issues > >with other databases, so this section of code may not turn out to be very > >portable in any case. > > > > > Maybe I can further abstract out the generate unique-id portion, Since > unique-id generation does seem to be a pretty common database extension > (for some reason...), and then provide a generic schema definition, and > a postgresql specific one (along with whatever others I can drum up). > The generic one will rely on the software to come up with the unique id > in the fashion I'm currently doing. > > Speaking of which, is there a better way than what i'm currently doing > (when the database doesn't have any such support)? I've heard of one > method based on something like "select max(id)+1 from table" but this > seems error prone, at the very least, you'd have to have a unique index, > and be prepared to redo on failure, which could get messy if its a big > transaction, and frequent if there is a lot of concurrent inserting > going on. > For a generic solution you could have an extra table that fed you ids and update it every time you took a value. (Maybe a trigger could be used?) Due to table locking during transactions no two concurrent requested would get the same answer. Implementation could be interesting but relatively simple. BEGIN; SELECT id from unqid where name='seq_name'; UPDATE unqid set id=id+1 where name='seq_name'; COMMIT; Peter Childs