Re: crash-safe visibility map, take four - Mailing list pgsql-hackers
From | Jesper Krogh |
---|---|
Subject | Re: crash-safe visibility map, take four |
Date | |
Msg-id | 4D89902C.7000004@krogh.cc Whole thread Raw |
In response to | crash-safe visibility map, take four (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: crash-safe visibility map, take four
Re: crash-safe visibility map, take four Re: crash-safe visibility map, take four |
List | pgsql-hackers |
On 2011-03-22 21:43, Robert Haas wrote: > I took a crack at implementing the first approach described above, > which seems to be by far the simplest idea we've come up with to date. > Patch attached. It doesn't seem to be that complicated, which could > mean either that it's not that complicated or that I'm missing > something. Feel free to point and snicker in the latter case. Looks simple, but there is now benefit on the usage side in the patch, so it isn't really "testable" yet? I would love to spend some time testing when its doable (even with rough corners.) I'm still a bit puzzled with how it would end up working with a page-level visibillity map bit for index-scans. There is a clear "drop off" in usabillity when the change rates of the table goes up, which may or may not be relevant, but I cannot really judge, since I haven't even got a ballpark figure about how much table churn would disable say 50% of the usage. = Really naive suggestion approaching = Another layout might be to simply drag out t_xmin, t_xmax pr row (8 bytes) into a table by itself. This table will be way bigger than the one bit per page map, but could be "wal-logged" as any other change in the system? It would, by definition make the visibility testing work (way faster than today), no matter how fast the underlying table changes. State of today (PG 8.4) is that a query like this: testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag'); count ------- 69753 (1 row) Time: 5863.600 ms testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag'); count ------- 69753 (1 row) Time: 659.832 ms testdb=# select count(id) from testdb.seq where fts @@ to_tsquery('tag'); count ------- 69753 (1 row) Time: 1005.765 ms Somewhere around 15ns / tuple (not bad at all). (the first was probably "half warm") The "average" rows per tuple is somewhere between 4 and 8 for this table, assuming 8 and that the 69K are randomly distributed among the 16M other tuples (fair assumption in this case). The 600-1000ms for the fresh cache run are the timing to drag: 69753*8192 (page size) = 571MB into memory for visibillity testing alone, on warm cache all pages being in main memory. Packing 16M tuples with 8 bytes / tuple in a map would be around 128MB. given 8 bytes/row and random distribution of data, that would require us to read all 128MB, so a speedup of x4 on this example, but it would rougly let us count the entire table in the same time. With regard to disk vs. memory hotness.. those 128MB compares to a table size of 32GB (with a toast table next to it of 64GB) but that shouldn't be touched by above query. The ns/tuple number (today) on a "thin" table in my system is approaching 1ns / tuple. If the page-level bitmap would be set "quite fast" on a fairly busy system anyway, then the above is just noise in the air, but I have currently no feeling, and there is some math in there I have trouble setting reliable ballpark numbers on. There is, by all approaches room for significant improvements for the visibillity testing for a huge range of installations. Can I drag out numbers of "frozenness of tuples" from my current systems to fill in the discussion? (how?) Jesper -- Jesper
pgsql-hackers by date: