Thread: Trading off large objects (arrays, large strings, large tables) for timeseries
Trading off large objects (arrays, large strings, large tables) for timeseries
From
Antonios Christofides
Date:
My questions briefly: (1) I made experiments with large (millions of rows/elements) arrays of text (text[], each element is 20-30 characters). On 7.4 (Debian Sarge prepackaged), inserting such an array takes forever (10 thousand elements per minute), but accessing, or writing an element, or appending an element, is done instantly. On 8.0.1 (compiled by me), inserting is very fast (two million per minute, on the same machine of course), but selecting any element takes more than a second, and updating any row, or appending, takes 25 seconds. Why 25 seconds for appending an element? Does it rewrite the entire array? (2) I also tried using a large (80M) text instead (i.e. instead of storing an array of lines, I store a huge plain text file). What surprised me is that I can get the 'tail' of the file (using substring) in only around one second, although it is transparently compressed (to 17M). It doesn't decompress the entire string, does it? Does it store it somehow chunked? It also takes 25 seconds to append something (update ... set string=string||'...'). Is there any way to do clever, faster appending, like I can in a text file? What I'm trying to do is find a good way to store timeseries. A timeseries is essentially a series of (date, value) pairs, and more specifically it is an array of records, each record consisting of three items: date TIMESTAMP, value DOUBLE PRECISION, flags TEXT. The flags is null in more than 99% of the records, but occasionally it contains flags or short comments. My above experiments are with comma-separated values in plain ascii; I haven't experimented with an array of composite type yet. The most important operations are: (1) Retrieving or inserting/replacing the entire timeseries (2) Selecting the last record (3) Appending a record I'm not interested in selecting part of a timeseries, or updating a record in the middle. Such operations do happen, but you can do them by selecting/replacing the entire timeseries instead. But (2) and (3) will be very frequent, and I don't want to do them through (1). I'm also considering a table, of course, where each timeseries record will be one row. I have experimented only a little with that (id integer, date timestamp, value double precision, flags text, primary key(id, date)). It appears to be fast alright, a problem being that it appears to consume much disk space (about 4 times more than an array of text, which is about 4 times more than a single huge text).
Re: Trading off large objects (arrays, large strings, large tables) for timeseries
From
Tom Lane
Date:
Antonios Christofides <anthony@itia.ntua.gr> writes: > Why 25 seconds for appending an element? Would you give us a specific test case, rather than a vague description of what you're doing? > (2) I also tried using a large (80M) text instead (i.e. instead of > storing an array of lines, I store a huge plain text file). What > surprised me is that I can get the 'tail' of the file (using > substring) in only around one second, although it is transparently > compressed (to 17M). It doesn't decompress the entire string, does > it? Does it store it somehow chunked? http://www.postgresql.org/docs/8.0/static/storage-toast.html > What I'm trying to do is find a good way to store timeseries. A > timeseries is essentially a series of (date, value) pairs, and more > specifically it is an array of records, each record consisting of > three items: date TIMESTAMP, value DOUBLE PRECISION, flags TEXT. In practically every case, the answer is to use a table with rows of that form. SQL just isn't designed to make it easy to do something else. regards, tom lane
Re: Trading off large objects (arrays, large strings, large tables) for timeseries
From
Shridhar Daithankar
Date:
On Tuesday 15 Feb 2005 8:03 pm, Antonios Christofides wrote: > I'm also considering a table, of course, where each timeseries record > will be one row. I have experimented only a little with that (id > integer, date timestamp, value double precision, flags text, primary > key(id, date)). It appears to be fast alright, a problem being that it > appears to consume much disk space (about 4 times more than an array > of text, which is about 4 times more than a single huge text). Perhaps you could attempt to store a fix small number of records per row, say 4-6? Or may be a smaller fixed size array, That should make the row overhead less intrusive... As far as replacing the time series is concerned, you can just drop the table which should be pretty fast.. Just a thought.. Shridhar
Hello, I did package for packing and unpacking large time series (in text format) into binary object. You can use it. Its very fast, storing 3 cols and 90000 rows ~ 10sec(P160). testdb011=# select * FROM time_series_unpack( testdb011(# time_series_pack( testdb011(# '10.23, 10.21, 10.222'||chr(10)|| testdb011(# '1.1, 2.5,3.10 '||chr(10)|| testdb011(# '1.1, 1.2,1.5'),1); row | f -----+------- 1 | 10.23 2 | 1.1 3 | 1.1 (3 rows) testdb011=# select * FROM time_series_unpack( testdb011(# time_series_pack( testdb011(# '10.23, 10.21, 10.222'||chr(10)|| testdb011(# '1.1, 2.5,3.10 '||chr(10)|| testdb011(# '1.1, 1.2,1.5'),3); row | f -----+-------- 1 | 10.222 2 | 3.1 3 | 1.5 (3 rows) testdb011=# select avg(f), min(f), max(f), count(f) from ( testdb011(# select * FROM time_series_unpack( testdb011(# time_series_pack( testdb011(# '10.23, 10.21, 10.222'||chr(10)|| testdb011(# '1.1,2.5,3.10 '||chr(10)|| testdb011(# '1.1, 1.2, 1.5'),3)) d; avg | min | max | count ------------------+-----+--------+------- 4.94066667556763 | 1.5 | 10.222 | 3 (1 row) Regards Pavel Stehule
Attachment
Re: Trading off large objects (arrays, large strings, large tables) for timeseries
From
Antonios Christofides
Date:
Tom Lane wrote: > Antonios Christofides <anthony@itia.ntua.gr> writes: > > Why 25 seconds for appending an element? > > Would you give us a specific test case, rather than a vague description > of what you're doing? OK, sorry, here it is (on another machine, thus times are different. 8.0.1 on a PIV 1.6GHz 512 MB RAM, Debian woody, kernel 2.4.18): CREATE TABLE test(id integer not null primary key, records text[]); INSERT INTO test(id, records) VALUES (1, '{"1993-09-30 13:20,182,", "1993-09-30 13:30,208,", "1993-09-30 13:51,203,", [snipping around 2 million rows] "2057-02-13 02:31,155,", "2099-12-08 10:39,198,"}'); [Took 60 seconds] SELECT array_dims(records) FROM test; array_dims ------------- [1:2000006] (1 row) UPDATE test SET records[2000007] = 'hello, world!'; [11 seconds] UPDATE test SET records[1000000] = 'hello, world!'; [15 seconds (but the difference may be because of system load - I don't have a completely idle machine available right now)] I thought the two above UPDATE commands would be instant.
Re: Trading off large objects (arrays, large strings, large tables) for timeseries
From
Antonios Christofides
Date:
Shridhar Daithankar wrote: > Perhaps you could attempt to store a fix small number of records per row, say > 4-6? Or may be a smaller fixed size array, That should make the row overhead > less intrusive... Thanks, I didn't like your idea, but it helped me come up with another idea: (timeseries_id integer, top text, middle text, bottom text); The entire timeseries is the concatenation of 'top' (a few records), 'middle' (millions of records), and 'bottom' (a few records). To get the last record, or to append a record, you only read/write 'bottom', which is very fast. Whenever the entire timeseries is written (a less frequent operation), the division into these three parts will be redone, thus keeping 'bottom' small.
Re: Trading off large objects (arrays, large strings, large tables) for timeseries
From
Tom Lane
Date:
Antonios Christofides <anthony@itia.ntua.gr> writes: > CREATE TABLE test(id integer not null primary key, records text[]); > UPDATE test SET records[2000007] = 'hello, world!'; > [11 seconds] > UPDATE test SET records[1000000] = 'hello, world!'; > [15 seconds (but the difference may be because of system load - I > don't have a completely idle machine available right now)] > I thought the two above UPDATE commands would be instant. Hardly likely seeing that text[] has variable-length array entries; the only way to isolate and replace the 2000007'th entry is to scan through all the ones before it. However the fact that the two cases seem to be about the same speed suggests to me that the bulk of the time is going into loading/decompressing/compressing/storing the array datum. You might try experimenting with the column storage option (see ALTER TABLE) --- using EXTERNAL instead of the default EXTENDED would suppress the compression/decompression step. I suspect that will be a net loss because it will eliminate CPU overhead by trading it off for more I/O ... but it would be worth doing the experiment to find out. On the whole though, I think you're going to have to abandon this approach. The TOAST mechanism really isn't designed to support partial updates of huge fields efficiently. It forces any change in the value to be an update of the whole value. regards, tom lane