Re: Unlogged vs. In-Memory - Mailing list pgsql-advocacy

From Robert Haas
Subject Re: Unlogged vs. In-Memory
Date
Msg-id BANLkTinj-k3_gW73_0GRyJ4vUSGBhZyqrA@mail.gmail.com
Whole thread Raw
In response to Re: Unlogged vs. In-Memory  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-advocacy
On Wed, May 4, 2011 at 4:19 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Well, the _init fork can go arbitrarily long without being used, so
>> you can't put any unfrozen tuples in there.  There may be some game
>> that can be played here, but it's not simple, especially since the
>> heap and indices have to stay in sync.
>
> I don't think that's a sufficient response. It's clear that people
> expect unlogged tables would be used in conjunction with RAM disks,
> but they clearly don't work in that situation.
>
> That is exactly the main use case of "cache tables".

I think it's a bit harsh to say that they "don't work".  As I
understand it, the use case that Rob is seeking here is the ability to
create a table space on a RAM disk and put unlogged tables (only) into
it and have everything continue to work after a reboot obliterates the
contents of the RAM disk.  Fair enough - I can understand why that
would be useful, but I don't think we've ever promised anyone that
blowing away a tablespace was a safe operation.  It might actually be
safe if only temporary tables are involved... assuming that the mount
point was the PG_<version>_<catversion> directory, rather than the
tablespace directory proper... but I doubt that we've ever documented
that anywhere, or promised that it would continue working in future
releases.  It's a new idea to me, anyhow.

>> I actually think there is very little low-hanging fruit to be found in
>> terms of improving unlogged tables.
>
> Solving Rob's complaint seems very easy to me.

Maybe not.  I think what you're proposing would essentially amount to
always storing the init forks in $PGDATA, even if the actual
tablespace is elsewhere.  I agree that would solve Rob's problem, but
I'm not sure that it's the behavior that everyone wants in general.

>> The things that I didn't tackle
>> got skipped because they were really hard or low-value or had
>> significant downsides or all three.
>
>> We're not going to find a general
>> solution to this problem that is cheaper than WAL-logging everything;
>> that's why WAL-logging is basically the only form of crash-safety used
>> by any modern database product.
>
> That's not accurate. Many products provide a means to load bulk data
> without hitting the transaction log, without the need to truncate the
> table.

I agree that a bulk loading path that bypasses the WAL log is useful.
I'm not sure whether we want to try to grow unlogged tables into a
solution to that problem, or tackle it in some other way.

>> I think that the solution to the
>> problem of "I don't want to lose the whole table when the database
>> crashes" is going to involve partitioning - have a logged partition
>> and an unlogged partition, and periodically move stuff over.  Even we
>> ultimately provide some automated way to have that happen under the
>> covers, I think that's still what it's going to be doing.  I might be
>> all wet, of course, but that's what I think.
>
> That's very roughly what NOLOGGING hint does on an Oracle table, but
> without the partitioning.

How does that work?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-advocacy by date:

Previous
From: Robert Haas
Date:
Subject: Re: Unlogged vs. In-Memory
Next
From: Robert Haas
Date:
Subject: Re: Unlogged vs. In-Memory