Re: Postgres as In-Memory Database? - Mailing list pgsql-general

From Stefan Keller
Subject Re: Postgres as In-Memory Database?
Date
Msg-id CAFcOn2-neuGxxKNoni1d2o4zrF1fMTDZpjw+sjpPNwwt0yrReg@mail.gmail.com
Whole thread Raw
In response to Re: Postgres as In-Memory Database?  (Hadi Moshayedi <hadi@citusdata.com>)
Responses Re: Postgres as In-Memory Database?  (Andrew Sullivan <ajs@crankycanuck.ca>)
List pgsql-general
Hi Hadi, hi all

It makes sense to me to design cstore_fdw for volume of data which is larger than main memory.

Coming back to my original thread, I'd like to ponder further on what makes in-memory special - and how to configure or extend Postgres to implement that.

I found e.g. some brand new functions of SQL Server called "Memory-optimized tables" which "fully reside in memory and can’t be paged out", are garbage collected, have special index, are persisting changes using transaction log and checkpoint streams, and are monitored for not running out-of-memory [1][2] - i.e. pretty much what has been discussed here - although little bit reluctantly :-)

Yours, Stefan

[1] "SQL Server In-Memory OLTP Internals Overview for CTP2" (PDF) http://t.co/T6zToWc6y6
[2] "SQL Server 2014 In-Memory OLTP: Memory Management for Memory-Optimized Tables"



2014-04-07 17:40 GMT+02:00 Hadi Moshayedi <hadi@citusdata.com>:
Hey Stefan,

@Hadi: Can you say something about usage of cstore FDW in-memory?


We designed cstore_fdw with the applications in mind where volume of data is much larger than main memory. In general, columnar stores usually bring two benefits:

1. Doing less disk I/O than row stores. We can skip reading entire columns or column blocks that are not related to the given query. This is effective when (a) volume of data is larger than main memory so OS cannot cache whole dataset, (b) most of our queries only require a small subset of columns to complete.

2. Vector processing and making better use of CPU. This usually helps most when data is in memory. If data is in disk and is not cached, I/O cost is usually higher than CPU cost, and vector processing may not help much.

cstore_fdw tries to optimize for #1. Also note that because we use compression, more data can be cached in memory and chance of hitting disk decreases.

But we don't do vector processing yet, and it is not our three month timeline.

If you want to be able use more CPU cores in PostgreSQL, you can have a look at CitusDB [1] which is built upon PostgreSQL and distributes queries to use all cpu cores in a single or more machines.


-- Hadi


pgsql-general by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Initial queries of day slow
Next
From: Andrew Sullivan
Date:
Subject: Re: Postgres as In-Memory Database?