Home > mailing lists

Re: Postgres as In-Memory Database? - Mailing list pgsql-general

From	Stefan Keller
Subject	Re: Postgres as In-Memory Database?
Date	April 7, 2014 23:44:04
Msg-id	CAFcOn2-neuGxxKNoni1d2o4zrF1fMTDZpjw+sjpPNwwt0yrReg@mail.gmail.com Whole thread Raw
In response to	Re: Postgres as In-Memory Database? (Hadi Moshayedi <hadi@citusdata.com>)
Responses	Re: Postgres as In-Memory Database? (Andrew Sullivan <ajs@crankycanuck.ca>)
List	pgsql-general

Tree view

Hi Hadi, hi all

It makes sense to me to design cstore_fdw for volume of data which is larger than main memory.

Coming back to my original thread, I'd like to ponder further on what makes in-memory special - and how to configure or extend Postgres to implement that.

I found e.g. some brand new functions of SQL Server called "Memory-optimized tables" which "fully reside in memory and can’t be paged out", are garbage collected, have special index, are persisting changes using transaction log and checkpoint streams, and are monitored for not running out-of-memory [1][2] - i.e. pretty much what has been discussed here - although little bit reluctantly :-)

Yours, Stefan

[1] "SQL Server In-Memory OLTP Internals Overview for CTP2" (PDF) http://t.co/T6zToWc6y6

[2] "SQL Server 2014 In-Memory OLTP: Memory Management for Memory-Optimized Tables"

http://blogs.technet.com/b/dataplatforminsider/archive/2013/11/14/sql-server-2014-in-memory-oltp-memory-management-for-memory-optimized-tables.aspx

2014-04-07 17:40 GMT+02:00 Hadi Moshayedi <hadi@citusdata.com>:

Hey Stefan,

@Hadi: Can you say something about usage of cstore FDW in-memory?

We designed cstore_fdw with the applications in mind where volume of data is much larger than main memory. In general, columnar stores usually bring two benefits:

1. Doing less disk I/O than row stores. We can skip reading entire columns or column blocks that are not related to the given query. This is effective when (a) volume of data is larger than main memory so OS cannot cache whole dataset, (b) most of our queries only require a small subset of columns to complete.

2. Vector processing and making better use of CPU. This usually helps most when data is in memory. If data is in disk and is not cached, I/O cost is usually higher than CPU cost, and vector processing may not help much.

cstore_fdw tries to optimize for #1. Also note that because we use compression, more data can be cached in memory and chance of hitting disk decreases.

But we don't do vector processing yet, and it is not our three month timeline.

If you want to be able use more CPU cores in PostgreSQL, you can have a look at CitusDB [1] which is built upon PostgreSQL and distributes queries to use all cpu cores in a single or more machines.

[1] http://citusdata.com/

-- Hadi

pgsql-general by date:

From: Jeff Janes
Date: 07 April 2014, 23:13:56
Subject: Re: Initial queries of day slow

From: Andrew Sullivan
Date: 08 April 2014, 00:38:03
Subject: Re: Postgres as In-Memory Database?

Re: Postgres as In-Memory Database? - Mailing list pgsql-general

Previous

Next