Thread: Re: PostgreSQL as a local in-memory cache
"jgardner@jonathangardner.net" <jgardner@jonathangardner.net> writes: > My question is how can I configure the database to run as quickly as > possible if I don't care about data consistency or durability? That > is, the data is updated so often and it can be reproduced fairly > rapidly so that if there is a server crash or random particles from > space mess up memory we'd just restart the machine and move on. For such a scenario, I'd suggest you: - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS are reasonable options for this. - The complication would be that your "restart the machine and move on" needs to consist of quite a few steps: - recreating the filesystem - fixing permissions as needed - running initdb to set up new PG instance - automating any needful fiddling with postgresql.conf, pg_hba.conf - starting up that PG instance - creating users, databases, schemas, ... When my desktop machine's not dead [as it is now :-(], I frequently use this very kind of configuration to host databases where I'm doing functionality testing on continually-freshly-created DBs and therefore don't actually care if they get thrown away. I have set up an "init.d"-style script which has an extra target to do the database "init" in order to make the last few steps mentioned as quick as possible. ~/dbs/pgsql-head.sh init goes an extra mile, using sed to rewrite postgresql.conf to change defaults. I expect that, if running on a ramdisk, you'd want to fiddle some of the disk performance parameters in postgresql.conf. It's certainly worth trying out the ramdisk to see if it helps with this case. Note that all you'll lose is durability under conditions of hardware outage - PostgreSQL will still care as much as always about data consistency. [Thinking about wilder possibilities...] I wonder if this kind of installation "comes into its own" for more realistic scenarios in the presence of streaming replication. If you know the WAL files have gotten to disk on another server, that's a pretty good guarantee :-). -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/internet.html "MS apparently now has a team dedicated to tracking problems with Linux and publicizing them. I guess eventually they'll figure out this back fires... ;)" -- William Burrow <aa126@DELETE.fan.nb.ca>
Chris Browne wrote: > "jgardner@jonathangardner.net" <jgardner@jonathangardner.net> writes: >> My question is how can I configure the database to run as quickly as >> possible if I don't care about data consistency or durability? That >> is, the data is updated so often and it can be reproduced fairly >> rapidly so that if there is a server crash or random particles from >> space mess up memory we'd just restart the machine and move on. > > For such a scenario, I'd suggest you: > > - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS > are reasonable options for this. > > - The complication would be that your "restart the machine and move > on" needs to consist of quite a few steps: > > - recreating the filesystem > - fixing permissions as needed > - running initdb to set up new PG instance > - automating any needful fiddling with postgresql.conf, pg_hba.conf > - starting up that PG instance > - creating users, databases, schemas, ... Doesn't PG now support putting both WAL and user table files onto file systems other than the one holding the PG config files and PG 'admin' tables? Wouldn't doing so simplify the above considertably by allowing just the WAL and user tables on the memory-backed file systems? I wouldn't think the performance impact of leaving the rest of the stuff on disk would be that large. Or does losing WAL files mandate a new initdb? -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.
[oops, didn't hit "reply to list" first time, resending...] On 6/15/10 9:02 AM, Steve Wampler wrote: > Chris Browne wrote: >> "jgardner@jonathangardner.net" <jgardner@jonathangardner.net> writes: >>> My question is how can I configure the database to run as quickly as >>> possible if I don't care about data consistency or durability? That >>> is, the data is updated so often and it can be reproduced fairly >>> rapidly so that if there is a server crash or random particles from >>> space mess up memory we'd just restart the machine and move on. >> >> For such a scenario, I'd suggest you: >> >> - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS >> are reasonable options for this. >> >> - The complication would be that your "restart the machine and move >> on" needs to consist of quite a few steps: >> >> - recreating the filesystem >> - fixing permissions as needed >> - running initdb to set up new PG instance >> - automating any needful fiddling with postgresql.conf, pg_hba.conf >> - starting up that PG instance >> - creating users, databases, schemas, ... How about this: Set up a database entirely on a RAM disk, then install a WAL-logging warm standby. If the production computergoes down, you bring the warm standby online, shut it down, and use tar(1) to recreate the database on the productionserver when you bring it back online. You have speed and you have near-100% backup. Craig
On Jun 15, 8:47 am, Chris Browne <cbbro...@acm.org> wrote: > "jgard...@jonathangardner.net" <jgard...@jonathangardner.net> writes: > > My question is how can I configure the database to run as quickly as > > possible if I don't care about data consistency or durability? That > > is, the data is updated so often and it can be reproduced fairly > > rapidly so that if there is a server crash or random particles from > > space mess up memory we'd just restart the machine and move on. > > For such a scenario, I'd suggest you: > > - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS > are reasonable options for this. > I had forgotten about this. I will try this out. > - The complication would be that your "restart the machine and move > on" needs to consist of quite a few steps: > > - recreating the filesystem > - fixing permissions as needed > - running initdb to set up new PG instance > - automating any needful fiddling with postgresql.conf, pg_hba.conf > - starting up that PG instance > - creating users, databases, schemas, ... > I'm going to have a system in place to create these databases when I restart the service. > ... > > I wonder if this kind of installation "comes into its own" for more > realistic scenarios in the presence of streaming replication. If you > know the WAL files have gotten to disk on another server, that's a > pretty good guarantee :-). > I have found that pre-computing and storing values in a general relational-type database without durability is an ideal use case to help improve services that need to return calculated results quickly. A simple hash lookup is no longer sufficient. Perhaps PostgreSQL running in this mode will be the ideal solution. Nowadays, no one is really surprised that it takes 30 seconds or so to replicate your data everywhere, but they do detest not getting answers to their complicated queries immediately.
swampler@noao.edu (Steve Wampler) writes: > Or does losing WAL files mandate a new initdb? Losing WAL would mandate initdb, so I'd think this all fits into the set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be significant to the performance focus. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/internet.html "MS apparently now has a team dedicated to tracking problems with Linux and publicizing them. I guess eventually they'll figure out this back fires... ;)" -- William Burrow <aa126@DELETE.fan.nb.ca>
On Tue, Jun 15, 2010 at 12:37 PM, Chris Browne <cbbrowne@acm.org> wrote: > swampler@noao.edu (Steve Wampler) writes: >> Or does losing WAL files mandate a new initdb? > > Losing WAL would mandate initdb, so I'd think this all fits into the > set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be > significant to the performance focus. why is that? isn't simply execute pg_resetxlog enough? specially 'cause OP doesn't care about loosing some transactions -- Jaime Casanova www.2ndQuadrant.com Soporte y capacitación de PostgreSQL
On 6/15/10 10:37 AM, Chris Browne wrote: > swampler@noao.edu (Steve Wampler) writes: >> Or does losing WAL files mandate a new initdb? > > Losing WAL would mandate initdb, so I'd think this all fits into the > set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be > significant to the performance focus. I'd like to see some figures about WAL on RAMfs vs. simply turning off fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already close to TokyoCabinet/MongoDB performance just with those turned off; I wonder if actually having the WAL on a memory partition would make any real difference in throughput. I've seen a lot of call for this recently, especially since PostgreSQL seems to be increasingly in use as a reporting server for Hadoop. Might be worth experimenting with just making wal writing a no-op. We'd also want to disable checkpointing, of course. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote: > On 6/15/10 10:37 AM, Chris Browne wrote: > > I'd like to see some figures about WAL on RAMfs vs. simply turning off > fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already > close to TokyoCabinet/MongoDB performance just with those turned off; I > wonder if actually having the WAL on a memory partition would make any > real difference in throughput. > > I've seen a lot of call for this recently, especially since PostgreSQL > seems to be increasingly in use as a reporting server for Hadoop. Might > be worth experimenting with just making wal writing a no-op. We'd also > want to disable checkpointing, of course. > My back-of-the-envelope experiment: Inserting single integers into a table without indexes using a prepared query via psycopg2. Python Script: import psycopg2 from time import time conn = psycopg2.connect(database='jgardner') cursor = conn.cursor() cursor.execute("CREATE TABLE test (data int not null)") conn.commit() cursor.execute("PREPARE ins AS INSERT INTO test VALUES ($1)") conn.commit() start = time() tx = 0 while time() - start < 1.0: cursor.execute("EXECUTE ins(%s)", (tx,)); conn.commit() tx += 1 print tx cursor.execute("DROP TABLE test"); conn.commit(); Local disk, WAL on same FS: * Default config => 90 * full_page_writes=off => 90 * synchronous_commit=off => 4,500 * fsync=off => 5,100 * fsync=off and synchronous_commit=off => 5,500 * fsync=off and full_page_writes=off => 5,150 * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 tmpfs, WAL on same tmpfs: * Default config: 5,200 * full_page_writes=off => 5,200 * fsync=off => 5,250 * synchronous_commit=off => 5,200 * fsync=off and synchronous_commit=off => 5,450 * fsync=off and full_page_writes=off => 5,250 * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 NOTE: If I do one giant commit instead of lots of littler ones, I get much better speeds for the slower cases, but I never exceed 5,500 which appears to be some kind of wall I can't break through. If there's anything else I should tinker with, I'm all ears.
On 16/06/10 18:30, jgardner@jonathangardner.net wrote: > On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote: > >> On 6/15/10 10:37 AM, Chris Browne wrote: >> >> I'd like to see some figures about WAL on RAMfs vs. simply turning off >> fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already >> close to TokyoCabinet/MongoDB performance just with those turned off; I >> wonder if actually having the WAL on a memory partition would make any >> real difference in throughput. >> >> I've seen a lot of call for this recently, especially since PostgreSQL >> seems to be increasingly in use as a reporting server for Hadoop. Might >> be worth experimenting with just making wal writing a no-op. We'd also >> want to disable checkpointing, of course. >> >> > My back-of-the-envelope experiment: Inserting single integers into a > table without indexes using a prepared query via psycopg2. > > Python Script: > import psycopg2 > from time import time > conn = psycopg2.connect(database='jgardner') > cursor = conn.cursor() > cursor.execute("CREATE TABLE test (data int not null)") > conn.commit() > cursor.execute("PREPARE ins AS INSERT INTO test VALUES ($1)") > conn.commit() > start = time() > tx = 0 > while time() - start< 1.0: > cursor.execute("EXECUTE ins(%s)", (tx,)); > conn.commit() > tx += 1 > print tx > cursor.execute("DROP TABLE test"); > conn.commit(); > > Local disk, WAL on same FS: > * Default config => 90 > * full_page_writes=off => 90 > * synchronous_commit=off => 4,500 > * fsync=off => 5,100 > * fsync=off and synchronous_commit=off => 5,500 > * fsync=off and full_page_writes=off => 5,150 > * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 > > tmpfs, WAL on same tmpfs: > * Default config: 5,200 > * full_page_writes=off => 5,200 > * fsync=off => 5,250 > * synchronous_commit=off => 5,200 > * fsync=off and synchronous_commit=off => 5,450 > * fsync=off and full_page_writes=off => 5,250 > * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 > > NOTE: If I do one giant commit instead of lots of littler ones, I get > much better speeds for the slower cases, but I never exceed 5,500 > which appears to be some kind of wall I can't break through. > > If there's anything else I should tinker with, I'm all ears. > > Seeing some profiler output (e.g oprofile) for the fastest case (and maybe 'em all later) might be informative about what limit is being hit here. regards Mark
Have you tried connecting using a UNIX socket instead of a TCP socket on localhost ? On such very short queries, the TCP overhead is significant.
> Have you tried connecting using a UNIX socket instead of a TCP socket on > localhost ? On such very short queries, the TCP overhead is significant. Actually UNIX sockets are the default for psycopg2, had forgotten that. I get 7400 using UNIX sockets and 3000 using TCP (host="localhost")
jgardner@jonathangardner.net wrote: > NOTE: If I do one giant commit instead of lots of littler ones, I get > much better speeds for the slower cases, but I never exceed 5,500 > which appears to be some kind of wall I can't break through. > That's usually about where I run into the upper limit on how many statements Python can execute against the database per second. Between that and the GIL preventing better multi-core use, once you pull the disk out and get CPU bound it's hard to use Python for load testing of small statements and bottleneck anywhere except in Python itself. I normally just write little performance test cases in the pgbench scripting language, then I get multiple clients and (in 9.0) multiple driver threads all for free. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
FYI I've tweaked this program a bit : import psycopg2 from time import time conn = psycopg2.connect(database='peufeu') cursor = conn.cursor() cursor.execute("CREATE TEMPORARY TABLE test (data int not null)") conn.commit() cursor.execute("PREPARE ins AS INSERT INTO test VALUES ($1)") cursor.execute("PREPARE sel AS SELECT 1") conn.commit() start = time() tx = 0 N = 100 d = 0 while d < 10: for n in xrange( N ): cursor.execute("EXECUTE ins(%s)", (tx,)); #~ conn.commit() #~ cursor.execute("EXECUTE sel" ); conn.commit() d = time() - start tx += N print "result : %d tps" % (tx / d) cursor.execute("DROP TABLE test"); conn.commit(); Results (Core 2 quad, ubuntu 10.04 64 bits) : SELECT 1 : 21000 queries/s (I'd say 50 us per query isn't bad !) INSERT with commit every 100 inserts : 17800 insets/s INSERT with commit every INSERT : 7650 tps fsync is on but not synchronous_commit.
Excerpts from jgardner@jonathangardner.net's message of mié jun 16 02:30:30 -0400 2010: > NOTE: If I do one giant commit instead of lots of littler ones, I get > much better speeds for the slower cases, but I never exceed 5,500 > which appears to be some kind of wall I can't break through. > > If there's anything else I should tinker with, I'm all ears. increase wal_buffers? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> * fsync=off => 5,100 > * fsync=off and synchronous_commit=off => 5,500 Now, this *is* interesting ... why should synch_commit make a difference if fsync is off? Anyone have any ideas? > tmpfs, WAL on same tmpfs: > * Default config: 5,200 > * full_page_writes=off => 5,200 > * fsync=off => 5,250 > * synchronous_commit=off => 5,200 > * fsync=off and synchronous_commit=off => 5,450 > * fsync=off and full_page_writes=off => 5,250 > * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 So, in this test, it seems like having WAL on tmpfs doesn't make a significant difference for everything == off. I'll try running some tests on Amazon when I have a chance. It would be worthwhile to get figures without Python's "ceiling". -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Wed, Jun 16, 2010 at 1:27 AM, Greg Smith <greg@2ndquadrant.com> wrote: > > I normally just write little performance test cases in the pgbench scripting > language, then I get multiple clients and (in 9.0) multiple driver threads > all for free. > See, this is why I love these mailing lists. I totally forgot about pgbench. I'm going to dump my cheesy python script and play with that for a while. -- Jonathan Gardner jgardner@jonathangardner.net
On Wed, Jun 16, 2010 at 4:22 AM, Pierre C <lists@peufeu.com> wrote: > > import psycopg2 > from time import time > conn = psycopg2.connect(database='peufeu') > cursor = conn.cursor() > cursor.execute("CREATE TEMPORARY TABLE test (data int not null)") > conn.commit() > cursor.execute("PREPARE ins AS INSERT INTO test VALUES ($1)") > cursor.execute("PREPARE sel AS SELECT 1") > conn.commit() > start = time() > tx = 0 > N = 100 > d = 0 > while d < 10: > for n in xrange( N ): > cursor.execute("EXECUTE ins(%s)", (tx,)); > #~ conn.commit() > #~ cursor.execute("EXECUTE sel" ); > conn.commit() > d = time() - start > tx += N > print "result : %d tps" % (tx / d) > cursor.execute("DROP TABLE test"); > conn.commit(); > I'm not surprised that Python add is so slow, but I am surprised that I didn't remember it was... ;-) -- Jonathan Gardner jgardner@jonathangardner.net
On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus <josh@agliodbs.com> wrote: > >> * fsync=off => 5,100 >> * fsync=off and synchronous_commit=off => 5,500 > > Now, this *is* interesting ... why should synch_commit make a difference > if fsync is off? > > Anyone have any ideas? > I may have stumbled upon this by my ignorance, but I thought I read that synchronous_commit controlled whether it tries to line up commits or has a more free-for-all that may cause some intermediate weirdness. -- Jonathan Gardner jgardner@jonathangardner.net
http://www.postgresql.org/docs/current/static/wal-async-commit.html
> Date: Wed, 16 Jun 2010 12:19:20 -0700
> Subject: Re: [PERFORM] PostgreSQL as a local in-memory cache
> From: jgardner@jonathangardner.net
> To: josh@agliodbs.com
> CC: pgsql-performance@postgresql.org
>
> On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus <josh@agliodbs.com> wrote:
> >
> >> * fsync=off => 5,100
> >> * fsync=off and synchronous_commit=off => 5,500
> >
> > Now, this *is* interesting ... why should synch_commit make a difference
> > if fsync is off?
> >
> > Anyone have any ideas?
> >
>
> I may have stumbled upon this by my ignorance, but I thought I read
> that synchronous_commit controlled whether it tries to line up commits
> or has a more free-for-all that may cause some intermediate weirdness.
>
> --
> Jonathan Gardner
> jgardner@jonathangardner.net
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.
" the server waits for the transaction's WAL records to be flushed to permanent storage before returning a success indication to the client."
I think with fynch=off, whether WAL gets written to disk or not is still controlled by synchronous_commit parameter. guessing here...
> Date: Wed, 16 Jun 2010 12:19:20 -0700
> Subject: Re: [PERFORM] PostgreSQL as a local in-memory cache
> From: jgardner@jonathangardner.net
> To: josh@agliodbs.com
> CC: pgsql-performance@postgresql.org
>
> On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus <josh@agliodbs.com> wrote:
> >
> >> * fsync=off => 5,100
> >> * fsync=off and synchronous_commit=off => 5,500
> >
> > Now, this *is* interesting ... why should synch_commit make a difference
> > if fsync is off?
> >
> > Anyone have any ideas?
> >
>
> I may have stumbled upon this by my ignorance, but I thought I read
> that synchronous_commit controlled whether it tries to line up commits
> or has a more free-for-all that may cause some intermediate weirdness.
>
> --
> Jonathan Gardner
> jgardner@jonathangardner.net
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.
On Wed, Jun 16, 2010 at 12:51 AM, Pierre C <lists@peufeu.com> wrote: > > Have you tried connecting using a UNIX socket instead of a TCP socket on > localhost ? On such very short queries, the TCP overhead is significant. > Unfortunately, this isn't an option for my use case. Carbonado only supports TCP connections. -- Jonathan Gardner jgardner@jonathangardner.net
On 6/16/10 12:00 PM, Josh Berkus wrote: > >> * fsync=off => 5,100 >> * fsync=off and synchronous_commit=off => 5,500 > > Now, this *is* interesting ... why should synch_commit make a difference > if fsync is off? > > Anyone have any ideas? I found that pgbench has "noise" of about 20% (I posted about this a couple days ago using data from 1000 identical pgbenchruns). Unless you make a bunch of runs and average them, a difference of 5,100 to 5,500 appears to be meaningless. Craig > >> tmpfs, WAL on same tmpfs: >> * Default config: 5,200 >> * full_page_writes=off => 5,200 >> * fsync=off => 5,250 >> * synchronous_commit=off => 5,200 >> * fsync=off and synchronous_commit=off => 5,450 >> * fsync=off and full_page_writes=off => 5,250 >> * fsync=off, synchronous_commit=off and full_page_writes=off => 5,500 > > So, in this test, it seems like having WAL on tmpfs doesn't make a > significant difference for everything == off. > > I'll try running some tests on Amazon when I have a chance. It would be > worthwhile to get figures without Python's "ceiling". >
> I'm not surprised that Python add is so slow, but I am surprised that > I didn't remember it was... ;-) it's not the add(), it's the time.time()...