Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT |
Date | |
Msg-id | 9ca60241895f4cede3269ea33a70d9bd.squirrel@2.emaily.eu Whole thread Raw |
In response to | Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT (Atri Sharma <atri.jiit@gmail.com>) |
List | pgsql-hackers |
Dne 27 Říjen 2014, 13:50, Atri Sharma napsal(a): >> >> >> >> IMHO writing all the data into a WAL would be the cleanest solution. >> >> Also, what is a small database? I don't think a static value will work, >> because the sweet spot between the current approach (forcing two >> checkpoints) and writing everything in WAL depends on the amount of >> dirty >> buffers that need to be checkpointed. Which is mostly driven by the size >> of shared buffers and write activity - for small shared buffers and/or >> mostly-read workload, checkpoints are cheap, so the 'small database' >> threshold (when choosing the WAL approach) is much higher than for large >> shared buffers or write-heavy workloads. >> > > So are you proposing having a heuristic based on the amount of data in > shared buffers and write activity? Do you have something in mind that > works > for general workloads as well? > > >> So maybe if we could determine the amount of data to be checkpointed, >> and >> then base the decision on that, that'd work better? This would also have >> to take into account that writing into WAL is sequential, while >> checkpoints usually cause random writes all over the datafiles (which is >> more expensive). >> >> Another option might be forcing just a "spread" checkpoint, not the >> immediate one (which is what we do now). That would not fix the CREATE >> DATABASE duration (actually, it would make it longer), but it would >> lower >> the impact on other activity on the machine. >> >> >> > I believe this to be the cleanest way to reduce the amount of I/O > generated. If I understand correctly, the original problem you mentioned > was not the time CREATE DATABASE is taking but rather the amount of I/O > each one is generating. Not exactly. There are two related issues, both caused by the I/O activity from the CHECKPOINT. (a) I/O spike, because of checkpointing everything (b) long CREATE DATABASE durations This "spread CREATE DATABASE" only fixes (a), and makes (b) a bit worse. It however makes the 'create databases in advance' a bit more flexible, because it allows more frequent runs of the cron job that actually creates them. Right now we're forced to schedule it rather rarely (say, 1/day) to minimize the impact, which has downsides (higher probability of running out of spare dbs, slow response to updated template etc.). If we can fix both (a) and (b), that'd be great. If we can fix only (a), so be it. > This also leads me to think if it makes sense to explore group commits > around the creation of files for a new database (for a same backend, of > course). This might be on call, if the user knows he/she is going to > create > a lot of databases in the near future and is fine with a large spike in > I/O > at one go. Again, might be even more broken than the current scenario, but > depends on what the user wants... That seems much more complex than what I proposed to do, but maybe I'm wrong. If we could get rid of the "checkpoint everything" altogether, we don't really need the group commit, no? regards Tomas
pgsql-hackers by date: