Thread: Tuning scenarios (was Changing the default configuration)
Tom Lane wrote: > If I thought that pgbench was representative of anything, or even > capable of reliably producing repeatable numbers, then I might subscribe > to results derived this way. But I have little or no confidence in > pgbench. Certainly I don't see how you'd use it to produce > recommendations for a range of application scenarios, when it's only > one very narrow scenario itself. So let's say you were designing a tool to help someone get reasonable performance out of a PostgreSQL installation. What scenarios would you include in such a tool, and what information would you want out of it? You don't have any real confidence in pgbench. Fair enough. What *would* you have confidence in? -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > You don't have any real confidence in pgbench. Fair enough. What > *would* you have confidence in? Measurements on your actual application? In fairness to pgbench, most of its problems come from people running it at tiny scale factors, where it reduces to an exercise in how many angels can dance on the same pin (or, how many backends can contend to update the same row). And in that regime it runs into two or three different Postgres limitations that might or might not have any relevance to your real-world application --- dead-index-row references used to be the worst, but I think probably aren't anymore in 7.3. But those same limitations cause the results to be unstable from run to run, which is why I don't have a lot of faith in reports of pgbench numbers. You need to work quite hard to get reproducible numbers out of it. No, I don't have a better benchmark in my pocket :-( regards, tom lane
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > You don't have any real confidence in pgbench. Fair enough. What > > *would* you have confidence in? > > Measurements on your actual application? That unfortunately doesn't help us a whole lot in figuring out defaults that will perform reasonably well under broad conditions, unless there's some way to determine a reasonably consistent pattern (or set of patterns) amongst a lot of those applications. > In fairness to pgbench, most of its problems come from people running > it at tiny scale factors, where it reduces to an exercise in how many > angels can dance on the same pin (or, how many backends can contend to > update the same row). This isn't easy to fix, but I don't think it's impossible either. It's probably sufficient to make the defaults dependent on information gathered about the system. I'd think total system memory would be the primary thing to consider, since most database engines are pretty fast once all the data and indexes are cached. :-) > And in that regime it runs into two or three different Postgres > limitations that might or might not have any relevance to your > real-world application --- dead-index-row references used to be the > worst, but I think probably aren't anymore in 7.3. But those same > limitations cause the results to be unstable from run to run, which > is why I don't have a lot of faith in reports of pgbench numbers. > You need to work quite hard to get reproducible numbers out of it. The interesting question is whether that's more an indictment of how PG does things or how pg_bench does things. I imagine it's probably difficult to get truly reproducible numbers out of pretty much any benchmark coupled with pretty much any database engine. There are simply far too many parameters to tweak on any but the simplest database engines, and we haven't even started talking about tuning the OS around the database... And benchmarks (as well as real-world applications) will always run into limitations of the database (locking mechanisms, IPC limits, etc.). In fact, that's another useful purpose: to see where the limits of the database are. Despite the limits, it's probably better to have a benchmark that only gives you an order of magnitude idea of what to expect than to not have anything at all. And thus we're more or less right back where we started: what kinds of benchmarking tests should go into a benchmark for the purposes of tuning a database system (PG in particular but the answer might easily apply to others as well) so that it will perform decently, if not optimally, under the most likely loads? I think we might be able to come up with some reasonable answers to that, as long as we don't expect too much out of the resulting benchmark. The right people to ask are probably the people who are actually running production databases. Anyone wanna chime in here with some opinions and perspectives? -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > Tom Lane wrote: >> ... But those same >> limitations cause the results to be unstable from run to run, which >> is why I don't have a lot of faith in reports of pgbench numbers. >> You need to work quite hard to get reproducible numbers out of it. > The interesting question is whether that's more an indictment of how > PG does things or how pg_bench does things. I didn't draw a conclusion on that ;-). I merely pointed out that the numbers are unstable, and therefore not to be trusted without a great deal of context ... regards, tom lane
On Fri, Feb 14, 2003 at 03:48:43AM -0800, Kevin Brown wrote: > That unfortunately doesn't help us a whole lot in figuring out > defaults that will perform reasonably well under broad conditions, > unless there's some way to determine a reasonably consistent pattern > (or set of patterns) amongst a lot of those applications. When moving to a new DB or DB box, we always run a series of benchmarks to make sure there aren't any surprises performance-wise. Our database activity, and thus our benchmarks, are broken up into roughly three different patterns: 1- Transaction processing: small number of arbitrary small (single-row) selects intermixed with large numbers of small inserts and updates. 2- Reporting: large reads joining 6-12 tables, usually involving calculations and/or aggregation. 3- Application (object retrieval): large numbers of arbitrary, single-row selects and updates, with smaller numbers of single row inserts. We use our own application code to do our benchmarks, so they're not general enough for your use, but it might be worthwhile to profile each of those different patterns, or allow DB admins to limit it to a relevant subset. Other patterns i can think of include logging (large number of single row inserts, no updates, occasional large, simple (1-3 table) selects), mining (complicated selects over 10 or more tables), automated (small inserts/updates, with triggers cascading everywhere), etc. The problem becomes dealing with the large amounts of data necessary to frame all of these patterns. An additional wrinkle is accomodating both columns with well-distributed data and columns that are top-heavy or which only have one of a small number of values. Plus indexed vs unindexed columns. Or, somewhat orthogonally, you could allow pgbench to take a workload of different sql statements (with frequencies), and execute those statements instead of the built-in transaction. Then it would be easy enough to contribute a library of pattern workloads, or for the DBA to write one herself. Just my two cents. -johnnnnnnnnnn
Kevin, > I think we might be able to come up with some reasonable answers to > that, as long as we don't expect too much out of the resulting > benchmark. The right people to ask are probably the people who are > actually running production databases. > > Anyone wanna chime in here with some opinions and perspectives? <grin> I thought you'd *never* ask. (for background: I'm a consultant, and I administrate 6 postgresql databases for 5 different clients) First off, one can't do a useful performance test on the sort of random data which can be generated by a script. The only really useful tests come from testing on a copy of the user's own database(s), or on a real database of some sort. For new installations, we'd need to make a copy of a public domain or OSS database as a part of our performance testing tool. This database would need at least 10 tables, some of them quite large, with FK relationships. Second, there are five kinds of query tests relevant to performmance: A) Rapid-fire simple select queries. B) Large complex select queries, combining at least 2 of: aggregates, sub-selects, unions, unindexed text searches, and outer joins. C) Rapid-fire small (<10 rows) update/insert/delete queries. D) Large update queries (> 10,000 rows, possibly in more than one table) E) Long-running PL procedures. Testing on these five types of operations give an all-around test of server performance. Fortunately, for many installations, not all tests are relevant; in fact, for many, only 2 of the 5 above are relevant. For example, for a PHP-Nuke installation, you'd only need to test on A and C. As another example, an OLAP reporting server would only need to test on B. Unfortunately, for any real production server, you need to test all the different operations concurrently at the appropriate multi-user level. Meaning that for one of my servers (a company-wide calendaring tool) I'd need to run tests on A, B, C, and E all simultaneously ... for that matter, A and C by themselves would require multiple connections. So, once again, if we're talking about a testing database, we would need twenty examples of A and C, ten of each of B and D, and at least 3 of E that we could run. For real production databases, the user could supply "pools" of the 5 types of operations from their real query base. Thirdly, we're up against the problem that there are several factors which can have a much more profound effect on database performance than *any* amount of tuning postgresql.conf, even given a particular hardware platform. In my experience, these factors include (in no particular order): 1) Location of the pg_xlog for heavy-update databases. 2) Location of other files on multi-disk systems 3) Choice of RAID and controller for RAID systems. 4) Filesystem choice and parameters 5) VACUUM/FULL/ANALYZE/REINDEX frequency and strategy 6) Database schema design 7) Indexing Thus the user would have to be somehow informed that they need to examine all of the above, possibly before running the tuning utility. Therefore, any tuning utility would have to: 1) Inform the user about the other factors affecting performance and notify them that they have to deal with these. 2) Ask the user for all of the following data: a) How much RAM does your system have? b) How many concurrent users, typically? c) How often do you run VACUUM/FULL/ANALYZE? d) Which of the Five Basic Operations does your database perform frequently? (this question could be reduced to "what kind of database do you have?" web site database = A and C reporting database = A and B transaction processing = A, C, D and possibly E etc.) e) For each of the 5 operations, how many times per minute is it run? f) Do you care about crash recovery? (if not, we can turn off fsync) g) (for users testing on their own database) Please make a copy of your database, and provide 5 pools of operation examples. 3) The the script would need to draw random operations from the pool, with operation type randomly drawn weighted by relative frequency for that type of operation. Each operation would be timed and scores kept per type of operation. 4) Based on the above scores, the tuning tool could adjust the following parameters: For A) shared_buffers For B) shared_buffers and sort_mem (and Tom's new JOIN COLLAPSE settings) For C) and D) wal settings and FSM settings For E) shared_buffers, wal, and FSM 5) Then run 3) again. The problem is that the above process becomes insurmountably complex when we are testing for several types of operations simultaneously. For example, if operation D is slow, we might dramatically increase FSM, but that could take away memory needed for op. B, making op. B run slower. So if we're running concurrently, we could could find the adjustments made for each type of operation contradictory, and the script would be more likely to end up in an endless loop than at a balance. If we don't run the different types of operations simultaneously, then it's not a good test; the optimal settings for op. B, for example, may make ops. A and C slow down and vice-versa. So we'd actually need to run an optimization for each type of desired operation seperately, and then compare settings, adjust to a balance (weighted according to the expected relative frequency), and re-test concurrently. Aieee! Personally, I think this is a project in and of itself. GBorg, anyone? -- Josh Berkus Aglio Database Solutions San Francisco
Folks, I forgot one question: > 2) Ask the user for all of the following data: > a) How much RAM does your system have? > b) How many concurrent users, typically? > c) How often do you run VACUUM/FULL/ANALYZE? > d) Which of the Five Basic Operations does your database perform > frequently? (this question could be reduced to "what kind of database do > you have?" web site database = A and C > reporting database = A and B > transaction processing = A, C, D and possibly E etc.) > e) For each of the 5 operations, how many times per minute is it run? > f) Do you care about crash recovery? (if not, we can turn off fsync) > g) (for users testing on their own database) Please make a copy of your > database, and provide 5 pools of operation examples. h) (for users using the test database) How large do you expect your main tables to be in your database? (then the test database would need to have its tables trimmed to match this estimate) -- Josh Berkus Aglio Database Solutions San Francisco
Robert, > > 1) Location of the pg_xlog for heavy-update databases. > > I see you put this up pretty high on the list. Do you feel this is the > most important thing you can do? For example, if you had a two drive > installation, would you load the OS and main database files on 1 disk > and put the pg_xlog on the second disk above all other configurations? Yes, actually. On machines with 2 IDE disks, I've found that this can make as much as 30% difference in speed of serial/large UPDATE statements. > Ideally I recommend 3 disks, one for os, one for data, one for xlog; but > if you only had 2 would the added speed benefits be worth the additional > recovery complexity (if you data/xlog are on the same disk, you have 1 > point of failure, one disk for backing up) On the other hand, with the xlog on a seperate disk, the xlog and the database disks are unlikely to fail at the same time. So I don't personally see it as a recovery problem, but a benefit. -- -Josh Berkus Aglio Database Solutions San Francisco
On Thu, 2003-02-20 at 17:33, Josh Berkus wrote: > > Robert, > > > > 1) Location of the pg_xlog for heavy-update databases. > > > > I see you put this up pretty high on the list. Do you feel this is the > > most important thing you can do? For example, if you had a two drive > > installation, would you load the OS and main database files on 1 disk > > and put the pg_xlog on the second disk above all other configurations? > > Yes, actually. On machines with 2 IDE disks, I've found that this can make > as much as 30% difference in speed of serial/large UPDATE statements. Do you know how well those numbers hold up under scsi and/ or raid based system? (I'd assume anyone doing serious work would run scsi) > > > Ideally I recommend 3 disks, one for os, one for data, one for xlog; but > > if you only had 2 would the added speed benefits be worth the additional > > recovery complexity (if you data/xlog are on the same disk, you have 1 > > point of failure, one disk for backing up) > > On the other hand, with the xlog on a seperate disk, the xlog and the database > disks are unlikely to fail at the same time. So I don't personally see it as > a recovery problem, but a benefit. > ok (playing a bit of devil's advocate here), but you have two possible points of failure, the data disk and the xlog disk. If either one goes, your in trouble. OTOH if you put the OS disk on one drive and it goes, your database and xlog are still safe on the other drive. Robert Treat
On Thu, Feb 20, 2003 at 06:35:44PM -0500, Robert Treat wrote: > Do you know how well those numbers hold up under scsi and/ or raid based > system? (I'd assume anyone doing serious work would run scsi) On some Sun E450s we have used, the machines are unusable for any load with xlog on the same disk (in the case I'm remembering, these are older 5400 RPM drives). Moving the xlog changed us for <hazymemory>something like 10tps to something like 30tps</hazymemory> in one seat-of-the-pants case. Sorry I can't be more specific. > ok (playing a bit of devil's advocate here), but you have two possible > points of failure, the data disk and the xlog disk. If either one goes, > your in trouble. OTOH if you put the OS disk on one drive and it goes, > your database and xlog are still safe on the other drive. If you're really worried about that, use RAID 1+0. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110