Re: Tuning scenarios (was Changing the default configuration) - Mailing list pgsql-performance
From | Kevin Brown |
---|---|
Subject | Re: Tuning scenarios (was Changing the default configuration) |
Date | |
Msg-id | 20030214114843.GB1847@filer Whole thread Raw |
In response to | Re: Tuning scenarios (was Changing the default configuration) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Tuning scenarios (was Changing the default configuration)
Re: Tuning scenarios (was Changing the default configuration) Re: Tuning scenarios (was Changing the default configuration) |
List | pgsql-performance |
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > You don't have any real confidence in pgbench. Fair enough. What > > *would* you have confidence in? > > Measurements on your actual application? That unfortunately doesn't help us a whole lot in figuring out defaults that will perform reasonably well under broad conditions, unless there's some way to determine a reasonably consistent pattern (or set of patterns) amongst a lot of those applications. > In fairness to pgbench, most of its problems come from people running > it at tiny scale factors, where it reduces to an exercise in how many > angels can dance on the same pin (or, how many backends can contend to > update the same row). This isn't easy to fix, but I don't think it's impossible either. It's probably sufficient to make the defaults dependent on information gathered about the system. I'd think total system memory would be the primary thing to consider, since most database engines are pretty fast once all the data and indexes are cached. :-) > And in that regime it runs into two or three different Postgres > limitations that might or might not have any relevance to your > real-world application --- dead-index-row references used to be the > worst, but I think probably aren't anymore in 7.3. But those same > limitations cause the results to be unstable from run to run, which > is why I don't have a lot of faith in reports of pgbench numbers. > You need to work quite hard to get reproducible numbers out of it. The interesting question is whether that's more an indictment of how PG does things or how pg_bench does things. I imagine it's probably difficult to get truly reproducible numbers out of pretty much any benchmark coupled with pretty much any database engine. There are simply far too many parameters to tweak on any but the simplest database engines, and we haven't even started talking about tuning the OS around the database... And benchmarks (as well as real-world applications) will always run into limitations of the database (locking mechanisms, IPC limits, etc.). In fact, that's another useful purpose: to see where the limits of the database are. Despite the limits, it's probably better to have a benchmark that only gives you an order of magnitude idea of what to expect than to not have anything at all. And thus we're more or less right back where we started: what kinds of benchmarking tests should go into a benchmark for the purposes of tuning a database system (PG in particular but the answer might easily apply to others as well) so that it will perform decently, if not optimally, under the most likely loads? I think we might be able to come up with some reasonable answers to that, as long as we don't expect too much out of the resulting benchmark. The right people to ask are probably the people who are actually running production databases. Anyone wanna chime in here with some opinions and perspectives? -- Kevin Brown kevin@sysexperts.com
pgsql-performance by date: