Thread: young guy wanting (Postgres DBA) ammo
Hiya list, A friend recently told me that, among other things, the institutions for which he works tend to choose MySQL or MSSQL over Postgres because the latter requires a dedicated DBA while the former do not. When they do spring for a DBA, they go with Oracle. As a developer/end-user in/of the MySQL/Postgres/Oracle environments, I can say hands down that I prefer Postgres and I have a list of reasons why to which I can readily refer. However, I'm not a DBA and only minimally know what's involved in doing the job, so I don't have "ammo" to defend (or agree?) with my friend when he says that "Postgres requires a DBA and MySQL doesn't so that's why they choose the latter." Basically, I have so far not run across the need to mess with any of the tunables. (I am naive, I know.) Clearly, one should generally pick the best tool for the job, so I'm no pundit that Postgres is *always* the right answer, but I'd like to be informed. My questions: - With 8.2, and the almost-out 8.3, what kinds of responsibilities should a Postgres DBA expect to have? - More in line with the conversation with my friend, what/why is it that Postgres needs a DBA while MySQL doesn't? I highly suspect that the assumption that MySQL doesn't need a DBA is incorrect, but that's what was posed to me and I couldn't agree or disagree. I am familiar with Greg Smith et al's page: http://www.postgresql.org/docs/techdocs.83 Thoughts? Thanks in advance, Kevin P.S. BTW Greg et al, thank you a *ton* for putting that document together. I learned a lot watching the discussion on the list and reading the final version.
Kevin Hunter <hunteke@earlham.edu> writes: > However, I'm not a DBA and only minimally know what's involved in doing > the job, so I don't have "ammo" to defend (or agree?) with my friend > when he says that "Postgres requires a DBA and MySQL doesn't so that's > why they choose the latter. He's full of it ... mysql is not any easier to run or tune. regards, tom lane
At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote: > Kevin Hunter <hunteke@earlham.edu> writes: >> However, I'm not a DBA and only minimally know what's involved in doing >> the job, so I don't have "ammo" to defend (or agree?) with my friend >> when he says that "Postgres requires a DBA and MySQL doesn't so that's >> why they choose the latter. > > He's full of it ... mysql is not any easier to run or tune. I expected as much, but would you give me something more than "Because Tom says so!" Good enough for me, but not for a non-Postgres-indoctrinated person, I fear. ;-) Thanks, Kevin
Kevin Hunter <hunteke@earlham.edu> writes: > At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote: >> He's full of it ... mysql is not any easier to run or tune. > I expected as much, but would you give me something more than "Because > Tom says so!" Good enough for me, but not for a > non-Postgres-indoctrinated person, I fear. ;-) Well, let's see: * Installation: on practically any platform that "I don't want to be a DBA" people would use, it's a wash. You download a package, you do the local equivalent of "sudo service <foo> start", and there you are. I'm only really familiar personally with the details of this for Red Hat systems, but being personally responsible for RH's packaging of both postgres and mysql, I can tell you they are isomorphic. * Initial configuration: well, both DBs are a bit unfriendly when it comes to setting up some initial database users and configuring a sane security policy. This is actually a hard problem since "sane" means different things to different people, so a one-size-fits-all preconfigured solution doesn't work. IMHO, for both DBs you're in for some manual-reading whether you like it or not. Red Hat's packages of the two are not isomorphic on this point --- RH forces ident authorization by default for PG, while mysql is left with its default behavior which is comparable to PG's "trust". (Don't ask me why, those decisions were made before I got here.) We see the blow-back from the forced ident authorization on our lists, in that people can't figure out how to get into the DB initially, and I'm sure there are people out there who don't get past that hurdle and just give up. It would be interesting to troll the mysql lists for evidence of the downside of their default ... which'd be along the line of "someone broke into my completely insecure database and stole/destroyed all my data" ... * Tuning: there used to be some truth to the "PG is hard" meme here, in that the out-of-the-box defaults for PG were designed for what we'd now consider laughably small boxes. PG is now cranked up to the point of being sanely configured for merely small boxes, which is about where the out-of-the-box mysql configuration is too. If you want you can award mysql a few points for providing sample config files for larger configurations, but even knowing that one of those should be installed is knowledge that an "I don't want to be a DBA" person wouldn't have. I think that in either case you're going to end up fooling with the configuration parameters if you're doing anything much more demanding than cataloging your baseball cards. * Routine backups: seems about the same to me; in both cases you gotta do something along the lines of setting up a cron job to call a DB-supplied program. * Replication: I'll grant that mysql has got built-in replication that is easier to set up than any PG alternative I know about. But is the "I don't want to be a DBA" crowd planning to run replication, or likely to get it right without reading any manuals? Certainly there is a perception out there that mysql is easier, but I think it's based on out-of-date information. You might also care to read Greg Smith's take on the matter: http://www.postgresql.org/docs/techdocs.83.html He points out that each DB's developer community has been working over the past few years to ameliorate their respective perceived disadvantages ... regards, tom lane
Kevin Hunter wrote: > At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote: >> Kevin Hunter <hunteke@earlham.edu> writes: >>> However, I'm not a DBA and only minimally know what's involved in doing >>> the job, so I don't have "ammo" to defend (or agree?) with my friend >>> when he says that "Postgres requires a DBA and MySQL doesn't so that's >>> why they choose the latter. >> He's full of it ... mysql is not any easier to run or tune. > > I expected as much, but would you give me something more than "Because > Tom says so!" Good enough for me, but not for a > non-Postgres-indoctrinated person, I fear. ;-) MySQL comprises at least three different database engines, one of which does not support relational integrity. Where I used to work, we developed a MySQL-based solution that required foreign keys, so we used one of the engines that did support that. The "DBA" for the production system forgot that instruction, and didn't use our scripts, I guess, because they configured the production system with the version that didn't support foreign keys. Whoops. MySQL's configuration contains similar tuning parameters to PG's. All you need to do to gather "ammo" is to visit the respective web sites and read up on the configuration parameters for both. By "MSSQL", what do you mean? SQL Server? That also needs some tuning. Tuning, of course, is only one chore for a DBA. Designing and maintaining the dataspace, performing backups without sacrificing (too much) availability, managing indexes, perhaps writing and maintaining stored procedures, allocating usernames and passwords, creating and configuring schemas (or whatever they're called in the particular product) are all part of DBA work. Does MySQL even support stored procedures? PG surely doesn't need a DBA for small data stores, any more than MySQL does. No DBMS will survive a heavy production environment for long without someone keeping an eye on it, particularly with large data sets. Then you get into issues of RAID storage, clustering, failover and business continuity, data striping, segmenting the database so you can drop or maintain parts of it while leaving others in service, and much more are all part of any high-volume DBMS if you want it reliable. Anybody who promulgates the idea that MySQL or SQL Server (assuming that's the one you meant) do not need a DBA simply have their head up their ass. Someone has to handle these tasks, and if the workload is high enough, that needs to be someone's primary duty. Unless, of course, you simply don't care about your data. The lifeblood of your enterprise. -- Lew
On 11/1/07, Kevin Hunter <hunteke@earlham.edu> wrote: > Hiya list, > > A friend recently told me that, among other things, the institutions for > which he works tend to choose MySQL or MSSQL over Postgres because the > latter requires a dedicated DBA while the former do not. When they do > spring for a DBA, they go with Oracle. As a developer/end-user in/of > the MySQL/Postgres/Oracle environments, I can say hands down that I > prefer Postgres and I have a list of reasons why to which I can readily > refer. Anyone who thinks MSSQL server doesn't need a DBA is an idiot. Period. I've worked with it in the past, and it needed at least as much DBA hand holding as PostgreSQL 7.4 did. Now, as to the difference between PostgreSQL and MySQL re: DBA being needed. The main difference between the two databases is one of behavior which has trained people to think MySQL is so simple you don't need a DBA. But MySQL will gladly corrupt your data silently until it's far too late to get any of it back. This is true of myisam files getting corrupted and is also true of it's rather poorly behaved out of the box replication, which is documented, even on the MySQL site, as quietly failing and losing sync. Main server fails, you switch to the backup, find it's 8 weeks out of date, or missing 1 in 100 rows, and it never told you. PostgreSQL, OTOH, throws fits the second anything goes wrong and often refuses to come up or keep running until you fix it. It lets you know something is wrong before you lose data. Neither one will run particularly fast or well without SOMEONE who knows how to keep them happy. Neither one takes backups automatically at midnight and ships them offsite for safe keeping. Neither one automatically finds slow running queries and analyzes them to find out what you need to do to make them run faster. The fact is that you can hire a part time DBA from one of the wonderful PostgreSQL support companies that frequent this list who will come in once a month or so and make sure your db is running smooth. And Oracle does NOT require a DBA. Oracle requires an entire entourage of DBAs to keep happy. It is a ravenous beast filled with arcane switches which requires a lot of manual labor to keep it running smooth. And not just for big transactional sites. For damned near anything you throw at it. Admittedly, when properly tuned and maintained it's an impressive database. So, in my opinion, MySQL is good for data you don't care about or can recreate or can lose a days worth of with no big loss. Like bulletin board systems, or content management. Take a backup every night or too, use rdiff-backup on it, and you're set for when it eats your data one day. which is will if you're running isam tables. MSSQL server is good for fast Windows based development, because it integrates well with .net et. al. PostgreSQL is good if you need to save money on licensing and want a DB that can rival Oracle for the 80/20 split. It's still not in the same realm for really really big transactional sites, but man is it geting close fast. Faster than Oracle is improving in that realm. Oracle is one of the big ticket DBs and needs big ticket budget and hardware to perform. They ALL NEED A DBA. Just for different things.
On Fri, 2 Nov 2007, Kevin Hunter wrote: > I don't have "ammo" to defend (or agree?) with my friend when he says > that "Postgres requires a DBA and MySQL doesn't so that's why they > choose the latter." A statement like this suggests a fundamental misunderstanding of what a DBA does, and unfortunately for you that means you're stuck with educating them as to why they don't even understand the concept--which is particularly tough when you're not a DBA yourself. The job of a DBA is to make sure the data you're storing in the database is safe and that the system as a whole performs fast enough to keep up with demand. If your data is so trivial that it doesn't really matter whether the data stays intact or gets corrupted, and there are no performance requirements to meet, then you don't need someone operating as a DBA; in every other case, you do. It's simple to setup MySQL with the default configuration running such trivial workloads, giving the impression you've built a system that works fine. There are a number of ways this default setup can end up with corrupted data one day. As mentioned in the paper you've already read, it's possible to setup recent MySQL versions to run in the new strict modes with the right type of engine such that it has reasonable standards for data integrity. Actually doing that work _correctly_ will require a DBA, but since it's possible not to do it at all and have things appear to work, many people walk away thinking they didn't need someone acting in that role at all. PostgreSQL defaults to high standards for data integrity and as a result you can't avoid being exposed to some amount of fighting with the inevitable ramifications of that. An example already thrown out here is that you must do some amount of initially frustrating configuration in order to even get users to login the way people expect. Another one on the performance side is that you'll be forced to understand the trade-offs in how vacuuming works in PostgreSQL in order to keep your system running acceptably. It's not possible to run a secure database on a larger scale without going through these sort of exercises. But if you don't care about security and never reach a large scale, you could get the impression that this work was a waste of time, and that the database that forced you to go through it was unreasonably difficult to setup without a DBA. To step back for a second, the software industry as a whole is going through this phase right now where programmers are more empowered than ever to run complicated database-driven designs without actually having to be DBAs. It used to be that you "needed a DBA" for every job like this because they were the only people who knew how to setup the database tables at all, and once they were involved they also (if they were any good) did higher-level design planning, with scalabilty in mind, and worried about data integrity issues. Software frameworks like Ruby on Rails and Hibernate have made it simple for programmers to churn out code that operates on databases without having the slightest idea what is going on under the hood. From a programmer's perspective, the "better" database is the one that requires the least work to get running. This leads to projects where a system that worked fine "in development" crashes and burns once it reaches a non-trivial workload, because if you don't design databases with an eye towards scalability and integrity you don't magically get either. The sad part is that it's nearly impossible to educate people going through this process what they're doing wrong. Human nature is such that until you've had a day where sloppy setup caused you to lose a gigantic amount of data, spending some time with that sick feeling in your stomach that everyone who has been through this knows, it's hard to ever reach the level of paranoid necessary to be a successful DBA. Until you've fought to try and speed up a database application where data normalization is the only way to solve the fundamental problem causing the slowdown, it's impossible to truly appreciate why you should consider design tradeoffs in that area from day one. Can you build a database without someone who has been through these experiences? Sure. That doesn't mean it's a good idea. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
You know the old saying, tell a lie often enough and it becomes the truth. There are perceptions about databases out there that may not stand the test of analysis. But that really doesn't matter. If you want to bring down the perception, you need to use a different tact. And that has nothing to do with engineering. It's why successful companies have marketing and sales groups. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lew Sent: Friday, November 02, 2007 1:50 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] young guy wanting (Postgres DBA) ammo Kevin Hunter wrote: > At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote: >> Kevin Hunter <hunteke@earlham.edu> writes: >>> However, I'm not a DBA and only minimally know what's involved in doing >>> the job, so I don't have "ammo" to defend (or agree?) with my friend >>> when he says that "Postgres requires a DBA and MySQL doesn't so that's >>> why they choose the latter. >> He's full of it ... mysql is not any easier to run or tune. > > I expected as much, but would you give me something more than "Because > Tom says so!" Good enough for me, but not for a > non-Postgres-indoctrinated person, I fear. ;-) MySQL comprises at least three different database engines, one of which does not support relational integrity. Where I used to work, we developed a MySQL-based solution that required foreign keys, so we used one of the engines that did support that. The "DBA" for the production system forgot that instruction, and didn't use our scripts, I guess, because they configured the production system with the version that didn't support foreign keys. Whoops. MySQL's configuration contains similar tuning parameters to PG's. All you need to do to gather "ammo" is to visit the respective web sites and read up on the configuration parameters for both. By "MSSQL", what do you mean? SQL Server? That also needs some tuning. Tuning, of course, is only one chore for a DBA. Designing and maintaining the dataspace, performing backups without sacrificing (too much) availability, managing indexes, perhaps writing and maintaining stored procedures, allocating usernames and passwords, creating and configuring schemas (or whatever they're called in the particular product) are all part of DBA work. Does MySQL even support stored procedures? PG surely doesn't need a DBA for small data stores, any more than MySQL does. No DBMS will survive a heavy production environment for long without someone keeping an eye on it, particularly with large data sets. Then you get into issues of RAID storage, clustering, failover and business continuity, data striping, segmenting the database so you can drop or maintain parts of it while leaving others in service, and much more are all part of any high-volume DBMS if you want it reliable. Anybody who promulgates the idea that MySQL or SQL Server (assuming that's the one you meant) do not need a DBA simply have their head up their ass. Someone has to handle these tasks, and if the workload is high enough, that needs to be someone's primary duty. Unless, of course, you simply don't care about your data. The lifeblood of your enterprise. -- Lew ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster
Tom Lane wrote: > Kevin Hunter <hunteke@earlham.edu> writes: > >> However, I'm not a DBA and only minimally know what's involved in doing >> the job, so I don't have "ammo" to defend (or agree?) with my friend >> when he says that "Postgres requires a DBA and MySQL doesn't so that's >> why they choose the latter. >> > > He's full of it ... mysql is not any easier to run or tune. > Kevin, If you work at an edu and want to talk to our DBA team at a large edu around operational DBA issues with MySQL, Postgres, Oracle and SQL Server, feel free to contact me off-list. My long-winded version of Tom's succinctness: Our shop supports all four. I am not a fanboi of any. Postgres continues to impress our shop with how reliable the core engine is, and how concerned with documented behavior and how concerned with following standards the project is. I don't want to just rip on MySQL, as there are some things it does do well, but the perceived "it's so easy" is a total illusion. I personally have gone on rescue missions to various departments around our University to rescue data (sometimes very important research data, upon which future grants depend) from the clutches of a dead or dying MySQL installations that were "just set up so easily" some time before. Projects where no one knows the database engine where their data is stored always end badly. The commercial database platforms and mysql continue to pitch how easy their engine is, how it tunes itself, etc, in order to compete in the marketing arena of the perception of total cost of ownership. Less DBA time is cheaper, goes the thinking, and so the smart manager avoids strategic decisions that carry larger fixed overhead costs. It makes for colorful glossy brochures. It does not really match reality, though, because how well and how many projects a team of X DBAs can support is more a function of how far the projects push the envelop with the database engine. And this pushing can happen in a lot of directions: what tools are being used, how large are the datasets, how demanding are the uptime requirements and performance requirements, how many features of the engine does the project exploit, how are releases done, etc etc. This stuff never factors into the marketing hype, but this is where it gets real. If your shop must meet any formal audit standards, you will be hard-pressed to do this without a DBA. If you *are* able to meet audit, then some other group(s) must be doing this work. A rose by another other name costs just as much. There are other reasons that make sense for a shop to decide what RDBMS is best for them, but the alleged reason of "MySQL requires less time" is definitely not one of them. HTH, Paul
On Fri, 2007-11-02 at 00:03 -0400, Kevin Hunter wrote: > - More in line with the conversation with my friend, what/why is it that > Postgres needs a DBA while MySQL doesn't? I highly suspect that the > assumption that MySQL doesn't need a DBA is incorrect, but that's what > was posed to me and I couldn't agree or disagree. Before I was a DBA I worked as a developer in shop that had both PG and MySQL running. We had no DBA, or anyone at that time that really understood databases from a DBA side. The stuff that we had running in PG just worked. Period, no problems (and this was on PG 7.0). The stuff we had in MySQL, well, that "mysteriously" ground to a halt every night at the same time, making several customers applications unavailable. Without anyone on staff that could actually diagnose the issue, the only soution that they came up with (and I emphasise the word they, as I had no part in this :-)) was a cron job was that restarted the MySQL server every night. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
On Friday 02 November 2007 00:03, Kevin Hunter wrote: > However, I'm not a DBA and only minimally know what's involved in doing > the job, so I don't have "ammo" to defend (or agree?) with my friend > when he says that "Postgres requires a DBA and MySQL doesn't so that's > why they choose the latter." Basically, I have so far not run across > the need to mess with any of the tunables. (I am naive, I know.) > Clearly, one should generally pick the best tool for the job, so I'm no > pundit that Postgres is *always* the right answer, but I'd like to be > informed. My questions: > > - With 8.2, and the almost-out 8.3, what kinds of responsibilities > should a Postgres DBA expect to have? > For smaller workloads, the day to day responsabilities are approaching zero; just turn on autovacuum and postgresql will take care of itself. There are some other responsibilties someone should be cognisent of, for example you'll probably want to do daily backups (cron job a pg_dump is probably sufficient, though our need for both pg_dump/pg_dumpall tools does complicate life for the SA doing DBA style work). The types of jobs I am talking about would be things like running an internal bulletin board, powering a cacti instance, as a backing to a jabber server, maybe running nagios against it, or similar types of workloads. > - More in line with the conversation with my friend, what/why is it that > Postgres needs a DBA while MySQL doesn't? I highly suspect that the > assumption that MySQL doesn't need a DBA is incorrect, but that's what > was posed to me and I couldn't agree or disagree. > I think the perception comes from the typical workloads each database has to deal with. At OmniTI we run a lot of MySQL servers for small workloads like the nagios/cacti type stuff, and we also use it for dumb data stores, where we might dump a bunch of log data into a database every day and then do a couple selects against it to pull out some data and aggregate it in perl. Many of these instance still run on mysql 4, and many I have never seen; our SA team handles them and they need zero administration because the jobs they do are just not complicated, the data they hold is fairly replaceable, and the applications that run against them have compensated for a lack of features in thier design up front. Now, we also run a couple of Postgres installations in that capacity as well, and honestly my involvement in those database is pretty much near zero too. I generally take an initial swing through their postgresql.conf, making sure autovacuum is on, and then I don't go back. But, speaking in generalities, MySQL is the defacto choice for those types of workloads, and has been filling that role for years, so the idea of a DBA-less DB has perpetuated. Note people also used to make this claim about BerkelyDB back in the day. Now, we also run a bunch of those large, TB size instances and 100-1000 tps workloads, both in PostgreSQL, MySQL, (and Oracle too actually), and trust me, all of those systems do have a DBA looking at them with frequency (along with bunches of graphs and custom alerts written to notify when things go wrong). The PostgreSQL ones tend to be more complicated, because PostgreSQL has a mature feature set with views/triggers/procedures and very complex SQL that our developers take full advantage of. On the MySQL side, the developers tend to push more of that logic into the application because the tools aren't as mature in the database. This requires less interaction with a DBA, at the cost of making applications harder to write and maintain. Now, if your talking about small departmental databases, then you probably arn't going to have to worry about this stuff, but since PostgreSQL's typical usage has leaned more toward solving larger/more complex problems, the idea that you would have a DBA standing behind it generally has been true in far more cases. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On 11/1/07, Kevin Hunter <hunteke@earlham.edu> wrote: > - More in line with the conversation with my friend, what/why is it that > Postgres needs a DBA while MySQL doesn't? I highly suspect that the > assumption that MySQL doesn't need a DBA is incorrect, but that's what > was posed to me and I couldn't agree or disagree. I'm going to say something that might be surprising to some since I'm one of the biggest pg cheerleaders out there, but there are some particular challenges that affect people trying to jump into the database that can make things difficult for newcomers. These issues have been mostly addressed in recent releases but there is some legitimacy some complaints in the past, including: * difficulty dealing with vacuum issues in 24/7 environments (pre autovacuum) * relative scarcity of information or lack of community consensus on memory settings * dealing with relatively slow update performance (both mysql and mssql are historically non-mvcc databases) * grappling how best to apply the many features to best use (should i use functions, or views?) * a lot of other things related to mvcc that surprise users, slow count(*), etc etc A lot of those issues have been mitigated or eliminated but the relative complexity of setting up postgresql for various workloads has been a fair analysis. Having a DBA, or at least someone who understands databases (I don't like overly broad labels), is not necessarily a bad idea, either. A competent DBA, by the way, can keep your programmers under control and can make your team much stronger. All this being said, PostgreSQL is a rich and robust system that can redefine your idea of information architecture as you pour over the manual asking your self again and again (aha! I've always thought this was good...where have you been all my life!). Once over the initial hump where learning new things feels awkward and scary, it will slowly dawn on you that you have enormous power at your disposal to solve all kinds of crazy problems quickly and efficiently. I would tell your management to take a chance on the database, and to take a chance on you, and if your earnestness and abilities are up to par (posting on this list is already a very positive sign), then you won't be disappointed. merlin
--- Greg Smith <gsmith@gregsmith.com> wrote: > On Fri, 2 Nov 2007, Kevin Hunter wrote: > > > I don't have "ammo" to defend (or agree?) with my > friend when he says > > that "Postgres requires a DBA and MySQL doesn't so > that's why they > > choose the latter." > > [snip] > > To step back for a second, the software industry as > a whole is going > through this phase right now where programmers are > more empowered than > ever to run complicated database-driven designs > without actually having to > be DBAs. It used to be that you "needed a DBA" for > every job like this > because they were the only people who knew how to > setup the database > tables at all, and once they were involved they also > (if they were any > good) did higher-level design planning, with > scalabilty in mind, and > worried about data integrity issues. > > Software frameworks like Ruby on Rails and Hibernate > have made it simple > for programmers to churn out code that operates on > databases without > having the slightest idea what is going on under the > hood. From a > programmer's perspective, the "better" database is > the one that requires > the least work to get running. This leads to > projects where a system that > worked fine "in development" crashes and burns once > it reaches a > non-trivial workload, because if you don't design > databases with an eye > towards scalability and integrity you don't > magically get either. > As one of these programmers, where is the best place to find the information I need to get it right. Finding information, and finding good information is not the same thing, and I am wary of 99% of what I find using google. Since you know what a DBA needs to know, I ask you where I can learn what you believe a good DBA needs to know. Or am I OK just relying on the documentation that comes with a given RDBMS (Postgres, MySQL, MS SQL, &c.)? Ted
On Fri, Nov 02, 2007 at 01:26:23AM -0400, Tom Lane wrote: > out there who don't get past that hurdle and just give up. It would > be interesting to troll the mysql lists for evidence of the downside > of their default ... which'd be along the line of "someone broke into > my completely insecure database and stole/destroyed all my data" ... Not "stole", surely? If they don't know they need a DBA, then they presumably also don't know that someone's been into the system and taken stuff. To respong to the original question, I'd also point out that sites that thought, "We can't afford a sysadmin," have usually found out the hard way that they were mistaken. Even for simple Windows networks of a few machines, you need someone to look after it. Refusing to hire a DBA for data you actually care about is like refusing to take your car to the mechanic at regular service intervals, because "there's nothing wrong with it." Supposing you don't need a DBA for MySQL or MS SQL Server or any other such system is a dangerous delusion. A -- Andrew Sullivan | ajs@crankycanuck.ca
Yes..I concur that every business should retain a dedicated DBA with the caveat that the DBA's expertise states a bit more than 'changed the DBA password' M-- ----- Original Message ----- Wrom: HDMNNSKVFVWRKJVZCMHVIBGDADRZFSQHYUCDDJB To: <pgsql-general@postgresql.org> Sent: Saturday, November 03, 2007 11:40 AM Subject: Re: [GENERAL] young guy wanting (Postgres DBA) ammo > On Fri, Nov 02, 2007 at 01:26:23AM -0400, Tom Lane wrote: > > out there who don't get past that hurdle and just give up. It would > > be interesting to troll the mysql lists for evidence of the downside > > of their default ... which'd be along the line of "someone broke into > > my completely insecure database and stole/destroyed all my data" ... > > Not "stole", surely? If they don't know they need a DBA, then they > presumably also don't know that someone's been into the system and > taken stuff. > > To respong to the original question, I'd also point out that sites > that thought, "We can't afford a sysadmin," have usually found out > the hard way that they were mistaken. Even for simple Windows > networks of a few machines, you need someone to look after it. > > Refusing to hire a DBA for data you actually care about is like > refusing to take your car to the mechanic at regular service > intervals, because "there's nothing wrong with it." Supposing you > don't need a DBA for MySQL or MS SQL Server or any other such system > is a dangerous delusion. > > A > > -- > Andrew Sullivan | ajs@crankycanuck.ca > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
Thank you to all for your thoughts and responses. Kevin
On Sat, 3 Nov 2007, Ted Byers wrote: > As one of these programmers, where is the best place to find the > information I need to get it right...I ask you where I can learn what > you believe a good DBA needs to know. What a DBA should know in general is a little different from the question I think you want an answer to, which is "what should a programmer know so that they can effectively work like/without a DBA?" There's an academic answer to that question. I could tell you to learn something about data normalization, indexing, what happens on the server when you join two tables, and how cursors work in your language of choice. But without some practice, I don't know how much of that would stick. The most valuable exercise I think someone with a good programming background, but is relatively new to databases, can go through is to work on a dramatically larger data set than you would normally encounter. The main thing I've seen developers do wrong is writing code or designing tables that don't scale well. Since coding works better when you can get quick feedback after changes, it's very easy to settle into working with only small test cases, and that can turn into a serious problem when such code runs into the real world. The only way to really understand how to think more like a DBA is to try and write something that works well against a big pile of data. To throw out some simple guidelines, you want to be working with a database that's at least 10X as big as the amount of RAM on your system, and if you do something that scans the full table like "select * from x" that should take at least a couple of minutes to complete. Now, try to build a program that operates quickly on subsets of this data. Working on this scale will let you discover very quickly if you've made any of the blatant mistakes that programmers unfamiliar with working on full-size data sets tend to make. Most importantly, you don't ever want to write queries that assume it's OK to send all the data to the client to sort through, rather than pushing as much as possible toward the database server. Second, you'll end up needing to construct indexes properly to optimize the server side. And if there's more than one big table involved, you'll be forced to come to grips with how joins work and probably learn something about normalization. You'll know you're learning something valuable whenver you run something that you expect to return almost instantly, but instead it churns away for minutes before finishing. Resist the urge to stop it too quickly, and instead spend that time thinking about what's gone wrong, or monitoring the client and/or server for clues. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Scott Marlowe wrote: > PostgreSQL is ... still not in the > same realm for really really big transactional sites, but man is it > geting close fast. I beg to differ. All anecdotal evidence, and also Sun benchmarks with Postgres, show that it is not only "in the same realm" but can actually come out ahead. <http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html> <http://blogs.ittoolbox.com/database/soup/archives/postgresql-publishes-first-real-benchmark-17470> > This publication shows that a properly tuned > PostgreSQL is not only as fast or faster than MySQL, > but almost as fast as Oracle (since the hardware platforms > are different, it's hard to compare directly). > This is something we've been saying for the last 2 years, and now we can prove it. There are links to the SPEC site which shows the tuning parameters. -- Lew
On 11/2/07, Lew <lew@lwsc.ehost-services.com> wrote: > Scott Marlowe wrote: > > PostgreSQL is ... still not in the > > same realm for really really big transactional sites, but man is it > > geting close fast. > > I beg to differ. All anecdotal evidence, and also Sun benchmarks with > Postgres, show that it is not only "in the same realm" but can actually come > out ahead. > <http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html> > > <http://blogs.ittoolbox.com/database/soup/archives/postgresql-publishes-first-real-benchmark-17470> I'll point out that the very article you're pointing me to, which I've read before, btw, has this in it: "Why pay more? As I said, almost as fast as Oracle. While the list of Spec publications on affordable commodity hardware is sparse, there are some. For example, the 874 JOpS@Standard on Oracle 10+Itanium+HP-UX. That's less than 15% faster than our PostgreSQL publication. " On commodity hardware, single image machines, PostgreSQL is a match for Oracle. On clustered hardware, Oracle still has a lead.