Thread: young guy wanting (Postgres DBA) ammo

young guy wanting (Postgres DBA) ammo

From
Kevin Hunter
Date:
Hiya list,

A friend recently told me that, among other things, the institutions for
which he works tend to choose MySQL or MSSQL over Postgres because the
latter requires a dedicated DBA while the former do not.  When they do
spring for a DBA, they go with Oracle.  As a developer/end-user in/of
the MySQL/Postgres/Oracle environments, I can say hands down that I
prefer Postgres and I have a list of reasons why to which I can readily
refer.

However, I'm not a DBA and only minimally know what's involved in doing
the job, so I don't have "ammo" to defend (or agree?) with my friend
when he says that "Postgres requires a DBA and MySQL doesn't so that's
why they choose the latter."  Basically, I have so far not run across
the need to mess with any of the tunables.  (I am naive, I know.)
Clearly, one should generally pick the best tool for the job, so I'm no
pundit that Postgres is *always* the right answer, but I'd like to be
informed.  My questions:

- With 8.2, and the almost-out 8.3, what kinds of responsibilities
should a Postgres DBA expect to have?

- More in line with the conversation with my friend, what/why is it that
Postgres needs a DBA while MySQL doesn't?  I highly suspect that the
assumption that MySQL doesn't need a DBA is incorrect, but that's what
was posed to me and I couldn't agree or disagree.

I am familiar with Greg Smith et al's page:
http://www.postgresql.org/docs/techdocs.83

Thoughts?  Thanks in advance,

Kevin

P.S. BTW Greg et al, thank you a *ton* for putting that document
together.  I learned a lot watching the discussion on the list and
reading the final version.

Re: young guy wanting (Postgres DBA) ammo

From
Tom Lane
Date:
Kevin Hunter <hunteke@earlham.edu> writes:
> However, I'm not a DBA and only minimally know what's involved in doing
> the job, so I don't have "ammo" to defend (or agree?) with my friend
> when he says that "Postgres requires a DBA and MySQL doesn't so that's
> why they choose the latter.

He's full of it ... mysql is not any easier to run or tune.

            regards, tom lane

Re: young guy wanting (Postgres DBA) ammo

From
Kevin Hunter
Date:
At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote:
> Kevin Hunter <hunteke@earlham.edu> writes:
>> However, I'm not a DBA and only minimally know what's involved in doing
>> the job, so I don't have "ammo" to defend (or agree?) with my friend
>> when he says that "Postgres requires a DBA and MySQL doesn't so that's
>> why they choose the latter.
>
> He's full of it ... mysql is not any easier to run or tune.

I expected as much, but would you give me something more than "Because
Tom says so!"  Good enough for me, but not for a
non-Postgres-indoctrinated person, I fear.  ;-)

Thanks,

Kevin

Re: young guy wanting (Postgres DBA) ammo

From
Tom Lane
Date:
Kevin Hunter <hunteke@earlham.edu> writes:
> At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote:
>> He's full of it ... mysql is not any easier to run or tune.

> I expected as much, but would you give me something more than "Because
> Tom says so!"  Good enough for me, but not for a
> non-Postgres-indoctrinated person, I fear.  ;-)

Well, let's see:

* Installation: on practically any platform that "I don't want to be a
DBA" people would use, it's a wash.  You download a package, you do
the local equivalent of "sudo service <foo> start", and there you
are.  I'm only really familiar personally with the details of this for
Red Hat systems, but being personally responsible for RH's packaging of
both postgres and mysql, I can tell you they are isomorphic.

* Initial configuration: well, both DBs are a bit unfriendly when it
comes to setting up some initial database users and configuring a
sane security policy.  This is actually a hard problem since "sane"
means different things to different people, so a one-size-fits-all
preconfigured solution doesn't work.  IMHO, for both DBs you're
in for some manual-reading whether you like it or not.  Red Hat's
packages of the two are not isomorphic on this point --- RH forces ident
authorization by default for PG, while mysql is left with its default
behavior which is comparable to PG's "trust".  (Don't ask me why, those
decisions were made before I got here.)  We see the blow-back from the
forced ident authorization on our lists, in that people can't figure
out how to get into the DB initially, and I'm sure there are people
out there who don't get past that hurdle and just give up.  It would
be interesting to troll the mysql lists for evidence of the downside
of their default ... which'd be along the line of "someone broke into
my completely insecure database and stole/destroyed all my data" ...

* Tuning: there used to be some truth to the "PG is hard" meme here,
in that the out-of-the-box defaults for PG were designed for what we'd
now consider laughably small boxes.  PG is now cranked up to the point
of being sanely configured for merely small boxes, which is about where
the out-of-the-box mysql configuration is too.  If you want you can
award mysql a few points for providing sample config files for larger
configurations, but even knowing that one of those should be installed
is knowledge that an "I don't want to be a DBA" person wouldn't have.
I think that in either case you're going to end up fooling with the
configuration parameters if you're doing anything much more demanding
than cataloging your baseball cards.

* Routine backups: seems about the same to me; in both cases you
gotta do something along the lines of setting up a cron job to call
a DB-supplied program.

* Replication: I'll grant that mysql has got built-in replication that
is easier to set up than any PG alternative I know about.  But is the
"I don't want to be a DBA" crowd planning to run replication, or likely
to get it right without reading any manuals?

Certainly there is a perception out there that mysql is easier,
but I think it's based on out-of-date information.  You might also
care to read Greg Smith's take on the matter:
http://www.postgresql.org/docs/techdocs.83.html
He points out that each DB's developer community has been working
over the past few years to ameliorate their respective perceived
disadvantages ...

            regards, tom lane

Re: young guy wanting (Postgres DBA) ammo

From
Lew
Date:
Kevin Hunter wrote:
> At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote:
>> Kevin Hunter <hunteke@earlham.edu> writes:
>>> However, I'm not a DBA and only minimally know what's involved in doing
>>> the job, so I don't have "ammo" to defend (or agree?) with my friend
>>> when he says that "Postgres requires a DBA and MySQL doesn't so that's
>>> why they choose the latter.
>> He's full of it ... mysql is not any easier to run or tune.
>
> I expected as much, but would you give me something more than "Because
> Tom says so!"  Good enough for me, but not for a
> non-Postgres-indoctrinated person, I fear.  ;-)

MySQL comprises at least three different database engines, one of which does
not support relational integrity.

Where I used to work, we developed a MySQL-based solution that required
foreign keys, so we used one of the engines that did support that.  The "DBA"
for the production system forgot that instruction, and didn't use our scripts,
I guess, because they configured the production system with the version that
didn't support foreign keys.  Whoops.

MySQL's configuration contains similar tuning parameters to PG's.  All you
need to do to gather "ammo" is to visit the respective web sites and read up
on the configuration parameters for both.

By "MSSQL", what do you mean?  SQL Server?  That also needs some tuning.

Tuning, of course, is only one chore for a DBA.  Designing and maintaining the
dataspace, performing backups without sacrificing (too much) availability,
managing indexes, perhaps writing and maintaining stored procedures,
allocating usernames and passwords, creating and configuring schemas (or
whatever they're called in the particular product) are all part of DBA work.

Does MySQL even support stored procedures?

PG surely doesn't need a DBA for small data stores, any more than MySQL does.
  No DBMS will survive a heavy production environment for long without someone
keeping an eye on it, particularly with large data sets.  Then you get into
issues of RAID storage, clustering, failover and business continuity, data
striping, segmenting the database so you can drop or maintain parts of it
while leaving others in service, and much more are all part of any high-volume
DBMS if you want it reliable.

Anybody who promulgates the idea that MySQL or SQL Server (assuming that's the
one you meant) do not need a DBA simply have their head up their ass.  Someone
has to handle these tasks, and if the workload is high enough, that needs to
be someone's primary duty.

Unless, of course, you simply don't care about your data.  The lifeblood of
your enterprise.

--
Lew

Re: young guy wanting (Postgres DBA) ammo

From
"Scott Marlowe"
Date:
On 11/1/07, Kevin Hunter <hunteke@earlham.edu> wrote:
> Hiya list,
>
> A friend recently told me that, among other things, the institutions for
> which he works tend to choose MySQL or MSSQL over Postgres because the
> latter requires a dedicated DBA while the former do not.  When they do
> spring for a DBA, they go with Oracle.  As a developer/end-user in/of
> the MySQL/Postgres/Oracle environments, I can say hands down that I
> prefer Postgres and I have a list of reasons why to which I can readily
> refer.

Anyone who thinks MSSQL server doesn't need a DBA is an idiot.
Period.  I've worked with it in the past, and it needed at least as
much DBA hand holding as PostgreSQL 7.4 did.

Now, as to the difference between PostgreSQL and MySQL re: DBA being
needed.  The main difference between the two databases is one of
behavior which has trained people to think MySQL is so simple you
don't need a DBA.  But MySQL will gladly corrupt your data silently
until it's far too late to get any of it back.  This is true of myisam
files getting corrupted and is also true of it's rather poorly behaved
out of the box replication, which is documented, even on the MySQL
site, as quietly failing and losing sync.  Main server fails, you
switch to the backup, find it's 8 weeks out of date, or missing 1 in
100 rows, and it never told you.

PostgreSQL, OTOH, throws fits the second anything goes wrong and often
refuses to come up or keep running until you fix it.  It lets you know
something is wrong before you lose data.

Neither one will run particularly fast or well without SOMEONE who
knows how to keep them happy.  Neither one takes backups automatically
at midnight and ships them offsite for safe keeping.  Neither one
automatically finds slow running queries and analyzes them to find out
what you need to do to make them run faster.

The fact is that you can hire a part time DBA from one of the
wonderful PostgreSQL support companies that frequent this list who
will come in once a month or so and make sure your db is running
smooth.

And Oracle does NOT require a DBA.  Oracle requires an entire
entourage of DBAs to keep happy.  It is a ravenous beast filled with
arcane switches which requires a lot of manual labor to keep it
running smooth.  And not just for big transactional sites.  For damned
near anything you throw at it.  Admittedly, when properly tuned and
maintained it's an impressive database.

So, in my opinion, MySQL is good for data you don't care about or can
recreate or can lose a days worth of with no big loss.  Like bulletin
board systems, or content management.  Take a backup every night or
too, use rdiff-backup on it, and you're set for when it eats your data
one day.  which is will if you're running isam tables.

MSSQL server is good for fast Windows based development, because it
integrates well with .net et. al.

PostgreSQL is good if you need to save money on licensing and want a
DB that can rival Oracle for the 80/20 split.  It's still not in the
same realm for really really big transactional sites, but man is it
geting close fast.  Faster than Oracle is improving in that realm.

Oracle is one of the big ticket DBs and needs big ticket budget and
hardware to perform.

They ALL NEED A DBA.  Just for different things.

Re: young guy wanting (Postgres DBA) ammo

From
Greg Smith
Date:
On Fri, 2 Nov 2007, Kevin Hunter wrote:

> I don't have "ammo" to defend (or agree?) with my friend when he says
> that "Postgres requires a DBA and MySQL doesn't so that's why they
> choose the latter."

A statement like this suggests a fundamental misunderstanding of what a
DBA does, and unfortunately for you that means you're stuck with educating
them as to why they don't even understand the concept--which is
particularly tough when you're not a DBA yourself.

The job of a DBA is to make sure the data you're storing in the database
is safe and that the system as a whole performs fast enough to keep up
with demand.  If your data is so trivial that it doesn't really matter
whether the data stays intact or gets corrupted, and there are no
performance requirements to meet, then you don't need someone operating as
a DBA; in every other case, you do.

It's simple to setup MySQL with the default configuration running such
trivial workloads, giving the impression you've built a system that works
fine.  There are a number of ways this default setup can end up with
corrupted data one day.  As mentioned in the paper you've already read,
it's possible to setup recent MySQL versions to run in the new strict
modes with the right type of engine such that it has reasonable standards
for data integrity.  Actually doing that work _correctly_ will require a
DBA, but since it's possible not to do it at all and have things appear to
work, many people walk away thinking they didn't need someone acting in
that role at all.

PostgreSQL defaults to high standards for data integrity and as a result
you can't avoid being exposed to some amount of fighting with the
inevitable ramifications of that.  An example already thrown out here is
that you must do some amount of initially frustrating configuration in
order to even get users to login the way people expect.  Another one on
the performance side is that you'll be forced to understand the trade-offs
in how vacuuming works in PostgreSQL in order to keep your system running
acceptably.  It's not possible to run a secure database on a larger scale
without going through these sort of exercises.  But if you don't care
about security and never reach a large scale, you could get the impression
that this work was a waste of time, and that the database that forced you
to go through it was unreasonably difficult to setup without a DBA.

To step back for a second, the software industry as a whole is going
through this phase right now where programmers are more empowered than
ever to run complicated database-driven designs without actually having to
be DBAs.  It used to be that you "needed a DBA" for every job like this
because they were the only people who knew how to setup the database
tables at all, and once they were involved they also (if they were any
good) did higher-level design planning, with scalabilty in mind, and
worried about data integrity issues.

Software frameworks like Ruby on Rails and Hibernate have made it simple
for programmers to churn out code that operates on databases without
having the slightest idea what is going on under the hood.  From a
programmer's perspective, the "better" database is the one that requires
the least work to get running.  This leads to projects where a system that
worked fine "in development" crashes and burns once it reaches a
non-trivial workload, because if you don't design databases with an eye
towards scalability and integrity you don't magically get either.

The sad part is that it's nearly impossible to educate people going
through this process what they're doing wrong.  Human nature is such that
until you've had a day where sloppy setup caused you to lose a gigantic
amount of data, spending some time with that sick feeling in your stomach
that everyone who has been through this knows, it's hard to ever reach the
level of paranoid necessary to be a successful DBA.  Until you've fought
to try and speed up a database application where data normalization is the
only way to solve the fundamental problem causing the slowdown, it's
impossible to truly appreciate why you should consider design tradeoffs in
that area from day one.  Can you build a database without someone who has
been through these experiences?  Sure.  That doesn't mean it's a good
idea.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: young guy wanting (Postgres DBA) ammo

From
"Gauthier, Dave"
Date:
You know the old saying, tell a lie often enough and it becomes the
truth.

There are perceptions about databases out there that may not stand the
test of analysis.  But that really doesn't matter.  If you want to bring
down the perception, you need to use a different tact.  And that has
nothing to do with engineering. It's why successful companies have
marketing and sales groups.





-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lew
Sent: Friday, November 02, 2007 1:50 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] young guy wanting (Postgres DBA) ammo

Kevin Hunter wrote:
> At 12:35a -0400 on 02 Nov 2007, Tom Lane wrote:
>> Kevin Hunter <hunteke@earlham.edu> writes:
>>> However, I'm not a DBA and only minimally know what's involved in
doing
>>> the job, so I don't have "ammo" to defend (or agree?) with my friend
>>> when he says that "Postgres requires a DBA and MySQL doesn't so
that's
>>> why they choose the latter.
>> He's full of it ... mysql is not any easier to run or tune.
>
> I expected as much, but would you give me something more than "Because
> Tom says so!"  Good enough for me, but not for a
> non-Postgres-indoctrinated person, I fear.  ;-)

MySQL comprises at least three different database engines, one of which
does
not support relational integrity.

Where I used to work, we developed a MySQL-based solution that required
foreign keys, so we used one of the engines that did support that.  The
"DBA"
for the production system forgot that instruction, and didn't use our
scripts,
I guess, because they configured the production system with the version
that
didn't support foreign keys.  Whoops.

MySQL's configuration contains similar tuning parameters to PG's.  All
you
need to do to gather "ammo" is to visit the respective web sites and
read up
on the configuration parameters for both.

By "MSSQL", what do you mean?  SQL Server?  That also needs some tuning.

Tuning, of course, is only one chore for a DBA.  Designing and
maintaining the
dataspace, performing backups without sacrificing (too much)
availability,
managing indexes, perhaps writing and maintaining stored procedures,
allocating usernames and passwords, creating and configuring schemas (or

whatever they're called in the particular product) are all part of DBA
work.

Does MySQL even support stored procedures?

PG surely doesn't need a DBA for small data stores, any more than MySQL
does.
  No DBMS will survive a heavy production environment for long without
someone
keeping an eye on it, particularly with large data sets.  Then you get
into
issues of RAID storage, clustering, failover and business continuity,
data
striping, segmenting the database so you can drop or maintain parts of
it
while leaving others in service, and much more are all part of any
high-volume
DBMS if you want it reliable.

Anybody who promulgates the idea that MySQL or SQL Server (assuming
that's the
one you meant) do not need a DBA simply have their head up their ass.
Someone
has to handle these tasks, and if the workload is high enough, that
needs to
be someone's primary duty.

Unless, of course, you simply don't care about your data.  The lifeblood
of
your enterprise.

--
Lew

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: young guy wanting (Postgres DBA) ammo

From
paul rivers
Date:
Tom Lane wrote:
> Kevin Hunter <hunteke@earlham.edu> writes:
>
>> However, I'm not a DBA and only minimally know what's involved in doing
>> the job, so I don't have "ammo" to defend (or agree?) with my friend
>> when he says that "Postgres requires a DBA and MySQL doesn't so that's
>> why they choose the latter.
>>
>
> He's full of it ... mysql is not any easier to run or tune.
>
Kevin,

If you work at an edu and want to talk to our DBA team at a large edu
around operational DBA issues with MySQL, Postgres, Oracle and SQL
Server, feel free to contact me off-list.

My long-winded version of Tom's succinctness:

Our shop supports all four.  I am not a fanboi of any.  Postgres
continues to impress our shop with how reliable the core engine is, and
how concerned with documented behavior and how concerned with following
standards the project is.

I don't want to just rip on MySQL, as there are some things it does do
well, but the perceived "it's so easy" is a total illusion.  I
personally have gone on rescue missions to various departments around
our University to rescue data (sometimes very important research data,
upon which future grants depend) from the clutches of a dead or dying
MySQL installations that were "just set up so easily" some time
before.   Projects where no one knows the database engine where their
data is stored always end badly.

The commercial database platforms and mysql continue to pitch how easy
their engine is, how it tunes itself, etc, in order to compete in the
marketing arena of the perception of total cost of ownership.  Less DBA
time is cheaper, goes the thinking, and so the smart manager avoids
strategic decisions that carry larger fixed overhead costs.  It makes
for  colorful glossy  brochures.

It does not really match reality, though, because how well and how many
projects a team of X DBAs can support is more a function of how far the
projects push the envelop with the database engine.   And this pushing
can happen in a lot of directions: what tools are being used, how large
are the datasets, how demanding are the uptime requirements and
performance requirements, how many features of the engine does the
project exploit, how are releases done, etc etc.  This stuff never
factors into the marketing hype, but this is where it gets real.

If your shop must meet any formal audit standards, you will be
hard-pressed to do this without a DBA.  If you *are* able to meet audit,
then some other group(s) must be doing this work.  A rose by another
other name costs just as much.

There are other reasons that make sense for a shop to decide what RDBMS
is best for them, but the alleged reason of "MySQL requires less time"
is definitely not one of them.

HTH,
Paul







Re: young guy wanting (Postgres DBA) ammo

From
Brad Nicholson
Date:
On Fri, 2007-11-02 at 00:03 -0400, Kevin Hunter wrote:
> - More in line with the conversation with my friend, what/why is it that
> Postgres needs a DBA while MySQL doesn't?  I highly suspect that the
> assumption that MySQL doesn't need a DBA is incorrect, but that's what
> was posed to me and I couldn't agree or disagree.

Before I was a DBA I worked as a developer in shop that had both PG and
MySQL running.  We had no DBA, or anyone at that time that really
understood databases from a DBA side.

The stuff that we had running in PG just worked.  Period, no problems
(and this was on PG 7.0).  The stuff we had in MySQL, well, that
"mysteriously" ground to a halt every night at the same time, making
several customers applications unavailable.  Without anyone on staff
that could actually diagnose the issue, the only soution that they came
up with (and I emphasise the word they, as I had no part in this :-))
was a cron job was that restarted the MySQL server every night.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: young guy wanting (Postgres DBA) ammo

From
Robert Treat
Date:
On Friday 02 November 2007 00:03, Kevin Hunter wrote:
> However, I'm not a DBA and only minimally know what's involved in doing
> the job, so I don't have "ammo" to defend (or agree?) with my friend
> when he says that "Postgres requires a DBA and MySQL doesn't so that's
> why they choose the latter."  Basically, I have so far not run across
> the need to mess with any of the tunables.  (I am naive, I know.)
> Clearly, one should generally pick the best tool for the job, so I'm no
> pundit that Postgres is *always* the right answer, but I'd like to be
> informed.  My questions:
>
> - With 8.2, and the almost-out 8.3, what kinds of responsibilities
> should a Postgres DBA expect to have?
>

For smaller workloads, the day to day responsabilities are approaching zero;
just turn on autovacuum and postgresql will take care of itself.  There are
some other responsibilties someone should be cognisent of, for example you'll
probably want to do daily backups (cron job a pg_dump is probably sufficient,
though our need for both pg_dump/pg_dumpall tools does complicate life for
the SA doing DBA style work).  The types of jobs I am talking about would be
things like running an internal bulletin board, powering a cacti instance, as
a backing to a jabber server, maybe running nagios against it, or similar
types of workloads.

> - More in line with the conversation with my friend, what/why is it that
> Postgres needs a DBA while MySQL doesn't?  I highly suspect that the
> assumption that MySQL doesn't need a DBA is incorrect, but that's what
> was posed to me and I couldn't agree or disagree.
>

I think the perception comes from the typical workloads each database has to
deal with.  At OmniTI we run a lot of MySQL servers for small workloads like
the nagios/cacti type stuff, and we also use it for dumb data stores, where
we might dump a bunch of log data into a database every day and then do a
couple selects against it to pull out some data and aggregate it in perl.
Many of these instance still run on mysql 4, and many I have never seen; our
SA team handles them and they need zero administration because the jobs they
do are just not complicated, the data they hold is fairly replaceable, and
the applications that run against them have compensated for a lack of
features in thier design up front.  Now, we also run a couple of Postgres
installations in that capacity as well, and honestly my involvement in those
database is pretty much near zero too. I generally take an initial swing
through their postgresql.conf, making sure autovacuum is on, and then I don't
go back. But, speaking in generalities, MySQL is the defacto choice for those
types of workloads, and has been filling that role for years, so the idea of
a DBA-less DB has perpetuated. Note people also used to make this claim about
BerkelyDB back in the day.

Now, we also run a bunch of those large, TB size instances and 100-1000 tps
workloads, both in PostgreSQL, MySQL, (and Oracle too actually), and trust
me, all of those systems do have a DBA looking at them with frequency (along
with bunches of graphs and custom alerts written to notify when things go
wrong).  The PostgreSQL ones tend to be more complicated, because PostgreSQL
has a mature feature set with views/triggers/procedures and very complex SQL
that our developers take full advantage of. On the MySQL side, the developers
tend to push more of that logic into the application because the tools aren't
as mature in the database. This requires less interaction with a DBA, at the
cost of making applications harder to write and maintain.  Now, if your
talking about small departmental databases, then you probably arn't going to
have to worry about this stuff, but since PostgreSQL's typical usage has
leaned more toward solving larger/more complex problems, the idea that you
would have a DBA standing behind it generally has been true in far more
cases.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: young guy wanting (Postgres DBA) ammo

From
"Merlin Moncure"
Date:
On 11/1/07, Kevin Hunter <hunteke@earlham.edu> wrote:
> - More in line with the conversation with my friend, what/why is it that
> Postgres needs a DBA while MySQL doesn't?  I highly suspect that the
> assumption that MySQL doesn't need a DBA is incorrect, but that's what
> was posed to me and I couldn't agree or disagree.

I'm going to say something that might be surprising to some since I'm
one of the biggest pg cheerleaders out there, but there are some
particular challenges that affect people trying to jump into the
database that can make things difficult for newcomers.  These issues
have been mostly addressed in recent releases but there is some
legitimacy some complaints in the past, including:

* difficulty dealing with vacuum issues in 24/7 environments (pre autovacuum)
* relative scarcity of information or lack of community consensus on
memory settings
* dealing with relatively slow update performance (both mysql and
mssql are historically non-mvcc databases)
* grappling how best to apply the many features to best use (should i
use functions, or views?)
* a lot of other things related to mvcc that surprise users, slow
count(*), etc etc

A lot of those issues have been mitigated or eliminated but the
relative complexity of setting up postgresql for various workloads has
been a fair analysis.  Having a DBA, or at least someone who
understands databases (I don't like overly broad labels), is not
necessarily a bad idea, either.  A competent DBA, by the way, can keep
your programmers under control and can make your team much stronger.

All this being said, PostgreSQL is a rich and robust system that can
redefine your idea of information architecture as you pour over the
manual asking your self again and again (aha! I've always thought this
was good...where have you been all my life!).  Once over the initial
hump where learning new things feels awkward and scary, it will slowly
dawn on you that you have enormous power at your disposal to solve all
kinds of crazy problems quickly and efficiently.

I would tell your management to take a chance on the database, and to
take a chance on you, and if your earnestness and abilities are up to
par (posting on this list is already a very positive sign), then you
won't be disappointed.

merlin

Re: young guy wanting (Postgres DBA) ammo

From
Ted Byers
Date:
--- Greg Smith <gsmith@gregsmith.com> wrote:

> On Fri, 2 Nov 2007, Kevin Hunter wrote:
>
> > I don't have "ammo" to defend (or agree?) with my
> friend when he says
> > that "Postgres requires a DBA and MySQL doesn't so
> that's why they
> > choose the latter."
>
> [snip]
>
> To step back for a second, the software industry as
> a whole is going
> through this phase right now where programmers are
> more empowered than
> ever to run complicated database-driven designs
> without actually having to
> be DBAs.  It used to be that you "needed a DBA" for
> every job like this
> because they were the only people who knew how to
> setup the database
> tables at all, and once they were involved they also
> (if they were any
> good) did higher-level design planning, with
> scalabilty in mind, and
> worried about data integrity issues.
>
> Software frameworks like Ruby on Rails and Hibernate
> have made it simple
> for programmers to churn out code that operates on
> databases without
> having the slightest idea what is going on under the
> hood.  From a
> programmer's perspective, the "better" database is
> the one that requires
> the least work to get running.  This leads to
> projects where a system that
> worked fine "in development" crashes and burns once
> it reaches a
> non-trivial workload, because if you don't design
> databases with an eye
> towards scalability and integrity you don't
> magically get either.
>
As one of these programmers, where is the best place
to find the information I need to get it right.
Finding information, and finding good information is
not the same thing, and I am wary of 99% of what I
find using google.  Since you know what a DBA needs to
know, I ask you where I can learn what you believe a
good DBA needs to know.  Or am I OK just relying on
the documentation that comes with a given RDBMS
(Postgres, MySQL, MS SQL, &c.)?

Ted

Re: young guy wanting (Postgres DBA) ammo

From
Andrew Sullivan
Date:
On Fri, Nov 02, 2007 at 01:26:23AM -0400, Tom Lane wrote:
> out there who don't get past that hurdle and just give up.  It would
> be interesting to troll the mysql lists for evidence of the downside
> of their default ... which'd be along the line of "someone broke into
> my completely insecure database and stole/destroyed all my data" ...

Not "stole", surely?  If they don't know they need a DBA, then they
presumably also don't know that someone's been into the system and
taken stuff.

To respong to the original question, I'd also point out that sites
that thought, "We can't afford a sysadmin," have usually found out
the hard way that they were mistaken.  Even for simple Windows
networks of a few machines, you need someone to look after it.

Refusing to hire a DBA for data you actually care about is like
refusing to take your car to the mechanic at regular service
intervals, because "there's nothing wrong with it."  Supposing you
don't need a DBA for MySQL or MS SQL Server or any other such system
is a dangerous delusion.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca

Re: young guy wanting (Postgres DBA) ammo

From
mgainty@hotmail.com
Date:
Yes..I concur that every business should retain a dedicated DBA with the
caveat that the DBA's expertise states a bit more than
'changed the DBA password'

M--
----- Original Message -----
Wrom: HDMNNSKVFVWRKJVZCMHVIBGDADRZFSQHYUCDDJB
To: <pgsql-general@postgresql.org>
Sent: Saturday, November 03, 2007 11:40 AM
Subject: Re: [GENERAL] young guy wanting (Postgres DBA) ammo


> On Fri, Nov 02, 2007 at 01:26:23AM -0400, Tom Lane wrote:
> > out there who don't get past that hurdle and just give up.  It would
> > be interesting to troll the mysql lists for evidence of the downside
> > of their default ... which'd be along the line of "someone broke into
> > my completely insecure database and stole/destroyed all my data" ...
>
> Not "stole", surely?  If they don't know they need a DBA, then they
> presumably also don't know that someone's been into the system and
> taken stuff.
>
> To respong to the original question, I'd also point out that sites
> that thought, "We can't afford a sysadmin," have usually found out
> the hard way that they were mistaken.  Even for simple Windows
> networks of a few machines, you need someone to look after it.
>
> Refusing to hire a DBA for data you actually care about is like
> refusing to take your car to the mechanic at regular service
> intervals, because "there's nothing wrong with it."  Supposing you
> don't need a DBA for MySQL or MS SQL Server or any other such system
> is a dangerous delusion.
>
> A
>
> --
> Andrew Sullivan  | ajs@crankycanuck.ca
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


Re: young guy wanting (Postgres DBA) ammo

From
Kevin Hunter
Date:
Thank you to all for your thoughts and responses.

Kevin

Re: young guy wanting (Postgres DBA) ammo

From
Greg Smith
Date:
On Sat, 3 Nov 2007, Ted Byers wrote:

> As one of these programmers, where is the best place to find the
> information I need to get it right...I ask you where I can learn what
> you believe a good DBA needs to know.

What a DBA should know in general is a little different from the question
I think you want an answer to, which is "what should a programmer know so
that they can effectively work like/without a DBA?"

There's an academic answer to that question.  I could tell you to learn
something about data normalization, indexing, what happens on the server
when you join two tables, and how cursors work in your language of choice.
But without some practice, I don't know how much of that would stick.

The most valuable exercise I think someone with a good programming
background, but is relatively new to databases, can go through is to work
on a dramatically larger data set than you would normally encounter.  The
main thing I've seen developers do wrong is writing code or designing
tables that don't scale well.  Since coding works better when you can get
quick feedback after changes, it's very easy to settle into working with
only small test cases, and that can turn into a serious problem when such
code runs into the real world.

The only way to really understand how to think more like a DBA is to try
and write something that works well against a big pile of data.  To throw
out some simple guidelines, you want to be working with a database that's
at least 10X as big as the amount of RAM on your system, and if you do
something that scans the full table like "select * from x" that should
take at least a couple of minutes to complete.

Now, try to build a program that operates quickly on subsets of this data.
Working on this scale will let you discover very quickly if you've made
any of the blatant mistakes that programmers unfamiliar with working on
full-size data sets tend to make.  Most importantly, you don't ever want
to write queries that assume it's OK to send all the data to the client to
sort through, rather than pushing as much as possible toward the database
server.  Second, you'll end up needing to construct indexes properly to
optimize the server side.  And if there's more than one big table
involved, you'll be forced to come to grips with how joins work and
probably learn something about normalization.

You'll know you're learning something valuable whenver you run something
that you expect to return almost instantly, but instead it churns away for
minutes before finishing.  Resist the urge to stop it too quickly, and
instead spend that time thinking about what's gone wrong, or monitoring
the client and/or server for clues.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: young guy wanting (Postgres DBA) ammo

From
Lew
Date:
Scott Marlowe wrote:
> PostgreSQL is ... still not in the
> same realm for really really big transactional sites, but man is it
> geting close fast.

I beg to differ.  All anecdotal evidence, and also Sun benchmarks with
Postgres, show that it is not only "in the same realm" but can actually come
out ahead.
<http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html>

<http://blogs.ittoolbox.com/database/soup/archives/postgresql-publishes-first-real-benchmark-17470>

> This publication shows that a properly tuned
> PostgreSQL is not only as fast or faster than MySQL,
> but almost as fast as Oracle (since the hardware platforms
> are different, it's hard to compare directly).
> This is something we've been saying for the last 2 years, and now we can prove it.

There are links to the SPEC site which shows the tuning parameters.

--
Lew

Re: young guy wanting (Postgres DBA) ammo

From
"Scott Marlowe"
Date:
On 11/2/07, Lew <lew@lwsc.ehost-services.com> wrote:
> Scott Marlowe wrote:
> > PostgreSQL is ... still not in the
> > same realm for really really big transactional sites, but man is it
> > geting close fast.
>
> I beg to differ.  All anecdotal evidence, and also Sun benchmarks with
> Postgres, show that it is not only "in the same realm" but can actually come
> out ahead.
> <http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html>
>
> <http://blogs.ittoolbox.com/database/soup/archives/postgresql-publishes-first-real-benchmark-17470>

I'll point out that the very article you're pointing me to, which I've
read before, btw, has this in it:

"Why pay more? As I said, almost as fast as Oracle. While the list of
Spec publications on affordable commodity hardware is sparse, there
are some. For example, the 874 JOpS@Standard on Oracle
10+Itanium+HP-UX. That's less than 15% faster than our PostgreSQL
publication. "

On commodity hardware, single image machines, PostgreSQL is a match
for Oracle.  On clustered hardware, Oracle still has a lead.