Re: (another ;-)) PostgreSQL-derived project ... - Mailing list pgsql-general

From Uwe Schroeder
Subject Re: (another ;-)) PostgreSQL-derived project ...
Date
Msg-id 201109251948.58500.uwe@oss4u.com
Whole thread Raw
In response to Re: (another ;-)) PostgreSQL-derived project ...  (Albretch Mueller <lbrtchx@gmail.com>)
Responses Re: (another ;-)) PostgreSQL-derived project ...
List pgsql-general
>  First, I wonder what kind of technical person would say there are
> "de-facto truth(s)". I thought only politicians would talk like that.

Well, politicians and Microsoft, Oracle etc.  :-)

> Now, in a sense you are right, I am talking from the background of my
> own experiences (and so are you). When I have developed relatively
> complicated and heavily accessed websites I only use DBMS when I need
> to actually persist any data. For example there are options in Tomcat
> (the java-based web serverrr) to offload session handling to a DBMS
> (which is great when you need to stat(istically trace and infer users'
> navigation) and establish transactions offloading a timed-out session
> to an actual database hitting thhard drivevee (some user got
> distracted ...) ...) and that sounds great, but anything dealing with
> I/O would invariably slow your apps, so what I do is use in-mem (lite)
> DBMS such as Hypersonic SQL (hsqldb.org) to keep sessions off the I/O
> subsystem and the speed increase is --very-- noticeable
> ~
>  Since any piece of code engaging the I/O such as database access code
> should be as fast and simple as possible; yes, I would design, say,
> java code wrappers doing anything that is not strictly INSERT and
> SELECT raw data ... let me deal with the "semantics" and "business
> intelligence" of the data myself

So you're keeping a lot in memory, which to me suggests plenty of hardware is
available.  One of my current apps chews up 8Gb of memory just for the app and
I can't afford to get a 64Gb or more server. If I wanted to keep permanently
accessed data in memory, I'd need somewhere around 1/2 a terrabyte of memory -
so obviously not an option (or maybe really bad database design :-) )

That said, just considering the cost/effort it takes to strip Postgresql down,
why don't you go with a server that has 1TB of solid state discs? That strips
down the I/O bottleneck considerably without any effort.


> > Data types aren't stored in the database as character strings (unless you
> > define your columns as text, of course).
>
> ~
>  I was talking about text and any formatting option in NUMERIC or DATE data

In my experience "data formatting" goes both ways, in and out. Out is
obviously not a major issue because errors don't cause data corruption. In,
however, is a different issue. Errors in "inwards" conversion will cause data
corruption. So unless you have an application that does very little "in" and a
lot of "out", you still have to code a lot of data conversion which otherwise
someone else (the postgresql developers) have already done for you.


> > Take dates for example: you'd have to code very carefully to catch all
> > the different ways dates are represented on this planet.
>
> ~
>  Yeah! And java does an exceptionally good job at that (including
> internationalization)

Maybe it does. I never coded Java because I don't like to use technology where
Oracle can come sue me :-) I do know however that a lot of languages have
quirks with dates and internationalization (python you mentioned earlier being
one of them)

>  http://download.oracle.com/javase/7/docs/api/java/ {util/Date.html,
> text/DateFormat.html}
> ~
>  So, you would ultimately just have to store a long value into the DBMS

Yes, a long value - which can represent pretty much any valid and invalid date
ever devised, so again you don't really know what's in the database when you
leave the validation to the application.


>  This is something I would do with wrapping code using input and
> output bound command objects which are indexed after the same column
> index the DBMS uses

Which still depends on your use case. Your assumption is that every piece of
code is coded in Java - which is fine if that's what your application calls
for. It's going to be a major hassle when you ever have to re-code in a
different language though.

> > To me, that's moving data integrity into the application.
>
> ~
>  Not exactly! Integrity is still a matter of the DBMS which can now
> handle it in an easier way in someone write a date in bangla and
> somebody else in Ukranian this is still the same date/time value
> ultimately determined by the rotation of our planet around the sun ...
> and all we need for that is a long value. Now, aren't we easying
> things for the DBMS?

I agree to disagree on this one. The date value the database stores in this
case is a long. Any "long" can be converted into a valid date - but is it
really the date that was entered in the first place? If I give a date
representation, i.e. 12/31/2010 to the database, I personally don't really
care how the database stores the date underneath. All that interests me is
that the next time I ask for that field I get 12/31/2010 back. There is no
error that can be made other than user error if you ask the database to store
a specific date representation. There are errors you can make in your own
conversion code which can lead to a different "long" stored than intended. So
again data integrity is at least partially in the application and not the
database.


> > Yes, I have, as have many others.  Simple example: program a website
> > like, say
> > Facebook. So you have thousands of users from all over the world. Your
> > website
> > code handles all the data conversions. Now Apple comes along and sells an
> > iPhone which silly enough a lot of people like and try to use to access
> > your website. You now face the problem that you need a second website
> > doing the same thing as the first website except solely made for
> > touch-screen devices. You will be forced to rewrite a lot of your code
> > because all the data conversion is in the code.
>
> ~
>  Well, the code you will have to write either way, regardless of where
> you keep it and in order to not even have to restart the application
> server cold I would code command objects (like function pointers in C)
> to handle those cases. It is ultimately a strings of characters you
> are dealing with

With the right design, you will have to rewrite the visual layer, not the
application logic. Errors in the visual layer are of little consequence
(except disgruntled users). So yes, if you use some kind of middleware that
does all the converting and validating for you,  the difference is negligible.
But then, why write your own when the database already provides that
functionality?

>  You see this is just an implementation issue I would rather try to do
> validation as close to the client as possible and do it in an
> incremental way if necessary

This one I can agree on. My background is government and financial industry and
neither use a system with client side validation - at least not that I have
seen. Actually I've seen military systems which handle 98% of the
"application" inside the database and just import/export text files without any
GUI whatsoever.
As I've said earlier, it all depends on what you're trying to do, what the
requirements are and how sensitive the data is. In my way of thinking - being
a bit paranoid as it is - I rather have nothing in memory and everything on at
least 10 different computer systems spread out over the planet, just so that
one of the systems survives the next asteroid impact with all my data intact
:-) (i.e. one of my webservers doesn't even have the session id in memory -
everything is in postgresql and replicated to different servers. Given, a big
I/O hog because there's multiple records for every page ... but any cleaning
woman can pull the plug and nothing will happen because the hot standbys
someplace else will simply take over - people don't even have to log in again,
the session is still valid on the standby)







pgsql-general by date:

Previous
From: Eduardo Morras
Date:
Subject: Re: looking for a faster way to do that
Next
From: John R Pierce
Date:
Subject: Re: (another ;-)) PostgreSQL-derived project ...