Thread: Proper relational database?
Anyone familiar with the issue would have to say that the tech world would be a significantly better place if IBM had developeda real relational database with an elegant query language rather than the awful camel of a thing that is SQL. If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a proper relationaldatabase server. Probably a query language that expressed the relational algebra in a scheme-like syntax, and thestorage model would be properly relational (eg no duplicate rows). It's an enormous tragedy that all the development effort that has gone into NoSQL database has pretty much all gotten itwrong: by all means throw out SQL, but not the relational model with it. They're all just rehashing the debate over hierarchicalstorage from the 70s. Comp Sci courses should feature a history class. It's a bit odd to me that someone isn't working on such a thing. Just curious what folks here have to say…
On Thu, Apr 21, 2016 at 1:36 PM, Guyren Howe <guyren@gmail.com> wrote: > If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a properrelational database server. Probably a query language that expressed the relational algebra in a scheme-like syntax,and the storage model would be properly relational (eg no duplicate rows). Have you heard of QUEL? See https://en.wikipedia.org/wiki/QUEL_query_languages -- Peter Geoghegan
Anyone familiar with the issue would have to say that the tech world would be a significantly better place if IBM had developed a real relational database with an elegant query language rather than the awful camel of a thing that is SQL.
If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a proper relational database server. Probably a query language that expressed the relational algebra in a scheme-like syntax, and the storage model would be properly relational (eg no duplicate rows).
It's an enormous tragedy that all the development effort that has gone into NoSQL database has pretty much all gotten it wrong: by all means throw out SQL, but not the relational model with it. They're all just rehashing the debate over hierarchical storage from the 70s. Comp Sci courses should feature a history class.
It's a bit odd to me that someone isn't working on such a thing.
Just curious what folks here have to say…
Anyone familiar with the issue would have to say that the tech world would be a significantly better place if IBM had developed a real relational database with an elegant query language rather than the awful camel of a thing that is SQL.
If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a proper relational database server. Probably a query language that expressed the relational algebra in a scheme-like syntax, and the storage model would be properly relational (eg no duplicate rows).
It's an enormous tragedy that all the development effort that has gone into NoSQL database has pretty much all gotten it wrong: by all means throw out SQL, but not the relational model with it. They're all just rehashing the debate over hierarchical storage from the 70s. Comp Sci courses should feature a history class.
It's a bit odd to me that someone isn't working on such a thing.
Just curious what folks here have to say…
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Maranatha! <><
John McKown
Well, this hits very close to my feelings in several respects. I don't often bring this up, because I don't generally feel like "I loathe SQL" is quite the thing to say in a community called "PostgreSQL". :-) Or, "I really love this project... can we change its direction entirely?" But yeah, that's in my heart. And I'm quite in agreement on NoSQL. I don't think a lot of people realize there *is* a difference between "relational" and SQL, so the baby goes out with the bathwater. As it just so happens, I actually hacked a Scheme interpreter into the PostgreSQL parser, and got it to interpret incoming messages, with virtually the same idea. I started messing with generating SQL from S-expressions, with the idea to perhaps call query-building functions later on. I started off with Chibi Scheme, and moved to Chicken later on, mostly because the folks in the Chicken IRC channel are so helpful. I was wondering whether PostgreSQL might consider supporting alternative query languages, similar to the way it supports multiple languages for stored procedures. Ideally, it'd be nice to get the query results as S-expressions also. On Thu, Apr 21, 2016 at 4:36 PM, Guyren Howe <guyren@gmail.com> wrote: > Anyone familiar with the issue would have to say that the tech world would be a significantly better place if IBM had developeda real relational database with an elegant query language rather than the awful camel of a thing that is SQL. > > If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a properrelational database server. Probably a query language that expressed the relational algebra in a scheme-like syntax,and the storage model would be properly relational (eg no duplicate rows). > > It's an enormous tragedy that all the development effort that has gone into NoSQL database has pretty much all gotten itwrong: by all means throw out SQL, but not the relational model with it. They're all just rehashing the debate over hierarchicalstorage from the 70s. Comp Sci courses should feature a history class. > > It's a bit odd to me that someone isn't working on such a thing. > > Just curious what folks here have to say… > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Ray Brinzer
Just as a curiosity, what do you think of ANDL?The developer has been posting some questions here about interfacing it to PostgreSQL. But he doesn't just want to do a "translate the ANDL language to SQL language"
On Thursday 21 April 2016 13:36:54 Guyren Howe wrote: > Anyone familiar with the issue would have to say that the tech world would > be a significantly better place if IBM had developed a real relational > database with an elegant query language rather than the awful camel of a > thing that is SQL. > > If I had a few $million to spend in a philanthropical manner, I would hire > some of the best PG devs to develop a proper relational database server. > Probably a query language that expressed the relational algebra in a > scheme-like syntax, and the storage model would be properly relational (eg > no duplicate rows). > > It's an enormous tragedy that all the development effort that has gone into > NoSQL database has pretty much all gotten it wrong: by all means throw out > SQL, but not the relational model with it. They're all just rehashing the > debate over hierarchical storage from the 70s. Comp Sci courses should > feature a history class. > > It's a bit odd to me that someone isn't working on such a thing. > > Just curious what folks here have to say… Well when IBM were first developing relational databases there were two different teams. One in California which produced System-R which became what we now know as DB2 and spawned SQL, and the other in Peterlee in the UK which was called PRTV (the Peterlee Relational Test Vehicle). PRTV rather died but bits of it survived. In particular it was the first to system to include a relational optimiser. You can find some details on the PRTV page in Wikipedia. It was written in PL/1, although it also used some modified microcode and therefore some assembler. It never appeared as a product, but there was a geographical system which built on top of it which was if I recall corrected used by the Greater London Council and Central Region Scotland, which did something of what postgis does for PostgreSQL. According to the Wikipedia page it did have a language (ISBL) but from what I recall (and it was nearly 40 years ago) there were a series of PL/1 function calls we used rather than encoding the request as a string as SQL systems require. The IBM centre in Peterlee was closed, and the lab moved to Winchester where I think it still resides. David
On 22 April 2016 at 07:05, Guyren Howe <guyren@gmail.com> wrote: > As I say, it amazes and somewhat depresses me that someone isn't doing this. > The NoSQL movement shows that the world is ready for change. Someone should > be offering folks something better than bloody MongoDB. > > Please don't get me wrong. I *adore* Postgres. It is for most projects > hands-down the best data store available. It's just tragic that this amazing > project is so wedded to the awfulness that is SQL. Can I make a counter-argument? SQL is excellent for beginners and adequate for most users. The basic syntax of SQL (enough that people can produce useful queries with it) can be presented and understood by a novice in an afternoon. I would balk at the idea of trying to present the sort of syntax that appears on the ANDL website to people who aren't programmers, which (I'd be tempted to suggest) is a significant proportion of the userbase of SQL. The fact that ORMs are horrible and involve far too much work to maintain isn't the fault of SQL, it's the fault of the people who believe that they have to have their data fed to them by an ORM, because of this idea (that has sadly propagated widely) that the separation of code from the data is somehow helpful (as is probably obvious I'm yet to be convinced of that!). The world is not "ready for a change". I think of NoSQL as being like Kim Kardashian: it gets an awful lot of publicity without providing much justification for it; it brings a lot of column inches without giving anything of substance in return and a lot of people talk about it an awful lot without really knowing much about it at all. There's a (very) small set of users for whom NoSQL makes a lot of sense. Most of those are large corporations with huge budgets for development, or academics who can afford to spend many hours tweaking and figuring out their optimum storage requirements. The average SQL user, on the other hand, just wants something that brings consistent data storage for their database that probably numbers in the tens of thousands of records at most (and if it scales to tens of millions then great). Geoff
A brief review of it says it would be better than SQL, but then almost anything would be. But the syntax looks a bit… baroque. Quell similarly.If I had the time and money to put together a team to do this, I would start with the lower-level guts of either Postgres or SQLite (or, heck, MySQL) so you had a thing that did BTrees and other data structures on disk and indexes, and provide access to that from a high level, portable and efficient language. Perhaps Scheme.Then you could write a high-level relational logic engine on top of that, in the high level language, perhaps with the odd bit of C or D or Go for anything really critical.I don't know if Postgres exposes the lower-level stuff to plugins or not — it would be nice if this could be an alternative query language for Postgres itself, but the assumptions about the two worlds (SQL vs a properly relational store) are probably too different.As I say, it amazes and somewhat depresses me that someone isn't doing this. The NoSQL movement shows that the world is ready for change. Someone should be offering folks something better than bloody MongoDB.And the project could adopt the spirit of the good parts of the NoSQL movement. I should be able to have a lightweight, distributed schema-on-demand, eventually consistent etc etc *relational* data store.Please don't get me wrong. I *adore* Postgres. It is for most projects hands-down the best data store available. It's just tragic that this amazing project is so wedded to the awfulness that is SQL.I wrote about such issues at a bit more length at http://relevantlogic.com/2015/11/04/no-sql-is-fixing-the-wrong-problem.html
Maranatha! <><
John McKown
On Fri, Apr 22, 2016 at 2:05 AM, Guyren Howe <guyren@gmail.com> wrote: > If I had the time and money to put together a team to do this, I would start > with the lower-level guts of either Postgres or SQLite (or, heck, MySQL) so > you had a thing that did BTrees and other data structures on disk and > indexes, and provide access to that from a high level, portable and > efficient language. Perhaps Scheme. > > Then you could write a high-level relational logic engine on top of that, in > the high level language, perhaps with the odd bit of C or D or Go for > anything really critical. Why is starting at a low level important? A database is truly relational to the extent that it implements the relational model. If you don't want the database to allow tables without keys, or to allow null values, don't let people create them. If the underlying machinery allows them, that seems like a mere performance issue; worrying about that from the outset seems like a perfect example of premature optimization. If PostgreSQL's performance is acceptable now, why wouldn't it be acceptable with a different interface language? There are other aspects of what would make a truly relational database, of course. Codd's 0th rule, for instance, that the "system must be able to manage data bases entirely through its relational capabilities" to me says that there should be no data definition language, except as syntactic sugar for relational operations. So you'd create users (thousands in one command, if you liked) by adding tuples to a base relation. But which things are important? I think a good many of the things one might would be lower-hanging fruit than that. Just having a clean query language would alleviate a lot of (my) discomfort. > I don't know if Postgres exposes the lower-level stuff to plugins or not — > it would be nice if this could be an alternative query language for Postgres > itself, Well, the parser doesn't, but as best I can tell it's also somewhat loosely coupled from the system. It doesn't do table access, for instance. It builds and returns a parse tree. There's no reason you couldn't parse a different language and return a tree of the same type. Or you could just translate your input language into SQL, and pass it along to the existing parser. > but the assumptions about the two worlds (SQL vs a properly > relational store) are probably too different. Are there relational algebra expressions, or other operations necessary to a truly relational database, which cannot be translated into SQL? I'm not aware that there are, but I'd be interested to hear of it. If there were, there's a good chance you wouldn't be able to translate them into the parse tree, either. -- Ray Brinzer
On 04/22/2016 06:21 AM, David Goodenough wrote: > On Thursday 21 April 2016 13:36:54 Guyren Howe wrote: >> Anyone familiar with the issue would have to say that the tech world would >> be a significantly better place if IBM had developed a real relational >> database with an elegant query language rather than the awful camel of a >> thing that is SQL. >> >> If I had a few $million to spend in a philanthropical manner, I would hire >> some of the best PG devs to develop a proper relational database server. >> Probably a query language that expressed the relational algebra in a >> scheme-like syntax, and the storage model would be properly relational (eg >> no duplicate rows). >> >> It's an enormous tragedy that all the development effort that has gone into >> NoSQL database has pretty much all gotten it wrong: by all means throw out >> SQL, but not the relational model with it. They're all just rehashing the >> debate over hierarchical storage from the 70s. Comp Sci courses should >> feature a history class. >> >> It's a bit odd to me that someone isn't working on such a thing. >> >> Just curious what folks here have to say… > Well when IBM were first developing relational databases there were two > different teams. One in California which produced System-R which became > what we now know as DB2 and spawned SQL, and the other in Peterlee in > the UK which was called PRTV (the Peterlee Relational Test Vehicle). PRTV > rather died but bits of it survived. In particular it was the first to system > to include a relational optimiser. You can find some details on the PRTV > page in Wikipedia. > > It was written in PL/1, although it also used some modified microcode > and therefore some assembler. > > It never appeared as a product, but there was a geographical system > which built on top of it which was if I recall corrected used by the Greater > London Council and Central Region Scotland, which did something of > what postgis does for PostgreSQL. > > According to the Wikipedia page it did have a language (ISBL) but from what > I recall (and it was nearly 40 years ago) there were a series of PL/1 > function calls we used rather than encoding the request as a string > as SQL systems require. > > The IBM centre in Peterlee was closed, and the lab moved to Winchester > where I think it still resides. One of the people involved in that was Hugh Darwen, who is one of the authors of The Third Manifesto, which is an attempt to define what a properly relational language and system should look like. So you could say the experience of ISBL vs SQL has been folded into that effort.
On Apr 22, 2016, at 10:45 , Raymond Brinzer <ray.brinzer@gmail.com> wrote: > > Are there relational algebra expressions, or other operations > necessary to a truly relational database, which cannot be translated > into SQL? I'm not aware that there are, but I'd be interested to hear > of it. If there were, there's a good chance you wouldn't be able to > translate them into the parse tree, either. The fundamental storage model needs to at least be a bit different. In particular, relations can't allow duplicates. Youcould have nulls (Codd proposed two different forms of null IIRC: a single null value and two different null values),although they should be more principled than the mess they are in SQL. I am no expert on database optimization, but I understand that it is significantly easier to do query optimization in a properlyrelational database, as it forms a reasonably simple algebra, which can be optimized much as you would optimize evaluationof a numeric expression. Major gains from a proper relational store would be: - a better language, easier to parse, read and generate. Perhaps multiple equivalent query languages; - other storage models (distributed and eventually consistent, say); - simpler (in implementation and use); We may also get some degree of faster and other good things. It also might be implemented in such a way that it can run asa server or more like SQLite.
So, let's just flat-out ask. Dear Important People: would the PostgreSQL project consider supporting other query languages? Or creating a plug-in mechanism for them, so that alternative interface languages could be added without changing the base code? On Thu, Apr 21, 2016 at 4:36 PM, Guyren Howe <guyren@gmail.com> wrote: > Anyone familiar with the issue would have to say that the tech world would be a significantly better place if IBM had developeda real relational database with an elegant query language rather than the awful camel of a thing that is SQL. > > If I had a few $million to spend in a philanthropical manner, I would hire some of the best PG devs to develop a properrelational database server. Probably a query language that expressed the relational algebra in a scheme-like syntax,and the storage model would be properly relational (eg no duplicate rows). > > It's an enormous tragedy that all the development effort that has gone into NoSQL database has pretty much all gotten itwrong: by all means throw out SQL, but not the relational model with it. They're all just rehashing the debate over hierarchicalstorage from the 70s. Comp Sci courses should feature a history class. > > It's a bit odd to me that someone isn't working on such a thing. > > Just curious what folks here have to say… > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Ray Brinzer
On 04/21/2016 01:36 PM, Guyren Howe wrote: > Anyone familiar with the issue would have to say that the tech world > would be a significantly better place if IBM had developed a real > relational database with an elegant query language I'm surprised no one yet has mentioned Tutorial D by C. J. Date. His book _Database in Depth_ is pretty much an extended argument for how superior it is to SQL. RelDB is apparently an open source implementation of it, and D4 is a commercial one. That's about all I know in terms of practically using it for something. But Date & Tutorial D seems like a good place to start if SQL isn't relational enough for you. The book I mentioned is short and easy to read. Paul
So, let's just flat-out ask.
Dear Important People: would the PostgreSQL project consider
supporting other query languages? Or creating a plug-in mechanism for
them, so that alternative interface languages could be added without
changing the base code?
On Apr 22, 2016, at 12:40 , David G. Johnston <david.g.johnston@gmail.com> wrote:On Fri, Apr 22, 2016 at 12:25 PM, Raymond Brinzer <ray.brinzer@gmail.com> wrote:So, let's just flat-out ask.
Dear Important People: would the PostgreSQL project consider
supporting other query languages? Or creating a plug-in mechanism for
them, so that alternative interface languages could be added without
changing the base code?If by important you mean possessing a commit-bit then I don't count...but for me, such a project would have to gain significant adoption as a fork of the PostgreSQL code base before it would ever be considered for take-over by the mainline project.David J.
On 04/22/2016 12:25 PM, Raymond Brinzer wrote: > So, let's just flat-out ask. > > Dear Important People: would the PostgreSQL project consider > supporting other query languages? Or creating a plug-in mechanism for > them, so that alternative interface languages could be added without > changing the base code? Probably not considering our mission. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
The SQL language is terrible but we can live with it.
But the answer to "Are there any relational data stores that offer eventual consistency, easy distribution, schema-on-demand or any such things a large modern application can use?" appears to be no. And that's just awful.
On Fri, Apr 22, 2016 at 3:07 PM, Paul Jungwirth <pj@illuminatedcomputing.com> wrote: > I'm surprised no one yet has mentioned Tutorial D by C. J. Date. His book > _Database in Depth_ is pretty much an extended argument for how superior it > is to SQL. RelDB is apparently an open source implementation of it, and D4 > is a commercial one. That's about all I know in terms of practically using > it for something. But Date & Tutorial D seems like a good place to start if > SQL isn't relational enough for you. The book I mentioned is short and easy > to read. This is a relevant project: https://github.com/agentm/project-m36
Andl is a "proper relational database" language. Andl is an original implementation of the language D described in The Third Manifesto. http://www.dcs.warwick.ac.uk/~hugh/TTM/. I have a working implementation of Andl on Postgres. See http://www.andl.org/2016/04/postgres-meet-andl/. This version isnot ready for release, but there is a download that supports Sqlite, Thrift and REST. The syntax of Andl is quite different from Tutorial D, or SQL. It does what SQL does, including any relational query youcan think of, and a few you haven't! Check out the web site for some code samples. Regards David M Bennett FACS Andl - A New Database Language - andl.org > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Paul Jungwirth > Sent: Saturday, 23 April 2016 5:38 AM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Proper relational database? > > On 04/21/2016 01:36 PM, Guyren Howe wrote: > > Anyone familiar with the issue would have to say that the tech world > > would be a significantly better place if IBM had developed a real > > relational database with an elegant query language > > I'm surprised no one yet has mentioned Tutorial D by C. J. Date. His book > _Database in Depth_ is pretty much an extended argument for how superior it > is to SQL. RelDB is apparently an open source implementation of it, and D4 is > a commercial one. That's about all I know in terms of practically using it > for something. But Date & Tutorial D seems like a good place to start if SQL > isn't relational enough for you. The book I mentioned is short and easy to > read. > > Paul > > > > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make > changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Eventual consistency is not part of the language, so outside the scope of Andl.
Easy distribution depends on a standardised language. SQL is a definite fail. There is only one Andl and it works identically on all platforms. That should help.
Why schema-on-demand? Can you explain what you mean by that?
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Guyren Howe
Sent: Saturday, 23 April 2016 5:54 AM
To: PostgreSQL General <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Proper relational database?
The SQL language is terrible but we can live with it.
But the answer to "Are there any relational data stores that offer eventual consistency, easy distribution, schema-on-demand or any such things a large modern application can use?" appears to be no. And that's just awful.
On Apr 22, 2016, at 12:40 , David G. Johnston <david.g.johnston@gmail.com> wrote:
On Fri, Apr 22, 2016 at 12:25 PM, Raymond Brinzer <ray.brinzer@gmail.com> wrote:
So, let's just flat-out ask.
Dear Important People: would the PostgreSQL project consider
supporting other query languages? Or creating a plug-in mechanism for
them, so that alternative interface languages could be added without
changing the base code?
If by important you mean possessing a commit-bit then I don't count...but for me, such a project would have to gain significant adoption as a fork of the PostgreSQL code base before it would ever be considered for take-over by the mainline project.
David J.
> Why is starting at a low level important? A database is truly relational to > the extent that it implements the relational model. If you don't want the > database to allow tables without keys, or to allow null values, don't let > people create them. If the underlying machinery allows them, that seems like > a mere performance issue; worrying about that from the outset seems like a > perfect example of premature optimization. If PostgreSQL's performance is > acceptable now, why wouldn't it be acceptable with a different interface > language? Agreed. > There are other aspects of what would make a truly relational database, of > course. Codd's 0th rule, for instance, that the "system must be able to > manage data bases entirely through its relational capabilities" to me says > that there should be no data definition language, except as syntactic sugar > for relational operations. So you'd create users (thousands in one command, > if you liked) by adding tuples to a base relation. Yes, maybe, but that makes it not part of the language. You can't apply rule 0 to things like creating a type or operatorin a language. > But which things are important? I think a good many of the things one might > would be lower-hanging fruit than that. Just having a clean query language > would alleviate a lot of (my) discomfort. Andl is that. > > I don't know if Postgres exposes the lower-level stuff to plugins or > > not — it would be nice if this could be an alternative query language > > for Postgres itself, > > Well, the parser doesn't, but as best I can tell it's also somewhat loosely > coupled from the system. It doesn't do table access, for instance. It builds > and returns a parse tree. There's no reason you couldn't parse a different > language and return a tree of the same type. Or you could just translate > your input language into SQL, and pass it along to the existing parser. I looked into that, and it's too hard as a starting place. There is too much of the tree and the subsequent query planningthat is hooked into specific features of SQL. Instead, Andl generates a small subset of SQL. Once each query hasbeen generated and parsed, the prepared statements can be cached and you get most of the benefits. > > but the assumptions about the two worlds (SQL vs a properly relational > > store) are probably too different. > Are there relational algebra expressions, or other operations necessary to a > truly relational database, which cannot be translated into SQL? I'm not > aware that there are, but I'd be interested to hear of it. If there were, > there's a good chance you wouldn't be able to translate them into the parse > tree, either. Absolutely not. SQL is a (nearly) full implementation of the relational algebra, plus other non-relational stuff. The onlything it really can't handle is a table with no columns! (I have to fake that) Regards David M Bennett FACS Andl - A New Database Language - andl.org
Absolutely not. SQL is a (nearly) full implementation of the relational algebra, plus other non-relational stuff. The only thing it really can't handle is a table with no columns! (I have to fake that)
a table with no columns would have no primary key... doesn't that violate one of the fundamental tenets of the relational model ?
-- john r pierce, recycling bits in santa cruz
On Fri, Apr 22, 2016 at 10:45 PM, John R Pierce <pierce@hogranch.com> wrote: > a table with no columns would have no primary key... doesn't that violate > one of the fundamental tenets of the relational model ? Not as I understand it. A relation must have at least one candidate key. That will be the set of all the fields, if no proper subset qualifies. Calling one key "primary" is merely convention, so far as I am aware (talking relational theory, here, not how databases regard primary keys). In a table with no columns, the only candidate key is the set of all fields, which is the empty set. If you want to call that the primary key, it shouldn't be a problem. The tuples (all 0 of them) are guaranteed to be unique. -- Ray Brinzer
> On Fri, Apr 22, 2016 at 10:45 PM, John R Pierce <pierce@hogranch.com> wrote: > > a table with no columns would have no primary key... doesn't that > > violate one of the fundamental tenets of the relational model ? > > Not as I understand it. A relation must have at least one candidate key. > That will be the set of all the fields, if no proper subset qualifies. > Calling one key "primary" is merely convention, so far as I am aware (talking > relational theory, here, not how databases regard primary keys). This is a 'soft' requirement. If there is no other key, then the set of all attributes is the key. > In a table with no columns, the only candidate key is the set of all fields, > which is the empty set. If you want to call that the primary key, it > shouldn't be a problem. The tuples (all 0 of them) are guaranteed to be > unique. The relation with no attributes may have a tuple, which itself has no attributes. Thus there are two such relations, oneempty and one of degree one. They can be referred to as DUM and DEE, or as false and true. See http://c2.com/cgi/wiki?TableDumfor example. Many experienced users of SQL are aware of situations where they are useful. In Andl they are automatically available asliterals. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On Sat, Apr 23, 2016 at 2:04 PM, <david@andl.org> wrote: >> Why is starting at a low level important? A database is truly relational to >> the extent that it implements the relational model. If you don't want the >> database to allow tables without keys, or to allow null values, don't let >> people create them. If the underlying machinery allows them, that seems like >> a mere performance issue; worrying about that from the outset seems like a >> perfect example of premature optimization. If PostgreSQL's performance is >> acceptable now, why wouldn't it be acceptable with a different interface >> language? > > Agreed. > >> There are other aspects of what would make a truly relational database, of >> course. Codd's 0th rule, for instance, that the "system must be able to >> manage data bases entirely through its relational capabilities" to me says >> that there should be no data definition language, except as syntactic sugar >> for relational operations. So you'd create users (thousands in one command, >> if you liked) by adding tuples to a base relation. > > Yes, maybe, but that makes it not part of the language. You can't apply rule 0 to things like creating a type or operatorin a language. > >> But which things are important? I think a good many of the things one might >> would be lower-hanging fruit than that. Just having a clean query language >> would alleviate a lot of (my) discomfort. > > Andl is that. > >> > I don't know if Postgres exposes the lower-level stuff to plugins or >> > not — it would be nice if this could be an alternative query language >> > for Postgres itself, >> >> Well, the parser doesn't, but as best I can tell it's also somewhat loosely >> coupled from the system. It doesn't do table access, for instance. It builds >> and returns a parse tree. There's no reason you couldn't parse a different >> language and return a tree of the same type. Or you could just translate >> your input language into SQL, and pass it along to the existing parser. > > I looked into that, and it's too hard as a starting place. There is too much of the tree and the subsequent query planningthat is hooked into specific features of SQL. Instead, Andl generates a small subset of SQL. Once each query hasbeen generated and parsed, the prepared statements can be cached and you get most of the benefits. > >> > but the assumptions about the two worlds (SQL vs a properly relational >> > store) are probably too different. > >> Are there relational algebra expressions, or other operations necessary to a >> truly relational database, which cannot be translated into SQL? I'm not >> aware that there are, but I'd be interested to hear of it. If there were, >> there's a good chance you wouldn't be able to translate them into the parse >> tree, either. > > Absolutely not. SQL is a (nearly) full implementation of the relational algebra, plus other non-relational stuff. Theonly thing it really can't handle is a table with no columns! (I have to fake that) FWIW standard SQL may not allow it but Postgres does, and it's even possible to exclude duplicates by using an expression that references the whole row. postgres=# select; ┌──┐ ├──┤ └──┘ (1 row) postgres=# create table dum (); CREATE TABLE postgres=# select * from dum; ┌──┐ ├──┤ └──┘ (0 rows) postgres=# create unique index dum_unique on dum((dum)); CREATE INDEX postgres=# insert into dum select; INSERT 0 1 postgres=# select * from dum; ┌──┐ ├──┤ └──┘ (1 row) postgres=# insert into dum select; ERROR: duplicate key value violates unique constraint "dum_unique" DETAIL: Key ((dum.*))=(()) already exists. -- Thomas Munro http://www.enterprisedb.com
> So, let's just flat-out ask. > > Dear Important People: would the PostgreSQL project consider supporting > other query languages? Or creating a plug-in mechanism for them, so that > alternative interface languages could be added without changing the base > code? I very much doubt it. The use case has to be established first. As it happens, the existing PL extension capability provides enough to get an alternative query language (such as Andl) towork. That's why I chose Postgres. Making it a 'native' would be not so hard if there is real demand. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On Sat, Apr 23, 2016 at 12:05 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > FWIW standard SQL may not allow it but Postgres does, and it's even > possible to exclude duplicates by using an expression that references > the whole row. Indeed, but its semantics can be rather wonky. Witness: postgres=# select 1; ?column? ---------- 1 (1 row) postgres=# select 1 union select 1; ?column? ---------- 1 (1 row) postgres=# select; -- (1 row) postgres=# select union select; -- (2 rows) postgres=# select 1 intersect select 1; ?column? ---------- 1 (1 row) postgres=# select intersect select; -- (2 rows)
Why schema-on-demand? Can you explain what you mean by that?
On Sat, Apr 23, 2016 at 1:53 AM, Guyren Howe <guyren@gmail.com> wrote: > On Apr 22, 2016, at 18:56 , <david@andl.org> <david@andl.org> wrote: > Why schema-on-demand? Can you explain what you mean by that? > > Something that is attractive, for beginners or perhaps when prototyping is > that you don't have to declare a table. You can just insert tuples into a > predicate whose name you provide and they go in and you've defined a > relation just by using it. test=# \d No relations found. test=# select * into people from (values (1,'Fred'), (2, 'Bob')) x(id, name); SELECT 2 test=# select * from people; id | name ----+------ 1 | Fred 2 | Bob (2 rows) test=# \d List of relations Schema | Name | Type | Owner --------+--------+-------+--------- public | people | table | kgrittn (1 row) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> owner@postgresql.org] On Behalf Of Guyren Howe > Sent: Saturday, 23 April 2016 4:04 AM > To: Raymond Brinzer <ray.brinzer@gmail.com> > Subject: Re: [GENERAL] Proper relational database? > > On Apr 22, 2016, at 10:45 , Raymond Brinzer <ray.brinzer@gmail.com> wrote: > The fundamental storage model needs to at least be a bit different. In > particular, relations can't allow duplicates. You could have nulls (Codd > proposed two different forms of null IIRC: a single null value and two > different null values), although they should be more principled than the mess > they are in SQL. Andl has no nulls. I have read Codd's later work, but IMHO the consequences of multi-valued logic do not justify that conclusion. The standard storage engines used for SQL can handle tables with no nulls perfectly well. > I am no expert on database optimization, but I understand that it is > significantly easier to do query optimization in a properly relational > database, as it forms a reasonably simple algebra, which can be optimized > much as you would optimize evaluation of a numeric expression. I would venture a guess that advanced query planners already take into account whether columns have nulls or not. Whatever can be done for a 'pure' RA can already be done as a special case for SQL, and probably has been. > Major gains from a proper relational store would be: > > - a better language, easier to parse, read and generate. Perhaps multiple > equivalent query languages; Check. Andl is that, and I know several others. > - other storage models (distributed and eventually consistent, say); > - simpler (in implementation and use); Not sure whether this is a reasonable consequence. > > We may also get some degree of faster and other good things. It also might be > implemented in such a way that it can run as a server or more like SQLite. Andl does that. It provides 3 native servers: Thrift, Web API and REST. Regards David M Bennett FACS Andl - A New Database Language - andl.org
> This is a relevant project: https://github.com/agentm/project-m36 Thanks -- I didn't know about that one. I'll add it to my list. It's quite unlike other implementations. I have some reading to do. Regards David M Bennett FACS Andl - A New Database Language - andl.org
> From: Thomas Munro [mailto:thomas.munro@enterprisedb.com] > FWIW standard SQL may not allow it but Postgres does, and it's even possible > to exclude duplicates by using an expression that references the whole row. Thank you. I didn't know that. I'll use it if I can verify it works right. It's not that important -- Andl can emulate it quite easily. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On 23 April 2016 at 07:08, Manuel Gómez <targen@gmail.com> wrote: > but its semantics can be rather wonky. Witness: > > postgres=# select 1; > ?column? > ---------- > 1 > (1 row) > > postgres=# select 1 union select 1; > ?column? > ---------- > 1 > (1 row) Exactly what you would expect. Use UNION ALL to get two rows. > postgres=# select; > -- > (1 row) > > postgres=# select union select; > -- > (2 rows) SELECT with no values is selecting a NULL, and two NULLs do not equate, so you would expect two rows. > postgres=# select 1 intersect select 1; > ?column? > ---------- > 1 > (1 row) > > postgres=# select intersect select; > -- > (2 rows) See above. Geoff
On 04/23/2016 08:09 AM, Geoff Winkless wrote: > On 23 April 2016 at 07:08, Manuel Gómez <targen@gmail.com> wrote: >> but its semantics can be rather wonky. Witness: >> >> postgres=# select 1; >> ?column? >> ---------- >> 1 >> (1 row) >> >> postgres=# select 1 union select 1; >> ?column? >> ---------- >> 1 >> (1 row) > > > Exactly what you would expect. Use UNION ALL to get two rows. Exactly, a simple reading the docs will explain this: http://www.postgresql.org/docs/9.5/static/queries-union.html JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
On Sat, Apr 23, 2016 at 10:39 AM, Geoff Winkless <pgsqladmin@geoff.dj> wrote: > SELECT with no values is selecting a NULL, and two NULLs do not > equate, so you would expect two rows. This is precisely what I mean by wonky semantics: it makes no sense for a nullary relation to be interpreted as selecting a NULL. A row with no columns should have different semantics from a row with one NULL column. Indeed it does in other contexts; note its correct behavior as the neutral for the cross join operator: postgres=# select * from (values (42)) as t(n) cross join (select) as dee; n ---- 42 (1 row) postgres=# select * from (values (42)) as t(n) cross join (select NULL) as not_dee; n | ?column? ----+---------- 42 | (1 row) The empty tuple should equal itself and its semantics should have nothing at all to do with NULL, and thence I expect one row. I've even been tempted to file a bug report, but I fear it may be brushed off as pedantry.
On Thu, Apr 21, 2016 at 01:36:54PM -0700, Guyren Howe wrote: > It's an enormous tragedy that all the development effort that has > gone into NoSQL database has pretty much all gotten it wrong: by all > means throw out SQL, but not the relational model with it. They're > all just rehashing the debate over hierarchical storage from the > 70s. Comp Sci courses should feature a history class. This turns out to be true in many areas of language design, mutli-user system security, virtually everything to do with networking, and application deployment. I was at an IETF meeting some years ago where someone talking about "Internet of Things" stuff was going on at length about how nobody around the IETF really understood constrained systems. Standing behind him at the mic was an assortment of grey-bearded men who'd worked directly on the original IMPs (which were 16-bit Honeywells that ran at like 5MHz and had IIRC 16Kwords of memory). It's also true that crappy interfaces that are good enough stick around anyway. The UNIX Haters' Handbook is full of evidence of how much less good UNIX was, but even Apple gave in. Also, many of the historical compromises turn out, once you start to try to make different ones, to be more obviously defensible. Most of the NoSQL trend was not a hatred of SQL the language but a carelessness about the relational syntax or a view that promises about consistency are dumb. Then the first credit card number gets lost in an eventually-consistent system, and people suddenly understand viscerally why transactions semantics are so hard. Best regards, A -- Andrew Sullivan ajs@crankycanuck.ca
Absolutely not. SQL is a (nearly) full implementation of the relational algebra, plus other non-relational stuff. The only thing it really can't handle is a table with no columns! (I have to fake that)
a table with no columns would have no primary key... doesn't that violate one of the fundamental tenets of the relational model ?
john r pierce, recycling bits in santa cruz
The relational model requires a key, but the key can be empty (no attributes). Such a relation can itself be empty, or it can have a single tuple as its body. The maths requires it, and it works just fine.
Regards
David M Bennett FACS
Andl - A New Database Language - andl.org
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Guyren Howe
Why schema-on-demand? Can you explain what you mean by that?
Something that is attractive, for beginners or perhaps when prototyping is that you don't have to declare a table. You can just insert tuples into a predicate whose name you provide and they go in and you've defined a relation just by using it.
The problem is which side of the fence you’re writing code. If you write code on the app side, it’s a hassle and every table is a cost. If you write on the SQL side, you can create temp tables with little effort.
So the answer to that one is: in a relational language (like Andl) it’s so easy to create a relation (table) you just do it.
Much of my point in raising this discussion is that there are features that the NoSQL folks are implementing that are useful in some cases. Things they are doing like eventually consistent distributed stores are really required at sufficient scale, but there are other great ideas. SQL's storage model is not the only way, nor should it be. We shouldn't have to abandon the relational model to get such features, but we *do* have to abandon SQL to get them. And good riddance.
Agreed.
I would like to have relational stores providing such features before some monstrosity like Mongo or CouchDB becomes so entrenched we'll never be rid of it.
They do provide some genuinely useful non-relational features, but yes. Friendly relational is what I’m working on.
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > This turns out to be true in many areas of language design, mutli-user system > security, virtually everything to do with networking, and application > deployment. I was at an IETF meeting some years ago where someone talking > about "Internet of Things" stuff was going on at length about how nobody > around the IETF really understood constrained systems. Standing behind him > at the mic was an assortment of grey-bearded men who'd worked directly on the > original IMPs (which were 16-bit Honeywells that ran at like 5MHz and had > IIRC 16Kwords of memory). Amen to that. I started on embedded systems that ran >1 usec cycle time and 16kb memory. Machine code, no assembler. But I never want to do that again -- be pleased someone does! > It's also true that crappy interfaces that are good enough stick around > anyway. The UNIX Haters' Handbook is full of evidence of how much less good > UNIX was, but even Apple gave in. Also, many of the historical compromises > turn out, once you start to try to make different ones, to be more obviously > defensible. Amen to that. Replacing SQL is easy when you look at SQL's faults, but not so easy when you realise its strengths. > Most of the NoSQL trend was not a hatred of SQL the language but > a carelessness about the relational syntax or a view that promises about > consistency are dumb. Then the first credit card number gets lost in an > eventually-consistent system, and people suddenly understand viscerally why > transactions semantics are so hard. But there is goodness there, and NoSQL is now just as hard to replace. Regards David M Bennett FACS Andl - A New Database Language - andl.org
> owner@postgresql.org] On Behalf Of Eric Schwarzenbach > >> If I had a few $million to spend in a philanthropical manner, I would > >> hire some of the best PG devs to develop a proper relational database > server. > >> Probably a query language that expressed the relational algebra in a > >> scheme-like syntax, and the storage model would be properly > >> relational (eg no duplicate rows). If there were someone to pay the bills, would you work on it? > >> It's an enormous tragedy that all the development effort that has > >> gone into NoSQL database has pretty much all gotten it wrong: by all > >> means throw out SQL, but not the relational model with it. They're > >> all just rehashing the debate over hierarchical storage from the 70s. > >> Comp Sci courses should feature a history class. > >> > >> It's a bit odd to me that someone isn't working on such a thing. Several people are, but without the few $million... > > Well when IBM were first developing relational databases there were > > two different teams. One in California which produced System-R which > > became what we now know as DB2 and spawned SQL, and the other in > > Peterlee in the UK which was called PRTV (the Peterlee Relational Test > > Vehicle). PRTV rather died but bits of it survived. And many of the people who worked on it are still around. > > According to the Wikipedia page it did have a language (ISBL) but from > > what I recall (and it was nearly 40 years ago) there were a series of > > PL/1 function calls we used rather than encoding the request as a > > string as SQL systems require. Ditto. Including Hugh Darwen. > One of the people involved in that was Hugh Darwen, who is one of the authors > of The Third Manifesto, which is an attempt to define what a properly > relational language and system should look like. So you could say the > experience of ISBL vs SQL has been folded into that effort. See http://www.thethirdmanifesto.com/. Hugh worked for some years for IBM on the SQL Committee, but eventually left over a major disagreement in direction. TTMis based on the work he's done since (with Chris Date). Andl derives from that. I would say that very little of PRTV/ISBL experience was added to SQL once it had been standardised, even with Hugh doinghis best. Regards David M Bennett FACS Andl - A New Database Language - andl.org
Hi, On Sun, Apr 24, 2016 at 12:55:48PM +1000, david@andl.org wrote: > But there is goodness there, and NoSQL is now just as hard to replace. Indeed, I wasn't trying to make some point-and-laugh argument about NoSQL. I was just observing that, as with many new techniques, some of the uses haven't really been thought out carefully. (It doesn't help that at least one of the early "successes" worked way better in theory than in practice, and that whole businesses have been taken down by the failure modes. "Oooh, can't resolve conflict! Oh well, throw it all away!" is not a great way to store critical business data.) New technologies are hard. Some regard Brooks's _The Mythical Man-Month_ and Spolsky's "Things You Should Never Do, Part I" as saying different things. But I think they're in deep agreement on a key point: understanding why the old approach is there is way harder than figuring out that old approach; so there's a natural tendency to replace rather than to understand and build further. In Brooks, this leads to the communications death, which is one of the ways that adding more people to a late project makes it later. In Spolsky, it yields the straightforward observation that reading code is harder than writing it. In both cases, though, the point is that careful software development management is considerably harder than it seems. I think that those two works -- along with _Soul of a New Machine_ -- impart certain basic things you really need to internalise to see why so many large software projects are more about people's egos than about actually making stuff better. None of them says, "Don't do new things." But all militate towards understanding what you're throwing away before you start work. In I think 2003 or 2004 I read an article in _CACM_[1] that said (in my reading) that Google proved CAP was true and that we had to get over ourselves (I'm exaggerating for effect). As a data guy, I found this both troubling and influential, and I've thought about it a lot since. The thing I found compelling about it was the observation that Google's approach to consistency was way better than good enough, so one shouldn't care too much about durability or consistency. The thing that bothered me was the obvious counter-examples. I came to believe that the point I understood was obviously true in its domain, and dangerously false in other cases. In retrospect, is is obviously true that, if you understand your domain well enough, many data handling techniques could be appropriate. But that's also _only_ true if you understand your domain well enough: applying the wrong techniques to your data can be seriously harmful, too. This explains why various NoSQL techniques are so powerful in some ways and yet often so frustrating to data people. It explains why the most successful distributed database ever is the DNS, which is the wrong tool for nearly every job yet fabulously successful in its job. And it's an excellent way to organise thinking about how to pick the right technology for a given data situation. For if you pick the wrong one, you might find you've left a lot of the value in a data set practically inaccessible. You don't need perfect foresight. But attending a little to what value there is in your data can yield great dividends. We shape our tools and then our tools shape us [2]. But in the software world, we must be more mindful than ever that we understand our tools -- the shapes that they take and that they make. Historicism in software is no vice. It is the path by which we learn to make new mistakes, as opposed to the same mistake over again. [1] Darned if I can find the article, but I confess some scepticism that my original reading was what the authors intended. Doesn't matter for these purposes! :) [2] Apparently, Marshall McLuhan didn't say this; instead, his tribune John Culkin, SJ said it. It's still an excellent point, whoever made it. Best regards, A -- Andrew Sullivan ajs@crankycanuck.ca
> owner@postgresql.org] On Behalf Of Andrew Sullivan > Indeed, I wasn't trying to make some point-and-laugh argument about NoSQL. I > was just observing that, as with many new techniques, some of the uses > haven't really been thought out carefully. (It doesn't help that at least > one of the early "successes" worked way better in theory than in practice, > and that whole businesses have been taken down by the failure modes. "Oooh, > can't resolve conflict! Oh well, throw it all away!" is not a great way to > store critical business > data.) I can think of quite a few of those. Starting with JavaScript... > New technologies are hard. Some regard Brooks's _The Mythical Man-Month_ and > Spolsky's "Things You Should Never Do, Part I" as saying different things. > But I think they're in deep agreement on a key point: understanding why the > old approach is there is way harder than figuring out that old approach; so > there's a natural tendency to replace rather than to understand and build > further. I agree. Microsoft completely replaced their original Basic/GWBASIC/Bascom code base with VB, but they did both at the same time. And current versions of Word, Excel and MSVC are built directly from the original Windows code base -- no rewrites. The Edge browser is new, but IE goes on... > In Brooks, this leads to the communications death, which is one of the ways > that adding more people to a late project makes it later. In Spolsky, it > yields the straightforward observation that reading code is harder than > writing it. In both cases, though, the point is that careful software > development management is considerably harder than it seems. I think that > those two works -- along with _Soul of a New Machine_ -- impart certain basic > things you really need to internalise to see why so many large software > projects are more about people's egos than about actually making stuff > better. None of them says, "Don't do new things." But all militate towards > understanding what you're throwing away before you start work. I agree. I know those works well. I am the proud possessor of 500KLOC of 25yo C/C++ code and you can trust it! > > In I think 2003 or 2004 I read an article in _CACM_[1] that said (in my > reading) that Google proved CAP was true and that we had to get over > ourselves (I'm exaggerating for effect). As a data guy, I found this both > troubling and influential, and I've thought about it a lot since. The thing > I found compelling about it was the observation that Google's approach to > consistency was way better than good enough, so one shouldn't care too much > about durability or consistency. The thing that bothered me was the obvious > counter-examples. I came to believe that the point I understood was > obviously true in its domain, and dangerously false in other cases. I think CAP is true enough, but financial transactions and comments on a Facebook page occupy different places on the spectrum. > > In retrospect, is is obviously true that, if you understand your domain well > enough, many data handling techniques could be appropriate. But that's also > _only_ true if you understand your domain well enough: applying the wrong > techniques to your data can be seriously harmful, too. This explains why > various NoSQL techniques are so powerful in some ways and yet often so > frustrating to data people. It explains why the most successful distributed > database ever is the DNS, which is the wrong tool for nearly every job yet > fabulously successful in its job. And it's an excellent way to organise > thinking about how to pick the right technology for a given data situation. > For if you pick the wrong one, you might find you've left a lot of the value > in a data set practically inaccessible. You don't need perfect foresight. > But attending a little to what value there is in your data can yield great > dividends. Writing Andl from scratch has been fun -- there was no other way to find out what it should do. But there is a lot to say for actually producing a new dialect of SQL instead (built on the same foundations as the old one). It's just that SQL is such a damned ugly language! The reason I'm here and engaging in this discussion on this list is that Postgres is virtually the only candidate for hosting a new industrial grade relational language. It has the credos, the extensibility and the licensing to make it possible. > > We shape our tools and then our tools shape us [2]. But in the software > world, we must be more mindful than ever that we understand our tools -- the > shapes that they take and that they make. > Historicism in software is no vice. It is the path by which we learn to make > new mistakes, as opposed to the same mistake over again. Most people do very little tool shaping. Most people will use a hammer to drive a screw if that's what they see other people doing. I'm a toolmaker and I think we can do better than SQL, but it sure is hard to get there. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On Sun, Apr 24, 2016 at 2:56 PM, <david@andl.org> wrote: >> One of the people involved in that was Hugh Darwen, who is one of the authors >> of The Third Manifesto, which is an attempt to define what a properly >> relational language and system should look like. So you could say the >> experience of ISBL vs SQL has been folded into that effort. > > See http://www.thethirdmanifesto.com/. So what incremental improvements could we steal from "properly relational" query languages? Here's one I've thought about, trivial as it may be. I noticed that Tutorial D (and apparently Andl too) includes a [NOT] MATCHING operator (alternatively spelled SEMIJOIN and SEMIMINUS) corresponding to the ⋉ (semi-join) and ▷ (anti-join) operators from relational algebra. In SQL you write [NOT] EXISTS or [NOT] IN in the WHERE clause, rather than something explicit in a <joined table> clause, though experienced users often talk explicitly about semi- and anti-joins, both because of the theory and because the terms show up in query plans. A recent blog post[1] argues that SQL should have a SQL92-style JOIN syntax for this and points at a couple of products that have added one[2][3]. I guess it might be hard to convince the Postgres community to add support for a non-standard syntax that doesn't give you anything you can't already do, but as an idea I find it interesting and it seems to be in the spirit of the part of the Third Manifesto that says: "support[ing] the usual operators of the relational algebra [..]. All such operators shall be expressible without excessive circumlocution." For example, say we want all students who have one or more exam today: SELECT s.student_id, s.name FROM student s WHERE EXISTS (SELECT 1 FROM exam e WHERE e.student_id = s.student_id AND e.exam_date = CURRENT_DATE) I don't know Tutorial D, but I think it might express that with something like: ( student MATCHING exam WHERE exam_date = CURRENT_DATE ) { student_id, name } With 10 minutes of prototype hacking I convinced Postgres to accept SEMI and ANTI like this: SELECT s.student_id, s.name FROM student s SEMI JOIN exam e USING (student_id) WHERE e.exam_date = CURRENT_DATE I guess a real version should accept (or require?) LEFT or RIGHT before SEMI/ANTI. When using this hypothetical syntax I think you should be allowed to refer to e.exam_date in the WHERE clause but not in the SELECT list (Impala apparently does allow you to see data from exam, and returns values from an arbitrary matching row, but that doesn't seem right to me). But the EXISTS syntax is correspondingly strange in that it requires you to provide a SELECT list which is entirely discarded, so people often write "*" or "1" (erm, OK, I guess you can use an empty select list in recent Postgres). [1] https://blog.jooq.org/2015/10/13/semi-join-and-anti-join-should-have-its-own-syntax-in-sql/ [2] http://www.cloudera.com/documentation/archive/impala/2-x/2-0-x/topics/impala_joins.html [3] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins -- Thomas Munro http://www.enterprisedb.com
On 24 April 2016 at 08:36, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > For example, say we want all students who have one or more exam today: > > SELECT s.student_id, s.name > FROM student s > WHERE EXISTS (SELECT 1 > FROM exam e > WHERE e.student_id = s.student_id > AND e.exam_date = CURRENT_DATE) To be fair to SQL a far simpler version would be SELECT DISTINCT student_id, name FROM student INNER JOIN exam USING(student_id) WHERE exam_date=CURRENT_DATE; To find students with no exam today (the other point of your argument): SELECT student_id, name FROM student LEFT JOIN exam USING(student_id) WHERE exam_date=CURRENT_DATE AND exam.student_id IS NULL; Not really sure what the issue is with either of those. Once learned, they're both very easy and straightforward. Geoff
> owner@postgresql.org] On Behalf Of Thomas Munro > So what incremental improvements could we steal from "properly relational" > query languages? Here is my list of deficiencies in some or all dialects of SQL. -Relation and tuple as data types -Globally exclude NULLs and duplicate columns -Relation with no columns, null key -Natural antijoin -Tuple join operations -Tuple tests for equality, superset, subset -Tuple test for set membership of relation -Relation tests for equality, superset, subset -Relation set operations include symmetric difference -User-defined functions of arbitrary complexity -User-defined aggregation functions of arbitrary complexity -User-defined ordered queries of arbitrary complexity -Iterative/recursive queries (when) > Here's one I've thought about, trivial as it may be. I noticed that Tutorial > D (and apparently Andl too) includes a [NOT] MATCHING operator (alternatively > spelled SEMIJOIN and SEMIMINUS) corresponding to the ⋉ (semi-join) and ▷ > (anti-join) operators from relational algebra. In SQL you write [NOT] EXISTS > or [NOT] IN in the WHERE clause, rather than something explicit in a <joined > table> clause, though experienced users often talk explicitly about semi- and > anti-joins, both because of the theory and because the terms show up in query > plans. Yes, there are two joins (join and antijoin). Semijoin is just one of a number of projections following a join, but antijoinis a quite different algorithm. Antijoin is quite hard to write in SQL in such a way that the query planner will do the right thing. There is a lot of variationbetween dialects. > A recent blog post[1] argues that SQL should have a SQL92-style JOIN syntax > for this and points at a couple of products that have added one[2][3]. I > guess it might be hard to convince the Postgres community to add support for > a non-standard syntax that doesn't give you anything you can't already do, > but as an idea I find it interesting and it seems to be in the spirit of the > part of the Third Manifesto that says: "support[ing] the usual operators of > the relational algebra [..]. All such operators shall be expressible without > excessive circumlocution." The purpose is simply that explicit syntax allows for explicit query optimisation. > > For example, say we want all students who have one or more exam today: > > SELECT s.student_id, s.name > FROM student s > WHERE EXISTS (SELECT 1 > FROM exam e > WHERE e.student_id = s.student_id > AND e.exam_date = CURRENT_DATE) > > I don't know Tutorial D, but I think it might express that with something > like: > > ( student MATCHING exam > WHERE exam_date = CURRENT_DATE ) > { student_id, name } > > With 10 minutes of prototype hacking I convinced Postgres to accept SEMI and > ANTI like this: > > SELECT s.student_id, s.name > FROM student s SEMI JOIN exam e USING (student_id) > WHERE e.exam_date = CURRENT_DATE > > I guess a real version should accept (or require?) LEFT or RIGHT before > SEMIANTI. Andl has semijoin, rsemijoin, ajoin and rajoin. They are trivial to add once you have the basic algorithm, but Tutorial Ddoes not. When using this hypothetical syntax I think you should be allowed > to refer to e.exam_date in the WHERE clause but not in the SELECT list > (Impala apparently does allow you to see data from exam, and returns values > from an arbitrary matching row, but that doesn't seem right to me). But the > EXISTS syntax is correspondingly strange in that it requires you to provide a > SELECT list which is entirely discarded, so people often write "*" or "1" > (erm, OK, I guess you can use an empty select list in recent Postgres). SQL has an implicit ordering of query evaluation -- you will often need to write a nested subquery or correlated query forwhat should be very straightforward situations. That's another thing that's easy to fix, if allowed. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On 24 April 2016 at 12:29, Geoff Winkless <pgsqladmin@geoff.dj> wrote: > To find students with no exam today (the other point of your argument): > > SELECT student_id, name > FROM student > LEFT JOIN exam USING(student_id) > WHERE exam_date=CURRENT_DATE AND exam.student_id IS NULL; *sigh* problem with writing emails in a rush. Of course I meant SELECT student.student_id, name FROM student LEFT JOIN exam ON exam.student_id=student.student_id AND exam_date=CURRENT_DATE WHERE exam.student_id IS NULL; Geoff
> From: Thomas Munro [mailto:thomas.munro@enterprisedb.com] > FWIW standard SQL may not allow it but Postgres does, and it's even possible > to exclude duplicates by using an expression that references the whole row. Thank you. I didn't know that. I'll use it if I can verify it works right. It's not that important -- Andl can emulate it quite easily. Regards David M Bennett FACS Andl - A New Database Language - andl.org
> owner@postgresql.org] On Behalf Of Eric Schwarzenbach > >> If I had a few $million to spend in a philanthropical manner, I would > >> hire some of the best PG devs to develop a proper relational database > server. > >> Probably a query language that expressed the relational algebra in a > >> scheme-like syntax, and the storage model would be properly > >> relational (eg no duplicate rows). If there were someone to pay the bills, would you work on it? > >> It's an enormous tragedy that all the development effort that has > >> gone into NoSQL database has pretty much all gotten it wrong: by all > >> means throw out SQL, but not the relational model with it. They're > >> all just rehashing the debate over hierarchical storage from the 70s. > >> Comp Sci courses should feature a history class. > >> > >> It's a bit odd to me that someone isn't working on such a thing. Several people are, but without the few $million... > > Well when IBM were first developing relational databases there were > > two different teams. One in California which produced System-R which > > became what we now know as DB2 and spawned SQL, and the other in > > Peterlee in the UK which was called PRTV (the Peterlee Relational Test > > Vehicle). PRTV rather died but bits of it survived. And many of the people who worked on it are still around. > > According to the Wikipedia page it did have a language (ISBL) but from > > what I recall (and it was nearly 40 years ago) there were a series of > > PL/1 function calls we used rather than encoding the request as a > > string as SQL systems require. Ditto. Including Hugh Darwen. > One of the people involved in that was Hugh Darwen, who is one of the authors > of The Third Manifesto, which is an attempt to define what a properly > relational language and system should look like. So you could say the > experience of ISBL vs SQL has been folded into that effort. See http://www.thethirdmanifesto.com/. Hugh worked for some years for IBM on the SQL Committee, but eventually left over a major disagreement in direction. TTMis based on the work he's done since (with Chris Date). Andl derives from that. I would say that very little of PRTV/ISBL experience was added to SQL once it had been standardised, even with Hugh doinghis best. Regards David M Bennett FACS Andl - A New Database Language - andl.org
On Apr 23, 2016, at 19:43 , David Bennett <davidb@pfxcorp.com> wrote: > owner@postgresql.org] On Behalf Of Eric Schwarzenbach > >>>> If I had a few $million to spend in a philanthropical manner, I would >>>> hire some of the best PG devs to develop a proper relational database >> server. >>>> Probably a query language that expressed the relational algebra in a >>>> scheme-like syntax, and the storage model would be properly >>>> relational (eg no duplicate rows). > > If there were someone to pay the bills, would you work on it? Yes, to the extent that my skills were useful (I'm good at logic, but I've always gravitated to higher-level languages, soif it was C, I'd write documentation or something). >>>> It's an enormous tragedy that all the development effort that has >>>> gone into NoSQL database has pretty much all gotten it wrong: by all >>>> means throw out SQL, but not the relational model with it. They're >>>> all just rehashing the debate over hierarchical storage from the 70s. >>>> Comp Sci courses should feature a history class. >>>> >>>> It's a bit odd to me that someone isn't working on such a thing. > > Several people are, but without the few $million… I hear you. I started this discussion because much as I utterly adore what the miracle workers behind Postgres have managedto create from the sow's ear that is SQL, I'm always aware of just how much better the world could be. And data storageis so central to everything that happens. Now is the time, with this NoSQL movement afoot, to provide something thatbreaks from SQL in the *right* way. I would love to be able to choose an eventually-consistent store that syncs with minimal effort with a local store in a handheldapp, or any of these great storage options that the NoSQL folks are doing, but giving up relations is *such* a highprice to pay, particularly when I know that it just isn't necessary. If someone was working on such a thing in a language that is less tedious than C (Go or Scheme or Haskell or Clojure, anyone?),I would happily contribute. My ideal data store is something like a top-level relational engine with pluggable stores, pluggable synchronization etcoptions, pluggable languages. Pretty much the philosophy that has driven Postgres from the beginning. If such a thingcould be produced and became popular, it would almost add measurably to GDP, I think, such could be its impact if doneright. Anyway, I hope everyone has found this as interesting a discussion as I have. I'll dig more deeply into Andl and watch itsprogress. And if anyone here knows of a non-SQL relational database project worth noting, please do so here.