Thread: Major features for 9.1

From:
Josh Berkus
Date:

All,

Here's a list of the features which seem to me to be major enough to be
headliners for 9.1.  Please mark which ones you think are major/not
major/not major and anything you think I've forgotten.

The tough part is going to be picking 5-6 items from the below.

* Synchronous Replication
Reason: obvious

* wCTE: Writeable Common Table Expressions
Reason: replace large amounts of app code with single queries.
Also, we may be the first DB to have this.  Anyone know?

* Per-column collations
Reason: fully multilingual databases now possible.
This has been a TODO for a decade.

* SSI: Serializable Snapshot Isolation
True serializability without locking.  Eliminate a lot of SELECT FOR
UPDATE code.
This might also be a PostgreSQL first.

* Unlogged Tables
Reason: allows using Postgres for not-valuable, high-performance data
like session tables instead of using a caching database

* SQL/MED with FILE_FDW
Reason: allows using PostgreSQL to query any kind of data source you can
roll a driver for.  First one shipping reads CSV files, eliminating the
need to COPY them into the database in many cases.

* KNN-GiST: K-Nearest Neighbor Indexing
Permits indexed searches of "what's near X", greatly improving the
responsiveness of PostGIS applications and allowing new application
types to be built.

* Single-command cloning, new admin tools for replication
Not sure about this one, should probably be buried in text somewhere.

* PL/Python Overhaul
Not sure what went into this, can anyone give me details?

* SEPostgres
SE-Postgres has been integrated into PostgreSQL through generic security
hooks and a contrib module.  Allows "military" level security/control
over database.

* Extensions
Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
can now be packaged as pluggable extensions, greatly simplifying
installing and upgrading them.  In addition, the PGXN network will
become an online repository of these extensions.

Ones I don't think are major features but are cool anyway:

* Valid-on-creation FKs
* Checkpoint Goggles
* Extensible ENUMs
* Triggers on Views
* New Trigram implementation (text is similar to ...)
* Reduced NUMERIC size

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:
Thom Brown
Date:

On 1 April 2011 20:21, Josh Berkus <> wrote:
> All,
>
> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> The tough part is going to be picking 5-6 items from the below.
>
> * Synchronous Replication
> Reason: obvious
>
> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>
> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>
> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.
>
> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database
>
> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.
>
> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.
>
> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.
>
> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?
>
> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.
>
> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
> can now be packaged as pluggable extensions, greatly simplifying
> installing and upgrading them.  In addition, the PGXN network will
> become an online repository of these extensions.
>
> Ones I don't think are major features but are cool anyway:
>
> * Valid-on-creation FKs
> * Checkpoint Goggles
> * Extensible ENUMs
> * Triggers on Views
> * New Trigram implementation (text is similar to ...)
> * Reduced NUMERIC size

What about column data type alterations for populated tables which
don't require rewrites if they're binary-coercible?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:
"Joshua D. Drake"
Date:

On Fri, 2011-04-01 at 12:21 -0700, Josh Berkus wrote:
> All,
>
> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> The tough part is going to be picking 5-6 items from the below.
>
> * Synchronous Replication
> Reason: obvious

+1

>
> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>
> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>
> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.

-1 (it is a great feature but doesn't mean anything to the wider
"press")

>
> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database

+1

>
> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.
>

+1

> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.

~ +/- 1

>
> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.
>

-1

> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?
>
-1

> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.
>

+1

> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
> can now be packaged as pluggable extensions, greatly simplifying
> installing and upgrading them.  In addition, the PGXN network will
> become an online repository of these extensions.
>


+1

JD
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


From:
Pavel Stehule
Date:

Hello

2011/4/1 Josh Berkus <>:
> All,
>
> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> The tough part is going to be picking 5-6 items from the below.
>
> * Synchronous Replication
> Reason: obvious
>
> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>

+1

> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>

+1

> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.
>

+1

> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database
>

+1

> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.
>

-1

> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.
>

+1
> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.
>

-1
> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?
>

-1

> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.

-1

>
> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
> can now be packaged as pluggable extensions, greatly simplifying
> installing and upgrading them.  In addition, the PGXN network will
> become an online repository of these extensions.
>

-1

I am thinking so 9.1 is little bit blurred - The most cool feature
highly depends on user group - PostGIS (KNN-GiST), ETL (unlogged
tables), in europe we like collations, developers will like wCTE, and
Java and Hibernate people will like SSI. A border is too wide. I like
SQL/MED but I am thinking, so this needs more time and as minimim is
real support for other SQL servers - this feature isn't completed now,
SEPostgres needs lot of love from people who work on distributions. I
don't know anybody, who can use this feature.


> Ones I don't think are major features but are cool anyway:
>
> * Valid-on-creation FKs
> * Checkpoint Goggles
> * Extensible ENUMs
> * Triggers on Views

-1

> * New Trigram implementation (text is similar to ...)

it is good marketing feature :) - same is functional dependency for
GROUP BY clause.



> * Reduced NUMERIC size

-1

I like all features - but not all can be mayor :)

Pavel

>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
> --
> Sent via pgsql-advocacy mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-advocacy
>

From:
Guillaume Lelarge
Date:

Le 01/04/2011 21:21, Josh Berkus a écrit :
> All,
>
> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> The tough part is going to be picking 5-6 items from the below.
>
> * Synchronous Replication
> Reason: obvious
>

+1
Not my personal favorite feature in 9.1, but it's one that people will
notice.

> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>
> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>

+1

> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.
>
> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database
>

+1

> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.
>

+1
One of my two personal favorites features in 9.1.

> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.
>
> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.
>

+1 (but I would better say "Better administration and monitoring
features for replication")

> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?
>
> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.
>
> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
> can now be packaged as pluggable extensions, greatly simplifying
> installing and upgrading them.  In addition, the PGXN network will
> become an online repository of these extensions.
>

+1
The other one of my two personal favorites features in 9.1.

> Ones I don't think are major features but are cool anyway:
>
> * Valid-on-creation FKs
> * Checkpoint Goggles
> * Extensible ENUMs
> * Triggers on Views
> * New Trigram implementation (text is similar to ...)
> * Reduced NUMERIC size
>


--
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

From:
"Kevin Grittner"
Date:

Josh Berkus <> wrote:

> * Synchronous Replication
> Reason: obvious

Gotta mention this, based on constantly expressed interest.

+1

> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?

This is very cool, but I don't know how many people know enough
about CTEs to be impressed.

> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.

Not big for me, but I'll bet in certain countries or application
types it is huge.

> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT
> FOR UPDATE code.
> This might also be a PostgreSQL first.

Obviously *I* think it's big.  Besides the on-list posts, I've
gotten quite a few off-list emails and comments at conferences which
make me think there are a lot of others interested.  Notice the
aggressive testing it got once it hit alpha.  I'm pretty sure I'd
have heard about it or run across it in web searches if anybody else
had beat us to this.  (I would say that "without blocking" would be
more technically accurate than "without locking".)

+1

> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance
> data like session tables instead of using a caching database

This will be very useful for our shop, but like CTEs, I don't know
how many people in the wider audience will be as impressed by this
as the other items.

> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source
> you can roll a driver for.  First one shipping reads CSV files,
> eliminating the need to COPY them into the database in many cases.

I have no idea how big the audience is for this one.

> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new
> application types to be built.

This is very cool, although I don't think I'll have any use for it.
The audience may be smaller than for other features.

> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text
> somewhere.

Agree on including in text but not major feature point.

> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?

As valuable as this is, an overhaul doesn't sound like a major
feature, on the face of it.

-1

> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic
> security hooks and a contrib module.  Allows "military" level
> security/control over database.

Clearly there is a lot of interest in certain quarters.  My concern
on pushing it as a major feature is that, given the nature of the
community which needs this, I suspect that those who really care
already know it's coming in 9.1, and those who don't already know
won't care so much.  (I could be wrong about that, though.)

> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS,
> languages, etc, can now be packaged as pluggable extensions,
> greatly simplifying installing and upgrading them.  In addition,
> the PGXN network will become an online repository of these
> extensions.

That's big.

+1

-Kevin

From:
phb07
Date:

Hi all,

I realize that by chance my choices are exactly the same as Guillaume's list !
So no need to repeat  ;-) . Just count +2 !
And I fully agree with his comments too.

Regards.
Philippe Beaudoin.



Guillaume Lelarge a écrit :
> Le 01/04/2011 21:21, Josh Berkus a écrit :
>
>> All,
>>
>> Here's a list of the features which seem to me to be major enough to be
>> headliners for 9.1.  Please mark which ones you think are major/not
>> major/not major and anything you think I've forgotten.
>>
>> The tough part is going to be picking 5-6 items from the below.
>>
>> * Synchronous Replication
>> Reason: obvious
>>
>>
>
> +1
> Not my personal favorite feature in 9.1, but it's one that people will
> notice.
>
>
>> * wCTE: Writeable Common Table Expressions
>> Reason: replace large amounts of app code with single queries.
>> Also, we may be the first DB to have this.  Anyone know?
>>
>> * Per-column collations
>> Reason: fully multilingual databases now possible.
>> This has been a TODO for a decade.
>>
>>
>
> +1
>
>
>> * SSI: Serializable Snapshot Isolation
>> True serializability without locking.  Eliminate a lot of SELECT FOR
>> UPDATE code.
>> This might also be a PostgreSQL first.
>>
>> * Unlogged Tables
>> Reason: allows using Postgres for not-valuable, high-performance data
>> like session tables instead of using a caching database
>>
>>
>
> +1
>
>
>> * SQL/MED with FILE_FDW
>> Reason: allows using PostgreSQL to query any kind of data source you can
>> roll a driver for.  First one shipping reads CSV files, eliminating the
>> need to COPY them into the database in many cases.
>>
>>
>
> +1
> One of my two personal favorites features in 9.1.
>
>
>> * KNN-GiST: K-Nearest Neighbor Indexing
>> Permits indexed searches of "what's near X", greatly improving the
>> responsiveness of PostGIS applications and allowing new application
>> types to be built.
>>
>> * Single-command cloning, new admin tools for replication
>> Not sure about this one, should probably be buried in text somewhere.
>>
>>
>
> +1 (but I would better say "Better administration and monitoring
> features for replication")
>
>
>> * PL/Python Overhaul
>> Not sure what went into this, can anyone give me details?
>>
>> * SEPostgres
>> SE-Postgres has been integrated into PostgreSQL through generic security
>> hooks and a contrib module.  Allows "military" level security/control
>> over database.
>>
>> * Extensions
>> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
>> can now be packaged as pluggable extensions, greatly simplifying
>> installing and upgrading them.  In addition, the PGXN network will
>> become an online repository of these extensions.
>>
>>
>
> +1
> The other one of my two personal favorites features in 9.1.
>
>
>> Ones I don't think are major features but are cool anyway:
>>
>> * Valid-on-creation FKs
>> * Checkpoint Goggles
>> * Extensible ENUMs
>> * Triggers on Views
>> * New Trigram implementation (text is similar to ...)
>> * Reduced NUMERIC size
>>
>>
>
>
>

From:
Robert Bernier
Date:

Here's my thoughts...

> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> ... picking 5-6 items ...


>  Synchronous Replication

+1

it's the flavour of the day


> * wCTE: Writeable Common Table Expressions

+1

I'm finding myself using a lot of Common Table Expressions it's more intuitive
than the more traditional BEGIN .. COMMIT paradigm especially for
programmers/developers


> * Per-column collations

-1

>
> * SSI: Serializable Snapshot Isolation
> True serializability without locking.

-1

You have to explain this otherwise there's no point in promoting this

> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database


+1

This is a sleeper. Drop the apology that it's for 'not-valuable' data. I can
see a lot of places that can allow pg to enter the big-data / cassandra /
hbase  world


> * SQL/MED with FILE_FDW

+1


> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.

-1


> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.

-1

Should be massaged as part of the continuing evolution of replication support,
the first point.


> * PL/Python Overhaul

-1

need more details to vote on it


> SEPostgres

-1

requires a user-case i.e. show how a past high profile exploit could have been
stopped with this mechanism in place.


> Extensions

+1

I suggest a user-case explanation of how would it be used



From:
Simon Riggs
Date:

On Fri, Apr 1, 2011 at 8:21 PM, Josh Berkus <> wrote:
> All,
>
> Here's a list of the features which seem to me to be major enough to be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.
>
> The tough part is going to be picking 5-6 items from the below.
>
> * Synchronous Replication
> Reason: obvious
>
> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>
> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>
> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.

Perhaps we can outline Major Features and Innovations separately. This
release is notable for the number of genuinely new ideas implemented,
so that's worth making something of even if the advanced features
themselves need a little more explanation. Or Perhaps have
"Innovations" as a major feature and then a comma list of the names of
them.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

From:
Josh Berkus
Date:

> Perhaps we can outline Major Features and Innovations separately. This
> release is notable for the number of genuinely new ideas implemented,
> so that's worth making something of even if the advanced features
> themselves need a little more explanation. Or Perhaps have
> "Innovations" as a major feature and then a comma list of the names of
> them.

Yeah, that sounds like the way to go.

So for Innovations, we'd have SSI, SE/Postgres, KNN-GiST, and wCTE, no?
 Or does someone else have wCTEs?  I vaguely remember something about
SQL Server ...

There's also another obvious grouping: Extensibility.  This would
include both Extensions and SQL/MED.

So we could have:

Features
    Sync Rep
    Per-col collation
    Unlogged tables

Innovations
    SSI
    KNN-GiST
    SE/Postgres
    wCTE

Extensibility
    Extensions
    SQL/MED

I like that, it means we can list everything but without having a long
laundry list which nobody will read.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:
Simon Riggs
Date:

On Sat, Apr 2, 2011 at 11:32 PM, Josh Berkus <> wrote:
>
>> Perhaps we can outline Major Features and Innovations separately. This
>> release is notable for the number of genuinely new ideas implemented,
>> so that's worth making something of even if the advanced features
>> themselves need a little more explanation. Or Perhaps have
>> "Innovations" as a major feature and then a comma list of the names of
>> them.
>
> Yeah, that sounds like the way to go.
>
> So for Innovations, we'd have SSI, SE/Postgres, KNN-GiST, and wCTE, no?
>  Or does someone else have wCTEs?  I vaguely remember something about
> SQL Server ...

wCTEs are definitely in SQLServer 2008

I would add Sync Rep also, since the transaction-controlled part is
new and original.

> There's also another obvious grouping: Extensibility.  This would
> include both Extensions and SQL/MED.
>
> So we could have:
>
> Features
>        Sync Rep
>        Per-col collation
>        Unlogged tables
>
> Innovations
>        SSI
>        KNN-GiST
>        SE/Postgres
>        wCTE
>
> Extensibility
>        Extensions
>        SQL/MED
>
> I like that, it means we can list everything but without having a long
> laundry list which nobody will read.

Key Features
Innovations
Extensions


--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

From:
Peter Eisentraut
Date:

On lör, 2011-04-02 at 15:32 -0700, Josh Berkus wrote:
>  Or does someone else have wCTEs?  I vaguely remember something about
> SQL Server ...

Since they're in the SQL:2011 standards draft, they must have been
implemented somewhere before.


From:
Peter Eisentraut
Date:

On lör, 2011-04-02 at 15:32 -0700, Josh Berkus wrote:
> Innovations
>         SSI
>         KNN-GiST
>         SE/Postgres
>         wCTE

I assume you mean by "innovation" that these are things where PostgreSQL
was the first DBMS to implement them.  The risk with that is, unless you
do very careful research, someone might debunk each of these within 24
hours after the release, making the whole thing look foolish.

Moreover, the fact that something might have been innovated doesn't
explain how it helps the user.



From:
Florian Weimer
Date:

* Kevin Grittner:

>> * SSI: Serializable Snapshot Isolation
>> True serializability without locking.  Eliminate a lot of SELECT
>> FOR UPDATE code.
>> This might also be a PostgreSQL first.
>
> Obviously *I* think it's big.  Besides the on-list posts, I've
> gotten quite a few off-list emails and comments at conferences which
> make me think there are a lot of others interested.  Notice the
> aggressive testing it got once it hit alpha.  I'm pretty sure I'd
> have heard about it or run across it in web searches if anybody else
> had beat us to this.  (I would say that "without blocking" would be
> more technically accurate than "without locking".)
>
> +1

Does SSI address the UPSERT race?  If yes, it's going to help a lot of
people indeed.

--
Florian Weimer                <>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

From:
Emanuel Calvo
Date:

>
> * Synchronous Replication
> Reason: obvious
>

+1

> * wCTE: Writeable Common Table Expressions
> Reason: replace large amounts of app code with single queries.
> Also, we may be the first DB to have this.  Anyone know?
>

-/+1

> * Per-column collations
> Reason: fully multilingual databases now possible.
> This has been a TODO for a decade.
>
> * SSI: Serializable Snapshot Isolation
> True serializability without locking.  Eliminate a lot of SELECT FOR
> UPDATE code.
> This might also be a PostgreSQL first.
>

+1

> * Unlogged Tables
> Reason: allows using Postgres for not-valuable, high-performance data
> like session tables instead of using a caching database
>

+1

> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.
>

+1

> * KNN-GiST: K-Nearest Neighbor Indexing
> Permits indexed searches of "what's near X", greatly improving the
> responsiveness of PostGIS applications and allowing new application
> types to be built.
>
> * Single-command cloning, new admin tools for replication
> Not sure about this one, should probably be buried in text somewhere.
>

+1

> * PL/Python Overhaul
> Not sure what went into this, can anyone give me details?
>
> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.
>
> * Extensions
> Optional PostgreSQL code, such as extra data types, GIS, languages, etc,
> can now be packaged as pluggable extensions, greatly simplifying
> installing and upgrading them.  In addition, the PGXN network will
> become an online repository of these extensions.
>
> Ones I don't think are major features but are cool anyway:
>
> * Valid-on-creation FKs
> * Checkpoint Goggles
> * Extensible ENUMs

+1

> * Triggers on Views
> * New Trigram implementation (text is similar to ...)
> * Reduced NUMERIC size

+1


--
--
              Emanuel Calvo
              Helpame.com

From:
"Kevin Grittner"
Date:

Florian Weimer  wrote:

> Does SSI address the UPSERT race?

Yes, but not necessarily in the way that you would like.  If all
transactions were serializable it would ensure that the UPSERT would
not commit with results inconsistent with some order of running the
UPSERT and other transactions one-at-a-time.  The problem is that
when a conflict situation developed SSI would prevent an anomaly from
being persisted by killing one of the transactions involved -- and it
would likely be the transaction running the UPSERT.  I get the
impression that most people interested in UPSERT would prefer a
solution which caused blocking on particular rows so that the UPSERT
would eventually make it through successfully after the other
transactions committed or rolled back.

Perhaps the predicate locking logic can be extended to allow lock
types which introduce blocking in some way which could provide the
behavior I've heard people discuss on the UPSERT threads, but the
non-blocking nature of the SIRREAD locks used by SSI only really
allows transaction cancellation as a conflict resolution technique.

-Kevin

From:
Joshua Berkus
Date:


> I assume you mean by "innovation" that these are things where
> PostgreSQL
> was the first DBMS to implement them. The risk with that is, unless
> you
> do very careful research, someone might debunk each of these within 24
> hours after the release, making the whole thing look foolish.

Well, we can hedge a bit.  But, better, can someone (preferably several someones) on this list volunteer to do
research? Anyone? 

>
> Moreover, the fact that something might have been innovated doesn't
> explain how it helps the user.

Truthfully, for a press release, it doesn't matter.  Many reporters/bloggers will happily quote that we're the first
databasewith SSI without having a clue what it means.  In general, most reporters who cover databases don't really know
know(or care) much about databases regardless (or "NoSQL" wouldn't have the press it does).   

From a PR perspective, the important thing to get across is that the PostgreSQL project is innovating, implementing new
databasetech before anyone else.  This is critically important during a period where SQL-RDBMSes are being portrayed as
"oldgrandfathers" in a press more enchanted with "the new hotness" of recently invented DBMSes. 

Now, for our users and for serious database geeks, we *do* need an elevator-pitch explanation of several features which
tellsthem why they should care.  Several features in particular are not immediately obvious: 

* SSI
* wCTE
* KNN-GiST
* SE-Postgres
* Transaction-controlled Synch Rep

I would like to have volunteers from the advocacy list commit to taking on one of each of these features.  For each one
weneed: 

a) a two-line explanation of what the feature is and why it's valuable (for the release notes, etc.)
b) a wiki page with a more detailed explaination and examples oriented towards the beginning-to-intermediate PostgreSQL
user.

Volunteers?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

From:
Andrew Lardinois
Date:

I'm starting research on 'innovation' in database technologies. To be sure I'm on the right track, this thread is about serializable  architecture, or, transaction locking. 
Furthermore, upserts are an umbrella term for 'insert' and 'update'. 
So far so good, but when it comes to SSI I am not finding any explanations for this acronym. 

Searching for the predicate of a real database: ...
Andrew Lardinois

On Sun, Apr 3, 2011 at 11:18 AM, Joshua Berkus <> wrote:


> I assume you mean by "innovation" that these are things where
> PostgreSQL
> was the first DBMS to implement them. The risk with that is, unless
> you
> do very careful research, someone might debunk each of these within 24
> hours after the release, making the whole thing look foolish.

Well, we can hedge a bit.  But, better, can someone (preferably several someones) on this list volunteer to do research?  Anyone?

>
> Moreover, the fact that something might have been innovated doesn't
> explain how it helps the user.

Truthfully, for a press release, it doesn't matter.  Many reporters/bloggers will happily quote that we're the first database with SSI without having a clue what it means.  In general, most reporters who cover databases don't really know know (or care) much about databases regardless (or "NoSQL" wouldn't have the press it does).

From a PR perspective, the important thing to get across is that the PostgreSQL project is innovating, implementing new database tech before anyone else.  This is critically important during a period where SQL-RDBMSes are being portrayed as "old grandfathers" in a press more enchanted with "the new hotness" of recently invented DBMSes.

Now, for our users and for serious database geeks, we *do* need an elevator-pitch explanation of several features which tells them why they should care.  Several features in particular are not immediately obvious:

* SSI
* wCTE
* KNN-GiST
* SE-Postgres
* Transaction-controlled Synch Rep

I would like to have volunteers from the advocacy list commit to taking on one of each of these features.  For each one we need:

a) a two-line explanation of what the feature is and why it's valuable (for the release notes, etc.)
b) a wiki page with a more detailed explaination and examples oriented towards the beginning-to-intermediate PostgreSQL user.

Volunteers?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

--
Sent via pgsql-advocacy mailing list ()
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-advocacy

From:
"Kevin Grittner"
Date:

Joshua Berkus  wrote:

> Now, for our users and for serious database geeks, we *do* need an
> elevator-pitch explanation of several features which tells them why
> they should care. Several features in particular are not
> immediately obvious:
>
> * SSI
> * wCTE
> * KNN-GiST
> * SE-Postgres
> * Transaction-controlled Synch Rep
>
> I would like to have volunteers from the advocacy list commit to
> taking on one of each of these features. For each one we need:
>
> a) a two-line explanation of what the feature is and why it's
> valuable (for the release notes, etc.)
> b) a wiki page with a more detailed explaination and examples
> oriented towards the beginning-to-intermediate PostgreSQL user.
>
> Volunteers?

I volunteer for SSI.

-Kevin

From:
Gilberto Castillo Martínez
Date:


El dom, 03-04-2011 a las 09:32 +0100, Simon Riggs escribió:
> On Sat, Apr 2, 2011 at 11:32 PM, Josh Berkus <> wrote:
> >

>
> Key Features
> Innovations
> Extensions
>
+1
--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

From:
Gilberto Castillo Martínez
Date:


El dom, 03-04-2011 a las 12:43 +0300, Peter Eisentraut escribió:
> On lör, 2011-04-02 at 15:32 -0700, Josh Berkus wrote:
> > Innovations
> >         SSI
> >         KNN-GiST
> >         SE/Postgres
> >         wCTE
>
> I assume you mean by "innovation" that these are things where PostgreSQL
> was the first DBMS to implement them.  The risk with that is, unless you
> do very careful research, someone might debunk each of these within 24
> hours after the release, making the whole thing look foolish.
>
> Moreover, the fact that something might have been innovated doesn't
> explain how it helps the user.
>

Placing a possible example was used for each.

--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx2.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

From:
"Kevin Grittner"
Date:

Andrew Lardinois  wrote:

> I'm starting research on 'innovation' in database technologies. To
> be sure I'm on the right track, this thread is about serializable
> architecture, or, transaction locking.

This thread has wandered into a discussion about innovative features
of the upcoming 9,1 release of PostgreSQL, of which there are
several.

> when it comes to SSI I am not finding any explanations
> for this acronym.

http://wiki.postgresql.org/wiki/Serializable

It's about a technique for serializable transactions which doesn't
add any blocking beyond what is present in snapshot isolation.  In
contrast, most serializable implementations which provide true
serializability do this by blocking conflicting transactions.  A
description of the SSI technique was first published in the 2008 ACM
SIGMOD, with independent confirmation of some of the published
performance improvements compared to traditional S2PL serializability
by an ACM committee.  While two prototype implementations were
developed to support academic papers on the topic, we believe this is
the first production-quality implementation.  It includes many
innovations which were not described in any academic papers on the
topic.

-Kevin

From:
Gilberto Castillo Martínez
Date:


>
> > when it comes to SSI I am not finding any explanations
> > for this acronym.
>
> http://wiki.postgresql.org/wiki/Serializable
>
> It's about a technique for serializable transactions which doesn't
> add any blocking beyond what is present in snapshot isolation.  In
> contrast, most serializable implementations which provide true
> serializability do this by blocking conflicting transactions.  A
> description of the SSI technique was first published in the 2008 ACM
> SIGMOD, with independent confirmation of some of the published
> performance improvements compared to traditional S2PL serializability
> by an ACM committee.  While two prototype implementations were
> developed to support academic papers on the topic, we believe this is
> the first production-quality implementation.  It includes many
> innovations which were not described in any academic papers on the
> topic.

Excelent!1

The News:
It's about a technique for serializable transactions which doesn't
add any blocking beyond what is present in snapshot isolation.It includes many
innovations which were not described in any academic papers on the
topic.

--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

From:
"Kevin Grittner"
Date:

Gilberto Castillo Martínez<> wrote:

>> > when it comes to SSI I am not finding any explanations
>> > for this acronym.
>>
>> http://wiki.postgresql.org/wiki/Serializable
>>
>> It's about a technique for serializable transactions which
>> doesn't add any blocking beyond what is present in snapshot
>> isolation. In contrast, most serializable implementations which
>> provide true serializability do this by blocking conflicting
>> transactions.  A description of the SSI technique was first
>> published in the 2008 ACM SIGMOD, with independent confirmation
>> of some of the published performance improvements compared to
>> traditional S2PL serializability by an ACM committee.  While two
>> prototype implementations were developed to support academic
>> papers on the topic, we believe this is the first production-
>> quality implementation.  It includes many innovations which were
>> not described in any academic papers on the topic.
>
> Excelent!
>
> The News:
> It's about a technique for serializable transactions which doesn't
> add any blocking beyond what is present in snapshot isolation. It
> includes many innovations which were not described in any academic
> papers on the topic.

I didn't intend for that to be my recommended "elevator talk"
summary; I was just trying to quickly explain the feature to someone
who knew nothing of it.  [pause]  Oh.  [pause]  I guess maybe this
should be my starting point, although I usually like to harp on the
fact that it allows programmers to enforce business rules within
triggers or even application code without worrying about explicit
locking and how this can boost programming productivity, especially
in large shops.

I'm not sure I can hit both angles in one two-sentence summary.
Which angle is better?  Maybe I could creep to three sentences to
allow both points?

-Kevin

From:
Andrew Lardinois
Date:

This must be the article in ACM-SIGMOD:
Which has a nice abstract, but does anybody on the list have the complete article?

There was also a paper published all the way back in 1997 by the Information Processing Society of Japan entitled:
"A concurrency control algorithm using serialization graph testing with write deferring"
Interestingly a Fujii, M is one (of three) authors. I'm wondering if that is Fujii, Masao of JPug? I'm off to write them now to confirm.

Andrew Lardinois


On Mon, Apr 4, 2011 at 9:20 AM, Kevin Grittner <> wrote:
Gilberto Castillo Martínez<> wrote:

>> > when it comes to SSI I am not finding any explanations
>> > for this acronym.
>>
>> http://wiki.postgresql.org/wiki/Serializable
>>
>> It's about a technique for serializable transactions which
>> doesn't add any blocking beyond what is present in snapshot
>> isolation. In contrast, most serializable implementations which
>> provide true serializability do this by blocking conflicting
>> transactions.  A description of the SSI technique was first
>> published in the 2008 ACM SIGMOD, with independent confirmation
>> of some of the published performance improvements compared to
>> traditional S2PL serializability by an ACM committee.  While two
>> prototype implementations were developed to support academic
>> papers on the topic, we believe this is the first production-
>> quality implementation.  It includes many innovations which were
>> not described in any academic papers on the topic.
>
> Excelent!
>
> The News:
> It's about a technique for serializable transactions which doesn't
> add any blocking beyond what is present in snapshot isolation. It
> includes many innovations which were not described in any academic
> papers on the topic.

I didn't intend for that to be my recommended "elevator talk"
summary; I was just trying to quickly explain the feature to someone
who knew nothing of it.  [pause]  Oh.  [pause]  I guess maybe this
should be my starting point, although I usually like to harp on the
fact that it allows programmers to enforce business rules within
triggers or even application code without worrying about explicit
locking and how this can boost programming productivity, especially
in large shops.

I'm not sure I can hit both angles in one two-sentence summary.
Which angle is better?  Maybe I could creep to three sentences to
allow both points?

-Kevin

From:
"Kevin Grittner"
Date:

Andrew Lardinois <> wrote:

> This must be the article in ACM-SIGMOD:
>
http://portal.acm.org/citation.cfm?id=1376616.1376690&coll=DL&dl=GUIDE&CFID=16503930&CFTOKEN=26926199
> Which has a nice abstract, but does anybody on the list have the
> complete article?

I did pay for and download the article, but it is a copyrighted
work, so I can't share it.  In an email exchange with Michael Cahill
he gave permission for me to share the URL of his personal copy of
the work with the list.  See my post in the archives here:

http://archives.postgresql.org/message-id/

Please note you are not allowed to distribute copies you download
from that URL either; it is restricted to the personal use of
whoever downloads it.

On the other hand, there is no such restriction on Michael Cahill's
later doctoral thesis, which covers everything in the ACM SIGMOD
paper and more.

> There was also a paper published all the way back in 1997 by the
> Information Processing Society of Japan entitled:
> "A concurrency control algorithm using serialization graph testing
> with write deferring"

Do you have a URL?  I haven't read that one, but I remember seeing
references to attempts to do full graph testing which failed due to
the high cost of doing so.  The main point of the SSI technique is
that it keys on a "dangerous structure" always found in cycles which
can cause anomalies (and from what I've seen rarely present without
causing such anomalies).  Testing for this two to three transaction
dangerous structure is much less expensive.

-Kevin

From:
Simon Riggs
Date:

On Mon, Apr 4, 2011 at 8:36 PM, Andrew Lardinois <> wrote:
> This must be the article in ACM-SIGMOD:
> http://portal.acm.org/citation.cfm?id=1376616.1376690&coll=DL&dl=GUIDE&CFID=16503930&CFTOKEN=26926199
> Which has a nice abstract, but does anybody on the list have the complete
> article?
> There was also a paper published all the way back in 1997 by the Information
> Processing Society of Japan entitled:
> "A concurrency control algorithm using serialization graph testing with
> write deferring"
> Interestingly a Fujii, M is one (of three) authors. I'm wondering if that is
> Fujii, Masao of JPug? I'm off to write them now to confirm.

It's not. He's been busy elsewhere.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

From:
"Kevin Grittner"
Date:

"Kevin Grittner" <> wrote:

> On the other hand, there is no such restriction on Michael
> Cahill's later doctoral thesis, which covers everything in the ACM
> SIGMOD paper and more.

Correction, it is copyrighted by the author with "All rights
reserved."  It is available for download without paying a fee,
though.

-Kevin

From:
Andrew Lardinois
Date:

Super thanks!

I have not found  "A concurrency control algorithm using serialization graph testing with write deferring" yet, but I did run into two other articles on the IPSJ site relating to serialization: 


the second of which is regarding bus serialization, so may not relate to databases specifically
http://www.jstage.jst.go.jp/article/ipsjdc/2/0/2_165/_article

Andrew Lardinois


On Mon, Apr 4, 2011 at 1:04 PM, Kevin Grittner <> wrote:
"Kevin Grittner" <> wrote:

> On the other hand, there is no such restriction on Michael
> Cahill's later doctoral thesis, which covers everything in the ACM
> SIGMOD paper and more.

Correction, it is copyrighted by the author with "All rights
reserved."  It is available for download without paying a fee,
though.

-Kevin

From:
"Kevin Grittner"
Date:

Andrew Lardinois <> wrote:

> I have not found  "A concurrency control algorithm using
> serialization graph testing with write deferring" yet, but I did
> run into two other articles on the IPSJ site relating to
> serialization:
>
> http://www.jstage.jst.go.jp/article/ipsjdc/2/0/2_759/_article

On a quick skim I don't think this is related to serializable
database transactions.  It might be useful for PostgreSQL in terms
of more efficient lightweight locks within the server memory space,
but that's an internal implementation issue, not anything which is
user-visible beyond its performance characteristics.  I'll give it a
closer read when I get a chance.

> the second of which is regarding bus serialization, so may not
> relate to databases specifically

Probably not.  This is getting really close to the metal, and
because of portability concerns PostgreSQL doesn't have a lot of
that.

-Kevin

From:
Jeff Davis
Date:

On Mon, 2011-04-04 at 07:20 -0500, Kevin Grittner wrote:
> > a) a two-line explanation of what the feature is and why it's
> > valuable (for the release notes, etc.)
> > b) a wiki page with a more detailed explaination and examples
> > oriented towards the beginning-to-intermediate PostgreSQL user.
> >
> > Volunteers?
>
> I volunteer for SSI.

The way I think about SSI is that it automatically detects live race
conditions in your SQL transactions at runtime; and protects you by
safely rolling some of them back (which can be retried safely).

Maybe something along those lines?

Regards,
    Jeff Davis


From:
"Kevin Grittner"
Date:

Jeff Davis <> wrote:
> On Mon, 2011-04-04 at 07:20 -0500, Kevin Grittner wrote:
>>> a) a two-line explanation of what the feature is and why it's
>>> valuable (for the release notes, etc.)
>>> b) a wiki page with a more detailed explaination and examples
>>> oriented towards the beginning-to-intermediate PostgreSQL user.
>>>
>>> Volunteers?
>>
>> I volunteer for SSI.
>
> The way I think about SSI is that it automatically detects live
> race conditions in your SQL transactions at runtime; and protects
> you by safely rolling some of them back (which can be retried
> safely).
>
> Maybe something along those lines?

That is very accessible to an IT manager who's not a database
expert, and conceptually very clean.  It doesn't directly address
what's innovative about it, though, nor does it directly address why
it's valuable -- although one could infer both through a close
reading.

To extend that such that the innovative nature and benefits are more
explicitly stated, perhaps:

SSI allows you to enforce arbitrarily complex user-defined business
rules within the database without blocking, by automatically
detecting live race conditions in your SQL transactions at runtime.
It protects you by safely rolling some of them back (which can be
retried safely).

It's more than two lines, but I'm not immediately able to see what
to cut.

By the way, I did some google searches to try to find a prior
production quality implementation, and have so far not found any.  I
have discovered that SSI is a very popular TLA, including its use
for something called Server Side Includes (SSI.php) which is a
popular enough technology with databases for people to be listing
years of experience with it on their LinkedIn pages.  I don't know
how much of a problem that might be.  I guess at a minimum we should
add Serializable Snapshot Isolation to the Wikipedia SSI page:

http://en.wikipedia.org/wiki/SSI

Maybe we shouldn't use SSI without spelling out Serializable
Snapshot Isolation, to avoid confusion.

-Kevin

From:
Aidan Van Dyk
Date:

Kevin Grittner wrote:

>> * wCTE: Writeable Common Table Expressions
>> Reason: replace large amounts of app code with single queries.
>> Also, we may be the first DB to have this.  Anyone know?
>
> This is very cool, but I don't know how many people know enough
> about CTEs to be impressed.

But those of us who do, are *itching* to use them in production applications
instead of having to pull everything into application code (or pl/*
functions) just to push into another query ;-)


>> * SQL/MED with FILE_FDW
>> Reason: allows using PostgreSQL to query any kind of data source
>> you can roll a driver for.  First one shipping reads CSV files,
>> eliminating the need to COPY them into the database in many cases.
>
> I have no idea how big the audience is for this one.

I *hope* this will really grow though, through adoption of extensions, and
PGXN...

Man 9.1 is packed with good stuff ;-)

a.

From:
Andrew Lardinois
Date:

I could go on finding random articles with some content about serialization all night. Instead I had a peek at what some other databases have to say about serialization / SSI:
SEARCHING MYSQL.COM ORACLE.COM COUCHDB
SSI                        X 60 X
SERIALIZATION                X lots 3
UPSERT                X lots X

http://labs.oracle.com/techrep/1996/smli_tr-96-57.pdf#search="SSI" see page 2, but this paper is titled 
"Solaris MC File System Framework" which makes me think it is operating system specific.

Regarding oracle on serialization, search results seem to be about specific products, i.e. Oracle Pedigree, or just object oriented serialization. Not database transactions specifically.

NoSQL... 
There is a 'load a serialized object in hadoop' thread in hadoop 

There is also an apache project called avro which is 'a data serialization system'
This does not seem to be a database specifically, but avro joined apache in 2009, so presumably has been around for pior to that.
It is worth noting that avro is built with JSON.

Seems like the the most confusion about serialization in the database would come from Amazon, the cloud, and Amazon's simpledb in particular. 
There is documentation that uses serialization:
"Policy -- the JSON serialization of the topic's access control policy" at:

Should note too that the couchdb project uses native JSON serialization.

It is surprising how much JSON "an ideal data-interchange language" (from json.org) turns up in other database technologies. 
This sounds a lot more like XML to me than how relational database store and manage data.

Valorizing knowledge capital, 
Andrew Lardinois
From:
Oleg Bartunov
Date:

Yes, SSI confuses me too. I used to use server sides includes in the eve
of web. I think our SSI is a cool technology and wiki page is must to have.
But I'd like to see useful examples.

Oleg

On Mon, 4 Apr 2011, Kevin Grittner wrote:

> Jeff Davis <> wrote:
>> On Mon, 2011-04-04 at 07:20 -0500, Kevin Grittner wrote:
>>>> a) a two-line explanation of what the feature is and why it's
>>>> valuable (for the release notes, etc.)
>>>> b) a wiki page with a more detailed explaination and examples
>>>> oriented towards the beginning-to-intermediate PostgreSQL user.
>>>>
>>>> Volunteers?
>>>
>>> I volunteer for SSI.
>>
>> The way I think about SSI is that it automatically detects live
>> race conditions in your SQL transactions at runtime; and protects
>> you by safely rolling some of them back (which can be retried
>> safely).
>>
>> Maybe something along those lines?
>
> That is very accessible to an IT manager who's not a database
> expert, and conceptually very clean.  It doesn't directly address
> what's innovative about it, though, nor does it directly address why
> it's valuable -- although one could infer both through a close
> reading.
>
> To extend that such that the innovative nature and benefits are more
> explicitly stated, perhaps:
>
> SSI allows you to enforce arbitrarily complex user-defined business
> rules within the database without blocking, by automatically
> detecting live race conditions in your SQL transactions at runtime.
> It protects you by safely rolling some of them back (which can be
> retried safely).
>
> It's more than two lines, but I'm not immediately able to see what
> to cut.
>
> By the way, I did some google searches to try to find a prior
> production quality implementation, and have so far not found any.  I
> have discovered that SSI is a very popular TLA, including its use
> for something called Server Side Includes (SSI.php) which is a
> popular enough technology with databases for people to be listing
> years of experience with it on their LinkedIn pages.  I don't know
> how much of a problem that might be.  I guess at a minimum we should
> add Serializable Snapshot Isolation to the Wikipedia SSI page:
>
> http://en.wikipedia.org/wiki/SSI
>
> Maybe we shouldn't use SSI without spelling out Serializable
> Snapshot Isolation, to avoid confusion.
>
> -Kevin
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: , http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

From:
Gilberto Castillo Martínez
Date:


El mar, 05-04-2011 a las 13:36 +0400, Oleg Bartunov escribió:
> Yes, SSI confuses me too. I used to use server sides includes in the eve
> of web. I think our SSI is a cool technology and wiki page is must to have.
> But I'd like to see useful examples.

All rigth, it's good.
--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

From:
"Kevin Grittner"
Date:

> Andrew Lardinois  wrote:

> I could go on finding random articles with some content about
> serialization all night.

> [all sorts of articles about data serialization and Server Side
> Include technology]

But we're talking about *SERIALIZABLE transactions*, and you have yet
to cite an article which has anything to do with that.  Please don't
post more of this until you read the wiki page I cited and understand
the issue.  If you find an article related to *that* topic I'm all
ears.  Random articles about unrelated topics which happen to contain
similar words are not helpful.

-Kevin

From:
"Kevin Grittner"
Date:

Oleg Bartunov  wrote:

> Yes, SSI confuses me too. I used to use server sides includes in
> the eve of web. I think our SSI is a cool technology and wiki page
> is must to have. But I'd like to see useful examples.

I guess the question is whether we want to use a name for the
technology that is different from what those who first developed the
concepts used in their published papers.  I've tried to respect their
work by using their technology.

The current wiki page was used for organizing work and communicating
with the community during development.  I think it might be best to
keep it that way and start a new page for the user perspective and
examples (with links back and forth, of course).  I will be working
on that.

There is an example or two in the docs.  Basically you can take any
example from the last ten years of where you need to use table locks
or SELECT FOR UPDATE and it now doesn't need that.  Those make great
examples.  ;-)

-Kevin

From:
Nikolas Everett
Date:

On Tue, Apr 5, 2011 at 8:22 AM, Kevin Grittner <> wrote:
> Andrew Lardinois  wrote:

> I could go on finding random articles with some content about
> serialization all night.

> [all sorts of articles about data serialization and Server Side
> Include technology]

But we're talking about *SERIALIZABLE transactions*, and you have yet
to cite an article which has anything to do with that.  Please don't
post more of this until you read the wiki page I cited and understand
the issue.  If you find an article related to *that* topic I'm all
ears.  Random articles about unrelated topics which happen to contain
similar words are not helpful.

It is rather unfortunate that serialize means both things.  It is also unfortunate that SSI also stands for Server Side Include.   This confusion is going to be common.  It isn't the end of the world but it is certainly going to have to be addressed.

I think it is saying something that one of the most exciting features in 9.1 would be lost on a huge portion of the DB using public.  I've interviewed DBAs who don't know what a transaction isolation level is.  Most of the developers I've interviewed don't know.

Nik
From:
Josh Berkus
Date:

Kevin,

> But we're talking about *SERIALIZABLE transactions*, and you have yet
> to cite an article which has anything to do with that.  Please don't
> post more of this until you read the wiki page I cited and understand
> the issue.  If you find an article related to *that* topic I'm all
> ears.  Random articles about unrelated topics which happen to contain
> similar words are not helpful.

Calm down, please.  I think that Andrew was just pointing out that it's
a difficult topic to research, and why.  Keyword searches are useless,
which leaves you forced to read every single paper on transaction
isolation in the ACM.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:
"Kevin Grittner"
Date:

Josh Berkus <> wrote:

> Calm down, please.  I think that Andrew was just pointing out that
> it's a difficult topic to research, and why.  Keyword searches are
> useless, which leaves you forced to read every single paper on
> transaction isolation in the ACM.

I apparently mistook his point entirely.  And in any event I could
have been more diplomatic.  Apologies.

-Kevin

From:
Jeff Davis
Date:

On Mon, 2011-04-04 at 16:56 -0500, Kevin Grittner wrote:
> SSI allows you to enforce arbitrarily complex user-defined business
> rules within the database without blocking, by automatically
> detecting live race conditions in your SQL transactions at runtime.
> It protects you by safely rolling some of them back (which can be
> retried safely).

"SSI allows you to enforce arbitrarily complex user-defined business
rules within the database without blocking..."

If I heard that statement and nothing else, and then encountered a
problem where SSI was a good solution, I'm not sure I'd make the
connection. It sounds a little vague (most people think SQL already
magically does that), and "complex" is probably a bad word to use
(people often read "complex" as "doesn't apply to me").

Sometimes I try to reverse the order and see if it sounds a little
better. Maybe something like:

"SSI automatically detects any race conditions among concurrent
transactions, and safely rolls some transactions back. That means you
can enforce arbitrary business rules and mix transactions safely without
blocking; and simply retry any failed transactions."

Still a little long, but I like the order a little more. At least for
me, "automatically detects race conditions" seems to have the most
punch, so it makes sense to put it first.

Ordinarily it's good to put a business need first, so I see why you
wrote it that way originally. But in this case, I think that "safety"
and "error detection" can be seen as the business needs; because most
people probably think they are enforcing their business rules already.
If they happen to be doing so safely already, then perhaps you can drill
down into the penalties they are paying in performance (due to blocking)
and/or development time (because getting it right is not easy).

Just my thoughts. It depends on the audience, of course.

Regards,
    Jeff Davis


From:
"Kevin Grittner"
Date:

Jeff Davis <> wrote:

> SSI automatically detects any race conditions among concurrent
> transactions, and safely rolls some transactions back. That means
> you can enforce arbitrary business rules and mix transactions
> safely without blocking; and simply retry any failed transactions.

I like it.  Josh, do we need to cut this to something shorter, or is
this OK?  You know, you can fit a lot on two lines in the right
font....  ;-)

-Kevin

From:
"Kevin Grittner"
Date:

Joshua Berkus <> wrote:

> I would like to have volunteers from the advocacy list commit to
> taking on one of each of these features.  For each one we need:

> b) a wiki page with a more detailed explaination and examples
> oriented towards the beginning-to-intermediate PostgreSQL user.

I've gotten a start on a wiki page with one example so far:

http://wiki.postgresql.org/wiki/SSI

Does this general format look OK for showing the SSI examples?

Is it at the desired technical level?

-Kevin

From:
Roy Hann
Date:

Nikolas Everett wrote:

> I think it is saying something that one of the most exciting features in 9.1
> would be lost on a huge portion of the DB using public.  I've interviewed
> DBAs who don't know what a transaction isolation level is.  Most of the
> developers I've interviewed don't know.

A DBA doesn't need to know what an isolation level is.  Programmers need
to know.  (But they don't.)

--
Roy


From:
Peter Eisentraut
Date:

On fre, 2011-04-01 at 12:21 -0700, Josh Berkus wrote:
> Here's a list of the features which seem to me to be major enough to
> be
> headliners for 9.1.  Please mark which ones you think are major/not
> major/not major and anything you think I've forgotten.

I guess it's not really a big feature, but the functional dependency
tracking feature that allows trimming down GROUP BY lists will
significantly cut down on questions and "bug reports" from especially
MySQL converts.


From:
Magnus Hagander
Date:

On Wed, Apr 6, 2011 at 19:10, Peter Eisentraut <> wrote:
> On fre, 2011-04-01 at 12:21 -0700, Josh Berkus wrote:
>> Here's a list of the features which seem to me to be major enough to
>> be
>> headliners for 9.1.  Please mark which ones you think are major/not
>> major/not major and anything you think I've forgotten.
>
> I guess it's not really a big feature, but the functional dependency
> tracking feature that allows trimming down GROUP BY lists will
> significantly cut down on questions and "bug reports" from especially
> MySQL converts.

Are you sure? Many of the cases I've come across aren't covered by it,
because they're not actually functional dependencies... They're just
using the "give me whatever row you can think of feature" in mysql. So
while this certainly helps, I'm ont sure it will make a significant
cut in the number of questions...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

From:
Peter Eisentraut
Date:

On ons, 2011-04-06 at 19:13 +0200, Magnus Hagander wrote:
> On Wed, Apr 6, 2011 at 19:10, Peter Eisentraut <> wrote:
> > On fre, 2011-04-01 at 12:21 -0700, Josh Berkus wrote:
> >> Here's a list of the features which seem to me to be major enough to
> >> be
> >> headliners for 9.1.  Please mark which ones you think are major/not
> >> major/not major and anything you think I've forgotten.
> >
> > I guess it's not really a big feature, but the functional dependency
> > tracking feature that allows trimming down GROUP BY lists will
> > significantly cut down on questions and "bug reports" from especially
> > MySQL converts.
>
> Are you sure? Many of the cases I've come across aren't covered by it,
> because they're not actually functional dependencies... They're just
> using the "give me whatever row you can think of feature" in mysql. So
> while this certainly helps, I'm ont sure it will make a significant
> cut in the number of questions...

Yes, many cases will still not work, but many will.


From:
Jeff Davis
Date:

On Tue, 2011-04-05 at 14:57 +0000, Roy Hann wrote:
> Nikolas Everett wrote:
>
> > I think it is saying something that one of the most exciting features in 9.1
> > would be lost on a huge portion of the DB using public.  I've interviewed
> > DBAs who don't know what a transaction isolation level is.  Most of the
> > developers I've interviewed don't know.
>
> A DBA doesn't need to know what an isolation level is.  Programmers need
> to know.  (But they don't.)

One of the use cases of SSI is to enforce constraints, which are
certainly of interest to DBAs.

Regards,
    Jeff Davis


From:
Peter Eisentraut
Date:

On ons, 2011-04-06 at 11:01 -0700, Jeff Davis wrote:
> On Tue, 2011-04-05 at 14:57 +0000, Roy Hann wrote:
> > Nikolas Everett wrote:
> >
> > > I think it is saying something that one of the most exciting features in 9.1
> > > would be lost on a huge portion of the DB using public.  I've interviewed
> > > DBAs who don't know what a transaction isolation level is.  Most of the
> > > developers I've interviewed don't know.
> >
> > A DBA doesn't need to know what an isolation level is.  Programmers need
> > to know.  (But they don't.)
>
> One of the use cases of SSI is to enforce constraints, which are
> certainly of interest to DBAs.

Well, but users can freely change the isolation level, so it would not
really be an effective constraint mechanism.




From:
"Kevin Grittner"
Date:

Peter Eisentraut <> wrote:
> On ons, 2011-04-06 at 11:01 -0700, Jeff Davis wrote:

>> One of the use cases of SSI is to enforce constraints, which are
>> certainly of interest to DBAs.
>
> Well, but users can freely change the isolation level, so it would
> not really be an effective constraint mechanism.

In our shop we plan to have a trigger check to ensure the isolation
level is serializable for all production tables.  In the long run,
it sure would be convenient to have a GUC setting to prohibit
explicitly setting the isolation level to something less strict than
the default, but it didn't seem wise to try to do that in the
initial release with the feature. People are probably going to want
a little wiggle room until the technique has some miles on it.

So, sure -- in our shop a user can change the isolation level, but
if they try to modify production data they will get an error with
such a setting, and if they view data from a read only transaction
in a state where a transient anomaly is visible to them, I guess
that's on them.

-Kevin

From:
David Fetter
Date:

On Sun, Apr 03, 2011 at 01:18:35PM -0500, Josh Berkus wrote:
>
>
> > I assume you mean by "innovation" that these are things where
> > PostgreSQL
> > was the first DBMS to implement them. The risk with that is, unless
> > you
> > do very careful research, someone might debunk each of these within 24
> > hours after the release, making the whole thing look foolish.
>
> Well, we can hedge a bit.  But, better, can someone (preferably several someones) on this list volunteer to do
research? Anyone? 
>
> >
> > Moreover, the fact that something might have been innovated doesn't
> > explain how it helps the user.
>
> Truthfully, for a press release, it doesn't matter.  Many reporters/bloggers will happily quote that we're the first
databasewith SSI without having a clue what it means.  In general, most reporters who cover databases don't really know
know(or care) much about databases regardless (or "NoSQL" wouldn't have the press it does).   
>
> >From a PR perspective, the important thing to get across is that the PostgreSQL project is innovating, implementing
newdatabase tech before anyone else.  This is critically important during a period where SQL-RDBMSes are being
portrayedas "old grandfathers" in a press more enchanted with "the new hotness" of recently invented DBMSes. 
>
> Now, for our users and for serious database geeks, we *do* need an elevator-pitch explanation of several features
whichtells them why they should care.  Several features in particular are not immediately obvious: 
>
> * SSI
> * wCTE
> * KNN-GiST
> * SE-Postgres
> * Transaction-controlled Synch Rep
>
> I would like to have volunteers from the advocacy list commit to taking on one of each of these features.  For each
onewe need: 
>
> a) a two-line explanation of what the feature is and why it's valuable (for the release notes, etc.)
> b) a wiki page with a more detailed explaination and examples oriented towards the beginning-to-intermediate
PostgreSQLuser. 

I'll take a whack at wCTE :)

Cheers,
David.
--
David Fetter <> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: 
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:
David Fetter
Date:

On Sun, Apr 03, 2011 at 12:38:02PM +0300, Peter Eisentraut wrote:
> On lör, 2011-04-02 at 15:32 -0700, Josh Berkus wrote:
> >  Or does someone else have wCTEs?  I vaguely remember something
> >  about SQL Server ...
>
> Since they're in the SQL:2011 standards draft, they must have been
> implemented somewhere before.

What's in the SQL:2011 standards draft is a long way behind even our
current wCTE implementation[1], and it corresponds to something in DB2.

Cheers,
David.

[1] I have Evil Plans™ for further expansion.
--
David Fetter <> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: 
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:
David Fetter
Date:

On Sun, Apr 03, 2011 at 09:32:27AM +0100, Simon Riggs wrote:
> On Sat, Apr 2, 2011 at 11:32 PM, Josh Berkus <> wrote:
> >
> >> Perhaps we can outline Major Features and Innovations separately.
> >> This release is notable for the number of genuinely new ideas
> >> implemented, so that's worth making something of even if the
> >> advanced features themselves need a little more explanation. Or
> >> Perhaps have "Innovations" as a major feature and then a comma
> >> list of the names of them.
> >
> > Yeah, that sounds like the way to go.
> >
> > So for Innovations, we'd have SSI, SE/Postgres, KNN-GiST, and
> > wCTE, no?   Or does someone else have wCTEs?  I vaguely remember
> > something about SQL Server ...
>
> wCTEs are definitely in SQLServer 2008

I haven't found them there (or anywhere else to date).  Got a
reference I can check?

Cheers,
David.
--
David Fetter <> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: 
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:
David Fetter
Date:

On Sat, Apr 02, 2011 at 10:39:51AM -0700, Robert Bernier wrote:
> > * wCTE: Writeable Common Table Expressions
>
> +1
>
> I'm finding myself using a lot of Common Table Expressions it's more intuitive
> than the more traditional BEGIN .. COMMIT paradigm especially for
> programmers/developers

Are you already using the data-changing part? :)

Cheers,
David.
--
David Fetter <> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: 
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:
Roy Hann
Date:

Peter Eisentraut wrote:

> On ons, 2011-04-06 at 11:01 -0700, Jeff Davis wrote:
>> On Tue, 2011-04-05 at 14:57 +0000, Roy Hann wrote:
>> > Nikolas Everett wrote:
>> >
>> > > I think it is saying something that one of the most exciting features in 9.1
>> > > would be lost on a huge portion of the DB using public.  I've interviewed
>> > > DBAs who don't know what a transaction isolation level is.  Most of the
>> > > developers I've interviewed don't know.
>> >
>> > A DBA doesn't need to know what an isolation level is.  Programmers need
>> > to know.  (But they don't.)
>>
>> One of the use cases of SSI is to enforce constraints, which are
>> certainly of interest to DBAs.
>
> Well, but users can freely change the isolation level, so it would not
> really be an effective constraint mechanism.

We're getting O/T, but...

The checking of constraints is expected to be done using
serializable isolation regardless of the isolation level of the
transation that triggered it. See section 4.35.4 of the latest SQL
standard:

"Regardless of the isolation level of the SQL-transaction, phenomena
P1, P2, and P3 shall not occur during the implied reading of schema
definitions performed on behalf of executing an SQL-statement, the
checking of integrity constraints, and the execution of referential actions
associated with referential constraints."

A friend pointed this section out to me just a couple of days ago.

--
Roy


From:
"Kevin Grittner"
Date:

Roy Hann <> wrote:
> Peter Eisentraut wrote:
>> On ons, 2011-04-06 at 11:01 -0700, Jeff Davis wrote:

>>> One of the use cases of SSI is to enforce constraints, which are
>>> certainly of interest to DBAs.
>>
>> Well, but users can freely change the isolation level, so it
>> would not really be an effective constraint mechanism.

> The checking of constraints is expected to be done using
> serializable isolation regardless of the isolation level of the
> transation that triggered it. See section 4.35.4 of the latest SQL
> standard:
>
> "Regardless of the isolation level of the SQL-transaction,
> phenomena P1, P2, and P3 shall not occur during the implied
> reading of schema definitions performed on behalf of executing an
> SQL-statement, the checking of integrity constraints, and the
> execution of referential actions associated with referential
> constraints."

I'm pretty sure Jeff was talking about enforcing business rules too
complex to be covered by the declarative constraints.  While Jeff's
exclusion constraints greatly expanded the types of constraints on
the data which can be defined declaratively and enforced
independently of transaction isolation level, there will always be
constraints people wish to apply to the data which are too complex
for such declarations to cover, which will require enforcement
through triggers or application code. These are much more easily
developed under consistent use of the serializable transaction level
than less strict levels.

Find any presentation from before SSI on why SERIALIZABLE isn't
truly serializable and how to work around that.  (One such example,
more or less at random, is the "sum of accounts for a person must
meet some minimum balance, which was a fairly popular such example.)
Now remove the workaround to get back to what was the original
failing case and use serializable isolation under 9.1 for all
transactions involved in the example.  It now "just works" without
taking table locks, SELECT FOR UPDATE, or using any other tricks.

http://wiki.postgresql.org/wiki/SSI#Overdraft_Protection

For some shops that has a lot of value.  The question is how to
punch that in release notes.  I expect that this feature will be
most useful, and most likely to be used by, "big" shops.  If you've
got over a dozen programmers actively working on thousands of
transaction types against a schema with several hundred tables,
you're probably going to like this feature a lot.  I don't know if
there's a particular venue where we could reach those people in
particular, but that is the audience which is most likely to be
willing to read a longer explanation of this feature and be
motivated to use it.

-Kevin

From:
Jeff Davis
Date:

On Thu, 2011-04-07 at 09:35 -0500, Kevin Grittner wrote:
> I'm pretty sure Jeff was talking about enforcing business rules too
> complex to be covered by the declarative constraints.

Exactly. SQL offers a limited set of declarative constraints that do
specific things, aside from ASSERT (aside: maybe we can implement ASSERT
now using SSI; but it would require all transactions to be in SSI mode).
Outside of that limited set, triggers and SSI are probably the best bet.

Even setting aside the issue of constraints, different shops have
different roles for DBAs. I generally think that a DBA has some
responsibility over the types of transactions that are permitted in the
system, what kinds of locks or other resources that those transactions
might hold, etc.; SSI is clearly of interest to such DBAs. And beyond
that, the DBA is often the expert, and may be able to show programmers
how to avoid race conditions inside the DBMS effectively (and SSI may be
one solution).

Regards,
    Jeff Davis


From:
"Kevin Grittner"
Date:

Jeff Davis <> wrote:

> (aside: maybe we can implement ASSERT now using SSI; but it would
> require all transactions to be in SSI mode).

I see all kinds of cool things which could build on the SSI
technology or the predicate locks developed to support it, but I've
been trying to hold back on such discussions as much as possible
until 9.1 is stabilized.  I just can't let this comment go without
observing that many of the cool uses would be a lot easier if one
could set a GUC which would require all transactions to be
serializable.  Now isn't the time to discuss what it would be called
or what its settings would be (much less what corner cases might be
problems with such a restriction), but assuming it is possible does
facilitate daydreaming about such things as SQL ASSERT constraints.
;-)

-Kevin

From:
Joshua Berkus
Date:

All,

> Yeah, that sounds like the way to go.
>
> So for Innovations, we'd have SSI, SE/Postgres, KNN-GiST, and wCTE,
> no?
> Or does someone else have wCTEs? I vaguely remember something about
> SQL Server ...

So, the research so far seems to indicate that:

SSI: PostgreSQL first
KNN-GiST: PostgreSQL first, but just barely: SQL Server will have this in November release.
wCTE: our "full implementation" is the first, version is SQL standard is much more limited.  How to word this though?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

From:
Peter Eisentraut
Date:

On lör, 2011-04-09 at 12:39 -0500, Joshua Berkus wrote:
> wCTE: our "full implementation" is the first, version is SQL standard
> is much more limited.  How to word this though?

"much more limited" is fuzzy and unsubstantiated.  Maybe something like
a "powerful and flexible implementation of this concept".


From:
Peter Eisentraut
Date:

On ons, 2011-04-06 at 17:41 -0700, David Fetter wrote:
> What's in the SQL:2011 standards draft is a long way behind even our
> current wCTE implementation[1], and it corresponds to something in
> DB2.

Could you give an example of that?


From:
Oleg Bartunov
Date:

On Sat, 9 Apr 2011, Joshua Berkus wrote:

> All,
>
>> Yeah, that sounds like the way to go.
>>
>> So for Innovations, we'd have SSI, SE/Postgres, KNN-GiST, and wCTE,
>> no?
>> Or does someone else have wCTEs? I vaguely remember something about
>> SQL Server ...
>
> So, the research so far seems to indicate that:
>
> SSI: PostgreSQL first
> KNN-GiST: PostgreSQL first, but just barely: SQL Server will have this in November release.

their solution is slower than custom 2008 based solution and has very limited
(I see only spatial datatype).

> wCTE: our "full implementation" is the first, version is SQL standard is much more limited.  How to word this though?
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: , http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

From:
"Kevin Grittner"
Date:

Joshua Berkus  wrote:

> SSI: PostgreSQL first

First *production* implementation.  The 2008 ACM SIGMOD paper
included a prototype implementation in Oracle Berkeley DB, and
Michael Cahill's 2009 doctoral thesis included a prototype
implementation in InnoDB.  Even though Cahill was hired by Oracle in
2009, I haven't been able to find any indication that Oracle has
included this technology in any production release of either of these
products, or any other.  I've checked their documentations without
finding references to anything but S2PL based SERIALIZABLE
implementations.

Having also tediously gone through every result page in in several
google searches using likely words, and diving into all pages which
look like they *could* refer to SSI, I haven't found any that
actually *do*.

I just wanted to provide enough detail to fashion a strong statement
that someone couldn't pick apart on the basis of the prototypes from
the academic papers.

-Kevin

From:
"Joshua D. Drake"
Date:

On Sun, 10 Apr 2011 11:12:24 -0500, "Kevin Grittner"
<> wrote:
> Joshua Berkus  wrote:
>
>> SSI: PostgreSQL first
>
> First *production* implementation.  The 2008 ACM SIGMOD paper
> included a prototype implementation in Oracle Berkeley DB, and
> Michael Cahill's 2009 doctoral thesis included a prototype
> implementation in InnoDB.  Even though Cahill was hired by Oracle in
> 2009, I haven't been able to find any indication that Oracle has
> included this technology in any production release of either of these
> products, or any other.  I've checked their documentations without
> finding references to anything but S2PL based SERIALIZABLE
> implementations.
>
> Having also tediously gone through every result page in in several
> google searches using likely words, and diving into all pages which
> look like they *could* refer to SSI, I haven't found any that
> actually *do*.
>
> I just wanted to provide enough detail to fashion a strong statement
> that someone couldn't pick apart on the basis of the prototypes from
> the academic papers.

I think you made our own argument though. We are releasing a "production"
version of SSI. Nobody else has done that.

JD


>
> -Kevin

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997

From:
Greg Smith
Date:

On 04/10/2011 12:12 PM, Kevin Grittner wrote:
> Having also tediously gone through every result page in in several
> google searches using likely words, and diving into all pages which
> look like they *could* refer to SSI, I haven't found any that
> actually *do*.
>

I just fixed http://en.wikipedia.org/wiki/SSI to include this acronym,
and the page I found to point it toward:
http://en.wikipedia.org/wiki/Snapshot_isolation#Making_Snapshot_Isolation_Serializable
doesn't suggest it's anything other than a theoretical concept so far.
I think there's a very real academic possibility for your work here
too.  I'd be writing to the ACM about your upcoming paper "Implementing
Serializable Snapshot Isolation in a MVCC Database" if I were you.
After 9.1 ships, of course.  (Be good to update the Wikipedia pages on
this at that point, too, once there's a long-term documentation URL for 9.1)

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
Greg Smith
Date:

On 04/03/2011 02:18 PM, Joshua Berkus wrote:
> * Transaction-controlled Synch Rep
> I would like to have volunteers from the advocacy list commit to taking on one of each of these features.  For each
onewe need: 
>
> a) a two-line explanation of what the feature is and why it's valuable (for the release notes, etc.)
> b) a wiki page with a more detailed explaination and examples oriented towards the beginning-to-intermediate
PostgreSQLuser. 
>

We already have http://wiki.postgresql.org/wiki/Synchronous_replication
for sync rep, and I just started changing that around so that it
reflects the code committed into 9.1.  I got my first set of questions
today suggesting our internal work on documenting this from the ground
up is moving along.  Over the next month we'll have at least two people
chugging away at making that targeted more toward beginners (or split
into something that is).

I've been doing the elevator pitch for sync rep for a while now; here's
a first draft description for the release notes:

Transaction-controlled Synchronous Replication:  When replicating to
multiple nodes, customize every database transaction for its individual
speed and durability needs.  Options range from only committing to
memory on the master up to the new synchronous standby mode, where data
must be stored on multiple servers to be considered safe.

The fact that several of the modes alluded to there were already
available on a per-transaction basis isn't new, but I think it's worth
being explicit about anyway because it's not really appreciated the way
it should be.  Combine this with a pitch for unlogged tables and there's
an interesting angle to complete with NoSQL...wait, I'm feeling some
more ad copy:

PostgreSQL 9.1 lets you pick exactly the level of commit guarantee your
data requires.  Whether you want unlogged tables optimized only for
speed, or you need durable synchronous replication to multiple servers,
you're covered--all in one database.

Need to take a shower to wash the stench of marketing off now.

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
"Kevin Grittner"
Date:

Greg Smith <> wrote:

> I just fixed http://en.wikipedia.org/wiki/SSI to include this
> acronym, and the page I found to point it toward:
>
http://en.wikipedia.org/wiki/Snapshot_isolation#Making_Snapshot_Isolation_Serializable
> doesn't suggest it's anything other than a theoretical concept so
> far.

Thanks for taking care of the SSI page.  I've had my eye on that,
the page you pointed it to, and this one:

http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

That last is pretty SS2PL-oriented, although it has a token
reference to snapshot isolation.  Those two pages both probably need
significant work to eliminate that bias and properly reference the
fact that SSI (or as Wikipedia refers to it, SerializableSI) has
moved from an academic concept to a production product.

I've been a little concerned about updating those pages because I
know they have been whacked around to support different biases over
the last few years, and I didn't want to get into some battle over
it; so on the 12th of April I tried to initiate a discussion on the
topic on the Discussion tab of the page you linked to.  So far there
has been no response, so perhaps those who were so eager to battle
it out on those pages have moved on and won't raise a fuss.  At a
minimum they should be able to see that I made the effort to discuss
it before making the changes.

> I think there's a very real academic possibility for your work
> here too.  I'd be writing to the ACM about your upcoming paper
> "Implementing Serializable Snapshot Isolation in a MVCC Database"
> if I were you.

I'm leaving that to Dan.  I'm not in a "publish or perish"
environment, nor at this point even looking to build cred for
consulting gigs.  Academic papers are not really my forte anyway.

> (Be good to update the Wikipedia pages on this at that point, too,
> once there's a long-term documentation URL for 9.1)

Yeah, the link to the PostgreSQL docs on the page you pointed to
still references the 8.2 docs.  I wonder whether it's better form to
link to the 9.1 docs once they're out, or to link to the "current"
URL, so that it's always automatically pointing to the latest
version.

-Kevin

From:
Joshua Berkus
Date:

Greg,

> Over the next month we'll have at least two people
> chugging away at making that targeted more toward beginners (or split
> into something that is).

Might I suggest something in the format of the Binary Replication Tutorial I started on the wiki?  I have plans to
updatethat for 9.1 and finish it, and of course would welcome expansion/help. 

One thing I'm unexpectedly pleased about is that, after 1000 e-mails of design arguments, the Synch Rep we ended up
withis simple and easy to understand.  I was able to explain it in a LWN article in 200 words. How unlike this project.
;-) 

I'm working with GoGrid on getting some PostgreSQL 9.1 Replication images up for use of all GoGrid users, as well.
Someoneelse could replicate my work at Rackspace and Amazon. 

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

From:
Greg Smith
Date:

On 04/30/2011 11:52 AM, Kevin Grittner wrote:
> Yeah, the link to the PostgreSQL docs on the page you pointed to
> still references the 8.2 docs.  I wonder whether it's better form to
> link to the 9.1 docs once they're out, or to link to the "current"
> URL, so that it's always automatically pointing to the latest
> version.
>

The 9.1 ones once they are out.  Having a URL whose content changes
significantly over time drives some of the Wikipedia crowd crazy.

I wouldn't even get into arguing over the main page.  I'd just add
"PostgreSQL 9.1 includes a true serialization implementation based on
the SerializableSI approach", add a footnote linking to the PostgreSQL
docs and your design page on the wiki, and call it a day.  Let the
revert geeks fight over how the tone of the page needs to be adjusted to
reflect this new referenced fact.

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
Simon Riggs
Date:

On Fri, Apr 1, 2011 at 8:21 PM, Josh Berkus <> wrote:

> * SQL/MED with FILE_FDW
> Reason: allows using PostgreSQL to query any kind of data source you can
> roll a driver for.  First one shipping reads CSV files, eliminating the
> need to COPY them into the database in many cases.

> * SEPostgres
> SE-Postgres has been integrated into PostgreSQL through generic security
> hooks and a contrib module.  Allows "military" level security/control
> over database.

Having just read the Beta announcement (well done!) I think we need to
explain what some of these features are a little more for the main
release.

I'm betting that only about 10% of people that care about databases
would know what SQL/MED is. Much better to say "Distributed Database
(Read-only)", or similar.

SE-Postgres isn't known to most people, and similarly, not everybody
has heard of SE-Linux or knows what its for.
How about "Generic Label Security, with integration to Security
Enhanced Linux (SELinux)"

Doesn't need to be more "marketingy" just more descriptive, fewer acronyms.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

From:
"Kevin Grittner"
Date:

Greg Smith <> wrote:

> Having a URL whose content changes significantly over time drives
> some of the Wikipedia crowd crazy.

While I can understand the reason, there is a certain irony in that.

Thanks for the Wikipedia advice.

-Kevin

From:
Joshua Tolley
Date:

On Mon, May 02, 2011 at 05:02:21PM +0100, Simon Riggs wrote:
> I'm betting that only about 10% of people that care about databases
> would know what SQL/MED is. Much better to say "Distributed Database
> (Read-only)", or similar.

At first glance, "distributed database" makes me think only of two-phase
commit, rather than SQL/MED. Perhaps it would be better to use words like
"management of external data" or "foreign data sources".

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment
From:
Gilberto Castillo Martínez
Date:


El mié, 04-05-2011 a las 07:02 -0600, Joshua Tolley escribió:
>
> At first glance, "distributed database" makes me think only of
> two-phase
> commit, rather than SQL/MED. Perhaps it would be better to use words
> like
> "management of external data" or "foreign data sources".

+1
--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

From:
Thomas Kellerer
Date:

Joshua Tolley, 04.05.2011 15:02:
> On Mon, May 02, 2011 at 05:02:21PM +0100, Simon Riggs wrote:
>> I'm betting that only about 10% of people that care about databases
>> would know what SQL/MED is. Much better to say "Distributed Database
>> (Read-only)", or similar.
>
> At first glance, "distributed database" makes me think only of two-phase
> commit, rather than SQL/MED. Perhaps it would be better to use words like
> "management of external data" or "foreign data sources".

Oracle and Firebird call this "External tables".

Regards
Thomas



From:
Bruce Momjian
Date:

Greg Smith wrote:
> On 04/03/2011 02:18 PM, Joshua Berkus wrote:
> > * Transaction-controlled Synch Rep
> > I would like to have volunteers from the advocacy list commit to taking on one of each of these features.  For each
onewe need: 
> >
> > a) a two-line explanation of what the feature is and why it's valuable (for the release notes, etc.)
> > b) a wiki page with a more detailed explaination and examples oriented towards the beginning-to-intermediate
PostgreSQLuser. 
> >
>
> We already have http://wiki.postgresql.org/wiki/Synchronous_replication
> for sync rep, and I just started changing that around so that it
> reflects the code committed into 9.1.  I got my first set of questions
> today suggesting our internal work on documenting this from the ground
> up is moving along.  Over the next month we'll have at least two people
> chugging away at making that targeted more toward beginners (or split
> into something that is).
>
> I've been doing the elevator pitch for sync rep for a while now; here's
> a first draft description for the release notes:
>
> Transaction-controlled Synchronous Replication:  When replicating to
> multiple nodes, customize every database transaction for its individual
> speed and durability needs.  Options range from only committing to
> memory on the master up to the new synchronous standby mode, where data
> must be stored on multiple servers to be considered safe.

...

> PostgreSQL 9.1 lets you pick exactly the level of commit guarantee your
> data requires.  Whether you want unlogged tables optimized only for
> speed, or you need durable synchronous replication to multiple servers,
> you're covered--all in one database.

I think the "users get control" montra could be a theme for this
release;  I think it would apply to all of these:

    * Unlogged tables to better handle some NoSQL workloads
    * Synchronous replication for greater reliability
    * SQL/MED (Management of External Data) (flat files, other databases)
    * Per-column collation support
    * Security Label, including SE-Linux integration
    * True serializable isolation with predicate locking (already had snapshot isolation)

> Need to take a shower to wash the stench of marketing off now.

LOL

--
  Bruce Momjian  <>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +