Thread: Reasons to reorder results *within* a transaction?

Reasons to reorder results *within* a transaction?

From
Evan Jones
Date:
I *know* that without an ORDER BY clause, the database is free to reorder results in any way it likes. However, I
recentlyran into a case where the *SAME* query was returning results in a different order *within* a single
transaction,which surprised me (we fixed it by adding the missing ORDER BY). I would assume that once a transaction
obtainsa snapshot, all its read operations would return the same results. 

Could concurrent updates in other transactions "move" tuples in the underlying heap files? Could the query optimizer
decideto execute a query two different ways for some reason (e.g. statistics collected after the first query?). Clearly
theway Postgres works internally is a bit different from what I assumed. Any references to docs I should read would be
appreciated.


Roughly speaking, the schema is something like:

create table group_record (id integer primary key, group_id integer, data text);
(plus other tables)


The transaction is something like:

begin;
select * from group_record where group_id = x;

… reads and writes to/from other tables …

select * from group_record where group_id = x; -- returns results in a different order from first time
commit;


Needless to say, this is one of those fun rare bugs that appeared occasionally in the logs in our production server,
andwe couldn't seem to reproduce it in development. 

Thanks!

Evan Jones

--
Work: https://www.mitro.co/    Personal: http://evanjones.ca/



Re: Reasons to reorder results *within* a transaction?

From
Kevin Grittner
Date:
Evan Jones <ej@evanjones.ca> wrote:

> I *know* that without an ORDER BY clause, the database is free to reorder
> results in any way it likes. However, I recently ran into a case where the
> *SAME* query was returning results in a different order *within* a single
> transaction, which surprised me (we fixed it by adding the missing ORDER BY). I
> would assume that once a transaction obtains a snapshot, all its read operations
> would return the same results.

That is not a valid assumption.  For one thing, the default
transaction isolation level is read committed, and at that
isolation level you are not guaranteed to even get the same *rows*
running the same query twice within the same transaction, much less
in the same order.  At any isolation level statistics could change,
resulting in a different plan on two successive executions.  Even
running the same plan using the same snapshot you could get a
different order if you have not specified one with ORDER BY.  As
one example, a sequential scan of a table won't necessarily start
at the beginning of the heap -- if there is already a sequential
scan in progress for another process, the new one will start at the
point the other one is at, and "wrap around".  This can save a lot
of physical disk access, resulting in better performance.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Reasons to reorder results *within* a transaction?

From
Evan Jones
Date:
On Oct 4, 2013, at 13:03 , Kevin Grittner <kgrittn@ymail.com> wrote:
> That is not a valid assumption.  For one thing, the default
> transaction isolation level is read committed, and at that
> isolation level you are not guaranteed to even get the same *rows*
> running the same query twice within the same transaction, much less
> in the same order.

I guess I should have mentioned that we are using serializable snapshot isolation (thanks for that, BTW!)

> if there is already a sequential
> scan in progress for another process, the new one will start at the
> point the other one is at, and "wrap around".  This can save a lot
> of physical disk access, resulting in better performance.

OH! This totally, totally makes sense. This is *exactly* the kind of thing I was looking for, and I'll bet that is
exactlywhat was happening in our case. The table is pretty small, so Postgres explain says it is doing a full table
scanfor this query. Thanks for the speedy insightful answer! 

This is yet another example of something that when tracking down the bug, we knew immediately it was incorrect and
probablywrong, but sometimes you don't notice these things the first time. The joys of software. 

Evan

--
Work: https://www.mitro.co/    Personal: http://evanjones.ca/



Re: Reasons to reorder results *within* a transaction?

From
Alvaro Herrera
Date:
Evan Jones wrote:
> I *know* that without an ORDER BY clause, the database is free to
> reorder results in any way it likes. However, I recently ran into a
> case where the *SAME* query was returning results in a different order
> *within* a single transaction, which surprised me (we fixed it by
> adding the missing ORDER BY). I would assume that once a transaction
> obtains a snapshot, all its read operations would return the same
> results.

Yes, the same results -- not necessarily in the same order.  For
instance, a synchronized scan might start at a different point of the
table.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services