Thread: AW: Plans for solving the VACUUM problem

AW: Plans for solving the VACUUM problem

From
Zeugswetter Andreas SB
Date:
> > You mean it is restored in session that is running the transaction ?

Depends on what you mean with restored. It first reads the heap page,
sees that it needs an older version and thus reads it from the "rollback segment".

> > 
> > I guess thet it could be slower than our current way of doing it.
> 
> Yes, for older transactions which *really* need in *particular*
> old data, but not for newer ones. Look - now transactions have to read
> dead data again and again, even if some of them (newer) need not to see
> those data at all, and we keep dead data as long as required for other
> old transactions *just for the case* they will look there.
> But who knows?! Maybe those old transactions will not read from table
> with big amount of dead data at all! So - why keep dead data in datafiles
> for long time? This obviously affects overall system performance.

Yes, that is a good description. And old version is only required in the following 
two cases:

1. the txn that modified this tuple is still open (reader in default committed read)
2. reader is in serializable transaction isolation and has earlier xtid

Seems overwrite smgr has mainly advantages in terms of speed for operations
other than rollback.

Andreas


Re: AW: Plans for solving the VACUUM problem

From
Hannu Krosing
Date:
Zeugswetter Andreas SB wrote:
> 
> > > You mean it is restored in session that is running the transaction ?
> 
> Depends on what you mean with restored. It first reads the heap page,
> sees that it needs an older version and thus reads it from the "rollback segment".

So are whole pages stored in rollback segments or just the modified data
?

Storing whole pages could be very wasteful for tables with small records
that 
are often modified.

---------------
Hannu


Re: Plans for solving the VACUUM problem

From
"Vadim Mikheev"
Date:
> Yes, that is a good description. And old version is only required in the following 
> two cases:
> 
> 1. the txn that modified this tuple is still open (reader in default committed read)
> 2. reader is in serializable transaction isolation and has earlier xtid
> 
> Seems overwrite smgr has mainly advantages in terms of speed for operations
> other than rollback.

... And rollback is required for < 5% transactions ...

Vadim




Re: Plans for solving the VACUUM problem

From
Hannu Krosing
Date:
Vadim Mikheev wrote:
> 
> > Yes, that is a good description. And old version is only required in the following
> > two cases:
> >
> > 1. the txn that modified this tuple is still open (reader in default committed read)
> > 2. reader is in serializable transaction isolation and has earlier xtid
> >
> > Seems overwrite smgr has mainly advantages in terms of speed for operations
> > other than rollback.
> 
> ... And rollback is required for < 5% transactions ...

This obviously depends on application. 

I know people who rollback most of their transactions (actually they use
it to 
emulate temp tables when reporting). 

OTOH it is possible to do without rolling back at all as MySQL folks
have 
shown us ;)

Also, IIRC, pgbench does no rollbacks. I think that we have no
performance test that does.

-----------------
Hannu


Re: AW: Plans for solving the VACUUM problem

From
"Vadim Mikheev"
Date:
> > > > You mean it is restored in session that is running the transaction ?
> > 
> > Depends on what you mean with restored. It first reads the heap page,
> > sees that it needs an older version and thus reads it from the "rollback segment".
> 
> So are whole pages stored in rollback segments or just the modified data?

This is implementation dependent. Storing whole pages is much easy to do,
but obviously it's better to store just modified data.

Vadim




RE: AW: Plans for solving the VACUUM problem

From
"Mikheev, Vadim"
Date:
> > > So are whole pages stored in rollback segments or just
> > > the modified data?
> > 
> > This is implementation dependent. Storing whole pages is
> > much easy to do, but obviously it's better to store just
> > modified data.
> 
> I am not sure it is necessarily better. Seems to be a tradeoff here.
> pros of whole pages:
>     a possible merge with physical log (for first
>           modification of a page after checkpoint
>         there would be no overhead compared to current 
>           since it is already written now)

Using WAL as RS data storage is questionable.

>     in a clever implementation a page already in the
>           "rollback segment" might satisfy the 
>         modification of another row on that page, and 
>           thus would not need any additional io.

This would be possible only if there was no commit (same SCN)
between two modifications.

But, aren't we too deep on overwriting smgr (O-smgr) implementation?
It's doable. It has advantages in terms of IO active transactions
must do to follow MVCC. It has drawback in terms of required
disk space (and, oh yeh, it's not easy to implement -:)).
So, any other opinions about value of O-smgr?

Vadim