Hey,
I"m trying to read the rows of a table in chunks to process them in a background worker.
I want to ensure that each row is processed only once.
I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT {limit_size}` functionality for this but I"m running into issues.
Some approaches I had in mind that aren't working out:
- Try to use the transaction id to query rows created since the last processed transaction id
- It seems Postgres does not expose row transaction ids so this approach is not feasible
- Rely on OFFSET / LIMIT combination to query the next chunk of data
- SELECT * does not guarantee ordering of rows so it's possible older rows repeat or newer rows are missed in a chunk
Can you please suggest any alternative to periodically read rows from a table in chunks while processing each row exactly once.
Thanks,
Sushrut