Basically, I want to do something like this.
I have N rows in table rawData.
I have to create batches from these N rows using tables batches (which has a serial id column and some additional data
columns)and batchContents (which references id in batches), where there will be M rows in batchContent for each row in
tablebatches.
Example (N=12, M=5, meaning take 12 rows, and create as many batches as needed with a maximum of 5 batchContent rows
perbatch)
rawData
r01
r02
r03
r04
r05
r06
r07
r08
r09
r10
r11
r12
Expected result:
batches:
id post
5001 5
5002 5
5003 5
batchContents:
batch_id datum
5001 r01
5001 r02
5001 r03
5001 r04
5001 r05
5002 r06
5002 r07
5002 r08
5002 r09
5002 r10
5003 r11
5003 r12
The order in which the data are distributed between the batches is not important, but I need to have M data in each
batchexcept the last.
My starting point was a statement for insertion into batches. If I know what N and M are, I know how many batches I'll
need(B=ceil(N/M)), so I thought of writing
INSERT INTO batches(post)
SELECT 5 -- All the extra data, like the "post" field, is inserted as literals here
FROM generate_series(1,B)
RETURNING id
This will give me the proper rows in batches, but I don't know how to incorporate this into an insert/select from
rawDatainto batchContent, assuming that I don't want to keep the data programatically and do repeated SELECTs with
OFFSETand LIMIT for each id returned. Maybe there is some elegant solution with window functions? Can there be any sort
ofjoin over a window or something like that?
TIA,
Herouth