Thread: PL/PGSQL + inserts+updates+limit - Postgres 9.3
Hi guys,
I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS...
- original_table1_b = Original table, where the BLOBS are
- table1_n_b = Table where everything related to the BLOBS is stored (file_id, account_id, note_id, etc)
- table2_y_b = Table BACKUP - The blobs+data will be copied to here before being deleted
- table3_n_b = On the table1_n_b, each blob is related to a note_id. Each note_id has three different file_id. I want to delete just the greatest one. So on this table3_n_b table I'm storing the greates file_id (by size)
How is the table3_n_b table created:
SELECT * INTO table3_n_b FROM
( SELECT account_id, note_id, st_ino, size FROM ( SELECT DISTINCT ON (note_id) note_id, MAX(size), file_id, id FROM table1_n_b GROUP BY note_id, size, file_id, id ORDER BY note_id, size desc ) AS r1
) AS r2;
The function must perform the following:
1 - Select note_id + size + file_id + full_path from table1_n_b table to the new table2_y_b one, but only those file_id that are greatest, so here we use the table created above: table3_n_b:
- Something like this?
INSERT INTO table2_y_b (note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
INTO
table2_y_b
FROM
table1_n_b t1
JOIN
table3_n_b t3 ON t3.file_id = t1.file_id
)
2 - Once the Blob's data is inside the table2_y_b table, we can now copy the blobs into the same table.
- something like this?
INSERT INTO table2_y_b (data)
(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3_n_b t3 ON t3.file_id = o1.file_id
)
3 - Changing the table2_y_b.migrated column from 0 to 1 (1 means the blob has been already copied):
FOR crtRow IN execute
'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0 ' || $1 ||' offset '||
4 - After we have a backup of the blobs+data, we can now delete the blob (setting the column as NULL)
FOR crtRow IN execute
'UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id ' || $1 ||' offset '||
This is what I've done so far:
CREATE or REPLACE FUNCTION function_1_name(rows integer)RETURNS INTEGER AS $$declarecompleted integer;crtRow record;BEGINoffset_num = 0;-- Copiyng the data into the table which will store the data+blobsFOR crtRow IN execute'INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathINTOtable2_y_bFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_id) ' || $1 ||' offset '||-- Copying the BLOBSFOR crtRow IN execute'INSERT INTO table2_y_b (data)(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idJOINtable2_y_b t2 ON t2.file_id = o1.file_idWHEREt2.migrated = 0) ' || $1 ||' offset '||-- Update the migrated column from 0 to 1, for those rows that have been modified/copied.FOR crtRow IN execute'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0 ' || $1 ||' offset '||FOR crtRow IN execute'UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id ' || $1 ||' offset '||RETURN file_id;END$$ language 'plpgsql';
Am I doing right?
When I will call the function: select function_1_name(5000) or select function_1_name(15000) will it respect the limited by the rows?
Hi guys,I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS...
[...]
When I will call the function: select function_1_name(5000) or select function_1_name(15000) will it respect the limited by the rows?
I'd suggest you setup a test environment with some unimportant data on a non-production machine and try it yourself.
David J.
I'd suggest you setup a test environment with some unimportant data on a non-production machine and try it yourself.David J.
Thanks.. but if I'm asking the list that's because I'm already testing it and it's not working... ;)
Patrick
On Wednesday, June 1, 2016, Patrick Baker <patrickbakerbr@gmail.com> wrote:
I'd suggest you setup a test environment with some unimportant data on a non-production machine and try it yourself.David J.Thanks.. but if I'm asking the list that's because I'm already testing it and it's not working... ;)
Are you getting errors? A quick look seemed like it shouldn't even run do to syntax problems.
David J.
On 06/01/2016 05:10 PM, Patrick Baker wrote: > Hi guys, > > I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS... > > I have four tables: > > *- original_table1_b =* Original table, where the BLOBS are > *- table1_n_b =* Table where everything related to the BLOBS is > stored (file_id, account_id, note_id, etc) > *- table2_y_b =* Table BACKUP - The blobs+data will be copied to > here before being deleted > *- table3_n_b =* On the *table1_n_b*, each blob is related to a > note_id. Each note_id has three different file_id. I want to delete > just the greatest one. So on this *table3_n_b* table I'm storing the > greates file_id (by size) > > > > How is the *table3_n_b* table created: > > |SELECT*INTOtable3_n_b FROM(SELECTaccount_id,note_id,st_ino,size > FROM(SELECTDISTINCTON(note_id)note_id,MAX(size),file_id,id > FROMtable1_n_b GROUPBYnote_id,size,file_id,id ORDERBYnote_id,size > desc)ASr1 )ASr2;| > > > The function must perform the following: > > 1 - Select /_note_id + size + file_id + full_path_/ from *table1_n_b* > table to the new *table2_y_b* one, but only those file_id that are > greatest, so here we use the table created above: *table3_n_b*: > > - Something like this? > > INSERT INTO table2_y_b (note_id, size, file_id, full_path) > ( > SELECT > t1.note_id, > t1.size, > t1.file_id, > t1.full_path > INTO > table2_y_b > FROM > table1_n_b t1 > JOIN > table3_n_b t3 ON t3.file_id = t1.file_id > ) > > > 2 - Once the Blob's data is inside the *table2_y_b* table, we can now > copy the blobs into the same table. > > - something like this? > > INSERT INTO table2_y_b (data) > ( > SELECT > o1.data > FROM > original_table1_b o1 > JOIN > table3_n_b t3 ON t3.file_id = o1.file_id > ) > > > 3 - Changing the table2_y_b.migrated column from 0 to 1 (1 means the > blob has been already copied): > > FOR crtRow IN execute > 'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id > AND migrated = 0 ' || $1 ||' offset '|| > > > > 4 - After we have a backup of the blobs+data, we can now delete the blob > (setting the column as NULL) > > FOR crtRow IN execute > > 'UPDATE original_table1_b SET data = NULL WHERE file_id = > crtRow.file_id ' || $1 ||' offset '|| > > > > *This is what I've done so far:* > > CREATE or REPLACE FUNCTION function_1_name(rows integer) > > RETURNS INTEGER AS $$ > > > declare > > completed integer; > > crtRow record; > > > BEGIN > > offset_num = 0; > > > -- Copiyng the data into the table which will store the data+blobs > > FOR crtRow IN execute > > 'INSERT INTO table2_y_b (note_id, size, file_id, full_path) > > ( > > SELECT > > t1.note_id, > > t1.size, > > t1.file_id, > > t1.full_path > > INTO > > table2_y_b > > FROM > > table1_n_b t1 > > JOIN > > table3_n_b t3 ON t3.file_id = t1.file_id > > ) ' || $1 ||' offset '|| > > > -- Copying the BLOBS > > FOR crtRow IN execute > > 'INSERT INTO table2_y_b (data) > > ( > > SELECT > > o1.data > > FROM > > original_table1_b o1 > > JOIN > > table3_n_b t3 ON t3.file_id = o1.file_id > > JOIN > > table2_y_b t2 ON t2.file_id = o1.file_id > > WHERE > > t2.migrated = 0 > > ) ' || $1 ||' offset '|| > > > -- Update the migrated column from 0 to 1, for those rows that > have been modified/copied. > > FOR crtRow IN execute > > 'UPDATE table2_y_b SET migrated = 1 WHERE file_id = > crtRow.file_id AND migrated = 0 ' || $1 ||' offset '|| > > > FOR crtRow IN execute > > 'UPDATE original_table1_b SET data = NULL WHERE file_id = > crtRow.file_id ' || $1 ||' offset '|| > > > > RETURN file_id; > > > END > > > $$ language 'plpgsql'; > > > > Am I doing right? > When I will call the function: *select function_1_name(5000) or **select > function_1_name(15000)* will it respect the limited by the rows? > I maybe be missing it, but I see no LIMIT in the function. I do see OFFSET and it looks backwards to me?: || $1 ||' offset '|| https://www.postgresql.org/docs/9.5/static/sql-select.html LIMIT Clause The LIMIT clause consists of two independent sub-clauses: LIMIT { count | ALL } OFFSET start Also I not sure what offset_num is supposed to do, it is declared but not used? -- Adrian Klaver adrian.klaver@aklaver.com
I maybe be missing it, but I see no LIMIT in the function.
I do see OFFSET and it looks backwards to me?:
|| $1 ||' offset '||
https://www.postgresql.org/docs/9.5/static/sql-select.html
LIMIT Clause
The LIMIT clause consists of two independent sub-clauses:
LIMIT { count | ALL }
OFFSET start
Also I not sure what offset_num is supposed to do, it is declared but not used?
Yep.. it's declared but it's not used..
Hmm.. interesting that about the LIMIT clause
However.. I had to changed the function...
Here is the new one:
CREATE or REPLACE FUNCTION function_data_1(rows integer)RETURNS INTEGER AS $$declarecompleted integer;offset_num integer;crtRow record;BEGINoffset_num = 0;INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_id);UPDATE table2_y_b t2 SET segment_data =(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idWHEREt2.migrated = 0ANDt2.file_id = o1.file_id);UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0;UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;END
$$ language 'plpgsql';
It's all working, except the LIMIT... if possible can you please give me an example of that LIMIT in some of those queries?
Thanks
On 06/01/2016 10:04 PM, Patrick Baker wrote: > > > I maybe be missing it, but I see no LIMIT in the function. > > I do see OFFSET and it looks backwards to me?: > > || $1 ||' offset '|| > > https://www.postgresql.org/docs/9.5/static/sql-select.html > > LIMIT Clause > > The LIMIT clause consists of two independent sub-clauses: > > LIMIT { count | ALL } > OFFSET start > > Also I not sure what offset_num is supposed to do, it is declared > but not used? > > > Yep.. it's declared but it's not used.. > > Hmm.. interesting that about the LIMIT clause > However.. I had to changed the function... > > Here is the new one: > > CREATE or REPLACE FUNCTION function_data_1(rows integer) > > RETURNS INTEGER AS $$ > > > declare > > completed integer; > > offset_num integer; > > crtRow record; > > > BEGIN > > offset_num = 0; > > > INSERT INTO table2_y_b (note_id, size, file_id, full_path) > > ( > > SELECT > > t1.note_id, > > t1.size, > > t1.file_id, > > t1.full_path > > FROM > > table1_n_b t1 > > JOIN > > table3_n_b t3 ON t3.file_id = t1.file_id > > ); > > > UPDATE table2_y_b t2 SET segment_data = > > ( > > SELECT > > o1.data > > FROM > > original_table1_b o1 > > JOIN > > table3_n_b t3 ON t3.file_id = o1.file_id > > WHERE > > t2.migrated = 0 > > AND > > t2.file_id = o1.file_id > > ); > > > UPDATE table2_y_b SET migrated = 1 WHERE file_id = > crtRow.file_id AND migrated = 0; > > > UPDATE original_table1_b SET data = NULL WHERE file_id = > crtRow.file_id; > > > END > > > $$ language 'plpgsql'; > > > It's all working, except the LIMIT... if possible can you please give > me an example of that LIMIT in some of those queries? I do not what you are trying to LIMIT/OFFSET, so I have no idea where to place the LIMIT/OFFSET. Maybe an example query showing what you are trying to do will help? > > Thanks -- Adrian Klaver adrian.klaver@aklaver.com
It's all working, except the LIMIT... if possible can you please give me an example of that LIMIT in some of those queries?
You also should use ORDER BY when using LIMIT and OFFSET; though depending on the setup it could be omitted. Usually as long as the second execution cannot select any of the records the first execution touched you can choose a random quantity. But if you want random then using OFFSET is pointless.
SELECT *
FROM generate_series(1, 10)
ORDER BY 1
LIMIT 5
OFFSET 3
generate_series
----------------------
4
5
6
7
8
You are going to have difficultly finding people willing to help when you cannot put together a self-contained and syntax error free example (I think the last one is...) of what you want to do. The PostgreSQL parser is very good at reading code and telling you what it doesn't like. I'm not inclined to spend time reading queries that obviously cannot run and point out those same problems. If you can a particular error you don't understand I'll be happy to try and explain what it is trying to tell you.
You probably need to reformulate your update to read:
UPDATE tbl
FROM (
SELECT 50 RECORDS
) src
WHERE src = tbl;
And ensure that the 50 being selected each time through are a different 50.
Writeable CTEs will probably help here.
David J.
2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:
It's all working, except the LIMIT... if possible can you please give me an example of that LIMIT in some of those queries?You also should use ORDER BY when using LIMIT and OFFSET; though depending on the setup it could be omitted. Usually as long as the second execution cannot select any of the records the first execution touched you can choose a random quantity. But if you want random then using OFFSET is pointless.SELECT *FROM generate_series(1, 10)ORDER BY 1LIMIT 5OFFSET 3generate_series----------------------45678You are going to have difficultly finding people willing to help when you cannot put together a self-contained and syntax error free example (I think the last one is...) of what you want to do. The PostgreSQL parser is very good at reading code and telling you what it doesn't like. I'm not inclined to spend time reading queries that obviously cannot run and point out those same problems. If you can a particular error you don't understand I'll be happy to try and explain what it is trying to tell you.You probably need to reformulate your update to read:UPDATE tblFROM (SELECT 50 RECORDS) srcWHERE src = tbl;And ensure that the 50 being selected each time through are a different 50.Writeable CTEs will probably help here.David J.
Hi David.
The SQLs inside the function works.... I'm just having problem about limiting the query to the number of rows I want, and also, to teach the update SQL to only touch the records the other SQLs inside the function have touched.
This is the function updated:
CREATE or REPLACE FUNCTION function_data_1(rows integer)RETURNS INTEGER AS $$declarecompleted integer;offset_num integer;crtRow record;BEGINoffset_num = 0;INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_id);UPDATE table2_y_b t2 SET segment_data =(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idWHEREt2.migrated = 0ANDt2.file_id = o1.file_id);UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0;UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;END$$ language 'plpgsql';
- As you can see, the first insert, inserts data into a new table from another select. This query must be limited by the number of rows I'll provide when calling the function; example:
select function_data_1(5000);
select function_data_1(60000);
select function_data_1(15000);
- The first update, copies the BLOBS from the original_table1_b table into the new one (as above). Here, I also need the query knows to only touch those records that have been touched by the above query.
- The second update, set the table2_y_b.migrated column from 0 to 1, telling me that, that record has been touched by the query. So the next call ( select function_data_1(60000); ) will already know that it does not need to touch that record; example:
WHERE
t2.migrated = 0
- The third and last update, deletes (set the blobs column as null) the blobs that have already been touched by the above queries.... Still.. don't know how to tell postgres to only touches the rows that have been touched by the above queries....
2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:It's all working, except the LIMIT... if possible can you please give me an example of that LIMIT in some of those queries?You also should use ORDER BY when using LIMIT and OFFSET; though depending on the setup it could be omitted. Usually as long as the second execution cannot select any of the records the first execution touched you can choose a random quantity. But if you want random then using OFFSET is pointless.SELECT *FROM generate_series(1, 10)ORDER BY 1LIMIT 5OFFSET 3generate_series----------------------45678You are going to have difficultly finding people willing to help when you cannot put together a self-contained and syntax error free example (I think the last one is...) of what you want to do. The PostgreSQL parser is very good at reading code and telling you what it doesn't like. I'm not inclined to spend time reading queries that obviously cannot run and point out those same problems. If you can a particular error you don't understand I'll be happy to try and explain what it is trying to tell you.You probably need to reformulate your update to read:UPDATE tblFROM (SELECT 50 RECORDS) srcWHERE src = tbl;And ensure that the 50 being selected each time through are a different 50.Writeable CTEs will probably help here.David J.Hi David.The SQLs inside the function works....
Really? You seem to have lost your FOR loop for starters, and your RETURN statement, and a semi-colon after END, and I doubt crtRow.file_id works, should I go on...so, yes, you can run the four individual SQL statements correctly but the function itself is bogus.
I'm just having problem about limiting the query to the number of rows I want, and also, to teach the update SQL to only touch the records the other SQLs inside the function have touched.
This is the function updated:CREATE or REPLACE FUNCTION function_data_1(rows integer)RETURNS INTEGER AS $$declarecompleted integer;offset_num integer;crtRow record;BEGINoffset_num = 0;INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_id);UPDATE table2_y_b t2 SET segment_data =(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idWHEREt2.migrated = 0ANDt2.file_id = o1.file_id);UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND migrated = 0;UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;END$$ language 'plpgsql';- As you can see, the first insert, inserts data into a new table from another select. This query must be limited by the number of rows I'll provide when calling the function; example:select function_data_1(5000);
select function_data_1(60000);
select function_data_1(15000);- The first update, copies the BLOBS from the original_table1_b table into the new one (as above). Here, I also need the query knows to only touch those records that have been touched by the above query.- The second update, set the table2_y_b.migrated column from 0 to 1, telling me that, that record has been touched by the query. So the next call ( select function_data_1(60000); ) will already know that it does not need to touch that record; example:WHERE
t2.migrated = 0- The third and last update, deletes (set the blobs column as null) the blobs that have already been touched by the above queries.... Still.. don't know how to tell postgres to only touches the rows that have been touched by the above queries....
Here's a fish - though you will still need to clean it.
This is not tested, and I haven't ever build this exact query for real, but it should work in theory...
--assumes that to be migrated records have previously had their migrated flag set to 0
function name (number_of_rows_to_process integer)
LANGUAGE sql -- this no longer requires procedural logic so no need for plpgsql
RETURNS SETOF bigint --returns the affected ids
AS $$
WITH the_records_I_want_to_affect AS (
-- pick N records to process
SELECT id, ...
FROM source_tbl
WHERE migrated = 0
ORDER BY ...
LIMIT number_of_rows_to_process -- your function argument goes here
FOR UPDATE
),
migrate_the_data AS (
-- place a copy of them into the archive table
INSERT INTO migration_table
SELECT id, ...
FROM the_records_I_want_to_affect
RETURNING *
),
mark_as_migrated AS (
-- mark them as having been archived and nullify the blob data
UPDATE source_tbl
SET migrated = 1, data = null
FROM migrate_the_data recs
WHERE recs.id = source_tbl.id
RETURNING source_tbl.id
)
SELECT id FROM mark_as_migrated;
$$
I am sure a fully working version of this idiom in present in one and more places on the internet. Feel free to search out fully working examples with additional commentary.
You can make a FOR loop version of this work, and had to many years ago before writable CTEs were implemented.
David J.
On 06/02/2016 02:03 PM, Patrick Baker wrote: > > > 2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com > <mailto:david.g.johnston@gmail.com>>: > > > Hi David. > > The SQLs inside the function works.... I'm just having problem about > limiting the query to the number of rows I want, and also, to teach the > update SQL to only touch the records the other SQLs inside the function > have touched. See notes inline. > > This is the function updated: > > CREATE or REPLACE FUNCTION function_data_1(rows integer) > > RETURNS INTEGER AS $$ > > > declare > > completed integer; > > offset_num integer; > > crtRow record; > > > BEGIN > > offset_num = 0; > > > INSERT INTO table2_y_b (note_id, size, file_id, full_path) > > ( > > SELECT > > t1.note_id, > > t1.size, > > t1.file_id, > > t1.full_path > > FROM > > table1_n_b t1 > > JOIN > > table3_n_b t3 ON t3.file_id = t1.file_id > > ); Why are you joining to table3_nb? You do not use any fields from it. How do you know what data in table1_n_b to get? I see this grabbing the same information over and over again. > > > UPDATE table2_y_b t2 SET segment_data = > > ( > > SELECT > > o1.data > > FROM > > original_table1_b o1 > > JOIN > > table3_n_b t3 ON t3.file_id = o1.file_id > > WHERE > > t2.migrated = 0 > > AND > > t2.file_id = o1.file_id > > ); > > > UPDATE table2_y_b SET migrated = 1 WHERE file_id = > crtRow.file_id AND migrated = 0; > > > UPDATE original_table1_b SET data = NULL WHERE file_id = > crtRow.file_id; All the above would seem to be handled in a LOOP. Grab the data from: SELECT t1.note_id, t1.size, t1.file_id, t1.full_path FROM table1_n_b t1 with suitable WHERE clause and use: https://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING to iterate over the results. As part of the iteration do your INSERT and UPDATE using the RECORD.file_id. This includes setting migrated=1 and data=NULL. > > > END > > > $$ language 'plpgsql'; > > > > > > - As you can see, the first *insert*, inserts data into a new table from > another select. This query must be limited by the number of rows I'll > provide when calling the function; example: > > select function_data_1(5000); > select function_data_1(60000); > select function_data_1(15000); > > > - The first *update*, copies the BLOBS from the original_table1_b table > into the new one (as above). Here, I also need the query knows to only > touch those records that have been touched by the above query. > > > - The second *update*, set the table2_y_b.migrated column from 0 to 1, > telling me that, that record has been touched by the query. So the next > call ( select function_data_1(60000); ) will already know that it does > not need to touch that record; example: > > WHERE > t2.migrated = 0 > > > - The third and last *update*, deletes (set the blobs column as null) > the blobs that have already been touched by the above queries.... > Still.. don't know how to tell postgres to only touches the rows that > have been touched by the above queries.... > > > -- Adrian Klaver adrian.klaver@aklaver.com
Why are you joining to table3_nb?
You do not use any fields from it.
How do you know what data in table1_n_b to get?
I see this grabbing the same information over and over again.
SELECT * INTO table3_n_b FROM
( SELECT account_id, note_id, file_id FROM ( SELECT DISTINCT ON (note_id) note_id, MAX(size), file_id, company_id FROM table1_n_b GROUP BY note_id, size, file_id, company_id ORDER BY note_id, size desc ) AS r1
) AS r2;
Because I just wanna touch the greatest file_id ( by size ) of each note_id
And the file_id I must change is into the table3
That's why:
table3_n_b t3 ON t3.file_id = t1.file_id
UPDATE table2_y_b t2 SET segment_data =
(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3_n_b t3 ON t3.file_id = o1.file_id
WHERE
t2.migrated = 0
AND
t2.file_id = o1.file_id
);
UPDATE table2_y_b SET migrated = 1 WHERE file_id =
crtRow.file_id AND migrated = 0;
UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id;
All the above would seem to be handled in a LOOP.
Grab the data from:
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
FROM
table1_n_b t1
with suitable WHERE clause and use:
https://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING
Hmm ok... but...
INSERT INTO table2_y_b (note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
FROM
table1_n_b t1
JOIN
table3_n_b t3 ON t3.file_id = t1.file_id
);
I don't need anything else on the WHERE clause , as the ON t3.file_id = t1.file_id is already doing what I need.... ( and it works.. I tested it )
to iterate over the results. As part of the iteration do your INSERT and UPDATE using the RECORD.file_id. This includes setting migrated=1 and data=NULL.
Yep.. that's the way I started by doing this...
Can you please tell me if this would be right?
CREATE or REPLACE FUNCTION function_data_1()RETURNS INTEGER AS $$declarerow record;BEGIN-- copying the data to the backup table (not the blobs)-- Limiting in 5000 rows each callFOR row IN EXECUTE 'INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_idORDER BY 1LIMIT 5000)'LOOP-- copying the blobs to the table aboveUPDATE table2_y_b t2 SET segment_data =(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idWHEREt2.migrated = 0ANDt2.file_id = o1.file_id)WHERE t2.file_id = row.file_idEND LOOP;-- updating the migrated column from 0 to 1LOOPUPDATEtable2_y_b t2SETmigrated = 1WHEREt2.file_id = row.file_idANDmigrated = 0END LOOP;LOOPUPDATEoriginal_table1_b o1SETdata = NULLWHEREo1.file_id = row.file_id;END LOOP;END$$ language 'plpgsql';
I did:
CREATE or REPLACE FUNCTION function_1_data()RETURNS INTEGER AS $$declarerow record;BEGIN-- copying the data to the backup table (not the blobs)FOR row IN EXECUTE 'SELECTt1.file_idFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1. file_id ORDER BY 1 LIMIT 3' LOOP-- Creating the backup table with the essential dataEXECUTE 'INSERT INTO table2_y_b (note_id, size, file_id, full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_pathFROMtable1_n_b t1JOINtable3_n_b t3 ON t3.file_id = t1.file_id) ';-- copying the blobs to the table aboveEXECUTE 'UPDATE table2_y_b t2 SET data =(SELECTo1.dataFROMoriginal_table1_b o1JOINtable3_n_b t3 ON t3.file_id = o1.file_idWHEREt3.migrated = 0ANDt2.file_id = o1.file_id)WHERE t2.file_id = row.file_id ';-- updating the migrated column from 0 to 1EXECUTE 'UPDATEtable2_y_b t2SETmigrated = 1WHEREt2.file_id = row.file_idANDmigrated = 0 ';-- setting the blob as nullEXECUTE 'UPDATEoriginal_table1_b o1SETdata = NULLWHEREo1.file_id = row.file_id ';END LOOP;return row.file_id;END$$ language 'plpgsql';
And I'm getting the error:
missing FROM-clause entry for table "row"
WHERE t2.st_ino = row.st_ino
Why does that happen?
Hi guys,
The function works... All the data is updated as expected. However, when I call the function for the second time, it touches the rows that had already been touched by the previous call....
It triplicate (
LIMIT 3
) the records.
Question:
How can I make the function to gets the next 3 rows and not use the same rows that have been used before?
Function updated:
CREATE or REPLACE FUNCTION function_data_1()
RETURNS SETOF bigint AS $$
declare row record;
BEGIN
-- copying the data to the backup table (not the blobs)
-- Limiting in 5000 rows each call
FOR row IN EXECUTE ' SELECT t1.file_id FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id ORDER BY 1 LIMIT 3 '
LOOP
-- Creating the backup table with the essential data
INSERT INTO table2 (note_id, size, file_id, full_path) ( SELECT t1.note_id, t1.size, t1.file_id, t1.full_path FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id WHERE t1.file_id = row.file_id );
-- copying the blobs to the table above table2 UPDATE junk.table2 t2 SET data = ( SELECT o1.data FROM original_table1_b o1 JOIN table3 t3 ON t3.file_id = o1.file_id WHERE t3.migrated = 0 AND t2.file_id = o1.file_id AND o1.file_id = row.file_id ) WHERE t2.file_id = row.file_id;
-- updating the migrated column from 0 to 1 UPDATE table3 t2 SET migrated = 1 WHERE t2.file_id = row.file_id AND migrated = 0;
-- set the blobs as null UPDATE original_table1_b o1 SET data = NULL WHERE o1.file_id = row.file_id;
END LOOP;
END
$$ language 'plpgsql';
How can I make the function to gets the next 3 rows and not use the same rows that have been used before?
WHERE migrated = 0
David J.
2016-06-03 15:50 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:
How can I make the function to gets the next 3 rows and not use the same rows that have been used before?
WHERE migrated = 0David J.
lol... that's right David J. Thanks for that! it's working... ;)
Last thing.. how to select the number of rows that have been modified?
I mean.. when doing: select function_data_1():
I want to get back the number of rows that have been touched..
do u know how ?
thanks again
On 06/02/2016 08:37 PM, Patrick Baker wrote: > Hi guys, > > * > > The function works... All the data is updated as expected. However, > when I call the function for the second time, it touches the rows > that had already been touched by the previous call.... > > * > > It triplicate ( |LIMIT 3| ) the records. > > *Question:* > > How can I make the function to gets the next 3 rows and not use the same > rows that have been used before? > > Function updated: > > |CREATEorREPLACE FUNCTIONfunction_data_1()RETURNS SETOF bigint > AS$$declarerowrecord;BEGIN-- copying the data to the backup table (not > the blobs)-- Limiting in 5000 rows each callFORrowINEXECUTE' SELECT > t1.file_id FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id > ORDER BY 1 LIMIT 3 'LOOP -- Creating the backup table with the essential > dataINSERTINTOtable2 > (note_id,size,file_id,full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_path > FROMtable1 t1 JOINtable3 t3 ONt3.file_id =t1.file_id WHEREt1.file_id > =row.file_id );-- copying the blobs to the table above > table2UPDATEjunk.table2 t2 SETdata =(SELECTo1.data FROMoriginal_table1_b > o1 JOINtable3 t3 ONt3.file_id =o1.file_id WHEREt3.migrated > =0ANDt2.file_id =o1.file_id ANDo1.file_id =row.file_id )WHEREt2.file_id > =row.file_id;-- updating the migrated column from 0 to 1UPDATEtable3 t2 > SETmigrated =1WHEREt2.file_id =row.file_id ANDmigrated =0;-- set the > blobs as nullUPDATEoriginal_table1_b o1 SETdata =NULLWHEREo1.file_id > =row.file_id;ENDLOOP;END$$language 'plpgsql';| > > | > | " CREATE or REPLACE FUNCTION function_data_1() RETURNS SETOF bigint AS $$ declare row record; BEGIN -- copying the data to the backup table (not the blobs) -- Limiting in 5000 rows each call FOR row IN EXECUTE ' SELECT t1.file_id FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id ORDER BY 1 LIMIT 3 ' LOOP -- Creating the backup table with the essential data INSERT INTO table2 (note_id, size, file_id, full_path) ( SELECT t1.note_id, t1.size, t1.file_id, t1.full_path FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id WHERE t1.file_id = row.file_id ); ......." Are you not repeating yourself, why not?: CREATE or REPLACE FUNCTION function_data_1() RETURNS SETOF bigint AS $$ declare row record; BEGIN -- copying the data to the backup table (not the blobs) -- Limiting in 5000 rows each call FOR row IN EXECUTE ' SELECT t1.file_id t1.size, t1.file_id, t1.full_path FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id ORDER BY 1 LIMIT 3 ' LOOP -- Creating the backup table with the essential data INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path) ..... Still not seeing what the JOIN to table3 t3 gets you? Any way the function works. -- Adrian Klaver adrian.klaver@aklaver.com
-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)
.....
Still not seeing what the JOIN to table3 t3 gets you?
Any way the function works.
I changed the function to use row.note_id, row.size, etc... think it's more intelligent that way! :)
Is there any way to create another function to restore the data back?
Example:
select function_data_1_restore(123414);
Where 123414 = file_id
How can I tell the function to get the file_id that I'll insert into the call?
Can you please guys tell me?
cheers
On 06/03/2016 12:23 AM, Patrick Baker wrote: > > > -- Creating the backup table with the essential data > INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path) > > ..... > > Still not seeing what the JOIN to table3 t3 gets you? > > Any way the function works. > > > I changed the function to use row.note_id, row.size, etc... think it's > more intelligent that way! :) > > > Is there any way to create another function to restore the data back? I am sure there is, but it will probably be more difficult then copying that data in the first place. From your previous function there seems to be lot of moving parts. Unwinding those tables and any other data that is dependent on those tables could be a chore. > > Example: > > > select function_data_1_restore(123414); > > Where 123414 = file_id > > How can I tell the function to get the file_id that I'll insert into the > call? That would depend on why and what you want to restore. The function is going to need some sort of prompting from the user on what criteria to use to determine the records to select and restore. > Can you please guys tell me? My help would be to say, first sit down and draw out the dependencies you have between the data and the various tables. Then work out an outline form of how to walk the data back from those tables into its original location(s). > > cheers -- Adrian Klaver adrian.klaver@aklaver.com
On 06/03/2016 12:23 AM, Patrick Baker wrote:
-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)
.....
Still not seeing what the JOIN to table3 t3 gets you?
Any way the function works.
I changed the function to use row.note_id, row.size, etc... think it's
more intelligent that way! :)
Is there any way to create another function to restore the data back?
I am sure there is, but it will probably be more difficult then copying that data in the first place. From your previous function there seems to be lot of moving parts. Unwinding those tables and any other data that is dependent on those tables could be a chore.
Adrian,
I don't think its that bad. All that was done is updating a bytea (or text...) field to NULL after saving the original contents elsewhere. Restoring should be as simple as
UPDATE tbl SET data = archived_data
FROM archive_tbl
WHERE tbl.file_id = archive_tbl.file_id
AND tbl.file_id = <user input>;
Updating the main migrated flag and cleaning up extraneous entries in the archive would be simple. No rows in the main tables were added or removed.
Patrick,
You already wrote the archive function; you should be capable of at least attempting to write its inverse.
If you are wondering how to pass the value 123414 in:
select function_data_1_restore(123414);
That would depend on the client. In psql you'd just type it in. In Java you probably do something like:
stmt = conn.prepareStatement("SELECT function_data_1_restore(?)");
stmt.setInteger(1, new Integer(123414));
stmt.execute();
David J.
2016-06-03 15:50 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:How can I make the function to gets the next 3 rows and not use the same rows that have been used before?
WHERE migrated = 0David J.lol... that's right David J. Thanks for that! it's working... ;)Last thing.. how to select the number of rows that have been modified?I mean.. when doing: select function_data_1():I want to get back the number of rows that have been touched..do u know how ?thanks again
Add a counter variable, increment it within the loop, and return it.
David J.
Ok so guys....
CREATE or REPLACE FUNCTION function_data_1()RETURNS SETOF bigint AS $$declarerow record;BEGIN[...]FOR row IN EXECUTE 'SELECTt1.file_id,t1.path,t1.account_idFROMtable1 t1JOINtable3 t3 ON t3.file_Id = t1.file_idWHEREt3.migrated = 0ANDt3.account_id = 1112ORDER BY 1 LIMIT 30 '[...]
How can I make the function works with account_id?
Example: select function_data_1(1112)
Example: select function_data_1(1112)
and then it will do all the work just for that specific account_id?
If you guys please could give me the way to do that..
thanks
Patrick
On 06/06/2016 09:01 PM, Patrick B wrote: > Ok so guys.... > > CREATE or REPLACE FUNCTION function_data_1() > > RETURNS SETOF bigint AS $$ > > > declare > > row record; > > > BEGIN > > > [...] > > > FOR row IN EXECUTE ' > > SELECT > > t1.file_id, > > t1.path, > > t1.account_id > > FROM > > table1 t1 > > JOIN > > table3 t3 ON t3.file_Id = t1.file_id > > WHERE > > t3.migrated = 0 > > AND > > *t3.account_id = 1112* > > ORDER BY 1 LIMIT 30 ' > > > [...] > > > > How can I make the function works with account_id? > > *Example: select function_data_1(1112)* > > and then it will do all the work just for that specific account_id? https://www.postgresql.org/docs/9.5/static/plpgsql-declarations.html#PLPGSQL-DECLARATION-PARAMETERS > > If you guys please could give me the way to do that.. > thanks > Patrick -- Adrian Klaver adrian.klaver@aklaver.com
Thanks Adrian... it's working ;)