Thread: [GENERAL] Function with limit and offset - PostgreSQL 9.3

[GENERAL] Function with limit and offset - PostgreSQL 9.3

From
marcinha rocha
Date:

Hi guys! I have the following queries, which will basically select data, insert it onto a new table and update a column on the original table.


CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;

declare       row record;

BEGIN

FOR row IN EXECUTE '       SELECT             id       FROM             tablea       WHERE             mig = true
'
LOOP

INSERT INTO tableb (id)
VALUES (row.id);

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

END LOOP;

RETURN numrows; -- I want it to return the number of processed rows

END

$$ language 'plpgsql';

When I call the function, it must execute 2000 rows and then stop. Then when calling it again, it must start from 2001 to 4000, and so on.


How can I do that? I couldn't find a solution for this.. 



Thanks!
Marcia

Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
"David G. Johnston"
Date:
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com> wrote:

When I call the function, it must execute 2000 rows and then stop. Then when calling it again, it must start from 2001 to 4000, and so on

You can do this is with plain sql with the help of a CTE.  Insert into + Select ... limit 2000 returning id.  Migration done.  Put that in a CTE.  In the outer query perform the update by referencing the returned rows from the CTE.

David J.

Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
John R Pierce
Date:
On 6/8/2017 5:53 PM, marcinha rocha wrote:
> Hi guys! I have the following queries, which will basically select
> data, insert it onto a new table and update a column on the original
> table.


I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?


BEGIN;
     insert into tableb (id) select id from tablea;
     update tablea set migrated=true;
COMMIT;


thats far more efficient that the row-at-a-time iterative solution you
showed.

--
john r pierce, recycling bits in santa cruz



Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
marcinha rocha
Date:

On 6/8/2017 5:53 PM, marcinha rocha wrote:

> Hi guys! I have the following queries, which will basically select
> data, insert it onto a new table and update a column on the original
> table.


I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?


BEGIN;
     insert into tableb (id) select id from tablea;
     update tablea set migrated=true;
COMMIT;


thats far more efficient that the row-at-a-time iterative solution you
showed.


You're right, that is just an example.

I'm basically using a CTE to select the data and then, inserting some rows onto a new table.

I just don't know how to tell my function to perform 2000 records at once, and then when calling it again it will "know" where to start from

Maybe, I already have everything I need?

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;
On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?

Example:
CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;

declare       row record;

BEGIN

FOR row IN EXECUTE '       SELECT             id       FROM             tablea       WHERE             migrated = false
'
LOOP

INSERT INTO tableb (id)
VALUES (row.id);

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

END LOOP;

RETURN num_rows; -- I want it to return the number of processed rows

END

$$ language 'plpgsql';



Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
"David G. Johnston"
Date:
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com> wrote:

On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?


You shoud try to avoid the for loop, but yes a limit 2000 on the for loop query should work since the migrated flag will ensure the same rows aren't selected again.

David J. 

Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
marcinha rocha
Date:

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com> wrote:

On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?


You shoud try to avoid the for loop, 

Why?

but yes a limit 2000 on the for loop query should work since the migrated flag will ensure the same rows aren't selected again.

David J. 


Ok, cool!

Now, how do tell the function to return the number of touched rows? On this case, it should always be 2000.

Thanks!

Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
"David G. Johnston"
Date:
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com> wrote:

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com> wrote:

On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?


You shoud try to avoid the for loop, 

Why?

Mainly expected performance concerns. The engine is designed to handle results sets as opposed to single row iterating.  Whether it's true in your case I don't know but I would assume that operating on sets would be faster.
 

Ok, cool!

Now, how do tell the function to return the number of touched rows? On this case, it should always be 2000.


Unless there are fewer rows to process.  You could always just do i = i + 1 in the loop.

David J. 

Re: [GENERAL] Function with limit and offset - PostgreSQL 9.3

From
John R Pierce
Date:
On 6/8/2017 6:36 PM, marcinha rocha wrote:
UPDATE tablea a SET migrated = yes WHERE a.id = row.id;
On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?

 SELECT does not return data in any determinate order unless you use an ORDER BY....   so LIMIT 2000 would return some 2000 elements, not neccessarily the 'first' 2000 elements unless you somehow order them by however you feel 'first' is defined.


    WITH ids AS (INSERT INTO tableb (id) SELECT id FROM tablea WHERE migrated=FALSE ORDER BY id LIMIT 2000 RETURNING id)
        UPDATE tablea a SET a.migrated=TRUE WHERE a.id = ids.id RETURNING COUNT(a.id);



I'm not 100% sure you can do UPDATE .... RETURNING COUNT(...), worse case the UPDATE RETURNING would be a subquery of a SELECT COUNT()...


-- 
john r pierce, recycling bits in santa cruz