Function with limit and offset - PostgreSQL 9.3
Hi guys! I have the following queries, which will basically select data, insert it onto a new table and update a column on the original table.
CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;
declare
row record;
BEGIN
FOR row IN EXECUTE '
SELECT
id
FROM
tablea
WHERE
mig = true
'
LOOP
INSERT INTO tableb (id)
VALUES (row.id);
UPDATE tablea a SET migrated = yes WHERE a.id = row.id;
END LOOP;
RETURN numrows; -- I want it to return the number of processed rows
END
$$ language 'plpgsql';
When I call the function, it must execute 2000 rows and then stop. Then when calling it again, it must start from 2001 to 4000, and so on.
How can I do that? I couldn't find a solution for this..
Thanks!
Marcia
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:
When I call the function, it must execute 2000 rows and then stop. Then
when calling it again, it must start from 2001 to 4000, and so onYou can do this is with plain sql with the help of a CTE. Insert into +
Select ... limit 2000 returning id. Migration done. Put that in a CTE.
In the outer query perform the update by referencing the returned rows from
the CTE.
David J.
On 6/8/2017 5:53 PM, marcinha rocha wrote:
Hi guys! I have the following queries, which will basically select
data, insert it onto a new table and update a column on the original
table.
I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?
BEGIN;
insert into tableb (id) select id from tablea;
update tablea set migrated=true;
COMMIT;
thats far more efficient that the row-at-a-time iterative solution you
showed.
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 6/8/2017 5:53 PM, marcinha rocha wrote:
Hi guys! I have the following queries, which will basically select
data, insert it onto a new table and update a column on the original
table.
I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?
BEGIN;
insert into tableb (id) select id from tablea;
update tablea set migrated=true;
COMMIT;
thats far more efficient that the row-at-a-time iterative solution you
showed.
You're right, that is just an example.
I'm basically using a CTE to select the data and then, inserting some rows onto a new table.
I just don't know how to tell my function to perform 2000 records at once, and then when calling it again it will "know" where to start from
Maybe, I already have everything I need?
UPDATE tablea a SET migrated = yes WHERE a.id = row.id;
On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?
Example:
CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;
declare
row record;
BEGIN
FOR row IN EXECUTE '
SELECT
id
FROM
tablea
WHERE
migrated = false
'
LOOP
INSERT INTO tableb (id)
VALUES (row.id);
UPDATE tablea a SET migrated = yes WHERE a.id = row.id;
END LOOP;
RETURN num_rows; -- I want it to return the number of processed rows
END
$$ language 'plpgsql';
Import Notes
Reply to msg id not found: CY1PR18MB0490B02CD04525A8AB74F26DAFCE0@CY1PR18MB0490.namprd18.prod.outlook.com
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:
On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?
You shoud try to avoid the for loop, but yes a limit 2000 on the for loop
query should work since the migrated flag will ensure the same rows aren't
selected again.
David J.
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com<mailto:marciaestefanidarocha@hotmail.com>> wrote:
On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?
You shoud try to avoid the for loop,
Why?
but yes a limit 2000 on the for loop query should work since the migrated flag will ensure the same rows aren't selected again.
David J.
Ok, cool!
Now, how do tell the function to return the number of touched rows? On this case, it should always be 2000.
Thanks!
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:
On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@
hotmail.com
<javascript:_e(%7B%7D,'cvml','marciaestefanidarocha@hotmail.com');>>
wrote:On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?You shoud try to avoid the for loop,
Why?
Mainly expected performance concerns. The engine is designed to handle
results sets as opposed to single row iterating. Whether it's true in your
case I don't know but I would assume that operating on sets would be faster.
Ok, cool!
Now, how do tell the function to return the number of touched rows? On
this case, it should always be 2000.
Unless there are fewer rows to process. You could always just do i = i + 1
in the loop.
David J.
On 6/8/2017 6:36 PM, marcinha rocha wrote:
|UPDATEtablea a SETmigrated =yes WHEREa.id =row.id;|
On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?
SELECT does not return data in any determinate order unless you use an
ORDER BY.... so LIMIT 2000 would return some 2000 elements, not
neccessarily the 'first' 2000 elements unless you somehow order them by
however you feel 'first' is defined.
WITH ids AS (INSERT INTO tableb (id) SELECT id FROM tablea WHERE
migrated=FALSE ORDER BY id LIMIT 2000 RETURNING id)
UPDATE tablea a SET a.migrated=TRUE WHERE a.id = ids.id
RETURNING COUNT(a.id);
I'm not 100% sure you can do UPDATE .... RETURNING COUNT(...), worse
case the UPDATE RETURNING would be a subquery of a SELECT COUNT()...
--
john r pierce, recycling bits in santa cruz