Function with limit and offset - PostgreSQL 9.3

Started by marcinha rochaalmost 9 years ago8 messagesgeneral
Jump to latest
#1marcinha rocha
marciaestefanidarocha@hotmail.com

Hi guys! I have the following queries, which will basically select data, insert it onto a new table and update a column on the original table.

CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;

declare
row record;

BEGIN

FOR row IN EXECUTE '
SELECT
id
FROM
tablea
WHERE
mig = true
'
LOOP

INSERT INTO tableb (id)
VALUES (row.id);

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

END LOOP;

RETURN numrows; -- I want it to return the number of processed rows

END

$$ language 'plpgsql';

When I call the function, it must execute 2000 rows and then stop. Then when calling it again, it must start from 2001 to 4000, and so on.

How can I do that? I couldn't find a solution for this..

Thanks!
Marcia

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: marcinha rocha (#1)
Re: Function with limit and offset - PostgreSQL 9.3

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:

When I call the function, it must execute 2000 rows and then stop. Then
when calling it again, it must start from 2001 to 4000, and so on

You can do this is with plain sql with the help of a CTE. Insert into +

Select ... limit 2000 returning id. Migration done. Put that in a CTE.
In the outer query perform the update by referencing the returned rows from
the CTE.

David J.

#3John R Pierce
pierce@hogranch.com
In reply to: marcinha rocha (#1)
Re: Function with limit and offset - PostgreSQL 9.3

On 6/8/2017 5:53 PM, marcinha rocha wrote:

Hi guys! I have the following queries, which will basically select
data, insert it onto a new table and update a column on the original
table.

I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?

BEGIN;
insert into tableb (id) select id from tablea;
update tablea set migrated=true;
COMMIT;

thats far more efficient that the row-at-a-time iterative solution you
showed.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4marcinha rocha
marciaestefanidarocha@hotmail.com
In reply to: marcinha rocha (#1)
Re: Function with limit and offset - PostgreSQL 9.3

On 6/8/2017 5:53 PM, marcinha rocha wrote:

Hi guys! I have the following queries, which will basically select
data, insert it onto a new table and update a column on the original
table.

I'm sure your example is a gross simplification of what you're really
doing, but if that's really all you're doing, why not do it all at once,
instead of row at a time?

BEGIN;
insert into tableb (id) select id from tablea;
update tablea set migrated=true;
COMMIT;

thats far more efficient that the row-at-a-time iterative solution you
showed.

You're right, that is just an example.

I'm basically using a CTE to select the data and then, inserting some rows onto a new table.

I just don't know how to tell my function to perform 2000 records at once, and then when calling it again it will "know" where to start from

Maybe, I already have everything I need?

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?

Example:

CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer;

declare
row record;

BEGIN

FOR row IN EXECUTE '
SELECT
id
FROM
tablea
WHERE
migrated = false
'
LOOP

INSERT INTO tableb (id)
VALUES (row.id);

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

END LOOP;

RETURN num_rows; -- I want it to return the number of processed rows

END

$$ language 'plpgsql';

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: marcinha rocha (#4)
Re: Function with limit and offset - PostgreSQL 9.3

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:

On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?

You shoud try to avoid the for loop, but yes a limit 2000 on the for loop
query should work since the migrated flag will ensure the same rows aren't
selected again.

David J.

#6marcinha rocha
marciaestefanidarocha@hotmail.com
In reply to: David G. Johnston (#5)
Re: Function with limit and offset - PostgreSQL 9.3

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com<mailto:marciaestefanidarocha@hotmail.com>> wrote:

On my original select, the row will have migrated = false. Maybe All I need to put is a limit 2000 and the query will do the rest?

You shoud try to avoid the for loop,

Why?

but yes a limit 2000 on the for loop query should work since the migrated flag will ensure the same rows aren't selected again.

David J.

Ok, cool!

Now, how do tell the function to return the number of touched rows? On this case, it should always be 2000.

Thanks!

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: marcinha rocha (#6)
Re: Function with limit and offset - PostgreSQL 9.3

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@hotmail.com>
wrote:

On Thursday, June 8, 2017, marcinha rocha <marciaestefanidarocha@
hotmail.com
<javascript:_e(%7B%7D,'cvml','marciaestefanidarocha@hotmail.com');>>
wrote:

On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?

You shoud try to avoid the for loop,

Why?

Mainly expected performance concerns. The engine is designed to handle
results sets as opposed to single row iterating. Whether it's true in your
case I don't know but I would assume that operating on sets would be faster.

Ok, cool!

Now, how do tell the function to return the number of touched rows? On
this case, it should always be 2000.

Unless there are fewer rows to process. You could always just do i = i + 1
in the loop.

David J.

#8John R Pierce
pierce@hogranch.com
In reply to: marcinha rocha (#4)
Re: Function with limit and offset - PostgreSQL 9.3

On 6/8/2017 6:36 PM, marcinha rocha wrote:

|UPDATEtablea a SETmigrated =yes WHEREa.id =row.id;|
On my original select, the row will have migrated = false. Maybe All I
need to put is a limit 2000 and the query will do the rest?

SELECT does not return data in any determinate order unless you use an
ORDER BY.... so LIMIT 2000 would return some 2000 elements, not
neccessarily the 'first' 2000 elements unless you somehow order them by
however you feel 'first' is defined.

WITH ids AS (INSERT INTO tableb (id) SELECT id FROM tablea WHERE
migrated=FALSE ORDER BY id LIMIT 2000 RETURNING id)
UPDATE tablea a SET a.migrated=TRUE WHERE a.id = ids.id
RETURNING COUNT(a.id);

I'm not 100% sure you can do UPDATE .... RETURNING COUNT(...), worse
case the UPDATE RETURNING would be a subquery of a SELECT COUNT()...

--
john r pierce, recycling bits in santa cruz