PL/PGSQL + inserts+updates+limit - Postgres 9.3

Started by Patrick Balmost 10 years ago24 messagesgeneral
Jump to latest
#1Patrick B
patrickbakerbr@gmail.com

Hi guys,

I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS...

I have four tables:

*- original_table1_b =* Original table, where the BLOBS are

*- table1_n_b =* Table where everything related to the BLOBS is stored
(file_id, account_id, note_id, etc)
*- table2_y_b =* Table BACKUP - The blobs+data will be copied to here
before being deleted
*- table3_n_b =* On the *table1_n_b*, each blob is related to a note_id.
Each note_id has three different file_id. I want to delete just the
greatest one. So on this *table3_n_b* table I'm storing the greates
file_id (by size)

How is the *table3_n_b* table created:

SELECT * INTO table3_n_b FROM(
SELECT account_id, note_id, st_ino, size FROM
(
SELECT DISTINCT ON
(note_id) note_id,
MAX(size),
file_id,
id
FROM
table1_n_b
GROUP BY
note_id, size, file_id, id
ORDER BY
note_id, size desc
) AS r1) AS r2;

The function must perform the following:

1 - Select *note_id + size + file_id + full_path* from *table1_n_b* table
to the new *table2_y_b* one, but only those file_id that are greatest, so
here we use the table created above: *table3_n_b*:

- Something like this?

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
INTO
table2_y_b
FROM
table1_n_b t1
JOIN
table3_n_b t3 ON t3.file_id = t1.file_id
)

2 - Once the Blob's data is inside the *table2_y_b* table, we can now copy
the blobs into the same table.

- something like this?

INSERT INTO table2_y_b (data)

(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3_n_b t3 ON t3.file_id = o1.file_id
)

3 - Changing the table2_y_b.migrated column from 0 to 1 (1 means the blob
has been already copied):

FOR crtRow IN execute
'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND
migrated = 0 ' || $1 ||' offset '||

4 - After we have a backup of the blobs+data, we can now delete the blob
(setting the column as NULL)

FOR crtRow IN execute

'UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id '

|| $1 ||' offset '||

*This is what I've done so far:*

CREATE or REPLACE FUNCTION function_1_name(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

crtRow record;

BEGIN

offset_num = 0;

-- Copiyng the data into the table which will store the data+blobs

FOR crtRow IN execute

'INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

INTO

table2_y_b

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

) ' || $1 ||' offset '||

-- Copying the BLOBS

FOR crtRow IN execute

'INSERT INTO table2_y_b (data)

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

JOIN

table2_y_b t2 ON t2.file_id = o1.file_id

WHERE

t2.migrated = 0

) ' || $1 ||' offset '||

-- Update the migrated column from 0 to 1, for those rows that have been
modified/copied.

FOR crtRow IN execute

'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND
migrated = 0 ' || $1 ||' offset '||

FOR crtRow IN execute

'UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id '
|| $1 ||' offset '||

RETURN file_id;

END

$$ language 'plpgsql';

Am I doing right?
When I will call the function: *select function_1_name(5000) or **select
function_1_name(15000)* will it respect the limited by the rows?

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#1)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Wed, Jun 1, 2016 at 8:10 PM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

Hi guys,

I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS...

​[...]

When I will call the function: *select function_1_name(5000) or **select
function_1_name(15000)* will it respect the limited by the rows?

​I'd suggest you setup a test environment with some unimportant data on a
non-production machine and try it yourself.

​​

David J.

#3Patrick B
patrickbakerbr@gmail.com
In reply to: David G. Johnston (#2)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

​I'd suggest you setup a test environment with some unimportant data on a
non-production machine and try it yourself.

​​

David J.

Thanks.. but if I'm asking the list that's because I'm already testing it
and it's not working... ;)

Patrick

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#3)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Wednesday, June 1, 2016, Patrick Baker <patrickbakerbr@gmail.com> wrote:

​I'd suggest you setup a test environment with some unimportant data on a
non-production machine and try it yourself.

​​

David J.

Thanks.. but if I'm asking the list that's because I'm already testing it
and it's not working... ;)

Are you getting errors? A quick look seemed like it shouldn't even run do
to syntax problems.

David J.

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#1)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On 06/01/2016 05:10 PM, Patrick Baker wrote:

Hi guys,

I need a function ( PL/PGSQL ) to perform a deletion of some BLOBS...

I have four tables:

*- original_table1_b =* Original table, where the BLOBS are
*- table1_n_b =* Table where everything related to the BLOBS is
stored (file_id, account_id, note_id, etc)
*- table2_y_b =* Table BACKUP - The blobs+data will be copied to
here before being deleted
*- table3_n_b =* On the *table1_n_b*, each blob is related to a
note_id. Each note_id has three different file_id. I want to delete
just the greatest one. So on this *table3_n_b* table I'm storing the
greates file_id (by size)

How is the *table3_n_b* table created:

|SELECT*INTOtable3_n_b FROM(SELECTaccount_id,note_id,st_ino,size
FROM(SELECTDISTINCTON(note_id)note_id,MAX(size),file_id,id
FROMtable1_n_b GROUPBYnote_id,size,file_id,id ORDERBYnote_id,size
desc)ASr1 )ASr2;|

The function must perform the following:

1 - Select /_note_id + size + file_id + full_path_/ from *table1_n_b*
table to the new *table2_y_b* one, but only those file_id that are
greatest, so here we use the table created above: *table3_n_b*:

- Something like this?

INSERT INTO table2_y_b (note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
INTO
table2_y_b
FROM
table1_n_b t1
JOIN
table3_n_b t3 ON t3.file_id = t1.file_id
)

2 - Once the Blob's data is inside the *table2_y_b* table, we can now
copy the blobs into the same table.

- something like this?

INSERT INTO table2_y_b (data)
(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3_n_b t3 ON t3.file_id = o1.file_id
)

3 - Changing the table2_y_b.migrated column from 0 to 1 (1 means the
blob has been already copied):

FOR crtRow IN execute
'UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id
AND migrated = 0 ' || $1 ||' offset '||

4 - After we have a backup of the blobs+data, we can now delete the blob
(setting the column as NULL)

FOR crtRow IN execute

'UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id ' || $1 ||' offset '||

*This is what I've done so far:*

CREATE or REPLACE FUNCTION function_1_name(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

crtRow record;

BEGIN

offset_num = 0;

-- Copiyng the data into the table which will store the data+blobs

FOR crtRow IN execute

'INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

INTO

table2_y_b

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

) ' || $1 ||' offset '||

-- Copying the BLOBS

FOR crtRow IN execute

'INSERT INTO table2_y_b (data)

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

JOIN

table2_y_b t2 ON t2.file_id = o1.file_id

WHERE

t2.migrated = 0

) ' || $1 ||' offset '||

-- Update the migrated column from 0 to 1, for those rows that
have been modified/copied.

FOR crtRow IN execute

'UPDATE table2_y_b SET migrated = 1 WHERE file_id =
crtRow.file_id AND migrated = 0 ' || $1 ||' offset '||

FOR crtRow IN execute

'UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id ' || $1 ||' offset '||

RETURN file_id;

END

$$ language 'plpgsql';

Am I doing right?
When I will call the function: *select function_1_name(5000) or **select
function_1_name(15000)* will it respect the limited by the rows?

I maybe be missing it, but I see no LIMIT in the function.

I do see OFFSET and it looks backwards to me?:

|| $1 ||' offset '||

https://www.postgresql.org/docs/9.5/static/sql-select.html

LIMIT Clause

The LIMIT clause consists of two independent sub-clauses:

LIMIT { count | ALL }
OFFSET start

Also I not sure what offset_num is supposed to do, it is declared but
not used?

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Patrick B
patrickbakerbr@gmail.com
In reply to: Adrian Klaver (#5)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

I maybe be missing it, but I see no LIMIT in the function.

I do see OFFSET and it looks backwards to me?:

|| $1 ||' offset '||

https://www.postgresql.org/docs/9.5/static/sql-select.html

LIMIT Clause

The LIMIT clause consists of two independent sub-clauses:

LIMIT { count | ALL }
OFFSET start

Also I not sure what offset_num is supposed to do, it is declared but not
used?

Yep.. it's declared but it's not used..

Hmm.. interesting that about the LIMIT clause
However.. I had to changed the function...

Here is the new one:

CREATE or REPLACE FUNCTION function_data_1(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

offset_num integer;

crtRow record;

BEGIN

offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND
migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;

END

$$ language 'plpgsql';

It's all working, except the LIMIT... if possible can you please give me
an example of that LIMIT in some of those queries?

Thanks

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#6)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On 06/01/2016 10:04 PM, Patrick Baker wrote:

I maybe be missing it, but I see no LIMIT in the function.

I do see OFFSET and it looks backwards to me?:

|| $1 ||' offset '||

https://www.postgresql.org/docs/9.5/static/sql-select.html

LIMIT Clause

The LIMIT clause consists of two independent sub-clauses:

LIMIT { count | ALL }
OFFSET start

Also I not sure what offset_num is supposed to do, it is declared
but not used?

Yep.. it's declared but it's not used..

Hmm.. interesting that about the LIMIT clause
However.. I had to changed the function...

Here is the new one:

CREATE or REPLACE FUNCTION function_data_1(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

offset_num integer;

crtRow record;

BEGIN

offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id =
crtRow.file_id AND migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id;

END

$$ language 'plpgsql';

It's all working, except the LIMIT... if possible can you please give
me an example of that LIMIT in some of those queries?

I do not what you are trying to LIMIT/OFFSET, so I have no idea where to
place the LIMIT/OFFSET.

Maybe an example query showing what you are trying to do will help?

Thanks

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#6)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Thu, Jun 2, 2016 at 1:04 AM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

It's all working, except the LIMIT... if possible can you please give

me an example of that LIMIT in some of those queries?

​​

You also should use ORDER BY when using LIMIT and OFFSET; though depending
on the setup it could be omitted. Usually as long as the second execution
cannot select any of the records the first execution touched you can choose
a random quantity. But if you want random then using OFFSET is pointless.

​SELECT *
FROM generate_series(1, 10)
ORDER BY 1
LIMIT 5
OFFSET 3

generate_series
----------------------
4
5
6
7
8

You are going to have difficultly finding people willing to help when you
cannot put together a self-contained and syntax error free example (I think
the last one is...) of what you want to do. The PostgreSQL parser is very
good at reading code and telling you what it doesn't like. I'm not
inclined to spend time reading queries that obviously cannot run and point
out those same problems. If you can a particular error you don't
understand I'll be happy to try and explain what it is trying to tell you.

​You probably need to reformulate your update to read:

UPDATE tbl
FROM (
SELECT 50 RECORDS
)​ src
WHERE src = tbl;

​And ensure that the 50 being selected each time through are a different 50.

Writeable CTEs will probably help here.

https://www.postgresql.org/docs/current/static/queries-with.html

​David J.

#9Patrick B
patrickbakerbr@gmail.com
In reply to: David G. Johnston (#8)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:

On Thu, Jun 2, 2016 at 1:04 AM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

It's all working, except the LIMIT... if possible can you please give

me an example of that LIMIT in some of those queries?

​​

You also should use ORDER BY when using LIMIT and OFFSET; though depending
on the setup it could be omitted. Usually as long as the second execution
cannot select any of the records the first execution touched you can choose
a random quantity. But if you want random then using OFFSET is pointless.

​SELECT *
FROM generate_series(1, 10)
ORDER BY 1
LIMIT 5
OFFSET 3

generate_series
----------------------
4
5
6
7
8

You are going to have difficultly finding people willing to help when you
cannot put together a self-contained and syntax error free example (I think
the last one is...) of what you want to do. The PostgreSQL parser is very
good at reading code and telling you what it doesn't like. I'm not
inclined to spend time reading queries that obviously cannot run and point
out those same problems. If you can a particular error you don't
understand I'll be happy to try and explain what it is trying to tell you.

​You probably need to reformulate your update to read:

UPDATE tbl
FROM (
SELECT 50 RECORDS
)​ src
WHERE src = tbl;

​And ensure that the 50 being selected each time through are a different
50.

Writeable CTEs will probably help here.

https://www.postgresql.org/docs/current/static/queries-with.html

​David J.

Hi David.

The SQLs inside the function works.... I'm just having problem about
limiting the query to the number of rows I want, and also, to teach the
update SQL to only touch the records the other SQLs inside the function
have touched.

This is the function updated:

CREATE or REPLACE FUNCTION function_data_1(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

offset_num integer;

crtRow record;

BEGIN

offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND
migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;

END

$$ language 'plpgsql';

- As you can see, the first *insert*, inserts data into a new table from
another select. This query must be limited by the number of rows I'll
provide when calling the function; example:

select function_data_1(5000);

select function_data_1(60000);
select function_data_1(15000);

- The first *update*, copies the BLOBS from the original_table1_b table
into the new one (as above). Here, I also need the query knows to only
touch those records that have been touched by the above query.

- The second *update*, set the table2_y_b.migrated column from 0 to 1,
telling me that, that record has been touched by the query. So the next
call ( select function_data_1(60000); ) will already know that it does not
need to touch that record; example:

WHERE

t2.migrated = 0

- The third and last *update*, deletes (set the blobs column as null) the
blobs that have already been touched by the above queries.... Still.. don't
know how to tell postgres to only touches the rows that have been touched
by the above queries....

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#9)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Thu, Jun 2, 2016 at 5:03 PM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:

On Thu, Jun 2, 2016 at 1:04 AM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

It's all working, except the LIMIT... if possible can you please give

me an example of that LIMIT in some of those queries?

​​

You also should use ORDER BY when using LIMIT and OFFSET; though
depending on the setup it could be omitted. Usually as long as the second
execution cannot select any of the records the first execution touched you
can choose a random quantity. But if you want random then using OFFSET is
pointless.

​SELECT *
FROM generate_series(1, 10)
ORDER BY 1
LIMIT 5
OFFSET 3

generate_series
----------------------
4
5
6
7
8

You are going to have difficultly finding people willing to help when you
cannot put together a self-contained and syntax error free example (I think
the last one is...) of what you want to do. The PostgreSQL parser is very
good at reading code and telling you what it doesn't like. I'm not
inclined to spend time reading queries that obviously cannot run and point
out those same problems. If you can a particular error you don't
understand I'll be happy to try and explain what it is trying to tell you.

​You probably need to reformulate your update to read:

UPDATE tbl
FROM (
SELECT 50 RECORDS
)​ src
WHERE src = tbl;

​And ensure that the 50 being selected each time through are a different
50.

Writeable CTEs will probably help here.

https://www.postgresql.org/docs/current/static/queries-with.html

​David J.

Hi David.

The SQLs inside the function works....

​Really? ​You seem to have lost your FOR loop for starters, and your
RETURN statement, and a semi-colon after END, and I doubt crtRow.file_id
works, should I go on...so, yes, you can run the four individual SQL
statements correctly but the function itself is bogus.

I'm just having problem about limiting the query to the number of rows I
want, and also, to teach the update SQL to only touch the records the other
SQLs inside the function have touched.

This is the function updated:

CREATE or REPLACE FUNCTION function_data_1(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

offset_num integer;

crtRow record;

BEGIN

offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id = crtRow.file_id AND
migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id = crtRow.file_id;

END

$$ language 'plpgsql';

- As you can see, the first *insert*, inserts data into a new table from
another select. This query must be limited by the number of rows I'll
provide when calling the function; example:

select function_data_1(5000);

select function_data_1(60000);
select function_data_1(15000);

- The first *update*, copies the BLOBS from the original_table1_b table
into the new one (as above). Here, I also need the query knows to only
touch those records that have been touched by the above query.

- The second *update*, set the table2_y_b.migrated column from 0 to 1,
telling me that, that record has been touched by the query. So the next
call ( select function_data_1(60000); ) will already know that it does not
need to touch that record; example:

WHERE

t2.migrated = 0

- The third and last *update*, deletes (set the blobs column as null) the
blobs that have already been touched by the above queries.... Still.. don't
know how to tell postgres to only touches the rows that have been touched
by the above queries....

​Here's a fish - though you will still need to clean it.​

​This is not tested, and I haven't ever build this exact query for real,
but it should work in theory...

--assumes that to be migrated records have previously had their migrated
flag set to 0

function name (number_of_rows_to_process integer)
LANGUAGE sql -- this no longer requires procedural logic so no need for
plpgsql
RETURNS SETOF bigint --returns the affected ids
AS $$
WITH the_records_I_want_to_affect AS (
-- pick N records to process
SELECT id, ...
FROM source_tbl
WHERE migrated = 0
​ORDER BY ...
LIMIT number_of_rows_to_process -- your function argument goes here
FOR UPDATE
)​,
migrate_the_data AS (
-- place a copy of them into the archive table
INSERT INTO migration_table
SELECT id, ...
FROM the_records_I_want_to_affect
RETURNING *
),
mark_as_migrated AS (
-- mark them as having been archived and nullify the blob data
UPDATE source_tbl
SET migrated = 1, data = null
FROM migrate_the_data recs
WHERE recs.id = source_tbl.id
RETURNING source_tbl.id
)
SELECT id FROM mark_as_migrated;
$$

​I am sure a fully working version of this idiom in present in one and more
places on the internet. Feel free to search out fully working examples
with additional commentary.​

You can make a FOR loop version of this work, and had to many years ago
before writable CTEs were implemented.

David J.

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#9)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On 06/02/2016 02:03 PM, Patrick Baker wrote:

2016-06-03 2:10 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com
<mailto:david.g.johnston@gmail.com>>:

Hi David.

The SQLs inside the function works.... I'm just having problem about
limiting the query to the number of rows I want, and also, to teach the
update SQL to only touch the records the other SQLs inside the function
have touched.

See notes inline.

This is the function updated:

CREATE or REPLACE FUNCTION function_data_1(rows integer)

RETURNS INTEGER AS $$

declare

completed integer;

offset_num integer;

crtRow record;

BEGIN

offset_num = 0;

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

Why are you joining to table3_nb?
You do not use any fields from it.

How do you know what data in table1_n_b to get?
I see this grabbing the same information over and over again.

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id =
crtRow.file_id AND migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id;

All the above would seem to be handled in a LOOP.
Grab the data from:

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

with suitable WHERE clause and use:

https://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING

to iterate over the results. As part of the iteration do your INSERT and
UPDATE using the RECORD.file_id. This includes setting migrated=1 and
data=NULL.

END

$$ language 'plpgsql';

- As you can see, the first *insert*, inserts data into a new table from
another select. This query must be limited by the number of rows I'll
provide when calling the function; example:

select function_data_1(5000);
select function_data_1(60000);
select function_data_1(15000);

- The first *update*, copies the BLOBS from the original_table1_b table
into the new one (as above). Here, I also need the query knows to only
touch those records that have been touched by the above query.

- The second *update*, set the table2_y_b.migrated column from 0 to 1,
telling me that, that record has been touched by the query. So the next
call ( select function_data_1(60000); ) will already know that it does
not need to touch that record; example:

WHERE
t2.migrated = 0

- The third and last *update*, deletes (set the blobs column as null)
the blobs that have already been touched by the above queries....
Still.. don't know how to tell postgres to only touches the rows that
have been touched by the above queries....

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12Patrick B
patrickbakerbr@gmail.com
In reply to: Adrian Klaver (#11)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

Why are you joining to table3_nb?
You do not use any fields from it.

How do you know what data in table1_n_b to get?
I see this grabbing the same information over and over again.

SELECT * INTO table3_n_b FROM (
SELECT account_id, note_id, file_id FROM
(
SELECT DISTINCT ON
(note_id) note_id,
MAX(size),
file_id,
company_id
FROM
table1_n_b
GROUP BY
note_id, size, file_id, company_id
ORDER BY
note_id, size desc
) AS r1) AS r2;

Because I just wanna touch the greatest file_id ( by size ) of each note_id
And the file_id I must change is into the table3

That's why:

table3_n_b t3 ON t3.file_id = t1.file_id

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

);

UPDATE table2_y_b SET migrated = 1 WHERE file_id =
crtRow.file_id AND migrated = 0;

UPDATE original_table1_b SET data = NULL WHERE file_id =
crtRow.file_id;

All the above would seem to be handled in a LOOP.
Grab the data from:

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

with suitable WHERE clause and use:

https://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING

Hmm ok... but...

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

);

I don't need anything else on the WHERE clause , as the *ON t3.file_id =
t1.file_id* is already doing what I need.... ( and it works.. I tested it )

to iterate over the results. As part of the iteration do your INSERT and
UPDATE using the RECORD.file_id. This includes setting migrated=1 and
data=NULL.

Yep.. that's the way I started by doing this...

Can you please tell me if this would be right?

CREATE or REPLACE FUNCTION function_data_1()

RETURNS INTEGER AS $$

declare

row record;

BEGIN

-- copying the data to the backup table (not the blobs)

-- Limiting in 5000 rows each call

FOR row IN EXECUTE '

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

ORDER BY 1

LIMIT 5000

)'

LOOP

-- copying the blobs to the table above

UPDATE table2_y_b t2 SET segment_data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t2.migrated = 0

AND

t2.file_id = o1.file_id

)

WHERE t2.file_id = row.file_id

END LOOP;

-- updating the migrated column from 0 to 1

LOOP

UPDATE

table2_y_b t2

SET

migrated = 1

WHERE

t2.file_id = row.file_id

AND

migrated = 0

END LOOP;

LOOP

UPDATE

original_table1_b o1

SET

data = NULL

WHERE

o1.file_id = row.file_id;

END LOOP;

END

$$ language 'plpgsql';

an.klaver@aklaver.c <adrian.klaver@aklaver.com>

#13Patrick B
patrickbakerbr@gmail.com
In reply to: Patrick B (#12)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

I did:

CREATE or REPLACE FUNCTION function_1_data()

RETURNS INTEGER AS $$

declare

row record;

BEGIN

-- copying the data to the backup table (not the blobs)

FOR row IN EXECUTE '

SELECT

t1.file_id

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1. file_id ORDER BY 1
LIMIT 3' LOOP

-- Creating the backup table with the essential data

EXECUTE '

INSERT INTO table2_y_b (note_id, size, file_id, full_path)

(

SELECT

t1.note_id,

t1.size,

t1.file_id,

t1.full_path

FROM

table1_n_b t1

JOIN

table3_n_b t3 ON t3.file_id = t1.file_id

) ';

-- copying the blobs to the table above

EXECUTE '

UPDATE table2_y_b t2 SET data =

(

SELECT

o1.data

FROM

original_table1_b o1

JOIN

table3_n_b t3 ON t3.file_id = o1.file_id

WHERE

t3.migrated = 0

AND

t2.file_id = o1.file_id

)

WHERE t2.file_id = row.file_id ';

-- updating the migrated column from 0 to 1

EXECUTE '

UPDATE

table2_y_b t2

SET

migrated = 1

WHERE

t2.file_id = row.file_id

AND

migrated = 0 ';

-- setting the blob as null

EXECUTE '

UPDATE

original_table1_b o1

SET

data = NULL

WHERE

o1.file_id = row.file_id ';

END LOOP;

return row.file_id;

END

$$ language 'plpgsql';

*And I'm getting the error:*

missing FROM-clause entry for table "row"
WHERE t2.st_ino = row.st_ino

Why does that happen?

#14Patrick B
patrickbakerbr@gmail.com
In reply to: Patrick B (#13)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

Hi guys,

-

The function works... All the data is updated as expected. However, when
I call the function for the second time, it touches the rows that had
already been touched by the previous call....
-

It triplicate ( LIMIT 3 ) the records.

*Question:*

How can I make the function to gets the next 3 rows and not use the same
rows that have been used before?

Function updated:

CREATE or REPLACE FUNCTION function_data_1()
RETURNS SETOF bigint AS $$
declare
row record;
BEGIN
-- copying the data to the backup table (not the blobs)-- Limiting in
5000 rows each callFOR row IN EXECUTE '
SELECT
t1.file_id
FROM
table1 t1
JOIN
table3 t3 ON t3.file_id = t1.file_id
ORDER BY 1 LIMIT 3 '
LOOP
-- Creating the backup table with the essential dataINSERT INTO table2
(note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
FROM
table1 t1
JOIN
table3 t3 ON t3.file_id = t1.file_id
WHERE
t1.file_id = row.file_id
);
-- copying the blobs to the table above table2
UPDATE junk.table2 t2 SET data =
(
SELECT
o1.data
FROM
original_table1_b o1
JOIN
table3 t3 ON t3.file_id = o1.file_id
WHERE
t3.migrated = 0
AND
t2.file_id = o1.file_id
AND
o1.file_id = row.file_id
)
WHERE t2.file_id = row.file_id;
-- updating the migrated column from 0 to 1
UPDATE
table3 t2
SET
migrated = 1
WHERE
t2.file_id = row.file_id
AND
migrated = 0;
-- set the blobs as null
UPDATE
original_table1_b o1
SET
data = NULL
WHERE
o1.file_id = row.file_id;END LOOP;
END
$$ language 'plpgsql';

#15David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#14)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Thu, Jun 2, 2016 at 11:37 PM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

How can I make the function to gets the next 3 rows and not use the same
rows that have been used before?

​WHERE migrated = 0

​David J.

#16Patrick B
patrickbakerbr@gmail.com
In reply to: David G. Johnston (#15)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

2016-06-03 15:50 GMT+12:00 David G. Johnston <david.g.johnston@gmail.com>:

On Thu, Jun 2, 2016 at 11:37 PM, Patrick Baker <patrickbakerbr@gmail.com>
wrote:

How can I make the function to gets the next 3 rows and not use the same
rows that have been used before?

​WHERE migrated = 0

​David J.

lol... that's right David J. Thanks for that! it's working... ;)

Last thing.. how to select the number of rows that have been modified?

I mean.. when doing: select function_data_1():

I want to get back the number of rows that have been touched..

do u know how ?

thanks again

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#14)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On 06/02/2016 08:37 PM, Patrick Baker wrote:

Hi guys,

*

The function works... All the data is updated as expected. However,
when I call the function for the second time, it touches the rows
that had already been touched by the previous call....

*

It triplicate ( |LIMIT 3| ) the records.

*Question:*

How can I make the function to gets the next 3 rows and not use the same
rows that have been used before?

Function updated:

|CREATEorREPLACE FUNCTIONfunction_data_1()RETURNS SETOF bigint
AS$$declarerowrecord;BEGIN-- copying the data to the backup table (not
the blobs)-- Limiting in 5000 rows each callFORrowINEXECUTE' SELECT
t1.file_id FROM table1 t1 JOIN table3 t3 ON t3.file_id = t1.file_id
ORDER BY 1 LIMIT 3 'LOOP -- Creating the backup table with the essential
dataINSERTINTOtable2
(note_id,size,file_id,full_path)(SELECTt1.note_id,t1.size,t1.file_id,t1.full_path
FROMtable1 t1 JOINtable3 t3 ONt3.file_id =t1.file_id WHEREt1.file_id
=row.file_id );-- copying the blobs to the table above
table2UPDATEjunk.table2 t2 SETdata =(SELECTo1.data FROMoriginal_table1_b
o1 JOINtable3 t3 ONt3.file_id =o1.file_id WHEREt3.migrated
=0ANDt2.file_id =o1.file_id ANDo1.file_id =row.file_id )WHEREt2.file_id
=row.file_id;-- updating the migrated column from 0 to 1UPDATEtable3 t2
SETmigrated =1WHEREt2.file_id =row.file_id ANDmigrated =0;-- set the
blobs as nullUPDATEoriginal_table1_b o1 SETdata =NULLWHEREo1.file_id
=row.file_id;ENDLOOP;END$$language 'plpgsql';|

|
|

"
CREATE or REPLACE FUNCTION function_data_1()
RETURNS SETOF bigint AS $$

declare
row record;

BEGIN

-- copying the data to the backup table (not the blobs)
-- Limiting in 5000 rows each call
FOR row IN EXECUTE '
SELECT
t1.file_id
FROM
table1 t1
JOIN
table3 t3 ON t3.file_id = t1.file_id
ORDER BY 1 LIMIT 3 '
LOOP

-- Creating the backup table with the essential data
INSERT INTO table2 (note_id, size, file_id, full_path)
(
SELECT
t1.note_id,
t1.size,
t1.file_id,
t1.full_path
FROM
table1 t1
JOIN
table3 t3 ON t3.file_id = t1.file_id
WHERE
t1.file_id = row.file_id
);

......."

Are you not repeating yourself, why not?:

CREATE or REPLACE FUNCTION function_data_1()
RETURNS SETOF bigint AS $$

declare
row record;

BEGIN

-- copying the data to the backup table (not the blobs)
-- Limiting in 5000 rows each call
FOR row IN EXECUTE '
SELECT
t1.file_id
t1.size,
t1.file_id,
t1.full_path
FROM
table1 t1
JOIN
table3 t3 ON t3.file_id = t1.file_id
ORDER BY 1 LIMIT 3 '
LOOP

-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)

.....

Still not seeing what the JOIN to table3 t3 gets you?

Any way the function works.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18Patrick B
patrickbakerbr@gmail.com
In reply to: Adrian Klaver (#17)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)

.....

Still not seeing what the JOIN to table3 t3 gets you?

Any way the function works.

I changed the function to use row.note_id, row.size, etc... think it's more
intelligent that way! :)

Is there any way to create another function to restore the data back?

Example:

select function_data_1_restore(123414);

Where 123414 = file_id

How can I tell the function to get the file_id that I'll insert into the
call?
Can you please guys tell me?

cheers

#19Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#18)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On 06/03/2016 12:23 AM, Patrick Baker wrote:

-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)

.....

Still not seeing what the JOIN to table3 t3 gets you?

Any way the function works.

I changed the function to use row.note_id, row.size, etc... think it's
more intelligent that way! :)

Is there any way to create another function to restore the data back?

I am sure there is, but it will probably be more difficult then copying
that data in the first place. From your previous function there seems to
be lot of moving parts. Unwinding those tables and any other data that
is dependent on those tables could be a chore.

Example:

select function_data_1_restore(123414);

Where 123414 = file_id

How can I tell the function to get the file_id that I'll insert into the
call?

That would depend on why and what you want to restore. The function is
going to need some sort of prompting from the user on what criteria to
use to determine the records to select and restore.

Can you please guys tell me?

My help would be to say, first sit down and draw out the dependencies
you have between the data and the various tables. Then work out an
outline form of how to walk the data back from those tables into its
original location(s).

cheers

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#20David G. Johnston
david.g.johnston@gmail.com
In reply to: Adrian Klaver (#19)
Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3

On Fri, Jun 3, 2016 at 3:16 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 06/03/2016 12:23 AM, Patrick Baker wrote:

-- Creating the backup table with the essential data
INSERT INTO table2 (row.note_id, row.size, row.file_id, row.full_path)

.....

Still not seeing what the JOIN to table3 t3 gets you?

Any way the function works.

I changed the function to use row.note_id, row.size, etc... think it's
more intelligent that way! :)

Is there any way to create another function to restore the data back?

I am sure there is, but it will probably be more difficult then copying
that data in the first place. From your previous function there seems to be
lot of moving parts. Unwinding those tables and any other data that is
dependent on those tables could be a chore.

​Adrian,​

​I don't think its that bad. All that was done is updating a bytea (or
text...) field to NULL after saving the original contents elsewhere.
Restoring should be as simple as

UPDATE tbl SET data = archived_data
FROM archive_tbl
WHERE tbl.file_id = archive_tbl.file_id
AND tbl.file_id = <user input>;

Updating the main migrated flag and cleaning up extraneous entries in the
archive would be simple. No rows in the main tables were added or removed.

Patrick,

You already wrote the archive function; you should be capable of at least
attempting to write its inverse.

If you are wondering how to pass the value 123414 in:

select function_data_1_restore(123414);

That would depend on the client. In psql you'd just type it in. In Java
you probably do something like:

stmt = conn.prepareStatement("SELECT function_data_1_restore(?)");
stmt.setInteger(1, new Integer(123414));
stmt.execute();

David J.

#21David G. Johnston
david.g.johnston@gmail.com
In reply to: Patrick B (#16)
#22Patrick B
patrickbakerbr@gmail.com
In reply to: David G. Johnston (#21)
#23Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Patrick B (#22)
#24Patrick B
patrickbakerbr@gmail.com
In reply to: Adrian Klaver (#23)