Improving performance of merging data between tables

Started by Pawel Veselovover 11 years ago9 messagesgeneral
Jump to latest
#1Pawel Veselov
pawel.veselov@gmail.com

Hi.

I was wondering if anybody would have any ideas on how to improve certain
operations that we are having.

PostgreSQL 9.3.3. Table layout:

main_table: PK, N key columns, M data columns. The key columns are bound in
a unique key together. PK is pure sequence number. There are few separate
non-unique indices on some of the key columns, to aid in querying the table
data.

second_table: PK, main_table_PK_REF (declared as foreign key), Na key
columns, Ma data columns. There are 0-inf (typically 0-10) entries in
second_table that reference a single entry in main_table. PK is pure
sequence number, and unique key for that table is (main_table_PK_REF + key
columns). The only non-obvious thing here is that some of the unique
indexes involve coalesce() function:

"uq_brkioevent" UNIQUE, btree (mainid, blockid, name, subid,
(COALESCE(cname, ''::character varying)))

Those are there because we need to write the NULL into the tables, but
unique indexes don't like NULL values (AFAIR).

There is one main_table, and 10 second_tables.

The amount of rows in main table right now is ~1.1M, second tables have
about 1M-2M of rows. The growth of the main table is fixed amount of rows
(~10) per hour.

Multiple (web) application nodes need to write data into this table. Most
of the operations are modifying the data columns, rather than inserting new
data. We had serious contention problems if we let all the application
nodes write directly into the table. The writes involved using prepared
transactions, the prepared transaction can only be reaped after other data
stores are written to, and there is very high chance nodes will need to
modify the same rows, so the locking was taking too long.

To relieve the contention, we have allocated each application node it's own
set of tables that structurally are exactly like main/second tables. The
application node will open a transaction, write all the data into its own
tables, free of locks, and then call a pgsql function to merge the data
from its tables into the common tables. There is typically relatively
little data in the node tables (say within 100 rows in any table) before
its merged into the common tables. Nodes would dump their data when there
is something to dump, can be few times a second.

Recently, the operation that moves the data from the node tables into the
common tables started being a real drain on the PostgreSQL server CPU. I
assume this is probably due to the data set size reaching some critical
mass. Things really got outta hand when we had to double the amount of
application nodes to accommodate surge in application use.

The merging is done in the following manner.
Each main/second table has an associated PL/pgSQL function (merge_xxx) that
takes in key and data values as arguments. It then, in an endless loop,
tries to execute UPDATE statement (using math operations to update the data
based on existing and input data values, using key data in the query part).
If UPDATE statement set "found", then the function exists. Otherwise, the
function tries to INSERT with key/data values. If that succeeds, function
exists, else if unique_violation is thrown, loop continues.

On top of these individual functions, there is another PL/pgSQL function
(merge_all). It uses "for ROW in select * from MAIN_NODE" outer loop, and
within that loop it calls the merge_xxx for the main table, and then for
each secondary table, does the same "for ROWx in select * from
SECOND_NODE", adding WHERE clause to only pick up entries that correspond
to the current main_node table entry that's being processed, calling
merge_xxx for the corresponding secondary table. At the end of the outer
loop, all data from node tables is removed (using DELETE). I will gladly
provide pseudo-code, or even the function body is my explanation is unclear.

Besides "can somebody please look at this and let me know if I'm doing
something utterly stupid", here are my questions.

1) How do I find out what exactly is consuming the CPU in a PL/pgSQL
function? All I see is that the calls to merge_all() function take long
time, and the CPU is high while this is going on.

2) Is there a better way to merge individual rows, except doing
UPDATE/INSERT in a loop, and would that be CPU expensive?

3) Is there a better way to merge whole tables? However, note that I need
to translate primary keys from node main table into the common main table,
as they are used as foreign keys, hence the loops. I suspect the looping is
CPU intensive.

Thank you,
Pawel.

#2Andy Colson
andy@squeakycode.net
In reply to: Pawel Veselov (#1)
Re: Improving performance of merging data between tables

On 12/28/2014 3:49 PM, Pawel Veselov wrote:

Hi.

I was wondering if anybody would have any ideas on how to improve
certain operations that we are having.

<SNIP>

Besides "can somebody please look at this and let me know if I'm doing
something utterly stupid", here are my questions.

1) How do I find out what exactly is consuming the CPU in a PL/pgSQL
function? All I see is that the calls to merge_all() function take long
time, and the CPU is high while this is going on.

First, I'll admit I didn't read your entire post.

I can think of a couple methods:

1) try each of the statements in merge_all by hand with an "explain
analyze" in front to see which is slow. Look for things that hit big
tables without an index. Check that fk lookups are indexes.

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:
http://www.postgresql.org/docs/9.4/static/pgstatstatements.html

I have used this to profile some functions, and it worked pretty well.
Mostly I use it on a test box, but once ran it on the live, which was
scary, but worked great.

3) try auto-explain:
http://www.postgresql.org/docs/9.4/static/auto-explain.html

I've never used it, so don't know if it'll show each statement inside a
function. Dumps stuff to the log AFAIK, so you'll have to dig out the
info by hand.

2) Is there a better way to merge individual rows, except doing
UPDATE/INSERT in a loop, and would that be CPU expensive?

Not that I know of. I use pretty much the same thing. Soon! we will
have merge/upsert support. Hopefully it'll be fast.

3) Is there a better way to merge whole tables? However, note that I
need to translate primary keys from node main table into the common main
table, as they are used as foreign keys, hence the loops. I suspect the
looping is CPU intensive.

Avoiding loops and doing things as sets is the best way. If possible.
The only loop I saw was looping over the merge_xxx tables, which is
probably the only way.

If possible (if you haven't already) you could add and extra column to
your secondary table that you can set as the main table's key.

bulk insert into second;
update second set magic = (select key from main where ... );

Then, maybe, you can do two ops in batch:

update main (where key exists in main)
insert into main (where key not exists in main)

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Pawel Veselov
pawel.veselov@gmail.com
In reply to: Andy Colson (#2)
Re: Improving performance of merging data between tables

Andy,

thanks for looking into this.

On Mon, Dec 29, 2014 at 9:00 AM, Andy Colson <andy@squeakycode.net> wrote:

On 12/28/2014 3:49 PM, Pawel Veselov wrote:

Hi.

I was wondering if anybody would have any ideas on how to improve
certain operations that we are having.

<SNIP>

Besides "can somebody please look at this and let me know if I'm doing
something utterly stupid", here are my questions.

1) How do I find out what exactly is consuming the CPU in a PL/pgSQL
function? All I see is that the calls to merge_all() function take long
time, and the CPU is high while this is going on.

First, I'll admit I didn't read your entire post.

I can think of a couple methods:

1) try each of the statements in merge_all by hand with an "explain
analyze" in front to see which is slow. Look for things that hit big
tables without an index. Check that fk lookups are indexes.

If I didn't miss anything, that seems to be OK, even on function-based
queries.

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:
http://www.postgresql.org/docs/9.4/static/pgstatstatements.html

I have used this to profile some functions, and it worked pretty well.
Mostly I use it on a test box, but once ran it on the live, which was
scary, but worked great.

That looks promising. Turned it on, waiting for when I can turn the server
at the next "quiet time".

3) try auto-explain:
http://www.postgresql.org/docs/9.4/static/auto-explain.html

I've never used it, so don't know if it'll show each statement inside a
function. Dumps stuff to the log AFAIK, so you'll have to dig out the info
by hand.

2) Is there a better way to merge individual rows, except doing
UPDATE/INSERT in a loop, and would that be CPU expensive?

Not that I know of. I use pretty much the same thing. Soon! we will
have merge/upsert support. Hopefully it'll be fast.

Well, anytime I cancelled the PID that was executing this whole mess, it
would always stop at UPDATE ... SET ... WHERE on the main table. Which does
make me believe that bulk update would really help.

3) Is there a better way to merge whole tables? However, note that I
need to translate primary keys from node main table into the common main
table, as they are used as foreign keys, hence the loops. I suspect the
looping is CPU intensive.

Avoiding loops and doing things as sets is the best way. If possible. The
only loop I saw was looping over the merge_xxx tables, which is probably
the only way.

There is an endless loop that is just a device for merging, but then there
are loops going over each record in all the tables that are being merge,
feeding them into the function that actually does the merge. That table
iteration is what I want to eliminate (especially if I knew it would help
:) )

If possible (if you haven't already) you could add and extra column to
your secondary table that you can set as the main table's key.

bulk insert into second;
update second set magic = (select key from main where ... );

Then, maybe, you can do two ops in batch:

update main (where key exists in main)
insert into main (where key not exists in main)

I was thinking along the same lines. I can't really do bulk insert, at any
point, because any key can be inserted by another process at any time, and
with a good probability. However, there will be a lot less inserts than
updates. So, in general, I'm making it do this:

with pivot as ( select main_table.id, node_table.id as node_id as main_id
from node_table left join main_table using (key fields) )
update node_table set translate_id = pivot.main_id where node_table.id =
pivot.node_id;

(missing is cursor as select from node_table where main_id is null)

for row in missing loop
-- merge_function will return PK of either the updated, or inserted
record.
-- use (0) data values, so there it's an identity update, if the merge
results
-- into an update, or "empty" data if not.
select merge_function(missing.key_fields, 0) into use_id;
update node_table set translate_id = use_id where current of missing;
end loop

At this point, I have a guarantee that I can update all records, and there
is nothing to insert.
So,

with new as ( select * from node_table )
update main_table old
set new.val = f(old.val, new.val)
where new.translate_id = old.id

So, I don't need full key matching anymore, I can use PKs instead.

#4Andy Colson
andy@squeakycode.net
In reply to: Pawel Veselov (#3)
Re: Improving performance of merging data between tables

On 12/29/2014 11:29 PM, Pawel Veselov wrote:

Andy,

thanks for looking into this.

On Mon, Dec 29, 2014 at 9:00 AM, Andy Colson <andy@squeakycode.net <mailto:andy@squeakycode.net>> wrote:

On 12/28/2014 3:49 PM, Pawel Veselov wrote:

Hi.

I was wondering if anybody would have any ideas on how to improve
certain operations that we are having.

<SNIP>

Besides "can somebody please look at this and let me know if I'm doing
something utterly stupid", here are my questions.

1) How do I find out what exactly is consuming the CPU in a PL/pgSQL
function? All I see is that the calls to merge_all() function take long
time, and the CPU is high while this is going on.

First, I'll admit I didn't read your entire post.

I can think of a couple methods:

1) try each of the statements in merge_all by hand with an "explain analyze" in front to see which is slow. Look for things that hit big tables without an index. Check that fk lookups are indexes.

If I didn't miss anything, that seems to be OK, even on function-based queries.

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:
http://www.postgresql.org/__docs/9.4/static/__pgstatstatements.html <http://www.postgresql.org/docs/9.4/static/pgstatstatements.html&gt;

I have used this to profile some functions, and it worked pretty well. Mostly I use it on a test box, but once ran it on the live, which was scary, but worked great.

That looks promising. Turned it on, waiting for when I can turn the server at the next "quiet time".

3) try auto-explain:
http://www.postgresql.org/__docs/9.4/static/auto-explain.__html <http://www.postgresql.org/docs/9.4/static/auto-explain.html&gt;

I've never used it, so don't know if it'll show each statement inside a function. Dumps stuff to the log AFAIK, so you'll have to dig out the info by hand.

2) Is there a better way to merge individual rows, except doing
UPDATE/INSERT in a loop, and would that be CPU expensive?

Not that I know of. I use pretty much the same thing. Soon! we will have merge/upsert support. Hopefully it'll be fast.

Well, anytime I cancelled the PID that was executing this whole mess, it would always stop at UPDATE ... SET ... WHERE on the main table. Which does make me believe that bulk update would really help.

3) Is there a better way to merge whole tables? However, note that I
need to translate primary keys from node main table into the common main
table, as they are used as foreign keys, hence the loops. I suspect the
looping is CPU intensive.

Avoiding loops and doing things as sets is the best way. If possible. The only loop I saw was looping over the merge_xxx tables, which is probably the only way.

There is an endless loop that is just a device for merging, but then there are loops going over each record in all the tables that are being merge, feeding them into the function that actually does the merge. That table iteration is what I want to eliminate (especially if I knew it would help :) )

If possible (if you haven't already) you could add and extra column to your secondary table that you can set as the main table's key.

bulk insert into second;
update second set magic = (select key from main where ... );

Then, maybe, you can do two ops in batch:

update main (where key exists in main)
insert into main (where key not exists in main)

I was thinking along the same lines. I can't really do bulk insert, at any point, because any key can be inserted by another process at any time, and with a good probability. However, there will be a lot less inserts than updates. So, in general, I'm making it do this:

What about transactions? I assume you do something like:

begin;
merge_all;
commit;

Depending on your transaction isolation level, then you could ensure that nobody could insert while you are inserting. I've never used the different isolation levels, so not 100% sure. Even then, maybe a lock on the table itself, like:

begin;
lock;
merge_all;
unlock;
commit;

This way only one at a time can work do insert/update, but you can do them in batch and not in a loop. It might be faster that way. Other processes might wait for the lock a little bit, but if merge_all was faster in general, the time to wait for lock would be less that the entire merge process itself.

I'm totally guessing here.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Pawel Veselov
pawel.veselov@gmail.com
In reply to: Pawel Veselov (#3)
Re: Improving performance of merging data between tables

On Mon, Dec 29, 2014 at 9:29 PM, Pawel Veselov <pawel.veselov@gmail.com>
wrote:

[skipped]

1) How do I find out what exactly is consuming the CPU in a PL/pgSQL
function? All I see is that the calls to merge_all() function take long
time, and the CPU is high while this is going on.

[skipped]

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:

http://www.postgresql.org/docs/9.4/static/pgstatstatements.html

I have used this to profile some functions, and it worked pretty well.
Mostly I use it on a test box, but once ran it on the live, which was
scary, but worked great.

That looks promising. Turned it on, waiting for when I can turn the server
at the next "quiet time".

I have to say this turned out into a bit of a disappointment for this use
case. It only measures total time spent in a call. So, it sends up
operations that waited a lot on some lock. It's good, but it would be great
if total_time was provided along with wait_time (and io_time may be as
well, since I also see operations that just naturally have to fetch a lot
of data)

[skipped]

#6Maxim Boguk
maxim.boguk@gmail.com
In reply to: Pawel Veselov (#5)
Re: Improving performance of merging data between tables

On Wed, Dec 31, 2014 at 11:10 AM, Pawel Veselov <pawel.veselov@gmail.com>
wrote

[skipped]

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:

http://www.postgresql.org/docs/9.4/static/pgstatstatements.html

I have used this to profile some functions, and it worked pretty well.
Mostly I use it on a test box, but once ran it on the live, which was
scary, but worked great.

That looks promising. Turned it on, waiting for when I can turn the
server at the next "quiet time".

I have to say this turned out into a bit of a disappointment for this use
case. It only measures total time spent in a call. So, it sends up
operations that waited a lot on some lock. It's good, but it would be great
if total_time was provided along with wait_time (and io_time may be as
well, since I also see operations that just naturally have to fetch a lot
of data)

​1) pg_stat_statements provide an information about io_time of each
statement but you should have track_io_timing ​

​enabled for that.

2) About locking I suggest enable log_lock_waits and set deadlock_timeout
to say 100ms (just for testing purposes), and than any lock waiting more
than 100ms will be logged with some useful additional info.

PS: your setup look pretty complicated and hard to analyze without seeing
all involved table structures, transaction/query flow, and (especially)
involved procedures source code.

PPS: btw, please check the database logs for deadlocks messages, your setup
around "and then call a pgsql function to merge the data from its tables
into the common tables" part could be easily deadlock prone.

PPPS: and the last suggestion, after you finished with the "write all the
data into its own tables", then application should perform analyze of these
own tables (or you could have weird/inefficient plans during last stage).

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/&gt;

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

#7Pawel Veselov
pawel.veselov@gmail.com
In reply to: Maxim Boguk (#6)
Re: Improving performance of merging data between tables

On Tue, Dec 30, 2014 at 7:25 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:

On Wed, Dec 31, 2014 at 11:10 AM, Pawel Veselov <pawel.veselov@gmail.com>
wrote

[skipped]

2) try pg_stat_statements, setting "pg_stat_statements.track = all". see:

http://www.postgresql.org/docs/9.4/static/pgstatstatements.html

I have used this to profile some functions, and it worked pretty well.
Mostly I use it on a test box, but once ran it on the live, which was
scary, but worked great.

That looks promising. Turned it on, waiting for when I can turn the
server at the next "quiet time".

I have to say this turned out into a bit of a disappointment for this use
case. It only measures total time spent in a call. So, it sends up
operations that waited a lot on some lock. It's good, but it would be great
if total_time was provided along with wait_time (and io_time may be as
well, since I also see operations that just naturally have to fetch a lot
of data)

​1) pg_stat_statements provide an information about io_time of each
statement but you should have track_io_timing ​

​enabled for that.

Enabled that now. Still the top winners are the functions that probably
lock for a long (relatively) time. This did help my find some crap that
either was missing an index, or used an unreasonable join, and just needed
re-writing. One entry that doesn't make sense to me is:

total_time - io_time = 1,366,773
calls = 666,542
query = SELECT * FROM q_SCHEDULER_STATE WHERE SCHED_NAME = ?
The table only has 18 rows, there is an index, but the analyzer chooses to
ignore it, which is right since sched_name column has the same value for
all rows. So all rows are returned in SELECT. The time to run that query
under database load varies from 0.09 to 70ms.
This is a distraction from the main topic, though, but does stand out odd.

2) About locking I suggest enable log_lock_waits and set deadlock_timeout
to say 100ms (just for testing purposes), and than any lock waiting more
than 100ms will be logged with some useful additional info.

PPS: btw, please check the database logs for deadlocks messages, your
setup around "and then call a pgsql function to merge the data from its
tables into the common tables" part could be easily deadlock prone.

I don't have I have abnormal problem with locking. I wanted to eliminate
locking time out of the pg_stat_statement, to address queries that aren't
waiting on disk and/or locks first, as my problem is high CPU, not specific
query performance. I don't have deadlocks for sure -- I had them before,
and I would normally get an error if there was a deadlock. We process all
the records in exactly the same order of keys to avoid deadlocks.

PPPS: and the last suggestion, after you finished with the "write all the

data into its own tables", then application should perform analyze of these
own tables (or you could have weird/inefficient plans during last stage).

Any references to back this up? I don't particularly mind doing it, but I
wonder if analysis can be more expensive the processing. These tables get a
few hundreds of records inserted/updated, then are entirely processed (with
expected full scans), and then deleted...

PS: your setup look pretty complicated and hard to analyze without seeing
all involved table structures, transaction/query flow, and (especially)
involved procedures source code.

Sure :) At this point, I've put together the "bulk merge" code as well. I
can't quite see much of a difference, actually, but it's hard to trust the
execution times, as on the same amount of data they vary from, say, 0.5s to
2s, and the sample data is not stepping on any other locks. In general, I'm
afraid all those left joins and multiple scans, even over small amount of
data, is nullifying any positive effect.

primary table: http://pastebin.com/gE2TjZd3
secondary table(s): http://pastebin.com/aDVakUkp
There are actually 10 secondary tables, but they are more or less of the
same structure.
The node tables have identical structure to the main tables.

First stage, which I don't particularly question, but may be wrongfully so,
the application does:
- create data suitable for the rows in the primary/secondary tables
- starts transaction
- merges data into its own node tables (using merge_xxx PL/pgSQL functions)
(<100 rows in primary table)
- prepares transactions
- deals with other data sources
- commits/rolls back prepared transaction depending on success of the
previous step.

An example of a merge_xxx function: http://pastebin.com/6YYm8BVM

Second stage is really:
- start transaction
- call PL/pgSQL merge_all()
- commit

2 reasons for the 2 stages:
- if stage#2 fails, the data will be merged during the next iteration
- the lock time on the shared tables is minimized

It's possible that an external process may take over writing data for
certain key subset (combination of (tagid,blockid)), to make sure there is
no race condition with such process, such key pairs are "locked", that's
what the whole r_locks table and get_r_lock() is about. This makes it a bit
more cumbersome for the bulk merge. Here is the r_lock related pieces:
http://pastebin.com/Y9NCemLV

This is the "old" code for merge_all function():
http://pastebin.com/5dn7WsvV
And this is the "new" code that I haven't finished testing, or put into
service, but that's the gist of the change I'm considering:
http://pastebin.com/XmgB5U0f

As always, any suggestions or pointers are greatly appreciated :)

#8Maxim Boguk
maxim.boguk@gmail.com
In reply to: Pawel Veselov (#7)
Re: Improving performance of merging data between tables

On Wed, Jan 7, 2015 at 8:49 PM, Pawel Veselov <pawel.veselov@gmail.com>
wrote:

PPPS: and the last suggestion, after you finished with the "write all the

data into its own tables", then application should perform analyze of these
own tables (or you could have weird/inefficient plans during last stage).

Any references to back this up? I don't particularly mind doing it, but I
wonder if analysis can be more expensive the processing. These tables get a
few hundreds of records inserted/updated, then are entirely processed (with
expected full scans), and then deleted...


If these "own tables" used only in full table selects but never used in
joins - than there should be no issues.
However, once you start join these tables with anything else, you could
have very inefficient/weird plans because the database doesn't know
(without analyze) how many rows you have in these tables.​

PS: your setup look pretty complicated and hard to analyze without seeing
all involved table structures, transaction/query flow, and (especially)
involved procedures source code.

Sure :) At this point, I've put together the "bulk merge" code as well. I
can't quite see much of a difference, actually, but it's hard to trust the
execution times, as on the same amount of data they vary from, say, 0.5s to
2s, and the sample data is not stepping on any other locks. In general, I'm
afraid all those left joins and multiple scans, even over small amount of
data, is nullifying any positive effect.


Now some ideas to check.
The high CPU usage usually isn't related to locking, but related to seq
scan or wrong plans or simple inefficient pl/pgsql code, locked processes
usually doesn't use too much cpu.

1)on the test database perform select pg_stat_reset(); then perform full
round of merges, then check
select * from pg_stat_user_tables where seq_scan>0 order by seq_tup_read;
and if you find a lot of seq_scan and seq_tuple_reads on the particular
table try find where they coming from (it could be reason for high CPU
usage).

2)enable track_functions in postgresql.conf and perform the same sequence
(select pg_stat_reset() + full round of merges
) then check
select * FROM pg_stat_user_functions order by self_time desc;
and check which function using the most time.

3)old/lost prepared transactions can have deadly effect on the database
performance at whole. So check select * from pg_prepared_xact(); and verify
that you don't have a hours (or weeks) old prepared xact lying around.

PS: btw I still don't fully understood relation between the:
"
- merges data into its own node tables (using merge_xxx PL/pgSQL functions)
"
and provided code for the public."merge_all02-9A-46-8B-C1-DD" and
PUBLIC.merge_agrio.
As I see
public."merge_all02-9A-46-8B-C1-DD" calling PUBLIC.merge_agrio, and the
PUBLIC.merge_agrio updates a global table R_AGRIO (but not the "own node
table").

I think the best implementation of such task is asynchronous processing of
this changes via background process. An application only inserts events
into queue table (it lockless process), and some background process read
these data from queue table and merge it into main table (again lockless
because it single thread so no concurrent writes), and then delete the
merged data from queue table.

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/&gt;

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

#9Pawel Veselov
pawel.veselov@gmail.com
In reply to: Maxim Boguk (#8)
Re: Improving performance of merging data between tables

Sorry, it took me a while to respond, but I re-factored all of this process
to suggestions.

On Wed, Jan 7, 2015 at 7:49 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:

On Wed, Jan 7, 2015 at 8:49 PM, Pawel Veselov <pawel.veselov@gmail.com>
wrote:

PS: your setup look pretty complicated and hard to analyze without seeing

all involved table structures, transaction/query flow, and (especially)
involved procedures source code.

Sure :) At this point, I've put together the "bulk merge" code as well. I
can't quite see much of a difference, actually, but it's hard to trust the
execution times, as on the same amount of data they vary from, say, 0.5s to
2s, and the sample data is not stepping on any other locks. In general, I'm
afraid all those left joins and multiple scans, even over small amount of
data, is nullifying any positive effect.

Now some ideas to check.
The high CPU usage usually isn't related to locking, but related to seq
scan or wrong plans or simple inefficient pl/pgsql code, locked processes
usually doesn't use too much cpu.

1)on the test database perform select pg_stat_reset(); then perform full
round of merges, then check
select * from pg_stat_user_tables where seq_scan>0 order by seq_tup_read;
and if you find a lot of seq_scan and seq_tuple_reads on the particular
table try find where they coming from (it could be reason for high CPU
usage).

2)enable track_functions in postgresql.conf and perform the same sequence
(select pg_stat_reset() + full round of merges
) then check
select * FROM pg_stat_user_functions order by self_time desc;
and check which function using the most time.

These are good pointers, if the new process is having the same sort of
problems, this will come in handy on figuring out where they are coming
from, thank you.

3)old/lost prepared transactions can have deadly effect on the database
performance at whole. So check select * from pg_prepared_xact(); and verify
that you don't have a hours (or weeks) old prepared xact lying around.

If there are "lost" prepared transactions, they will lock up a particular
instance from being able to write into its table data, so it will just
stall the node. But does happen, and we have an application mechanism to
find and delete those.

PS: btw I still don't fully understood relation between the:
"
- merges data into its own node tables (using merge_xxx PL/pgSQL functions)
"
and provided code for the public."merge_all02-9A-46-8B-C1-DD" and
PUBLIC.merge_agrio.
As I see
public."merge_all02-9A-46-8B-C1-DD" calling PUBLIC.merge_agrio, and the
PUBLIC.merge_agrio updates a global table R_AGRIO (but not the "own node
table").

It's a bit irrelevant at this point, but. merge_all02-9A-46-8B-C1-DD()
function will take all data for 02-9A-46-8B-C1-DD node and move it into the
master table. There is an analogous merge_02-9A-46-8B-C1-DD() function that
takes data from application, and writes it into the tables for
02-9A-46-8B-C1-DD node. The process of moving data node tables->main tables
and application->node tables is nearly identical, hence I only provided the
body once. The big difference, is when merging into master, there is a lot
more data to look through, as node tables only contain data that has not
yet been merged into the master yet.

I think the best implementation of such task is asynchronous processing of
this changes via background process. An application only inserts events
into queue table (it lockless process), and some background process read
these data from queue table and merge it into main table (again lockless
because it single thread so no concurrent writes), and then delete the
merged data from queue table.

Well, that was a really good suggestion, thank you. Some weeks later I've
put it together. This hasn't hit production yet, so I'm yet to see the
overall improvement effect. Along with turning it into a queue, I've added
provisions to try to combine as much data as possible before writing it out
into the databse tables, and merged all of the satellite tables with the
main data.

Before, I had:
r_agrio
r_brk_xxx (multiple entries reference rows in r_agrio)

Now, I have:
r_agrio_daily
r_agrio_total
r_agrio_hourly

All the data that was in the r_brk_xxx tables is now in columns of the
r_agrio* tables. To get around the fact that there are potentially multiple
BRK records for each AGR record, the data is now stored as JSON object. The
primary key used for the BRK tables is turned into a string that serves as
a key in a top level JSON object. This should help me tremendously on the
side that needs to read that data, as I had to join or left join the BRK
tables.

Splitting this into 3 tables may come back and bite me in back, since it's
two more inserts and corresponding look ups, but it seriously helps me on
the reading side of things.

The code that aggregates the JSON data is still done in PL/PGSQL, which is
probably a bad idea, considering that PL doesn't have good ways of
manipulating JSON data in-place (I can't run PL/V8, or untrusted
languages). But I should move this logic to Java, and use updateable result
sets to modify data in place. The reason I left it in PL/PGSQL is that I
didn't want to do select and then update, making it two look ups per each
update. Another concern on moving it to the application side is the network
turn-around.