background triggers?

Started by Sim Zacksalmost 20 years ago36 messagesgeneral

sim@compulab.co.il

almost 20 years ago

Is there any way to write a statement trigger that runs in the
background? In my testing, when I have an After Insert For Each
Statement trigger, the function call does not end until the trigger is
finished processing.

What I would like to do, and please let me know if there is a better way
to do this, is to have an after event statement trigger run in a
separate process so the function call ends when it ends and the
statement trigger gets executed on its own time.

The use that I have for this at the moment, and I can think of many
other uses, is that I want to populate a statistics table each time that
a table is updated. But the code to populate the table takes 10 seconds
to run. I don't want the user to have to wait 10 seconds to add a record.

cbbrowne@acm.org

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

Sim Zacks <sim@compulab.co.il> writes:

Is there any way to write a statement trigger that runs in the
background? In my testing, when I have an After Insert For Each
Statement trigger, the function call does not end until the trigger is
finished processing.

What I would like to do, and please let me know if there is a better
way to do this, is to have an after event statement trigger run in a
separate process so the function call ends when it ends and the
statement trigger gets executed on its own time.

The use that I have for this at the moment, and I can think of many
other uses, is that I want to populate a statistics table each time
that a table is updated. But the code to populate the table takes 10
seconds to run. I don't want the user to have to wait 10 seconds to
add a record.

This seems a case for using NOTIFY/LISTEN.

- You have a process connected to the database that runs LISTEN,
causing it to listen for a particular message.

LISTEN regen_statistics;

- Your trigger submits a notification:

NOTIFY regen_statistics;

The notification doesn't get committed to internal table
pg_catalog.pg_listener until the transaction doing the writing
completes its COMMIT, so the listener won't "hear" anything until
then...

Per the docs:

The method a frontend application must use to detect notify events
depends on which Postgres application programming interface it
uses. With the basic libpq library, the application issues LISTEN as
an ordinary SQL command, and then must periodically call the routine
PQnotifies to find out whether any notify events have been
received. Other interfaces such as libpgtcl provide higher-level
methods for handling notify events; indeed, with libpgtcl the
application programmer should not even issue LISTEN or UNLISTEN
directly. See the documentation for the library you are using for
more details.
--
select 'cbbrowne' || '@' || 'ntlug.org';
http://cbbrowne.com/info/multiplexor.html
"I am a bomb technician. If you see me running, try to keep up..."

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Chris Browne (#2)

Re: background triggers?

On Tue, 2006-05-23 at 15:56 +0000, Chris Browne wrote:

The use that I have for this at the moment, and I can think of many
other uses, is that I want to populate a statistics table each time
that a table is updated. But the code to populate the table takes 10
seconds to run. I don't want the user to have to wait 10 seconds to
add a record.

This seems a case for using NOTIFY/LISTEN.

- You have a process connected to the database that runs LISTEN,
causing it to listen for a particular message.

LISTEN regen_statistics;

- Your trigger submits a notification:

NOTIFY regen_statistics;

Some time ago I fell into quite similair problem. But LISTEN/NOTIFY was
not a solution.

In that case I needed to *disconnect* and never bother about the outcome
of a long running background trigger.

So if I may re-phrase the question: "is there a way to have a trigger,
that, when launched, can check if it's already running in backgroung for
some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

The functionality required is "lazy" update of agregate tables, which
need not be 100% acurate, but should be kept in-sync whenever possible.

-R

ken@secdat.com

almost 20 years ago

In reply to: Rafal Pietrak (#3)

Re: background triggers?

Rafal Pietrak wrote:

So if I may re-phrase the question: "is there a way to have a trigger,
that, when launched, can check if it's already running in backgroung for
some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

Rafal, I'm wondering why you want to do this. You may be "fighting the
framework".

If you are trying to do something that is totally unsupported, it is
probably for a pretty good reason, usually dealing with security or data
loss. You can probably get what you want by supported methods, but it
may require looking at the problem in a different way.

What is it you are trying to accomplish? Is it just performance?

qnex42@gmail.com

almost 20 years ago

In reply to: Rafal Pietrak (#3)

Re: background triggers?

On 5/23/06, Rafal Pietrak <rafal@zorro.isa-geek.com> wrote:

On Tue, 2006-05-23 at 15:56 +0000, Chris Browne wrote:

The use that I have for this at the moment, and I can think of many
other uses, is that I want to populate a statistics table each time
that a table is updated. But the code to populate the table takes 10
seconds to run. I don't want the user to have to wait 10 seconds to
add a record.

This seems a case for using NOTIFY/LISTEN.

- You have a process connected to the database that runs LISTEN,
causing it to listen for a particular message.

LISTEN regen_statistics;

- Your trigger submits a notification:

NOTIFY regen_statistics;

Some time ago I fell into quite similair problem. But LISTEN/NOTIFY was
not a solution.

In that case I needed to *disconnect* and never bother about the outcome
of a long running background trigger.

The idea is that you *disconnect* and you have a daemon running at the
server side, which will handle LISTEN efficiently. Daemon can be quickly
written in perl, and it can use select(2) call to listen for incoming notifies.
Fast, efficient and powerful.

So if I may re-phrase the question: "is there a way to have a trigger,
that, when launched, can check if it's already running in backgroung for
some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

It is also possible. You probably want to use locks checking in your
trigger (I'm writing from memory, so forgive me syntax erros, if any).
a trigger might be, say:

CREATE OR REPLACE FUNCTION foo_trigger() RETURNS trigger AS $$
BEGIN
SELECT key FROM foo_stats WHERE key = NEW.key FOR UPDATE NOWAIT;
-- this will either lock the row with "key" or return
immediately, if it's been locked:
UPDATE foo_stats SET count=(SELECT count(*) FROM foo WHERE
key=NEW.key) WHERE key=NEW.key;
RETURN NEW;
EXCEPTION
WHEN lockbusyorsomething THEN RETURN NEW;
END;
$$ LANGUAGE PLpgSQL;

Most likely there are better ways to accomplish your goal.

Regards,
Dawid

fgp@phlo.org

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

Sim Zacks wrote:

Is there any way to write a statement trigger that runs in the
background? In my testing, when I have an After Insert For Each
Statement trigger, the function call does not end until the trigger is
finished processing.

What I would like to do, and please let me know if there is a better way
to do this, is to have an after event statement trigger run in a
separate process so the function call ends when it ends and the
statement trigger gets executed on its own time.

Create some queue table, and let the after-statement trigger insert a
record into the queue.

Some daemon, or a cron-job, can then periodically check the queue, and
perform the action you want to do asynchronously.

If you don't want to use polling to find new queue entries, you can use
notify/listen to inform the daemon of new queue entries.

greetings, Florian Pflug

fgp@phlo.org

almost 20 years ago

In reply to: Rafal Pietrak (#3)

Re: background triggers?

Rafal Pietrak wrote:

On Tue, 2006-05-23 at 15:56 +0000, Chris Browne wrote:

The use that I have for this at the moment, and I can think of many
other uses, is that I want to populate a statistics table each time
that a table is updated. But the code to populate the table takes 10
seconds to run. I don't want the user to have to wait 10 seconds to
add a record.

This seems a case for using NOTIFY/LISTEN.

- You have a process connected to the database that runs LISTEN,
causing it to listen for a particular message.

LISTEN regen_statistics;

- Your trigger submits a notification:

NOTIFY regen_statistics;

Some time ago I fell into quite similair problem. But LISTEN/NOTIFY was
not a solution.

In that case I needed to *disconnect* and never bother about the outcome
of a long running background trigger.

You can "disconnect" after doing NOTIFY, as long as you commit the transaction
you called NOTIFY in. Otherwise your notification gets rolles back, just like
your other database updates. But since the updates never happened then, your
aggregate table won't need rebuilding in that case either...

So if I may re-phrase the question: "is there a way to have a trigger,
that, when launched, can check if it's already running in backgroung for
some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

Not easily, because according the transaction isolation rules, one transaction
has not way of finding out what another concurrent transaction might be doing.

But your background daemon can check for this. Before starting a "aggregate run",
it'd need store the currently-visible transaction-ids. If a new queue-entry is created
while the first job is still running, it either starts a new job directly after the first
one finished (if the transaction that created the queue entry wasn't visible at the time
the first job was started), or just deletes the new queue entry (If the previous run already
saw the changes introduced by the transaction that created the queue entry)

The functionality required is "lazy" update of agregate tables, which
need not be 100% acurate, but should be kept in-sync whenever possible.

Why isn't the solution outlines above sufficient?

greetings, FLorian Pflug

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Kenneth Downs (#4)

Re: background triggers?

On Tue, 2006-05-23 at 13:01 -0400, Kenneth Downs wrote:

Rafal Pietrak wrote:

some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

Rafal, I'm wondering why you want to do this. You may be "fighting the
framework".

Yes, most probably. I'm afraid of that :(

Still...

If you are trying to do something that is totally unsupported, it is
probably for a pretty good reason, usually dealing with security or data
loss. You can probably get what you want by supported methods, but it
may require looking at the problem in a different way.

What is it you are trying to accomplish? Is it just performance?

OK. here is a 'real life' example. It works more like a post office. Now
and then, there arrive a 'delivery man' with a bunch of post to deliver.
Post office takes the batch, checks and stamps each and every item, and
hands over a receipt. But the actual bin-ing (into delivery channels)
and routing (dispatch) is handled without the delivery man standing and
waiting for the process to end.

In my *real*life* case I have a file with hundreds of tousends of
'transactions' uploaded by operator (several times times a day, and in
fact, by 10-30 operators) - those are checked at INSERT time. But after
that check and INSERT, I'd like to say to the operator: "OK, jour job is
done, don't warry about the rest".

But there is more work to do with the batch.

I have to 'route the transactions' to their relevant accounts, and see
how those change the 'status' of those accounts, consequently, store the
updated status within the account itself. This is tedious and time
consuming. But it have to be done, because 'customers' query account
status for those 'agregate status information' and it would be *very*
haevy for the database if those queries required browsing of the entire
'transaction log'. Number of 'Transactions' to number of accounts is
like milions to thousends.

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

So my solution was 1) to have an additional table "TABLE dirty_log(tiem
timestamp, who text)", which gets inserted a row *after* a batch of
INSERTS. 2) and a 'server side process', run every 5-10 minutes, which
makes accounts update, and which: A) does NOT launche when another such
process is currently running; B) purges DIRTY_LOG table after it's done.
This is quite obvoisly a 'user space implementation' of the 'background
triggers'. I needed that.

Natuaraly, having this 'bacground trigger' I loose acuracy of the
account information. But I gain on system efficiency - this is
engineering decision. When we have to take those (like the
implementation above), it's good to have 'system tools' (like
'background triggers') that support us.

But of cource I may be wrong all togather. I'd really like to know the
techniq, which is 'along the line' of RDBM systems design, which serves
that same purpose.

Is there a better solution?

NB: the 'batch INSERT' I mentioned above is done by www server. It's
quite vital to have the the server process terminated (meaning: not keep
it waiting for LISTEN/NOPTIFY event - meaning not use LISTEN/NOTIFY), as
apache will keep the connection opened until the process ends.

In 'real life', this scenario is applicable also to 'service network':
1. Say, you have 100_000 - 1000_000 vending machines (VM) network.
2. each is is loaded with c.a. 100 item types (300 types in the entire
network).
3. each VM dispatches an item every 2-3 minutes. which make overall
'transaction traffic' at the level of over hundreds per second.
4. assume, that for 'customer management', you need to store quite a bit
of data with each item-dispense 'transaction'. Meaning: transaction are
not very light, and their details have to be kept for long time.
5. obviously, you need to manage your stock (each of the 300
item-types): you keep VM loaded and keep some stock at central store.

(ATMs are a good example of such netowrk)

So:
1. 'transaction traffic' is so signifficant, that you really have to
'optimise for that'
2. you don't really have to know *exactly* when you run out of stock,
because each VM has signifficant local item store, so if you get
notified, that a particular VM gets close to the bottom with particular
item, you may dispatch a reload in 10min, but it's also OK to dispatch
that in 2hours - meaning, the 'acocunt information' does not have to be
'immediately acurate'. Far more important is 'dispatch transaction'
performance.
3. normally, you 'keep an eye' on you VM network - meaning, you issue a
'statistics' query quite frequently. If that was a 'haevy query' it
would degrade your database performance quite signifficantly - we really
need the 'agregate information' stored within 'item-accounts'.

Is there a clean, 'along the framework' design that serves this reality?

-R

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

On Wed, 2006-05-24 at 13:36 +0200, Sim Zacks wrote:

My understanding of Listen/Notify is that it is a completely
disconnected process running on the database server.

Yes. But In my particular case (and I presume, the intention of
'bacground triggers' is that) a programmer (like myself) is not really
interested in the outcome of thet trigger process. So there is no
'technical' need for him/her to create the server side proces *provided*
hi/she can setup a job *within* the database server itself, and just go
away.

That's the idea of 'background triggers'.

Surely, there are work arounds. Like the LISTEN/NOTIFY server (not
datagase server, but system server) daemon that takes database server
notiffications. And even a system server daemon, that simply uses
synchronous database communication (like I did in my case). The problem
is, that I have this 'eatching desire', to have such technical issues
supported 'withing the framework of RDBM architecture'.

That's why I keep thinking, that the solution I used is actually
'bitting the fances', while gurus do it some other, *better* way. But if
not, a think that 'bakground triggers' could help here.

I may not have understood exactly what you are trying to do, but from
what I understood, this will solve your problem.

I think you did. I just feel that 'background triggers' is 'real life'
engineering issue, so it should get some backing from RDBMS.

just my 2c.

-R

Import Notes

Reply to msg id not found: e51d37$kpe$1@news.hub.org

sim@compulab.co.il

almost 20 years ago

In reply to: Rafal Pietrak (#8)

Re: background triggers?

My understanding of Listen/Notify is that it is a completely
disconnected process running on the database server.
It should be run as a dameon (or service in Windows) application. The
trigger on the table calls the notify SQL command and then the trigger,
and thereby the statement, ends.
The notify then tells this daemon application to wake up and start
processing.
To make sure that a second process does not start while the first
process is running, you should have a running processes table which gets
inserted when it starts and updated when it ends. That way your process
can check if one is currently running or not.

Another way of doing it is to have a cron job check every X minutes for
records in the table. When they are there, it should run the process. It
can also have a flag that says don't run another process until this one
is finished.

I may not have understood exactly what you are trying to do, but from
what I understood, this will solve your problem.

Sim

Rafal Pietrak wrote:

Show quoted text

On Tue, 2006-05-23 at 13:01 -0400, Kenneth Downs wrote:

Rafal Pietrak wrote:

some other INSERT, return imediately if so, but turn into background for
a long-lasting job if not".

Rafal, I'm wondering why you want to do this. You may be "fighting the
framework".

Yes, most probably. I'm afraid of that :(

Still...

If you are trying to do something that is totally unsupported, it is
probably for a pretty good reason, usually dealing with security or data
loss. You can probably get what you want by supported methods, but it
may require looking at the problem in a different way.

What is it you are trying to accomplish? Is it just performance?

OK. here is a 'real life' example. It works more like a post office. Now
and then, there arrive a 'delivery man' with a bunch of post to deliver.
Post office takes the batch, checks and stamps each and every item, and
hands over a receipt. But the actual bin-ing (into delivery channels)
and routing (dispatch) is handled without the delivery man standing and
waiting for the process to end.

In my *real*life* case I have a file with hundreds of tousends of
'transactions' uploaded by operator (several times times a day, and in
fact, by 10-30 operators) - those are checked at INSERT time. But after
that check and INSERT, I'd like to say to the operator: "OK, jour job is
done, don't warry about the rest".

But there is more work to do with the batch.

I have to 'route the transactions' to their relevant accounts, and see
how those change the 'status' of those accounts, consequently, store the
updated status within the account itself. This is tedious and time
consuming. But it have to be done, because 'customers' query account
status for those 'agregate status information' and it would be *very*
haevy for the database if those queries required browsing of the entire
'transaction log'. Number of 'Transactions' to number of accounts is
like milions to thousends.

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

So my solution was 1) to have an additional table "TABLE dirty_log(tiem
timestamp, who text)", which gets inserted a row *after* a batch of
INSERTS. 2) and a 'server side process', run every 5-10 minutes, which
makes accounts update, and which: A) does NOT launche when another such
process is currently running; B) purges DIRTY_LOG table after it's done.
This is quite obvoisly a 'user space implementation' of the 'background
triggers'. I needed that.

Natuaraly, having this 'bacground trigger' I loose acuracy of the
account information. But I gain on system efficiency - this is
engineering decision. When we have to take those (like the
implementation above), it's good to have 'system tools' (like
'background triggers') that support us.

But of cource I may be wrong all togather. I'd really like to know the
techniq, which is 'along the line' of RDBM systems design, which serves
that same purpose.

Is there a better solution?

NB: the 'batch INSERT' I mentioned above is done by www server. It's
quite vital to have the the server process terminated (meaning: not keep
it waiting for LISTEN/NOPTIFY event - meaning not use LISTEN/NOTIFY), as
apache will keep the connection opened until the process ends.

In 'real life', this scenario is applicable also to 'service network':
1. Say, you have 100_000 - 1000_000 vending machines (VM) network.
2. each is is loaded with c.a. 100 item types (300 types in the entire
network).
3. each VM dispatches an item every 2-3 minutes. which make overall
'transaction traffic' at the level of over hundreds per second.
4. assume, that for 'customer management', you need to store quite a bit
of data with each item-dispense 'transaction'. Meaning: transaction are
not very light, and their details have to be kept for long time.
5. obviously, you need to manage your stock (each of the 300
item-types): you keep VM loaded and keep some stock at central store.

(ATMs are a good example of such netowrk)

So:
1. 'transaction traffic' is so signifficant, that you really have to
'optimise for that'
2. you don't really have to know *exactly* when you run out of stock,
because each VM has signifficant local item store, so if you get
notified, that a particular VM gets close to the bottom with particular
item, you may dispatch a reload in 10min, but it's also OK to dispatch
that in 2hours - meaning, the 'acocunt information' does not have to be
'immediately acurate'. Far more important is 'dispatch transaction'
performance.
3. normally, you 'keep an eye' on you VM network - meaning, you issue a
'statistics' query quite frequently. If that was a 'haevy query' it
would degrade your database performance quite signifficantly - we really
need the 'agregate information' stored within 'item-accounts'.

Is there a clean, 'along the framework' design that serves this reality?

-R

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

ken@secdat.com

almost 20 years ago

In reply to: Rafal Pietrak (#8)

Re: background triggers?

Rafal Pietrak wrote:

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

Why not have the INSERT go to an "inbox" table, a table whose only job
is to receive the data for future processing.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure to
process them. Pass the stored procedure the batch number.

IOW, have your "background trigger" be a stored procedure that is
invoked by the client, instead of trying to get the server to do it.

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Kenneth Downs (#11)

Re: background triggers?

On Wed, 2006-05-24 at 07:41 -0400, Kenneth Downs wrote:

Why not have the INSERT go to an "inbox" table, a table whose only job
is to receive the data for future processing.

Actually, it 'sort of' works that way.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure to
process them. Pass the stored procedure the batch number.

If I have that stored procedure and if I issue command that would launch
such stored procedure from "psql>" prompt: how long will I have to wait
for another prompt? 1) until the procedure ends its job. 2) right away,
the procedure does its job unabidedly 'in the background'.

My impression was, that I get the next prompt after the procedure
finishes, so it wouldn't be a solution. But if (2) applies, that is
really it.... Frankly, it would take me some time to get back to those
sources (and generate simulation data) - so anybody knows the answer?

-R

ken@secdat.com

almost 20 years ago

In reply to: Rafal Pietrak (#12)

Re: background triggers?

Rafal Pietrak wrote:

On Wed, 2006-05-24 at 07:41 -0400, Kenneth Downs wrote:

Why not have the INSERT go to an "inbox" table, a table whose only job
is to receive the data for future processing.

Actually, it 'sort of' works that way.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure to
process them. Pass the stored procedure the batch number.

If I have that stored procedure and if I issue command that would launch
such stored procedure from "psql>" prompt: how long will I have to wait
for another prompt? 1) until the procedure ends its job. 2) right away,
the procedure does its job unabidedly 'in the background'.

What web server and OS are you using? In linux/apache you can fork off
a process that runs the SP and then detach from it.

sim@compulab.co.il

almost 20 years ago

In reply to: Rafal Pietrak (#9)

Re: background triggers?

I agree with you that it would be better to have a background trigger
run by the database server. That was exactly what I asked. The answer
was that there is no current way to do a background trigger and to
accomplish what I want to do I need to write a daemon that calls Listen
and then on the Notify run my function.

If background triggers were a possibility that would make the whole
thing so much easier.

Rafal Pietrak wrote:

Show quoted text

On Wed, 2006-05-24 at 13:36 +0200, Sim Zacks wrote:

My understanding of Listen/Notify is that it is a completely
disconnected process running on the database server.

Yes. But In my particular case (and I presume, the intention of
'bacground triggers' is that) a programmer (like myself) is not really
interested in the outcome of thet trigger process. So there is no
'technical' need for him/her to create the server side proces *provided*
hi/she can setup a job *within* the database server itself, and just go
away.

That's the idea of 'background triggers'.

Surely, there are work arounds. Like the LISTEN/NOTIFY server (not
datagase server, but system server) daemon that takes database server
notiffications. And even a system server daemon, that simply uses
synchronous database communication (like I did in my case). The problem
is, that I have this 'eatching desire', to have such technical issues
supported 'withing the framework of RDBM architecture'.

That's why I keep thinking, that the solution I used is actually
'bitting the fances', while gurus do it some other, *better* way. But if
not, a think that 'bakground triggers' could help here.

I may not have understood exactly what you are trying to do, but from
what I understood, this will solve your problem.

I think you did. I just feel that 'background triggers' is 'real life'
engineering issue, so it should get some backing from RDBMS.

just my 2c.

-R

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

ken@secdat.com

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

Sim Zacks wrote:

The problem with client code processing a function is that unless you
are using threads (my client application is not in a multi-threaded
environment), the client has to wait for the server to return from the
end of the function. I don't want the client to wait and the result
doesn't affect the user at all, so there is no reason why he should wait.

No reason for him to wait. Even Windows supports background processing
(geez I think so anyway).

If you are on Linux spawning a background process is a no-brainer,
there's plenty of ways to do it. There ought to be a way on windows I
would think.

Show quoted text

Kenneth Downs wrote:

Rafal Pietrak wrote:

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

Why not have the INSERT go to an "inbox" table, a table whose only
job is to receive the data for future processing.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure
to process them. Pass the stored procedure the batch number.

IOW, have your "background trigger" be a stored procedure that is
invoked by the client, instead of trying to get the server to do it.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Import Notes

Reply to msg id not found: e51lo3$1fb9$1@news.hub.org

cbbrowne@acm.org

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

The world rejoiced as rafal@zorro.isa-geek.com (Rafal Pietrak) wrote:

On Wed, 2006-05-24 at 07:41 -0400, Kenneth Downs wrote:

Why not have the INSERT go to an "inbox" table, a table whose only job
is to receive the data for future processing.

Actually, it 'sort of' works that way.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure to
process them. Pass the stored procedure the batch number.

If I have that stored procedure and if I issue command that would launch
such stored procedure from "psql>" prompt: how long will I have to wait
for another prompt? 1) until the procedure ends its job. 2) right away,
the procedure does its job unabidedly 'in the background'.

My impression was, that I get the next prompt after the procedure
finishes, so it wouldn't be a solution. But if (2) applies, that is
really it.... Frankly, it would take me some time to get back to those
sources (and generate simulation data) - so anybody knows the answer?

It depends on something else...

Is the process that is starting up the "background batch process"
making other changes to the data?

If it is, then the "foreground" process MUST submit a COMMIT
in order to finalize the batch before kicking off anything in the
background. Otherwise, the background process won't see the batch in
its "full glory." You can never read uncommitted data, after all.

That being the case (that you MUST have a COMMIT), I don't see any
reason to consider LISTEN/NOTIFY to be an insufficient mechanism.

It would only be insufficient if you don't submit a COMMIT.

But the background process won't see the right data unless you COMMIT.
So you *MUST* submit a COMMIT, making LISTEN/NOTIFY quite adequate to
the task...
--
let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];;
http://linuxdatabases.info/info/x.html
You know how most packages say "Open here". What is the protocol if
the package says, "Open somewhere else"?

sim@compulab.co.il

almost 20 years ago

In reply to: Kenneth Downs (#11)

Re: background triggers?

The problem with client code processing a function is that unless you
are using threads (my client application is not in a multi-threaded
environment), the client has to wait for the server to return from the
end of the function. I don't want the client to wait and the result
doesn't affect the user at all, so there is no reason why he should wait.

Kenneth Downs wrote:

Show quoted text

Rafal Pietrak wrote:

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

Why not have the INSERT go to an "inbox" table, a table whose only job
is to receive the data for future processing.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure to
process them. Pass the stored procedure the batch number.

IOW, have your "background trigger" be a stored procedure that is
invoked by the client, instead of trying to get the server to do it.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

cbbrowne@acm.org

almost 20 years ago

In reply to: Sim Zacks (#1)

Re: background triggers?

Sim Zacks <sim@compulab.co.il> writes:

The reason why the client application would have to wait is because
the front-end is written in MSAccess. While Windows supports multi
threading, Access does not support multi threading.
(Multi threading allows background processing, IYDK)

Well, you *clearly* need to have some sort of "daemon" running in
order to do this.

I expect it will in effect be a LISTEN process that waits for clients
to submit NOTIFY requests.

Even if you don't actually choose to use NOTIFY/LISTEN, per se, you'll
doubtless wind up creating an ad hoc, informally-specified
implementation of part of it...
--
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/unix.html
"Any sufficiently complicated C or Fortran program contains an ad hoc
informally-specified bug-ridden slow implementation of half of Common
Lisp." -- Philip Greenspun

sim@compulab.co.il

almost 20 years ago

In reply to: Kenneth Downs (#15)

Re: background triggers?

The reason why the client application would have to wait is because the
front-end is written in MSAccess. While Windows supports multi
threading, Access does not support multi threading.
(Multi threading allows background processing, IYDK)

Kenneth Downs wrote:

Show quoted text

Sim Zacks wrote:

The problem with client code processing a function is that unless you
are using threads (my client application is not in a multi-threaded
environment), the client has to wait for the server to return from the
end of the function. I don't want the client to wait and the result
doesn't affect the user at all, so there is no reason why he should wait.

No reason for him to wait. Even Windows supports background processing
(geez I think so anyway).

If you are on Linux spawning a background process is a no-brainer,
there's plenty of ways to do it. There ought to be a way on windows I
would think.

Kenneth Downs wrote:

Rafal Pietrak wrote:

A plain INSERT of batch takes 5-10minutes on desktop postgresql (800MHz
machine, ATA disks). When I attach trigger (*Very* simple funciton) to
update the accounts, the INSERT take hours (2-4). But when I make just
one single update of all accounts at the end of the batch insert, it
takes 20-30min.

Why not have the INSERT go to an "inbox" table, a table whose only
job is to receive the data for future processing.

Your client code should mark all rows with a batch number as they go
in. Then when the batch is loaded, simply invoke a stored procedure
to process them. Pass the stored procedure the batch number.

IOW, have your "background trigger" be a stored procedure that is
invoked by the client, instead of trying to get the server to do it.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Kenneth Downs (#13)

Re: background triggers?

On Wed, 2006-05-24 at 08:38 -0400, Kenneth Downs wrote:

What web server and OS are you using? In linux/apache you can fork
off a process that runs the SP and then detach from it.

Actually it is linux/apache. Yes, I could do it that way, but eventually
I've sattled for a cron job that lookes-up a single 'dirty_log' table
and does the housekeeping if necesery.

Still, from my point of view, this is a workaround. Not an elegant,
'database' solution, which I was looking for.

-R

sim@compulab.co.il

almost 20 years ago

In reply to: Chris Browne (#16)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Chris Browne (#16)

Thomas Hallgren

thhal@mailblocks.com

almost 20 years ago

In reply to: Sim Zacks (#1)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Sim Zacks (#1)

sim@compulab.co.il

almost 20 years ago

In reply to: Chris Browne (#18)

qnex42@gmail.com

almost 20 years ago

In reply to: Rafal Pietrak (#24)

Thomas Hallgren

thhal@mailblocks.com

almost 20 years ago

In reply to: Rafal Pietrak (#24)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Thomas Hallgren (#27)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Dawid Kuroczko (#26)

Thomas Hallgren

thhal@mailblocks.com

almost 20 years ago

In reply to: Rafal Pietrak (#28)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Thomas Hallgren (#30)

fgp@phlo.org

almost 20 years ago

In reply to: Rafal Pietrak (#9)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Florian Pflug (#32)

qnex42@gmail.com

almost 20 years ago

In reply to: Rafal Pietrak (#33)

fgp@phlo.org

almost 20 years ago

In reply to: Dawid Kuroczko (#34)

rafal@zorro.isa-geek.com

almost 20 years ago

In reply to: Dawid Kuroczko (#34)