scheduler in core

Started by Jaime Casanovaabout 16 years ago44 messageshackers
Jump to latest
#1Jaime Casanova
jcasanov@systemguards.com.ec

Hi,

I'm trying to figure out how difficult is this

What we need:
- a shared catalog
- an API for filling the catalog
- a scheduler daemon
- pg_dump support

A shared catalog
-------------------------
Why shared? obviously because we don't want to scan all database's
pg_job every time the daemon wake up.
Maybe something like:

pg_job (
oid -- use the oid as pk
jobname
jobdatoid -- job database oid
jobowner -- for permission's checking
jobstarttime -- year to minute
jobfrequency -- an interval?
jobnexttime or joblasttime
jobtype -- if we are going to allow plain sql or
executable/shell job types
jobexecute or jobscript
)

comments about the catalog?

An API for filling the catalog
-----------------------------------------
do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
create/remove jobs.

An scheduler daemon
--------------------------------
I think we can use 8.3's autovacuum daemon as a reference for this...
AFAIK, it's a child of postmaster that sleep for $naptime and then
looks for something to do (it also looks in a
catalog) and the send a worker to do it
that's what we need to do but...

for the $naptime i think we can autoconfigure it, when we execute a
job look for the next job in queue and sleep
until we are going to reach the time to execute it

i don't think we need a max_worker parameter, it should launch as many
workers as it needs

pg_dump support
--------------------------
dump every entry of the pg_job catalog as a CREATE JOB SQL statement
or a create_job() function depending
on what we decided

ideas? comments?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#2Dave Page
dpage@pgadmin.org
In reply to: Jaime Casanova (#1)
Re: scheduler in core

On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Jaime Casanova (#1)
Re: scheduler in core

On Sat, Feb 20, 2010 at 4:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

What we need:
- a shared catalog
- an API for filling the catalog
- a scheduler daemon
- pg_dump support

A shared catalog
-------------------------
Why shared? obviously because we don't want to scan all database's
pg_job every time the daemon wake up.
Maybe something like:

pg_job (
   oid                -- use the oid as pk
   jobname
   jobdatoid       -- job database oid
   jobowner       -- for permission's checking
   jobstarttime   -- year to minute
   jobfrequency  -- an interval?
   jobnexttime or joblasttime
   jobtype          -- if we are going to allow plain sql or
executable/shell job types
   jobexecute or jobscript
)

comments about the catalog?

An API for filling the catalog
-----------------------------------------
do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
create/remove jobs.

An scheduler daemon
--------------------------------
I think we can use 8.3's autovacuum daemon as a reference for this...
AFAIK, it's a child of postmaster that sleep for $naptime and then
looks for something to do (it also looks in a
catalog) and the send a worker to do it
that's what we need to do but...

for the $naptime i think we can autoconfigure it, when we execute a
job look for the next job in queue and sleep
until we are going to reach the time to execute it

i don't think we need a max_worker parameter, it should launch as many
workers as it needs

pg_dump support
--------------------------
dump every entry of the pg_job catalog as a CREATE JOB SQL statement
or a create_job() function depending
on what we decided

ideas? comments?

IMNSHO, an 'in core' scheduler would be useful. however, I think
before you tackle a scheduler, we need proper stored procedures. Our
existing functions don't cut it because you can manage the transaction
state yourself.

merlin

#4Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Dave Page (#2)
Re: scheduler in core

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core, so that it's not a
separate install to do? I'd love to have some support for running my
maintenance pl functions directly from the database. I mean without
installing, running and monitoring another (set of) process.

Main advantage over cron or another scheduler being that it'd be part of
my transactional backups, of course.

Use cases, in case it's needed already, include creating new partitions,
materializing views at known intervals, more general maintenance like
vacuum and clusters operations, some reporting that could be done in the
database itself, etc.

Regards,
--
dim

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jaime Casanova (#1)
Re: scheduler in core

pg_job (
   oid                -- use the oid as pk
   jobname
   jobdatoid       -- job database oid
   jobowner       -- for permission's checking
   jobstarttime   -- year to minute
   jobfrequency  -- an interval?
   jobnexttime or joblasttime
   jobtype          -- if we are going to allow plain sql or
executable/shell job types
   jobexecute or jobscript
)

comments about the catalog?

+ success_action
+failure_action

Show quoted text

An API for filling the catalog
-----------------------------------------
do we want a CREATE JOB SQL synatx? FWIW, Oracle uses functions to
create/remove jobs.

An scheduler daemon
--------------------------------
I think we can use 8.3's autovacuum daemon as a reference for this...
AFAIK, it's a child of postmaster that sleep for $naptime and then
looks for something to do (it also looks in a
catalog) and the send a worker to do it
that's what we need to do but...

for the $naptime i think we can autoconfigure it, when we execute a
job look for the next job in queue and sleep
until we are going to reach the time to execute it

i don't think we need a max_worker parameter, it should launch as many
workers as it needs

pg_dump support
--------------------------
dump every entry of the pg_job catalog as a CREATE JOB SQL statement
or a create_job() function depending
on what we decided

ideas? comments?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Bruce Momjian
bruce@momjian.us
In reply to: Dimitri Fontaine (#4)
Re: scheduler in core

On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:

What would it take to have it included in core, so that it's not a
separate install to do? I'd love to have some support for running my
maintenance pl functions directly from the database. I mean without
installing, running and monitoring another (set of) process.

It'll always be another (set of) processes even if it's "in core". All
it means to be "in core" is that it will be harder to make
modifications and you'll be tied to the Postgres release cycle.

Main advantage over cron or another scheduler being that it'd be part of
my transactional backups, of course.

All you need for that is to store the schedule in a database table.
This has nothing to do with where the scheduler code lives.

--
greg

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dimitri Fontaine (#4)
Re: scheduler in core

Dimitri Fontaine <dfontaine@hi-media.com> writes:

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core,

I don't think this really makes sense. There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package". Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better. So I'm not eager to take on additional maintenance burden
for such a thing.

regards, tom lane

#8Lucas
lucas75@gmail.com
In reply to: Tom Lane (#7)
Re: scheduler in core

Tom,

I believe that "in core" may be "installed by default" in case of the
pgAgent or similar solution...

Many big companies does not allow the developers to configure and install
components.... we need to request everthing in 10 copies of forms...

By making it "in core" or "installed by default" means that we have more
chance that the db scheduler would be widely accepted...

And more important... we would not have to check its availability on the
setup and provide an alternate scheduler if the database scheduler is off...

I believe that a database scheduler would allow me to drop 20 thousand lines
of java code in my server...

2010/2/20 Tom Lane <tgl@sss.pgh.pa.us>

Dimitri Fontaine <dfontaine@hi-media.com> writes:

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core,

I don't think this really makes sense. There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package". Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better. So I'm not eager to take on additional maintenance burden
for such a thing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Lucas

#9Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Dave Page (#2)
Re: scheduler in core

On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

- it's not that easy if you don't have pgadmin
- i need to backup postgres database to backup the schedules
- the use pgagent here is not very extended but the few a know have
tried desisted because they
said: "not always executed the jobs"... i don't have any real evidence
of that and probably what happens
was that the pgagent daemon wasn't working (error prone), but being it
started by the postmaster get rid of that
problem...

The first one could be rid out with a set of functions in pgagent and
clear docs...

i can live with the other two at some degree... but getting rid of
the third one should be nice :)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#10Dave Page
dpage@pgadmin.org
In reply to: Jaime Casanova (#9)
Re: scheduler in core

On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

- it's not that easy if you don't have pgadmin

That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
interface with it for example.

- i need to backup postgres database to backup the schedules

Only if you put the control schema in that database. If you don't want
to do that, stick it somewhere else. With your proposed scheme, you'd
probably have to use pg_dumpall --backup-globals (or whatever it's
called)

- the use pgagent here is not very extended but the few a know have
tried desisted because they
said: "not always executed the jobs"... i don't have any real evidence
of that and probably what happens
was that the pgagent daemon wasn't working (error prone), but being it
started by the postmaster get rid of that
problem...

Noone has ever reported such a bug that I'm aware of.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#11Dave Page
dpage@pgadmin.org
In reply to: Lucas (#8)
Re: scheduler in core

On Sat, Feb 20, 2010 at 11:55 PM, Lucas <lucas75@gmail.com> wrote:

I believe that a database scheduler would allow me to drop 20 thousand lines
of java code in my server...

How does that work? If you don't have a scheduler in the database, or
pgAgent, why aren't you using cron or Windows task scheduler, neither
of which would require 20K lines of Java code.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#12Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Dave Page (#10)
Re: scheduler in core

On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

- it's not that easy if you don't have pgadmin

That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
interface with it for example.

maybe i can work on that, then

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#13Dave Page
dpage@pgadmin.org
In reply to: Dimitri Fontaine (#4)
Re: scheduler in core

On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core, so that it's not a
separate install to do? I'd love to have some support for running my
maintenance pl functions directly from the database. I mean without
installing, running and monitoring another (set of) process.

It's currently written in C++/pl/pgsql and uses wxWidgets, none of
which couldn't be changed with a little work. Having it in core will
almost certainly result in reduced functionality though - there are
use cases in which you may have multiple agents running against one
control database, or executing jobs on remote databases for example.

We originally wrote the code such that it might be easily included in
core in the future, but every time this topic comes up in -hackers,
there are a significant number of people who don't think a scheduler
should be tied to the core code so we stopped assuming it ever would
be.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Lucas (#8)
Re: scheduler in core

Lucas wrote:

Tom,

I believe that "in core" may be "installed by default" in case of
the pgAgent or similar solution...

Many big companies does not allow the developers to configure and
install components.... we need to request everthing in 10 copies
of forms...

By making it "in core" or "installed by default" means that we
have more chance that the db scheduler would be widely accepted...

This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is
designed to be extensible, not a monolithic product. We're not going to
change that because some companies have insane corporate policies. The
answer, as Jefferson said in another context, is to "inform their
ignorance."

That isn't to say that there isn't a case for an in core scheduler, but
this at least isn't a good reason for it.

cheers

andrew

#15Dave Page
dpage@pgadmin.org
In reply to: Jaime Casanova (#12)
Re: scheduler in core

On Sun, Feb 21, 2010 at 12:38 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Feb 20, 2010 at 7:32 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sun, Feb 21, 2010 at 12:03 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

On Sat, Feb 20, 2010 at 4:37 PM, Dave Page <dpage@pgadmin.org> wrote:

On Sat, Feb 20, 2010 at 9:33 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:

Hi,

I'm trying to figure out how difficult is this

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

- it's not that easy if you don't have pgadmin

That's easily changed. EDB's Advanced Server emulates Oracles DBMS_JOB
interface with it for example.

maybe i can work on that, then

I'd love to add a management API to pgAgent if you'd like to work on it.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#16Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#7)
Re: scheduler in core

On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:

Dimitri Fontaine <dfontaine@hi-media.com> writes:

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core,

I don't think this really makes sense. There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package". Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better. So I'm not eager to take on additional maintenance burden
for such a thing.

There is zero technical reason for this to be in core.

That doesn't mean it isn't a really good idea. It would be nice to have
a comprehensive job scheduling solution that allows me to continue
abstract away from external solutions and operating system dependencies.

Joshua D. Drake

regards, tom lane

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.

#17Robert Haas
robertmhaas@gmail.com
In reply to: Joshua D. Drake (#16)
Re: scheduler in core

On Feb 20, 2010, at 8:06 PM, "Joshua D. Drake" <jd@commandprompt.com>
wrote:

There is zero technical reason for this to be in core.

That doesn't mean it isn't a really good idea. It would be nice to
have
a comprehensive job scheduling solution that allows me to continue
abstract away from external solutions and operating system
dependencies.

Well put. That pretty much sums up my feelings on this perfectly.

...Robert

#18Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Dave Page (#13)
Re: scheduler in core

Ah! wxWidgets... Yes, i knew there was something i didn't like about
pgAgent. So is not as simple as installing it

2010/2/20, Dave Page <dpage@pgadmin.org>:

On Sat, Feb 20, 2010 at 10:03 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:

Dave Page <dpage@pgadmin.org> writes:

Why not just use pgAgent? It's far more flexible than the design
you've suggested, and already exists.

What would it take to have it included in core, so that it's not a
separate install to do? I'd love to have some support for running my
maintenance pl functions directly from the database. I mean without
installing, running and monitoring another (set of) process.

It's currently written in C++/pl/pgsql and uses wxWidgets, none of
which couldn't be changed with a little work. Having it in core will
almost certainly result in reduced functionality though - there are
use cases in which you may have multiple agents running against one
control database, or executing jobs on remote databases for example.

We originally wrote the code such that it might be easily included in
core in the future, but every time this topic comes up in -hackers,
there are a significant number of people who don't think a scheduler
should be tied to the core code so we stopped assuming it ever would
be.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

--
Enviado desde mi dispositivo móvil

Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#19Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Dunstan (#14)
Re: scheduler in core

2010/2/21 Andrew Dunstan <andrew@dunslane.net>:

Lucas wrote:

Tom,

   I believe that "in core" may be "installed by default" in case of
   the pgAgent or similar solution...

   Many big companies does not allow the developers to configure and
   install components.... we need to request everthing in 10 copies
   of forms...

   By making it "in core" or "installed by default" means that we
   have more chance that the db scheduler would be widely accepted...

This reasoning just doesn't fly in the PostgreSQL world. PostgreSQL is
designed to be extensible, not a monolithic product. We're not going to
change that because some companies have insane corporate policies.  The
answer, as Jefferson said in another context, is to "inform their
ignorance."

That isn't to say that there isn't a case for an in core scheduler, but this
at least isn't a good reason for it.

What I remember - this is exactly same discus like was about
replication thre years ago

fiirst strategy - we doesn't need it in core
next we was last with replacation

Regards
Pavel Stehule

Show quoted text

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Joshua D. Drake (#16)
Re: scheduler in core

"Joshua D. Drake" <jd@commandprompt.com> writes:

On Sat, 2010-02-20 at 18:19 -0500, Tom Lane wrote:

Dimitri Fontaine <dfontaine@hi-media.com> writes:

What would it take to have it included in core,

I don't think this really makes sense. There's basically no argument
for having it in core other than "I'm too lazy to install a separate
package". Unlike the case for autovacuum, there isn't anything an
in-core implementation could do that an external one doesn't do as well
or better. So I'm not eager to take on additional maintenance burden
for such a thing.

There is zero technical reason for this to be in core.

That doesn't mean it isn't a really good idea. It would be nice to have
a comprehensive job scheduling solution that allows me to continue
abstract away from external solutions and operating system dependencies.

Maybe what we need, on the technical level, is a way to distribute this
code with the main product but without draining too much effort from
core members there. Like we do with contribs I guess, but on a larger
scale.

I guess git submodules, PGAN, extensions and all that jazz are going to
help. Meanwhile I'll have to learn enough of pgAgent to figure out how
much it's tied to pgadmin, and we'll have to make those other facilities
something real.

Regards,
--
dim

#21Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Bruce Momjian (#6)
#22Andrew Dunstan
andrew@dunslane.net
In reply to: Pavel Stehule (#19)
#23Bruce Momjian
bruce@momjian.us
In reply to: Pavel Stehule (#19)
#24Lucas
lucas75@gmail.com
In reply to: Andrew Dunstan (#14)
#25Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Lucas (#8)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Mayer (#25)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#26)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Lucas (#24)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#28)
#30Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#7)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#29)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#30)
#33Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Simon Riggs (#30)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: Dimitri Fontaine (#33)
#35Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Simon Riggs (#30)
#36Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jaime Casanova (#35)
#37Pavel Stehule
pavel.stehule@gmail.com
In reply to: Heikki Linnakangas (#36)
#38Merlin Moncure
mmoncure@gmail.com
In reply to: Joshua D. Drake (#16)
#39Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Merlin Moncure (#38)
#40Merlin Moncure
mmoncure@gmail.com
In reply to: Alvaro Herrera (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#3)
#42Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#41)
#43Merlin Moncure
mmoncure@gmail.com
In reply to: Robert Haas (#41)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#41)