Proposal: Job Scheduler

Started by Wang Chengover 1 year ago13 messages
#1Wang Cheng
348448708@qq.com

Hackers,

We are the PostgreSQL team in Tencent. We have recently developed a job scheduler that runs inside the database to schedules and manages jobs similar to Oracle DBMS_JOB package, and we would like to contribute this feature to the community.

Similar to autovacuum, the job scheduler consists of 2 parts: the job launcher and the job worker. The job launcher periodically scans a metadata table and signals the postmaster to start new workers if needed.

As far as we know, there are currently two open-sourced job scheduling extensions for PostgreSQL: pg_cron (https://github.com/citusdata/pg_cron/) and pg_dbms_job (https://github.com/MigOpsRepos/pg_dbms_job/tree/main). However, the cron-based syntax is not easy to use and suffers some limitations like one-off commands. The pg_dbms_job extension is difficult to manage and operate because it runs as a standalone process .

That's why we have developed the job scheduler that runs as a process inside the database just like autovacuum.

We can start to send the patch if this idea makes sense to the you. Thanks for your time.

Regards,
Cheng

 

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Wang Cheng (#1)
Re: Proposal: Job Scheduler

On Thu, 2024-06-06 at 16:27 +0800, Wang Cheng wrote:

We are the PostgreSQL team in Tencent. We have recently developed a job scheduler
that runs inside the database to schedules and manages jobs similar to Oracle
DBMS_JOB package, and we would like to contribute this feature to the community.

As far as we know, there are currently two open-sourced job scheduling extensions
for PostgreSQL: pg_cron (https://github.com/citusdata/pg_cron/) and pg_dbms_job
(https://github.com/MigOpsRepos/pg_dbms_job/tree/main). However, the cron-based
syntax is not easy to use and suffers some limitations like one-off commands.
The pg_dbms_job extension is difficult to manage and operate because it runs as
a standalone process .

There is also pg_timetable:
https://github.com/cybertec-postgresql/pg_timetable

That's why we have developed the job scheduler that runs as a process inside the
database just like autovacuum.

We can start to send the patch if this idea makes sense to the you.

Perhaps your job scheduler is much better than all the existing ones.
But what would be a compelling reason to keep it in the PostgreSQL source tree?
With PostgreSQL's extensibility features, it should be possible to write your
job scheduler as an extension and maintain it outside the PostgreSQL source.

I am sure that the PostgreSQL community will be happy to use the extension
if it is any good.

Yours,
Laurenz Albe

#3Dave Page
dpage@pgadmin.org
In reply to: Laurenz Albe (#2)
Re: Proposal: Job Scheduler

On Thu, 6 Jun 2024 at 09:47, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Thu, 2024-06-06 at 16:27 +0800, Wang Cheng wrote:

We are the PostgreSQL team in Tencent. We have recently developed a job

scheduler

that runs inside the database to schedules and manages jobs similar to

Oracle

DBMS_JOB package, and we would like to contribute this feature to the

community.

As far as we know, there are currently two open-sourced job scheduling

extensions

for PostgreSQL: pg_cron (https://github.com/citusdata/pg_cron/) and

pg_dbms_job

(https://github.com/MigOpsRepos/pg_dbms_job/tree/main). However, the

cron-based

syntax is not easy to use and suffers some limitations like one-off

commands.

The pg_dbms_job extension is difficult to manage and operate because it

runs as

a standalone process .

There is also pg_timetable:
https://github.com/cybertec-postgresql/pg_timetable

And probably the oldest of them all, pgAgent:
https://www.pgadmin.org/docs/pgadmin4/8.7/pgagent.html

That's why we have developed the job scheduler that runs as a process

inside the

database just like autovacuum.

We can start to send the patch if this idea makes sense to the you.

Perhaps your job scheduler is much better than all the existing ones.
But what would be a compelling reason to keep it in the PostgreSQL source
tree?
With PostgreSQL's extensibility features, it should be possible to write
your
job scheduler as an extension and maintain it outside the PostgreSQL
source.

I am sure that the PostgreSQL community will be happy to use the extension
if it is any good.

I agree. This is an area in which there are lots of options at the moment,
with compelling reasons to choose from various of them depending on your
needs.

It's this kind of choice that means it's unlikely we'd include any one
option in PostgreSQL, much like various other tools such as failover
managers or poolers.

--
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

#4Wang Cheng
348448708@qq.com
In reply to: Dave Page (#3)
Re: Proposal: Job Scheduler

Noted. Thanks for suggestions. We will open-source it as an extension.

Regards,
Cheng

&nbsp;

------------------&nbsp;Original&nbsp;------------------
From: "Dave Page" <dpage@pgadmin.org&gt;;
Date:&nbsp;Thu, Jun 6, 2024 04:59 PM
To:&nbsp;"Laurenz Albe"<laurenz.albe@cybertec.at&gt;;
Cc:&nbsp;"Wang Cheng"<348448708@qq.com&gt;;"pgsql-hackers"<pgsql-hackers@lists.postgresql.org&gt;;
Subject:&nbsp;Re: Proposal: Job Scheduler

On Thu, 6 Jun 2024 at 09:47, Laurenz Albe <laurenz.albe@cybertec.at&gt; wrote:

On Thu, 2024-06-06 at 16:27 +0800, Wang Cheng wrote:
&gt; We are the PostgreSQL team in Tencent. We have recently developed a job scheduler
&gt; that runs inside the database to schedules and manages jobs similar to Oracle
&gt; DBMS_JOB package, and we would like to contribute this feature to the community.
&gt;
&gt; As far as we know, there are currently two open-sourced job scheduling extensions
&gt; for PostgreSQL: pg_cron (https://github.com/citusdata/pg_cron/) and pg_dbms_job
&gt; (https://github.com/MigOpsRepos/pg_dbms_job/tree/main). However, the cron-based
&gt; syntax is not easy to use and suffers some limitations like one-off commands.
&gt; The pg_dbms_job extension is difficult to manage and operate because it runs as
&gt; a standalone process .

There is also pg_timetable:
https://github.com/cybertec-postgresql/pg_timetable

And probably the oldest of them all, pgAgent:&nbsp;https://www.pgadmin.org/docs/pgadmin4/8.7/pgagent.html
&nbsp;

&gt; That's why we have developed the job scheduler that runs as a process inside the
&gt; database just like autovacuum.
&gt;
&gt; We can start to send the patch if this idea makes sense to the you.

Perhaps your job scheduler is much better than all the existing ones.
But what would be a compelling reason to keep it in the PostgreSQL source tree?
With PostgreSQL's extensibility features, it should be possible to write your
job scheduler as an extension and maintain it outside the PostgreSQL source.

I am sure that the PostgreSQL community will be happy to use the extension
if it is any good.

I agree. This is an area in which there are lots of options at the moment, with compelling reasons to choose from various of them depending on your needs.

It's this kind of choice that means it's unlikely we'd include any one option in PostgreSQL, much like various other tools such as failover managers or poolers.&nbsp;

--
Dave PagepgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB:&nbsp;https://www.enterprisedb.com

#5Andrei Lepikhov
lepihov@gmail.com
In reply to: Wang Cheng (#4)
Re: Proposal: Job Scheduler

On 6/6/2024 16:04, Wang Cheng wrote:

Noted. Thanks for suggestions. We will open-source it as an extension.

It would be nice! `For me doesn't matter where to contribute: to
PostgreSQL core or to its extension if it is published under BSD license.

--
regards, Andrei Lepikhov

#6Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Dave Page (#3)
Re: Proposal: Job Scheduler

On 2024-Jun-06, Dave Page wrote:

It's this kind of choice that means it's unlikely we'd include any one
option in PostgreSQL, much like various other tools such as failover
managers or poolers.

TBH I see that more as a bug than as a feature, and I see the fact that
there are so many schedulers as a process failure. If we could have
_one_ scheduler in core that encompassed all the important features of
all the independent ones we have, with hooks or whatever to allow the
user to add any fringe features they need, that would probably lead to
less duplicative code and divergent UIs, and would be better for users
overall.

That's, of course, just my personal opinion.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#7Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Alvaro Herrera (#6)
Re: Proposal: Job Scheduler

On Thu, Jun 06, 2024 at 12:53:38PM GMT, Alvaro Herrera wrote:
On 2024-Jun-06, Dave Page wrote:

It's this kind of choice that means it's unlikely we'd include any one
option in PostgreSQL, much like various other tools such as failover
managers or poolers.

TBH I see that more as a bug than as a feature, and I see the fact that
there are so many schedulers as a process failure. If we could have
_one_ scheduler in core that encompassed all the important features of
all the independent ones we have, with hooks or whatever to allow the
user to add any fringe features they need, that would probably lead to
less duplicative code and divergent UIs, and would be better for users
overall.

That's, of course, just my personal opinion.

+1. The PostgreSQL ecosystem is surprisingly fragmented, when it comes
to quite essential components that happen to be outside of the core. But
of course it doesn't mean that there should be _one_ component of every
kind in core, more like it makes sense to have _one_ component available
out of the box (where the box is whatever form of PostgreSQL that gets
delivered to users, e.g. a distro package, container, etc.).

#8Nikolay Samokhvalov
nik@postgres.ai
In reply to: Dmitry Dolgov (#7)
Re: Proposal: Job Scheduler

On Thu, Jun 6, 2024 at 5:31 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

On Thu, Jun 06, 2024 at 12:53:38PM GMT, Alvaro Herrera wrote:
On 2024-Jun-06, Dave Page wrote:

It's this kind of choice that means it's unlikely we'd include any one
option in PostgreSQL, much like various other tools such as failover
managers or poolers.

TBH I see that more as a bug than as a feature, and I see the fact that
there are so many schedulers as a process failure. If we could have
_one_ scheduler in core that encompassed all the important features of
all the independent ones we have, with hooks or whatever to allow the
user to add any fringe features they need, that would probably lead to
less duplicative code and divergent UIs, and would be better for users
overall.

That's, of course, just my personal opinion.

+1. The PostgreSQL ecosystem is surprisingly fragmented, when it comes
to quite essential components that happen to be outside of the core. But
of course it doesn't mean that there should be _one_ component of every
kind in core, more like it makes sense to have _one_ component available
out of the box (where the box is whatever form of PostgreSQL that gets
delivered to users, e.g. a distro package, container, etc.).

+1 too.

There is a huge reason to have a job scheduler in core – new partition
creation.

In my opinion, partitioning in Postgres needs more automation, and new
partition creation is a big missing piece. And it does require a scheduler.

I like pg_timetable a lot, but it's written in Go;

pg_cron is written in Go, and it's already present in most managed Postgres
platforms. Why not to bring it to Postgres core so we could then use it to
improve developer experience of dealing with partitioning?

#9Nikolay Samokhvalov
nik@postgres.ai
In reply to: Nikolay Samokhvalov (#8)
Re: Proposal: Job Scheduler

On Thu, May 29, 2025 at 6:17 PM Nikolay Samokhvalov <nik@postgres.ai> wrote:

pg_cron is written in Go, and it's already present in most managed
Postgres platforms. Why not to bring it to Postgres core so we could then
use it to improve developer experience of dealing with partitioning?

I mean, in C, of course.

#10Andrei Lepikhov
lepihov@gmail.com
In reply to: Nikolay Samokhvalov (#8)
Re: Proposal: Job Scheduler

On 5/30/25 03:17, Nikolay Samokhvalov wrote:

On Thu, Jun 6, 2024 at 5:31 AM Dmitry Dolgov <9erthalion6@gmail.com
+1. The PostgreSQL ecosystem is surprisingly fragmented, when it comes
to quite essential components that happen to be outside of the core. But
of course it doesn't mean that there should be _one_ component of every
kind in core, more like it makes sense to have _one_ component available
out of the box (where the box is whatever form of PostgreSQL that gets
delivered to users, e.g. a distro package, container, etc.).

+1 too.

There is a huge reason to have a job scheduler in core – new partition
creation.

In my opinion, partitioning in Postgres needs more automation, and new
partition creation is a big missing piece. And it does require a scheduler.

I like pg_timetable a lot, but it's written in Go;

pg_cron is written in Go, and it's already present in most managed
Postgres platforms. Why not to bring it to Postgres core so we could
then use it to improve developer experience of dealing with partitioning?

I would say you should provide a reason why it is too difficult to stay
outside the core, such as pg_hint_plan or a similar feature.
In my opinion, the main reason to push an extension into contrib is if
it has a strong connection with the core API. But the scheduler seems as
far from the volatile API features as possible.
That's more, contrib extensions have essential priority to external
solutions and reduce development impulse in the area.

--
regards, Andrei Lepikhov

#11Nikolay Samokhvalov
nik@postgres.ai
In reply to: Andrei Lepikhov (#10)
Re: Proposal: Job Scheduler

On Fri, May 30, 2025 at 02:22 Andrei Lepikhov <lepihov@gmail.com> wrote:

On 5/30/25 03:17, Nikolay Samokhvalov wrote:

On Thu, Jun 6, 2024 at 5:31 AM Dmitry Dolgov <9erthalion6@gmail.com
+1. The PostgreSQL ecosystem is surprisingly fragmented, when it

comes

to quite essential components that happen to be outside of the core.

But

of course it doesn't mean that there should be _one_ component of

every

kind in core, more like it makes sense to have _one_ component

available

out of the box (where the box is whatever form of PostgreSQL that

gets

delivered to users, e.g. a distro package, container, etc.).

+1 too.

There is a huge reason to have a job scheduler in core – new partition
creation.

In my opinion, partitioning in Postgres needs more automation, and new
partition creation is a big missing piece. And it does require a

scheduler.

I like pg_timetable a lot, but it's written in Go;

pg_cron is written in Go, and it's already present in most managed
Postgres platforms. Why not to bring it to Postgres core so we could
then use it to improve developer experience of dealing with partitioning?

I would say you should provide a reason why it is too difficult to stay
outside the core, such as pg_hint_plan or a similar feature.
In my opinion, the main reason to push an extension into contrib is if
it has a strong connection with the core API. But the scheduler seems as
far from the volatile API features as possible.
That's more, contrib extensions have essential priority to external
solutions and reduce development impulse in the area.

I'm not proposing to include it as contrib module -- I propose to include
it to core code base and then use to implement automated partition creation.

Show quoted text
#12Adam Brusselback
adambrusselback@gmail.com
In reply to: Nikolay Samokhvalov (#11)
Re: Proposal: Job Scheduler

Add me to the +1 for having a built-in scheduler. It's useful for plenty of
things like automated partition creation (as noted), scheduling backups,
index maintenance, batch processing jobs, etc...

I wrote jpgAgent (compatible with pgAgent) ~10 years ago because pgAgent
was too unstable (and the other scheduling tools hadn't come out yet), but
I really wish I didn't have to deal with external tooling for features like
this at all.

#13Florents Tselai
florents.tselai@gmail.com
In reply to: Adam Brusselback (#12)
Re: Proposal: Job Scheduler

On 30 May 2025, at 5:24 PM, Adam Brusselback <adambrusselback@gmail.com> wrote:

Add me to the +1 for having a built-in scheduler. It's useful for plenty of things like automated partition creation (as noted), scheduling backups, index maintenance, batch processing jobs, etc...

I wrote jpgAgent (compatible with pgAgent) ~10 years ago because pgAgent was too unstable (and the other scheduling tools hadn't come out yet), but I really wish I didn't have to deal with external tooling for features like this at all.

I could see an argument of adopting pg_cron as a contrib/ module to the core;
not only because it’s become the standard, but it’s a nice example of an extension using bgworker.

But having it in core? I don’t think so; It’s way beyond the scope of an RDBMS,
which already has transactions and SKIP LOCKED, and I think that’s as fast as it should go.