Help with writing a generate_series(tsmultirange, interval)

Started by François Beausoleilover 4 years ago6 messagesgeneral
Jump to latest
#1François Beausoleil
francois@teksol.info

Hello all!

I’m excited for multi ranges, as they fit nicely into a scheduling app. What I’m trying to express is something along the lines of « Every weekday from 2021-08-01 and 2021-10-01, from 9 AM to 8 PM, every 90 minutes ». You can think of public transit for the model.

Initially, I was going to create a table with every departure recorded, and was going to refresh the table on every write to the parent table, but that means maintaining many rows for every change to the schedule. Then, I remembered multi ranges in PG14, and they fit nicely with what I had in mind.

Now that I can store the info I want, I’d like to iterate over the ranges, so I can generate the exact departure times, something similar to this:

SELECT instant
FROM generate_series(
'{["2021-08-02 08:00:00","2021-08-02 11:00:00"),["2021-08-03 08:00:00","2021-08-03 20:00:00"]’}
, interval ’90 minutes’) as instant;

2021-08-02 08:00
2021-08-02 09:30
— 2021-08-02 11:00 excluded as the range excludes its upper bound
2021-08-03 08:00:00
2021-08-03 09:30:00
2021-08-03 11:00:00
2021-08-03 12:30:00
2021-08-03 14:00:00
2021-08-03 15:30:00
2021-08-03 17:00:00
2021-08-03 18:30:00
2021-08-03 20:00:00 — included, as the upper bound is inclusive

That function doesn’t exist, and I can’t seem to find a function to iterate over a multi range either. Does such a function exist? I’m specifically looking at https://www.postgresql.org/docs/14/functions-range.html.

This is a toy application, a spike to see what’s possible.

I wanted to avoid maintaining a table with hundreds of rows per route, if a route runs frequently enough (every 10 minutes, 8 AM to 8 PM, over a year). Of course, I can avoid refreshing the departures table if the schedule hasn’t changed, but still, preparing this table will not take a constant amount of time; e.g. it will depend on the schedule’s size.

Any tips appreciated!
François

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: François Beausoleil (#1)
Re: Help with writing a generate_series(tsmultirange, interval)

On 7/31/21 11:59 AM, François Beausoleil wrote:

Hello all!

I’m excited for multi ranges, as they fit nicely into a scheduling app. What I’m trying to express is something along the lines of « Every weekday from 2021-08-01 and 2021-10-01, from 9 AM to 8 PM, every 90 minutes ». You can think of public transit for the model.

Initially, I was going to create a table with every departure recorded, and was going to refresh the table on every write to the parent table, but that means maintaining many rows for every change to the schedule. Then, I remembered multi ranges in PG14, and they fit nicely with what I had in mind.

Now that I can store the info I want, I’d like to iterate over the ranges, so I can generate the exact departure times, something similar to this:

SELECT instant
FROM generate_series(
'{["2021-08-02 08:00:00","2021-08-02 11:00:00"),["2021-08-03 08:00:00","2021-08-03 20:00:00"]’}
, interval ’90 minutes’) as instant;

2021-08-02 08:00
2021-08-02 09:30
— 2021-08-02 11:00 excluded as the range excludes its upper bound
2021-08-03 08:00:00
2021-08-03 09:30:00
2021-08-03 11:00:00
2021-08-03 12:30:00
2021-08-03 14:00:00
2021-08-03 15:30:00
2021-08-03 17:00:00
2021-08-03 18:30:00
2021-08-03 20:00:00 — included, as the upper bound is inclusive

That function doesn’t exist, and I can’t seem to find a function to iterate over a multi range either. Does such a function exist? I’m specifically looking at https://www.postgresql.org/docs/14/functions-range.html.

This is a toy application, a spike to see what’s possible.

I wanted to avoid maintaining a table with hundreds of rows per route, if a route runs frequently enough (every 10 minutes, 8 AM to 8 PM, over a year). Of course, I can avoid refreshing the departures table if the schedule hasn’t changed, but still, preparing this table will not take a constant amount of time; e.g. it will depend on the schedule’s size.

Any tips appreciated!

How about:

SELECT
*
FROM
generate_series ('2021-08-02 08:00:00'::timestamp, '2021-08-02
10:59:00'::timestamp, interval '90 minutes') AS instant
UNION
SELECT
*
FROM
generate_series ('2021-08-03 08:00:00'::timestamp, '2021-08-03
20:00:00'::timestamp, interval '90 minutes') AS instant
ORDER BY instant;

instant
---------------------
2021-08-02 08:00:00
2021-08-02 09:30:00
2021-08-03 08:00:00
2021-08-03 09:30:00
2021-08-03 11:00:00
2021-08-03 12:30:00
2021-08-03 14:00:00
2021-08-03 15:30:00
2021-08-03 17:00:00
2021-08-03 18:30:00
2021-08-03 20:00:00

François

--
Adrian Klaver
adrian.klaver@aklaver.com

#3François Beausoleil
francois@teksol.info
In reply to: Adrian Klaver (#2)
Re: Help with writing a generate_series(tsmultirange, interval)

Hello Adrian,

Le 31 juill. 2021 à 15:49, Adrian Klaver <adrian.klaver@aklaver.com> a écrit :

On 7/31/21 11:59 AM, François Beausoleil wrote:

Hello all!
I’m excited for multi ranges, as they fit nicely into a scheduling app. What I’m trying to express is something along the lines of « Every weekday from 2021-08-01 and 2021-10-01, from 9 AM to 8 PM, every 90 minutes ». You can think of public transit for the model.
Initially, I was going to create a table with every departure recorded, and was going to refresh the table on every write to the parent table, but that means maintaining many rows for every change to the schedule. Then, I remembered multi ranges in PG14, and they fit nicely with what I had in mind.
Now that I can store the info I want, I’d like to iterate over the ranges, so I can generate the exact departure times, something similar to this:
SELECT instant
FROM generate_series(
'{["2021-08-02 08:00:00","2021-08-02 11:00:00"),["2021-08-03 08:00:00","2021-08-03 20:00:00"]’}
, interval ’90 minutes’) as instant;
2021-08-02 08:00
2021-08-02 09:30
— 2021-08-02 11:00 excluded as the range excludes its upper bound
2021-08-03 08:00:00
2021-08-03 09:30:00
2021-08-03 11:00:00
2021-08-03 12:30:00
2021-08-03 14:00:00
2021-08-03 15:30:00
2021-08-03 17:00:00
2021-08-03 18:30:00
2021-08-03 20:00:00 — included, as the upper bound is inclusive
That function doesn’t exist, and I can’t seem to find a function to iterate over a multi range either. Does such a function exist? I’m specifically looking at https://www.postgresql.org/docs/14/functions-range.html.
This is a toy application, a spike to see what’s possible.
I wanted to avoid maintaining a table with hundreds of rows per route, if a route runs frequently enough (every 10 minutes, 8 AM to 8 PM, over a year). Of course, I can avoid refreshing the departures table if the schedule hasn’t changed, but still, preparing this table will not take a constant amount of time; e.g. it will depend on the schedule’s size.
Any tips appreciated!

How about:

SELECT
*
FROM
generate_series ('2021-08-02 08:00:00'::timestamp, '2021-08-02 10:59:00'::timestamp, interval '90 minutes') AS instant
UNION
SELECT
*
FROM
generate_series ('2021-08-03 08:00:00'::timestamp, '2021-08-03 20:00:00'::timestamp, interval '90 minutes') AS instant
ORDER BY instant;

Yes, in fact, I wrote the following:

--------------------------------------------------------------------------------------------------------------------------

CREATE FUNCTION generate_series(tstzrange, interval) RETURNS SETOF timestamp with time zone AS $$
SELECT n
FROM generate_series(lower($1), upper($1), $2) AS t0(n)
WHERE $1 @> n
$$ LANGUAGE sql immutable;

CREATE FUNCTION generate_series(tsrange, interval) RETURNS SETOF timestamp without time zone AS $$
SELECT n
FROM generate_series(lower($1), upper($1), $2) AS t0(n)
WHERE $1 @> n
$$ LANGUAGE sql immutable;

That was the easy part. My end goal is to iterate over a tsmultirange: I would like to get each individual range from a given multi range. Ideally, I’d like to do that without parsing the textual version of the multi range.

While mowing the lawn, I thought that since the syntax of multi ranges is similar to arrays, maybe I could use unnest(), but sadly, that was not to be the case:

# select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
ERROR: function unnest(tsmultirange) does not exist
LINE 1: select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-...

Apparently, PG can accept multi range values, but can’t do much with them at the time, except to check for inclusion/exclusion.

Thanks for your time!
François

Show quoted text

François

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: François Beausoleil (#3)
Re: Help with writing a generate_series(tsmultirange, interval)

On 7/31/21 5:16 PM, François Beausoleil wrote:

Hello Adrian,

Yes, in fact, I wrote the following:

--------------------------------------------------------------------------------------------------------------------------

CREATE FUNCTION generate_series(tstzrange, interval) RETURNS SETOF
timestamp with time zone AS $$
  SELECT n
  FROM generate_series(lower($1), upper($1), $2) AS t0(n)
  WHERE $1 @> n
$$ LANGUAGE sql immutable;

CREATE FUNCTION generate_series(tsrange, interval) RETURNS SETOF
timestamp without time zone AS $$
  SELECT n
  FROM generate_series(lower($1), upper($1), $2) AS t0(n)
  WHERE $1 @> n
$$ LANGUAGE sql immutable;

That was the easy part. My end goal is to iterate over a tsmultirange: I
would like to get each individual range from a given multi range.
Ideally, I’d like to do that without parsing the textual version of the
multi range.

While mowing the lawn, I thought that since the syntax of multi ranges
is similar to arrays, maybe I could use unnest(), but sadly, that was
not to be the case:

# select
unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
ERROR:  function unnest(tsmultirange) does not exist
LINE 1: select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-...

Apparently, PG can accept multi range values, but can’t do much with
them at the time, except to check for inclusion/exclusion.

I see your mowing the lawn and raise walking the dog. This rang some
bells and then I remembered when in doubt consult depesz:

https://www.depesz.com/2021/07/15/how-to-get-list-of-elements-from-multiranges/

/messages/by-id/20210715121508.GA30348@depesz.com

Thanks for your time!
François

François

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: François Beausoleil (#3)
Re: Help with writing a generate_series(tsmultirange, interval)

=?utf-8?Q?Fran=C3=A7ois_Beausoleil?= <francois@teksol.info> writes:

While mowing the lawn, I thought that since the syntax of multi ranges is similar to arrays, maybe I could use unnest(), but sadly, that was not to be the case:
# select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
ERROR: function unnest(tsmultirange) does not exist

That's fixed for beta3:

regression=# select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
unnest
-----------------------------------------------
["2021-08-02 00:00:00","2021-08-04 00:00:00"]
["2021-08-07 00:00:00","2021-08-09 00:00:00")
(2 rows)

regards, tom lane

#6Alban Hertroys
haramrae@gmail.com
In reply to: Tom Lane (#5)
Re: Help with writing a generate_series(tsmultirange, interval)

On 1 Aug 2021, at 3:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?Q?Fran=C3=A7ois_Beausoleil?= <francois@teksol.info> writes:

While mowing the lawn, I thought that since the syntax of multi ranges is similar to arrays, maybe I could use unnest(), but sadly, that was not to be the case:
# select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
ERROR: function unnest(tsmultirange) does not exist

That's fixed for beta3:

regression=# select unnest('{[2021-08-02,2021-08-04],[2021-08-07,2021-08-09)}'::tsmultirange);
unnest
-----------------------------------------------
["2021-08-02 00:00:00","2021-08-04 00:00:00"]
["2021-08-07 00:00:00","2021-08-09 00:00:00")
(2 rows)

regards, tom lane

If what you need is behaving similar to arrays, perhaps arrays of ranges suit your problem?

development=> select unnest(array['[2021-08-02,2021-08-04]'::tsrange,'[2021-08-07,2021-08-09)'::tsrange]);
unnest
-----------------------------------------------
["2021-08-02 00:00:00","2021-08-04 00:00:00"]
["2021-08-07 00:00:00","2021-08-09 00:00:00")
(2 rows)

The drawback of that approach is probably with the operators and functions you have to your avail. It seems to me though that several of those not available for arrays could be emulated using array functions such as array_position(…) for the contains operator, unnest with tsrange functions for others, etc.

Another approach could be to store the “rules” of the schedule and generate the relevant portion of the multirange as a set of tsrange rows on-the-fly. That may well perform better than storing the entire range in a table of tsrange records.

I’ve done something like that for a hierarchical query on versioned items where I had to base how to slice through the hierarchy on a reference timestamp. That performed adequately on a production data warehouse, as long as you sufficiently constrained the inputs. You can join such a function (laterally) to some other data set too.

Regards,

Alban Hertroys
--
There is always an exception to always.