Is partition pruning impacted by data type
Hi,
We are designing one application which is currently restricted to one time
zone users but has the possibility to go global in future. Some of the
transaction tables are going to be daily range partitioned on the
transaction_create_date column. But the "date" data type will have no time
component in it, so we are thinking to make it as timestamp data
type(timestamptz(6)), so that it will help us in us two ways,
firstly , though current use cases in which the majority of the queries are
going to happen on a day or multiple days of transactions. But if we have
any use case which needs further lower granularity like in hourly duration
, then having "timestamp" data type with an index created on it will help.
And in future , if we plan to partition it based on further lower
granularity like hourly , that can be accommodated easily with a
"timestamp" data type.
However the question we have is ,
1)If there is any downside of having the partition key with "timestamp with
timezone" type? Will it impact the partition pruning of the queries anyway
by appending any run time "time zone" conversion function during the query
planning/execution phase?
2) As it will take the default server times , so during daylight saving
the server time will change, so in that case, can it cause any unforeseen
issue?
3)Will this cause the data to be spread unevenly across partitions and make
the partitions unevenly sized? If will go for UTC/GMT as db time, the
user's one day transaction might span across two daily partitions.
Thanks and Regards
Sud
Hello,
Has anybody got experience of using a range partitioning table using
timestamptz or "timestamp with no timezone" Column and saw any of such
known issues in pruning?
Show quoted text
On Tue, 5 Mar, 2024, 1:09 am sud, <suds1434@gmail.com> wrote:
Hi,
We are designing one application which is currently restricted to one
time zone users but has the possibility to go global in future. Some of the
transaction tables are going to be daily range partitioned on the
transaction_create_date column. But the "date" data type will have no time
component in it, so we are thinking to make it as timestamp data
type(timestamptz(6)), so that it will help us in us two ways,firstly , though current use cases in which the majority of the queries
are going to happen on a day or multiple days of transactions. But if we
have any use case which needs further lower granularity like in hourly
duration , then having "timestamp" data type with an index created on it
will help. And in future , if we plan to partition it based on further
lower granularity like hourly , that can be accommodated easily with a
"timestamp" data type.However the question we have is ,
*1)If there is any downside of having the partition key with "timestamp
with timezone" type? Will it impact the partition pruning of the queries
anyway by appending any run time "time zone" conversion function during the
query planning/execution phase? **2) As it will take the default server times , so during daylight saving
the server time will change, so in that case, can it cause any unforeseen
issue?**3)Will this cause the data to be spread unevenly across partitions and
make the partitions unevenly sized? If will go for UTC/GMT as db time, the
user's one day transaction might span across two daily partitions. *Thanks and Regards
Sud
Import Notes
Reply to msg id not found: CAD=mzVWHH4ojSw5d+yyOe7bR6XiqfhnX1kx07a+mDM8Z169nkg@mail.gmail.com
On Tue, Mar 5, 2024 at 1:09 AM sud <suds1434@gmail.com> wrote:
However the question we have is ,
1)If there is any downside of having the partition key with "timestamp
with timezone" type? Will it impact the partition pruning of the queries
anyway by appending any run time "time zone" conversion function during the
query planning/execution phase?
2) As it will take the default server times , so during daylight saving
the server time will change, so in that case, can it cause any unforeseen
issue?
3)Will this cause the data to be spread unevenly across partitions and
make the partitions unevenly sized? If will go for UTC/GMT as db time, the
user's one day transaction might span across two daily partitions.
My 2 cents.
We have cases which use the "timestamp with timezone" column as partition
key and the partition pruning happens for the read queries without any
issue, so we don't see any conversion functions applied to the predicate as
such which is partition key. I think if the users go global it's better to
have the database time in UTC time zone. and it's obvious that, In case of
global users the data ought to be span across multiple days as the days
won't be as per the users time zone rather UTC.
Thank you.
Yes, I tried creating a table manually with column timestamptz(6) type and
partitioned on that and then executed select query with the filter on that
column and I do see partition pruning happening. Not able to visualize any
other issues though, however some teammates say it may have a negative
impact on aggregation type queries , not sure how but will try to test it.
Thanks again for the response.
On Wed, Mar 6, 2024 at 12:35 AM Lok P <loknath.73@gmail.com> wrote:
Show quoted text
On Tue, Mar 5, 2024 at 1:09 AM sud <suds1434@gmail.com> wrote:
However the question we have is ,
1)If there is any downside of having the partition key with "timestamp
with timezone" type? Will it impact the partition pruning of the queries
anyway by appending any run time "time zone" conversion function during the
query planning/execution phase?
2) As it will take the default server times , so during daylight saving
the server time will change, so in that case, can it cause any unforeseen
issue?
3)Will this cause the data to be spread unevenly across partitions and
make the partitions unevenly sized? If will go for UTC/GMT as db time, the
user's one day transaction might span across two daily partitions.My 2 cents.
We have cases which use the "timestamp with timezone" column as partition
key and the partition pruning happens for the read queries without any
issue, so we don't see any conversion functions applied to the predicate as
such which is partition key. I think if the users go global it's better to
have the database time in UTC time zone. and it's obvious that, In case of
global users the data ought to be span across multiple days as the days
won't be as per the users time zone rather UTC.
Something interesting and not sure if expected behaviour is as below. We
are confused here.
In the below example we created two partitioned tables on timestamptz type
columns with different time zones and the child partitions are created
appropriately with boundaries as one mid night to next mid night and so
on.But when we change the time zone and query the data dictionary views
again, it shows the start and end of the partition boundary as not
midnights but different values.
So I was wondering if this can cause us any unforeseen issues in the long
run while creating the partitions though partman or while persisting the
data into the tables from the end users? or should we always set the local
timezone as UTC always before running or calling the pg_partman/pg_cron
process which creates the partitions? Mainly in a database which serves
global users sitting across multiple timezones. And same thing while
inserting data into the table, we should use UTC timezone conversion
function.
And while checking the timezone using the "show timezone" function it shows
the local timezone, so is there any way to see postgres DB the server
timezone?
SET SESSION TIME ZONE 'UTC';
CREATE TABLE test_timestamp (
ts TIMESTAMP,
tstz TIMESTAMPTZ) PARTITION BY RANGE (tstz);
SELECT partman.create_parent(
p_parent_table := 'public.test_timestamp',
p_control := 'tstz',
p_type := 'native',
p_interval := '1 day',
p_premake := 4,
p_start_partition => '2024-03-07 00:00:00'
);
UPDATE partman.part_config SET infinite_time_partitions = 'true' WHERE
parent_table = 'public.test_timestamp';
with recursive inh as (
select i.inhrelid, null::text as parent
from pg_catalog.pg_inherits i
join pg_catalog.pg_class cl on i.inhparent = cl.oid
join pg_catalog.pg_namespace nsp on cl.relnamespace = nsp.oid
where nsp.nspname = 'public'
and cl.relname = 'test_timestamp2'
union all
select i.inhrelid, (i.inhparent::regclass)::text
from inh
join pg_catalog.pg_inherits i on (inh.inhrelid = i.inhparent)
)
select c.relname as partition_name,
pg_get_expr(c.relpartbound, c.oid, true) as partition_expression
from inh
join pg_catalog.pg_class c on inh.inhrelid = c.oid
join pg_catalog.pg_namespace n on c.relnamespace = n.oid
left join pg_partitioned_table p on p.partrelid = c.oid
order by n.nspname, c.relname;
test_timestamp_default DEFAULT
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO
('2024-03-08 00:00:00+00')
test_timestamp_p2024_03_08 FOR VALUES FROM ('2024-03-08 00:00:00+00') TO
('2024-03-09 00:00:00+00')
test_timestamp_p2024_03_09 FOR VALUES FROM ('2024-03-09 00:00:00+00') TO
('2024-03-10 00:00:00+00')
test_timestamp_p2024_03_10 FOR VALUES FROM ('2024-03-10 00:00:00+00') TO
('2024-03-11 00:00:00+00')
test_timestamp_p2024_03_11 FOR VALUES FROM ('2024-03-11 00:00:00+00') TO
('2024-03-12 00:00:00+00')
SET SESSION TIME ZONE 'EST';
test_timestamp_default DEFAULT
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 *19:00:00-05*') TO
('2024-03-07 19:00:00-05')
test_timestamp_p2024_03_08 FOR VALUES FROM ('2024-03-07 *19:00:00-05*') TO
('2024-03-08 19:00:00-05')
test_timestamp_p2024_03_09 FOR VALUES FROM ('2024-03-08 *19:00:00-05*') TO
('2024-03-09 19:00:00-05')
test_timestamp_p2024_03_10 FOR VALUES FROM ('2024-03-09 *19:00:00-05*') TO
('2024-03-10 19:00:00-05')
test_timestamp_p2024_03_11 FOR VALUES FROM ('2024-03-10 *19:00:00-05*') TO
('2024-03-11 19:00:00-05')
***********************
SET SESSION TIME ZONE 'EST';
CREATE TABLE test_timestamp2 (
ts TIMESTAMP,
tstz TIMESTAMPTZ) PARTITION BY RANGE (tstz);
SELECT partman.create_parent(
p_parent_table := 'public.test_timestamp2',
p_control := 'tstz',
p_type := 'native',
p_interval := '1 day',
p_premake := 4,
p_start_partition => '2024-03-07 00:00:00'
);
UPDATE partman.part_config SET infinite_time_partitions = 'true' WHERE
parent_table = 'public.test_timestamp2';
with recursive inh as (
select i.inhrelid, null::text as parent
from pg_catalog.pg_inherits i
join pg_catalog.pg_class cl on i.inhparent = cl.oid
join pg_catalog.pg_namespace nsp on cl.relnamespace = nsp.oid
where nsp.nspname = 'public'
and cl.relname = 'test_timestamp2'
union all
select i.inhrelid, (i.inhparent::regclass)::text
from inh
join pg_catalog.pg_inherits i on (inh.inhrelid = i.inhparent)
)
select c.relname as partition_name,
pg_get_expr(c.relpartbound, c.oid, true) as partition_expression
from inh
join pg_catalog.pg_class c on inh.inhrelid = c.oid
join pg_catalog.pg_namespace n on c.relnamespace = n.oid
left join pg_partitioned_table p on p.partrelid = c.oid
order by n.nspname, c.relname;
test_timestamp2_default DEFAULT
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05') TO
('2024-03-08 00:00:00-05')
test_timestamp2_p2024_03_08 FOR VALUES FROM ('2024-03-08 00:00:00-05') TO
('2024-03-09 00:00:00-05')
test_timestamp2_p2024_03_09 FOR VALUES FROM ('2024-03-09 00:00:00-05') TO
('2024-03-10 00:00:00-05')
test_timestamp2_p2024_03_10 FOR VALUES FROM ('2024-03-10 00:00:00-05') TO
('2024-03-11 00:00:00-05')
test_timestamp2_p2024_03_11 FOR VALUES FROM ('2024-03-11 00:00:00-05') TO
('2024-03-12 00:00:00-05')
SET SESSION TIME ZONE 'UTC';
test_timestamp2_default DEFAULT
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 *05:00:00+00*') TO
('2024-03-08 05:00:00+00')
test_timestamp2_p2024_03_08 FOR VALUES FROM ('2024-03-08 *05:00:00+00*') TO
('2024-03-09 05:00:00+00')
test_timestamp2_p2024_03_09 FOR VALUES FROM ('2024-03-09 *05:00:00+00*') TO
('2024-03-10 05:00:00+00')
test_timestamp2_p2024_03_10 FOR VALUES FROM ('2024-03-10 *05:00:00+00*') TO
('2024-03-11 05:00:00+00')
test_timestamp2_p2024_03_11 FOR VALUES FROM ('2024-03-11 *05:00:00+00*') TO
('2024-03-12 05:00:00+00')
Regards
Sud