Added prosupport function for estimating numeric generate_series rows

Started by songjinzhouover 1 year ago11 messageshackers
Jump to latest
#1songjinzhou
tsinghualucky912@foxmail.com

Hello hackers, I saw a recent submission: Teach planner how to estimate rows for timestamp generate_series. I provide a patch for the numeric type here, and a simple test is as follows:

postgres=# explain SELECT * FROM generate_series(-25.0, -1.0, 2.0);
                              QUERY PLAN                              
----------------------------------------------------------------------
 Function Scan on generate_series  (cost=0.00..0.13 rows=13 width=32)
(1 row)

postgres=# explain SELECT * FROM generate_series(-25.0, -1.0);    
                              QUERY PLAN                              
----------------------------------------------------------------------
 Function Scan on generate_series  (cost=0.00..0.25 rows=25 width=32)
(1 row)

postgres=#

I really want to know your thoughts, please give me feedback. Thank you.

Attachments:

0001-Added-prosupport-functions-for-estimating-numeric-ge.patchapplication/octet-stream; charset=utf-8; name=0001-Added-prosupport-functions-for-estimating-numeric-ge.patchDownload+110-3
#2Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: songjinzhou (#1)
Re: Added prosupport function for estimating numeric generate_series rows

On Thu, 28 Nov 2024 at 07:47, 孤傲小二~阿沐 <tsinghualucky912@foxmail.com> wrote:

Hello hackers, I saw a recent submission: Teach planner how to estimate rows for timestamp generate_series. I provide a patch for the numeric type here, and a simple test is as follows:

I really want to know your thoughts, please give me feedback. Thank you.

Good idea.

Some random review comments:

This should test for special inputs, NaN and infinity (it doesn't make
sense to convert those to NumericVars). generate_series() produces an
error for all such inputs, so the support function can just not
produce an estimate for these cases (the same as when the step size is
zero).

NumericVars initialised using init_var() should be freed using
free_var(). That can be avoided for the 3 inputs, by using
init_var_from_num(), rather than set_var_from_num(), which saves
copying digit arrays. It should then be possible to write this using a
single additional allocated NumericVar and one init_var()/free_var()
pair.

There's no need to use floor(), since the div_var() call already
produces a floored integer result.

It could use some regression test cases.

Regards,
Dean

#3songjinzhou
tsinghualucky912@foxmail.com
In reply to: Dean Rasheed (#2)
Re: Added prosupport function for estimating numeric generate_series rows

Hello, thank you very much for your attention and guidance. I have modified and improved the problem you mentioned. The patch of version v2 is attached below.&nbsp;

Regarding regression testing, I implemented it with the help of some facilities of generate_series_timestamp_support last time. Everything is normal in Cirrus CI test.

Looking forward to your reply, thank you very much.

原始邮件

发件人: "Dean Rasheed" <dean.a.rasheed@gmail.com&gt;
发件时间: 2024年11月28日 21:56
收件人: "孤傲小二~阿沐" <tsinghualucky912@foxmail.com&gt;
抄送: "pgsql-hackers" <pgsql-hackers@lists.postgresql.org&gt; , "japinli" <japinli@hotmail.com&gt; , "jian.universality" <jian.universality@gmail.com&gt;
主题: Re: Added prosupport function for estimating numeric generate_series rows

On Thu, 28 Nov 2024 at 07:47, 孤傲小二~阿沐 wrote:
&gt;
&gt; Hello hackers, I saw a recent submission: Teach planner how to estimate rows for timestamp generate_series. I provide a patch for the numeric type here, and a simple test is as follows:
&gt;
&gt; I really want to know your thoughts, please give me feedback. Thank you.
&gt;

Good idea.

Some random review comments:

This should test for special inputs, NaN and infinity (it doesn't make
sense to convert those to NumericVars). generate_series() produces an
error for all such inputs, so the support function can just not
produce an estimate for these cases (the same as when the step size is
zero).

NumericVars initialised using init_var() should be freed using
free_var(). That can be avoided for the 3 inputs, by using
init_var_from_num(), rather than set_var_from_num(), which saves
copying digit arrays. It should then be possible to write this using a
single additional allocated NumericVar and one init_var()/free_var()
pair.

There's no need to use floor(), since the div_var() call already
produces a floored integer result.

It could use some regression test cases.

Regards,
Dean

Attachments:

v2_0001-Added-prosupport-function-for-estimating-numeric-gen.patchapplication/octet-stream; charset=utf-8; name=v2_0001-Added-prosupport-function-for-estimating-numeric-gen.patchDownload+209-3
#4David Rowley
dgrowleyml@gmail.com
In reply to: songjinzhou (#3)
Re: Added prosupport function for estimating numeric generate_series rows

On Fri, 29 Nov 2024 at 06:25, 孤傲小二~阿沐 <tsinghualucky912@foxmail.com> wrote:

Hello, thank you very much for your attention and guidance. I have modified and improved the problem you mentioned. The patch of version v2 is attached below.

I've only had a quick look at the patch.

* The following needs a DatumGetFloat8():

+ req->rows = numericvar_to_double_no_overflow(&q) + 1;

* It would also be good to see a comment explaining the following line:

+ if(nstep.sign != var_diff.sign)

Something like: /* When the sign of the step size and the series range
don't match, there are no rows in the series. */

* You should add one test for the generate_series(numeric, numeric).
You've only got tests for the 3 arg version.

Also a few minor things:

* Missing space after "if"

+ if(arg3)

* We always have an empty line after variable declarations, that's missing in:

+ NumericVar q;
+ init_var(&q);

* No need for the braces in:

+ if (NUMERIC_IS_SPECIAL(step))
+ {
+ goto cleanup;
+ }

David

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#4)
Re: Added prosupport function for estimating numeric generate_series rows

David Rowley <dgrowleyml@gmail.com> writes:

Also a few minor things:
* Missing space after "if"
+ if(arg3)
* We always have an empty line after variable declarations, that's missing in:
+ NumericVar q;
+ init_var(&q);

This sort of stuff is best addressed by running the code through
pgindent, rather than fixing it manually. Usually we don't insist
on submitters getting it right; the committer should pgindent it.

regards, tom lane

#6songjinzhou
tsinghualucky912@foxmail.com
In reply to: Tom Lane (#5)
Re: Added prosupport function for estimating numeric generate_series rows

&gt; This sort of stuff is best addressed by running the code through
&gt; pgindent, rather than fixing it manually. Usually we don't insist
&gt; on submitters getting it right; the committer should pgindent it.

Hello, thank you and David Rowley for your comments.&nbsp;

I have used pgindent to adjust the code format and added comments and missing regression test cases. Here is the patch of version v3.&nbsp;

I look forward to your reply. Thank you very much!

Regards, Song Jinzhou

Attachments:

v3_0001-Added-prosupport-function-for-estimating-numeric-gen.patchapplication/octet-stream; charset=utf-8; name=v3_0001-Added-prosupport-function-for-estimating-numeric-gen.patchDownload+222-3
#7David Rowley
dgrowleyml@gmail.com
In reply to: songjinzhou (#6)
Re: Added prosupport function for estimating numeric generate_series rows

On Fri, 29 Nov 2024 at 18:50, songjinzhou <tsinghualucky912@foxmail.com> wrote:

Hello, thank you and David Rowley for your comments.

I have used pgindent to adjust the code format and added comments and missing regression test cases. Here is the patch of version v3.

It looks fine to me. The only things I'd adjust are stylistic,
namely; 1) remove two tabs before the goto label, 2) remove redundant
braces around the goto cleanup, 3) rename the variable "q" to
something slightly more meaningful, maybe "res" or "rows".

I'll defer to Dean.

David

#8Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: David Rowley (#7)
Re: Added prosupport function for estimating numeric generate_series rows

On Fri, 29 Nov 2024, 12:01 David Rowley, <dgrowleyml@gmail.com> wrote:

On Fri, 29 Nov 2024 at 18:50, songjinzhou <tsinghualucky912@foxmail.com>
wrote:

Hello, thank you and David Rowley for your comments.

I have used pgindent to adjust the code format and added comments and

missing regression test cases. Here is the patch of version v3.

It looks fine to me. The only things I'd adjust are stylistic,
namely; 1) remove two tabs before the goto label, 2) remove redundant
braces around the goto cleanup, 3) rename the variable "q" to
something slightly more meaningful, maybe "res" or "rows".

I'll defer to Dean.

There are a couple more things that I think need tidying up. I'll post an
update when I get back to my computer.

Regards,
Dean

#9Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Dean Rasheed (#8)
Re: Added prosupport function for estimating numeric generate_series rows

On Fri, 29 Nov 2024 at 13:10, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

There are a couple more things that I think need tidying up. I'll post an update when I get back to my computer.

Here's an update with some cosmetic tidying up, plus a couple of
not-so-cosmetic changes:

The new #include wasn't in the right place alphabetically (the same is
true for the recent timestamp equivalent function).

It's not necessary to call init_var() for a variable that you're going
to initialise with init_var_from_num(), and it's then not necessary to
call free_var() for that variable.

It's not necessary to have separate NumericVars for the difference and
quotient -- the same variable can be reused.

Doing both those things means that there's only one variable to free
after the computation, and it can be kept local to if-step-not-zero
code block, so there's no need for the "goto cleanup" stuff.

It seems worth avoiding div_var() in the 2-argument case, when step = 1.

Regards,
Dean

Attachments:

v4-0001-Add-a-planner-support-function-for-numeric-genera.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-a-planner-support-function-for-numeric-genera.patchDownload+231-3
#10songjinzhou
tsinghualucky912@foxmail.com
In reply to: songjinzhou (#3)
Re: Re: Added prosupport function for estimating numeric generate_series rows

On 2024-11-30, Dean Rasheed wrote:
On Fri, 29 Nov 2024 at 13:10, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

There are a couple more things that I think need tidying up. I'll post an update when I get back to my computer.

Here's an update with some cosmetic tidying up, plus a couple of
not-so-cosmetic changes:

The new #include wasn't in the right place alphabetically (the same is
true for the recent timestamp equivalent function).

It's not necessary to call init_var() for a variable that you're going
to initialise with init_var_from_num(), and it's then not necessary to
call free_var() for that variable.

It's not necessary to have separate NumericVars for the difference and
quotient -- the same variable can be reused.

Doing both those things means that there's only one variable to free
after the computation, and it can be kept local to if-step-not-zero
code block, so there's no need for the "goto cleanup" stuff.

It seems worth avoiding div_var() in the 2-argument case, when step = 1.

Regards,
Dean

Dear Dean Rasheed, I have reviewed the v4 patch and it is very thoughtful and reasonable, with a very clever attention to detail (plus I am very happy that we can get rid of the goto, which I was not a big fan of).

This patch looks very good and I have no complaints about it. Thanks again for your help from beginning to end!

Regards, Song Jinzhou

#11Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: songjinzhou (#10)
Re: Re: Added prosupport function for estimating numeric generate_series rows

On Sat, 30 Nov 2024 at 00:38, tsinghualucky912@foxmail.com
<tsinghualucky912@foxmail.com> wrote:

Dear Dean Rasheed, I have reviewed the v4 patch and it is very thoughtful and reasonable, with a very clever attention to detail (plus I am very happy that we can get rid of the goto, which I was not a big fan of).

This patch looks very good and I have no complaints about it. Thanks again for your help from beginning to end!

Cool. Patch committed.

Regards,
Dean