reorganizing partitioning code (was: Re: [HACKERS] path toward faster partition pruning)

Started by Amit Langoteabout 8 years ago20 messageshackers
Jump to latest
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

Re-posting my earlier email to start a new thread.

On 2018/02/09 2:58, Alvaro Herrera wrote:> Robert Haas wrote:

On Wed, Feb 7, 2018 at 3:42 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

partition.c seems to have two kinds of functions 1. that build and
manage relcache, creates quals from bounds etc. which are metadata
management kind 2. partition bound comparison functions, and other
optimizer related functions. May be we should divide the file that
way. The first category code remains in catalog/ as it is today. The
second catagory functions move to optimizer/.

It would be sensible to separate functions that build and manage data
in the relcache from other functions. I think we should consider
moving the existing functions of that type from partition.c to
src/backend/utils/cache/partcache.c.

FWIW I've been thinking that perhaps we need some other separation of
code better than statu quo. The current partition.c file includes stuff
for several modules and ISTM all these new patches are making more and
more of a mess. So +1 to the general idea of splitting things up.
Maybe partcache.c is not ambitious enough, but it seems a good first
step.

Agree with the proposed reorganizing and adding a partcache.c, which I
tried to do in the attached patch.

* The new src/backend/utils/cache/partcache.c contains functions that
initialize relcache's partitioning related fields. Various partition
bound comparison and search functions (and then some) that work off of the
cached information are moved. Also, since we cache partition qual,
interface functions RelationGetPartitioQual(Relation) and
get_partition_qual_relid(Oid) are moved too.

* The new src/include/utils/partcache.h contains various struct
definitions that are moved from backend/catalog/partition.c,
include/catalog/partition.h, and include/utils/rel.h. Also, declarations
of interface functions of partcache.c.

Thoughts?

Thanks,
Amit

Attachments:

v1-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v1-0001-Reorganize-partitioning-code.patchDownload+3301-3243
#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
Re: reorganizing partitioning code (was: Re: [HACKERS] path toward faster partition pruning)

On 2018/02/13 23:08, Ashutosh Bapat wrote:

On Tue, Feb 13, 2018 at 2:17 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

Agree with the proposed reorganizing and adding a partcache.c, which I
tried to do in the attached patch.

* The new src/backend/utils/cache/partcache.c contains functions that
initialize relcache's partitioning related fields. Various partition
bound comparison and search functions (and then some) that work off of the
cached information are moved.

Are you moving partition bound comparison functions to partcache.c?
They will also used by optimizer, so may be leave them out of
partcache.c?

Yes, I moved the partition bound comparison and search functions, because
I thought that they should live with the other code that manages the
cached information. So, I moved not only the code that reads the catalogs
and builds the partition key, partition (bound) descriptor, and partition
qual (for individual partitions), but also the code that uses those
structures.

So, with the new arrangement, optimizer will include utils/partcache.h,
instead of catalog/partition.h.

Thanks,
Amit

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
Re: reorganizing partitioning code (was: Re: [HACKERS] path toward faster partition pruning)

On 2018/02/14 10:00, Amit Langote wrote:

Agree with the proposed reorganizing and adding a partcache.c, which I
tried to do in the attached patch.

* The new src/backend/utils/cache/partcache.c contains functions that
initialize relcache's partitioning related fields. Various partition
bound comparison and search functions (and then some) that work off of the
cached information are moved. Also, since we cache partition qual,
interface functions RelationGetPartitioQual(Relation) and
get_partition_qual_relid(Oid) are moved too.

* The new src/include/utils/partcache.h contains various struct
definitions that are moved from backend/catalog/partition.c,
include/catalog/partition.h, and include/utils/rel.h. Also, declarations
of interface functions of partcache.c.

Attached updated version, where I removed #include "catalog/partition.h"
from even more places and also moved map_partition_varattnos() next to
map_variable_attnos() in rewriteManip.c.

Now, after applying the patch -

Files #including partition.h

File Line
0 src/backend/catalog/heap.c 44 #include "catalog/partition.h"
1 src/backend/catalog/partition.c 22 #include "catalog/partition.h"
2 src/backend/commands/indexcmds.c 26 #include "catalog/partition.h"
3 src/backend/commands/tablecmds.c 33 #include "catalog/partition.h"
4 src/backend/utils/cache/partcache.c 23 #include "catalog/partition.h"

Files #including partcache.h

File Line
0 src/backend/optimizer/path/joinrels.c 24 #include "utils/partcache.h"
1 src/backend/optimizer/prep/prepunion.c 51 #include "utils/partcache.h"
2 src/backend/optimizer/util/relnode.c 30 #include "utils/partcache.h"
3 src/backend/utils/cache/partcache.c 37 #include "utils/partcache.h"
4 src/backend/utils/cache/relcache.c 83 #include "utils/partcache.h"
5 src/include/executor/execPartition.h 19 #include "utils/partcache.h"
6 src/include/utils/rel.h 27 #include "utils/partcache.h"

Thanks,
Amit

Attachments:

v2-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v2-0001-Reorganize-partitioning-code.patchDownload+3322-3261
#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#3)
Re: reorganizing partitioning code (was: Re: [HACKERS] path toward faster partition pruning)

This is looking attractive.

Please don't #include postgres.h in partcache.h. That's per policy.

Why do you need to #include parsenodes.h in partcache.h?

I think rewriteManip.h can do with just relcache.h rather than rel.h
(probably partition.h can do likewise)

thanks

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#3)
Re: reorganizing partitioning code (was: Re: [HACKERS] path toward faster partition pruning)

BTW may I suggest that

git config diff.algorithm=histogram

results in better diffs?

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#5)
Re: reorganizing partitioning code

On 2018/02/15 5:30, Alvaro Herrera wrote:

BTW may I suggest that

git config diff.algorithm=histogram

results in better diffs?

Aha! That's much better.

Thanks,
Amit

#7Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alvaro Herrera (#4)
Re: reorganizing partitioning code

Thanks for the review.

On 2018/02/15 5:25, Alvaro Herrera wrote:

This is looking attractive.

Please don't #include postgres.h in partcache.h. That's per policy.

Oops, fixed.

Why do you need to #include parsenodes.h in partcache.h?

I thought it was needed because there was this:

extern void check_new_partition_bound(char *relname, Relation parent,
PartitionBoundSpec *spec);

in partcache.h and PartitionBoundSpec is defined in parsenodes.h.
Removing the #include turned out to be fine, that is, after I put it in
partition.h instead.

I think rewriteManip.h can do with just relcache.h rather than rel.h
(probably partition.h can do likewise)

Hmm. map_partition_varattnos() that I moved to rewriteManip.c wants to
use RelationGetDescr(), but partition.h was fine.

Attached updated patch.

Thanks,
Amit

Attachments:

v3-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v3-0001-Reorganize-partitioning-code.patchDownload+2373-2313
#8Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#7)
Re: reorganizing partitioning code

Added to CF here: https://commitfest.postgresql.org/17/1520/

Thanks,
Amit

#9Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Langote (#8)
Re: reorganizing partitioning code

Hello. I'd like to make a humble comment.

At Thu, 15 Feb 2018 19:31:47 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <8906c861-ea47-caee-c860-eff8d7e1dbc0@lab.ntt.co.jp>

Added to CF here: https://commitfest.postgresql.org/17/1520/

The reorganization adds/removes header files to/from .[ch]
files. That can easily leave useless includes and they're hardly
noticed. Following files touched in this patch have such useless
includes after this patch.

On the other hand, naive decision of this kind of cleanup can
lead to curruption. [1]/messages/by-id/6748.1518711125@sss.pgh.pa.us So, I don't insisit that the all of the
following *should* amended, especially for rel.h.

[1]: /messages/by-id/6748.1518711125@sss.pgh.pa.us

==== nodeModifyTable.c:

+#include "rewrite/rewriteManip.h"

rewriteManip.h is changed to include rel.h so rel.h is no longer
need to be included in the file. (It is also included in lmgr.h
so it was needless since before this patch, though.)

==== hba.c:

+#include "catalog/objectaddress.h"

objectaddress.h includes acl.h so acl.h is no longer useful.

==== joinrels.c:

+#include "utils/partcache.h"

partcache.h includes lsyscache.h.

==== prepunion.c:

+#include "utils/partcache.h"

partcache.h includes lsyscache.h and partcache.h is included in
rel.h. So partcache.h and lsyscache.h are not required.

==== utility.c:

+#include "utils/rel.h"

rel.h includes xlog.h (and xlog.h includes fd.h). The last two
are removable.

==== partcache.c:
parsenodes.h is included from some other files.
rel.h is included from rewriteManip.h.
partcache.h is included from rel.h
As the result, parsenodes.h, rel.h, partcache.h are not required.

==== relcache.c:

+#include "utils/partcache.h"

lsyscache.h is included by partcache.h.

==== rel.h:

+#include "utils/partcache.h"

partcache.h includes fmgr.h and relcache.h so the last two are
no longer useful.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Kyotaro Horiguchi (#9)
Re: reorganizing partitioning code

Horiguchi-san,

On 2018/02/16 14:07, Kyotaro HORIGUCHI wrote:

Hello. I'd like to make a humble comment.

Thanks a lot for taking a look.

The reorganization adds/removes header files to/from .[ch]
files. That can easily leave useless includes and they're hardly
noticed. Following files touched in this patch have such useless
includes after this patch.

Yes, I agree.

On the other hand, naive decision of this kind of cleanup can
lead to curruption. [1] So, I don't insisit that the all of the
following *should* amended, especially for rel.h.

[1] /messages/by-id/6748.1518711125@sss.pgh.pa.us

I was initially trying limit this patch's #include churn so that it only
touches .[ch] files that are somehow involved with partitioning, but ended
up with a lot of files touched. To prevent confusion and concerns, I have
updated the patch to keep the churn to minimum as best as I could.

So, while the v3 patch looked like this:

24 files changed, 2373 insertions(+), 2312 deletions(-)
create mode 100644 src/backend/utils/cache/partcache.c
create mode 100644 src/include/utils/partcache.h

v4 looks like this:

15 files changed, 2314 insertions(+), 2263 deletions(-)
create mode 100644 src/backend/utils/cache/partcache.c
create mode 100644 src/include/utils/partcache.h

==== nodeModifyTable.c:

+#include "rewrite/rewriteManip.h"

rewriteManip.h is changed to include rel.h so rel.h is no longer
need to be included in the file. (It is also included in lmgr.h
so it was needless since before this patch, though.)

On second thought, I decided to hold back on moving
map_partition_varattnos() to rewriteManip.c, because its interface is
unlike any other functions in that file, requiring us to include rel.h in
rewriteManip.h. So, I removed this #include from nodeModifyTable.c.

==== hba.c:

+#include "catalog/objectaddress.h"

objectaddress.h includes acl.h so acl.h is no longer useful.

This and a few others were necessitated by removing partition.h from
executor.h. For now, I'm putting partition.h back into executor.h,
although it would be nice to remove it eventually.

==== joinrels.c:

+#include "utils/partcache.h"

partcache.h includes lsyscache.h.

Moved lsyscache.h from partcache.h to partcache.c.

==== prepunion.c:

+#include "utils/partcache.h"

partcache.h includes lsyscache.h and partcache.h is included in
rel.h. So partcache.h and lsyscache.h are not required.

Oops, why did I even include partcache.h in prepunion.c!

Removed.

==== utility.c:

+#include "utils/rel.h"

rel.h includes xlog.h (and xlog.h includes fd.h). The last two
are removable.

I've reverted the change that necessitated this and so this one.

==== partcache.c:
parsenodes.h is included from some other files.
rel.h is included from rewriteManip.h.
partcache.h is included from rel.h
As the result, parsenodes.h, rel.h, partcache.h are not required.

Removed.

==== relcache.c:

+#include "utils/partcache.h"

lsyscache.h is included by partcache.h.

lsyscache.h was moved from partcache.h per above, so keeping here.

==== rel.h:

+#include "utils/partcache.h"

partcache.h includes fmgr.h and relcache.h so the last two are
no longer useful.

Removed both.

Attached updated version.

Thanks,
Amit

Attachments:

v4-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v4-0001-Reorganize-partitioning-code.patchDownload+2314-2264
#11David Steele
david@pgmasters.net
In reply to: Amit Langote (#10)
Re: Re: reorganizing partitioning code

Hi Amit,

On 2/16/18 3:36 AM, Amit Langote wrote:

Attached updated version.

This patch no longer applies and the conflicts do not appear to be trivial.

I'm a bit confused about your comment in [1]/messages/by-id/33098109-9ef1-9594-e7d5-0977a50f8cfa@lab.ntt.co.jp:

I gave up on rebasing this patch yesterday as I couldn't finish it in
5 minutes, but maybe I will try later this month. Gotta focus on
thefaster pruning stuff for now...

How much later are we talking about?

Marked Waiting on Author.

--
-David
david@pgmasters.net

[1]: /messages/by-id/33098109-9ef1-9594-e7d5-0977a50f8cfa@lab.ntt.co.jp
/messages/by-id/33098109-9ef1-9594-e7d5-0977a50f8cfa@lab.ntt.co.jp

#12Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Steele (#11)
Re: Re: reorganizing partitioning code

Hi David.

On Fri, Mar 2, 2018 at 11:53 PM, David Steele <david@pgmasters.net> wrote:

Hi Amit,

On 2/16/18 3:36 AM, Amit Langote wrote:

Attached updated version.

This patch no longer applies and the conflicts do not appear to be trivial.

I'm a bit confused about your comment in [1]:

I gave up on rebasing this patch yesterday as I couldn't finish it in
5 minutes, but maybe I will try later this month. Gotta focus on
thefaster pruning stuff for now...

How much later are we talking about?

Sorry about the confusing comment. It could be sometime later half of
the next week.

Thanks,
Amit

#13David Steele
david@pgmasters.net
In reply to: Amit Langote (#12)
Re: Re: Re: reorganizing partitioning code

Hi Amit,

On 3/2/18 11:17 AM, Amit Langote wrote:

On Fri, Mar 2, 2018 at 11:53 PM, David Steele <david@pgmasters.net> wrote:

Hi Amit,

On 2/16/18 3:36 AM, Amit Langote wrote:

Attached updated version.

This patch no longer applies and the conflicts do not appear to be trivial.

I'm a bit confused about your comment in [1]:

I gave up on rebasing this patch yesterday as I couldn't finish it in
5 minutes, but maybe I will try later this month. Gotta focus on
thefaster pruning stuff for now...

How much later are we talking about?

Sorry about the confusing comment. It could be sometime later half of
the next week.

We are now three weeks into the CF with no new patch.

Are you planning to update this patch? If not, I think it should be
marked as Returned with Feedback and submitted to the next CF once it
has been updated.

Regards,
--
-David
david@pgmasters.net

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Steele (#13)
Re: Re: Re: reorganizing partitioning code

David Steele wrote:

Sorry about the confusing comment. It could be sometime later half of
the next week.

We are now three weeks into the CF with no new patch.

Are you planning to update this patch? If not, I think it should be
marked as Returned with Feedback and submitted to the next CF once it
has been updated.

This is no new development, only code movement. I think it would be
worse to have three different branches of partitioning code, v10
"basic", v11 "powerful but not reorganized", v12 "reorganized". I'd
rather have only v10 "basic" and v11+ "powerful".

Let's keep this entry open till the last minute.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#14)
Re: reorganizing partitioning code

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

David Steele wrote:

Are you planning to update this patch? If not, I think it should be
marked as Returned with Feedback and submitted to the next CF once it
has been updated.

This is no new development, only code movement. I think it would be
worse to have three different branches of partitioning code, v10
"basic", v11 "powerful but not reorganized", v12 "reorganized". I'd
rather have only v10 "basic" and v11+ "powerful".

Let's keep this entry open till the last minute.

Nonetheless, it's March 21. David's point is that it's time to get a
move on.

regards, tom lane

#16Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#15)
Re: reorganizing partitioning code

On 2018/03/22 2:33, Tom Lane wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

David Steele wrote:

Are you planning to update this patch? If not, I think it should be
marked as Returned with Feedback and submitted to the next CF once it
has been updated.

This is no new development, only code movement. I think it would be
worse to have three different branches of partitioning code, v10
"basic", v11 "powerful but not reorganized", v12 "reorganized". I'd
rather have only v10 "basic" and v11+ "powerful".

Let's keep this entry open till the last minute.

Nonetheless, it's March 21. David's point is that it's time to get a
move on.

I'm sorry it took me a while to reply on this thread.

Due to quite a few changes to the partitioning-related code recently and
also considering some pending patches which might touch the code moved
around by this patch, I'd been putting off rebasing this patch. Although,
I should have said that before without waiting until today to do so. Sorry.

FWIW, I did manage to rebase it this morning and posting it here.

Thanks,
Amit

Attachments:

v5-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v5-0001-Reorganize-partitioning-code.patchDownload+2343-2288
#17Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#14)
Re: Re: Re: reorganizing partitioning code

On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:

Let's keep this entry open till the last minute.

Ugh, I don't like keeping things open till the last minute all that
much, especially if they're not being updated. But since this has
been updated I guess it's somewhat moot.

Are you going to pick this up RSN?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#17)
Re: Re: Re: reorganizing partitioning code

Robert Haas wrote:

On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:

Let's keep this entry open till the last minute.

Ugh, I don't like keeping things open till the last minute all that
much, especially if they're not being updated. But since this has
been updated I guess it's somewhat moot.

Are you going to pick this up RSN?

If during next week qualifies as RSN, then yes.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#16)
Re: reorganizing partitioning code

On 2018/03/22 11:45, Amit Langote wrote:

FWIW, I did manage to rebase it this morning and posting it here.

Rebased again.

I started wondering if we should separate out stuff related to partition
bounds. That is create a utils/partbound.h and put PartitionBoundInfo and
related comparison and search functions into a utils/adt/partbound.c. I
had started thinking about that when I looked at the code added by the
patch submitted on the "advanced partition matching algorithm for
partition-wise join" thread [1]https://commitfest.postgresql.org/17/1553/. I haven't done anything about that though.

Thanks,
Amit

[1]: https://commitfest.postgresql.org/17/1553/

Attachments:

v6-0001-Reorganize-partitioning-code.patchtext/plain; charset=UTF-8; name=v6-0001-Reorganize-partitioning-code.patchDownload+2698-2640
#20Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#19)
Re: reorganizing partitioning code

On Wed, Mar 28, 2018 at 12:07 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

On 2018/03/22 11:45, Amit Langote wrote:

FWIW, I did manage to rebase it this morning and posting it here.

Rebased again.

I started wondering if we should separate out stuff related to partition
bounds. That is create a utils/partbound.h and put PartitionBoundInfo and
related comparison and search functions into a utils/adt/partbound.c. I
had started thinking about that when I looked at the code added by the
patch submitted on the "advanced partition matching algorithm for
partition-wise join" thread [1]. I haven't done anything about that though.

adt = Abstract Data Type, which I think we've interpreted up until now
to mean an SQL-visible data type, so that seems like an odd choice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company