Refactor textToQualifiedNameList()

Started by Yugo Nagataover 7 years ago13 messages
#1Yugo Nagata
nagata@sraoss.co.jp
1 attachment(s)

Hi,

When working on other patch[1], I found there are almost same
functions, texttoQualifiedNameList() and stringToQualifiedNameList().
The only difference is the argument type, text or char*. I don't know
why these functions are defined seperately, but I think the former
function can be rewritten using the latter code as the attached patch.
Is this reasonable fix?

Regards,
--
Yugo Nagata <nagata@sraoss.co.jp>

Attachments:

refactor_textToQualifiedNameList.patchtext/x-diff; name=refactor_textToQualifiedNameList.patchDownload
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a5e812d026..2ef1a1e330 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -34,6 +34,7 @@
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/sortsupport.h"
+#include "utils/regproc.h"
 #include "utils/varlena.h"
 
 
@@ -3233,34 +3234,12 @@ textToQualifiedNameList(text *textval)
 {
 	char	   *rawname;
 	List	   *result = NIL;
-	List	   *namelist;
-	ListCell   *l;
 
 	/* Convert to C string (handles possible detoasting). */
 	/* Note we rely on being able to modify rawname below. */
 	rawname = text_to_cstring(textval);
 
-	if (!SplitIdentifierString(rawname, '.', &namelist))
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_NAME),
-				 errmsg("invalid name syntax")));
-
-	if (namelist == NIL)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_NAME),
-				 errmsg("invalid name syntax")));
-
-	foreach(l, namelist)
-	{
-		char	   *curname = (char *) lfirst(l);
-
-		result = lappend(result, makeString(pstrdup(curname)));
-	}
-
-	pfree(rawname);
-	list_free(namelist);
-
-	return result;
+	return stringToQualifiedNameList(rawname);
 }
 
 /*
#2Yugo Nagata
nagata@sraoss.co.jp
In reply to: Yugo Nagata (#1)
Re: Refactor textToQualifiedNameList()

On Fri, 24 Aug 2018 20:44:12 +0900
Yugo Nagata <nagata@sraoss.co.jp> wrote:

Hi,

When working on other patch[1], I found there are almost same
functions, texttoQualifiedNameList() and stringToQualifiedNameList().
The only difference is the argument type, text or char*. I don't know
why these functions are defined seperately, but I think the former
function can be rewritten using the latter code as the attached patch.
Is this reasonable fix?

I forgot to add a link to the referred thread.

[1]: /messages/by-id/20180817075100.bc99378255943d3c3951ad63@sraoss.co.jp

Regards,
--
Yugo Nagata <nagata@sraoss.co.jp>

--
Yugo Nagata <nagata@sraoss.co.jp>

#3Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Yugo Nagata (#1)
Re: Refactor textToQualifiedNameList()

Hello.

At Fri, 24 Aug 2018 20:44:12 +0900, Yugo Nagata <nagata@sraoss.co.jp> wrote in <20180824204412.150979ae6b283ddb639f93f6@sraoss.co.jp>

When working on other patch[1], I found there are almost same
functions, texttoQualifiedNameList() and stringToQualifiedNameList().
The only difference is the argument type, text or char*. I don't know
why these functions are defined seperately, but I think the former
function can be rewritten using the latter code as the attached patch.
Is this reasonable fix?

The functions were introduced within a month for different
objectives in March and April, 2002. I supppose that they are
intentionally added as separate functions for simplicitly since
the second one is apparent CnP'ed from the first one.

commit 5f4745adf4fb2a1f933b25d7a2bc72b39fa9edfd
commit 52200befd04b9fa71da83231c808764867079226

Returning to the patch, the downside of it is that textToQNL
makes an extra and unused copy of the parameter string. (It's a
kind of bug that it is forgetting to free rawname.)

Maybe we can separate them into three functions (or one function
and two macros) to get rid of the duplication but I'm not sure
it's worth doing..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#4Yugo Nagata
nagata@sraoss.co.jp
In reply to: Kyotaro HORIGUCHI (#3)
1 attachment(s)
Re: Refactor textToQualifiedNameList()

On Tue, 28 Aug 2018 11:49:26 +0900 (Tokyo Standard Time)
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

At Fri, 24 Aug 2018 20:44:12 +0900, Yugo Nagata <nagata@sraoss.co.jp> wrote in <20180824204412.150979ae6b283ddb639f93f6@sraoss.co.jp>

When working on other patch[1], I found there are almost same
functions, texttoQualifiedNameList() and stringToQualifiedNameList().
The only difference is the argument type, text or char*. I don't know
why these functions are defined seperately, but I think the former
function can be rewritten using the latter code as the attached patch.
Is this reasonable fix?

The functions were introduced within a month for different
objectives in March and April, 2002. I supppose that they are
intentionally added as separate functions for simplicitly since
the second one is apparent CnP'ed from the first one.

commit 5f4745adf4fb2a1f933b25d7a2bc72b39fa9edfd
commit 52200befd04b9fa71da83231c808764867079226

Thank you for your comments. I also looked at the commit history.
stringToQNL is added after textToQNL as a static function originally,
but this is now public and reffered from other modules, too. So, I
propose to mote this from regproc.c to verlena.c and rewrite textToQNL
to call stringToQNL. This is just for reducing redundancy of the code.
Attached is the updated patch.

Returning to the patch, the downside of it is that textToQNL
makes an extra and unused copy of the parameter string. (It's a
kind of bug that it is forgetting to free rawname.)

I also fixed to free rawname in the texttoQNL.

Regards,

--
Yugo Nagata <nagata@sraoss.co.jp>

Attachments:

refactor_textToQualifiedNameList_v2.patchtext/x-diff; name=refactor_textToQualifiedNameList_v2.patchDownload
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index a0079821fe..ebc087ed3f 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -1680,43 +1680,6 @@ text_regclass(PG_FUNCTION_ARGS)
 }
 
 
-/*
- * Given a C string, parse it into a qualified-name list.
- */
-List *
-stringToQualifiedNameList(const char *string)
-{
-	char	   *rawname;
-	List	   *result = NIL;
-	List	   *namelist;
-	ListCell   *l;
-
-	/* We need a modifiable copy of the input string. */
-	rawname = pstrdup(string);
-
-	if (!SplitIdentifierString(rawname, '.', &namelist))
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_NAME),
-				 errmsg("invalid name syntax")));
-
-	if (namelist == NIL)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_NAME),
-				 errmsg("invalid name syntax")));
-
-	foreach(l, namelist)
-	{
-		char	   *curname = (char *) lfirst(l);
-
-		result = lappend(result, makeString(pstrdup(curname)));
-	}
-
-	pfree(rawname);
-	list_free(namelist);
-
-	return result;
-}
-
 /*****************************************************************************
  *	 SUPPORT ROUTINES														 *
  *****************************************************************************/
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index a5e812d026..1846ddcae8 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -3219,26 +3219,19 @@ name_text(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(NameStr(*s)));
 }
 
-
 /*
- * textToQualifiedNameList - convert a text object to list of names
- *
- * This implements the input parsing needed by nextval() and other
- * functions that take a text parameter representing a qualified name.
- * We split the name at dots, downcase if not double-quoted, and
- * truncate names if they're too long.
+ * Given a C string, parse it into a qualified-name list.
  */
 List *
-textToQualifiedNameList(text *textval)
+stringToQualifiedNameList(const char *string)
 {
 	char	   *rawname;
 	List	   *result = NIL;
 	List	   *namelist;
 	ListCell   *l;
 
-	/* Convert to C string (handles possible detoasting). */
-	/* Note we rely on being able to modify rawname below. */
-	rawname = text_to_cstring(textval);
+	/* We need a modifiable copy of the input string. */
+	rawname = pstrdup(string);
 
 	if (!SplitIdentifierString(rawname, '.', &namelist))
 		ereport(ERROR,
@@ -3263,6 +3256,31 @@ textToQualifiedNameList(text *textval)
 	return result;
 }
 
+
+/*
+ * textToQualifiedNameList - convert a text object to list of names
+ *
+ * This implements the input parsing needed by nextval() and other
+ * functions that take a text parameter representing a qualified name.
+ * We split the name at dots, downcase if not double-quoted, and
+ * truncate names if they're too long.
+ */
+List *
+textToQualifiedNameList(text *textval)
+{
+	char	   *rawname;
+	List	   *result = NIL;
+
+	/* Convert to C string (handles possible detoasting). */
+	/* Note we rely on being able to modify rawname below. */
+	rawname = text_to_cstring(textval);
+
+	result = stringToQualifiedNameList(rawname);
+	pfree(rawname);
+
+	return result;
+}
+
 /*
  * SplitIdentifierString --- parse a string containing identifiers
  *
diff --git a/src/include/utils/regproc.h b/src/include/utils/regproc.h
index 5b9a8cbee8..ace74ed49f 100644
--- a/src/include/utils/regproc.h
+++ b/src/include/utils/regproc.h
@@ -15,7 +15,6 @@
 
 #include "nodes/pg_list.h"
 
-extern List *stringToQualifiedNameList(const char *string);
 extern char *format_procedure(Oid procedure_oid);
 extern char *format_procedure_qualified(Oid procedure_oid);
 extern void format_procedure_parts(Oid operator_oid, List **objnames,
diff --git a/src/include/utils/varlena.h b/src/include/utils/varlena.h
index c776931bc4..a51c84beb3 100644
--- a/src/include/utils/varlena.h
+++ b/src/include/utils/varlena.h
@@ -26,6 +26,7 @@ extern int varstr_levenshtein_less_equal(const char *source, int slen,
 							  const char *target, int tlen,
 							  int ins_c, int del_c, int sub_c,
 							  int max_d, bool trusted);
+extern List *stringToQualifiedNameList(const char *string);
 extern List *textToQualifiedNameList(text *textval);
 extern bool SplitIdentifierString(char *rawstring, char separator,
 					  List **namelist);
#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Yugo Nagata (#4)
Re: Refactor textToQualifiedNameList()

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, failed
Spec compliant: tested, failed
Documentation: tested, failed

Hi

I tested this patch. This patch removes some duplicate rows, what is good - on second hand, after this patch, the textToQualifiedNameList does one more copy of input string more. I looked where this function is used, and I don't think so it is a issue.

This patch is trivial - all tests passed and I don't see any problem. I'll mark as ready for commit.

The new status of this patch is: Ready for Committer

#6Michael Paquier
michael@paquier.xyz
In reply to: Pavel Stehule (#5)
Re: Refactor textToQualifiedNameList()

On Mon, Oct 08, 2018 at 03:22:55PM +0000, Pavel Stehule wrote:

I tested this patch. This patch removes some duplicate rows, what is
good - on second hand, after this patch, the textToQualifiedNameList
does one more copy of input string more. I looked where this function
is used, and I don't think so it is a issue.

This patch is trivial - all tests passed and I don't see any
problem. I'll mark as ready for commit.

The new status of this patch is: Ready for Committer

Are you sure that there is no performance impact? Say implement a
simple SQL-callable function which does textToQualifiedNameList on the
same string N times, and then compare the time it takes to run HEAD and
the refactored implementation. I may buy that an extra palloc call is
not something to worry about, but in my experience it can be a problem.

This patch also changes stringToQualifiedNameList from regproc.h to
varlena.h, which would break extensions calling it. That's not nice
either.
--
Michael

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Michael Paquier (#6)
Re: Refactor textToQualifiedNameList()

Hi

út 9. 10. 2018 v 5:28 odesílatel Michael Paquier <michael@paquier.xyz>
napsal:

On Mon, Oct 08, 2018 at 03:22:55PM +0000, Pavel Stehule wrote:

I tested this patch. This patch removes some duplicate rows, what is
good - on second hand, after this patch, the textToQualifiedNameList
does one more copy of input string more. I looked where this function
is used, and I don't think so it is a issue.

This patch is trivial - all tests passed and I don't see any
problem. I'll mark as ready for commit.

The new status of this patch is: Ready for Committer

Are you sure that there is no performance impact? Say implement a
simple SQL-callable function which does textToQualifiedNameList on the
same string N times, and then compare the time it takes to run HEAD and
the refactored implementation. I may buy that an extra palloc call is
not something to worry about, but in my experience it can be a problem.

I try to find some worst case where this function is used and I expect so
most critical usage is in "nextval" function.

I disabled fsync and debug assertions and created sequence by command

create sequence pretty_long_schema_name.pretty_long_sequence_name cache
1000;

And I tested

explain analyze select
nextval('pretty_long_schema_name.pretty_long_sequence_name') from
generate_series(1,10000000);

The difference on 10M calls is about 300ms - it is about 6%.

It is probably the cost of one palloc and one free more.

I am not able to decide if it is too much. If it is too much, then probably
is better do nothing. Introducing another third function we will have more
lines than before.

This patch also changes stringToQualifiedNameList from regproc.h to
varlena.h, which would break extensions calling it. That's not nice
either.

It is not nice, but this is not serious problem for extensions's authors. I
remember more similar issues when I worked on Orafce extension.

My opinion is neutral - this patch reduces some lines, but not too much,
and has some overhead, but not too significant. Maybe more rich comment or
notes can be best solution.

Regards

Pavel

Show quoted text

--
Michael

#8Michael Paquier
michael@paquier.xyz
In reply to: Pavel Stehule (#7)
Re: Refactor textToQualifiedNameList()

On Tue, Oct 09, 2018 at 10:47:48AM +0200, Pavel Stehule wrote:

The difference on 10M calls is about 300ms - it is about 6%.

This number gives a good argument for rejecting this patch. I am not
usually against code beautification, but that's a high price to pay for
just some refactoring. On top of that, this potentially breaks
extension compilation.
--
Michael

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Michael Paquier (#8)
Re: Refactor textToQualifiedNameList()

út 9. 10. 2018 v 13:20 odesílatel Michael Paquier <michael@paquier.xyz>
napsal:

On Tue, Oct 09, 2018 at 10:47:48AM +0200, Pavel Stehule wrote:

The difference on 10M calls is about 300ms - it is about 6%.

This number gives a good argument for rejecting this patch. I am not
usually against code beautification, but that's a high price to pay for
just some refactoring. On top of that, this potentially breaks
extension compilation.

ok

Regards

Pavel

--

Show quoted text

Michael

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#8)
Re: Refactor textToQualifiedNameList()

On 2018-Oct-09, Michael Paquier wrote:

On Tue, Oct 09, 2018 at 10:47:48AM +0200, Pavel Stehule wrote:

The difference on 10M calls is about 300ms - it is about 6%.

This number gives a good argument for rejecting this patch. I am not
usually against code beautification, but that's a high price to pay for
just some refactoring. On top of that, this potentially breaks
extension compilation.

One thing I do like about this patch is that it takes
stringToQualifiedNameList out of regproc.c, where it was put as static
by commit 52200befd04b ("Implement types regprocedure, regoper,
regoperator, regclass, regtype" April 2002); made extern by ba790a5608ea
("Here is a patch for Composite and Set returning function support."
June 2002) in what appears to have been a careless change. The function
would end up in a place where it more reasonably belongs into,
varlena.c, next to its sibling textToQualifiedNameList.

The committer of such a change will get a lot of flak for changing the
#include requirements for code that calls that function, though.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#10)
Re: Refactor textToQualifiedNameList()

On Tue, Oct 09, 2018 at 09:54:12AM -0300, Alvaro Herrera wrote:

The committer of such a change will get a lot of flak for changing the
#include requirements for code that calls that function, though.

So the patch has been switched to rejected...
--
Michael

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#11)
Re: Refactor textToQualifiedNameList()

On 2018-Oct-10, Michael Paquier wrote:

On Tue, Oct 09, 2018 at 09:54:12AM -0300, Alvaro Herrera wrote:

The committer of such a change will get a lot of flak for changing the
#include requirements for code that calls that function, though.

So the patch has been switched to rejected...

I know -- I'm just disagreeing with that.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#12)
Re: Refactor textToQualifiedNameList()

On Wed, Oct 10, 2018 at 09:45:22AM -0300, Alvaro Herrera wrote:

On 2018-Oct-10, Michael Paquier wrote:

On Tue, Oct 09, 2018 at 09:54:12AM -0300, Alvaro Herrera wrote:

The committer of such a change will get a lot of flak for changing the
#include requirements for code that calls that function, though.

So the patch has been switched to rejected...

I know -- I'm just disagreeing with that.

The point of moving stringToQualifiedNameList out of regproc.c to
varlena.c is actually a good one, if we don't add the extra palloc
overhead. Now I suspect that any change of header makes plugin
maintainers unhappy, and I can see this routine extensively used on
github, like pipelineDB or such.
--
Michael