[Fwd: Re: dblink patches for comment]

Started by Joe Conwayalmost 17 years ago12 messageshackers
Jump to latest
#1Joe Conway
mail@joeconway.com

Based on Tom's post today about RC1, it sounds like I need to get this
committed very soon. Any complaints?

Joe

-------- Original Message --------
Subject: Re: [HACKERS] dblink patches for comment
Date: Tue, 02 Jun 2009 16:08:18 -0700
From: Joe Conway <mail@joeconway.com>

Tom Lane wrote:

The docs patch looks okay, except this comment is a bit hazy:

+ -- Note: local connection must require authentication for this to work properly

I think what it means is

+ -- Note: local connection must require password authentication for this to work properly

If not, please clarify some other way. It might also be good to be a
bit more clear about what "fail to work properly" might entail.

OK, hopefully the attached is more clear.

As far as the code goes, hopefully Peter will take a look since he's
spent more time on the SQL/MED code than I have. The only thing I can
see that looks bogus is that get_connect_string() is failing to handle
any quoting/escaping that might be needed for the values to be inserted
into the connection string. I don't recall offhand what rules libpq
has for that, but I hope it at least implements doubled single quotes...

Added quote_literal_cstr() around the connection string params. Also
found I needed to restrict the servername string length to NAMEDATALEN
in order to avoid an assert if a full connection string is passed to
dblink_connect().

Other comments?

Thanks,

Joe

Attachments:

dblink.2009.06.02.02-sqlmed.difftext/x-patch; name=dblink.2009.06.02.02-sqlmed.diffDownload+145-10
dblink.2009.06.02.02-sqlmed_doc.difftext/x-patch; name=dblink.2009.06.02.02-sqlmed_doc.diffDownload+59-0
Attached Message Parttext/plain; name="Attached Message Part"Download
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#1)
Re: [Fwd: Re: dblink patches for comment]

Joe Conway <mail@joeconway.com> writes:

Based on Tom's post today about RC1, it sounds like I need to get this
committed very soon. Any complaints?

The quoting logic is still completely the wrong thing :-(. For one
thing, quote_literal will try to generate E'' syntax in some cases.
But more to the point, quote_literal's quoting rules don't match
what is needed. A look at libpq's conninfo_parse says that what it
accepts is single-quoted strings in which backslash quotes the next
character. It does not recognize doubled single quotes. I think
you will need to whip up a special-purpose quoting subroutine.

One other really minor point is that the pstrdup here:

+ return pstrdup(buf->data);

is a waste of time. The StringInfo's buffer is already palloc'd.

regards, tom lane

#3Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#2)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

The quoting logic is still completely the wrong thing :-(. For one
thing, quote_literal will try to generate E'' syntax in some cases.
But more to the point, quote_literal's quoting rules don't match
what is needed. A look at libpq's conninfo_parse says that what it
accepts is single-quoted strings in which backslash quotes the next
character. It does not recognize doubled single quotes. I think
you will need to whip up a special-purpose quoting subroutine.

OK, I see that. I assume I need to care for encoding issues? If so, do I
assume server encoding or client encoding?

+ return pstrdup(buf->data);

is a waste of time. The StringInfo's buffer is already palloc'd.

Thanks -- will fix.

Joe

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#3)
Re: [Fwd: Re: dblink patches for comment]

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

you will need to whip up a special-purpose quoting subroutine.

OK, I see that. I assume I need to care for encoding issues? If so, do I
assume server encoding or client encoding?

Hoo, good point. You can assume the database (server) encoding, because
that's what the local encoding is from the point of view of libpq ---
and the code in conninfo_parse knows nothing of encodings anyway. So
that's a no-op as far as the quoting itself goes. But that reminds me,
weren't you going to add something to force libpq to set client_encoding
to the database encoding?

regards, tom lane

#5Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#4)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

you will need to whip up a special-purpose quoting subroutine.

OK, I see that. I assume I need to care for encoding issues? If so, do I
assume server encoding or client encoding?

Hoo, good point. You can assume the database (server) encoding, because
that's what the local encoding is from the point of view of libpq ---
and the code in conninfo_parse knows nothing of encodings anyway. So
that's a no-op as far as the quoting itself goes.

OK, got it. I think the attached is what you're looking for, although I
have not yet tested beyond "it compiles" and "it passes make installcheck".

But that reminds me, weren't you going to add something to force libpq to set client_encoding
to the database encoding?

Yes, I was going to work that next. I assume it is pretty
straightforward, but I've never been particularly strong on the nuances
of encodings...

Joe

Attachments:

dblink.2009.06.06.01-sqlmed.difftext/x-patch; name=dblink.2009.06.06.01-sqlmed.diffDownload+235-10
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#5)
Re: [Fwd: Re: dblink patches for comment]

Joe Conway <mail@joeconway.com> writes:

OK, got it. I think the attached is what you're looking for, although I
have not yet tested beyond "it compiles" and "it passes make installcheck".

You're making it vastly overcomplicated. Just do something like

for (cp = str; *cp; cp++)
{
if (*cp == '\\' || *cp == '\'')
AppendStringInfoChar(buf, '\\');
AppendStringInfoChar(buf, *cp);
}

Since you're working in a server-safe encoding, there is no need to
worry about multibyte characters --- the tests will never match
any byte of a multibyte char.

regards, tom lane

#7Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#6)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

OK, got it. I think the attached is what you're looking for, although I
have not yet tested beyond "it compiles" and "it passes make installcheck".

You're making it vastly overcomplicated. Just do something like

for (cp = str; *cp; cp++)
{
if (*cp == '\\' || *cp == '\'')
AppendStringInfoChar(buf, '\\');
AppendStringInfoChar(buf, *cp);
}

Since you're working in a server-safe encoding, there is no need to
worry about multibyte characters --- the tests will never match
any byte of a multibyte char.

It wasn't clear to me that would be safe, but thanks for your patience :-)

This version is clearly simpler. For the record I've included the doc
patch again here. Will commit shortly...

Joe

Attachments:

dblink.2009.06.06.02-sqlmed.difftext/x-patch; name=dblink.2009.06.06.02-sqlmed.diffDownload+170-10
dblink.2009.06.02.02-sqlmed_doc.difftext/x-patch; name=dblink.2009.06.02.02-sqlmed_doc.diffDownload+59-0
#8Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#4)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

But that reminds me, weren't you going to add something to force
libpq to set client_encoding to the database encoding?

I think the attached is what you had in mind. But I don't know right off
how to trigger the failure (and therefore how to test the solution). A
naive test with two databases, one LATIN2, the other UTF8 does not
produce the error with simple text literals. Any guidance on specific
literals that would trigger the problem?

Thanks,

Joe

Attachments:

dblink.2009.06.06.01-encoding.difftext/x-patch; name=dblink.2009.06.06.01-encoding.diffDownload+5-0
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#8)
Re: [Fwd: Re: dblink patches for comment]

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

But that reminds me, weren't you going to add something to force
libpq to set client_encoding to the database encoding?

I think the attached is what you had in mind.

Looks plausible to me.

But I don't know right off
how to trigger the failure (and therefore how to test the solution). A
naive test with two databases, one LATIN2, the other UTF8 does not
produce the error with simple text literals. Any guidance on specific
literals that would trigger the problem?

Hmm, sending some non-ASCII characters from the LATIN2 end to the UTF8
end should do it, I would think. The other direction would probably
not show any immediate error.

regards, tom lane

#10Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#9)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

how to trigger the failure (and therefore how to test the solution). A
naive test with two databases, one LATIN2, the other UTF8 does not
produce the error with simple text literals. Any guidance on specific
literals that would trigger the problem?

Hmm, sending some non-ASCII characters from the LATIN2 end to the UTF8
end should do it, I would think. The other direction would probably
not show any immediate error.

I tried some random high-bit characters on the LATIN2 side, but was not
able to stumble upon the right combination of characters to trigger a
complaint. I've contacted Ruzsinszky Attila off-list and he said he will
get me a self contained test case.

Joe

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#8)
Re: [Fwd: Re: dblink patches for comment]

Joe Conway <mail@joeconway.com> writes:

I think the attached is what you had in mind. But I don't know right off
how to trigger the failure (and therefore how to test the solution). A
naive test with two databases, one LATIN2, the other UTF8 does not
produce the error with simple text literals.

I can reproduce an error (and verify the patch corrects it) using this
test case:

select '�x�y'::text as x;

select * from dblink('dbname = u8', $$select '�x�y'::text$$)
as t1 (x text);

(The two non-ASCII characters are octal 340 and 367, if they don't come
through properly in your mail.) Execute in a LATIN1 database (being sure
client_encoding is also LATIN1), connecting to a database with encoding
UTF8. With the patch, both commands give the same results; without,
I get

ERROR: invalid byte sequence for encoding "UTF8": 0xe078f7
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
CONTEXT: Error occurred on dblink connection named "unnamed": could not execute query.

Please get this committed soon, we have other stuff to get done
(like a pgindent run).

regards, tom lane

#12Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#11)
Re: [Fwd: Re: dblink patches for comment]

Tom Lane wrote:

Please get this committed soon, we have other stuff to get done
(like a pgindent run).

Thanks -- committed.

Joe