UTF-8 on Postgres wire protocol

Started by Rui Pachecoover 9 years ago2 messagesgeneral
Jump to latest
#1Rui Pacheco
rui.pacheco@gmail.com

I’m toying around with the wire protocol and came across something I don’t understand.

I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the list of columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the other column:

So this would be the ID column:

[…]
[7]: = 0x69
[8]: = 0x64
[9]: = 0x00
[10]: = 0x00
[11]: = 0x00
[12]: = 0x4f
[13]: = 0x08
[14]: = 0x00
[15]: = 0x01
[16]: = 0x00
[17]: = 0x00
[18]: = 0x00
[19]: = 0x17
[20]: = 0x00
[21]: = 0x04
[22]: = 0xff
[23]: = 0xff
[24]: = 0xff
[25]: = 0xff
[26]: = 0x00
[27]: = 0x00 […]
[…]

And this señor:
[47]: = 0x01
[48]: = 0x03
[49]: = 0x00
[50]: = 0x00
[51]: = 0x73
[52]: = 0x65
[53]: = 0xc3
[54]: = 0xb1
[55]: = 0x6f
[56]: = 0x72
[57]: = 0x00
[58]: = 0x00
[59]: = 0x00
[60]: = 0x4f
[61]: = 0x08
[62]: = 0x00
[63]: = 0x08
[64]: = 0x00
[65]: = 0x00
[66]: = 0x04
[67]: = 0x13
[68]: = 0xff
[69]: = 0xff
[70]: = 0x00
[71]: = 0x00 […]
[…]

What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the column name begins and ends?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Michael Paquier
michael@paquier.xyz
In reply to: Rui Pacheco (#1)
Re: UTF-8 on Postgres wire protocol

On Thu, Dec 22, 2016 at 8:25 AM, Rui Pacheco <rui.pacheco@gmail.com> wrote:

I’m toying around with the wire protocol and came across something I don’t understand.

I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the list of columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the other column:

So this would be the ID column:

[…]
[7] = 0x69
[8] = 0x64

Yes this one maps to "id".

And this señor:
[47] = 0x01
[48] = 0x03
[49] = 0x00
[50] = 0x00

The string is from here...

[51] = 0x73
[52] = 0x65
[53] = 0xc3
[54] = 0xb1
[55] = 0x6f
[56] = 0x72

To here. And then señor ends.

What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the column name begins and ends?

I am not sure what message you used to query them, but the answer you
are looking for is much likely here:
https://www.postgresql.org/docs/9.6/static/protocol-message-formats.html
https://www.postgresql.org/docs/9.6/static/protocol-message-types.html
If you are looking at a reliable way to re-implement the frontend-side
protocol parsing the information according to those docs is the way to
go.
--
Michael

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general