pageinspect patch, for showing tuple data
Hi!
I've created a patch for pageinspect that allows to see data stored in the
tuple.
This patch has two main purposes:
1. Practical: Make manual DB recovery more simple
2. Educational: Seeing what data is actually stored in tuple, allows to get
better understanding of how does postgres actually works.
This patch adds several new functions, available from SQL queries. All these
functions are based on heap_page_items, but accept slightly different
arguments and has one additional column at the result set:
heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot of tuple data
area stored in bytea.
heap_page_tuples_attributes - same as heap_page_tuples, but instead of single
tuple data bytea snapshot, it has array of bytea values, that were splitted
into attributes as they would be spitted by nocachegetattr function (I
actually reimplemented this function main algorithm to get this done)
heap_page_tuples_attrs_detoasted - same as heap_page_tuples_attrs, but all
varlen attributes values that were compressed or TOASTed, are replaced with
unTOASTed and uncompressed values.
There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea, but also
get a relation name. It tries to apply tuple descriptor of that relation to
that page data.
This would allow you to try to read page data from one table using tuple
descriptor from anther. A strange idea, one should say. But this will allow
you: a) See how wrong data can be interpreted (educational purpose).
b) I have plenty of sanity check while reading parsing that tuple, for this
function I've changed error level from ERROR to WARNING. This function will
allow to write proper tests that all these checks work as they were designed
(I hope to write these tests sooner or later)
I've also added raw tuple data output to original heap_page_items function,
thought I am not sure if it is good idea. I just can add it there so I did it.
May be it would be better to change it back for better backward compatibility.
Attached patched is in "almost ready" state. It has some formatting issues.
I'd like to hear HACKER's opinion before finishing it and sending to
commitfest.
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
pageinspect.difftext/x-patch; charset=UTF-8; name=pageinspect.diffDownload+720-239
On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
Hi!
I've created a patch for pageinspect that allows to see data stored in the
tuple.This patch has two main purposes:
1. Practical: Make manual DB recovery more simple
To what are you referring to in this case? Manual manipulation of
on-disk data manually?
b) I have plenty of sanity check while reading parsing that tuple, for this
function I've changed error level from ERROR to WARNING. This function will
allow to write proper tests that all these checks work as they were designed
(I hope to write these tests sooner or later)
+ ereport(inter_call_data->error_level,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg("Data corruption: Iterating over
tuple data reached out of actual tuple size")));
I don't think that the possibility to raise a WARNING is a good thing
in those code paths. If data is corrupted this may crash, and I am not
sure that anybody would want that even for educational purposes.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 3 августа 2015 14:30:46 пользователь Michael Paquier написал:
On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
Hi!
I've created a patch for pageinspect that allows to see data stored in the
tuple.This patch has two main purposes:
1. Practical: Make manual DB recovery more simple
To what are you referring to in this case? Manual manipulation of
on-disk data manually?
Yes, when DB is broken for example
b) I have plenty of sanity check while reading parsing that tuple, for
this
function I've changed error level from ERROR to WARNING. This function
will
allow to write proper tests that all these checks work as they were
designed (I hope to write these tests sooner or later)+ ereport(inter_call_data->error_level, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("Data corruption: Iterating over tuple data reached out of actual tuple size"))); I don't think that the possibility to raise a WARNING is a good thing in those code paths. If data is corrupted this may crash, and I am not sure that anybody would want that even for educational purposes.
Hm... I considered _heap_page_items really very dangerous function, with big
red "Do not call it if you not sure what are you doing" warning. Reading data
with not proper attribute descriptor is dangerous any way. But when I wrote
that code, I did not have that checks at first, and it was really interesting
to subst one data with another and see what will happen. And I thought that
may be other explorers will like to do the same. And it is really possible
only in warning mode. So I left warnings only in _heap_page_items, and set
errors for all other cases.
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Nikolay Shaplov wrote:
This patch adds several new functions, available from SQL queries. All these
functions are based on heap_page_items, but accept slightly different
arguments and has one additional column at the result set:heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot of tuple data
area stored in bytea.
I think the API around get_raw_page is more useful generally. You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination. If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.
There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea, but also
get a relation name. It tries to apply tuple descriptor of that relation to
that page data.
For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data. I think OID is better than name generally. (You can
cast the relation name to regclass).
It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least. The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 3 августа 2015 15:35:23 Вы написали:
Nikolay Shaplov wrote:
This patch adds several new functions, available from SQL queries. All
these functions are based on heap_page_items, but accept slightly
different arguments and has one additional column at the result set:heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot of tuple
data area stored in bytea.I think the API around get_raw_page is more useful generally. You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination. If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea, but
also get a relation name. It tries to apply tuple descriptor of that
relation to that page data.For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data. I think OID is better than name generally. (You can
cast the relation name to regclass).It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least. The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.
Hmm... may be you are right. I'll try to change it would take raw page and
OID.
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
Nikolay Shaplov wrote:
This patch adds several new functions, available from SQL queries. All
these functions are based on heap_page_items, but accept slightly
different arguments and has one additional column at the result set:heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot of tuple
data area stored in bytea.I think the API around get_raw_page is more useful generally. You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination. If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea, but
also get a relation name. It tries to apply tuple descriptor of that
relation to that page data.For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data. I think OID is better than name generally. (You can
cast the relation name to regclass).It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least. The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.
I've done as you've said: Now patch adds two functions heap_page_item_attrs
and heap_page_item_detoast_attrs and these function takes raw_page and
relation OID as arguments. They also have third optional parameter that allows
to change error mode from ERROR to WARNING.
Is it ok now?
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
pageinspect_show_tuple_data.difftext/x-patch; charset=UTF-8; name=pageinspect_show_tuple_data.diffDownload+725-239
On Wed, Sep 2, 2015 at 6:58 PM, Nikolay Shaplov <n.shaplov@postgrespro.ru>
wrote:
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
Nikolay Shaplov wrote:
This patch adds several new functions, available from SQL queries. All
these functions are based on heap_page_items, but accept slightly
different arguments and has one additional column at the result set:heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot oftuple
data area stored in bytea.
I think the API around get_raw_page is more useful generally. You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination. If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea,but
also get a relation name. It tries to apply tuple descriptor of that
relation to that page data.For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data. I think OID is better than name generally. (You can
cast the relation name to regclass).It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least. The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.I've done as you've said: Now patch adds two functions heap_page_item_attrs
and heap_page_item_detoast_attrs and these function takes raw_page and
relation OID as arguments. They also have third optional parameter that
allows
to change error mode from ERROR to WARNING.Is it ok now?
Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
caller specifies nothing.
Here are some observations after looking at the code, no functional testing.
+ int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
:NULL;
When handling additional arguments, it seems to me that it is more readable
to use something like that:
if (PG_NARGS >= 3)
{
arg = PG_GETARG_XXX(2);
//etc.
}
+ error_mode = error_mode?WARNING:ERROR;
This generates warnings at compilation.
+ if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ (errmsg("Runnung heap_page_item_attrs in
WARNING mode."))));
+ }
I am not convinced that this is necessary.
+ inter_call_data->raw_page_size = VARSIZE(raw_page) -
VARHDRSZ;
+ if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
Including raw_page_size in the status data does not seem necessary to me.
The page data is already available for each loop so you could just extract
it from it.
+ ereport(inter_call_data->error_level,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg("Data corruption: number of
attributes in tuple header is greater than number of attributes in tuple
descripor")));
I'd rather switch the error reports related to data corruption from ereport
to elog, which is more suited for internal errors, and it seems to me that
it is the case here.
heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
whenever 'if' condition is false [-Wsometimes-uninitialized]
if (!is_null)
^~~~~~~~
heapfuncs.c:419:43: note: uninitialized use occurs here
raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
BYTEAOID, fctx->multi_call_memory_ctx);
^~~~~~~~
heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
if (!is_null)
^~~~~~~~~~~~~
heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
this warning
Datum raw_attr;
My compiler complains about uninitialized variables.
+--
+-- _heap_page_items()
+--
+CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
I am not sure why this is necessary and visibly it overlaps with the
existing declaration of heap_page_items. The last argument is different
though, and I recall that we decided not to use anymore the relation name
specified as text, but its OID instead in this module.
Documentation is missing, that would be good to have to understand what
each function is intended to do. Code has some whitespaces.
--
Michael
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
caller specifies nothing.Here are some observations after looking at the code, no functional testing.
+ int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
:NULL;
When handling additional arguments, it seems to me that it is more readable
to use something like that:
if (PG_NARGS >= 3)
{
arg = PG_GETARG_XXX(2);
//etc.
}+ error_mode = error_mode?WARNING:ERROR;
This generates warnings at compilation.
yeah, what I have done here is too complex with no reason. I've simplified this
code now.
+ if (SRF_IS_FIRSTCALL() && (error_mode == WARNING)) + { + ereport(WARNING, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + (errmsg("Runnung heap_page_item_attrs in WARNING mode.")))); + }I am not convinced that this is necessary.
I've removed it.
+ inter_call_data->raw_page_size = VARSIZE(raw_page) - VARHDRSZ; + if (inter_call_data->raw_page_size < SizeOfPageHeaderData) Including raw_page_size in the status data does not seem necessary to me. The page data is already available for each loop so you could just extract it from it.+ ereport(inter_call_data->error_level, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("Data corruption: number of attributes in tuple header is greater than number of attributes in tuple descripor"))); I'd rather switch the error reports related to data corruption from ereport to elog, which is more suited for internal errors, and it seems to me that it is the case here.
Hm... First, since we have proper error code, do not see why not give it.
Second, this is not exactly internal error, in some cases user tries to parse
corrupted data on purpose. So for user it is not an internal error, error from
deep below, but error on the level he is currently working at. So I would not
call it internal error.
heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
whenever 'if' condition is false [-Wsometimes-uninitialized]
if (!is_null)
^~~~~~~~
heapfuncs.c:419:43: note: uninitialized use occurs here
raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
BYTEAOID, fctx->multi_call_memory_ctx);
^~~~~~~~
heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
if (!is_null)
^~~~~~~~~~~~~
heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
this warning
Datum raw_attr;
My compiler complains about uninitialized variables.
Fixed it
+-- +-- _heap_page_items() +-- +CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text, I am not sure why this is necessary and visibly it overlaps with the existing declaration of heap_page_items. The last argument is different though, and I recall that we decided not to use anymore the relation name specified as text, but its OID instead in this module.
Oh! This should not go to the final patch :-( Sorry. Removed it.
Documentation is missing, that would be good to have to understand what
each function is intended to do.
I were going to add documentation when this patch is commited, or at least
approved for commit. So I would know for sure that function definition would
not change, so had not to rewrite it again and again.
So if it is possible I would write documentation and test when this patch is
already approved.
Code has some whitespaces.
I've found and removed some. Hope that was all of them...
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
pageinspect_show_tuple_data_v3.difftext/x-patch; charset=utf-8; name=pageinspect_show_tuple_data_v3.diffDownload+692-239
On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
Documentation is missing, that would be good to have to understand what
each function is intended to do.I were going to add documentation when this patch is commited, or at least
approved for commit. So I would know for sure that function definition
would
not change, so had not to rewrite it again and again.
I doubt that this patch would be committed without documentation, this is a
requirement for any patch.
So if it is possible I would write documentation and test when this patch
is
already approved.
I'm fine with that. Let's discuss its shape and come up with an agreement.
It would have been good to post test cases of what this stuff actually does
though, people reviewing this thread are limited to guess on what is tried
to be achieved just by reading the code. That's actually where
documentation, even a draft of documentation helps a lot in the review to
see if what is expected by the developer matches what the code actually
does.
Code has some whitespaces.
I've found and removed some. Hope that was all of them...
Yeah, it looks that you took completely rid of them.
In details, this patch introduces two independent concepts:
- add tuple data as a new bytea column of heap_page_items. This is indeed
where it should be, and note that this is where the several corruption
checks are done by checking the state of the tuple data.
- add heap_page_item_attrs and heap_page_item_detoast_attrs, which is very
similar to heap_page_items, at the difference that it can take an OID to be
able to use a TupleDesc and return a bytea array with the data of each
attribute.
Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
similar to what heap_page_items does, leading to a code maze that is going
to make future extensions more difficult, which is what lead to the
refactoring your patch does.
Hence, instead of all those complications, why not simply introducing two
functions that take as input the tuple data and the OID of the relation,
returning those bytea arrays? It seems to be that this would be a more
handy interface, and as this is for educational purposes I guess that the
addition of the overhead induced by the extra function call would be
acceptable.
Regards,
--
Michael
On Tue, Sep 8, 2015 at 11:53 AM, Michael Paquier wrote:
Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
similar to what heap_page_items does, leading to a code maze that is going
to make future extensions more difficult, which is what lead to the
refactoring your patch does.
Hence, instead of all those complications, why not simply introducing two
functions that take as input the tuple data and the OID of the relation,
returning those bytea arrays? It seems to be that this would be a more handy
interface, and as this is for educational purposes I guess that the addition
of the overhead induced by the extra function call would be acceptable.
Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 8 сентября 2015 11:53:24 Вы написали:
On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
Documentation is missing, that would be good to have to understand what
each function is intended to do.I were going to add documentation when this patch is commited, or at least
approved for commit. So I would know for sure that function definition
would
not change, so had not to rewrite it again and again.I doubt that this patch would be committed without documentation, this is a
requirement for any patch.
Ok. I can do it.
So if it is possible I would write documentation and test when this patch
is
already approved.I'm fine with that. Let's discuss its shape and come up with an agreement.
It would have been good to post test cases of what this stuff actually does
though, people reviewing this thread are limited to guess on what is tried
to be achieved just by reading the code.
To test non detoasted functions one should do:
CREATE EXTENSION pageinspect;
create table test (a int, b smallint,c varchar);
insert into test VALUES
(-1,2,'111'),
(2,null,'222'),
(3,8,'333'),
(null,16,null);
Then
# select * from heap_page_items(get_raw_page('test', 0));
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+----------+-------+------------------------
1 | 8152 | 1 | 34 | 763 | 0 | 0 | (0,1) | 3 | 2050 | 24 | | | \xffffffff020009313131
2 | 8120 | 1 | 32 | 763 | 0 | 0 | (0,2) | 3 | 2051 | 24 | 10100000 | | \x0200000009323232
3 | 8080 | 1 | 34 | 763 | 0 | 0 | (0,3) | 3 | 2050 | 24 | | | \x03000000080009333333
4 | 8048 | 1 | 26 | 763 | 0 | 0 | (0,4) | 3 | 2049 | 24 | 01000000 | | \x1000
or
# select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+----------+-------+-----------------------------------------
1 | 8152 | 1 | 34 | 763 | 0 | 0 | (0,1) | 3 | 2050 | 24 | | | {"\\xffffffff","\\x0200","\\x09313131"}
2 | 8120 | 1 | 32 | 763 | 0 | 0 | (0,2) | 3 | 2051 | 24 | 10100000 | | {"\\x02000000",NULL,"\\x09323232"}
3 | 8080 | 1 | 34 | 763 | 0 | 0 | (0,3) | 3 | 2050 | 24 | | | {"\\x03000000","\\x0800","\\x09333333"}
4 | 8048 | 1 | 26 | 763 | 0 | 0 | (0,4) | 3 | 2049 | 24 | 01000000 | | {NULL,"\\x1000",NULL}
(4 rows)
For detoasted function you should do something like this:
CREATE EXTENSION pageinspect;
create table test2 (c varchar);
insert into test2 VALUES ('++');
insert into test2 VALUES (repeat('+',2100));
Then heap_page_item_attrs will show you compressed attr data:
# select * from heap_page_item_attrs(get_raw_page('test2', 0),'test2'::regclass);
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+-------------------------------------------------------------------------------
1 | 8160 | 1 | 27 | 768 | 0 | 0 | (0,1) | 1 | 2050 | 24 | | | {"\\x072b2b"}
2 | 8096 | 1 | 59 | 769 | 0 | 0 | (0,2) | 1 | 2050 | 24 | | | {"\\x8e00000034080000fe2b0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff010f01aa"}
(2 rows)
and heap_page_item_detoast_attrs will show you original data
# select * from heap_page_item_detoast_attrs(get_raw_page('test2', 0),'test2'::regclass);
[will not paste output here it is too long, see it for yourself]
That's actually where
documentation, even a draft of documentation helps a lot in the review to
see if what is expected by the developer matches what the code actually
does.Code has some whitespaces.
I've found and removed some. Hope that was all of them...Yeah, it looks that you took completely rid of them.
In details, this patch introduces two independent concepts:
- add tuple data as a new bytea column of heap_page_items. This is indeed
where it should be, and note that this is where the several corruption
checks are done by checking the state of the tuple data.
Sorry do not understand what do you want to say in the second part of the last
sentence. About corruption checks.
- add heap_page_item_attrs and heap_page_item_detoast_attrs, which is very
similar to heap_page_items, at the difference that it can take an OID to be
able to use a TupleDesc and return a bytea array with the data of each
attribute.
That's right.
And more:
- heap_page_item_attrs and heap_page_item_detoast_attrs has third optional
attribute, set it to true if you want these functions to report about problems in
WARNING mode. This will allow to force attribute parsing even if some consistency
checks are failed. This is necessary for educational purposes, so everyone can
make fake data and see how postgres will try to parse it. Also this will be needed
for test. So we can test that pageinspect will is properly doing all checks. And it is
much easier to do this when it is done in warning mode.
Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
similar to what heap_page_items does, leading to a code maze that is going
to make future extensions more difficult, which is what lead to the
refactoring your patch does.
Hence, instead of all those complications, why not simply introducing two
functions that take as input the tuple data and the OID of the relation,
returning those bytea arrays? It seems to be that this would be a more
handy interface, and as this is for educational purposes I guess that the
addition of the overhead induced by the extra function call would be
acceptable.
When I looked at this task I considered doing the same thing. Bun unfortunately it is
not possible. (Or if be more correct it is possible, but if I do so, it would be hard to us
e it)
The thing is, that to properly split tuple data into attributes, we need some values from
tuple header:
t_infomask2: where postgres store actual number of stored attributes, that may differ
from one in tuple data. That will allow to properly parse tuples after
ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
t_infomask: has bit that indicates that there is some null attributes in tuple.
t_bits: has a bit mask that shows what attributes were set to null.
So if move tuple data parsing into separate function, then we have to pass these
values alongside the tuple data. This is not convenient at all.
So I did it in a way you see in the patch.
Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.
Yeah, I also thought about it. But did not come to any final conclusion. Should
we add forth argument, that enables detoasting, instead of adding separate
function?
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 9, 2015 at 8:39 PM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
В письме от 8 сентября 2015 11:53:24 Вы написали:
Hence, instead of all those complications, why not simply introducing two
functions that take as input the tuple data and the OID of the relation,
returning those bytea arrays? It seems to be that this would be a more
handy interface, and as this is for educational purposes I guess that the
addition of the overhead induced by the extra function call would be
acceptable.When I looked at this task I considered doing the same thing. Bun unfortunately it is
not possible. (Or if be more correct it is possible, but if I do so, it would be hard to us
e it)
The thing is, that to properly split tuple data into attributes, we need some values from
tuple header:
t_infomask2: where postgres store actual number of stored attributes, that may differ
from one in tuple data. That will allow to properly parse tuples after
ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
t_infomask: has bit that indicates that there is some null attributes in tuple.
t_bits: has a bit mask that shows what attributes were set to null.So if move tuple data parsing into separate function, then we have to pass these
values alongside the tuple data. This is not convenient at all.
So I did it in a way you see in the patch.
Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]
Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.
As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.
Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.Yeah, I also thought about it. But did not come to any final conclusion. Should
we add forth argument, that enables detoasting, instead of adding separate
function?
This is definitely something you want to control with a switch.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:
So if move tuple data parsing into separate function, then we have to pass
these values alongside the tuple data. This is not convenient at all.
So I did it in a way you see in the patch.Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]
Just compare two expressions:
select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
and
select lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid,
t_infomask2, t_infomask, t_hoff, t_bits, t_oid, tuple_data_parse (
t_data, t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false)
from heap_page_item_attrs(get_raw_page('test', 0));
The second variant is a total mess and almost unsable...
Though I've discussed this issue with friedns, and we came to conclusion that
it might be good to implement tuple_data_parse and then implement
easy to use heap_page_item_attrs in pure SQL, using heap_page_items and
tuple_data_parse.
That would keep usage simplicity, and make code more simple as you offer.
The only one argument against it is that in internal representation t_bist is
binary, and in sql output it is string with '0' and '1' characters. We will
have to convert it back to binary mode. This is not difficult, but just useless
convertations there and back again.
What do you think about this solution?
Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.Yeah, I also thought about it. But did not come to any final conclusion.
Should we add forth argument, that enables detoasting, instead of adding
separate function?This is definitely something you want to control with a switch.
Ok.Let's come to the final decision with tuple_data_parse, and i will add this
switch there and to pure sql heap_page_item_attrs
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]
Should pageinspect create a table that contains some of the constants
used to interpret infomask?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 11, 2015 at 12:08 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:
So if move tuple data parsing into separate function, then we have to pass
these values alongside the tuple data. This is not convenient at all.
So I did it in a way you see in the patch.Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]Just compare two expressions:
select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
and
select lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid,
t_infomask2, t_infomask, t_hoff, t_bits, t_oid, tuple_data_parse (
t_data, t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false)
from heap_page_item_attrs(get_raw_page('test', 0));The second variant is a total mess and almost unsable...
It is hard to believe as well that any sane application would use * as
well in a SELECT query. Users would though and we are talking about
user's education here :)
Though I've discussed this issue with friedns, and we came to conclusion that
it might be good to implement tuple_data_parse and then implement
easy to use heap_page_item_attrs in pure SQL, using heap_page_items and
tuple_data_parse.
That would keep usage simplicity, and make code more simple as you offer.
Yep. That's doable with a simple SQL function. I am not sure that it
is worth including in pageinspect though.
The only one argument against it is that in internal representation t_bits is
binary, and in sql output it is string with '0' and '1' characters. We will
have to convert it back to binary mode. This is not difficult, but just useless
convertations there and back again.
The reason why this is visibly converted from bit to text is that the
in-core bit data type has a fixed length, and in the case of
HeapTupleHeaderData there is a variable length.
What do you think about this solution?
For code simplicity's sake this seems worth the cost.
Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.Yeah, I also thought about it. But did not come to any final conclusion.
Should we add forth argument, that enables detoasting, instead of adding
separate function?This is definitely something you want to control with a switch.
Ok.Let's come to the final decision with tuple_data_parse, and i will add this
switch there and to pure sql heap_page_item_attrs
Fine for me.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 11, 2015 at 12:12 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]Should pageinspect create a table that contains some of the constants
used to interpret infomask?
Interesting idea. It may be indeed useful to show to a user mappings
between t_infomask flags <=> textual meaning. I guess that we could
have an SRF function with a view on top of it that returns such a
list. The same can apply to t_infomask2.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 11 сентября 2015 15:12:04 Вы написали:
Ok.Let's come to the final decision with tuple_data_parse, and i will add
this switch there and to pure sql heap_page_item_attrsFine for me.
So I've modified the code, now we have:
heap_page_items - have a column with raw tuple data
tuple_data_split - takes oid, raw tuple data, infomask, infomask2 and bits
parsed as string and returns bytea[] with attribute raw values. It also have
two optional arguments do_detoast that forces function to detoast attribute,
and warning_mode that allows to set this function to warning mode, and do not
stop working if some inconsistency were found.
there is also pure sql function heap_page_item_attrs that takes page data, and
table oid, and returns same data as heap_page_items but bytea[] of attributes
instead of one whole piece of raw data. It also has optional argument
do_detoast that allows to get bytea[] of detoasted attribute data.
I've decided that there is no real need in warning_mode in
heap_page_item_attrs so there is no such argument there.
So now it is still RFC. Final patch with documentation will come soon
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
pageinspect_show_tuple_data_v4a.difftext/x-patch; charset=utf-8; name=pageinspect_show_tuple_data_v4a.diffDownload+563-148
В письме от 11 сентября 2015 15:12:04 пользователь Michael Paquier написал:
Ok.Let's come to the final decision with tuple_data_parse, and i will add
this switch there and to pure sql heap_page_item_attrsFine for me.
Here is final version with documentation.
Hope it will be the last one. :-)
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachments:
pageinspect_show_tuple_data_v5.difftext/x-patch; charset=UTF-8; name=pageinspect_show_tuple_data_v5.diffDownload+906-155
On Fri, Sep 25, 2015 at 8:30 PM, Nikolay Shaplov wrote:
Here is final version with documentation.
Thanks! I just had a short look at it:
- I am not convinced that it is worth declaring 3 versions of tuple_data_split.
- The patch does not respect the project code style, particularly
one-line "if condition {foo;}" are not adapted, code line is limited
to 80 characters, etc. The list is basically here:
http://www.postgresql.org/docs/current/static/source.html
- Be aware of typos: s/whitch/which is one.
+ <entry><structfield>t_infomask2</structfield></entry>
+ <entry><type>integer</type></entry>
+ <entry>stores number of attributes (11 bits of the word). The
rest are used for flag bits:
+<screen>
+0x2000 - tuple was updated and key cols modified, or tuple deleted
+0x4000 - tuple was HOT-updated
+0x8000 - this is heap-only tuple
+0xE000 - visibility-related bits (so called "hint bits")
+</screen>
This large chunk of documentation is a duplicate of storage.sgml. If
that's really necessary, it looks adapted to me to have more detailed
comments at code level directly in heapfuncs.c.
The example of tuple_data_split in the docs would be more interesting
if embedded with a call to heap_page_items.
Hope it will be the last one. :-)
Unfortunately not :) But this is definitely going in the right
direction thinking that this code is mainly targeted for educational
purposes.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:
Here is final version with documentation.
Thanks! I just had a short look at it:
- I am not convinced that it is worth declaring 3 versions of
tuple_data_split.
How which of them should we leave?
- The patch does not respect the project code style,
particularly one-line "if condition {foo;}" are not adapted, code line is
limited
to 80 characters, etc. The list is basically here:
http://www.postgresql.org/docs/current/static/source.html
I did my best. Results are attached.
- Be aware of typos: s/whitch/which is one.
I've run spellchecker on all comments. Hope that I removed most of the
mistakes. But as I am not native speaker, I will not be able to eliminate them
all. I will need help here.
+ <entry><structfield>t_infomask2</structfield></entry> + <entry><type>integer</type></entry> + <entry>stores number of attributes (11 bits of the word). The rest are used for flag bits: +<screen> +0x2000 - tuple was updated and key cols modified, or tuple deleted +0x4000 - tuple was HOT-updated +0x8000 - this is heap-only tuple +0xE000 - visibility-related bits (so called "hint bits") +</screen> This large chunk of documentation is a duplicate of storage.sgml. If that's really necessary, it looks adapted to me to have more detailed comments at code level directly in heapfuncs.c.
Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml.
If there is no source of information other then source code, then the
documentation is not good.
If there were information about t_infomask/t_infomask2 in storage.sgml, then I
would add "See storage.sgml for more info" into pageinspect doc, and thats
all. But since there is no such information there, I think that the best
thing is to quote comments from source code there, so you can get all
information from documentation, not looking for it in the code.
So I would consider two options: Either move t_infomask/t_infomask2 details
into storage.sgml or leave as it is.
I am lazy, and does not feel confidence about touching main documentation, so I
would prefer to leave as it is.
The example of tuple_data_split in the docs would be more interesting
if embedded with a call to heap_page_items.
This example would be almost not readable. May be I should add one more praise
explaining where did we take arguments for tuple_data_split
--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company