BUG #14164: Postgres allow to insert more data into field than this field allow
The following bug has been logged on the website:
Bug reference: 14164
Logged by: Eugen Konkov
Email address: kes-kes@yandex.ru
PostgreSQL version: 9.5.2
Operating system: x86_64-pc-linux-gnu, (Ubuntu 4.8.2-19ubuntu1) 64bi
Description:
# \d+ files2_102009; Table
"public.files2_102009"
Column | Type | Modifiers
| Storage | Stats target | Description
-------------+-----------------------------+----------------------------------------+----------+--------------+-------------
filename | character varying(255) | default NULL::character varying
| extended | |
# select filename from files2_102009 where id = '057cbd75c160a38e';
filename
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
20-%D0%A1%D0%B2%D0%B8%D0%BD%D0%BD%D0%B0%D1%8F%20%D1%80%D1%83%D0%BB%D1%8C%D0%BA%D0%B0%28%D0%A1%D0%B2%D0%B8%D0%BD%D0%B8%D0%BD%D0%B0%2C%20%D0%BA%D0%B5%D0%B4%D1%80%D0%BE%D0%B2%D1%8B%D0%B9%20%D0%BE%D1%80%D0%B5%D1%85%2C%20%D1%81%D0%BE%D1%83%D1%81%29620%D1%80.jpg
(1 row)
How to reproduce:
We need mysql server, mysql_fdw, postgres;
1) create mysql table:
CREATE TABLE `files2` (
...
`filename` text
);
2) create extension mysql_fdw and foreign table:
# \d+ files2_mysql
Foreign table "public.files2_mysql"
Column | Type | Modifiers
| FDW Options | Storage | Stats target | Description
...
filename | character varying(255) | default NULL::character varying
| | extended | |
3) create postgres table:
# \d+ files2_102009; Table
"public.files2_102009"
Column | Type | Modifiers
| Storage | Stats target | Description
-------------+-----------------------------+----------------------------------------+----------+--------------+-------------
filename | character varying(255) | default NULL::character varying
| extended | |
4)
INSERT INTO files2_102009 SELECT * FROM files2_mysql;
Because of type of column of source and destination tables are equal
postgres does not check actual data so very long data at text field of mysql
database is inserted into limited field at postgres DB.
EXPECTED: postgres should not allow to insert into field more data than
field allows.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
kes-kes@yandex.ru writes:
Because of type of column of source and destination tables are equal
postgres does not check actual data so very long data at text field of mysql
database is inserted into limited field at postgres DB.
I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, May 30, 2016 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
kes-kes@yandex.ru writes:
Because of type of column of source and destination tables are equal
postgres does not check actual data so very long data at text field ofmysql
database is inserted into limited field at postgres DB.
I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.
K>>
4)
K>>
INSERT INTO files2_102009 SELECT * FROM files2_mysql;
I'm surprised that the definition of the object in the FROM clause affects
behavior in this manner...
For the uninitiated while we do allow text type definitions to include
length restrictions the underlying storage model is the same for any and
all text variations you can legally describe (char, varchar, text,
varchar(n)) though some special behaviors occur during input/output
depending upon the type name and optional attribute.
I'd be more inclined to play with this if it didn't require such a specific
combination of technologies to readily reproduce.
I'd at least like to understand the implications this would have on our
system if it is allowed to stand. The specific situation I'm thinking of
is a table that didn't require a toast but is now faced with a text value
too large to fit onto a page.
At some point in this SELECT->INSERT process there has to be a recognition
that the data we are seeing is external and needs to be validated against
the rules and assumptions of PostgreSQL. It's not our fault that mysql_fdw
is broken but it is our fault that we allowed its brokenness to result is
non-conforming persisted data.
Either the SELECT should fail because a PostgreSQL varchar(255) cannot hold
longer data (probably this) or the INSERT should fail so that at least the
impact of the broken varchar is limited to an active query and doesn't make
it onto disk.
David J.
To my mind, postgres should not rely on third party code and write broken data to the disk.
I suppose this may override other records (even those the user have no access) which are stored close to this broken one if data in the 'text' field of mysql are large enough.
And, yes, mysql_fdw, should also do checks for data constistency
30.05.2016, 20:10, "Tom Lane" <tgl@sss.pgh.pa.us>:
kes-kes@yandex.ru writes:
О©╫Because of type of column of source and destination tables are equal
О©╫postgres does not check actual data so very long data at text field of mysql
О©╫database is inserted into limited field at postgres DB.I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Mon, May 30, 2016 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.
At some point in this SELECT->INSERT process there has to be a recognition
that the data we are seeing is external and needs to be validated against
the rules and assumptions of PostgreSQL. It's not our fault that mysql_fdw
is broken but it is our fault that we allowed its brokenness to result is
non-conforming persisted data.
Either the SELECT should fail because a PostgreSQL varchar(255) cannot hold
longer data (probably this) or the INSERT should fail so that at least the
impact of the broken varchar is limited to an active query and doesn't make
it onto disk.
Indeed, the SELECT should have failed, and it's mysql_fdw that should have
issued the error. We cannot expect to protect users against every sort of
malfeasance or misfeasance that might occur in extension C code. Consider
for example that the Datum mysql_fdw is handing back and claiming to be
varchar(255) might not be textual at all, or it might be in the wrong
encoding, etc etc. It wouldn't even be possible for the core code to
detect some of those cases, and it certainly would impose a lot of
overhead to add checking that should be redundant.
(BTW, I checked this and verified that an equivalent case in postgres_fdw
does throw an error; so it's *possible* for an FDW to get this right.
mysql_fdw simply doesn't.)
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
The convention here is to inline or bottom post.
30.05.2016, 20:10, "Tom Lane" <tgl@sss.pgh.pa.us>:
kes-kes@yandex.ru writes:
Because of type of column of source and destination tables are equal
postgres does not check actual data so very long data at text field ofmysql
database is inserted into limited field at postgres DB.
I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.
On Mon, May 30, 2016 at 2:28 PM, KES <kes-kes@yandex.ru> wrote:
To my mind, postgres should not rely on third party code and write broken
data to the disk.
It isn't quite this cut-and-dry. The user takes some responsibility for
the stuff they load into their database.
https://www.postgresql.org/docs/9.6/static/sql-createforeigntable.html
"""
Although PostgreSQL does not attempt to enforce constraints on foreign
tables, it does assume that they are correct for purposes of query
optimization. If there are rows visible in the foreign table that do not
satisfy a declared constraint, queries on the table might produce incorrect
answers. It is the user's responsibility to ensure that the constraint
definition matches reality.
"""
While this isn't a table constraint it seems the same provision applies to
data types and their attributes.
I suspect that the only reason this example gets through whatever checks
are present is because of the fact that varchar(n) is just, basically, a
domain over text.
I suppose this may override other records (even those the user have no
access) which are stored close to this broken one if data in the 'text'
field of mysql are large enough.
Just because PostgreSQL doesn't validate that the length is within the
constrained limit doesn't impact how it gets stored. PostgreSQL will
measure the length as it writes the data and request the necessary amount
of free memory/space to hold it. That is what I was trying to point out
with my brief description of the implementation of the text data type.
I'll add my last comment to Tom's most recent reply.
David J.
On Mon, May 30, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Mon, May 30, 2016 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I would say this is a bug in mysql_fdw, not Postgres proper. In general
it's the responsibility of an FDW to ensure that data it passes back
meets the constraints of the foreign table declaration.At some point in this SELECT->INSERT process there has to be a
recognition
that the data we are seeing is external and needs to be validated against
the rules and assumptions of PostgreSQL. It's not our fault thatmysql_fdw
is broken but it is our fault that we allowed its brokenness to result is
non-conforming persisted data.Either the SELECT should fail because a PostgreSQL varchar(255) cannot
hold
longer data (probably this) or the INSERT should fail so that at least
the
impact of the broken varchar is limited to an active query and doesn't
make
it onto disk.
Indeed, the SELECT should have failed, and it's mysql_fdw that should have
issued the error. We cannot expect to protect users against every sort of
malfeasance or misfeasance that might occur in extension C code.
Ok, but do you really not want to handle (or at least explore) this
particular situation that will result in a dump-restore hazard - while
appearing as fully functioning in all other aspects.
I suspect that these records are not checked on COPY TO reading either but
that they are checked when going through COPY FROM - at which point they
will invoke an error.
Is there maybe some way to mark the datum coming from FDW as being dirty
which would preempt optimizations (or force some kind of cleaning) when
writing to a permanent table? We could supply "cleaning" functions to
remove the dirty bit for the included types.
David J.