Bug #652: NAMEDATALEN limitations
Erik Erkelens (Erik_Erkelens@yahoo.com) reports a bug with a severity of 3
The lower the number the more severe it is.
Short Description
NAMEDATALEN limitations
Long Description
If NAMEDATALEN is given values of 45,61 initdb -d will fail with the error "relation pg_proc does not exist'.
I'd appreciate a comment in e.g. postgress_ext.h telling me e.g. that it should be a power of two, even or something like that.
Sample Code
No file was uploaded with this report
pgsql-bugs@postgresql.org writes:
If NAMEDATALEN is given values of 45,61 initdb -d will fail with the error "relation pg_proc does not exist'.
Did you try to track down why?
Although it's inefficient to declare NAMEDATALEN as not a multiple of 4
(because of alignment considerations --- the space will just be wasted
as pad bytes, so you might as well use it), I don't offhand know why it
wouldn't work. I don't *think* there is any assumption that it's a
power of 2 ... but the well-tested cases are all powers of 2, so ...
I'd appreciate a comment in e.g. postgress_ext.h telling me e.g. that it should be a power of two, even or something like that.
To do that, we'd need to know what the constraint actually is. Do you
care enough to do the research to find out?
From my perspective it'd be even better to remove whatever the
constraint is, if it turns out to be a localized fix. But not knowing
what's causing the failure, it's hard to guess.
regards, tom lane
I said:
Although it's inefficient to declare NAMEDATALEN as not a multiple of 4
(because of alignment considerations --- the space will just be wasted
as pad bytes, so you might as well use it), I don't offhand know why it
wouldn't work.
One possible theory is that if NAMEDATALEN isn't a multiple of
sizeof(int), the compiler's idea of sizeof(NameData) will probably be
NAMEDATALEN rounded up to the next multiple of sizeof(int). However,
I still don't see exactly how that breaks anything, with the possible
exception of pg_language tuple layout --- but pg_language layout
problems wouldn't give rise to a failure during bootstrap AFAICS.
So I still don't know what the constraint mechanism really is.
BTW, I'm assuming here that alignof(int) is 4 on your platform; is it?
regards, tom lane
I said:
One possible theory is that if NAMEDATALEN isn't a multiple of
sizeof(int), the compiler's idea of sizeof(NameData) will probably be
NAMEDATALEN rounded up to the next multiple of sizeof(int).
For the record, this does indeed seem to be the root cause for Erik's
complaint. relcache.c declares the key size for its relation name
cache index as sizeof(NameData) --- so if that's larger than NAMEDATALEN
the hashtable code will end up trying to compare pad bytes that
probably haven't been zeroed out.
It looks to me like it would not really be practical to expect the
system to work when sizeof(NameData) is different from NAMEDATALEN;
we could maybe eliminate the existing discrepancies but more would
surely creep in. So I've added comments to document that NAMEDATALEN
must be a multiple of sizeof(int).
BTW, I suspect that Names used as hashtable keys may explain the
residual speed differences that people have been reporting for large
values of NAMEDATALEN. The dynahash.c code assumes fixed-length keys
and will compare all bytes out to the specified length, no matter what
--- so the extra cycles may all be spent in key comparisons for
dynahash tables. Perhaps this could be fixed.
regards, tom lane
I am updating a large (1M+) table from a join. There is no question the
table has to be sequencially scanned. But, I am surprised that Postgres
would perform the join in order to do the update.
My take is that it should lookup in the "rs" table for a match and then
from this reduced set lookup the "routes" table.
Since it is performing an update, it is fair to assume that there will
be only one record from "routes" that will match the where clause. So,
why waste resources performing the join first?
Or maybe I am not reading the query plan correctly...
THX
nsd=# explain update c set newroute = r.route, route = r.routeno, street
= trim( lpad( r.seqno, 4, '0')), exchangeno = r.exchangeno
nsd-# from routes r, rs s where ( c.stname, c.municipality) =
(s.oldstname, s.oldmuni) and
nsd-# (s.exchangeno, s.stname, s.municipality) = ( r.exchangeno,
r.street, r.municipality) and
nsd-# r.fromno <= c.civic and c.civic <= r.tono and (r.parity =
c.civic%2 or r.parity = -1);
NOTICE: QUERY PLAN:
Hash Join (cost=109.44..118460.53 rows=1 width=857)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=109.44..109.44 rows=1 width=338)
-> Merge Join (cost=0.00..109.44 rows=1 width=338)
-> Index Scan using routes_str_mun on routes r
(cost=0.00..52.00 rows=1000 width=152)
-> Index Scan using rs_stname on rs s (cost=0.00..52.00
rows=1000 width=186)
EXPLAIN
Jean-Luc Lachance <jllachan@nsd.ca> writes:
Hash Join (cost=109.44..118460.53 rows=1 width=857)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=109.44..109.44 rows=1 width=338)
-> Merge Join (cost=0.00..109.44 rows=1 width=338)
-> Index Scan using routes_str_mun on routes r
(cost=0.00..52.00 rows=1000 width=152)
-> Index Scan using rs_stname on rs s (cost=0.00..52.00
rows=1000 width=186)
Have you vacuum analyzed routes and rs? I always disbelieve any plan
with 'rows=1000' in it, because I know that's the default assumption
when no stats are available...
regards, tom lane
Fair enough.
nsd=# explain update c set newroute = r.route, route = r.routeno, street
= trim( lpad( r.seqno, 4, '0')), exchangeno = r.exchangeno
nsd-# from routes r, rs s where ( c.stname, c.municipality) =
(s.oldstname, s.oldmuni) and
nsd-# (s.exchangeno, s.stname, s.municipality) = ( r.exchangeno,
r.street, r.municipality) and
nsd-# r.fromno <= c.civic and c.civic <= r.tono and (r.parity =
c.civic%2 or r.parity = -1);
NOTICE: QUERY PLAN:
Hash Join (cost=1943.04..136718.39 rows=100 width=660)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=1939.63..1939.63 rows=1365 width=141)
-> Merge Join (cost=0.00..1939.63 rows=1365 width=141)
-> Index Scan using routes_str_mun on routes r
(cost=0.00..1053.46 rows=13928 width=77)
-> Index Scan using rs_stname on rs s (cost=0.00..704.11
rows=10942 width=64)
EXPLAIN
I have also the following indecies that are ot being used:
create index routes_ex_str_mun on routes( exchangeno, street,
municipality);
create index rs_ostr_omun on rs( oldstname, oldmuni);
Tom Lane wrote:
Show quoted text
Jean-Luc Lachance <jllachan@nsd.ca> writes:
Hash Join (cost=109.44..118460.53 rows=1 width=857)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=109.44..109.44 rows=1 width=338)
-> Merge Join (cost=0.00..109.44 rows=1 width=338)
-> Index Scan using routes_str_mun on routes r (cost=0.00..52.00 rows=1000 width=152)
-> Index Scan using rs_stname on rs s (cost=0.00..52.00 rows=1000 width=186)Have you vacuum analyzed routes and rs? I always disbelieve any plan
with 'rows=1000' in it, because I know that's the default assumption
when no stats are available...regards, tom lane
Jean-Luc Lachance <jllachan@nsd.ca> writes:
Hash Join (cost=1943.04..136718.39 rows=100 width=660)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=1939.63..1939.63 rows=1365 width=141)
-> Merge Join (cost=0.00..1939.63 rows=1365 width=141)
-> Index Scan using routes_str_mun on routes r
(cost=0.00..1053.46 rows=13928 width=77)
-> Index Scan using rs_stname on rs s (cost=0.00..704.11
rows=10942 width=64)
Okay, those numbers look more believable.
Offhand this seems like a perfectly fine plan to me; computing the
r/s join once and forming it into an in-memory hashtable seems better
than probing the r and s tables again for each of 1M+ rows of c.
If the planner is way off about the size of that join (ie, it's not
1365 rows but many more) then maybe this isn't a good plan --- but you
haven't given us any info about that.
I have also the following indecies that are ot being used:
create index routes_ex_str_mun on routes( exchangeno, street,
municipality);
create index rs_ostr_omun on rs( oldstname, oldmuni);
That routes index could be used for the mergejoin if you had a
corresponding index on rs (ie, one on exchangeno,stname,municipality).
Not sure that this would make any significant improvement though.
Merging on street name is probably plenty good enough.
My thought is that the plan is fine, and if you are having a performance
problem with the update, it's more likely due to operations triggered by
the update rather than anything the planner can alter. Do you have
rules, triggers, foreign keys on the c table, foreign keys pointing to
that table, a large number of indexes for that table?
regards, tom lane
Tom,
There is no index on c right now, and there aren't any tiggers, rules or
foreign index on any of the tables.
"rs" has 10941 rows ans "routes" has 13928.
JLL
Tom Lane wrote:
Show quoted text
Jean-Luc Lachance <jllachan@nsd.ca> writes:
Hash Join (cost=1943.04..136718.39 rows=100 width=660)
-> Seq Scan on c (cost=0.00..112409.04 rows=1156604 width=519)
-> Hash (cost=1939.63..1939.63 rows=1365 width=141)
-> Merge Join (cost=0.00..1939.63 rows=1365 width=141)
-> Index Scan using routes_str_mun on routes r
(cost=0.00..1053.46 rows=13928 width=77)
-> Index Scan using rs_stname on rs s (cost=0.00..704.11
rows=10942 width=64)Okay, those numbers look more believable.
Offhand this seems like a perfectly fine plan to me; computing the
r/s join once and forming it into an in-memory hashtable seems better
than probing the r and s tables again for each of 1M+ rows of c.
If the planner is way off about the size of that join (ie, it's not
1365 rows but many more) then maybe this isn't a good plan --- but you
haven't given us any info about that.I have also the following indecies that are ot being used:
create index routes_ex_str_mun on routes( exchangeno, street,
municipality);
create index rs_ostr_omun on rs( oldstname, oldmuni);That routes index could be used for the mergejoin if you had a
corresponding index on rs (ie, one on exchangeno,stname,municipality).
Not sure that this would make any significant improvement though.
Merging on street name is probably plenty good enough.My thought is that the plan is fine, and if you are having a performance
problem with the update, it's more likely due to operations triggered by
the update rather than anything the planner can alter. Do you have
rules, triggers, foreign keys on the c table, foreign keys pointing to
that table, a large number of indexes for that table?regards, tom lane