pg_class -> reltuples?
Hi all,
Is there a reason why the reltuples column of pg_class is stored as a
"real", rather than one of the integer data types? Are there any
situations in which there will be a non-integer value stored in this
column?
Cheers,
Neil
P.S. I tried to search the archives, but archives.postgresql.org is so
slow, it's basically unusable. So my apologies if this has already been
discussed...
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
Neil Conway wrote:
Hi all,
Is there a reason why the reltuples column of pg_class is stored as a
"real", rather than one of the integer data types? Are there any
situations in which there will be a non-integer value stored in this
column?
That is an excellent question. I assume it is related to having > 4
billion rows, but we have int8 for that. The value is used mostly by
the optimizer, which does most of its calcultions using float8 (real),
so that may be why.
P.S. I tried to search the archives, but archives.postgresql.org is so
slow, it's basically unusable. So my apologies if this has already been
discussed...
Yes, it is hampering me from researching some of these patches too, and
fts is completely down. If I could just get a web page of all the
threads (forget searching), I would be happy. The archives site contents
hasn't been updated since Feb 28.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Neil Conway wrote:
Is there a reason why the reltuples column of pg_class is stored as a
"real", rather than one of the integer data types?
That is an excellent question. I assume it is related to having > 4
billion rows, but we have int8 for that.
1. We support tables > 4G rows.
2. int8 is not available on all platforms.
3. The only use for reltuples is in the optimizer, which is perfectly
content with approximate values.
regards, tom lane
On Thursday 07 March 2002 23:11, Bruce Momjian wrote:
Neil Conway wrote:
P.S. I tried to search the archives, but archives.postgresql.org is so
slow, it's basically unusable. So my apologies if this has already been
discussed...Yes, it is hampering me from researching some of these patches too, and
fts is completely down. If I could just get a web page of all the
threads (forget searching), I would be happy. The archives site contents
hasn't been updated since Feb 28.
maybe google?
http://groups.google.com/groups?hl=en&group=comp.databases.postgresql.hackers
Though a cursory glance shows some mails which went over the
list aren't there, particularly the most recent threads are pretty patchy, pun
unintended..
Ian Barwick
maybe google?
http://groups.google.com/groups?hl=en&group=comp.databases.postgresql.hackers
Though a cursory glance shows some mails which went over the
list aren't there, particularly the most recent threads are pretty patchy, pun
unintended..
Thanks. That is a huge help. In fact, this lists all the groups:
http://groups.google.com/groups?hl=en&group=comp.databases.postgresql
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Thu, 2002-03-07 at 17:51, Tom Lane wrote:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
Neil Conway wrote:
Is there a reason why the reltuples column of pg_class is stored as a
"real", rather than one of the integer data types?That is an excellent question. I assume it is related to having > 4
billion rows, but we have int8 for that.1. We support tables > 4G rows.
I agree we should try to support very large tables -- so why waste space
on storing floating point? And am I missing something, or is a "real"
only 4 bytes?
2. int8 is not available on all platforms.
I have no problem making restrictions on data types for portability, but
at least we should be consistent:
% grep -rI 'long long' * | wc -l
37
% grep -rI 'int64' * | wc -l
191
On all the platforms I tested (x86, SPARC, PPC, PA-RISC, Alpha), a 'long
long' is supported, and is 8 bytes. Which platforms don't have this, and
are we actively supporting them?
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
Neil Conway <nconway@klamath.dyndns.org> writes:
I have no problem making restrictions on data types for portability, but
at least we should be consistent:
We *are* consistent. int8 is not used in the system catalogs, and where
it is used, the system will continue to function if it's implemented as
a 32-bit datatype. (At least, things still worked the last time I tried
turning off HAVE_LONG_LONG_INT. If someone broke it since then, it
needs to be fixed.)
regards, tom lane
On Thu, 2002-03-07 at 19:54, Tom Lane wrote:
Neil Conway <nconway@klamath.dyndns.org> writes:
I have no problem making restrictions on data types for portability, but
at least we should be consistent:We *are* consistent. int8 is not used in the system catalogs, and where
it is used, the system will continue to function if it's implemented as
a 32-bit datatype. (At least, things still worked the last time I tried
turning off HAVE_LONG_LONG_INT. If someone broke it since then, it
needs to be fixed.)
9 regression tests fail without HAVE_LONG_LONG_INT on a 32-bit machine
(int8, constraints, select_implicit, select_having, subselect, union,
aggregates, misc, rules). It's pretty obvious that int8 should fail, but
the others look like bugs.
As for the original question, maybe I'm missing something obvious, but
is there a reason why reltuples can't be an int8? (which is already
typedef'ed to a int4 on broken machines/compilers) This would mean that
on machines without a 64-bit int type, tables greater than 2^32 rows
can't be stored (or at least, reltuples breaks). But I'm inclined to
dismiss those platforms as broken, anyway...
In any case, I think the current situation is the wrong way around:
we're using a workaround on _all_ platforms, just to avoid breaking a
few old systems. Wouldn't it make more sense to use an int8 by default,
and fall back to a floating-point workaround if the default, optimal
solution isn't available?
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
Neil Conway <nconway@klamath.dyndns.org> writes:
9 regression tests fail without HAVE_LONG_LONG_INT on a 32-bit machine
(int8, constraints, select_implicit, select_having, subselect, union,
aggregates, misc, rules). It's pretty obvious that int8 should fail, but
the others look like bugs.
I think int8_tbl may be used in some of the other tests, so diffs there
are not necessarily a big deal. Did you examine the diffs closely?
As for the original question, maybe I'm missing something obvious, but
is there a reason why reltuples can't be an int8? (which is already
typedef'ed to a int4 on broken machines/compilers)
Yes: it won't work. If reltuples is construed to be 8 bytes by some
compilers and 4 bytes by others, then the struct definition will fail to
overlay onto the storage as seen by the general-purpose tuple access
routines. (We could maybe fix that by having pg_type.h and some other
places conditionally compile the declared size of type int8, but it
ain't worth the trouble.)
This would mean that
on machines without a 64-bit int type, tables greater than 2^32 rows
can't be stored (or at least, reltuples breaks). But I'm inclined to
dismiss those platforms as broken, anyway...
Sorry, but I have very little patience for arguments that "if it works
on all the machines I use, it's good enough". Especially for a case
like this, where there is zero advantage to using int8 anyway.
Using a float here is not a "workaround", it's the right thing to do.
(The optimizer would only have to convert it to float anyway for its
internal calculations.)
Wouldn't it make more sense to use an int8 by default,
and fall back to a floating-point workaround if the default, optimal
solution isn't available?
So the user-visible column types of pg_class would vary depending on
this implementation detail? Not a good idea IMHO.
regards, tom lane