Performance implications of numeric?

Started by Wells Oliverover 13 years ago4 messagesgeneral
Jump to latest
#1Wells Oliver
wellsoliver@gmail.com

We have a lot of tables which store numeric data. These tables all use the
numeric type, where the values are 95% integer values. We used numeric
because it eliminated the need for casting during division to yield a
floating point value.

I'm curious as to whether this would have performance and/or disk size
implications. Would converting these columns to integer (or double
precision on the handful that require the precision) and forcing developers
to use explicit casting be worth the time?

Thanks for any clarification.

--
Wells Oliver
wellsoliver@gmail.com

#2Craig Ringer
craig@2ndquadrant.com
In reply to: Wells Oliver (#1)
Re: Performance implications of numeric?

On 08/22/2012 12:27 PM, Wells Oliver wrote:

We have a lot of tables which store numeric data. These tables all use
the numeric type, where the values are 95% integer values. We used
numeric because it eliminated the need for casting during division to
yield a floating point value.

I'm curious as to whether this would have performance and/or disk size
implications.

Yes, and yes, though the gap seems to have shrunk a lot since I first
started using Pg.

It's easy to concoct fairly meaningless micro-benchmarks, but you should
really try it with some real queries you run on your real schema. Take a
copy of your data, convert it, and run some tests. Use
`pg_total_relation_size` to compare the numeric and int versions of the
relations after `CLUSTER`ing them to debloat and reindex them.

Would converting these columns to integer (or double
precision on the handful that require the precision) and forcing
developers to use explicit casting be worth the time?

Without knowing your workload and your constraints, that's a "how blue
is the sky" question.

--
Craig Ringer

#3Craig Ringer
craig@2ndquadrant.com
In reply to: Wells Oliver (#1)
Re: Performance implications of numeric?

On 08/23/2012 12:48 AM, Wells Oliver wrote:

Hey, thanks for your feedback. Just to clarify: pg_total_relation_size
returns bytes, correct?

Please reply to the list, not directly to me.

Yes, pg_total_relation_size returns bytes. The documentation
(http://www.postgresql.org/docs/9.1/static/functions-admin.html) doesn't
seem to explicitly say that for pg_total_relation_size though it does
for pg_relation_size and other functions.

Use pg_size_pretty to convert bytes to "human" values for display.

--
Craig Ringer

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Wells Oliver (#1)
Re: Performance implications of numeric?

On Tue, Aug 21, 2012 at 11:27 PM, Wells Oliver <wellsoliver@gmail.com> wrote:

We have a lot of tables which store numeric data. These tables all use the
numeric type, where the values are 95% integer values. We used numeric
because it eliminated the need for casting during division to yield a
floating point value.

I'm curious as to whether this would have performance and/or disk size
implications. Would converting these columns to integer (or double precision
on the handful that require the precision) and forcing developers to use
explicit casting be worth the time?

Thanks for any clarification.

Calculations against numeric are several orders of magnitude slower
than native binary operations. Fortunately the time the database
spends doing these types of calculations is often a tiny fraction of
overall execution time and I advise giving numeric a whirl unless you
measure a big performance hit. Let's put it this way: native binary
types are a performance hack that come with all kinds of weird baggage
that percolate up and uglify your code: your example given is a
classic case in point. Database "integer" types are not in fact
integers but a physically constrained approximation of them. Floating
point types are even worse.

Another example: I just found out for the first time (after many years
of programming professionally) that -2147483648 / -1 raises a hardware
exception: this is exactly the kind of thing that makes me think that
rote use of hardware integer types is a terribly bad practice.

merlin