Primary Key

sam@samason.me.uk

over 18 years ago

In reply to: João Paulo Zavanela (#1)

Re: Primary Key

On Fri, Nov 16, 2007 at 03:50:20PM -0300, Jo??o Paulo Zavanela wrote:

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

Normally a primary key would just be a single column. When you start
going to that many I'd probably have a serial column as the primary key,
and a UNIQUE index on those six fields. Depends on what you're doing,
though unless you've got a few years experience I'd be tempted to stay
away from primary keys of more than a single column.

Sam

jd@commandprompt.com

over 18 years ago

In reply to: Sam Mason (#2)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 16 Nov 2007 18:54:22 +0000
Sam Mason <sam@samason.me.uk> wrote:

On Fri, Nov 16, 2007 at 03:50:20PM -0300, Jo??o Paulo Zavanela wrote:

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

Normally a primary key would just be a single column.

Uhmm no. Normally a primary key is defined by the number of columns it
takes to determine naturally distinct values.

I would agree that if he is looking at 6 then he probably needs to
normalize further.

Sincerely,

Joshua D. Drake

- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPej7ATb/zqfZUUQRAjnAAJ4p8mH685V/ysBDd4sPLoUkDRSzfACffFhs
AkUg+htYb3Nv2wpN/0ecod8=
=hnSU
-----END PGP SIGNATURE-----

ron.l.johnson@cox.net

over 18 years ago

In reply to: João Paulo Zavanela (#1)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/07 12:50, Joï¿œo Paulo Zavanela wrote:

Hello,

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

The number of recommended fields is the *minimum* number required
for uniqueness. 1 or 6 or 24. Doesn't matter.

- --
Ron Johnson, Jr.
Jefferson LA USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPej/S9HxQb37XmcRAo5mAJoDwp1E+aL2M/oTWhOsR5XYJi0AhgCgvUgU
/ZB7nP+K6j0WW8vNn5Q8tFI=
=s+zS
-----END PGP SIGNATURE-----

ron.l.johnson@cox.net

over 18 years ago

In reply to: Sam Mason (#2)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/07 12:54, Sam Mason wrote:

On Fri, Nov 16, 2007 at 03:50:20PM -0300, Jo??o Paulo Zavanela wrote:

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

Normally a primary key would just be a single column. When you start
going to that many I'd probably have a serial column as the primary key,
and a UNIQUE index on those six fields. Depends on what you're doing,
though unless you've got a few years experience I'd be tempted to stay
away from primary keys of more than a single column.

Fie on you evil synthetic key lovers. Long live the Natural Key!

- --
Ron Johnson, Jr.
Jefferson LA USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPel7S9HxQb37XmcRAmPZAJ9mml45XtPRSCrAiQ7K+LCwWf5J5QCgvaSp
Zs5F1SU5CNQvnoofaxkAoIY=
=9ipR
-----END PGP SIGNATURE-----

jd@commandprompt.com

over 18 years ago

In reply to: Ron Johnson (#5)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 16 Nov 2007 13:03:23 -0600
Ron Johnson <ron.l.johnson@cox.net> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/07 12:54, Sam Mason wrote:

On Fri, Nov 16, 2007 at 03:50:20PM -0300, Jo??o Paulo Zavanela
wrote:

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

Normally a primary key would just be a single column. When you
start going to that many I'd probably have a serial column as the
primary key, and a UNIQUE index on those six fields. Depends on
what you're doing, though unless you've got a few years experience
I'd be tempted to stay away from primary keys of more than a single
column.

Fie on you evil synthetic key lovers. Long live the Natural Key!

Right with you there buddy. Let's get the pitchforks!

Joshua D. Drake

- --
Ron Johnson, Jr.
Jefferson LA USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPel7S9HxQb37XmcRAmPZAJ9mml45XtPRSCrAiQ7K+LCwWf5J5QCgvaSp
Zs5F1SU5CNQvnoofaxkAoIY=
=9ipR
-----END PGP SIGNATURE-----

---------------------------(end of
broadcast)--------------------------- TIP 2: Don't 'kill -9' the
postmaster

- --

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPetVATb/zqfZUUQRAp29AJ49UhU3LWyaIJq+F/vCkrZIMBIi8QCdEesA
EqTIjelyYrqR8+JN1Y3sPwM=
=AlLT
-----END PGP SIGNATURE-----

sam@samason.me.uk

over 18 years ago

In reply to: Ron Johnson (#5)

Re: Primary Key

( Fi Fie Foe Fum, I smell the blood of a religious war )

On Fri, Nov 16, 2007 at 01:03:23PM -0600, Ron Johnson wrote:

On 11/16/07 12:54, Sam Mason wrote:

On Fri, Nov 16, 2007 at 03:50:20PM -0300, Jo??o Paulo Zavanela wrote:

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

Normally a primary key would just be a single column. When you start
going to that many I'd probably have a serial column as the primary key,
and a UNIQUE index on those six fields. Depends on what you're doing,
though unless you've got a few years experience I'd be tempted to stay
away from primary keys of more than a single column.

Fie on you evil synthetic key lovers. Long live the Natural Key!

Really? I started off with everything using sequences and everything
was good. Then I found I wanted to do more complicated things so I
started to transition to natural keys and things were better. Then
I took things too far and wanted something artificial back in my
life. I'm back to almost never using natural keys now, mainly because
interfacing with the outside world gets too complicated. When I'm just
doing stuff inside the database then I can use natural keys and all is
good, otherwise things just get too complicated.

I'll probably look back in another few years and remember how young and
naive I was back now.

Sam

jd@commandprompt.com

over 18 years ago

In reply to: Sam Mason (#7)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 16 Nov 2007 20:00:29 +0000
Sam Mason <sam@samason.me.uk> wrote:

Normally a primary key would just be a single column. When you
start going to that many I'd probably have a serial column as the
primary key, and a UNIQUE index on those six fields. Depends on
what you're doing, though unless you've got a few years
experience I'd be tempted to stay away from primary keys of more
than a single column.

Fie on you evil synthetic key lovers. Long live the Natural Key!

Really? I started off with everything using sequences and everything
was good. Then I found I wanted to do more complicated things so I
started to transition to natural keys and things were better. Then
I took things too far and wanted something artificial back in my

When that is needed I do this:

create table foo(id serial unique, a text, b text, primary (a,b));

Joshua D. Drake

life. I'm back to almost never using natural keys now, mainly because
interfacing with the outside world gets too complicated. When I'm
just doing stuff inside the database then I can use natural keys and
all is good, otherwise things just get too complicated.

I'll probably look back in another few years and remember how young
and naive I was back now.

Sam

---------------------------(end of
broadcast)--------------------------- TIP 6: explain analyze is your
friend

- --

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPfhWATb/zqfZUUQRArjCAJ90cXYwuw/A/yojQanj1gv+RJqJnQCdFvPv
6nGph8K57KcKtk1rTgfFSFg=
=fQ5l
-----END PGP SIGNATURE-----

sam@samason.me.uk

over 18 years ago

In reply to: Joshua D. Drake (#8)

Re: Primary Key

On Fri, Nov 16, 2007 at 12:06:46PM -0800, Joshua D. Drake wrote:

When that is needed I do this:

create table foo(id serial unique, a text, b text, primary (a,b));

Humm, so the other way around from what I've ended up doing. I'll need
to think about the implications of changing things around like this.
There are lots of things that seems as though they'll be pretty awkard
to do, I'm sure it's just because I haven't thought about it enough.

Sam

#10

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Sam Mason (#9)

Re: Primary Key

On Nov 16, 2007 3:21 PM, Sam Mason <sam@samason.me.uk> wrote:

On Fri, Nov 16, 2007 at 12:06:46PM -0800, Joshua D. Drake wrote:

When that is needed I do this:

create table foo(id serial unique, a text, b text, primary (a,b));

Humm, so the other way around from what I've ended up doing. I'll need
to think about the implications of changing things around like this.
There are lots of things that seems as though they'll be pretty awkard
to do, I'm sure it's just because I haven't thought about it enough.

there is a lot of nuance to this debate and tons of hyperbole on both
sides. There are many side effects, pro and con, about choosing
'natural' keys vs. surrogates. josh's suggestion is the most
reasonable compromise, because it allows you to get the performance
benefits (which are sometimes overrated) when you need it, but still
forces you to think about how your data is _truly_ organized and what
makes each record unique. the sad fact is that sequences have made
developers lazy, not giving much thought to proper normalization
strategies which in turn often produces lousy databases. if you know
how to do things properly, you will know what we mean.

merlin

#11

David Fetter

david@fetter.org

over 18 years ago

In reply to: Joshua D. Drake (#8)

Re: Primary Key

On Fri, Nov 16, 2007 at 12:06:46PM -0800, Joshua D. Drake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 16 Nov 2007 20:00:29 +0000
Sam Mason <sam@samason.me.uk> wrote:

Normally a primary key would just be a single column. When
you start going to that many I'd probably have a serial column
as the primary key, and a UNIQUE index on those six fields.
Depends on what you're doing, though unless you've got a few
years experience I'd be tempted to stay away from primary keys
of more than a single column.

Fie on you evil synthetic key lovers. Long live the Natural
Key!

Really? I started off with everything using sequences and
everything was good. Then I found I wanted to do more complicated
things so I started to transition to natural keys and things were
better. Then I took things too far and wanted something
artificial back in my

When that is needed I do this:

create table foo(id serial unique, a text, b text, primary (a,b));

By itself, this insufficiently restricts what's going on in the table.
I'd recommend a TRIGGER that disallows updating the synthetic key.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#12

sam@samason.me.uk

over 18 years ago

In reply to: Merlin Moncure (#10)

Re: Primary Key

On Fri, Nov 16, 2007 at 04:41:42PM -0500, Merlin Moncure wrote:

On Nov 16, 2007 3:21 PM, Sam Mason <sam@samason.me.uk> wrote:

On Fri, Nov 16, 2007 at 12:06:46PM -0800, Joshua D. Drake wrote:

When that is needed I do this:

create table foo(id serial unique, a text, b text, primary (a,b));

Humm, so the other way around from what I've ended up doing. I'll need
to think about the implications of changing things around like this.
There are lots of things that seems as though they'll be pretty awkard
to do, I'm sure it's just because I haven't thought about it enough.

there is a lot of nuance to this debate and tons of hyperbole on both
sides. There are many side effects, pro and con, about choosing
'natural' keys vs. surrogates. josh's suggestion is the most
reasonable compromise, because it allows you to get the performance
benefits (which are sometimes overrated) when you need it,

I'm not sure if performance has ever really come into the decision about
whether to use natural/surrogate keys with me. The main reason for
using a surrogate key is simplicity; I don't trust myself to maintain a
large database where every relationship is composed of multiple columns.

If I could say somewhere that I want a set of properties to hold (i.e.
there is a 1-to-1 relationship between these tables, there's at most one
row in this table for each of these, etc) and then these constraints
were checked when I actually wrote my queries I'd be much happier.

For example, given the tables:

CREATE TABLE foo (
id INTEGER NOT NULL PRIMARY KEY,
description TEXT );
CREATE TABLE bar (
key INTEGER NOT NULL PRIMARY KEY,
fooid INTEGER NOT NULL REFERENCES foo );

I'd like to be able to write the query:

SELECT b.key, f.description
FROM bar b, foo f
WHERE b.fooid = f.id;

And be able to say that I expect exactly one row for each bar.key.
The database would be able to go back over the definitions and prove
that this constraint holds (because bar.key and foo.id are UNIQUE, the
FOREIGN KEY constraint has checked that bar.fooid always references a
valid foo.id, and that bar.fooid can never be NULL).

Or is this the sort of thing that materialised views are good for, and
I've always just been thinking about them as a performance hack.

the sad fact is that sequences have made
developers lazy, not giving much thought to proper normalization
strategies which in turn often produces lousy databases. if you know
how to do things properly, you will know what we mean.

"Properly" is very open ended. Most people will try to do their best
job (given various external constraints) and we've all experienced bad
design, if only from stuff that we did while learning. I think I've
experienced this, but you've probably got a very different idea about
what "properly" means than I do.

Sam

#13

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: Ron Johnson (#4)

Re: Primary Key

"Ron Johnson" <ron.l.johnson@cox.net> writes:

On 11/16/07 12:50, João Paulo Zavanela wrote:

Hello,

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

The number of recommended fields is the *minimum* number required
for uniqueness. 1 or 6 or 24. Doesn't matter.

Unless of course you care about every other table being 24x larger and slower
due to having all these copies of the 24 fields. And of course unless you care
about being able to handle the inevitable day when it turns out the 24 fields
aren't unique and you need to consider adding a 25th column to the table *and
every table referencing it* as well as changing every line of application code
to use the new column.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

#14

Greg Smith

gsmith@gregsmith.com

over 18 years ago

In reply to: Merlin Moncure (#10)

Re: Primary Key

On Fri, 16 Nov 2007, Merlin Moncure wrote:

the sad fact is that sequences have made developers lazy

Nah, developers were lazy long before that. If you ask Larry Wall it's a
virtue.

I gave up on this argument ten years ago after a long battle with
well-known natural key zealot Joe Celko wore me out. He published one of
his many articles making a case for them using an example from the
automotive industry. Only problem was, the unique identifier he suggested
wasn't. At the auto parts company I worked for, I had just spent many
monotonous days contorting keys to work around a problem caused by the
original designer there, who misunderstood some nuances of how the "Big
Three" auto manufacturers assigned part numbers the same way Celko did.

He doesn't use that example anymore but still misses the point I tried to
make. The ability of the world to invalidate the assumptions that go into
natural key assignment are really impressive. I particularly enjoy that
so many systems are built presuming that the Social Security number for a
person is involatile that this topic comes up in their FAQ about identify
theft: http://www.socialsecurity.gov/pubs/10064.html

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#15

Ted Byers

r.ted.byers@rogers.com

over 18 years ago

In reply to: Sam Mason (#12)

Need help with complicated SQL statement

Please consider the following statement (it becomes
obvious if you remember the important thing about the
table is that it has columns for each of stock_id,
price_date, and price).

(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 22) AS T2
ORDER BY T2.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 66) AS T3
ORDER BY T3.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 132) AS T4
ORDER BY T4.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 264) AS T5
ORDER BY T5.price_date ASC LIMIT 1);

This statement works flawlessly, and is blindingly
fast relative to everything else I have tried. But I
am stuck.

First, while this statement gets me the correct data,
I need to obtain a single record with stock_id,
current price (that obtained from the first select
statement in the union, and each of the prices
returned by the subsequent select statements as a the
current price minus the price at the previous date,
and the result divided by the price at the previous
date, expressed as a percentage. I do not yet know
how to do this using SQL (it would be trivial if I
exported the data to Java or C++ - but it isn't clear
how to do it within SQL).

To make things more difficult, suppose I have another
select statement that returns a set of stock_ids. How
do I apply the SQL logic I require to only those
stocks in the set returned by a statement like SELECT
stock_id FROM someTable WHERE ... The result of this
extension would be that I have one record for each
stock in the selected set of stocks.

I do NOT want to have to recompute the set of stocks
for each of the select statements in the above union
(since that would be a waste because the resulting set
of stocks would always be the same for the given
criteria). Nor do I want to apply the SQL logic I
need for the prices to all the stocks in the database.
There could be thousands, or even tens of thousands,
of stocks represented in the database and I'd need the
gain/loss logic only for a few dozen at any given
time!

How do I make the two extensions I require?

I expect the SQL I get to be eventually placed in a
stored procedure, which may then be used to construct
a view, but that is the easy part.

Maybe I have been staring at this for too long to see
the obvious solution, but I am exhausted and am not
seeing the next step. If there IS an obvious next
step, please at least give me a hint.

Thanks

Ted

#16

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Greg Smith (#14)

Re: Primary Key

On Nov 16, 2007 9:50 PM, Greg Smith <gsmith@gregsmith.com> wrote:

On Fri, 16 Nov 2007, Merlin Moncure wrote:

the sad fact is that sequences have made developers lazy

Nah, developers were lazy long before that. If you ask Larry Wall it's a
virtue.

well, 'lazy' in the sense that it encourages easy to solutions to
difficult problems is arguably virtuous. intellectual laziness (which
i most certainly am not accusing you [or the OP] of) is another
matter. long years of wrestling with you and many other less talented
individuals on this particular topic has imparted to me a little bit
of weariness as well. furthermore, i have myself surrogated a
database to victory on various occasions, although usually for
performance reasons...so i'm hardly a zealot. i do however think that
being able to separate data into tables using unambiguous keys lifted
directly from the data is a critical skill.

I gave up on this argument ten years ago after a long battle with
well-known natural key zealot Joe Celko wore me out. He published one of
his many articles making a case for them using an example from the
automotive industry. Only problem was, the unique identifier he suggested
wasn't. At the auto parts company I worked for, I had just spent many
monotonous days contorting keys to work around a problem caused by the
original designer there, who misunderstood some nuances of how the "Big
Three" auto manufacturers assigned part numbers the same way Celko did.

well, nobody's perfect...

He doesn't use that example anymore but still misses the point I tried to
make. The ability of the world to invalidate the assumptions that go into
natural key assignment are really impressive. I particularly enjoy that
so many systems are built presuming that the Social Security number for a
person is involatile that this topic comes up in their FAQ about identify
theft: http://www.socialsecurity.gov/pubs/10064.html

that just means that the SSN is only part of the key that
unambiguously defines a person, should that be a requirement :)

database design, like many engineering disciplines, is a series of
trade-offs mixed in with a couple of helpings of artistry and the few
bits of theory that the sql standards committee was was not able to
snuff out. like i said in my opening remarks, the issues at play are
nuanced without clear cut answers.

merlin

p.s. no compilation of 80's albums is complete without 'full moon fever'...

#17

ron.l.johnson@cox.net

over 18 years ago

In reply to: Greg Smith (#14)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/16/07 20:50, Greg Smith wrote:
[snip]

He doesn't use that example anymore but still misses the point I tried
to make. The ability of the world to invalidate the assumptions that go
into natural key assignment are really impressive. I particularly enjoy
that so many systems are built presuming that the Social Security number
for a person is involatile that this topic comes up in their FAQ about
identify theft: http://www.socialsecurity.gov/pubs/10064.html

Natural PKs are *not* set in stone, and only stubborn fools won't
admit that they can't divine all situations. So, you add a new
column to the PK and keep on going.

But still, there *are* some circumstances where natural PKs just
don't work. After all, SSNs and credit card numbers are synthetic
(just not generated sequential by the RDBMS).

- --
Ron Johnson, Jr.
Jefferson LA USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHPnLPS9HxQb37XmcRAggNAKCL8UDTQ3238mbYIiV32AUAnOs+aQCgkhTP
yr+t6pT5loh7PBUc3QPljD4=
=O/Eb
-----END PGP SIGNATURE-----

#18

pgsql@Sheeky.Biz

over 18 years ago

In reply to: Ted Byers (#15)

Re: Need help with complicated SQL statement

Ted Byers wrote:

Please consider the following statement (it becomes
obvious if you remember the important thing about the
table is that it has columns for each of stock_id,
price_date, and price).

(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 22) AS T2
ORDER BY T2.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 66) AS T3
ORDER BY T3.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 132) AS T4
ORDER BY T4.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 264) AS T5
ORDER BY T5.price_date ASC LIMIT 1);

This statement works flawlessly, and is blindingly
fast relative to everything else I have tried. But I
am stuck.

I would have these subselects as -

UNION
(SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 1 OFFSET 264)

I would expect that to give the same result but make the query plan a
bit simpler and quicker using less memory.

First, while this statement gets me the correct data,
I need to obtain a single record with stock_id,
current price (that obtained from the first select
statement in the union, and each of the prices
returned by the subsequent select statements as a the
current price minus the price at the previous date,
and the result divided by the price at the previous
date, expressed as a percentage. I do not yet know
how to do this using SQL (it would be trivial if I
exported the data to Java or C++ - but it isn't clear
how to do it within SQL).

I haven't tested this but I would start with -

CREATE VIEW stock_price_combined AS
SELECT
stock_id

, (SELECT stock_price FROM stockprices WHERE stock_id = OT.stock_id
ORDER BY price_date DESC LIMIT 1) as orig_price

, (SELECT stock_price FROM stockprices WHERE stock_id = OT.stock_id
ORDER BY price_date DESC LIMIT 1 OFFSET 22) as price_two

, (SELECT stock_price FROM stockprices WHERE stock_id = OT.stock_id
ORDER BY price_date DESC LIMIT 1 OFFSET 66) as price_three

, (SELECT stock_price FROM stockprices WHERE stock_id = OT.stock_id
ORDER BY price_date DESC LIMIT 1 OFFSET 132) as price_four

, (SELECT stock_price FROM stockprices WHERE stock_id = OT.stock_id
ORDER BY price_date DESC LIMIT 1 OFFSET 264) as price_five

FROM stock_prices OT;

Then you can -

SELECT
orig_price
, (orig_price - price_two) as price_increase
, ((orig_price - price_two)/price_two) as percentile
...
...

FROM stock_price_combined

WHERE stock_id in (SELECT stock_id FROM someTable WHERE ...)

To make things more difficult, suppose I have another
select statement that returns a set of stock_ids. How
do I apply the SQL logic I require to only those
stocks in the set returned by a statement like SELECT
stock_id FROM someTable WHERE ... The result of this
extension would be that I have one record for each
stock in the selected set of stocks.

SELECT * from stockprices WHERE stock_id in (SELECT stock_id FROM
someTable WHERE ...)

If that isn't the answer you want I hope it points you in the right
direction...

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz

#19

Ted Byers

r.ted.byers@rogers.com

over 18 years ago

In reply to: Shane Ambler (#18)

Re: Need help with complicated SQL statement

--- Shane Ambler <pgsql@Sheeky.Biz> wrote:

Ted Byers wrote:

Please consider the following statement (it

becomes

obvious if you remember the important thing about

the

table is that it has columns for each of stock_id,
price_date, and price).

(SELECT * FROM stockprices WHERE stock_id = 1

ORDER BY

price_date DESC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 22) AS

T2

ORDER BY T2.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 66) AS

T3

ORDER BY T3.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 132)

AS T4

ORDER BY T4.price_date ASC LIMIT 1)
UNION
(SELECT * FROM (SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 264)

AS T5

ORDER BY T5.price_date ASC LIMIT 1);

This statement works flawlessly, and is blindingly
fast relative to everything else I have tried.

But I

am stuck.

I would have these subselects as -

UNION
(SELECT * FROM stockprices WHERE
stock_id = 1 ORDER BY price_date DESC LIMIT 1
OFFSET 264)

I would expect that to give the same result but make
the query plan a
bit simpler and quicker using less memory.

It gave apparently correct values, but for some
reason, it insisted on returning thousands upon
thousands of identical record. There is something
awry there, but I can't place what. Yes, I know I
could use SELECT DISTINCT, but I worry that it may be
doing a full table scan, as opposed to the relatively
direct lookup I came up with after looking at your
statement. I don't yet know how long it would take
because it is the slowest option I tied, and I gave up
after it had returned over 10,000 rows and still
showed no signs of finishing. I don't understand this
as explain returned apparently much better results for
yours than it did for mine.

My latest is as follows:

SELECT A1.stock_id,
A1.price_date,
A1.adjusted,
A2.price_date AS pd22,
100.0 * (A1.adjusted - A2.adjusted)/A2.adjusted
AS gl22pc,
A3.price_date AS pd66,
100.0 * (A1.adjusted - A3.adjusted)/A3.adjusted
AS gl66pc,
A4.price_date AS pd132,
100.0 * (A1.adjusted - A4.adjusted)/A4.adjusted
AS gl132pc,
A5.price_date AS pd264,
100.0 * (A1.adjusted - A5.adjusted)/A5.adjusted
AS gl264pc
FROM
(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1) AS A1
LEFT JOIN
(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1 OFFSET 22) AS A2
ON A1.stock_id = A2.stock_id
LEFT JOIN
(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1 OFFSET 66) AS A3
ON A1.stock_id = A3.stock_id
LEFT JOIN
(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1 OFFSET 132) AS A4
ON A1.stock_id = A4.stock_id
LEFT JOIN
(SELECT * FROM stockprices WHERE stock_id = 1 ORDER BY
price_date DESC LIMIT 1 OFFSET 264) AS A5
ON A1.stock_id = A5.stock_id;

This still gives me the correct answer, but is faster
still than anything I came up with before.

Now that I have the correct result for one stock, I
need to adapt it to apply to each stock individually,
in some small selection from a large number of stocks.

Thanks again

Ted

#20

ron.l.johnson@cox.net

over 18 years ago

In reply to: Bruce Momjian (#13)

Re: Primary Key

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/17/07 01:21, Gregory Stark wrote:

"Ron Johnson" <ron.l.johnson@cox.net> writes:

On 11/16/07 12:50, Joï¿½o Paulo Zavanela wrote:

Hello,

How many fields is recomended to create a primary key?
I'm thinking to create one with 6 fields, is much?

The number of recommended fields is the *minimum* number required
for uniqueness. 1 or 6 or 24. Doesn't matter.

Unless of course you care about every other table being 24x larger and slower
due to having all these copies of the 24 fields. And of course unless you care
about being able to handle the inevitable day when it turns out the 24 fields
aren't unique and you need to consider adding a 25th column to the table *and
every table referencing it* as well as changing every line of application code
to use the new column.

What's got to be done has got to be done.

On one of our systems, the natural PK of an electronic road toll is:
ETC_ACCOUNT_ID INTEGER
FISCAL_PERIOD INTEGER
LANE_TX_ID BIGINT
TX_TYPE_ID CHAR(1)
TX_SUBTYPE_IND CHAR(1)

On another, it's:
ETC_ACCOUNT_ID INTEGER
FISCAL_PERIOD INTEGER
LANE_TX_ID BIGINT
DEVICE_NO CHAR(12) <<<< added column
TX_TYPE_ID CHAR(1)
TX_SUBTYPE_IND CHAR(1)

If the PK was synthetic and generated by the engine, then a (buggy)
app could insert duplicate tolls and the system wouldn't utter a
peep. But the customer sure would when he saw the duplicate entries.

Note the seemingly *synthetic* field LANE_TX_ID.

Records coming in from the lane are inserted into the T_LANE_TX
table which has the PK of LANE_TX_ID. However, that table also has
a "natural" unique index of LANE_ID, TX_DATE, TX_TIME, TX_SEQ_NUMBER

Likewise, T_LANE has the synthetic PK of LANE_ID, but it
back-stopped by a natural unique index PLAZA_ID, EXTERN_LANE_ID.

And... T_PLAZA has the PK of PLAZA_ID and is back-stopped by the
natural unique index AGENCY_ID, EXTERN_PLAZA_ID.

Breaking the chain, T_AGENCY only has the synthetic key AGENCY_ID.

But it only has 27 rows.

- --
Ron Johnson, Jr.
Jefferson LA USA

%SYSTEM-F-FISH, my hovercraft is full of eels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHP701S9HxQb37XmcRAk1cAKCFnNraM2Z1s8M8j8sl8Gotxz1r0QCglEfJ
sUCabkDaZTQVc/kCyHGewhQ=
=b9ii
-----END PGP SIGNATURE-----

#21

pgsql@Sheeky.Biz

over 18 years ago

In reply to: Ted Byers (#19)

#22

sam@samason.me.uk

over 18 years ago

In reply to: Ron Johnson (#20)

#23

pgsql@Sheeky.Biz

over 18 years ago

In reply to: Shane Ambler (#21)

#24

Ted Byers

r.ted.byers@rogers.com

over 18 years ago

In reply to: Shane Ambler (#23)

#25

pgsql@Sheeky.Biz

over 18 years ago

In reply to: Ted Byers (#24)

#26

Sascha Bohnenkamp

asbohnenkamp@gmx.de

over 18 years ago

In reply to: Ron Johnson (#5)

#27

Peter Childs

peterachilds@gmail.com

over 18 years ago

In reply to: Sascha Bohnenkamp (#26)

#28

Martijn van Oosterhout

kleptog@svana.org

over 18 years ago

In reply to: Peter Childs (#27)

#29

Csaba Nagy

nagy@ecircle-ag.com

over 18 years ago

In reply to: Martijn van Oosterhout (#28)

#30

sam@samason.me.uk

over 18 years ago

In reply to: Csaba Nagy (#29)

#31

ron.l.johnson@cox.net

over 18 years ago

In reply to: Peter Childs (#27)

#32

Tom Lane

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Ron Johnson (#31)

#33

James B. Byrne

byrnejb@harte-lyne.ca

over 18 years ago

In reply to: Tom Lane (#32)

#34

Michael Glaesemann

grzm@seespotcode.net

over 18 years ago

In reply to: James B. Byrne (#33)

#35

Steve Crawford

scrawford@pinpointresearch.com

over 18 years ago

In reply to: Martijn van Oosterhout (#28)

#36

jd@commandprompt.com

over 18 years ago

In reply to: Steve Crawford (#35)

#37

Richard Broersma Jr

rabroersma@yahoo.com

over 18 years ago

In reply to: Joshua D. Drake (#36)

#38

Garber, Mikhail

mgarber@amazon.com

over 18 years ago

In reply to: Richard Broersma Jr (#37)

#39

jd@commandprompt.com

over 18 years ago

In reply to: Richard Broersma Jr (#37)

#40

Scott Ribe

scott_ribe@killerbytes.com

over 18 years ago

In reply to: Steve Crawford (#35)

#41

Steve Crawford

scrawford@pinpointresearch.com

over 18 years ago

In reply to: Joshua D. Drake (#36)

#42

Scott Marlowe

scott.marlowe@gmail.com

over 18 years ago

In reply to: Steve Crawford (#41)

#43

Ron Mayer

rm_pg@cheapcomplexdevices.com

over 18 years ago

In reply to: Joshua D. Drake (#39)

#44