caching subtotals: update vs sum -- aaugh!

Started by will trillichabout 24 years ago7 messagesgeneral
Jump to latest
#1will trillich
will@serensoft.com

to save a few cycles, i'm hoping to cache subtotals of
subsidiary records into the corresponding parent records -- but
i can't figure out how to update the parent table with the sums
of the child table fields:

create table invoice (
id serial,
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);
create table line_item (
id serial,
item_id integer references invoice ( id ),
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);

selecting the sums is easy:

select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
from
line_item
group by
item_id;

but i can't seem to get by cerebellum around how to flow all the
subtotals upstream -- these don't work:

-- all totals wind up in ONE invoice record
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
where line_item.item_id = set.id;

-- syntax error at 'group'
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
from line_item
where line_item.item_id = set.id GROUP BY line_item.item_id;

the next one works, but not even within earshot of 'elegant'.
there's GOTTA be a slicker way, right?

-- ridiculosity of redundancy but it limps into the right result
update invoice set
bal = (select sum(line_item.bal) where item_id = invoice.id),
cost = (select sum(line_item.cost) where item_id = invoice.id),
charge = (select sum(line_item.charge) where item_id = invoice.id)
from line_item
where line_item.item_id = set.id GROUP BY line_item.item_id;

or should i settle for something like

select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
into
fooey
from
line_item
group by
item_id
;
update invoice set
bal = fooey.bal,
cost = fooey.cost,
charge = fooey.charge
where fooey.item_id = id
;
drop table fooey
;

...? seems a bit of the old "long-way-around"... surely there's
a way--

--
DEBIAN NEWBIE TIP #110 from Dimitri Maziuk <dmaziuk@yola.bmrb.wisc.edu>
:
Here's how to TUNNEL SECURE X11 CONNECTIONS THROUGH SSH: on the
client, do this:
client# export DISPLAY=client:0.0
client# ssh -X server
then once you're logged in at the server, do:
server# netscape &
The environment created at the server will include the DISPLAY
variable, so netscape (or whatever) will dialogue with the
client machine. (See "man ssh" for more.)

Also see http://newbieDoc.sourceForge.net/ ...

#2Alaric B. Snell
abs@frontwire.com
In reply to: will trillich (#1)
Re: caching subtotals: update vs sum -- aaugh!

On Thu, 10 Jan 2002, will trillich wrote:

or should i settle for something like

select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
into
fooey
from
line_item
group by
item_id
;
update invoice set
bal = fooey.bal,
cost = fooey.cost,
charge = fooey.charge
where fooey.item_id = id
;
drop table fooey
;

...? seems a bit of the old "long-way-around"... surely there's
a way--

A stored procedure sounds the way ahead to me. DO it with a trigger if you
want updates to happen automagically whenever the line_items are changed:

http://www.postgresql.org/idocs/index.php?plpgsql-trigger.html
http://www.postgresql.org/idocs/index.php?sql-createtrigger.html

--
Alaric B. Snell, Developer
abs@frontwire.com

#3Andrew Gould
andrewgould@yahoo.com
In reply to: will trillich (#1)
Re: caching subtotals: update vs sum -- aaugh!

Will,

Where does 'set.id' come from? If this was a typo,
was it in the sql query you executed? I would think
that your first attempt should have worked if 'set.id'
was replaced with 'invoice.id'.

Best of luck,

Andrew Gould

--- will trillich <will@serensoft.com> wrote:

to save a few cycles, i'm hoping to cache subtotals
of
subsidiary records into the corresponding parent
records -- but
i can't figure out how to update the parent table
with the sums
of the child table fields:

create table invoice (
id serial,
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);
create table line_item (
id serial,
item_id integer references invoice ( id ),
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);

selecting the sums is easy:

select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
from
line_item
group by
item_id;

but i can't seem to get by cerebellum around how to
flow all the
subtotals upstream -- these don't work:

-- all totals wind up in ONE invoice record
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
where line_item.item_id = set.id;

-- syntax error at 'group'
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
from line_item
where line_item.item_id = set.id GROUP BY
line_item.item_id;

the next one works, but not even within earshot of
'elegant'.
there's GOTTA be a slicker way, right?

-- ridiculosity of redundancy but it limps into the
right result
update invoice set
bal = (select sum(line_item.bal) where item_id =
invoice.id),
cost = (select sum(line_item.cost) where item_id
= invoice.id),
charge = (select sum(line_item.charge) where
item_id = invoice.id)
from line_item
where line_item.item_id = set.id GROUP BY
line_item.item_id;

or should i settle for something like

select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
into
fooey
from
line_item
group by
item_id
;
update invoice set
bal = fooey.bal,
cost = fooey.cost,
charge = fooey.charge
where fooey.item_id = id
;
drop table fooey
;

...? seems a bit of the old "long-way-around"...
surely there's
a way--

--
DEBIAN NEWBIE TIP #110 from Dimitri Maziuk
<dmaziuk@yola.bmrb.wisc.edu>
:
Here's how to TUNNEL SECURE X11 CONNECTIONS THROUGH
SSH: on the
client, do this:
client# export DISPLAY=client:0.0
client# ssh -X server
then once you're logged in at the server, do:
server# netscape &
The environment created at the server will include
the DISPLAY
variable, so netscape (or whatever) will dialogue
with the
client machine. (See "man ssh" for more.)

Also see http://newbieDoc.sourceForge.net/ ...

---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: will trillich (#1)
Re: caching subtotals: update vs sum -- aaugh!

will trillich <will@serensoft.com> writes:

but i can't seem to get by cerebellum around how to flow all the
subtotals upstream -- these don't work:

-- all totals wind up in ONE invoice record
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
where line_item.item_id = set.id;

UPDATEs containing top-level aggregate functions don't really work
correctly. SQL92 forbids such things entirely, suggesting that they
think it's not well-defined. Postgres doesn't currently reject the
query, but the behavior is rather broken IMHO. See past discussions
in the archives about whether we should reject this, and what it should
mean if we don't.

As for your example with UPDATE ... GROUP BY, I don't believe that that
will get past the parser.

The temp table is probably the right way to go, ugly as it is.
The only standards-conformant way to do it in one query would be three
independent subselects:

update invoice set
bal = (select sum(line_item.bal) from line_item where item_id = invoice.id),
cost = (select sum(line_item.cost) from line_item where item_id = invoice.id),
charge = (select sum(line_item.charge) from line_item where item_id = invoice.id);

and the amount of extra computation needed to do it that way is large.

Or ... wait a second. How about

update invoice set
bal = ss.bal,
cost = ss.cost,
charge = ss.charge
from
(select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
from
line_item
group by
item_id) ss
where ss.item_id = invoice.id;

I haven't tried this but it seems like it should work. Better check the
quality of the generated plan though. The temp table might be faster.

regards, tom lane

#5will trillich
will@serensoft.com
In reply to: Andrew Gould (#3)
Re: caching subtotals: update vs sum -- aaugh!

subtotalling child records into a parent field -- HOWTO?

--- will trillich <will@serensoft.com> wrote:

create table invoice (
id serial,
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);
create table line_item (
id serial,
item_id integer references invoice ( id ),
bal numeric(8,2)
cost numeric(8,2),
charge numeric(8,2),
);

i can't seem to get by cerebellum around how to
flow all the
subtotals upstream -- these don't work:

-- all totals wind up in ONE invoice record
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
where line_item.item_id = INVOICE.id;

-- syntax error at 'group'
update invoice set
bal = sum(line_item.bal),
cost = sum(line_item.cost),
charge = sum(line_item.charge)
from line_item
where line_item.item_id = INVOICE.id GROUP BY
line_item.item_id;

On Thu, Jan 10, 2002 at 05:06:01AM -0800, Andrew Gould wrote:

Where does 'set.id' come from? If this was a typo,
was it in the sql query you executed? I would think
that your first attempt should have worked if 'set.id'
was replaced with 'invoice.id'.

yes, that was a typo (which i fixed in my quoted sql above). and
no, it doesn't do what i want. (it DOES if i have just ONE
single invoice record in the table -- all totals wind up in just
one invoice record.) if you have an example that DOES work i'd
love to see it!

i'm guessing that this isn't really an obscure task
(subtotalling child records into a parent field) but getting it
to work just ain't happenin' fer me yet. i'm SURE there's gotta
be an sql-friendly way to do this... ?

-- 
DEBIAN NEWBIE TIP #119 from Jonathan D. Proulx <jon@ai.mit.edu>
:
Having trouble RUNNING REMOTE X APPLICATIONS?  You've tried "xhost
+<host>", set the DISPLAY variable on the remote session, and
checked that the "-nolisten tcp" flag is *not* being sent at X
startup, right?
  Verify that X is really listening: "netstat -tl" will show
all listening tcp ports; you should see port 6000 open if
display :0 is listening (6001 for :1 etc.)
  If it is listening, I'd start wondering about packet filtering
rules. Check ipchains or iptables...

Also see http://newbieDoc.sourceForge.net/ ...

#6will trillich
will@serensoft.com
In reply to: Tom Lane (#4)
Re: caching subtotals: update vs sum -- aaugh!

On Thu, Jan 10, 2002 at 10:41:50AM -0500, Tom Lane wrote:

Or ... wait a second. How about

update invoice set
bal = ss.bal,
cost = ss.cost,
charge = ss.charge
from
(select
item_id,
sum(bal) as bal,
sum(cost) as cost,
sum(charge) as charge
from
line_item
group by
item_id) ss
where ss.item_id = invoice.id;

I haven't tried this but it seems like it should work. Better check the
quality of the generated plan though. The temp table might be faster.

this is why we like having you around, Tom. you're sneaky in all
the right places. :)

i knew there was a solution in there somewhere...

UPDATE _acct SET
cost = ppp.cost,
charge = ppp.charge
FROM
(SELECT
acct_id,
SUM(cost) AS cost,
SUM(charge) AS charge
FROM
_prop
GROUP BY
acct_id) ppp
WHERE
acct_id = ppp.acct_id;

this seems like it'd properly update all _acct records with
appropriate subtotals from child _prop records. BUT-- after
doing just that (above), given this data in then _prop table,

db=# select id,acct_id,charge,cost from _prop;
id | acct_id | charge | cost
----+---------+--------+-------
3 | 4 | 0.00 | 0.00
4 | 3 | 0.00 | 0.00
5 | 2 | 210.98 | 7.25
2 | 2 | 384.95 | 40.00
(4 rows)

(then i do the update, above) i wind up with the following in
the _acct table:

db=# select id,charge,cost from _acct;
id | charge | cost
----+--------+-------
3 | 595.93 | 47.25
4 | 595.93 | 47.25
2 | 595.93 | 47.25
(3 rows)

the amounts are right for _acct.id = 2, but should be zero for
both of the others. arggh!

for completeness, i tried aliasing the main table, to no avail:

UPDATE _acct a SET -- alias for subquery?
cost = ppp.cost,
charge = ppp.charge
FROM
(SELECT
acct_id,
SUM(cost) AS cost,
SUM(charge) AS charge
FROM
_prop
WHERE -- hmm?
acct_id = a.id
GROUP BY
acct_id) ppp
WHERE
acct_id = ppp.acct_id;

ERROR: parser: parse error at or near "a"

i'm hoping to add this to a rule:

ON UPDATE TO acct DO INSTEAD (
UPDATE _acct SET
f1 = NEW.f1,
f2 = NEW.f2,
-- yada yada
fN = NEW.fN
WHERE
id = OLD.id
;
UPDATE _acct SET
cost = ppp.cost,
charge = ppp.charge
FROM
(SELECT
acct_id,
SUM(cost) AS cost,
SUM(charge) AS charge
FROM
_prop
-- WHERE
-- acct_id = OLD.acct_id
GROUP BY
acct_id
) ppp
WHERE
id = OLD.acct_id;
);

i'm not having any luck with this, though. i guess it's okay to
do the CREATE TEMP TABLE... DROP TABLE... within a rule, right?
:(

--
DEBIAN NEWBIE TIP #36 from Sean Quinlan <smq@gmx.co.uk>
:
Looking to CHANGE THE DEFAULT LS COLORS? It's simple: first,
dircolors -p >~/.dircolors
and then edit the results to suit your tastes; finally, insert
eval `dircolors -b ~/.dircolors`
in your ~/.bashrc. Next time you log in (or source ~/.bashrc)
your new colors will take effect.

Also see http://newbieDoc.sourceForge.net/ ...

#7will trillich
will@serensoft.com
In reply to: will trillich (#6)
Re: caching subtotals: update vs sum -- aaugh!

On Fri, Jan 11, 2002 at 03:11:16PM -0600, i whined about:

the amounts are right for _acct.id = 2, but should be zero for
both of the others. arggh!

then Tom Lane <tgl@sss.pgh.pa.us> replied

[ scratches head ] This may represent a bug. I'm too lazy today to
whip up a test case based on your emails --- could I trouble you for
a script that creates and loads the test tables?

here i had my hopes up. alas, it was only me and my bungles.

i think i found it-- this finally works (sure would like to have
the subquery use the WHERE, though...)

create rule acct_edit as
on update to acct
do instead (
update _acct set
code = NEW.code,
charge = p.charge,
cost = p.cost
from (
select
sum(charge) as charge,
sum(cost ) as cost,
acct_id
from
_prop
-- where
-- acct_id = OLD.id -- can't see *OLD* record here
group by
acct_id
) p
where
id = OLD.id
and
p.acct_id = OLD.id;
);

i also tried creating a temp table within the rule, but didn't
get very far. (and BOY is it crucial to have an index on
_prop(acct_id)!)

--

as for where the problem sat:

UPDATE _acct SET
cost = ppp.cost,
charge = ppp.charge
FROM
(SELECT
acct_id,
SUM(cost) AS cost,
SUM(charge) AS charge
FROM
_prop
GROUP BY
acct_id) ppp
WHERE
acct_id = ppp.acct_id;

the WHERE was wrong -- shoulda been

WHERE
id -- _acct.id, in top query
=
ppp.acct_id -- subquery from _prop.acct_id
;

--
DEBIAN NEWBIE TIP #60 from Vineet Kumar <debian-user@virtual.doorstop.net>
:
Been hoping to find A FEATURE-PACKED MUTT CONFIG FILE? Check
out the ones at Sven Guckes' site:
http://www.fefe.de/muttfaq/muttrc
There's also some great vimrc ideas there, too.

Also see http://newbieDoc.sourceForge.net/ ...