Writing most code in Stored Procedures

Started by RPKover 18 years ago21 messagesgeneral
Jump to latest
#1RPK
rpk.general@gmail.com

I have few queries regarding the use of Stored Procedures, Functions
and Triggers in an RDBMS.

(1) When to use Stored Procedure? Writing an INSERT query in a Stored
Procedure is better or firing it from the application level?

(2) Can a Trigger call a Stored Procedure?

(3) What type of code must reside in application and what type in
RDBMS?

(4) Is it faster to work at application level or at the database level?

#2Richard Huxton
dev@archonet.com
In reply to: RPK (#1)
Re: Writing most code in Stored Procedures

Rohit wrote:

I have few queries regarding the use of Stored Procedures, Functions
and Triggers in an RDBMS.

These are all easy questions to answer: "it depends".

OK, so you might want some reasons...

(1) When to use Stored Procedure? Writing an INSERT query in a Stored
Procedure is better or firing it from the application level?

Some people like to have all access to the database through stored
procedures. This can make it simpler to control access to the data and
seems popular with MSSQL developers.

Personally, I use SQL as my interface - that's its purpose. Actually,
that's not quite true, I use various wrappers that generate SQL for most
queries.

(2) Can a Trigger call a Stored Procedure?

It has to, although they're just called functions in PostgreSQL, can be
written in a range of procedural languages and take place within a
transaction the same as any other function-call.

(3) What type of code must reside in application and what type in
RDBMS?

Anything that's to do with data integrity I put in the database.
Anything that's to do with process I put in the application (or
middle-ware layer).
The tricky bit is deciding what to do when you have something that's a
bit of both.

(4) Is it faster to work at application level or at the database level?

Probably faster in the database, assuming you have only one machine. If
you have more than one machine then you can have each machine designed
for its purpose. Of course, faster to run might be at the cost of more
difficult to write. It might also mean your process code can crash the
database server.

--
Richard Huxton
Archonet Ltd

#3Lew
lew@lewscanon.com
In reply to: Richard Huxton (#2)
Re: Writing most code in Stored Procedures

Rohit wrote:

(4) Is it faster to work at application level or at the database level?

Richard Huxton wrote:

Probably faster in the database, assuming you have only one machine. If
you have more than one machine then you can have each machine designed
for its purpose. Of course, faster to run might be at the cost of more
difficult to write. It might also mean your process code can crash the
database server.

Which would you rather have, a wrong answer really quickly, a correct answer
slowly, or no answer at all?

Be clear on the metric for "faster" before devising strategies. Personally I
prefer "more correct" first. As Richard hinted, reliability and stability are
usually important also, I'd say more important.

"Faster to run" also depends on the degree of client concurrency. Sometimes
middleware can scale logic better than the database engine can. Systems can
achieve higher aggregate throughput with perhaps a reduction in individual
response time.

If you put logic where it "belongs" that can help scalability. The database
tends to be a central resource for all concurrent clients, so logic in that
layer is potentially a bottleneck. Danger in that layer is danger to everyone.

Business logic that relies on data but performs session-specific processing
should live in the middleware in many cases. One can throw extra hardware at
that layer to parallelize client services, and errors tend to not escape a
specific client session.

Business logic that is actually data-integrity logic could break the system if
it were in the middleware layer. Deadlocks, race conditions and other
nastiness argue that that type of logic belongs in a more central location,
with the data. One can then exploit the database engine capabilities for this
stuff.

As Richard metioned, there are plenty of boundary cases that require a
judgment call. Try to analyze which approach will have more risk in such
cases; sometimes that helps discriminate. It ain't always easy; that's why
they pay us the big bucks.

--
Lew

#4Joshua Tolley
eggyknap@gmail.com
In reply to: RPK (#1)
Re: Writing most code in Stored Procedures

On 8/15/07, Rohit <rpk.general@gmail.com> wrote:

I have few queries regarding the use of Stored Procedures, Functions
and Triggers in an RDBMS.

(1) When to use Stored Procedure? Writing an INSERT query in a Stored
Procedure is better or firing it from the application level?

(2) Can a Trigger call a Stored Procedure?

(3) What type of code must reside in application and what type in
RDBMS?

(4) Is it faster to work at application level or at the database level?

One of the major advantages of stored procedures over application code
is that stored procedures can get to the data much more quickly than
an application can, in general. An application needs to talk to
PostgreSQL through some sort of driver (JDBC, libpq, etc.) and
typically data need to traverse a network to get from the database
machine to the application server. Stored procedures, on the other
hand, don't have the overhead of either the network or the driver.
However, stored procedures can be more difficult to debug, profile,
etc., so they might not be the best for really complex logic. They
tend to be really useful if you're doing something that requires lots
and lots of queries to the database, and don't need anything else
(data from other sources, user interaction, etc.), because that takes
greatest advantage of their quick connection to the data. Richard
Huxton's point that stored procedures are typically best for data
integrity types of functions, whereas business logic should often be
in application code is an excellent rule of thumb.

-Josh

#5Steve Manes
smanes@magpie.com
In reply to: Joshua Tolley (#4)
Re: Writing most code in Stored Procedures

On 8/15/07, Rohit <rpk.general@gmail.com> wrote:

I have few queries regarding the use of Stored Procedures, Functions
and Triggers in an RDBMS.

(1) When to use Stored Procedure? Writing an INSERT query in a Stored
Procedure is better or firing it from the application level?

(2) Can a Trigger call a Stored Procedure?

(3) What type of code must reside in application and what type in
RDBMS?

(4) Is it faster to work at application level or at the database level?

I'm fairly hardcore about keeping as much business logic as I can in the
database. In fact, I only do SELECTs from the application, and usually
via Views. All inserts, updates and deletes are via procs. I'm a
proponent of separating application code from presentation and
application code from database code. And HTML from layout style, for
that matter.

In addition to the other reasons you've gotten:

It lets me blackbox commonly used functions, such as a function to
insert a normalized demographic record for a customer, an organization,
a user, a guarantor, a physician, etc.

It isolates database logic so it can be debugged separately from the
application.

It reduces the application's vulnerability to SQL injection, especially
if another developer (never me <g>) forgets to massage potentially
tainted user input.

Another is because I typically do my web application programming in PHP5
but the offline scripts in Perl. Both can call the same stored
procedures so I don't have multiple copies of database code to maintain
and debug.

Another is because I want transactions to start and end in the database,
not in external application code which might crash before a COMMIT.

Another is because I'm a freelancer and generally hand off my
applications to the client's tech department, which is often a junior
level grasshopper who knows just enough SQL to be dangerous. Using
stored procedures raises the bar on what they need to know about RDMBSes
before they start hacking working code.

And, yes, it's faster. Particularly if business logic decisions have to
be made in context with a transaction.

#6Trevor Talbot
quension@gmail.com
In reply to: Steve Manes (#5)
Re: Writing most code in Stored Procedures

On 8/15/07, Steve Manes <smanes@magpie.com> wrote:

I'm fairly hardcore about keeping as much business logic as I can in the
database. In fact, I only do SELECTs from the application, and usually
via Views. All inserts, updates and deletes are via procs. I'm a
proponent of separating application code from presentation and
application code from database code. And HTML from layout style, for
that matter.

In addition to the other reasons you've gotten:

Another is because I want transactions to start and end in the database,
not in external application code which might crash before a COMMIT.

Hmm, how do you handle this logically? Do your applications never
need to submit chunks of work at once? Or do you do something like
fill in a temporary table, and have a proc work from that?

Another is because I'm a freelancer and generally hand off my
applications to the client's tech department, which is often a junior
level grasshopper who knows just enough SQL to be dangerous. Using
stored procedures raises the bar on what they need to know about RDMBSes
before they start hacking working code.

...although that helps explain "why" a bit more: if you're stuck with
people who can't use transactions properly, it makes sense to keep
them out of reach...

In reply to: Steve Manes (#5)
Re: Writing most code in Stored Procedures

On 8/16/07, Steve Manes <smanes@magpie.com> wrote:

On 8/15/07, Rohit <rpk.general@gmail.com> wrote:

Another is because I typically do my web application programming in PHP5
but the offline scripts in Perl. Both can call the same stored
procedures so I don't have multiple copies of database code to maintain
and debug.

I'm also thinking the same thing once before. Doing as much logic in
the database and at the application layer most of it would be just a
"SELECT .." call. It allows me to experiment with different kind of
langguage (PHP, Python, Ruby) without having to rewrite the business
logic of the application. But using this kind of approach make it hard
to debug and maintain. Changing simple thing would mean updating the
database schema and it would always frigthened me some silly mistake
can affect the integrity of the data. In the end, I slowly move the
logic out of the db and decided on one platform for the application
(PHP).

--
kamal, www.k4ml.com

#8Steve Manes
smanes@magpie.com
In reply to: Trevor Talbot (#6)
Re: Writing most code in Stored Procedures

Trevor Talbot wrote:

Another is because I want transactions to start and end in the database,
not in external application code which might crash before a COMMIT.

Hmm, how do you handle this logically? Do your applications never
need to submit chunks of work at once? Or do you do something like
fill in a temporary table, and have a proc work from that?

Of course not "never" but the goal is one logical database transaction
per page invocation. It's a discipline I got into when I was working on
a site that clocked 3/4 billion page views month on a severely overtaxed
Sun Oracle box. A lot of it involves caching on the web server. For
instance, static data like pulldown data gets cached and refreshed
either by expiry or by the application blowing the cache after an
update. If I have a thousand users pulling the same list of physicians
a hundred times a day, I cache it. If users need to view a complex
report where the dependent data only changes once every 24 hours (like
an adserver pull), I cache it and have an offline job rebuild it when
fresh data becomes available.

Result sets get cached either in the user's session or in hidden DIVs to
reduce redundant database calls.

I build database-intensive applications starting at the database,
beginning with an ERD, then the proc primitives. Then I wireframe the
application from the perspective of the database, building macros of
procedures. After testing, these go into the application's API. The
brass ring is being able to take a validated, prepared and filtered POST
array and send it to a stored procedure. And rather than building a
200+ line SELECT statement in the application, building a view to
encapsulate it.

I'm not saying that's the best way to build an application. It's just
what I'm most comfortable doing and it works for me.

#9Guy Rouillier
guyr-ml1@burntmail.com
In reply to: Steve Manes (#5)
Re: Writing most code in Stored Procedures

Steve Manes wrote:

I'm fairly hardcore about keeping as much business logic as I can in the
database. In fact, I only do SELECTs from the application, and usually
via Views. All inserts, updates and deletes are via procs.

...

And, yes, it's faster. Particularly if business logic decisions have to
be made in context with a transaction.

I have a thread I started ages ago over on the PERFORM list that I'm
sadly just now being able to provide some insight on. I'll be replying
on that thread in more detail, but the short of it turns out to be that
at least in this one application, using stored procs for inserts is
slowing down the app considerably. The app does high volume inserts and
updates, about 16 million rows a day. By switching from stored procs to
inline inserts, elapsed time dropped from 2595 seconds to 991 seconds
for a test run.

So the moral of the story is that, as anyone who has worked
professionally for a number of years knows, no magic bullets exist.
General guidelines can be helpful, but each scenario must be
individually investigated.

--
Guy Rouillier

#10Kenneth Downs
ken@secdat.com
In reply to: Guy Rouillier (#9)
Re: Writing most code in Stored Procedures

Guy Rouillier wrote:

Steve Manes wrote:

I'm fairly hardcore about keeping as much business logic as I can in
the database. In fact, I only do SELECTs from the application, and
usually via Views. All inserts, updates and deletes are via procs.

...

And, yes, it's faster. Particularly if business logic decisions have
to be made in context with a transaction.

I have a thread I started ages ago over on the PERFORM list that I'm
sadly just now being able to provide some insight on. I'll be
replying on that thread in more detail, but the short of it turns out
to be that at least in this one application, using stored procs for
inserts is slowing down the app considerably. The app does high
volume inserts and updates, about 16 million rows a day. By switching
from stored procs to inline inserts, elapsed time dropped from 2595
seconds to 991 seconds for a test run.

So the moral of the story is that, as anyone who has worked
professionally for a number of years knows, no magic bullets exist.
General guidelines can be helpful, but each scenario must be
individually investigated.

I would suggest separating the strategy of server-side biz rules from
the implementation method of stored procedures. You can also implement
biz rules as triggers and stick with inline inserts, updates and
deletes, which is what we do in my shop.

That being said, the enforcement of the biz rules has to be done
somewhere, and no matter where it is done it will involve disk reads and
validations. It always *seemed* to me that putting the code on a table
trigger involves the least possible overhead for doing these things. It
provides complete real-time enforcement within the transaction with no
network overhead, and has the simplest possible interface: plain old SQL.

--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010

#11Steve Manes
smanes@magpie.com
In reply to: Guy Rouillier (#9)
Re: Writing most code in Stored Procedures

Guy Rouillier wrote:

I have a thread I started ages ago over on the PERFORM list that I'm
sadly just now being able to provide some insight on. I'll be replying
on that thread in more detail, but the short of it turns out to be that
at least in this one application, using stored procs for inserts is
slowing down the app considerably. The app does high volume inserts and
updates, about 16 million rows a day. By switching from stored procs to
inline inserts, elapsed time dropped from 2595 seconds to 991 seconds
for a test run.

So the moral of the story is that, as anyone who has worked
professionally for a number of years knows, no magic bullets exist.
General guidelines can be helpful, but each scenario must be
individually investigated.

Absolutely. You can't assume that every application is going to fit
neatly into the same development jig. Hope the MVC frameworks zealots
don't read that. <g>

I worked on a fairly high volume site using PG, an ad trafficking
workflow application, which imported ~2 million placements daily from
DoubleClick, OAS and Accipiter. Everything had to be imported and the
reports run and cached by 8am so the clients stare blankly at 10,000 row
Excel charts over their morning coffee.

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.

#12Ron Johnson
ron.l.johnson@cox.net
In reply to: Steve Manes (#11)
Re: Writing most code in Stored Procedures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 18:00, Steve Manes wrote:

Guy Rouillier wrote:

I have a thread I started ages ago over on the PERFORM list that I'm
sadly just now being able to provide some insight on. I'll be
replying on that thread in more detail, but the short of it turns out
to be that at least in this one application, using stored procs for
inserts is slowing down the app considerably. The app does high
volume inserts and updates, about 16 million rows a day. By switching
from stored procs to inline inserts, elapsed time dropped from 2595
seconds to 991 seconds for a test run.

So the moral of the story is that, as anyone who has worked
professionally for a number of years knows, no magic bullets exist.
General guidelines can be helpful, but each scenario must be
individually investigated.

Absolutely. You can't assume that every application is going to fit
neatly into the same development jig. Hope the MVC frameworks zealots
don't read that. <g>

I worked on a fairly high volume site using PG, an ad trafficking
workflow application, which imported ~2 million placements daily from
DoubleClick, OAS and Accipiter. Everything had to be imported and the
reports run and cached by 8am so the clients stare blankly at 10,000 row
Excel charts over their morning coffee.

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.

Would that be because the original app was written in a compiled
language, but the SPs in an interpreted language?

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxjK1S9HxQb37XmcRAnYWAKCRV+INrpvl83lhnt4iadIMrBNIRgCgr8J2
UK3F87ji/24mrISLl+WmLnY=
=5csM
-----END PGP SIGNATURE-----

#13Steve Manes
smanes@magpie.com
In reply to: Ron Johnson (#12)
Re: Writing most code in Stored Procedures

Ron Johnson wrote:

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.

Would that be because the original app was written in a compiled
language, but the SPs in an interpreted language?

No, because the application language was Perl5 for both. I think it was
just the overhead of 2 million inserts via procs versus 2 million inline
inserts (without the proc overhead).

#14Guy Rouillier
guyr-ml1@burntmail.com
In reply to: Steve Manes (#11)
Re: Writing most code in Stored Procedures

Steve Manes wrote:

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.

That's the conclusion I'm coming to as well for my app with very high
insert/update rates. Is there a PG knowledgebase where this kind of
information might be useful?

--
Guy Rouillier

#15Ron Johnson
ron.l.johnson@cox.net
In reply to: Steve Manes (#13)
Re: Writing most code in Stored Procedures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 21:45, Steve Manes wrote:

Ron Johnson wrote:

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.

Would that be because the original app was written in a compiled
language, but the SPs in an interpreted language?

No, because the application language was Perl5 for both. I think it was
just the overhead of 2 million inserts via procs versus 2 million inline
inserts (without the proc overhead).

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwiAS9HxQb37XmcRAgi2AJ9Yq2drImecZVTbZR0Wo4VKlpaiXwCgpjHo
8KcDWqDpW6BWNWCj+ZUQFU4=
=Qlpg
-----END PGP SIGNATURE-----

#16Joshua Tolley
eggyknap@gmail.com
In reply to: Ron Johnson (#15)
Re: Writing most code in Stored Procedures

On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

IIRC PostgreSQL should only load the perl interpreter once per session.

- Josh

#17Joshua D. Drake
jd@commandprompt.com
In reply to: Joshua Tolley (#16)
Re: Writing most code in Stored Procedures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Josh Tolley wrote:

On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

IIRC PostgreSQL should only load the perl interpreter once per session.

Right.

Joshua D. Drake

- Josh

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxxmFATb/zqfZUUQRAoL+AJ0cyZ3FaB82PGbGvC/fPYWPo9UcjwCgjiDo
U5XCNKFZhhdiJ3Lf5x68lKQ=
=4HMG
-----END PGP SIGNATURE-----

#18Ron Johnson
ron.l.johnson@cox.net
In reply to: Joshua D. Drake (#17)
Re: Writing most code in Stored Procedures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/18/07 11:08, Joshua D. Drake wrote:

Josh Tolley wrote:

On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

IIRC PostgreSQL should only load the perl interpreter once per session.

Right.

I'd have been stunned if you did it any other way.

So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?

(I can imagine that the SP code path would be longer, but since IO
is the slowest part of the system, I'm surprised that it's *that*
much slower.)

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxx5jS9HxQb37XmcRAn1vAKCEWTovgcj/w/uFVK0nankGdbOFuACg6HSq
dUnMN0dPsdQ8NKSDW0EahcU=
=P8gN
-----END PGP SIGNATURE-----

#19Joshua D. Drake
jd@commandprompt.com
In reply to: Ron Johnson (#18)
Re: Writing most code in Stored Procedures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ron Johnson wrote:

On 08/18/07 11:08, Joshua D. Drake wrote:

Josh Tolley wrote:

On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

IIRC PostgreSQL should only load the perl interpreter once per session.

Right.

I'd have been stunned if you did it any other way.

So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?

(I can imagine that the SP code path would be longer, but since IO
is the slowest part of the system, I'm surprised that it's *that*
much slower.)

We would have to see his script as there are several variables at hand here.

1. Is he using do or execute with his perl-sql? If he is using execute
then he is getting the advantage of prepared query inserts.

2. Did he use a single transaction? If so then he is not paying for 2
million commits, which he would be with the procedure.

Sincerely,

Joshua D. Drake

- ---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxzFYATb/zqfZUUQRAvUWAKCHm4EOtmrblRcnoVLVSEA0/Hd1JQCfTbgu
4tRGmCvqZSuBsGvQyUOQBzU=
=edfC
-----END PGP SIGNATURE-----

#20Steve Manes
smanes@magpie.com
In reply to: Ron Johnson (#15)
Re: Writing most code in Stored Procedures

Ron Johnson wrote:

Interesting. Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

I mean the *application* language was Perl for both the inline insert
and the proc call. The proc was written in plpgsql.

#21Guy Rouillier
guyr-ml1@burntmail.com
In reply to: Ron Johnson (#18)