views, queries, and locks

Started by Jon Nelsonabout 14 years ago22 messagesgeneral
Jump to latest
#1Jon Nelson
jnelson+pgsql@jamponi.net

I have a situation that I'd like some help resolving.
Using PostgreSQL 8.4.<reasonably recent> on Linux, I have three things
coming together that cause me pain. I have a VIEW used by a bunch of
queries. Usually, these queries are fairly short (subsecond) but
sometimes they can be very long (days). I also update this view with
CREATE OR REPLACE VIEW every 15-30 minutes. What I observe, sometimes,
is this:

1. query A comes in. It's a big one.
2. another process comes along, needs to update the view definition.
It issues create or replace view. It blocks on [1].
3. queries B through N come in, are blocked by [2], which is blocked by [1].
4. pandemonium!

I can reduce (some) but not eliminate the need to update the view
multiple times a day. What might be some good ways to prevent queries
B through N blocking?

Addendum: I can work around the issue by timing out and failing the
CREATE OR REPLACE VIEW (by canceling the query) after a short
duration, but is there a better way?

--
Jon

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Jon Nelson (#1)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 12:01 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

I have a situation that I'd like some help resolving.
Using PostgreSQL 8.4.<reasonably recent> on Linux, I have three things
coming together that cause me pain. I have a VIEW used by a bunch of
queries. Usually, these queries are fairly short (subsecond) but
sometimes they can be very long (days). I also update this view with
CREATE OR REPLACE VIEW every 15-30 minutes. What I observe, sometimes,
is this:

1. query A comes in. It's a big one.
2. another process comes along, needs to update the view definition.
It issues create or replace view. It blocks on [1].
3. queries B through N come in, are blocked by [2], which is blocked by [1].
4. pandemonium!

I can reduce (some) but not eliminate the need to update the view
multiple times a day. What might be some good ways to prevent queries
B through N blocking?

Addendum: I can work around the issue by timing out and failing the
CREATE OR REPLACE VIEW (by canceling the query) after a short
duration, but is there a better way?

Yeah -- this is just asking for trouble. Why do you have to replace
the view every 30 minutes? Your solution is probably going to involve
not doing that.

merlin

#3Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Merlin Moncure (#2)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 12:16 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Apr 3, 2012 at 12:01 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

I have a situation that I'd like some help resolving.
Using PostgreSQL 8.4.<reasonably recent> on Linux, I have three things
coming together that cause me pain. I have a VIEW used by a bunch of
queries. Usually, these queries are fairly short (subsecond) but
sometimes they can be very long (days). I also update this view with
CREATE OR REPLACE VIEW every 15-30 minutes. What I observe, sometimes,
is this:

1. query A comes in. It's a big one.
2. another process comes along, needs to update the view definition.
It issues create or replace view. It blocks on [1].
3. queries B through N come in, are blocked by [2], which is blocked by [1].
4. pandemonium!

I can reduce (some) but not eliminate the need to update the view
multiple times a day. What might be some good ways to prevent queries
B through N blocking?

Addendum: I can work around the issue by timing out and failing the
CREATE OR REPLACE VIEW (by canceling the query) after a short
duration, but is there a better way?

Yeah -- this is just asking for trouble.  Why do you have to replace
the view every 30 minutes?  Your solution is probably going to involve
not doing that.

As I said, I can reduce the frequency, but not eliminate it. I'm
curious - if views are little more than sql macros, by the time the
query has begun to execute why is a lock still needed on the view
definition?

--
Jon

#4Thomas Kellerer
spam_eater@gmx.net
In reply to: Jon Nelson (#1)
Re: views, queries, and locks

Jon Nelson wrote on 03.04.2012 19:01:

I also update this view with CREATE OR REPLACE VIEW every 15-30 minutes

That is a highly questionable approach.

What real problem are you trying to solve with that?
Maybe there is a better solution that does not require changing the view.

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Jon Nelson (#3)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 12:30 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Tue, Apr 3, 2012 at 12:16 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Apr 3, 2012 at 12:01 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

I have a situation that I'd like some help resolving.
Using PostgreSQL 8.4.<reasonably recent> on Linux, I have three things
coming together that cause me pain. I have a VIEW used by a bunch of
queries. Usually, these queries are fairly short (subsecond) but
sometimes they can be very long (days). I also update this view with
CREATE OR REPLACE VIEW every 15-30 minutes. What I observe, sometimes,
is this:

1. query A comes in. It's a big one.
2. another process comes along, needs to update the view definition.
It issues create or replace view. It blocks on [1].
3. queries B through N come in, are blocked by [2], which is blocked by [1].
4. pandemonium!

I can reduce (some) but not eliminate the need to update the view
multiple times a day. What might be some good ways to prevent queries
B through N blocking?

Addendum: I can work around the issue by timing out and failing the
CREATE OR REPLACE VIEW (by canceling the query) after a short
duration, but is there a better way?

Yeah -- this is just asking for trouble.  Why do you have to replace
the view every 30 minutes?  Your solution is probably going to involve
not doing that.

As I said, I can reduce the frequency, but not eliminate it. I'm
curious - if views are little more than sql macros, by the time the
query has begun to execute why is a lock still needed on the view
definition?

Generally speaking, in SQL, locks are held until the transaction
commits; there are tons of reasons why things have to work that way.
Anyways, I'm betting your requirement to have to re-CREATE the view
can be abstracted out somehow. I'm guessing you have some type of
table rotation going on?

merlin

#6Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Merlin Moncure (#5)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 1:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Apr 3, 2012 at 12:30 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Tue, Apr 3, 2012 at 12:16 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Apr 3, 2012 at 12:01 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

I have a situation that I'd like some help resolving.
Using PostgreSQL 8.4.<reasonably recent> on Linux, I have three things
coming together that cause me pain. I have a VIEW used by a bunch of
queries. Usually, these queries are fairly short (subsecond) but
sometimes they can be very long (days). I also update this view with
CREATE OR REPLACE VIEW every 15-30 minutes. What I observe, sometimes,
is this:

1. query A comes in. It's a big one.
2. another process comes along, needs to update the view definition.
It issues create or replace view. It blocks on [1].
3. queries B through N come in, are blocked by [2], which is blocked by [1].
4. pandemonium!

I can reduce (some) but not eliminate the need to update the view
multiple times a day. What might be some good ways to prevent queries
B through N blocking?

Addendum: I can work around the issue by timing out and failing the
CREATE OR REPLACE VIEW (by canceling the query) after a short
duration, but is there a better way?

Yeah -- this is just asking for trouble.  Why do you have to replace
the view every 30 minutes?  Your solution is probably going to involve
not doing that.

As I said, I can reduce the frequency, but not eliminate it. I'm
curious - if views are little more than sql macros, by the time the
query has begun to execute why is a lock still needed on the view
definition?

Generally speaking, in SQL, locks are held until the transaction
commits; there are tons of reasons why things have to work that way.
Anyways, I'm betting your requirement to have to re-CREATE the view
can be abstracted out somehow.  I'm guessing you have some type of
table rotation going on?

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Regarding locks: I know that's how locks usually work - but what I'm
asking is if holding on to the lock for the view, once the view has
been 'expanded', is necessary at all.

--
Jon

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Jon Nelson (#6)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 1:41 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Tue, Apr 3, 2012 at 1:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

Generally speaking, in SQL, locks are held until the transaction
commits; there are tons of reasons why things have to work that way.
Anyways, I'm betting your requirement to have to re-CREATE the view
can be abstracted out somehow.  I'm guessing you have some type of
table rotation going on?

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

How sophisticated are the queries that are touching this view? How
much data in the tables? If you don't need to push quals down into
the view, we can do a view wrapping function that can pick up the new
tables.

Regarding locks: I know that's how locks usually work - but what I'm
asking is if holding on to the lock for the view, once the view has
been 'expanded', is necessary at all.

Unfortunately, it is.

merlin

#8Thomas Kellerer
spam_eater@gmx.net
In reply to: Jon Nelson (#6)
Re: views, queries, and locks

Jon Nelson wrote on 03.04.2012 20:41:

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Can't you use table inheritance for that?

#9Scott Marlowe
scott.marlowe@gmail.com
In reply to: Thomas Kellerer (#8)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 1:45 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 03.04.2012 20:41:

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Can't you use table inheritance for that?

It could well be that inherited tables are or at least were slower
than a view of individual tables.

#10Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Thomas Kellerer (#8)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 2:45 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 03.04.2012 20:41:

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Can't you use table inheritance for that?

Not efficiently. My view looks like this:

select <bunch of stuff from table A>, DATE 'date string here' as
some_date_column
UNION ALL
select <bunch of stuff from table B>, DATE 'date string here' as
some_date_column
....

for lots and lots of tables. Storing that DATE would be cost
prohibitive and inefficient, since the same value would be used
throughout each 'sub' table. This let's me do queries that involve
'some_date_column' and the query optimizer will remove the tables that
don't apply, etc.

--
Jon

#11Scott Marlowe
scott.marlowe@gmail.com
In reply to: Jon Nelson (#10)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 7:21 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Tue, Apr 3, 2012 at 2:45 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 03.04.2012 20:41:

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Can't you use table inheritance for that?

Not efficiently. My view looks like this:

select <bunch of stuff from table A>, DATE 'date string here' as
some_date_column
UNION ALL
select <bunch of stuff from table B>, DATE 'date string here' as
some_date_column
....

for lots and lots of tables. Storing that DATE would be cost
prohibitive and inefficient, since the same value would be used
throughout each 'sub' table.This let's me do queries that involve
'some_date_column' and the query optimizer will remove the tables that
don't apply, etc.

I was thinking it was something like that. Have you thought of using
a pl/pgsql function with a built up and executed query to accomplish
this? That way you'd get both the efficiency of your current method
without having to rebuild views all the time.

#12Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Scott Marlowe (#11)
Re: views, queries, and locks

On Tue, Apr 3, 2012 at 8:58 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

On Tue, Apr 3, 2012 at 7:21 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Tue, Apr 3, 2012 at 2:45 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 03.04.2012 20:41:

Close, but not quite. It's not rotation but every N minutes a
newly-built table appears. I'd like that table to appear as part of
the view as soon as possible.

Can't you use table inheritance for that?

Not efficiently. My view looks like this:

select <bunch of stuff from table A>, DATE 'date string here' as
some_date_column
UNION ALL
select <bunch of stuff from table B>, DATE 'date string here' as
some_date_column
....

for lots and lots of tables. Storing that DATE would be cost
prohibitive and inefficient, since the same value would be used
throughout each 'sub' table.This let's me do queries that involve
'some_date_column' and the query optimizer will remove the tables that
don't apply, etc.

I was thinking it was something like that.  Have you thought of using
a pl/pgsql function with a built up and executed query to accomplish
this?  That way you'd get both the efficiency of your current method
without having to rebuild views all the time.

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

--
Jon

#13Merlin Moncure
mmoncure@gmail.com
In reply to: Jon Nelson (#12)
Re: views, queries, and locks

On Wed, Apr 4, 2012 at 8:50 AM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

hence my question upthread: "how sophisticated are the queries that
touch this, and how much data?" you can wrap the function output in a
view and even do other tricks to parameterize the function through the
view if necessary.

merlin

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jon Nelson (#12)
Re: views, queries, and locks

Jon Nelson <jnelson+pgsql@jamponi.net> writes:

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

Why aren't you using a standard partitioned table, cf
http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

The stuff associated with table inheritance is actually designed for the
use-case of adding and dropping child tables, and we might consider any
problems therein as something to be fixed. Whereas the fact that
changing a view locks it is unlikely to change.

regards, tom lane

#15Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Tom Lane (#14)
Re: views, queries, and locks

On Wed, Apr 4, 2012 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why aren't you using a standard partitioned table, cf
http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

Because I'm adding "scalar" (constant-value) columns to the view like this:
SELECT * from tableA, DATE 'date string here' as date_column
UNION ALL
SELECT * from tableB, DATE 'date string here' as date_column

for hundreds or even thousands of tables.

The stuff associated with table inheritance is actually designed for the
use-case of adding and dropping child tables, and we might consider any
problems therein as something to be fixed.  Whereas the fact that
changing a view locks it is unlikely to change.

I'm not asking for "don't lock views when changing them" I'm asking
"does the lock on the view still have to be held after the query
rewrite takes place" (since views are little more than rules?).

--
Jon

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jon Nelson (#15)
Re: views, queries, and locks

Jon Nelson <jnelson+pgsql@jamponi.net> writes:

On Wed, Apr 4, 2012 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why aren't you using a standard partitioned table, cf
http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

Because I'm adding "scalar" (constant-value) columns to the view like this:
SELECT * from tableA, DATE 'date string here' as date_column
UNION ALL
SELECT * from tableB, DATE 'date string here' as date_column

for hundreds or even thousands of tables.

[ yawn... ] Premature micro-optimization is the root of all evil.
The actual advantage to what you are doing is not scanning irrelevant
partitions, which constraint exclusion handles perfectly fine. Not
storing the date column is unlikely to be saving anything meaningful.
(How wide are those table rows, anyway?)

More generally, partitioning "hundreds or even thousands" of ways is
costly overkill. Realistically, do you need to manage your data in
a way that allows you to drop less than perhaps 10% at once? I think
the usefulness threshold is probably a lot closer to 10% than 0.01%.

regards, tom lane

#17Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Tom Lane (#16)
Re: views, queries, and locks

On Wed, Apr 4, 2012 at 10:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jon Nelson <jnelson+pgsql@jamponi.net> writes:

On Wed, Apr 4, 2012 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why aren't you using a standard partitioned table, cf
http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

Because I'm adding "scalar" (constant-value) columns to the view like this:
SELECT * from tableA, DATE 'date string here' as date_column
UNION ALL
SELECT * from tableB, DATE 'date string here' as date_column

for hundreds or even thousands of tables.

[ yawn... ]  Premature micro-optimization is the root of all evil.
The actual advantage to what you are doing is not scanning irrelevant
partitions, which constraint exclusion handles perfectly fine.  Not
storing the date column is unlikely to be saving anything meaningful.
(How wide are those table rows, anyway?)

I agree, generally, however as with a great many things in life, what
it does now what it was designed to do are two different things. Quite
frankly, it's a testament to PostgreSQL that it handles this situation
(which is many times greater than the original design) as well as it
does.

Regarding the storage costs for adding a column:
A quick back-of-the-napkin means the table size increase is roughly 5%.
I'll have to determine if the size tradeoff (+ table inheritance) is
worth it versus using the view.

Thanks for the advice, everyone.

--
Jon

#18Thomas Kellerer
spam_eater@gmx.net
In reply to: Jon Nelson (#12)
Re: views, queries, and locks

Jon Nelson wrote on 04.04.2012 15:50:

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

What about a set-returning function that builds the query dynamically and wrapping that into a view?

That way the view would never change and client would still have the perspective of a view/table

Your function could pick up the changes automatically e.g. by looking at information_schema.tables

#19Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Thomas Kellerer (#18)
Re: views, queries, and locks

On Wed, Apr 4, 2012 at 11:22 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 04.04.2012 15:50:

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

What about a set-returning function that builds the query dynamically and
wrapping that into a view?

That way the view would never change and client would still have the
perspective of a view/table

Your function could pick up the changes automatically e.g. by looking at
information_schema.tables

That sounds pretty cool, but that is beyond my understanding.
I can write a set-returning function easily enough (call it
build_the_huge_table for "clarity"), but the "wrap that into a view"
part throws me.
Could you give me an example?

--
Jon

#20Merlin Moncure
mmoncure@gmail.com
In reply to: Jon Nelson (#19)
Re: views, queries, and locks

On Wed, Apr 4, 2012 at 12:47 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:

On Wed, Apr 4, 2012 at 11:22 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Jon Nelson wrote on 04.04.2012 15:50:

I need to have something table-like from the client's perspective for
a bunch of reasons.
For now, assume that I want to keep using the view and that I'd like
to find better ways to address my concerns.

What about a set-returning function that builds the query dynamically and
wrapping that into a view?

That way the view would never change and client would still have the
perspective of a view/table

Your function could pick up the changes automatically e.g. by looking at
information_schema.tables

That sounds pretty cool, but that is beyond my understanding.
I can write a set-returning function easily enough (call it
build_the_huge_table for "clarity"), but the  "wrap that into a view"
part throws me.
Could you give me an example?

it's trivial:

create view stuff as select * from function();

it's less trivial if you need to pass arguments to the function, but
can be doable depending.

merlin

#21Thomas Kellerer
spam_eater@gmx.net
In reply to: Jon Nelson (#19)
#22Jon Nelson
jnelson+pgsql@jamponi.net
In reply to: Thomas Kellerer (#21)