Do all Postgres queries touch Shared_Buffers at some point?

Started by Shiv Sharmaover 12 years ago3 messagesgeneral
Jump to latest
#1Shiv Sharma
shiv.sharma.1835@gmail.com

I am puzzled about the extent to which shared_bufferes is used for
different queries. Do _all_ queries "touch" shared buffers at some point
of their execution?

Many of our warehouse queries are seq_scan followed by HASH. I know
work_mem is assigned for HASH joins: but does this mean that these queries
never touch shared buffers at *all* during their execution? Perhaps they
are read into shared_buffers and then passed into work_mem HASH areas???

What about updates on big tables? What about inserts on big tables? What
about append-inserts?

I think I could get these answers from Explain Analyze Buffers but I am on
8.2 :-(

Please tell me which queries use/touch shared_buffers in general terms, or
please point me to documentation.

Shiv

#2Michael Paquier
michael@paquier.xyz
In reply to: Shiv Sharma (#1)
Re: Do all Postgres queries touch Shared_Buffers at some point?

On Sun, Dec 29, 2013 at 9:05 PM, Shiv Sharma <shiv.sharma.1835@gmail.com> wrote:

I am puzzled about the extent to which shared_bufferes is used for different
queries. Do _all_ queries "touch" shared buffers at some point of their
execution?

Many of our warehouse queries are seq_scan followed by HASH. I know
work_mem is assigned for HASH joins: but does this mean that these queries
never touch shared buffers at *all* during their execution? Perhaps they
are read into shared_buffers and then passed into work_mem HASH areas???

What about updates on big tables? What about inserts on big tables? What
about append-inserts?

I think I could get these answers from Explain Analyze Buffers but I am on
8.2 :-(

Please tell me which queries use/touch shared_buffers in general terms, or
please point me to documentation.

All your queries that interacts with relations.

shared_buffers is used for data caching across all the backends of the
server, to put it simply pages of the relation involved. Such data can
be relation data, like data of a table you defined yourself, index
data, or some system catalog data, containing definitions of the
database objects. So simply everything that is a relation and contains
physical data might be in shared buffers. Views for example do not
enter in this category.

You could for example use pg_buffercache to have a look at what
contains the shared buffers:
http://www.postgresql.org/docs/devel/static/pgbuffercache.html

Regards,
--
Michael

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Shiv Sharma
shiv.sharma.1835@gmail.com
In reply to: Michael Paquier (#2)
Re: Do all Postgres queries touch Shared_Buffers at some point?

Thanks. We are on Greenplum GP 4.2 (Postgres 8.2). As per GP suggestions,
we have 6 primary/6 mirror instances on each server. The server has 64 G
RAM, and shared_buffers is at ...125 MB :-). I suppose the idea is for the
OS buffer cache to do the legwork.

But still...performance is at least "not bad". If all HASH JOIN queries
touch shared_buffers in some way, I find it non-intuitive that we can have
concurrent hash queries involving big tables (100M+ joined with say 100K),
all apparently using the 125MB shared_buffers in some way, and yet giving
reasonable performance.

Basically what is the anatomy of a hash join involving large tables?
Disk->Shared_Buffers->Hash Join areas? something like that?

On Sun, Dec 29, 2013 at 9:18 AM, Michael Paquier
<michael.paquier@gmail.com>wrote:

Show quoted text

On Sun, Dec 29, 2013 at 9:05 PM, Shiv Sharma <shiv.sharma.1835@gmail.com>
wrote:

I am puzzled about the extent to which shared_bufferes is used for

different

queries. Do _all_ queries "touch" shared buffers at some point of their
execution?

Many of our warehouse queries are seq_scan followed by HASH. I know
work_mem is assigned for HASH joins: but does this mean that these

queries

never touch shared buffers at *all* during their execution? Perhaps they
are read into shared_buffers and then passed into work_mem HASH areas???

What about updates on big tables? What about inserts on big tables? What
about append-inserts?

I think I could get these answers from Explain Analyze Buffers but I am

on

8.2 :-(

Please tell me which queries use/touch shared_buffers in general terms,

or

please point me to documentation.

All your queries that interacts with relations.

shared_buffers is used for data caching across all the backends of the
server, to put it simply pages of the relation involved. Such data can
be relation data, like data of a table you defined yourself, index
data, or some system catalog data, containing definitions of the
database objects. So simply everything that is a relation and contains
physical data might be in shared buffers. Views for example do not
enter in this category.

You could for example use pg_buffercache to have a look at what
contains the shared buffers:
http://www.postgresql.org/docs/devel/static/pgbuffercache.html

Regards,
--
Michael