How many Mandatory Process are there in Postgres

Started by Tadipathri Raghuabout 16 years ago10 messagesgeneral

traghu.dba@gmail.com

about 16 years ago

Hi All,

When we start the postgres server, the writer process, wal process,
postmaster, autovacuum ( if autovacuum is on), stats collector will come
into picture as mandotory process. My question is, is there any processes
apart from these process, what are the mandotory process come along with the
postgres server and how many sleeping processes are there.

Few important question about Postgres Architecture
======================================
1. When does temp_buffer comes into existence in a database session ( like
when shared_buffer completely filled or any wal_buffer filled) ?
2. What is process array in shared memory?
3. maintenance_work_mem is used for vacuuming(does this memory allocated if
autovacuum is off)?
4. As oracle, Postgres doesnt have any undo_tablespace, for rollback
transaction. But it is handled in BEGIN /END block. So question is where the
transaction data is stored.

Thanks in Advance

Regards
Raghavendra

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 16 years ago

In reply to: Tadipathri Raghu (#1)

Re: How many Mandatory Process are there in Postgres

Hi All,

When we start the postgres server, the writer process, wal process,
postmaster, autovacuum ( if autovacuum is on), stats collector will come
into picture as mandotory process. My question is, is there any processes
apart from these process, what are the mandotory process come along with
the
postgres server and how many sleeping processes are there.

Well, there is a bunch of processes started at the beginning, and then
there is one backend process for each connection (see the postgresql.conf
how many connections are allowed in your case).

Few important question about Postgres Architecture
======================================
1. When does temp_buffer comes into existence in a database session ( like
when shared_buffer completely filled or any wal_buffer filled) ?

AFAIK Those are completely different buffers I.

Shared buffers are used as a "shared cache" for all the backends /
sessions. Wal_buffers are used when writing data to the write-ahead-log
(transaction log) and are shared by all backends just like the shared
buffers.

Temp_buffers on the other hand are used to access temporary tables - so
they serve s a completely different purpose and are session-specific. Each
session may consume the given amount of memory.

2. What is process array in shared memory?

No, idea (not a PostgreSQL hacker) but I'd guess it has something to do
with the backends (list of backends).

3. maintenance_work_mem is used for vacuuming(does this memory allocated
if
autovacuum is off)?

Yes, it's used for vacuuming (and many other things related to
maintenance). AFAIK the amount of memory is 'upper limit' and does not
mean the autovacuum will consume that.

Anyway disabling the autovacuum is a bad idea, just as using not enough
memory.

4. As oracle, Postgres doesnt have any undo_tablespace, for rollback
transaction. But it is handled in BEGIN /END block. So question is where
the
transaction data is stored.

Oracle uses a completely different implementation of MVCC architecture. It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way Oracle
does, but it has to remove stale copies of the rows (such that no running
transaction can see) - that's why there is VACUUM.

Regards
Tomas

Tadipathri Raghu

traghu.dba@gmail.com

about 16 years ago

In reply to: Tomas Vondra (#2)

Re: How many Mandatory Process are there in Postgres

Hi Tomas,

Thank you for the reply.

Well, there is a bunch of processes started at the beginning, and then
there is one backend process for each connection (see the postgresql.conf
how many connections are allowed in your case).

I do agree with you, that there would be bunch of process. Can you describe
any?

Temp_buffers on the other hand are used to access temporary tables - so
they serve s a completely different purpose and are session-specific. Each
session may consume the given amount of memory.

So, what happen when the shared_buffer is filled completely, where it does
its operation.

Yes, it's used for vacuuming (and many other things related to
maintenance). AFAIK the amount of memory is 'upper limit' and does not
mean the autovacuum will consume that.

Anyway disabling the autovacuum is a bad idea, just as using not enough
memory.

My question is does maintenance_work_mem is occupied or allocated, even the
autovacuum is off.

Oracle uses a completely different implementation of MVCC architecture. It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way Oracle
does, but it has to remove stale copies of the rows (such that no running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be kept
if the undo's are not there.

Regards
Raghav

2010/3/25 <tv@fuzzy.cz>

Show quoted text

Hi All,

When we start the postgres server, the writer process, wal process,
postmaster, autovacuum ( if autovacuum is on), stats collector will come
into picture as mandotory process. My question is, is there any processes
apart from these process, what are the mandotory process come along with
the
postgres server and how many sleeping processes are there.

Well, there is a bunch of processes started at the beginning, and then
there is one backend process for each connection (see the postgresql.conf
how many connections are allowed in your case).

Few important question about Postgres Architecture
======================================
1. When does temp_buffer comes into existence in a database session (

like

when shared_buffer completely filled or any wal_buffer filled) ?

AFAIK Those are completely different buffers I.

Shared buffers are used as a "shared cache" for all the backends /
sessions. Wal_buffers are used when writing data to the write-ahead-log
(transaction log) and are shared by all backends just like the shared
buffers.

Temp_buffers on the other hand are used to access temporary tables - so
they serve s a completely different purpose and are session-specific. Each
session may consume the given amount of memory.

2. What is process array in shared memory?

No, idea (not a PostgreSQL hacker) but I'd guess it has something to do
with the backends (list of backends).

3. maintenance_work_mem is used for vacuuming(does this memory allocated
if
autovacuum is off)?

Yes, it's used for vacuuming (and many other things related to
maintenance). AFAIK the amount of memory is 'upper limit' and does not
mean the autovacuum will consume that.

Anyway disabling the autovacuum is a bad idea, just as using not enough
memory.

4. As oracle, Postgres doesnt have any undo_tablespace, for rollback
transaction. But it is handled in BEGIN /END block. So question is where
the
transaction data is stored.

Oracle uses a completely different implementation of MVCC architecture. It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way Oracle
does, but it has to remove stale copies of the rows (such that no running
transaction can see) - that's why there is VACUUM.

Regards
Tomas

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 16 years ago

In reply to: Tadipathri Raghu (#3)

Re: How many Mandatory Process are there in Postgres

Hi Tomas,

Thank you for the reply.

Well, there is a bunch of processes started at the beginning, and then
there is one backend process for each connection (see the
postgresql.conf
how many connections are allowed in your case).

I do agree with you, that there would be bunch of process. Can you
describe
any?

The names of the processes are quite descriptive I guess - just start the
database and use 'ps ax' command. And then use the names to search the
docs, there's a plenty of details available there.

Temp_buffers on the other hand are used to access temporary tables - so
they serve s a completely different purpose and are session-specific.
Each
session may consume the given amount of memory.

So, what happen when the shared_buffer is filled completely, where it does
its operation.

See this:
http://wiki.postgresql.org/wiki/User:Gsmith#How_is_the_shared_buffer_cache_organized

Shared buffers is a LRU cache, i.e. when it's full and a new buffer is
requested, some of the 'oldest' buffers are removed.

Yes, it's used for vacuuming (and many other things related to
maintenance). AFAIK the amount of memory is 'upper limit' and does not
mean the autovacuum will consume that.

Anyway disabling the autovacuum is a bad idea, just as using not enough
memory.

My question is does maintenance_work_mem is occupied or allocated, even
the
autovacuum is off.

No. As I wrote the maintenance_work_mem is the maximum amount of memory
the vacuum may consume, so if it is not running the memory is not
allocated. BTW you could find this out simply by starting the server with
enabled / disabled autovacuum and comparing the amount of memory consumed.

Oracle uses a completely different implementation of MVCC architecture.
It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the
row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be
kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a checkpoint),
but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there is
no backup running. PostgreSQL does not need the checkpoints to perform
backup.

Tomas

Scott Mead

scott.lists@enterprisedb.com

about 16 years ago

In reply to: Tomas Vondra (#4)

Re: How many Mandatory Process are there in Postgres

2010/3/25 <tv@fuzzy.cz>

<Snip>

Oracle uses a completely different implementation of MVCC architecture.
It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the
row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be
kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a checkpoint),
but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there is
no backup running. PostgreSQL does not need the checkpoints to perform
backup.

The 'UNDO' data is just kept in the main data files. Then, based on what
your xid is, and the xmin / xmax on each row, it's either visible or not.
That's what vacuum does, clean up the rows that are for 'undo' and can
never been seen anymore. The REDO itself is kept in the transaction logs.

Like Thomas says, the pg_start_backup() just creates a label that tells the
restoring database what pg_xlog record to start with when you 'restore' your
data. The 'UNDO' type of data (just the old rows actually) is just kept in
the main table until vacuum nukes 'em.

--Scott

Tadipathri Raghu

traghu.dba@gmail.com

about 16 years ago

In reply to: Scott Mead (#5)

Re: How many Mandatory Process are there in Postgres

Hi Scott, Thomas,

Thank you for the update.

Oracle uses a completely different implementation of MVCC

architecture.

It
overwrites the data and then uses rollback segments to provide 'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the
row
and then decides which version should each session see (depending on the
transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be
kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a checkpoint),
but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there is
no backup running. PostgreSQL does not need the checkpoints to perform
backup.

The 'UNDO' data is just kept in the main data files. Then, based on what

your xid is, and the xmin / xmax on each row, it's either visible or not.
That's what vacuum does, clean up the rows that are for 'undo' and can
never been seen anymore. The REDO itself is kept in the transaction logs.

Like Thomas says, the pg_start_backup() just creates a label that tells the
restoring database what pg_xlog record to start with when you 'restore' your
data. The 'UNDO' type of data (just the old rows actually) is just kept in
the main table until vacuum nukes 'em.

--Scott
Q1. Does vacuum process come into existence even you turn off the
autovacuum.? What is the main work for vacuum process, to the clean the
buffers or work only when autovacuum in on and to clean up the rows that are
'undo'
Q2. Do you mean to say there is a vacuum process which will clean up the
buffers that are for 'undo'?
Q3. Fine, if the vacuum is off then your data files will hold the undo data
also in it?

See this:

http://wiki.postgresql.org/wiki/User:Gsmith#How_is_the_shared_buffer_cache_organized

Shared buffers is a LRU cache, i.e. when it's full and a new buffer is

requested, some of the 'oldest' buffers are removed.

Suppose, if there is a big transaction going on , in between of the
transaction shared_buffer got filled even after oldest buffers are removed,
at that time where does operation performs to finish up the transaction. As
it is a shared buffer every clients will be accessing this and even they
demand for the space for there operation too.

Regards
Raghav

On Thu, Mar 25, 2010 at 10:49 PM, Scott Mead
<scott.lists@enterprisedb.com>wrote:

Show quoted text

2010/3/25 <tv@fuzzy.cz>

<Snip>

Oracle uses a completely different implementation of MVCC architecture.
It
overwrites the data and then uses rollback segments to provide

'previous

versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the
row
and then decides which version should each session see (depending on

the

transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be
kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a checkpoint),
but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there is
no backup running. PostgreSQL does not need the checkpoints to perform
backup.

The 'UNDO' data is just kept in the main data files. Then, based on what
your xid is, and the xmin / xmax on each row, it's either visible or not.
That's what vacuum does, clean up the rows that are for 'undo' and can
never been seen anymore. The REDO itself is kept in the transaction logs.

Like Thomas says, the pg_start_backup() just creates a label that tells the
restoring database what pg_xlog record to start with when you 'restore' your
data. The 'UNDO' type of data (just the old rows actually) is just kept in
the main table until vacuum nukes 'em.

--Scott

Scott Marlowe

scott.marlowe@gmail.com

about 16 years ago

In reply to: Tadipathri Raghu (#6)

Re: How many Mandatory Process are there in Postgres

On Thu, Mar 25, 2010 at 9:03 PM, Tadipathri Raghu <traghu.dba@gmail.com> wrote:

Hi Scott, Thomas,

Thank you for the update.

Oracle uses a completely different implementation of MVCC architecture.
It
overwrites the data and then uses rollback segments to provide
'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of the
row
and then decides which version should each session see (depending on
the
transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will be
kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a checkpoint),
but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there is
no backup running. PostgreSQL does not need the checkpoints to perform
backup.

The 'UNDO' data is just kept in the main data files. Then, based on what
your xid is, and the xmin / xmax on each row, it's either visible or not.
That's what vacuum does, clean up the rows that are for 'undo' and can
never been seen anymore. The REDO itself is kept in the transaction logs.

Like Thomas says, the pg_start_backup() just creates a label that tells
the restoring database what pg_xlog record to start with when you 'restore'
your data. The 'UNDO' type of data (just the old rows actually) is just
kept in the main table until vacuum nukes 'em.

--Scott
Q1. Does vacuum process come into existence even you turn off the
autovacuum.? What is the main work for vacuum process, to the clean the
buffers or work only when autovacuum in on and to clean up the rows that are
'undo'

1a: Yes, if you are approaching transaction id wraparound (txid wrap)
then autovacuum will crank up a vacuum to fix that situation whether
you like it or not. The alternative is db shutdown.
1b: The main job of vacuum is to free dead tuples, which postgresql
accumulates as it updates or deletes tuples. When you have several old
versions of a tuple to collect, vacuum recycles the space and makes it
available to other updates to now use instead of allocating on the end
of the relation.

Q2. Do you mean to say there is a vacuum process which will clean up the
buffers that are for 'undo'?

No buffers for undo. It's all stored on disc, ready for instant
access. Try it on a test db. begin; load a table with 1M rows;
rollback; sub second response. A million changes just disappeared
instantly. And now you've got 1M dead tuples in that table. Vacuum's
job is to make them available for re-use. But they're all in one big
consecutive file so the performance is actually pretty good as the
table repopulates, especially if it'll get back up to 1M records again
in the near future. As long as it got vacuumed.

Q3. Fine, if the vacuum is off then your data files will hold the undo data
also in it?

Yep. Until a forced vacuum due to txid wrap occurs.

Tadipathri Raghu

traghu.dba@gmail.com

about 16 years ago

In reply to: Scott Marlowe (#7)

Re: How many Mandatory Process are there in Postgres

Hi All,

Thank you for the prompt reply on this.

Please find the output of the top command and the process availabe. Could
explain what logger process is here for

top - 12:41:57 up 17:51, 3 users, load average: 0.00, 0.04, 0.01
Tasks: 141 total, 1 running, 139 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.3%us, 1.4%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 1417256k total, 1300888k used, 116368k free, 41468k buffers
Swap: 2097144k total, 22156k used, 2074988k free, 1017980k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23466 postgres 15 0 4664 1508 1228 S 0.0 0.1 0:00.10 -bash
25983 postgres 15 0 44620 3032 2680 S 0.0 0.2 0:00.43
/usr/local/pgsql/bin/postgres
25984 postgres 18 0 8608 632 296 S 0.0 0.0 0:00.04 postgres:
logger process
25986 postgres 15 0 44604 1656 1292 S 0.0 0.1 0:00.10 postgres:
writer process
25987 postgres 15 0 44604 932 588 S 0.0 0.1 0:00.11 postgres: wal
writer process
25988 postgres 15 0 44756 1232 680 S 0.0 0.1 0:00.17 postgres:
autovacuum launcher process
25989 postgres 18 0 8604 648 304 S 0.0 0.0 0:00.04 postgres:
archiver process
25990 postgres 15 0 8740 900 408 S 0.0 0.1 0:00.31 postgres: stats
collector process
25993 postgres 15 0 4964 1464 1196 S 0.0 0.1 0:00.17 psql
25994 postgres 15 0 46264 6212 4672 S 0.0 0.4 0:00.23 postgres:
postgres postgres [local] idle

Regards
Raghavendra

On Fri, Mar 26, 2010 at 9:22 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:

Show quoted text

On Thu, Mar 25, 2010 at 9:03 PM, Tadipathri Raghu <traghu.dba@gmail.com>
wrote:

Hi Scott, Thomas,

Thank you for the update.

Oracle uses a completely different implementation of MVCC

architecture.

It
overwrites the data and then uses rollback segments to provide
'previous
versions' to running transactions etc.

PostgreSQL does not overwrite the data - it just creates a copy of

the

row
and then decides which version should each session see (depending on
the
transaction IDs etc.). So it does not need to do rollbacks the way
Oracle
does, but it has to remove stale copies of the rows (such that no
running
transaction can see) - that's why there is VACUUM.

Here, if you have issued a command pg_start_backup() at that time the
cluster is freezed, and if any transaction takes place before the
pg_stop_backup() issued at that time where the transaction data will

be

kept
if the undo's are not there.

What do you mean by 'freezed'? The cluster operates normally, the
pg_start_backup() just creates a backup label (and performs a

checkpoint),

but that's not a problem. OK, there could be a performance decrease
because of full page writes, but the data will be processed as if there

is

no backup running. PostgreSQL does not need the checkpoints to perform
backup.

The 'UNDO' data is just kept in the main data files. Then, based on

what

your xid is, and the xmin / xmax on each row, it's either visible or

not.

That's what vacuum does, clean up the rows that are for 'undo' and can
never been seen anymore. The REDO itself is kept in the transaction

logs.

Like Thomas says, the pg_start_backup() just creates a label that tells
the restoring database what pg_xlog record to start with when you

'restore'

your data. The 'UNDO' type of data (just the old rows actually) is just
kept in the main table until vacuum nukes 'em.

--Scott
Q1. Does vacuum process come into existence even you turn off the
autovacuum.? What is the main work for vacuum process, to the clean the
buffers or work only when autovacuum in on and to clean up the rows that

are

'undo'

1a: Yes, if you are approaching transaction id wraparound (txid wrap)
then autovacuum will crank up a vacuum to fix that situation whether
you like it or not. The alternative is db shutdown.
1b: The main job of vacuum is to free dead tuples, which postgresql
accumulates as it updates or deletes tuples. When you have several old
versions of a tuple to collect, vacuum recycles the space and makes it
available to other updates to now use instead of allocating on the end
of the relation.

Q2. Do you mean to say there is a vacuum process which will clean up the
buffers that are for 'undo'?

No buffers for undo. It's all stored on disc, ready for instant
access. Try it on a test db. begin; load a table with 1M rows;
rollback; sub second response. A million changes just disappeared
instantly. And now you've got 1M dead tuples in that table. Vacuum's
job is to make them available for re-use. But they're all in one big
consecutive file so the performance is actually pretty good as the
table repopulates, especially if it'll get back up to 1M records again
in the near future. As long as it got vacuumed.

Q3. Fine, if the vacuum is off then your data files will hold the undo

data

also in it?

Yep. Until a forced vacuum due to txid wrap occurs.

Scott Marlowe

scott.marlowe@gmail.com

about 16 years ago

In reply to: Tadipathri Raghu (#8)

Re: How many Mandatory Process are there in Postgres

On Sun, Mar 28, 2010 at 1:14 AM, Tadipathri Raghu <traghu.dba@gmail.com> wrote:

Hi All,

Thank you for the prompt reply on this.

Please find the output of the top command and the process availabe. Could
explain what logger process is here for

Logging? I'm just guessing there. My machines don't have it and I'm
guessing it's the process that starts up if you're using syslog as
opposed to local logging.

#10

Guillaume Lelarge

guillaume@lelarge.info

about 16 years ago

In reply to: Scott Marlowe (#9)

Re: How many Mandatory Process are there in Postgres

Le 28/03/2010 19:30, Scott Marlowe a ï¿½crit :

On Sun, Mar 28, 2010 at 1:14 AM, Tadipathri Raghu <traghu.dba@gmail.com> wrote:

Hi All,

Thank you for the prompt reply on this.

Please find the output of the top command and the process availabe. Could
explain what logger process is here for

Logging? I'm just guessing there. My machines don't have it and I'm
guessing it's the process that starts up if you're using syslog as
opposed to local logging.

That's the other way around. The logger process is launched when you
enable the logging_collector (or redirect_stderr in 8.2 and older releases).

--
Guillaume.
http://www.postgresqlfr.org
http://dalibo.com