Point-in-time data recovery - v.7.4
Hello
I am trying to find out when 'Point-in-time data recovery' functionality
will be available with postgreSQL but I can not find concrete info about
this.
References in mailinglists talk about version 7.4 and in the TODO list
is under the section 'urgent'.
Anybody knows when this functionality will be available with
prod-quality?
--
With regards
Rafael Martinez
USIT, University of Oslo
I think it was delayed until 7.5... same for win32 port.
Here ir Bruce's message talkin about both topics:
http://archives.postgresql.org/pgsql-hackers/2003-07/msg00284.php
On Tue, 2003-11-18 at 11:01, Rafael Martinez Guerrero wrote:
Show quoted text
Hello
I am trying to find out when 'Point-in-time data recovery' functionality
will be available with postgreSQL but I can not find concrete info about
this.References in mailinglists talk about version 7.4 and in the TODO list
is under the section 'urgent'.Anybody knows when this functionality will be available with
prod-quality?
Rafael Martinez Guerrero <r.m.guerrero@usit.uio.no> writes:
Anybody knows when this functionality will be available with
prod-quality?
It's likely to show up in 7.5 which it is hoped would be released around the
middle of next year. However as this is free software and dependent on
volunteers and other people's priorities neither the feature set of the next
release nor the exact release date are promised.
--
greg
It was planned for 7.4 but got bumped since the work was not finished.
There are patches floating around and people are putting some effort
into it, so hopefully we will see something in 7.5, but it is dependent
on the code being finished before the end of 7.5 development cycle. If
you want to help code PITR please send a message to hackers.
Robert Treat
On Tue, 2003-11-18 at 09:01, Rafael Martinez Guerrero wrote:
Hello
I am trying to find out when 'Point-in-time data recovery' functionality
will be available with postgreSQL but I can not find concrete info about
this.References in mailinglists talk about version 7.4 and in the TODO list
is under the section 'urgent'.Anybody knows when this functionality will be available with
prod-quality?--
With regards
Rafael Martinez
USIT, University of Oslo---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On Tue, Nov 18, 2003 at 03:01:15PM +0100, Rafael Martinez Guerrero wrote:
References in mailinglists talk about version 7.4 and in the TODO list
is under the section 'urgent'.
It didn't get done.
Anybody knows when this functionality will be available with
prod-quality?
I don't think so.
Jan Wieck has a proposal for a new replication system which will
offer a trick for producing point in time as a side benefit. We are
aiming to have that software in use sooner rather than later, but it
hasn't been written yet.
A
--
----
Andrew Sullivan 204-4141 Yonge Street
Afilias Canada Toronto, Ontario Canada
<andrew@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
On Tue, 2003-11-18 at 08:29, Andrew Sullivan wrote:
On Tue, Nov 18, 2003 at 03:01:15PM +0100, Rafael Martinez Guerrero wrote:
References in mailinglists talk about version 7.4 and in the TODO list
is under the section 'urgent'.It didn't get done.
Anybody knows when this functionality will be available with
prod-quality?I don't think so.
Jan Wieck has a proposal for a new replication system which will
offer a trick for producing point in time as a side benefit. We are
aiming to have that software in use sooner rather than later, but it
hasn't been written yet.
Supplanting PITR with Replication? If so, I don't think that's
a good idea, since you'd either have to buy a new server, or double
disk capacity, both of which can cost noticeable amounts of money,
if SCSI disks are used.
--
-----------------------------------------------------------------
Ron Johnson, Jr. ron.l.johnson@cox.net
Jefferson, LA USA
When Swedes start committing terrorism, I'll become suspicious of
Scandinavians.
What is the best method for storing files in postgres? Is it better to use
the large object functions or to just encode the data and store it in a
regular text or data field?
On Tue, Nov 18, 2003 at 12:39:40PM -0600, Ron Johnson wrote:
Supplanting PITR with Replication? If so, I don't think that's
No, I think the idea is that if you're already using the replication
system, you can get this for nothing along with it. Yes, it'll cost
you in hardware.
A
--
----
Andrew Sullivan 204-4141 Yonge Street
Afilias Canada Toronto, Ontario Canada
<andrew@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
"Rick Gigger" <rick@alpinenetworking.com> writes:
What is the best method for storing files in postgres? Is it better to use
the large object functions or to just encode the data and store it in a
regular text or data field?
Depends on whether you need "random access" to the contents. You can
lo_seek() inside a large object and retrieve parts of the data with
lo_read(), while 'text' and 'bytea' currently require fetching the
whole field.
-Doug
Import Notes
Reply to msg id not found: RickGigger'smessageofTue18Nov2003121153-0700
Quoting Rick Gigger <rick@alpinenetworking.com>:
What is the best method for storing files in postgres? Is it better to use
the large object functions or to just encode the data and store it in a
regular text or data field?---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Rick,
This has been discussed recently so you might want to dig through the archives
but one thing to be aware of is that with large objects, you have have to dump
your database in a different format- a non-text format. This is less portable
than storing files as bytea's which can be dumped in the text and non-text
formats. Argueably is all you use is PosgreSQL then this might not be a big deal.
Performance-wise I'm not qualified to speak to which is "better". As always, it
is going to depend on your specific application and environment.
--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com
____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com
I will search the archives but does anyone know off the top of their head
which performs better?
----- Original Message -----
From: "Keith C. Perry" <netadmin@vcsn.com>
To: "Rick Gigger" <rick@alpinenetworking.com>
Cc: "PgSQL General ML" <pgsql-general@postgresql.org>
Sent: Tuesday, November 18, 2003 12:25 PM
Subject: Re: [GENERAL] uploading files
Quoting Rick Gigger <rick@alpinenetworking.com>:
What is the best method for storing files in postgres? Is it better to
use
the large object functions or to just encode the data and store it in a
regular text or data field?---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanlyRick,
This has been discussed recently so you might want to dig through the
archives
but one thing to be aware of is that with large objects, you have have to
dump
your database in a different format- a non-text format. This is less
portable
than storing files as bytea's which can be dumped in the text and non-text
formats. Argueably is all you use is PosgreSQL then this might not be a
big deal.
Performance-wise I'm not qualified to speak to which is "better". As
always, it
Show quoted text
is going to depend on your specific application and environment.
--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com
"Rick Gigger" <rick@alpinenetworking.com> writes:
I will search the archives but does anyone know off the top of their head
which performs better?
It really depends on what you are trying to do. If you are always
fetching the entire file from the database, 'bytea' *might* have a
little less overhead, but it's not clearcut.
Both of them will be ultimately limited by how fast you can get data
off the disk and and pump it through your client connection.
-Doug
Import Notes
Reply to msg id not found: RickGigger'smessageofTue18Nov2003122924-0700
Depends on whether you need "random access" to the contents. You can
lo_seek() inside a large object and retrieve parts of the data with
lo_read(), while 'text' and 'bytea' currently require fetching the
whole file.
Not so unless I misunderstand. We use substr() on bytea for
chunking access to large files on OSs with inferior TCP/IP buffer
implementations. That's probably suboptimal performance-wise,
though.
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
Depends on whether you need "random access" to the contents. You can
lo_seek() inside a large object and retrieve parts of the data with
lo_read(), while 'text' and 'bytea' currently require fetching the
whole file.Not so unless I misunderstand. We use substr() on bytea for
chunking access to large files on OSs with inferior TCP/IP buffer
implementations. That's probably suboptimal performance-wise,
though.
I think the field will still be competely loaded into memory on the
server side though, while LOs are stored in "chunks" and can
theoretically be streamed to the client. I'm not really a definitive
authority, though...
-Doug
Import Notes
Reply to msg id not found: KarstenHilbert'smessageofTue18Nov2003205058+0100
I think the field will still be competely loaded into memory on the
server side though, while LOs are stored in "chunks" and can
theoretically be streamed to the client. I'm not really a definitive
authority, though...
Ah ! Sounds about right ! Something new to learn every day :-)
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
I am currently trying to import a text data file without about 45,000
records. At the end of the import it does an update on each of the 45,000
records. Doing all of the inserts completes in a fairly short amount of
time (about 2 1/2 minutes). Once it gets to the the updates though it slows
to a craw. After about 10 minutes it's only done about 3000 records.
Is that normal? Is it because it's inside such a large transaction? Is
there anything I can do to speed that up. It seems awfully slow to me.
I didn't think that giving it more shared buffers would help but I tried
anyway. It didn't help.
I tried doing a analyze full on it (vacuumdb -z -f) and it cleaned up a lot
of stuff but it didn't speed up the updates at all.
I am using a dual 800mhz xeon box with 2 gb of ram. I've tried anywhere
from about 16,000 to 65000 shared buffers.
What other factors are involved here?
Rick Gigger wrote:
I am currently trying to import a text data file without about 45,000
records. At the end of the import it does an update on each of the 45,000
records. Doing all of the inserts completes in a fairly short amount of
time (about 2 1/2 minutes). Once it gets to the the updates though it slows
to a craw. After about 10 minutes it's only done about 3000 records.Is that normal? Is it because it's inside such a large transaction? Is
there anything I can do to speed that up. It seems awfully slow to me.I didn't think that giving it more shared buffers would help but I tried
anyway. It didn't help.I tried doing a analyze full on it (vacuumdb -z -f) and it cleaned up a lot
of stuff but it didn't speed up the updates at all.I am using a dual 800mhz xeon box with 2 gb of ram. I've tried anywhere
from about 16,000 to 65000 shared buffers.What other factors are involved here?
It is difficult to say without knowing either the definition of the
relation(s) or the update queries involved. Are there indexes being
created after the import that would allow PostgreSQL to locate the
rows being updated quickly, or is the update an unqualified update (no
WHERE clause) that affects all tuples?
EXPLAIN ANALYZE is your friend...
Mike Mascari
mascarm@mascari.com
-----Original Message-----
From: Rick Gigger [mailto:rick@alpinenetworking.com]
Sent: Tuesday, November 18, 2003 12:43 PM
To: PgSQL General ML
Subject: [GENERAL] performance problemI am currently trying to import a text data file without
about 45,000 records. At the end of the import it does an
update on each of the 45,000 records. Doing all of the
inserts completes in a fairly short amount of time (about 2
1/2 minutes). Once it gets to the the updates though it
slows to a craw. After about 10 minutes it's only done about
3000 records.Is that normal?
No way to know, from what you have told us.
Is it because it's inside such a large
transaction? Is there anything I can do to speed that up.
It seems awfully slow to me.I didn't think that giving it more shared buffers would help
but I tried anyway. It didn't help.I tried doing a analyze full on it (vacuumdb -z -f) and it
cleaned up a lot of stuff but it didn't speed up the updates at all.I am using a dual 800mhz xeon box with 2 gb of ram. I've
tried anywhere from about 16,000 to 65000 shared buffers.What other factors are involved here?
What is the structure of the tables? What sort of indexes are involved?
Is the query using a prepared statement? Can you perform the entire
update in a single SQL statement? What is the cardinality of the tables
involved?
Provide your exact code that you use to perform the update.
Provide the SQL schema for the every table used in the queries.
Perform an analyze for the query that is slow. What does it say?
Import Notes
Resolved by subject fallback
On Tue, 18 Nov 2003, Rick Gigger wrote:
I am currently trying to import a text data file without about 45,000
records. At the end of the import it does an update on each of the 45,000
records. Doing all of the inserts completes in a fairly short amount of
time (about 2 1/2 minutes). Once it gets to the the updates though it slows
to a craw. After about 10 minutes it's only done about 3000 records.
Are you doing your updates like this:
update table1 set field1='1' where id=1;
update table1 set field2=4 where id=1;
...
update table1 set field10='something else' where id=1;
update table1 set field1='3' where id=2;
...
Maybe an analyze after the import and before the updates would help. As
might a vacuum [full]. If the table isn't updated by other processes
probably not.
Maybe you've got a foreign key mistmatch going on and a lot of sequential
scanning?
Is that normal? Is it because it's inside such a large transaction? Is
there anything I can do to speed that up. It seems awfully slow to me.
Possibly. If you are creating a lot of dead tuples, then the operations
can get slower and slower. Have you checked your fsm settings et. al.?
I didn't think that giving it more shared buffers would help but I tried
anyway. It didn't help.
Usually doesn't. More sort_mem might though. Make it something like
16384 or 32768 (it's measured in kbytes)
I tried doing a analyze full on it (vacuumdb -z -f) and it cleaned up a lot
of stuff but it didn't speed up the updates at all.
You need to probably do the analyze between the import and the update.
I am using a dual 800mhz xeon box with 2 gb of ram. I've tried anywhere
from about 16,000 to 65000 shared buffers.
That's VERY high. When postgresql has to manage a lot of buffers it
actually is slower than letting the kernel in Linux or BSD do it for you.
What other factors are involved here?
Not sure. More concrete examples would help. Have you run your queries
with explain analyze at the front and looked for differences in number of
rows / loops? Those are the dead giveaways.
Take a look here:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
Uh, I feel a little silly now. I had and index on the field in question
(needed to locate the row to update) but later recreated the table and
forgot to readd it. I had assumed that it was there but double checked just
now and it was gone. I then readded the index and and it finished in a few
minutes.
Sorry about that one. Thanks for the help.
rg
----- Original Message -----
From: "Mike Mascari" <mascarm@mascari.com>
To: "Rick Gigger" <rick@alpinenetworking.com>
Cc: "PgSQL General ML" <pgsql-general@postgresql.org>
Sent: Tuesday, November 18, 2003 2:03 PM
Subject: Re: [GENERAL] performance problem
Rick Gigger wrote:
I am currently trying to import a text data file without about 45,000
records. At the end of the import it does an update on each of the
45,000
records. Doing all of the inserts completes in a fairly short amount of
time (about 2 1/2 minutes). Once it gets to the the updates though it
slows
to a craw. After about 10 minutes it's only done about 3000 records.
Is that normal? Is it because it's inside such a large transaction? Is
there anything I can do to speed that up. It seems awfully slow to me.I didn't think that giving it more shared buffers would help but I tried
anyway. It didn't help.I tried doing a analyze full on it (vacuumdb -z -f) and it cleaned up a
lot
Show quoted text
of stuff but it didn't speed up the updates at all.
I am using a dual 800mhz xeon box with 2 gb of ram. I've tried anywhere
from about 16,000 to 65000 shared buffers.What other factors are involved here?
It is difficult to say without knowing either the definition of the
relation(s) or the update queries involved. Are there indexes being
created after the import that would allow PostgreSQL to locate the
rows being updated quickly, or is the update an unqualified update (no
WHERE clause) that affects all tuples?EXPLAIN ANALYZE is your friend...
Mike Mascari
mascarm@mascari.com