Re: where should I stick that backup?

Started by Stephen Frostalmost 6 years ago5 messages
#1Stephen Frost
sfrost@snowman.net

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Apr 6, 2020 at 2:23 PM Stephen Frost <sfrost@snowman.net> wrote:

So, instead of talking about 'bzip2 > %f.bz2', and then writing into our
documentation that that's how this feature can be used, what about
proposing something that would actually work reliably with this
interface? Which properly fsync's everything, has good retry logic for
when failures happen, is able to actually detect when a failure
happened, how to restore from a backup taken this way, and it'd probably
be good to show how pg_verifybackup could be used to make sure the
backup is actually correct and valid too.

I don't really understand the problem here. Suppose I do:

mkdir ~/my-brand-new-empty-directory
cd ~/my-brand-new-empty-directory
pg_basebackup -Ft --pipe-output 'bzip2 > %f.bz2'
initdb -S --dont-expect-that-this-is-a-data-directory . # because
right now it would complain about pg_wal and pg_tblspc being missing

I think if all that works, my backup should be good and durably on
disk. If it's not, then either pg_basebackup or bzip2 or initdb didn't
report errors that they should have reported. If you're worried about
that, say because you suspect those programs are buggy or because you
think the kernel may not be reporting errors properly, you can use tar
-jxvf + pg_validatebackup to check.

What *exactly* do you think can go wrong here?

What if %f.bz2 already exists? How about if %f has a space in it? What
about if I'd like to verify that the backup looks reasonably valid
without having to find space to store it entirely decompressed?

Also, this argument feels disingenuous to me. That isn't the only thing
you're promoting this feature be used for, as you say below. If the
only thing this feature is *actually* intended for is to add bzip2
support, then we should just add bzip2 support directly and call it a
day, but what you're really talking about here is a generic interface
that you'll want to push users to for things like "how do I back up to
s3" or "how do I back up to GCS" and so we should be thinking about
those cases and not just a relatively simple use case.

This is the same kind of slippery slope that our archive command is
built on- sure, if everything "works" then it's "fine", even with our
documented example, but we know that not everything works in the real
world, and just throwing an 'initdb -S' in there isn't a general
solution because users want to do things like send WAL to s3 or GCS or
such.

I don't think there's any doubt that there'll be no shortage of shell
scripts and various other things that'll be used with this that, yes,
will be provided by our users and therefore we can blame them for doing
it wrong, but then they'll complain on our lists and we'll spend time
educating them as to how to write proper software to be used, or
pointing them to a solution that someone writes specifically for this.
I don't view that as, ultimately, a good solution.

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Apr 6, 2020 at 1:32 PM Magnus Hagander <magnus@hagander.net> wrote:

Now, if we were just talking about compression, it would actually be
interesting to implement some sort of "postgres compression API" if
you will, that is implemented by a shared library. This library could
then be used from pg_basebackup or from anything else that needs
compression. And anybody who wants could then do a "<compression X>
for PostgreSQL" module, removing the need for us to carry such code
upstream.

I think it could be more general than a compression library. It could
be a store-my-stuff-and-give-it-back-to-me library, which might do
compression or encryption or cloud storage or any combination of the
three, and probably other stuff too. Imagine that you first call an
init function with a namespace that is basically a string provided by
the user. Then you open a file either for read or for write (but not
both). Then you read or write a series of chunks (depending on the
file mode). Then you close the file. Then you can do the same with
more files. Finally at the end you close the namespace. You don't
really need to care where or how the functions you are calling store
the data. You just need them to return proper error indicators if by
chance they fail.

Yes, having a storage layer makes a lot of sense here, with features
that are understood by the core system and which each driver
understands, and then having a filter system which is also pluggable and
can support things like compression and hashing for this would also be
great.

I can point you to examples of all of the above, already implemented, in
C, all OSS. Sure seems like a pretty good and reasonable approach to
take when other folks are doing it.

As compared with my previous proposal, this would work much better for
pg_basebackup -Fp, because you wouldn't launch a new bzip2 process for
every file. You'd just bzopen(), which is presumably quite lightweight
by comparison. The reasons I didn't propose it are:

1. Running bzip2 on every file in a plain-format backup seems a lot
sillier than running it on every tar file in a tar-format backup.

This is circular, isn't it? It's silly because you're launching new
bzip2 processes for every file, but if you were using bzopen() then you
wouldn't have that issue and therefore compressing every file in a
plain-format backup would be entirely reasonable.

2. I'm not confident that the command specified here actually needs to
be anything very complicated (unlike archive_command).

This.. just doesn't make sense to me. The above talks about pushing
things to cloud storage and such, which is definitely much more
complicated than what had really been contemplated when archive_command
was introduced.

3. The barrier to entry for a loadable module is a lot higher than for
a shell command.

Sure.

4. I think that all of our existing infrastructure for loadable
modules is backend-only.

That certainly doesn't seem a terrible hurdle, but I'm not convinced
that we'd actually need or want this to be done through loadable
modules- I'd argue that we should, instead, be thinking about building a
system where we could accept patches that add in new drivers and new
filters to core, where they're reviewed and well written.

Now all of these are up for discussion. I am sure we can make the
loadable module stuff work in frontend code; it would just take some
work. A C interface for extensibility is very significantly harder to
use than a shell interface, but it's still way better than no
interface. The idea that this shell command can be something simple is
my current belief, but it may turn out to be wrong. And I'm sure
somebody can propose a good reason to do something with every file in
a plain-format backup rather than using tar format.

I've already tried to point out that the shell command you're talking
about isn't going to be able to just be a simple command if the idea is
that it'd be used to send things to s3 or gcs or anything like that.
*Maybe* it could be simple if the only thing it's used for is a simple
compression filter (though we'd have to deal with the whole fsync thing,
as discussed), but it seems very likely that everyone would be a lot
happier if we just built in support for bzip2, lz4, gzip, whatever, and
that certainly doesn't strike me as a large ask in terms of code
complexity or level of effort.

All that being said, I still find it hard to believe that we will want
to add dependencies for libraries that we'd need to do encryption or
S3 cloud storage to PostgreSQL itself. So if we go with this more
integrated approach we should consider the possibility that, when the
dust settles, PostgreSQL will only have pg_basebackup
--output-plugin=lz4 and Aurora will also have pg_basebackup
--output-plugin=s3. From my point of view, that would be less than
ideal.

We already have libraries for encryption and we do not have to add
libraries for s3 storage to support it as an option, as I mentioned
up-thread. I don't find the argument that someone else might extend
pg_basebackup (or whatever) to add on new features to be one that
concerns me terribly much, provided we give people the opportunity to
add those same features into core if they're willing to put in the
effort to make it happen. I'm quite concerned that using this generic
"you can just write a shell script to do it" approach will be used, over
and over again, as an argument or at least a deterrent to having
something proper in core and will ultimately result in us not having any
good solution in core for the very common use cases that our users have
today.

That certainly seems like what's happened with archive_command.

Thanks,

Stephen

#2Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#1)

On Wed, Apr 8, 2020 at 1:05 PM Stephen Frost <sfrost@snowman.net> wrote:

What if %f.bz2 already exists?

That cannot occur in the scenario I described.

How about if %f has a space in it?

For a tar-format backup I don't think that can happen, because the
file names will be base.tar and ${tablespace_oid}.tar. For a plain
format backup it's a potential issue.

What
about if I'd like to verify that the backup looks reasonably valid
without having to find space to store it entirely decompressed?

Then we need to make pg_validatebackup better.

Also, this argument feels disingenuous to me.
[ lots more stuff ]

This all just sounds like fearmongering to me. "archive_command
doesn't work very well, so maybe your thing won't either." Maybe it
won't, but the fact that archive_command doesn't isn't a reason.

Yes, having a storage layer makes a lot of sense here, with features
that are understood by the core system and which each driver
understands, and then having a filter system which is also pluggable and
can support things like compression and hashing for this would also be
great.

It's good to know that you prefer a C interface to one based on shell
scripting. I hope that we will also get some other opinions on that
question, as my own feelings are somewhat divided (but with some bias
toward trying to making the shell scripting thing work, because I
believe it gives a lot more practical flexibility).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#2)

Greeitngs,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Apr 8, 2020 at 1:05 PM Stephen Frost <sfrost@snowman.net> wrote:

What if %f.bz2 already exists?

That cannot occur in the scenario I described.

Of course it can.

How about if %f has a space in it?

For a tar-format backup I don't think that can happen, because the
file names will be base.tar and ${tablespace_oid}.tar. For a plain
format backup it's a potential issue.

I agree that it might not be an issue for tar-format.

What
about if I'd like to verify that the backup looks reasonably valid
without having to find space to store it entirely decompressed?

Then we need to make pg_validatebackup better.

Sure- but shouldn't the design be contemplating how these various tools
will work together?

Also, this argument feels disingenuous to me.
[ lots more stuff ]

This all just sounds like fearmongering to me. "archive_command
doesn't work very well, so maybe your thing won't either." Maybe it
won't, but the fact that archive_command doesn't isn't a reason.

I was trying to explain that we have literally gone down exactly this
path before and it's not been a good result, hence we should be really
careful before going down it again. I don't consider that to be
fearmongering, nor that we should be dismissing that concern out of
hand.

Yes, having a storage layer makes a lot of sense here, with features
that are understood by the core system and which each driver
understands, and then having a filter system which is also pluggable and
can support things like compression and hashing for this would also be
great.

It's good to know that you prefer a C interface to one based on shell
scripting. I hope that we will also get some other opinions on that
question, as my own feelings are somewhat divided (but with some bias
toward trying to making the shell scripting thing work, because I
believe it gives a lot more practical flexibility).

Yes, I do prefer a C interface. One might even say "been there, done
that." Hopefully sharing such experience is still useful to do on these
lists.

Thanks,

Stephen

#4Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#3)

On Wed, Apr 8, 2020 at 2:06 PM Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Apr 8, 2020 at 1:05 PM Stephen Frost <sfrost@snowman.net> wrote:

What if %f.bz2 already exists?

That cannot occur in the scenario I described.

Of course it can.

Not really. The steps I described involved creating a new directory.
Yeah, in theory, somebody could inject a file into that directory
after we created it and before bzip writes any files into it, but
pg_basebackup already has the exact same race condition.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#4)

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Apr 8, 2020 at 2:06 PM Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Apr 8, 2020 at 1:05 PM Stephen Frost <sfrost@snowman.net> wrote:

What if %f.bz2 already exists?

That cannot occur in the scenario I described.

Of course it can.

Not really. The steps I described involved creating a new directory.
Yeah, in theory, somebody could inject a file into that directory
after we created it and before bzip writes any files into it, but
pg_basebackup already has the exact same race condition.

With pg_basebackup, at least we could reasonably fix that race
condition.

Thanks,

Stephen