PGSQL or other DB?
Dear PG Users!
I don't want to make a heated debate
<http://szotar.sztaki.hu/dict_search.php?M=1&O=HUN&E=1&C=1&A=0&S=H&T=1&D=0&G=0&P=0&F=0&MR=100&orig_lang=HUN%3AENG%3AEngHunDict&orig_mode=1&orig_word=vita&flash=&sid=0ab237888b26676a51a7567bf8920f92&vk=&L=ENG%3AHUN%3AEngHunDict&W=heated%20debate>
with this subject, but I wanna ask about your experiences because we
need to make a new special site and we wanna know, which DB is the best
for this...
This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these corps
(datas to keep, way to working them), we wanna make many databases (one
DB for one corp).
The site uses one main database that handle all public, and shared
services, and store the links to subdatabases.
The website will works with apache/mod_python, and the each of the
corp's services are handled with another python module (from another
(separated) database). The main structure of the source code already wroted.
But: we need to determine, which database we will use in the future for
this project.
The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistently
- fast queries
- easy IDE to use (like pgadmin)
- the db server can handle many databases (webdb_nnn where nnn is an
integer)
- I can add/modify a table, or a field to a table without "full lock"
on the table (like DBISAM restructure). Like in FireBird, where the "add
field" change only the table description. I don't know that PG supports
this way of the DB modifying.
- Quick and easy backup/restore system
Another important thing that I don't understand (what as I saw) that the
PostGreSQL is store the databases in one, unseparatable file set, in a
directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are
separated to another directories/files.
This "one datadir" is seems to be not too good for us. We used DBISAM in
our clients, and many times when we got some filesystem error, we can
simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two way:
a.) archive all files b.) make sql dump from database.
If a file get corrupted in a database, then we can restore the datas
from files, and this filesystem error causes problems only for this
database, not for all.
I very fear from to keep all databases in one place, because if they are
corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump
enabled).
Ok, I saw that pgsql supports "tablespaces", but as I saw, this function
can hold only table datas in the another directory, and not the full
database can separated with them.
Because I don't used one PGSQL with many databases (up to 200), I don't
know, what happening, and which cases possible. But I think you have
many experience with it. Please share it with me!
Please help me, because we need to determine which DB to use.
I started the usage of the PG in prev. month, and I liked it except the
way of the data storage (one data dir).
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing
transaction system on MyISAM". And boolean data type is missing (solved
with enum?).
I don't like it all, but it is seems to be fast with little tables, and
it is separate the database files to another directories which thing I
like. Possible it have many limitations what I don't saw in first time.
Please help me, which DB is good for us, and how to configure, and use
PGSQL with these database-set which we need to use.
Thanks for your help:
dd
On Fri, Jan 30, 2009 at 08:37:02PM +0100, durumdara wrote:
This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these corps
(datas to keep, way to working them), we wanna make many databases (one
DB for one corp).
Databases or schemas? they are different in PG and they have very
different behaviors which suit different use cases.
The site uses one main database that handle all public, and shared
services, and store the links to subdatabases.
Sounds like you want to use schemas.
The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistently
?? mod_python can be used to do persistent connections as well as
connection pooling.
- fast queries
They're as fast as you write them! PG generally does better with higher
concurrent loads than MySQL (never used DBISAM or FireBird).
- the db server can handle many databases (webdb_nnn where nnn is an
integer)
There's a summary of limits in the about page:
http://www.postgresql.org/about/
- I can add/modify a table, or a field to a table without "full lock"
on the table (like DBISAM restructure). Like in FireBird, where the "add
field" change only the table description. I don't know that PG supports
this way of the DB modifying.
Yes; have a read about MVCC:
http://www.postgresql.org/docs/current/static/mvcc.html
- Quick and easy backup/restore system
pg_dump works well, how quick it is depends on the amount of data you
have obviously
Another important thing that I don't understand (what as I saw) that the
PostGreSQL is store the databases in one, unseparatable file set, in a
directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are
separated to another directories/files.
PG's scheme works well; one file per table doesn't work very well in
practice. there will always be links between tables (foreign keys being
the obvious one) and hence selectively restoring individual tables on a
file by file basis seems like an *amazing* way of killing your database.
This "one datadir" is seems to be not too good for us. We used DBISAM in
our clients, and many times when we got some filesystem error, we can
simply recover the tables - from the files.
Use a proper backup system that takes consistent snapshots of your data,
anything else will come back and bite you when you really don't want it
to
I very fear from to keep all databases in one place, because if they are
corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump
enabled).
There are much better tools available to solve things than treating
tables as files! As far as backups go, there are several options
of varying complexity and resilience. pg_dump is simple, you can do
filesystem level backups with a bit of fiddling, you can keep another
database fed with data from the main one with varying levels of lag.
http://www.postgresql.org/docs/current/static/backup.html
--
Sam http://samason.me.uk/
This "one datadir" is seems to be not too good for us. We used DBISAM
in our clients, and many times when we got some filesystem error, we
can simply recover the tables - from the files.
pg_dump files with the custom format can be used to selectively restore
tables. It's really easy and a lot more reliable than filesystem level
backups - which require snapshotting or shutting down the database or
in-database support to get clean backups of.
I very fear from to keep all databases in one place, because if they
are corrupted, possible more of them injured (if they are not
separated). I cannot make filesystem based (hard) copy from one db
(only SQL dump enabled).
If you consistently have hardware problems that cause filesystem corruption,
you should probably consider fixing that before worrying too much about
which software is on top of it being corrupted.
And, honestly, if keeping all the files in once place is a major problem for
you, then don't use PostgreSQL. It's not like it's going to be changed in
this regard.
--
Alan
On Jan 30, 2009, at 2:37 PM, durumdara wrote:
Dear PG Users!
I don't want to make a heated debate with this subject, but I wanna
ask about your experiences because we need to make a new special
site and we wanna know, which DB is the best for this...This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these
corps (datas to keep, way to working them), we wanna make many
databases (one DB for one corp).
The site uses one main database that handle all public, and shared
services, and store the links to subdatabases.
The website will works with apache/mod_python, and the each of the
corp's services are handled with another python module (from another
(separated) database). The main structure of the source code already
wroted.But: we need to determine, which database we will use in the future
for this project.The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistently
I don`t know much about python except the animal but could use a SQL
proxy to solve this issue
- fast queries
Define fast, but remember a DB was never designed to be fast. However
it sounds like for the coupld of companies you wouldn't have to much
problems there.
- easy IDE to use (like pgadmin)
- the db server can handle many databases (webdb_nnn where nnn is
an integer)
I don't have experience with PG and hunderds of DB's, but I think it
shouldn't be a problem
- I can add/modify a table, or a field to a table without "full
lock" on the table (like DBISAM restructure). Like in FireBird,
where the "add field" change only the table description. I don't
know that PG supports this way of the DB modifying.
How big are these tables and how frequently do they change? If they
change really frequently then you have a problem with your design in
general.
- Quick and easy backup/restore system
pg_dump/pg_restore are your friends. webmin makes this also easy if
you want a point and click interface.
Another important thing that I don't understand (what as I saw) that
the PostGreSQL is store the databases in one, unseparatable file
set, in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases
are separated to another directories/files.This "one datadir" is seems to be not too good for us. We used
DBISAM in our clients, and many times when we got some filesystem
error, we can simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two
way: a.) archive all files b.) make sql dump from database.If a file get corrupted in a database, then we can restore the datas
from files, and this filesystem error causes problems only for this
database, not for all.I very fear from to keep all databases in one place, because if they
are corrupted, possible more of them injured (if they are not
separated).
I cannot make filesystem based (hard) copy from one db (only SQL
dump enabled).
Why would just one file get corrupt?? You can store your
tables(indexes in different places if you want to but if one get's
corrupt you need to change/check/replace hardware anyways. PostgreSQL
doesn't behave much like MySQL where all of a sudden tables get
corrupted and you need to repair them.
Ok, I saw that pgsql supports "tablespaces", but as I saw, this
function can hold only table datas in the another directory, and not
the full database can separated with them.
Correct...
Because I don't used one PGSQL with many databases (up to 200), I
don't know, what happening, and which cases possible. But I think
you have many experience with it. Please share it with me!
Don't expect yourself to just set one table and/or DB back. This is
plain wrong and you need to make appropriate backups. If you really
want to set a DB back as a table then use sqlite, but that defeats
your requirements of 'fast'.
Please help me, because we need to determine which DB to use.
I think PG will do very well in your situation, but you have to set
your mind off using files
I started the usage of the PG in prev. month, and I liked it except
the way of the data storage (one data dir).
I don't know any DB (except may be some Object DB's and sqlite) where
you can do that properly, officially and ACID compliant.
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing
transaction system on MyISAM". And boolean data type is missing
(solved with enum?).
I don't like it all, but it is seems to be fast with little tables,
and it is separate the database files to another directories which
thing I like. Possible it have many limitations what I don't saw in
first time.
Don't want to debate MySQL but PG is a better option IMHO, it recovers
VERY well from crashes where you are worried about (I know, I live in
Ecuador where power goes down every week or so and it always recovered
perfectly).
Please help me, which DB is good for us, and how to configure, and
use PGSQL with these database-set which we need to use.
PostgreSQL is good for you as long as you set your mind away from
restoring a DB by replacing a fileset. pg_dump/pg_restore are your
friends. If you do care about restoring a DB up to a point in time you
can do WAL shipping.
Thanks for your help:
dd
regards, Ries van Twisk
-------------------------------------------------------------------------------------------------
Ries van Twisk
tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS
WebORB PostgreSQL DB-Architect
email: ries@vantwisk.nl
web: http://www.rvantwisk.nl/
skype: callto://r.vantwisk
Phone: +1-810-476-4196
SIP: +1-747-690-5133
durumdara wrote:
Another important thing that I don't understand (what as I saw) that
the PostGreSQL is store the databases in one, unseparatable file set,
in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are
separated to another directories/files.This "one datadir" is seems to be not too good for us. We used DBISAM
in our clients, and many times when we got some filesystem error, we
can simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two
way: a.) archive all files b.) make sql dump from database.
Postgresql uses a database cluster. In the data directory each database
goes into a directory identified by the database OID, and in this
directory the database resides
in 2 gb chunks(as far as I remember). You can easily backup the cluster
by using the file system, you just have to make sure you stop the
postmaster and then you can backup the entire data dir or individual
database dir. You have to restore this to the same version of PG though.
it's the same kind of deal with Firebird, you can't backup a running
database with the file system and even Firebird must be stopped in order
to do a file system copy of the database file.
Firebird also stores all it's transaction data in the same file, so you
end up with HUGE files unless you do a backup and restore to shrink them
down. You don't have this problem with PostgreSQL as it stores that
info in the WAL.
PostgreSQL will give the the best solution for your project hands down,
just give it a chance and don't worry about what you did with Firebird
or MySQL.
I have never had a PostgreSQL database or dump file become corrupt, but
on Firebird it happens fairly often and it's part of the reason why I
eventually dumped it and moved to PostgreSQL.
Hope that helps you out some.
Tony Caduto
AM Software Design
Home of Lightning Admin for PostgreSQL
http://www.amsoftwaredesign.com
durumdara wrote:
- quick (re)connect - because mod_python basically not store the
database connections persistently
You should consider using a Python connection pool with something like
Pylons (http://pylonshq.com), I use dbutils:
http://www.webwareforpython.org/DBUtils/Docs/UsersGuide.html
You can scale PostgreSQL and Python web apps really well with that as
long as you are
not using CGI. I use Pylons combined with mod_wsgi, but you can use it
with mod_python as well.
Tony Caduto
AM Software Design
Home of Lightning Admin for PostgreSQL
http://www.amsoftwaredesign.com
On Fri, 30 Jan 2009, durumdara wrote:
- quick (re)connect - because mod_python basically not store the database
connections persistently
If this turns out to be a problem, you may want to use some sort of
connection pooling software in order to reduce this overhead.
http://wiki.postgresql.org/wiki/Replication%2C_Clustering%2C_and_Connection_Pooling#Connection_Pooling_and_Acceleration
gives a quick intro to this topic with links to some good solutions here.
This "one datadir" is seems to be not too good for us. We used DBISAM in our
clients, and many times when we got some filesystem error, we can simply
recover the tables - from the files.
It's quite easy for ISAM solutions to get corrupted. Barring a bad
hardware failure, this shouldn't ever happen to a PostgreSQL database.
The write-ahead log implementation nevers leaves you in a state where the
database can't recover itself after a crash or filesystem error. You
should consider this a positive thing--this whole class of problems that
you think you need a solution to, needing to fix corrupted database files
regularly, shouldn't ever happen in a proper database implementation.
The first two sections of
http://www.postgresql.org/docs/8.3/static/wal.html give a good
introduction to this topic.
Rolling back to a backup should only be necessary after a serious hardware
disaster--the sort of situation where it's unlikely you'll be able to
trust any of your individual tables even if you had the option of
repairing them one at a time.
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing
transaction system on MyISAM". And boolean data type is missing (solved with
enum?).
I don't like it all, but it is seems to be fast with little tables, and it is
separate the database files to another directories which thing I like.
Possible it have many limitations what I don't saw in first time.
http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007
has a lot of commentary in this area. MySQL+MyISAM will have the same
sort of problems you should be trying to escape. MySQL+InnoDB would be
much better. In addition to missing features, the other thing you should
be careful about is making speed measurements with a low client
count--make sure you evaluate any database you're considering with a many
clients going at once as you can simulate.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Alan Hodgson wrote:
And, honestly, if keeping all the files in once place is a major problem for
you, then don't use PostgreSQL. It's not like it's going to be changed in
this regard.
Um, I think it already did:
http://www.postgresql.org/docs/8.3/interactive/sql-createtablespace.html
Cheers,
Steve
On Fri, Jan 30, 2009 at 1:04 PM, Sam Mason <sam@samason.me.uk> wrote:
On Fri, Jan 30, 2009 at 08:37:02PM +0100, durumdara wrote:
- I can add/modify a table, or a field to a table without "full lock"
on the table (like DBISAM restructure). Like in FireBird, where the "add
field" change only the table description. I don't know that PG supports
this way of the DB modifying.Yes; have a read about MVCC:
Just because one uses MVCC doesn't mean it's free to add a new column
to a table. Adding a nullable column in pgsql is trivial. Adding an
index can be done concurrently. InnoDB supports MVCC, but MySQL's way
of doing things results in innodb tables being rewritten from scratch
for both of those for varying reasons, resulting in a locked table the
whole time.
Create index concurrently is a treasure to behold on large production
systems. I can't stress how important it is that I can create an
index on demand in the middle of the heaviest user activity on my
system and nothing hangs or locks from the user side waiting on it.
Same for adding a column.
- Quick and easy backup/restore system
pg_dump works well, how quick it is depends on the amount of data you
have obviously
More important that quick and easy is reliable, predictable, and
capable of being run on a live, working database (backup that is).
pg_dump definitely is those things.
This "one datadir" is seems to be not too good for us. We used DBISAM in
our clients, and many times when we got some filesystem error, we can
simply recover the tables - from the files.
Do you get a lot of filesystem errors? Or corrupted table errors.
Having table corruption be a common problem means something somewhere
is broken, whether it be your hardware or the db you're running. I've
been running pgsql for a long time, going back to the tailend of the
6.x era. I've never once had a corrupt table on a production server.
I am not alone.
Use a proper backup system that takes consistent snapshots of your data,
anything else will come back and bite you when you really don't want it
to
I feel like I'm bathed with light and getting religious, but yes, this.
I very fear from to keep all databases in one place, because if they are
corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump
enabled).
You need to get out of the habit of playing in your databases files.
Any database that requires you to routinely do that is broken by
design.
I get the feeling you've learned how to use a hammer (the file system
level manipulation thing) to fix a problem that's very nail like (file
corruption in mysql) and are looking for a place to use your hammer in
pgsql. Pgsql has no such nails sticking out of it.
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistently
mod_python is not a database adaptor. Put another way, mod_python
does not make database connections. If you're using Python then to
connect to Postgres you'll most likely be using psycopg2 to connect to
and query your database. I'm not sure what your previous database
experience is but it sounds like you may be thinking in terms of PHP's
persistent connections and on that I'll just say that PHP's persistent
connections are not true persistent connections from your top-level
application's perspective. If you want connection pooling then there
are options available, pgpool-II and pgbouncer are probably the two
most commonly used ones today.
- fast queries
Stop and think about that one for a moment. Do you really expect to
go to a specific database's mailing list(s) with that requirement
expecting anything other than "We got you covered!"
- easy IDE to use (like pgadmin)
I think you just answered your own question.
- the db server can handle many databases (webdb_nnn where nnn is
an integer)
I know people hate hearing this answer but that's the wrong question.
What matters isn't how many databases can be run (and, if you ever run
into a database server in this day and age where it does, run) but
rather how many concurrent connections you expect as well as both the
total data set size and how much of that data set will be constantly
worked with, i.e. needs to be in memory. Let's say you have 10 5G
databases with each averaging 10% of their total size in use (in
memory, hopefully) at any given moment. That's entirely different
from 10 10G databases with each averaging 10% in memory and both are
different from running 200 databases each with 5% in use at any given
moment. All of that being said, a properly configured Postgres
provisioned for the workload installation built with a little
knowledge will scale with the best of 'em.
- I can add/modify a table, or a field to a table without "full
lock" on the table (like DBISAM restructure). Like in FireBird,
where the "add field" change only the table description. I don't
know that PG supports this way of the DB modifying.
Nope. PostgreSQL is an all or nothing transactional database. I'd
never heard of DBISAM before you mentioned it and have never used
Firebird. After doing a little reading it turns out that if you plan
to use transactions at all (which is very likely given even just the
little you've described about the applications you're building) then
you should realize that altering tables is not compatible with
transactions and doing so will automatically commit any open
transactions on the table. Looking into Firebird I couldn't find how
it handles (or doesn't) that at all I but I did see that it will
happily let you add a new not null column with no default to a table
by writing nulls for the new attribute for any existing columns. That
already makes me queasy.
- Quick and easy backup/restore system
Well, again, that depends. I believe pgAdmin has some nice menu
options to handle simple cases (others can either add to or correct me
there) but as to the core Postgres distribution there's still a bit of
ambiguity arising from cases of "there's more than one way to do it"
even for basic tasks as well as some inconsistent UI aspects to the
existing tools (see the pg_dump/pg_restore/psql comments in the recent
Pet Peeves thread).
Another important thing that I don't understand (what as I saw) that
the PostGreSQL is store the databases in one, unseparatable file
set, in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases
are separated to another directories/files.
This "one datadir" is seems to be not too good for us. We used
DBISAM in our clients, and many times when we got some filesystem
error, we can simply recover the tables - from the files.
Way off. While I didn't look into the storage layouts for DBISAM or
Firebird I can tell you that in any MySQL server instance all of the
tables will be under (to some depth) the same top level data directory
directory with each database in each own directory under that and
table files for each database under those. I think you may be able to
use symlinks MyISAM table files but InnoDB tables will croak if you
try to do that. What's more, by default InnoDB does *not* use
separate files per table and instead stores all InnoDB tables in one
file with it's own internal file system. Admittedly, I think most
people turn on the innodb_file_per_table option such that each table
will get it's own data and index files but even then all of a given
table's data indexes will reside in the same file. Each of those
InnoDB table data+index files is referred to as an InnoDB tablespace
(but that's really different from what most people expect a tablespace
to be, see next paragraph). One consequence of that is that adding an
index to a table requires that the entire table and all of it's
indexes be rewritten.
PostgreSQL also follows the 1 directory per-database setup. Each
database directory is located directly underneath the base/ direcotry
in the main Postgres One big difference is that table data and index
data are in separate files and multiple indexes for a given table are
all in separate files. This can be viewed as the low-level embodiment
of the fact that in PostgreSQL indexes are not part of the tables they
are for but rather separate, but tightly coupled, database objects.
With this scheme adding a new index, which doesn't even require the
table to be locked any more for new indexes, only needs to write the
new index file, not rewrite the table data and previously existing
indexes as well. What's more, Postgres allows you to create real
tablespaces so that you can place individual persistent database
objects (databases, tables, indexes, and some constraints) on separate
storage.
When we want to backup or restore one database, we can do it in two
way: a.) archive all files b.) make sql dump from database.
Both options are available with Postgres although, as has been noted
by others, option 'a' requires other tools in order to back up a
consistent data set.
If a file get corrupted in a database, then we can restore the datas
from files, and this filesystem error causes problems only for this
database, not for all.
I very fear from to keep all databases in one place, because if they
are corrupted, possible more of them injured (if they are not
separated).
I cannot make filesystem based (hard) copy from one db (only SQL
dump enabled).
Again, I'm not sure what you mean. There are different kinds of
corruption that can be found ion files. Database files could be
considered corrupt if the database screws something up and writes bad
data to a file. That would be localized to the files it wrote to in
the bad operation (and is extremely rare with Postgres) whereas what I
think of as filesystem corruption often doesn't see file boundaries.
Ok, I saw that pgsql supports "tablespaces", but as I saw, this
function can hold only table datas in the another directory, and not
the full database can separated with them.
No, you can put entire databases in/on different tablespaces.
Because I don't used one PGSQL with many databases (up to 200), I
don't know, what happening, and which cases possible. But I think
you have many experience with it. Please share it with me!Please help me, because we need to determine which DB to use.
I started the usage of the PG in prev. month, and I liked it except
the way of the data storage (one data dir).I tried the MySQL before I tried PG. InnoDB is seems to be "forcing
transaction system on MyISAM". And boolean data type is missing
(solved with enum?).
I don't like it all, but it is seems to be fast with little tables,
and it is separate the database files to another directories which
thing I like. Possible it have many limitations what I don't saw in
first time.
Well, if you don't want transaction then stay away from Postgres and
realize that Postgres actually gives you more built-in support for
placing your data at different locations than MySQL.
Please help me, which DB is good for us, and how to configure, and
use PGSQL with these database-set which we need to use.
Just realize that when choosing a complex technology solution like a
database there really is no "yes" or "no" answer or "here's the right
way to do it" and "here's the wrong way". You need to define your
requirements and part of that is asking questions. For example, 'how
to configure..' -- if you seriously ask a question like that here
you're going to get more questions (what kind of IO subsystem? how
many concurrent connections do you need to support? how much memory do
you have? how large is our data set? etc...) than you will answers.
Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k
On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote:
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
- I can add/modify a table, or a field to a table without "full lock" on
the table (like DBISAM restructure). Like in FireBird, where the "add field"
change only the table description. I don't know that PG supports this way of
the DB modifying.Nope. PostgreSQL is an all or nothing transactional database. I'd never
heard of DBISAM before you mentioned it and have never used Firebird. After
doing a little reading it turns out that if you plan to use transactions at
all (which is very likely given even just the little you've described about
the applications you're building) then you should realize that altering
tables is not compatible with transactions and doing so will automatically
commit any open transactions on the table.
Are talking about pgsql or some other database? Everything in pgsql
can be done in a transaction, except create / drop database /
tablespace.
Looking into Firebird I couldn't
find how it handles (or doesn't) that at all I but I did see that it will
happily let you add a new not null column with no default to a table by
writing nulls for the new attribute for any existing columns. That already
makes me queasy.
That's pretty much what pgsql does. Why does it make you queasy?
Scott Marlowe wrote:
On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote:
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
- I can add/modify a table, or a field to a table without "full lock" on
the table (like DBISAM restructure). Like in FireBird, where the "add field"
change only the table description. I don't know that PG supports this way of
the DB modifying.Nope. PostgreSQL is an all or nothing transactional database. I'd never
heard of DBISAM before you mentioned it and have never used Firebird. After
doing a little reading it turns out that if you plan to use transactions at
all (which is very likely given even just the little you've described about
the applications you're building) then you should realize that altering
tables is not compatible with transactions and doing so will automatically
commit any open transactions on the table.Are talking about pgsql or some other database? Everything in pgsql
can be done in a transaction, except create / drop database /
tablespace.Looking into Firebird I couldn't
find how it handles (or doesn't) that at all I but I did see that it will
happily let you add a new not null column with no default to a table by
writing nulls for the new attribute for any existing columns. That already
makes me queasy.That's pretty much what pgsql does. Why does it make you queasy?
I think the key is that the new column is NOT NULL, so defaulting the
new column's values to NULL results in immediate data integrity
inconsistency.
If I remember rightly, PG doesn't allow this: you have to create the
column as NULL, UPDATE and then add the NOT NULL constraint, or perhaps
(I haven't tried this) create the column with a default and then remove
it immediately afterwards.
On Sun, Feb 1, 2009 at 7:33 AM, Russ Brown <pickscrape@gmail.com> wrote:
Scott Marlowe wrote:
On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote:
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
Looking into Firebird I couldn't
find how it handles (or doesn't) that at all I but I did see that it will
happily let you add a new not null column with no default to a table by
writing nulls for the new attribute for any existing columns. That
already
makes me queasy.That's pretty much what pgsql does. Why does it make you queasy?
I think the key is that the new column is NOT NULL, so defaulting the new
column's values to NULL results in immediate data integrity inconsistency.If I remember rightly, PG doesn't allow this: you have to create the column
as NULL, UPDATE and then add the NOT NULL constraint, or perhaps (I haven't
tried this) create the column with a default and then remove it immediately
afterwards.
OK, I completely misunderstood what the other poster meant. Pgsql
does NOT allow creating the not null column with nulls in place. That
would make me quesy too. Creating it with a default works in
postgresql.
On Jan 31, 2009, at 9:36 AM, Scott Marlowe wrote:
On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com>
wrote:On Jan 30, 2009, at 11:37 AM, durumdara wrote:
- I can add/modify a table, or a field to a table without "full
lock" on
the table (like DBISAM restructure). Like in FireBird, where the
"add field"
change only the table description. I don't know that PG supports
this way of
the DB modifying.Nope. PostgreSQL is an all or nothing transactional database. I'd
never
heard of DBISAM before you mentioned it and have never used
Firebird. After
doing a little reading it turns out that if you plan to use
transactions at
all (which is very likely given even just the little you've
described about
the applications you're building) then you should realize that
altering
tables is not compatible with transactions and doing so will
automatically
commit any open transactions on the table.Are talking about pgsql or some other database? Everything in pgsql
can be done in a transaction, except create / drop database /
tablespace.
I was referring to DBISAM there.
Looking into Firebird I couldn't
find how it handles (or doesn't) that at all I but I did see that
it will
happily let you add a new not null column with no default to a
table by
writing nulls for the new attribute for any existing columns. That
already
makes me queasy.That's pretty much what pgsql does. Why does it make you queasy?
Another poster already beat me to answering this one so I'll not
repeat what they said.
Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k
Hi!
2009.01.31. 10:13 keltez�ssel, Erik Jones �rta:
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistentlymod_python is not a database adaptor.
Yes, I know. But because of mod_python (apache) commonly configured to
"1 interpreter/1 request", I need to reconnect with the adapter.
But I tested it, and it is seems to be have enough speed. If not, I must
use some proxy server to keep them alive.
rewrite the table data and previously existing indexes as well.
What's more, Postgres allows you to create real tablespaces so that
you can place individual persistent database objects (databases,
tables, indexes, and some constraints) on separate storage.
I wanna ask that if I create a database in another tablespace, is the
database name already stored in "main" data storage (like global metadata)?
And If I makes a table or other object into this database, need I to
define the tablespace name, or it is taken from the database
"tablespace" name?
For example:
create db anything tablespace "new2"
These sqls are different?
create table any_table (a int)
or
create table any_table (a int) tablespace "new2".
So: need I define tablespace instruction in every table/object creation,
or not?
And: if I store this database in another drive, and it is unmounted
(because of any reason - like failure, etc), is it causes any problems
with postgresql main databases (for example: service stops, etc). I ask
about "after restart pg service", not online usage.
So if I remove some tablespace path (directory, drive) in the level of
the filesystem, can pg service up to serve main databases the it can access?
Or is it completely die in this problem?
Thanks for your help:
dd
On Feb 2, 2009, at 12:23 AM, durumdara wrote:
Hi!
2009.01.31. 10:13 keltezéssel, Erik Jones írta:
On Jan 30, 2009, at 11:37 AM, durumdara wrote:
The main viewpoints:
- quick (re)connect - because mod_python basically not store the
database connections persistentlymod_python is not a database adaptor.
Yes, I know. But because of mod_python (apache) commonly configured
to "1 interpreter/1 request", I need to reconnect with the adapter.
But I tested it, and it is seems to be have enough speed. If not, I
must use some proxy server to keep them alive.
There are various connection pooling softwares available such as
pgpool or pgbouncer for that.
rewrite the table data and previously existing indexes as well.
What's more, Postgres allows you to create real tablespaces so that
you can place individual persistent database objects (databases,
tables, indexes, and some constraints) on separate storage.I wanna ask that if I create a database in another tablespace, is
the database name already stored in "main" data storage (like global
metadata)?
And If I makes a table or other object into this database, need I to
define the tablespace name, or it is taken from the database
"tablespace" name?
For example:
create db anything tablespace "new2"These sqls are different?
create table any_table (a int)
or
create table any_table (a int) tablespace "new2".So: need I define tablespace instruction in every table/object
creation, or not?
No, if you create a database as being in a given tablespace then all
table created in that database will go in that tablespace unless you
specify otherwise.
And: if I store this database in another drive, and it is unmounted
(because of any reason - like failure, etc), is it causes any
problems with postgresql main databases (for example: service stops,
etc). I ask about "after restart pg service", not online usage.
So if I remove some tablespace path (directory, drive) in the level
of the filesystem, can pg service up to serve main databases the it
can access?
Or is it completely die in this problem?
IIRC, the database should run fine and will just throw errors if you
try to access a database or table in a tablespace that isn't mounted.
Thanks for your help:
dd
You're welcome :)
Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k