Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4

Started by Jason McManusalmost 20 years ago38 messagesgeneral

mcmanus.jason@gmail.com

almost 20 years ago

On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
--------------------------------------------------

I am in the process of converting a couple of major sites from MySQL 5.0.22
to PostgreSQL 8.1.4, and I thought I would share some of my observations on
this process and the two database systems in general. I feel I am in a good
position to do this, as I am fairly familiar in-depth with MySQL's modern
features, am relatively technology-agnostic, and having done this since
the MySQL project has matured substantially over the last couple of years, I
feel I can give insight into the features of both systems from a current
perspective.

I will attempt to do this from an objective standpoint, as I still see
benefits and drawbacks to both systems (no software will ever meet the needs
of every situation). If you are looking for a document that mocks one system
or another, this is not it; there are countless results for "X sucks" on
$your_favorite_search_engine.

My reasons for making this switch are primarily due to having a bit of spare
time, wanting to expand my horizons and familiarize myself with another
well-respected open source project, some of the nice in-built procedural
language features of PostgreSQL, and basically wanting to form my own
opinion of the features of both systems. That being said, I have really
been impressed so far with the features I am discovering, and becoming
happier each day.

It is by no means comprehensive; I have just highlighted some of the
more notable and obvious differences as I discovered them. Most of the
information in here is pretty basic for those of you very familiar with
both systems. I also only highlight the differences in the Unix/POSIX
versions of these programs. (I don't run Windows as a server, and I don't
care to. My technology agnosticism stops at inadequate systems.)

All comments are current as of the time of writing (Spring/Summer 2006).
Error corrections are very welcome. Flames can go to /dev/null. Don't care.
Oh, and since I wrote this in vim, it is best read with a monospace font. :^)

Major differences I have noted:
-------------------------------

MySQL 5.0.x:
* Multiple storage engines with different features.
* Supports multi-insert syntax (INSERT INTO foo VALUES (1,2), (3,4) ...)
* A few more access controls on features built-in to the GRANT tables. Many
of these are still present, but implemented in other ways in PostgreSQL.
* Single AUTO_INCREMENT column allowed per table.
* Easy, built-in and extensive replication support.
* Single datastore location per server.
* ALL Stored Procedures are kept in the mysql system database.

PostgreSQL 8.1.x:
* Embedded procedures in multiple native languages (stored procedures and
functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)
* Extensive and versatile procedural language functionality.
* User-definable data types and operators.
* Multiple sequence generators allowed per table.
* Replication support still rudimentary.
* Stored procedures are kept (somewhat more logically, imho) in the
corresponding databases.
* Multiple datastore locations possible using tablespaces concept.
(For the record, MySQL will have tablespaces when 5.1.x is stabilized.)
* Most system variables, "built-in" types and features configurable as they
are just kept in a system catalog.
* Allows deletions and subselects to specify the same table (e.g. DELETE
FROM foo WHERE id IN (SELECT id FROM foo WHERE...) ). MySQL does not
allow this as of 5.0.22.
* Copious documentation on the database internals, for extending the
database itself.

Pointers, tips, quick facts and gotchas for other people converting:
--------------------------------------------------------------------

* Don't bother using an old version, just go for 8.1.4 (or whatever is new
at the time of your conversion. This should be common sense.)
* Since Pg uses a full transactional storage engine, the speed is roughly
comparable to InnoDB, rather than the stock MyISAM format.
* PostgreSQL's TCP port is 5432 by default.
* The main server process on PostgreSQL is 'postmaster'.
* 'postmaster' can be controlled via the 'pg_ctl' command.
* The administrative user is called 'postgres' by default.
* Like MySQL, Pg uses the system user as default, if no username is
specified when connecting.
* The command-line client is called 'psql'.
* PostgreSQL by default comes configured to disallow network connections.
To enable these, you must follow these steps:
1. Edit $DATADIR/pg_hba.conf and add access permissions.
2. Edit $DATADIR/postgresql.conf and uncomment the listen_addresses
line, setting it to something reasonable.
3. Restart postmaster.
* PostgreSQL relies extensively upon quick aliases for common features within
the CLI shell. MySQL offers many similar features, but they aren't used
as much from what I have observed.
* MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL
differentiates the two. While the hierarchy in MySQL is
database.table.field, PostgreSQL is roughly: database.schema.table.field.
A schema is a 'logically grouped set of tables but still kept within a
particular database.' This could allow separate applications to be built
that still rely upon the same database, but can be kept somewhat logically
separated. The default schema in each database is called 'public', and is
the one referred to if no others are specified. This can be modified with
'SET search_path TO ...'.
* Pg uses a 'template1' pseudo-database that can be tailored to provide
default objects for new database creation, if you should desire. It
obviously also offers a 'template0' database that is read-only and
offers a barebones database, more equivalent to the empty db created with
mysql's CREATE DATABASE statement.
* Pg's ROLEs can specify a single user or a group, and be nested to contain
multiple users.
* Pg's default character set (in 8.1.4) is UTF8.
* Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows
more than one independent sequence to be specified per table (though the
utility of this may be of dubious value). These are closer to Oracle's
concept of sequence generators, and they can be manipulated with the
currval(), nextval(), setval(), and lastval() functions.
* Pg requires its tables and databases be 'vacuumed' regularly to remove
completed transaction snapshots and optimize the tables on disk. It is
necessary because the way that PostgreSQL implements true MVCC is by
writing all temporary transactions to disk and setting a visibility
flag for the record. Vacuuming can be performed automatically, and in
a deferred manner by using vacuum_cost settings to limit it to low-load
periods or based upon numerous other criteria. See the manual for more
information.
* Kept internally in Pg, there is a concept called the OID, which is a
continuously incremented number used to assign unique IDs to system
objects. This allows the database to store and refer uniquely to user
operators, new databases, basically anything that the system needs to
refer to in the 'data directory', regardless of user-defined names.
* Most administrative procedures will refuse to run as root, and require you
to su to the 'postgres' system user to perform the action.
* PgAdminIII gives you a great overview of the hierarchy of system objects
throughout the server. Even though you may administrate your server
primarily via the CLI, as I do, it is still valuable during the learning
process to use this tool to browse around the various objects, to learn
the system setup and hierarchy visually.
* While MySQL supports transactions with the InnoDB databases, many MySQL
users generally do not use them extensively enough. With Pg, due to the
behaviour of the server in attempting to ensure data integrity in a
variety of situations (client disconnection, network trouble, server
crashes, etc.), it is highly advisable to become familiar and utilize
transactions a lot more, to ensure your DATA is left in a consistent state
before and after every change you wish to make.
* There is a conversion utility called 'mysql2pgsql' that will convert
dump files from the mysqldump format, to a format that psql can
understand. It is available at:
- http://gborg.postgresql.org/project/mysql2psql/projdisplay.php
* To turn on query time output, similar to the mysql CLI, use the '\timing'
command from psql. (Note that the time is displayed in milliseconds,
whereas in the mysql client it is displayed in seconds.)

Common equivalents:
-------------------

MySQL PostgreSQL
----- -----------
mysql database system tables
mysql psql
mysqld postmaster
mysqladmin initdb/dropdb/createuser/dropuser/
createlang/droplang/vacuumdb
mysqldump pg_dump/pg_dumpall/pg_restore
-nothing- ecpg
SHOW DATABASES; \l
SHOW GRANTS; \du
SHOW TABLES; \dt
DESC tblname; \d foo
USE dbname; \c dbname
ALTER TABLE foo AUTO_INCREMENT = n; SELECT setval('seq_name',n);
SHOW PROCESSLIST; SELECT * FROM pg_stat_activity;
OPTIMIZE TABLE ... VACUUM ...

Final thoughts:
---------------
Overall, I have been happy and very impressed with the features offered by
PostgreSQL 8.1.4, and believe that I will be using it for the majority of my
future projects. There are still some niches where I see utility for the
pluggable storage engines and raw speed of MySQL. I will be keeping my eye
on that project closely, also, as I want to test out the new Scheduled
Events feature from the 5.1 series once it is stabilized.

I have also been using Apress' "Beginning Databases with PostgreSQL, 2nd
Edition" (2005) as my learning material. I briefly examined Sams'
"PostgreSQL - The Comprehensive Guide, 2nd Ed." (2005) and Pearson's
"PostgreSQL: Introduction and Concepts" (2001), but found the Apress book
the best of the 3 that I had access to. YMMV. More info in the book
reviews linked below.

Further information:
--------------------
PostgreSQL manual: http://www.postgresql.org/docs/manuals/
PgAdmin3: http://www.pgadmin.org/
PostgreSQL book reviews:
http://techdocs.postgresql.org/techdocs/bookreviews.php
Community Support Channel: irc.freenode.net in #postgresql
Varlena Consulting's General Bits archives:
http://www.varlena.com/GeneralBits/

---------

Thank you, and I hope that these notes prove helpful to others!

Jason McManus

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Jason McManus (#1)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4

"Jason McManus" <mcmanus.jason@gmail.com> writes:

I am in the process of converting a couple of major sites from MySQL 5.0.22
to PostgreSQL 8.1.4, and I thought I would share some of my observations on
this process and the two database systems in general.

Nice notes! I see only one small error:

* Pg's default character set (in 8.1.4) is UTF8.

I don't believe there is any fixed "default character set". Each
installation will have a default locale and encoding, but these depend
on the locale under which initdb was run. From the above comment I
surmise that you initdb'd under some UTF8-using locale ...

Thank you, and I hope that these notes prove helpful to others!

Perhaps they should go on the project website somewhere?

regards, tom lane

Jason McManus

mcmanus.jason@gmail.com

almost 20 years ago

In reply to: Tom Lane (#2)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4

I don't believe there is any fixed "default character set". Each
installation will have a default locale and encoding, but these depend
on the locale under which initdb was run. From the above comment I
surmise that you initdb'd under some UTF8-using locale ...

Ah, great. Thank you for the clarification. I did indeed run initdb
on different systems, but they were all under UTF-8, so I guess I had
drawn the wrong conclusion.

Perhaps they should go on the project website somewhere?

I was thinking so as well, but I'm not sure where, or who to submit to
if there is no wiki or other external editing method. Any pointers
for who to contact/where to submit?

Thanks,
Jason

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Jason McManus (#3)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4

"Jason McManus" <mcmanus.jason@gmail.com> writes:

Perhaps they should go on the project website somewhere?

I was thinking so as well, but I'm not sure where, or who to submit to
if there is no wiki or other external editing method. Any pointers
for who to contact/where to submit?

Not my department, but if none of the project's web people follow up
in the next few hours, try inquiring on pgsql-docs or pgsql-www.

regards, tom lane

Scott Marlowe

smarlowe@g2switchworks.com

almost 20 years ago

In reply to: Jason McManus (#1)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

I agree with Tom, nice notes. I noted a few minor issues that seem to
derive from a familiarity with MySQL. I'll put my corrections below...

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:

On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
--------------------------------------------------
Major differences I have noted:
-------------------------------

MySQL 5.0.x:

* Easy, built-in and extensive replication support.

Not sure how extensive it is. It's basically synchronous single master
single slave, right? It is quite easy though.

PostgreSQL 8.1.x:
* Embedded procedures in multiple native languages (stored procedures and
functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)

Note that there are a dozen or more other languages as well. Just FYI.
Off the top of my head, plPHP, plJ (java there's two different java
implementations, I think) and plR (R is the open source equivalent of
the S statistics language)

* Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain". hehe. That said, once you've
learned how to drive it, it's quite amazing. Keep in mind, slony can be
applied to a living database while it's running, and can run between
different major versions of postgresql. That's a pretty advanced
feature. Plus, if the replication daemons die (kill -9ed or whatever)
you can restart replication and slony will come right back where it was
and catch up.

Pointers, tips, quick facts and gotchas for other people converting:
--------------------------------------------------------------------

* MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL
differentiates the two. While the hierarchy in MySQL is
database.table.field, PostgreSQL is roughly: database.schema.table.field.
A schema is a 'logically grouped set of tables but still kept within a
particular database.' This could allow separate applications to be built
that still rely upon the same database, but can be kept somewhat logically
separated. The default schema in each database is called 'public', and is
the one referred to if no others are specified. This can be modified with
'SET search_path TO ...'.

This is a VERY good analysis of the difference between the two
databases.

* Pg uses a 'template1' pseudo-database that can be tailored to provide
default objects for new database creation, if you should desire. It
obviously also offers a 'template0' database that is read-only and
offers a barebones database, more equivalent to the empty db created with
mysql's CREATE DATABASE statement.

This isn't quite right.

template0 is a locked and "pure" copy of the template database. It's
there for "break glass in case of emergency" use. :)

template1, when you first initdb, is exactly the same as template0, but
you can connect to it, and alter it. Both of these are "real"
postgresql databases. template1 is the database that gets copied by
default when you do "create database". Note that you can also define a
different template database when running create database, which lets you
easily clone any database on your machine. "create database newdb with
template olddb"

* Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows
more than one independent sequence to be specified per table (though the
utility of this may be of dubious value). These are closer to Oracle's
concept of sequence generators, and they can be manipulated with the
currval(), nextval(), setval(), and lastval() functions.

Don't forget 64bit bigserials too.

* Pg requires its tables and databases be 'vacuumed' regularly to remove
completed transaction snapshots and optimize the tables on disk. It is
necessary because the way that PostgreSQL implements true MVCC is by
writing all temporary transactions to disk and setting a visibility
flag for the record. Vacuuming can be performed automatically, and in
a deferred manner by using vacuum_cost settings to limit it to low-load
periods or based upon numerous other criteria. See the manual for more
information.

Interestingly enough, MySQL's innodb tables do almost the exact same
thing, but their vacuum process is wholly automated. Generally, this
means fewer issues pop up for the new dba, but when they do, they can be
a little harder to deal with. It's about a wash. Of course, as you
mentioned earlier, most mysql folks aren't using innodb.

* While MySQL supports transactions with the InnoDB databases, many MySQL
users generally do not use them extensively enough. With Pg, due to the
behaviour of the server in attempting to ensure data integrity in a
variety of situations (client disconnection, network trouble, server
crashes, etc.), it is highly advisable to become familiar and utilize
transactions a lot more, to ensure your DATA is left in a consistent state
before and after every change you wish to make.

A point you might want to throw in here is that EVERYTHING in postgresql
is a transaction. If you don't issue a begin statement, then postgresql
runs each statement you type in inside its own transaction.

This means that inserting 10,000 rows without wrapping them inside an
explicit transaction results in 10,000 individual transactions.

However, the more interesting thing here, is that every statement,
including DDL is transactable, except for a couple of big odd ones, like
create database. So, in postgresql, you can do:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

and there's no change and no lost data. Quite impressive actually.

Common equivalents:
-------------------

MySQL PostgreSQL
----- -----------
OPTIMIZE TABLE ... VACUUM ...

vacuum and analyze for optimize I think. Also, possibly reindex,
although nominally that's the "sledge hammer" of optimization.

One last thing I'd mention that I REALLY like about PostgreSQL over any
other database I've used is that the psql interface has a complete
syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND
where COMMAND is the command you want to look up will bring up the
syntax for your command.

And, I hate the fact that CTRL-C in the mysql command line tool exits
the tool instead of interrupting the current query. In PostgreSQL it
interrupts the current query. CTRL-\ will kill the client if you need
to.

Overall, a great review. Thanks.

Dave Page

dpage@pgadmin.org

almost 20 years ago

In reply to: Tom Lane (#4)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: 30 June 2006 15:47
To: Jason McManus
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x
to PostgreSQL 8.1.4

"Jason McManus" <mcmanus.jason@gmail.com> writes:

Perhaps they should go on the project website somewhere?

I was thinking so as well, but I'm not sure where, or who

to submit to

if there is no wiki or other external editing method. Any pointers
for who to contact/where to submit?

Not my department, but if none of the project's web people follow up
in the next few hours, try inquiring on pgsql-docs or pgsql-www.

Documentation such as this can be added to the new techdocs area on the
main site at http://www.postgresql.org/docs/techdocs under the relevant
section (probably http://www.postgresql.org/docs/techdocs.3 in this
case).

Please note that the editting interface is still new and may still have
a quirk or two...

Interesting notes BTW Jason - thanks for sharing.

Regards, Dave.

Ron Johnson

ron.l.johnson@cox.net

almost 20 years ago

In reply to: Scott Marlowe (#5)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
[snip]

However, the more interesting thing here, is that every
statement, including DDL is transactable, except for a couple of
big odd ones, like create database. So, in postgresql, you can do:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

But isn't that what it means to be "transactional"? Or am I spoiled
by my "big, expensive enterprise database"?

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEpUGPS9HxQb37XmcRAiJwAJ9/A/N/OgmslveSsX3Xym2QnDQz1gCghPD0
YX882Kv81hzZ4AKjaIVKHg8=
=Gsml
-----END PGP SIGNATURE-----

Dave Page

dpage@pgadmin.org

almost 20 years ago

In reply to: Scott Marlowe (#5)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Scott Marlowe
Sent: 30 June 2006 16:12
To: Jason McManus
Cc: pgsql general
Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x
to PostgreSQL

* Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain". hehe. That said, once you've
learned how to drive it, it's quite amazing.

I'm not sure that many people necessarily realise it, but you can also
drive Slony directly from pgAdmin 1.4+ if slonik scripts give you a
headache.

Regards, Dave.

Alban Hertroys

alban@magproductions.nl

almost 20 years ago

In reply to: Ron Johnson (#7)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

Ron Johnson wrote:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

But isn't that what it means to be "transactional"? Or am I spoiled
by my "big, expensive enterprise database"?

Well, according to my colleague here this wasn't possible until now
(partially!) in Oracle 10. Meaning it's not common-place even among
enterprise db's.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Scott Marlowe (#5)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

Scott Marlowe <smarlowe@g2switchworks.com> writes:

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:

* Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain". hehe. That said, once you've
learned how to drive it, it's quite amazing. Keep in mind, slony can be
applied to a living database while it's running, and can run between
different major versions of postgresql. That's a pretty advanced
feature. Plus, if the replication daemons die (kill -9ed or whatever)
you can restart replication and slony will come right back where it was
and catch up.

It might be worth pointing out that mysql's replication falls over
if you so much as look at it crosseyed. I have not had to use it
for production purposes, but I can tell you that the mysql replication
regression tests fail ... irreproducibly of course ... almost one time
in two in Red Hat's build environment. I've been able to trace a few of
these failures to quirks of the build environment, like trying to build
x86 and x86_64 at the same time in different chroots of the same machine
(must take care not to use same TCP port numbers for tests), but it
still seems flaky as hell.

regards, tom lane

#11

Leif B. Kristensen

leif@solumslekt.org

almost 20 years ago

In reply to: Scott Marlowe (#5)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

On Friday 30. June 2006 17:12, Scott Marlowe wrote:

And, I hate the fact that CTRL-C in the mysql command line tool exits
the tool instead of interrupting the current query.

I agree, it's a nuisance.

In PostgreSQL it
interrupts the current query. CTRL-\ will kill the client if you need
to.

Or Ctrl-D, which also is a common way of terminating a command-line
interface, like the Python interpreter. It's much easier to remember
than the Ctrl-\ .
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

#12

paul rivers

rivers.paul@gmail.com

almost 20 years ago

In reply to: Ron Johnson (#7)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

Out of curiosity, which "big, expensive enterprise database" are you spoiled
by? Many that I support do not allow DDL within an transaction, or if they
allow it, there are many caveats and rules.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Ron Johnson
Sent: Friday, June 30, 2006 8:22 AM
To: pgsql general
Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x to PostgreSQL

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
[snip]

However, the more interesting thing here, is that every
statement, including DDL is transactable, except for a couple of
big odd ones, like create database. So, in postgresql, you can do:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

But isn't that what it means to be "transactional"? Or am I spoiled
by my "big, expensive enterprise database"?

- --
Ron Johnson, Jr.
Jefferson LA USA

iD8DBQFEpUGPS9HxQb37XmcRAiJwAJ9/A/N/OgmslveSsX3Xym2QnDQz1gCghPD0
YX882Kv81hzZ4AKjaIVKHg8=
=Gsml
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#13

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Ron Johnson (#7)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

Ron Johnson <ron.l.johnson@cox.net> writes:

Scott Marlowe wrote:

However, the more interesting thing here, is that every
statement, including DDL is transactable, except for a couple of
big odd ones, like create database. So, in postgresql, you can do:

But isn't that what it means to be "transactional"? Or am I spoiled
by my "big, expensive enterprise database"?

Being able to roll back DDL (table-schema modifications) isn't that
common. Since PG keeps most of its schema information in tables,
we have it easier than some other systems supporting DDL rollback,
but it's still tricky. As an example, a long time ago we used to
name table files after the table and database directories after the
database, which made it easy to see what was what under $PGDATA,
but prevented a lot of DDL from being transactional. For instance

BEGIN;
DROP TABLE foo;
CREATE TABLE foo (some-new-definition);
ROLLBACK;

couldn't work because there would need to be two physical files named
foo in the interim until you commit or roll back. ALTER TABLE RENAME
had some related problems. Now we name all the filesystem objects using
OIDs that can be chosen to never collide, even if they belong to
database objects with similar names.

Last I checked, mysql was still using table names for file names,
so they're on the wrong side of this.

regards, tom lane

#14

Ron Johnson

ron.l.johnson@cox.net

almost 20 years ago

In reply to: paul rivers (#12)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

paul rivers wrote:

Out of curiosity, which "big, expensive enterprise database" are
you spoiled by? Many that I support do not allow DDL within an
transaction, or if they allow it, there are many caveats and
rules.

Oracle Rdb. Built by DEC back in the early 1980s. It's had
tablespaces (Storage Areas in Rdb parlance) since the late 80s.

Tables (including the system catalog, which itself is a set of
tables) all go in a Storage Area. If you don't specify one, it does
in the default: RDB$SYSTEM.

Creating a table is no more than inserting records into a few system
tables, and allocating a few pages in the relevant Storage Area.
Thus, rolling back most all DDL is built deep into the engine.

http://www.oracle.com/technology/products/rdb/index.html

- --
Ron Johnson, Jr.
Jefferson LA USA

iD8DBQFEpVAZS9HxQb37XmcRArUOAKDFNtLVqr9BeYi7k6nhp/GnVI7M6QCfV7hJ
wNUUCx2sGUmRklxtwu6hoUA=
=CLgH
-----END PGP SIGNATURE-----

#15

David Fetter

david@fetter.org

almost 20 years ago

In reply to: Dave Page (#8)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

On Fri, Jun 30, 2006 at 04:22:28PM +0100, Dave Page wrote:

* Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the
administrative tools consist mostly of "your brain". hehe. That
said, once you've learned how to drive it, it's quite amazing.

I'm not sure that many people necessarily realise it, but you can
also drive Slony directly from pgAdmin 1.4+ if slonik scripts give
you a headache.

Last I checked, pgAdmin 1.4 doesn't help setting up clusters, which is
one of the major headaches of a Slony-I setup. I also noticed that
pgAdmin 1.6-to-be has at least some of those hooks. Any ETA on that?

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#16

David Fetter

david@fetter.org

almost 20 years ago

In reply to: Tom Lane (#10)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote:

Scott Marlowe <smarlowe@g2switchworks.com> writes:

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:

* Replication support still rudimentary.

It might be worth pointing out that mysql's replication falls over
if you so much as look at it crosseyed. I have not had to use it
for production purposes, but I can tell you that the mysql
replication regression tests fail ... irreproducibly of course ...
almost one time in two in Red Hat's build environment.

Are those tests, or at least descriptions of them, available?

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#17

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: David Fetter (#16)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

David Fetter <david@fetter.org> writes:

On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote:

It might be worth pointing out that mysql's replication falls over
if you so much as look at it crosseyed. I have not had to use it
for production purposes, but I can tell you that the mysql
replication regression tests fail ... irreproducibly of course ...
almost one time in two in Red Hat's build environment.

Are those tests, or at least descriptions of them, available?

Sure, it's just the standard "make test" sequence in mysql's source.
If you want to do exactly what I'm talking about, grab the latest
mysql SRPM off the Fedora download server and "rpmbuild --rebuild" it.
There's a very long regression test suite (much larger than ours :-()
and when it fails, it's invariably in one of the replication-related
tests.

regards, tom lane

#18

David Fetter

david@fetter.org

almost 20 years ago

In reply to: Tom Lane (#17)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

On Fri, Jun 30, 2006 at 01:41:53PM -0400, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote:

It might be worth pointing out that mysql's replication falls
over if you so much as look at it crosseyed. I have not had to
use it for production purposes, but I can tell you that the mysql
replication regression tests fail ... irreproducibly of course
... almost one time in two in Red Hat's build environment.

Are those tests, or at least descriptions of them, available?

Sure, it's just the standard "make test" sequence in mysql's source.

Uh oh. I'm a little worried about writing tests based on GPLed code
for Slony-I or other replication systems. Might these need to be
clean-roomed?

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#19

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: David Fetter (#18)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

David Fetter <david@fetter.org> writes:

On Fri, Jun 30, 2006 at 01:41:53PM -0400, Tom Lane wrote:

Sure, it's just the standard "make test" sequence in mysql's source.

Uh oh. I'm a little worried about writing tests based on GPLed code
for Slony-I or other replication systems. Might these need to be
clean-roomed?

Oh, is that what you wanted 'em for? Probably. My recollection from
the few that I've really looked at is that they're pretty mysql-specific
anyway (eg, some of them are actual "regression" tests to catch
reappearance of old mysql bugs).

regards, tom lane

#20

Chris Browne

cbbrowne@acm.org

almost 20 years ago

In reply to: Jason McManus (#1)

Re: Notes on converting from MySQL 5.0.x to PostgreSQL

smarlowe@g2switchworks.com (Scott Marlowe) writes:

I agree with Tom, nice notes. I noted a few minor issues that seem to
derive from a familiarity with MySQL. I'll put my corrections below...

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:

On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
--------------------------------------------------
Major differences I have noted:
-------------------------------

MySQL 5.0.x:

* Easy, built-in and extensive replication support.

Not sure how extensive it is. It's basically synchronous single master
single slave, right? It is quite easy though.

And it's statement-based, is it not?

Indications are that MySQL replication is quite non-deterministic, as
a result; if you use SYSDATE() in INSERT/UPDATE queries to set
timestamps, replicas will get the wrong time.

It looks like anything that is dynamically evaluated will be processed
incorrectly on replicas, such as timezones.

It is possible for the data on the master and slave to become
different if a statement is designed in such a way that the data
modification is non-deterministic; that is, left to the will of the
query optimizer.

It's multi-slave, mind you...

* Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain". hehe. That said, once you've
learned how to drive it, it's quite amazing. Keep in mind, slony
can be applied to a living database while it's running, and can run
between different major versions of postgresql. That's a pretty
advanced feature. Plus, if the replication daemons die (kill -9ed
or whatever) you can restart replication and slony will come right
back where it was and catch up.

And you can trust that the data that is replicated will actually be
faithfully replicated, even in the presence of timestamps, triggers,
and other things that challenge determinism...

* Pg uses a 'template1' pseudo-database that can be tailored to provide
default objects for new database creation, if you should desire. It
obviously also offers a 'template0' database that is read-only and
offers a barebones database, more equivalent to the empty db created with
mysql's CREATE DATABASE statement.

This isn't quite right.

template0 is a locked and "pure" copy of the template database. It's
there for "break glass in case of emergency" use. :)

template1, when you first initdb, is exactly the same as template0, but
you can connect to it, and alter it. Both of these are "real"
postgresql databases. template1 is the database that gets copied by
default when you do "create database". Note that you can also define a
different template database when running create database, which lets you
easily clone any database on your machine. "create database newdb with
template olddb"

In the last few weeks, we've had fun using "createdb --template=" to
create test copies of production databases (well, replicas thereof...).

Creating a replica via Slony-I takes several hours, for large
databases, as it has to load data into tables, then generate indexes.

We've used "createdb" on such databases; the longest it took to set up
an "extra duplicate" was something like 8 minutes, and that gave our
sysadmins full copies of the production databases that could be used
for testing... The speed was *stunning*...
--
(format nil "~S@~S" "cbbrowne" "cbbrowne.com")
http://cbbrowne.com/info/multiplexor.html
"How much more helpful could I be than to provide you with the
appropriate e-mail address? I could engrave it on a clue-by-four and
deliver it to you in Chicago, I suppose." -- Seen on Slashdot...

#21

Chris Browne

cbbrowne@acm.org

almost 20 years ago

In reply to: Jason McManus (#1)

#22

Scott Marlowe

smarlowe@g2switchworks.com

almost 20 years ago

In reply to: Chris Browne (#20)

#23