UNICODE

Started by Jean-Michel POUREabout 24 years ago29 messages

jm.poure@freesurf.fr

about 24 years ago

Dear all,

I am running PostgreSQL 7.1.2 with UNICODE support in production.
Maybe I miss something about UNICODE:

CREATE TABLE "test" (
"source_oid" serial,
"source_timestamp" timestamp,
"source_creation" date DEFAULT 'now',
"source_modification" date DEFAULT 'now',
"source_content" text
);

INSERT INTO test (source_content) VALUES ('Photocopie du permis de
construire acceptï¿½.');

Now, when trying :
SELECT * FROM test WHERE source_content ILIKE '%accept%'; ---> returns the
record;
SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%' ---> returns nothing
SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns nothing

The same happens from ODBC, PHP and psql. Can you reproduce this?

I have tried
Best regards,
Jean-Michel POURE

Marko Kreen

marko@l-t.ee

about 24 years ago

In reply to: Jean-Michel POURE (#1)

Re: UNICODE

On Sun, Oct 28, 2001 at 09:22:24AM +0100, Jean-Michel POURE wrote:

Dear all,

I am running PostgreSQL 7.1.2 with UNICODE support in production.
Maybe I miss something about UNICODE:

SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%' ---> returns
nothing
SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns
nothing

The same happens from ODBC, PHP and psql. Can you reproduce this?

1) Did you compile PostgreSQL with --enable-locale
2) Did you set correct locale for postmaster (LANG=xxx)

--
marko

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: Marko Kreen (#2)

Re: UNICODE

1) Did you compile PostgreSQL with --enable-locale

Yes.

2) Did you set correct locale for postmaster (LANG=xxx)

Database was create using CREATE db WITH ENCODING='UNICODE'.
pgsql: \encoding returns UNICODE.

The db stores multiple languages (French, English, Japanese).
Why should I define a *single* locale for postmaster?
Do I miss something?

Best regards,
Jean-Michel POURE

Marko Kreen

marko@l-t.ee

about 24 years ago

In reply to: Jean-Michel POURE (#1)

Re: UNICODE

On Sun, Oct 28, 2001 at 09:22:24AM +0100, Jean-Michel POURE wrote:

I am running PostgreSQL 7.1.2 with UNICODE support in production.
Maybe I miss something about UNICODE:

CREATE TABLE "test" (
"source_oid" serial,
"source_timestamp" timestamp,
"source_creation" date DEFAULT 'now',
"source_modification" date DEFAULT 'now',
"source_content" text
);

INSERT INTO test (source_content) VALUES ('Photocopie du permis de
construire acceptï¿½.');

Now, when trying :
SELECT * FROM test WHERE source_content ILIKE '%accept%'; ---> returns the
record;
SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%' ---> returns
nothing
SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns
nothing

The same happens from ODBC, PHP and psql. Can you reproduce this?

Sorry, I misinterpreted what your problem is. I somehow thought
you want the 'ï¿½' and 'e' produce same result - for that you need
to mess with locale, but LIKE does not use locale anyway...

Now I reread you message and here's hint:

* If client_encoding == server_encoding, the bytes are put into
DB as-is - no conversion is done.

So are you abslutely sure you have on client side UTF8 strings?
Unfortunately you cant use client_encoding=latin1 as PostgreSQL
refuses the do the conversion between them. (I am with 7.1.3)

Eg. I did the following:

* created db with encoding = UNICODE
* Put your example into test.sql
* iconv -f latin1 -t utf8 test.sql > test2.sql
* psql < test2.sql

and it worked as it should...

--
marko

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: Marko Kreen (#4)

Re: UNICODE

I only want this query to work under Unicode:
SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%'.

* If client_encoding == server_encoding, the bytes are put into
DB as-is - no conversion is done.

So are you absolutely sure you have on client side UTF8 strings?

PostgreSQL is compiled with UNICODE and LOCALE support.
Unicode is used on both ends (PostgreSQL and psql).

Unfortunately you cant use client_encoding=latin1 as PostgreSQL
refuses the do the conversion between them. (I am with 7.1.3)

According to the on-line manual, only MULE provides instant transcoding.

Eg. I did the following:

* created db with encoding = UNICODE
* Put your example into test.sql
* iconv -f latin1 -t utf8 test.sql > test2.sql
* psql < test2.sql

and it worked as it should...

Nice to hear it works when transcoding files to UTF-8. It shows it is not a
back-end problem.

As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
acceptï¿½') in Psql.
Psql does not insert the data and I have to kill it manually. Can you
reproduce this?

Best regards,
Jean-Michel POURE

Marko Kreen

marko@l-t.ee

about 24 years ago

In reply to: Jean-Michel POURE (#5)

Re: UNICODE

On Sun, Oct 28, 2001 at 12:44:26PM +0100, Jean-Michel POURE wrote:

I only want this query to work under Unicode:
SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%'.

As I showed it works, if data in db is in UTF-8 and the query
string 'acceptï¿½' is in UTF8

* If client_encoding == server_encoding, the bytes are put into
DB as-is - no conversion is done.

So are you absolutely sure you have on client side UTF8 strings?

PostgreSQL is compiled with UNICODE and LOCALE support.
Unicode is used on both ends (PostgreSQL and psql).

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

Eg. I did the following:

* created db with encoding = UNICODE
* Put your example into test.sql
* iconv -f latin1 -t utf8 test.sql > test2.sql
* psql < test2.sql

and it worked as it should...

Nice to hear it works when transcoding files to UTF-8. It shows it is not a
back-end problem.

As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
acceptï¿½') in Psql.

As I said - psql does not do any conversion.

Psql does not insert the data and I have to kill it manually. Can you
reproduce this?

No. If it hangs this is serious problem. Or did you simply
forgot final ';' ? It btw does not seem valid sql to me,
considering you previously provided table structure.

In the end: are the strings/queries you give to psql/pg_exec
UTF-8 - this is now main thing, as you have _configured_
everything correctly.

--
marko

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: Marko Kreen (#6)

Re: UNICODE

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

Client: \encoding returns 'UNICODE'.
Server: \list show databases. All databases are UNICODE (except TEMPLATE0
and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1 distribution
and think my console is UNICODE.

As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
acceptï¿½') in Psql.

As I said - psql does not do any conversion.

The faulty query is: INSERT INTO test (source_content) VALUES ('Permis de
conduire acceptï¿½');

I just can't believe that Psql is not UTF-8 compatible. It seems unreal as
Psql is PostgreSQL #1 helper application. Should I use PostgreSQL MULE
encoding to have automatic trans coding. What are the guidelines, I am
completely lost.

Psql does not insert the data and I have to kill it manually. Can you
reproduce this?

No. If it hangs this is serious problem. Or did you simply
forgot final ';' ? It btw does not seem valid sql to me,
considering you previously provided table structure.

Is it possible that my database is corrupted? I have used pg_dump several
times to dump data from production server to development servers and
conversely. Does pg_dump produce UTF8 output? What are the guidelines when
using UTF-8: forget psql and pg_dump?

In the end: are the strings/queries you give to psql/pg_exec
UTF-8 - this is now main thing, as you have _configured_
everything correctly.

Everything is configured correctly server-side (PostgreSQL, Psql).

Thank you very much for your support Marko,
Best regards,
Jean-Michel

Marko Kreen

marko@l-t.ee

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: UNICODE

On Sun, Oct 28, 2001 at 02:34:49PM +0100, Jean-Michel POURE wrote:

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

Client: \encoding returns 'UNICODE'.
Server: \list show databases. All databases are UNICODE (except TEMPLATE0
and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1 distribution
and think my console is UNICODE.

You think? Try this:

$ echo "acceptï¿½" | od -c

If your term is in utf you should get:

0000000 a c c e p t 303 251 \n
0000011

If in iso-8859-1:

0000000 a c c e p t 351 \n
0000010

It may be in some other 8bit encoding too, then the last number
may be different.

As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
acceptï¿½') in Psql.

As I said - psql does not do any conversion.

The faulty query is: INSERT INTO test (source_content) VALUES ('Permis de
conduire acceptï¿½');

Hmm. It may be a bug in input routines. You give PostgreSQL a
1byte 'ï¿½', it expects 2 byte char and overflows somewhere. Can
you reproduce it on 7.1.3? Maybe its fixed there, I cant
reproduce it.

I just can't believe that Psql is not UTF-8 compatible. It seems unreal as
Psql is PostgreSQL #1 helper application. Should I use PostgreSQL MULE
encoding to have automatic trans coding. What are the guidelines, I am
completely lost.

psql & pg_dump are fine. Your problem is that you dont give to
psql and pg_exec/PHP utf-8 strings, but some iso-8859-*.

Psql does not insert the data and I have to kill it manually. Can you
reproduce this?

No. If it hangs this is serious problem. Or did you simply
forgot final ';' ? It btw does not seem valid sql to me,
considering you previously provided table structure.

Is it possible that my database is corrupted? I have used pg_dump several
times to dump data from production server to development servers and
conversely. Does pg_dump produce UTF8 output? What are the guidelines when
using UTF-8: forget psql and pg_dump?

As I said, psql & pg_dump are fine, they do not touch your data
when it passes through them.

It may be that all of your database is in latin1, as you
inserted strings in this encoding, not utf8. Basically
PostgreSQL server also does not touch your data, only its
compare functions does not work, as the strings are not in
encoding you tell they are.

Solution to this is to dump your data, use the iconv utility
to convert it to utf8 and reload.

To see this you should do:

$ psql -c "SELECT source_contect FROM table where ..." \
| od -c

And then look whether the weird characters are represented in
1 or 2 bytes.

--
marko

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: Marko Kreen (#8)

Re: UNICODE

At 17:09 28/10/01 +0200, you wrote:

On Sun, Oct 28, 2001 at 02:34:49PM +0100, Jean-Michel POURE wrote:

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

Client: \encoding returns 'UNICODE'.
Server: \list show databases. All databases are UNICODE (except TEMPLATE0
and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1

distribution

and think my console is UNICODE.

You think? Try this:

$ echo "acceptï¿½" | od -c

If your term is in utf you should get:

0000000 a c c e p t 303 251 \n
0000011

If in iso-8859-1:

0000000 a c c e p t 351 \n
0000010

It may be in some other 8bit encoding too, then the last number
may be different.

It is:
0000000 a c c e p t ï¿½ \n
0000010

Hmm. It may be a bug in input routines. You give PostgreSQL a
1byte 'ï¿½', it expects 2 byte char and overflows somewhere. Can
you reproduce it on 7.1.3? Maybe its fixed there, I cant
reproduce it.

I noticed some longer routines with "ï¿½" worked without any problem.
I cannot reproduce it as I converted my database to plain ASCII.
Will try UNICODE on 7.2 beta when adding Japanese text to my database.

Thank you very much for your help.
Best regards, Jean-Michel POURE

#10

Marko Kreen

marko@l-t.ee

about 24 years ago

In reply to: Jean-Michel POURE (#9)

Re: UNICODE

On Sun, Oct 28, 2001 at 04:37:48PM +0100, Jean-Michel POURE wrote:

At 17:09 28/10/01 +0200, you wrote:

$ echo "acceptï¿½" | od -c

It is:
0000000 a c c e p t ï¿½ \n
0000010

Huh. Then try 'od -t x1'. Also what the commend 'locale'
prints.

Hmm. It may be a bug in input routines. You give PostgreSQL a
1byte 'ï¿½', it expects 2 byte char and overflows somewhere. Can
you reproduce it on 7.1.3? Maybe its fixed there, I cant
reproduce it.

I noticed some longer routines with "ï¿½" worked without any problem.
I cannot reproduce it as I converted my database to plain ASCII.
Will try UNICODE on 7.2 beta when adding Japanese text to my database.

Ok. I still suggest you try to understand what was going on,
otherwise you will be in trouble again. The logic around
encodings will be same in 7.2.

--
marko

#11

Mike Rogers

temp6453@hotmail.com

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Ultimate DB Server

I'm questioning whether anyone has done benchmarks on various hardware for
PGSQL and MySQL. I'm either thinking dual P3-866's, Dual AMD-1200's, etc.
I'm looking for benchmarks of large queries on striped -vs- non-striped
volumes, different processor speeds, etc.

Any thoughts people?

#12

Todd Williamsen

todd@williamsen.net

about 24 years ago

In reply to: Mike Rogers (#11)

Re: Ultimate DB Server

Ram plays a big factor in queries, most queries are stored in ram. Also
depends on which platform as well.

Thank you,

Todd Williamsen, MCSE
home: 847.265.4692
Cell: 847.867.9427

-----Original Message-----
From: Mike Rogers [mailto:temp6453@hotmail.com]
Sent: Sunday, October 28, 2001 11:08 AM
To: mysql@lists.mysql.com; pgsql-hackers@postgresql.org;
pgsql-admin@postgresql.org
Subject: Ultimate DB Server

I'm questioning whether anyone has done benchmarks on various hardware
for PGSQL and MySQL. I'm either thinking dual P3-866's, Dual
AMD-1200's, etc. I'm looking for benchmarks of large queries on striped
-vs- non-striped volumes, different processor speeds, etc.

Any thoughts people?

---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail <mysql-thread89232@lists.mysql.com>
To unsubscribe, e-mail
<mysql-unsubscribe-todd=williamsen.net@lists.mysql.com>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

#13

Patrice Hédé

phede-ml@islande.org

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: UNICODE

Hi Jean-Micehl,

* Jean-Michel POURE <jm.poure@freesurf.fr> [011028 18:23]:

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

Client: \encoding returns 'UNICODE'.
Server: \list show databases. All databases are UNICODE (except
TEMPLATE0 and TEMPLATE1 which are ASCII of course). I use a Mandrake
8.1 distribution and think my console is UNICODE.

I don't know the details for the Mandrake distribution, but I would
rather think the default terminal to be iso-8859-15 or iso-8859-1
encoded (I use myself a linux debian sid, customised to be mixed
iso-8859-15/utf-8 :) ).

In that case, it's likely to cause problems.
One thing is to check your current locale (before running psql), by
typing "locale charmap" on your terminal :

Unicode :

asterix:~$ locale charmap
UTF-8

latin-9 (fr_FR@euro) :

asterix:~$ locale charmap
ISO-8859-15

Then, if you really have a Unicode term, then you may run into other
problems. Psql uses readline, and readline is not yet "utf-8" enabled
by default. There are patches for that, but I don't know why they
don't integrate the support into the code... whatever the reason, it
means that for example Backspace won't work over characters with more
than one byte, and that includes everything which is not ASCII.

So, if while typing in psql, you try to do some text editing over the
"ï¿œ", then it's likely to mangle your input to psql (without
necessarily be visible in your terminal), and anything from a bad
commandline, to psql waiting for more input... When you've finished
typing your line, check if psql prompt is displaying an "=" sign :

tests=#

Third, depending on how your data is entered vs queried, it may have
some differences. For example, if you use an application which
converts UTF-8 data to D-normalisation before submitting to
PostgreSQL, then the "ï¿œ" will be stored as "e"+"combining mark acute
accent". Then, when you do your query, you have to submit in the same
format, as "ï¿œ" (directly typed from the keyboard) and "e"+"comb.acute
accent" are two different things (I plan to add support in PostgreSQL
for this kind of stuff for 7.3, if I manage to go a bit faster on my
other projects...).

Anyway, I have been trying a query like yours, using a UTF-8 xterm,
with a UNICODE encoding, both psql and database :

my table :

tests=# insert into matable values ('un texte accentuï¿œ', 12);
INSERT 70197 1
tests=# insert into matable values ('ï¿œa accentue le problï¿œme', 14);
INSERT 70198 1

tests=# select * from matable;
montext | valeur
-------------------------+--------
un texte accentuï¿œ | 12
ï¿œa accentue le problï¿œme | 14
(2 rows)

[note that the "ï¿œ", "ï¿œ" and "ï¿œ" are not combining forms here...]

tests=# select * from matable where montext ilike '%accentuï¿œ%';
montext | valeur
-------------------+--------
un texte accentuï¿œ | 12
(1 row)

It works fine for me.

As for me, I typed INSERT INTO source_content VALUES ('Permis de
conduire acceptï¿œ') in Psql.

As I said - psql does not do any conversion.

The faulty query is: INSERT INTO test (source_content) VALUES
('Permis de conduire acceptï¿œ');

I just can't believe that Psql is not UTF-8 compatible. It seems
unreal as Psql is PostgreSQL #1 helper application. Should I use
PostgreSQL MULE encoding to have automatic trans coding. What are
the guidelines, I am completely lost.

Psql is UTF-8 compatible. However, the terminal support of UTF-8 may
be a little shaky for now (no dead keys, no compose key) and that will
be fixed in Xfree-4.2, and readline support of UTF-8 is deficient (as
is bash's, where readline comes from). I don't know when *that* will
be fixed. I know http://www.li18nux.org/ has some patches, but I
haven't tried them yet.

Psql does not insert the data and I have to kill it manually. Can
you reproduce this?

No. If it hangs this is serious problem. Or did you simply forgot
final ';' ? It btw does not seem valid sql to me, considering you
previously provided table structure.

Is it possible that my database is corrupted? I have used pg_dump
several times to dump data from production server to development
servers and conversely. Does pg_dump produce UTF8 output? What are
the guidelines when using UTF-8: forget psql and pg_dump?

One thing you really have to be careful about is the locale you're
running your terminal into (cf above with "locale charmap"). A lot of
tools are sensitive to that, as soon as they set the locale, and also
the terminal itself is sensitive to that (if you run an xterm, a
gnome-terminal or other, make sure they are started themselves with
the correct locale, rather than the locale being set by a .bashrc or
.profile AFTER the xterm is launched. One way to be sure is to launch
an Xterm from the command line in an other xterm ;) ).

In the end: are the strings/queries you give to psql/pg_exec UTF-8
- this is now main thing, as you have _configured_ everything
correctly.

Everything is configured correctly server-side (PostgreSQL, Psql).

Thank you very much for your support Marko,
Best regards,
Jean-Michel

It's possible to work with psql and UTF-8, I'm using it :) But support
for utf-8 is not complete yet, and it's not seamless. Also, support in
Postgresql is not yet complete for UTF-8 (normalisation forms,
collation, regexes...), but it'll come :)

Patrice.

--
Patrice Hï¿œdï¿œ
email: patrice hede ï¿œ islande org
www : http://www.islande.org/

#14

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: Mike Rogers (#11)

Re: Ultimate DB Server

At 13:07 28/10/01 -0400, you wrote:

I'm questioning whether anyone has done benchmarks on various hardware for
PGSQL and MySQL. I'm either thinking dual P3-866's, Dual AMD-1200's, etc.
I'm looking for benchmarks of large queries on striped -vs- non-striped
volumes, different processor speeds, etc.

Hello Mike,

IMHO, you should consider *simple* software optimization first.

Hardware can bring a 2x gain whereas software optimization can boost an
application by 10x. Until now, I never heard or read about a real *software
optimization* benchmark between MySQL and PostgreSQL.

Software optimization includes the use of views, triggers, rules, PL/pgSQL
server side programming. By definition, it is hard to compare MySQL with
PostgreSQL because MySQL *does not include* these important features (and
probably will never do).

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead of
performing LEFT JOINS. Increase the number of simultaneous queries. MySQL
will die at x queries and PostgreSQL will still be working at 5x queries.
2) Use PL/pgSQL to perform complex jobs normally devoted to an application
server (Java, PHP) on a separate platform. In some case (recursive loops
for example), network traffic can be divided by 100. As a result,
PostgreSQL can be 10x faster because everything is performed server-side.

This is to say that, in some circomstances, PostgreSQL running on an i586
with IDE drive beats MySQL on a double Pentium. In real life, applications
are always optimized at software level first before hardware level. This is
why PostsgreSQL is *by nature* better than MySQL.

Unless MySQL gets better, there is no real challenge in comparing both systems.

Cheers,
Jean-Michel POURE

#15

mlw

markw@mohawksoft.com

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: Ultimate DB Server

Hardware can bring a 2x gain whereas software optimization can boost an
application by 10x. Until now, I never heard or read about a real *software
optimization* benchmark between MySQL and PostgreSQL.

It has been my experience that a knowledgeable, SQL savvy engineer can not use
MySQL. You have to have no basic knowledge of SQL to be able to work within its
limitations. Every project with which I have tried MySQL, I have always found
myself trying to work around what I can't do with it. In that respect, it is
like working on Windows.

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead of
performing LEFT JOINS. Increase the number of simultaneous queries. MySQL
will die at x queries and PostgreSQL will still be working at 5x queries.
2) Use PL/pgSQL to perform complex jobs normally devoted to an application
server (Java, PHP) on a separate platform. In some case (recursive loops
for example), network traffic can be divided by 100. As a result,
PostgreSQL can be 10x faster because everything is performed server-side.

Server side programming is a double edged sword. PostgreSQL is not a
distributed database, thus you are limited to the throughput of a single
system. Moving processing off to PHP or Java on a different system can reduce
the load on your server by distributing processing to other systems. If you can
cut query execution time by moving work off to other systems, you can
effectively increase the capacity of your database server.

Typically, on a heavily used database, you should try to limit server side
programming to that which reduces the database work load. If you are moving
work, which can be done on the client, back to the server, you will bottleneck
at the server while the client is sitting idle.

This is to say that, in some circomstances, PostgreSQL running on an i586
with IDE drive beats MySQL on a double Pentium. In real life, applications
are always optimized at software level first before hardware level. This is
why PostsgreSQL is *by nature* better than MySQL.

One of the reasons why PostgreSQL beats MySQL, IMHO, is that it has the SQL
features that allow you to control and reduce the database work load by doing
things smarter.

Unless MySQL gets better, there is no real challenge in comparing both systems.

It is funny, I know guys that love MySQL. Even when I show them the cool things
they can do with Postgres, they just don't seem to get it. It is sort of like
talking to an Amiga user.

#16

Mike Rogers

temp6453@hotmail.com

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: Ultimate DB Server

MySQL and PostgreSQL are starting to move together as far as I can see.
MySQL has the _option_ of transactional database formats (you can use both
normal MyISAM tables and transactional tables). MySQL 4.0 has all those
various features you speak of. On all too many applications, MySQL kicks
ass. Admitedly, if you do massive complex database applications, PostgreSQL
can smoke it when done right, but MySQL works great for most tasks. It's
not even a matter of which is better or how to compare them. It is a
question of 'what is your purpose for the database' and then deciding based
on the intended purpose.
I did mention that it would be running BOTH MySQL and PostgreSQL, and
not just one. I use them both for various purposes, depending on the need
and am trying to move it to a seperate server to increase the speed of
queries on BOTH database systems. It's not a question of which is better,
but a question of what will maximize output for cost.
I think you may have misinterpreted the question
--
Mike

----- Original Message -----
From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
To: <pgsql-hackers@postgresql.org>
Cc: "Mike Rogers" <temp6453@hotmail.com>
Sent: Sunday, October 28, 2001 3:18 PM
Subject: Re: [HACKERS] Ultimate DB Server

At 13:07 28/10/01 -0400, you wrote:

I'm questioning whether anyone has done benchmarks on various hardware

for

PGSQL and MySQL. I'm either thinking dual P3-866's, Dual AMD-1200's,

etc.

I'm looking for benchmarks of large queries on striped -vs- non-striped
volumes, different processor speeds, etc.

Hello Mike,

IMHO, you should consider *simple* software optimization first.

Hardware can bring a 2x gain whereas software optimization can boost an
application by 10x. Until now, I never heard or read about a real

*software

optimization* benchmark between MySQL and PostgreSQL.

Software optimization includes the use of views, triggers, rules, PL/pgSQL
server side programming. By definition, it is hard to compare MySQL with
PostgreSQL because MySQL *does not include* these important features (and
probably will never do).

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead of
performing LEFT JOINS. Increase the number of simultaneous queries. MySQL
will die at x queries and PostgreSQL will still be working at 5x queries.
2) Use PL/pgSQL to perform complex jobs normally devoted to an application
server (Java, PHP) on a separate platform. In some case (recursive loops
for example), network traffic can be divided by 100. As a result,
PostgreSQL can be 10x faster because everything is performed server-side.

This is to say that, in some circomstances, PostgreSQL running on an i586
with IDE drive beats MySQL on a double Pentium. In real life, applications
are always optimized at software level first before hardware level. This

why PostsgreSQL is *by nature* better than MySQL.

Unless MySQL gets better, there is no real challenge in comparing both

systems.

Show quoted text

Cheers,
Jean-Michel POURE

#17

Mike Mascari

mascarm@mascari.com

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: Ultimate DB Server

mlw wrote:

It is funny, I know guys that love MySQL. Even when I show them the cool things
they can do with Postgres, they just don't seem to get it. It is sort of like
talking to an Amiga user.

Hey. As someone who learned 68000 assembly on the Amiga back in '86,
I take that personally. There's nothing like writing a pixel editor
in 4096-color HAM mode in 68000 off of a floppy-based Commodore
Macro Assembler. Sadly, however, I don't yet see the Amiga 1000 on
the PostgreSQL ports list. ;-)

Mike Mascari
mascarm@mascari.com

#18

mlw

markw@mohawksoft.com

about 24 years ago

In reply to: Jean-Michel POURE (#7)

Re: Ultimate DB Server

Mike Mascari wrote:

mlw wrote:

..

It is funny, I know guys that love MySQL. Even when I show them the cool things
they can do with Postgres, they just don't seem to get it. It is sort of like
talking to an Amiga user.

Hey. As someone who learned 68000 assembly on the Amiga back in '86,
I take that personally. There's nothing like writing a pixel editor
in 4096-color HAM mode in 68000 off of a floppy-based Commodore
Macro Assembler. Sadly, however, I don't yet see the Amiga 1000 on
the PostgreSQL ports list. ;-)

Sorry, I like to needle Amiga users. One of my closest friends is, what can
only be described as, a complete Amiga zealot. Most of the time it is pretty
fun to get him going, I hope you know it is all good natured fun.

I built my first robot using an RCA 1802 back in the late '70s. I still think
the P.C. was the worst computer design. Going from any platform to the
8080~8088 was such a let down. If someone had ported CP/M to the RCA 1802 back
in the '70s, computers may have been different today.

#19

Tatsuo Ishii

t-ishii@sra.co.jp

about 24 years ago

In reply to: Jean-Michel POURE (#1)

Re: UNICODE

I'm not sure what you are expecting but...

SELECT * FROM test WHERE source_content ILIKE '%accept%';

---> returns a record

SELECT * FROM test WHERE source_content ILIKE '%acceptï¿½%'

---> returns a record

SELECT * FROM test WHERE source_content ILIKE '%accepte%'

---> returns 0 record

So all of above seem to work fine for me.

$ pg_config --configure
--prefix=/usr/local/pgsql --enable-multibyte=EUC_JP --enable-unicode-conversion --with-tcl --with-perl --enable-syslog --enable-debug --with-CXX --with-java
$ pg_config --version
PostgreSQL 7.1.3
--
Tatsuo Ishii

#20

Tatsuo Ishii

t-ishii@sra.co.jp

about 24 years ago

In reply to: Patrice Hédé (#13)

Re: UNICODE

In that case, it's likely to cause problems.
One thing is to check your current locale (before running psql), by
typing "locale charmap" on your terminal :

Unicode :

asterix:~$ locale charmap
UTF-8

Just curious. Are there any working charmap for UTF-8? I mean, the
charmap contains not only ISO 8859-* but also other languages defined
in UNICODE 2.0 at least. I coudn't find such a thing around me. Also,
does it handle Unicode combined characters?
--
Tatsuo Ishii

#21

Christopher Kings-Lynne

chriskl@familyhealth.com.au

about 24 years ago

In reply to: Mike Rogers (#16)

Re: Ultimate DB Server

MySQL and PostgreSQL are starting to move together as far as I can see.
MySQL has the _option_ of transactional database formats (you can use both
normal MyISAM tables and transactional tables). MySQL 4.0 has all those
various features you speak of.

Not it doesn't.

It supports the UNION statement (thank god!)

And this is for 4.1:

-------
MySQL 4.1, the following development release

Internally, through a new .frm file format for table definitions, MySQL 4.0
lays the foundation for the new features of MySQL 4.1, such as nested
subqueries, stored procedures, and foreign key integrity rules, which form
the top of the wish list for many of our customers. Along with those, we
will also include simpler additions, such as multi-table UPDATE statements.

After those additions, critics of MySQL have to be more imaginative than
ever in pointing out deficiencies in the MySQL Database Management System.
For long already known for its stability, speed, and ease of use, MySQL will
then match the requirement checklist of very demanding buyers.
--------

I don't get how you can have different tables being transactional in your
database??

ie. What on earth does this do? (pseudo)

create table blah not_transactional;
create table hum not_transactional;

begin;
insert into blah values (1);
insert into hum values (2);
rollback;

?????

On all too many applications, MySQL kicks

Show quoted text

ass. Admitedly, if you do massive complex database applications,
PostgreSQL
can smoke it when done right, but MySQL works great for most tasks. It's
not even a matter of which is better or how to compare them. It is a
question of 'what is your purpose for the database' and then
deciding based
on the intended purpose.
I did mention that it would be running BOTH MySQL and PostgreSQL, and
not just one. I use them both for various purposes, depending on the need
and am trying to move it to a seperate server to increase the speed of
queries on BOTH database systems. It's not a question of which is better,
but a question of what will maximize output for cost.
I think you may have misinterpreted the question
--
Mike

----- Original Message -----
From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
To: <pgsql-hackers@postgresql.org>
Cc: "Mike Rogers" <temp6453@hotmail.com>
Sent: Sunday, October 28, 2001 3:18 PM
Subject: Re: [HACKERS] Ultimate DB Server

At 13:07 28/10/01 -0400, you wrote:

I'm questioning whether anyone has done benchmarks on various hardware

for

PGSQL and MySQL. I'm either thinking dual P3-866's, Dual AMD-1200's,

etc.

I'm looking for benchmarks of large queries on striped -vs- non-striped
volumes, different processor speeds, etc.

Hello Mike,

IMHO, you should consider *simple* software optimization first.

Hardware can bring a 2x gain whereas software optimization can boost an
application by 10x. Until now, I never heard or read about a real

*software

optimization* benchmark between MySQL and PostgreSQL.

Software optimization includes the use of views, triggers,

rules, PL/pgSQL

server side programming. By definition, it is hard to compare MySQL with
PostgreSQL because MySQL *does not include* these important

features (and

probably will never do).

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead of
performing LEFT JOINS. Increase the number of simultaneous

queries. MySQL

will die at x queries and PostgreSQL will still be working at

5x queries.

2) Use PL/pgSQL to perform complex jobs normally devoted to an

application

server (Java, PHP) on a separate platform. In some case (recursive loops
for example), network traffic can be divided by 100. As a result,
PostgreSQL can be 10x faster because everything is performed

server-side.

This is to say that, in some circomstances, PostgreSQL running

on an i586

with IDE drive beats MySQL on a double Pentium. In real life,

applications

are always optimized at software level first before hardware level. This

is

why PostsgreSQL is *by nature* better than MySQL.

Unless MySQL gets better, there is no real challenge in comparing both

systems.

Cheers,
Jean-Michel POURE

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#22

Christopher Kings-Lynne

chriskl@familyhealth.com.au

about 24 years ago

In reply to: Christopher Kings-Lynne (#21)

Re: Ultimate DB Server

Doh! I messed up my example!

The first table was supposed to be transactional.

I don't get how you can have different tables being transactional in your
database??

ie. What on earth does this do? (pseudo)

create table blah transactional;

create table hum not_transactional;

begin;
insert into blah values (1);
insert into hum values (2);
rollback;

?????

Chris

#23

bpalmer

bpalmer@crimelabs.net

about 24 years ago

In reply to: Christopher Kings-Lynne (#21)

Re: Ultimate DB Server

Not it doesn't.

It supports the UNION statement (thank god!)

And this is for 4.1:

This crap shouldn't be on the hackers list, please take it else where.
The hackers lists is for people developing postgresql, not for people
auguing about the merits of postgresql vs mysql.

Please go elsewhere.

- Brandon

----------------------------------------------------------------------------
c: 646-456-5455 h: 201-798-4983
b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5

#24

Mike Rogers

temp6453@hotmail.com

about 24 years ago

In reply to: Christopher Kings-Lynne (#21)

Re: Ultimate DB Server

What that does is very simple: it rolls back the one that is keeping track
of it's transactions. Think of the overhead if someone doesn't have
transactional statements. The idea is, in PGSQL, all inserts and updates
are essentially logged so that they can be rolled back. Here is the MySQL
concept:
Have a log table that logs all transactions (lets say, failed or not)
1. begin transaction
2. insert into non-transactional table 'user did this, status-
unprocessed'
3. insert into payment table
4. insert into product table
5. update to processed
6. insert into shipping
7. update to 'pending shipping'
Perfectly common transaction that happens. Now! What if you want the
entry inserted and dealt with as a status and what happens, but you don't
want all the evidence of that to disappear when you hit rollback. It means
you can have some things roll back and others don't. In PGSQL, that would
have to be begin/rollback for only transactional entries.
--
Mike

----- Original Message -----
From: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
To: "Mike Rogers" <temp6453@hotmail.com>; <pgsql-hackers@postgresql.org>;
"Jean-Michel POURE" <jm.poure@freesurf.fr>
Sent: Sunday, October 28, 2001 9:52 PM
Subject: RE: [HACKERS] Ultimate DB Server

MySQL and PostgreSQL are starting to move together as far as I can see.
MySQL has the _option_ of transactional database formats (you can use

both

normal MyISAM tables and transactional tables). MySQL 4.0 has all those
various features you speak of.

Not it doesn't.

It supports the UNION statement (thank god!)

And this is for 4.1:

-------
MySQL 4.1, the following development release

Internally, through a new .frm file format for table definitions, MySQL

4.0

lays the foundation for the new features of MySQL 4.1, such as nested
subqueries, stored procedures, and foreign key integrity rules, which form
the top of the wish list for many of our customers. Along with those, we
will also include simpler additions, such as multi-table UPDATE

statements.

After those additions, critics of MySQL have to be more imaginative than
ever in pointing out deficiencies in the MySQL Database Management System.
For long already known for its stability, speed, and ease of use, MySQL

will

then match the requirement checklist of very demanding buyers.
--------

I don't get how you can have different tables being transactional in your
database??

ie. What on earth does this do? (pseudo)

create table blah not_transactional;
create table hum not_transactional;

begin;
insert into blah values (1);
insert into hum values (2);
rollback;

?????

On all too many applications, MySQL kicks

ass. Admitedly, if you do massive complex database applications,
PostgreSQL
can smoke it when done right, but MySQL works great for most tasks.

It's

not even a matter of which is better or how to compare them. It is a
question of 'what is your purpose for the database' and then
deciding based
on the intended purpose.
I did mention that it would be running BOTH MySQL and PostgreSQL,

and

not just one. I use them both for various purposes, depending on the

need

and am trying to move it to a seperate server to increase the speed of
queries on BOTH database systems. It's not a question of which is

better,

but a question of what will maximize output for cost.
I think you may have misinterpreted the question
--
Mike

----- Original Message -----
From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
To: <pgsql-hackers@postgresql.org>
Cc: "Mike Rogers" <temp6453@hotmail.com>
Sent: Sunday, October 28, 2001 3:18 PM
Subject: Re: [HACKERS] Ultimate DB Server

At 13:07 28/10/01 -0400, you wrote:

I'm questioning whether anyone has done benchmarks on various hardwar

for

PGSQL and MySQL. I'm either thinking dual P3-866's, Dual AMD-1200's,

etc.

I'm looking for benchmarks of large queries on striped -vs-

non-striped

volumes, different processor speeds, etc.

Hello Mike,

IMHO, you should consider *simple* software optimization first.

Hardware can bring a 2x gain whereas software optimization can boost

application by 10x. Until now, I never heard or read about a real

*software

optimization* benchmark between MySQL and PostgreSQL.

Software optimization includes the use of views, triggers,

rules, PL/pgSQL

server side programming. By definition, it is hard to compare MySQL

with

PostgreSQL because MySQL *does not include* these important

features (and

probably will never do).

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead

performing LEFT JOINS. Increase the number of simultaneous

queries. MySQL

will die at x queries and PostgreSQL will still be working at

5x queries.

2) Use PL/pgSQL to perform complex jobs normally devoted to an

application

server (Java, PHP) on a separate platform. In some case (recursive

loops

for example), network traffic can be divided by 100. As a result,
PostgreSQL can be 10x faster because everything is performed

server-side.

This is to say that, in some circomstances, PostgreSQL running

on an i586

with IDE drive beats MySQL on a double Pentium. In real life,

applications

are always optimized at software level first before hardware level.

This

Show quoted text

is

why PostsgreSQL is *by nature* better than MySQL.

Unless MySQL gets better, there is no real challenge in comparing both

systems.

Cheers,
Jean-Michel POURE

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#25

Lincoln Yeoh

lyeoh@pop.jaring.my

about 24 years ago

In reply to: Mike Rogers (#24)

Re: Ultimate DB Server

(reply- s/hackers/general/)

Well I put that sort of thing into my application logs, personal preference
for now.

It seems to me mixed transaction tables are likely to create an error prone
environment for little real gain.

No transactions doesn't necessarily mean faster (or slower). IMO MySQL has
that feature for backwards compatibility. Little to do with performance.

It would be good for claims regarding overheads/performance issues to be
backed by reproducible experiments, or at least interesting
statements/arguments.

I've used MySQL and it was far better than Postgresql when Postgresql was
Postgres95 (yuckyuckyuck!). While Postgresql has really gone a long way,
MySQL seems to have been stuck in denial till recently.

Will be interesting to see if MySQL can pull off the Windows vs Mac trick :).

I hope it will be a good clean fight :).

Cheerio,
Link.

At 11:05 PM 28-10-2001 -0400, Mike Rogers wrote:

Show quoted text

What that does is very simple: it rolls back the one that is keeping track
of it's transactions. Think of the overhead if someone doesn't have
transactional statements. The idea is, in PGSQL, all inserts and updates
are essentially logged so that they can be rolled back. Here is the MySQL
concept:
Have a log table that logs all transactions (lets say, failed or not)
1. begin transaction
2. insert into non-transactional table 'user did this, status-
unprocessed'
3. insert into payment table
4. insert into product table
5. update to processed
6. insert into shipping
7. update to 'pending shipping'
Perfectly common transaction that happens. Now! What if you want the
entry inserted and dealt with as a status and what happens, but you don't
want all the evidence of that to disappear when you hit rollback. It means
you can have some things roll back and others don't. In PGSQL, that would
have to be begin/rollback for only transactional entries.

#26

Hannu Krosing

hannu@tm.ee

about 24 years ago

In reply to: Christopher Kings-Lynne (#21)

Re: Ultimate DB Server

Mike Rogers wrote:

What that does is very simple: it rolls back the one that is keeping track
of it's transactions. Think of the overhead if someone doesn't have
transactional statements. The idea is, in PGSQL, all inserts and updates
are essentially logged so that they can be rolled back. Here is the MySQL
concept:
Have a log table that logs all transactions (lets say, failed or not)
1. begin transaction
2. insert into non-transactional table 'user did this,
status - unprocessed'
3. insert into payment table
4. insert into product table
5. update to processed
6. insert into shipping
7. update to 'pending shipping'
Perfectly common transaction that happens. Now! What if you want the
entry inserted and dealt with as a status and what happens, but you don't
want all the evidence of that to disappear when you hit rollback.
It means you can have some things roll back and others don't. In PGSQL,
that would have to be begin/rollback for only transactional entries.

Or you would run two parallel transactions (currently you need two
connections
for this) - one for logging and one for work.

I agree that having non_transactional (i.e. logging) tables may be
sometimes
desirable. I've been told that some of Oracles debugging/logging
facilities
are almost useless due-to the fact that they disappear at rollback.

------------------
Hannu

#27

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: bpalmer (#23)

Re: Ultimate DB Server

This crap shouldn't be on the hackers list, please take it else where.
The hackers lists is for people developing postgresql, not for people
auguing about the merits of postgresql vs mysql.

Agreed. Should be on pgsql-general@postgresql.org.

#28

Jean-Michel POURE

jm.poure@freesurf.fr

about 24 years ago

In reply to: mlw (#15)

Re: [HACKERS] Ultimate DB Server

If you answer, please email to pgsql-general@postgresql.org

****************************************************************************
***************************

Server side programming is a double edged sword. PostgreSQL is not a
distributed database, thus you are limited to the throughput of a single
system. Moving processing off to PHP or Java on a different system can reduce
the load on your server by distributing processing to other systems. If
you can
cut query execution time by moving work off to other systems, you can
effectively increase the capacity of your database server.

Yes, but for the Web, SQL queries are SELECT with LEFT JOINS to get display
values of OIDs.
If you store LEFT JOIN results using triggers, you divide complexity by a
factor of 10.

MySQL
A simple example would be :
SELECT customer_name, category_name FROM customer_table WHERE customer_oid
= xxx
LEFT JOIN customer_category ON customer_oidcategory = category_oid;

PostgreSQL
Because Categories do not change a lot, it is possible to create a
category_name_tg field
table customer_table and store the value using a trigger. As UPDATE account
for 5% of all queries, it is not a real overhead.

To maintain consistency, you also add a customer_timestamp to
customer_table. When Category value changes all you need to do is: UPDATE
customer_table SET customer_timestamp = 'now' WHERE customer_oidcategory = yyy;

Under PostgreSQL, your query becomes
SELECT customer_name, customer_category_tg FROM customer_table WHERE
customer_oid = xxx

Typically, on a heavily used database, you should try to limit server side
programming to that which reduces the database work load. If you are moving
work, which can be done on the client, back to the server, you will bottleneck
at the server while the client is sitting idle.

I do not always agree server-side programming should be limited. It
depends. In some cases yes, in some cases no. Optimization is a progressive
task, where you start with basic things and end up with more complex
architecture. For what I noticed, 95% of applications were not truly
optimized.

This is to say that, in some circomstances, PostgreSQL running on an i586
with IDE drive beats MySQL on a double Pentium. In real life, applications
are always optimized at software level first before hardware level. This is
why PostsgreSQL is *by nature* better than MySQL.

One of the reasons why PostgreSQL beats MySQL, IMHO, is that it has the SQL
features that allow you to control and reduce the database work load by doing
things smarter.

Agreed, This is what I meant when I said PostgreSQL beat MySQL.

Unless MySQL gets better, there is no real challenge in comparing both

systems.

It is funny, I know guys that love MySQL. Even when I show them the cool
things
they can do with Postgres, they just don't seem to get it. It is sort of like
talking to an Amiga user.

On heavy workload systems MySQL cannot compare to PostgreSQL. It's funny to
read these mails
of people doing benchmarks.

#29

George Eric R Contr AFSPC/CVYZ

Eric.George@PETERSON.af.mil

about 24 years ago

In reply to: Jean-Michel POURE (#28)

Re: Ultimate DB Server

Anandtech did an article on the dual athlon box that runs their forum.
Ot's not MySQL or PgSQL, but you might check it out.

http://www.anandtech.com/showdoc.html?i=1514

-----Original Message-----
From: Mike Rogers [mailto:temp6453@hotmail.com]
Sent: Sunday, October 28, 2001 10:08 AM
To: mysql@lists.mysql.com; pgsql-hackers@postgresql.org;
pgsql-admin@postgresql.org
Subject: Ultimate DB Server

Any thoughts people?

---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail <mysql-thread89232@lists.mysql.com>
To unsubscribe, e-mail
<mysql-unsubscribe-Eric.George=peterson.af.mil@lists.mysql.com>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Import Notes

Resolved by subject fallback