Trying to load MySQL data

Started by garrettmoore@gmail.comabout 19 years ago10 messagesgeneral
Jump to latest
#1garrettmoore@gmail.com
garrettmoore@gmail.com

Hello,

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

I'm having a problem importing some of the data. What I have done is
exported the MySQL data and then modified it so that all single quotes
(a ' quote) are doubled, and null values are replaced with an empty
value in the CSV.

Our data, for example, looks like this:

2628,'Poster,5,'255.255.18.138',,'[quote=gz]I''m curious, why not allow
users, then to freely edit topics?[/quote]
We could either go with "it''s a minor annoyance that is one measure
that helps keep DB load lower," or "we do it just to piss you off."
You choose.',0,1124498148,,,376,0,0,,

This row causes an error. The error is that:

The value "why not allow users" is not valid for column 'x'.

Column x is the first column after the long section of text, with a
value of '0'.

It appears that the quote in "I'm", which has been doubled quoted to
'', is not being properly skipped over, and COPY thinks that the next
comma (after "curious") is a new column, and tries to start inserting
data there. At least that's what I have come up with.

Why is this happening? I've used this method before and I didn't have
any trouble, when loading a bunch of Wikipedia test data (which has all
manner of quotes, commas, and apostrophes in it).

This is line 39150 in the file; all previous lines import fine, but
this kills the COPY process and all of the previous inserts are rolled
back. I need to get this data loaded intact.

My copy command is
COPY posts FROM '/tmp/posts.txt' CSV QUOTE $$'$$;

#2Scott Marlowe
smarlowe@g2switchworks.com
In reply to: garrettmoore@gmail.com (#1)
Re: Trying to load MySQL data

On Tue, 2007-01-09 at 19:54, garrettmoore@gmail.com wrote:

Hello,

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

I'm having a problem importing some of the data. What I have done is
exported the MySQL data and then modified it so that all single quotes
(a ' quote) are doubled, and null values are replaced with an empty
value in the CSV.

Our data, for example, looks like this:

2628,'Poster,5,'255.255.18.138',,'[quote=gz]I''m curious, why not allow
users, then to freely edit topics?[/quote]
We could either go with "it''s a minor annoyance that is one measure
that helps keep DB load lower," or "we do it just to piss you off."
You choose.',0,1124498148,,,376,0,0,,

If that is indeed the line, then this part:

2628,'Poster,5,'255.255.18.138',

is misformed.

I'm assuming you really need:

2628,'Poster',5,'255.255.18.138',

#3brian
brian@zijn-digital.com
In reply to: garrettmoore@gmail.com (#1)
Re: Trying to load MySQL data

garrettmoore@gmail.com wrote:

Hello,

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

I'm having a problem importing some of the data. What I have done is
exported the MySQL data and then modified it so that all single quotes
(a ' quote) are doubled, and null values are replaced with an empty
value in the CSV.

Our data, for example, looks like this:

2628,'Poster,5,'255.255.18.138',,'[quote=gz]I''m curious, why not allow
users, then to freely edit topics?[/quote]
We could either go with "it''s a minor annoyance that is one measure
that helps keep DB load lower," or "we do it just to piss you off."
You choose.',0,1124498148,,,376,0,0,,

This row causes an error. The error is that:

The value "why not allow users" is not valid for column 'x'.

Column x is the first column after the long section of text, with a
value of '0'.

It appears that the quote in "I'm", which has been doubled quoted to
'', is not being properly skipped over, and COPY thinks that the next
comma (after "curious") is a new column, and tries to start inserting
data there. At least that's what I have come up with.

Why is this happening? I've used this method before and I didn't have
any trouble, when loading a bunch of Wikipedia test data (which has all
manner of quotes, commas, and apostrophes in it).

This is line 39150 in the file; all previous lines import fine, but
this kills the COPY process and all of the previous inserts are rolled
back. I need to get this data loaded intact.

My copy command is
COPY posts FROM '/tmp/posts.txt' CSV QUOTE $$'$$;

It appears that you're missing a quote after the word 'Poster':

2628,'Poster,5,'255.255.18.138',

brian

#4Walter Vaughan
wvaughan@steelerubber.com
In reply to: garrettmoore@gmail.com (#1)
Re: Trying to load MySQL data

garrettmoore@gmail.com wrote:

Hello,

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

I dunno, but unless you don't really care about your data, I'd use something
that you have no chance of in your data. Things like | and ~ are pretty standard
field separators.

So you'd use things like FIELDS TERMINATED BY '|' in your MySQL export
and WITH DELIMITER '|' NULL '' in your postgresql import

Also remember that mySQL nulls are nothing like postgresql nulls.

--
Walter

#5garrettmoore@gmail.com
garrettmoore@gmail.com
In reply to: Walter Vaughan (#4)
Re: Trying to load MySQL data

The missing quote after Poster is a mistake I made when sanitzing the
data for posting here. That error is NOT present in the actual data.
There is a quote where needed in the data. So, with that in mind, why
am I still getting the error?

Also, there is no symbol we can expect to not be in the data. This data
is from several sources, including a message board, and there could be
tildes, pipes, or any other symbol in discussion fields.

Also since it's CSV we just have null represented by lack of any value
between two commas, so: a,b,,d represents a row with values a, b, NULL,
d. This works fine in general.

Walter Vaughan wrote:

Show quoted text

garrettmoore@gmail.com wrote:

Hello,

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

I dunno, but unless you don't really care about your data, I'd use something
that you have no chance of in your data. Things like | and ~ are pretty standard
field separators.

So you'd use things like FIELDS TERMINATED BY '|' in your MySQL export
and WITH DELIMITER '|' NULL '' in your postgresql import

Also remember that mySQL nulls are nothing like postgresql nulls.

--
Walter

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

#6Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: garrettmoore@gmail.com (#1)
Re: Trying to load MySQL data

Hi,

Le mercredi 10 janvier 2007 02:54, garrettmoore@gmail.com a écrit :

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

You could also give pgloader a try.
It uses COPY but allows you to load good data even in the presence of errors,
and have a reject file containing erroneous data lines, to replay insertion
later.
It even allows you to reorder data for matching your columns definition, but
as of now suffer from a psycopg2 limitation : you have to provide all table
columns into your data file.

http://pgfoundry.org/projects/pgloader/
http://debian.dalibo.org/unstable/
http://debian.dalibo.org/unstable/pgloader_2.0.2.tar.gz

Hope this helps,
--
Dimitri Fontaine
http://www.dalibo.com/

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Dimitri Fontaine (#6)
Re: Trying to load MySQL data

On 1/10/07, Dimitri Fontaine <dim@dalibo.com> wrote:

Hi,

Le mercredi 10 janvier 2007 02:54, garrettmoore@gmail.com a écrit:

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

If you are using pg 8.2+, I've had good luck with the following:

1. create pgsql schema by hand or using some method, so they match mysql
2. mysqldump --compatible=postgresql [yadda] | grep ^INSERT | psql [yadda]

this will work for most data types. as of 8.2, postgresql supports
multiple record inserts, which while not as fast as copy, is pretty
close. if mysqldump is dumping single line inserts, change it to
multiple with -e switch iiirc.

merlin

#8Bruce Momjian
bruce@momjian.us
In reply to: Merlin Moncure (#7)
Re: Trying to load MySQL data

Merlin Moncure wrote:

On 1/10/07, Dimitri Fontaine <dim@dalibo.com> wrote:

Hi,

Le mercredi 10 janvier 2007 02:54, garrettmoore@gmail.com a ?crit:

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

If you are using pg 8.2+, I've had good luck with the following:

1. create pgsql schema by hand or using some method, so they match mysql
2. mysqldump --compatible=postgresql [yadda] | grep ^INSERT | psql [yadda]

Wow, mysqldump has a postgresql compatibility mode? Intersting.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#9Merlin Moncure
mmoncure@gmail.com
In reply to: Bruce Momjian (#8)
Re: Trying to load MySQL data

On 1/11/07, Bruce Momjian <bruce@momjian.us> wrote:

Merlin Moncure wrote:

On 1/10/07, Dimitri Fontaine <dim@dalibo.com> wrote:

Hi,

Le mercredi 10 janvier 2007 02:54, garrettmoore@gmail.com a ?crit:

I am working on a project where we are converting from MySQL to
Postgres. I figured the easiest way would be to export the MySQL data
as CSV.

If you are using pg 8.2+, I've had good luck with the following:

1. create pgsql schema by hand or using some method, so they match mysql
2. mysqldump --compatible=postgresql [yadda] | grep ^INSERT | psql [yadda]

Wow, mysqldump has a postgresql compatibility mode? Intersting.

It does (had it for years), but it doesn't do very much...fixes the
quotes and a couple of other things. In particular I know of no easy
ways to convert the table schemas without use of external tools.

merlin

#10Scott Marlowe
smarlowe@g2switchworks.com
In reply to: garrettmoore@gmail.com (#5)
Re: Trying to load MySQL data

On Wed, 2007-01-10 at 16:51, garrettmoore@gmail.com wrote:

The missing quote after Poster is a mistake I made when sanitzing the
data for posting here. That error is NOT present in the actual data.
There is a quote where needed in the data. So, with that in mind, why
am I still getting the error?

Also, there is no symbol we can expect to not be in the data. This data
is from several sources, including a message board, and there could be
tildes, pipes, or any other symbol in discussion fields.

Also since it's CSV we just have null represented by lack of any value
between two commas, so: a,b,,d represents a row with values a, b, NULL,
d. This works fine in general.

Can you make a sanitized test case, complete unto itself, and post that?

The data to look for are generally \ and '