Adapter update.

Started by Murali Maddaliover 18 years ago7 messagesgeneral
Jump to latest
#1Murali Maddali
murali.maddali@uai.com

Hello Group,

I have asked this question already on the NpgSql forum, but didn't get a
response so far. Sorry for cross posting, but want to check if any one had
any suggestions for my problem.

I am trying to do my updates through NpgsqlDataAdapter (I also tried with
Odbc driver with no luck) by passing in a Datatable with changes in it, this
would take forever to do the updates.

This is what I am doing, I am reading the data from SQL Server 2005 and
dumping to out to Postgresql 8.2 database.

using (SqlCommand cmd = new SqlCommand(t.SourceSelect, conn))
{
using (SqlDataReader r = cmd.ExecuteReader())
{
DataSet ds = new DataSet("postgis");
NpgsqlDataAdapter adp = new
NpgsqlDataAdapter(t.DestinationSelect, destConn);
NpgsqlCommandBuilder cmdBld = new
NpgsqlCommandBuilder(adp);
adp.Fill(ds, t.DestinationTable);
DataTable destTbl = ds.Tables[t.DestinationTable];

DataTable srcTblSchema = r.GetSchemaTable();
adp.FillSchema(ds, SchemaType.Mapped,
t.DestinationTable);

// My save method will check if the row exists or not
and would add or update accordingly to the datatable (destTbl). The whole
process
// of comparision is done under 2 mins on 60,000
records.
while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);

r.Close();

// This is the where my application goes into lala land.
If I call this update in my while loop above, it took about two hours to
process
// the whole thing
adp.Update(destTbl);
}
}

I have around 60000 records. I also have a geometry field on my table.

I have couple of questions.

1) What do I do to speed up the process? Any database configuration changes,
connection properties, ....
2) When I call the adapter.update does NpgsqlDataAdapter checks to see if
the column value really changed or not? I believe SQLDataAdapter does this
validation before it actually writes to the database.

Any suggestions and comments are greatly appreciated. Right now I am in dead
waters and can't get it to work on large datasets.

Thank you all.

Regards,
Murali K. Maddali
UAI, Inc.
murali.maddali@uai.com <mailto:murali.maddali@uai.com>

"Always bear in mind that your own resolution to succeed is more important
than any one thing." - Abraham Lincoln

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender. This
message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail.

#2Richard Huxton
dev@archonet.com
In reply to: Murali Maddali (#1)
Re: Adapter update.

Murali Maddali wrote:

This is what I am doing, I am reading the data from SQL Server 2005 and
dumping to out to Postgresql 8.2 database.

while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);

r.Close();

// This is the where my application goes into lala land.
If I call this update in my while loop above, it took about two hours to
process
// the whole thing
adp.Update(destTbl);

That's probably because it was doing each update in its own transaction.
That'll require committing each row to disk.

I have around 60000 records. I also have a geometry field on my table.

I have couple of questions.

1) What do I do to speed up the process? Any database configuration changes,
connection properties, ....

Well, if you're doing it all in its own transaction it should be fairly
quick.

You might also find the DBI-link project useful, if you know any Perl.
That would let you reach out directly from PG to the SQL-Server database.
http://pgfoundry.org/projects/dbi-link/

2) When I call the adapter.update does NpgsqlDataAdapter checks to see if
the column value really changed or not? I believe SQLDataAdapter does this
validation before it actually writes to the database.

Sorry, don't know - but you have the source, should be easy enough to
check. If not, I'm sure the npgsql people would be happy of a patch.

Any suggestions and comments are greatly appreciated. Right now I am in dead
waters and can't get it to work on large datasets.

Fastest way to load data into PG is via COPY, don't know if npgsql
driver supports that. If not, you'd have to go via a text-file.

Load the data into an import table (TEMPORARY table probably) and then
just use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

--
Richard Huxton
Archonet Ltd

#3Murali Maddali
murali.maddali@uai.com
In reply to: Richard Huxton (#2)
Re: Adapter update.

Richard,

I have added transaction to my code and it took about 2 and half hours to
process around 48,000 records. Again all this time is taken by update method
on the adapter.

I don't know Perl to setup the database link to SQL Server 2005 and also I
don't have permission to write the data to files. Are there any other
options like a different driver I can use or through stored procedures. I
have to compare each column in each row before doing the update.

Your suggestions and comments are greatly appreciated.

Thank you,
Murali K. Maddali
256-705-5191
murali.maddali@uai.com

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Wednesday, August 22, 2007 2:41 PM
To: Murali Maddali
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Adapter update.

Murali Maddali wrote:

This is what I am doing, I am reading the data from SQL Server 2005
and dumping to out to Postgresql 8.2 database.

while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);

r.Close();

// This is the where my application goes into lala

land.

If I call this update in my while loop above, it took about two hours
to process
// the whole thing
adp.Update(destTbl);

That's probably because it was doing each update in its own transaction.
That'll require committing each row to disk.

I have around 60000 records. I also have a geometry field on my table.

I have couple of questions.

1) What do I do to speed up the process? Any database configuration
changes, connection properties, ....

Well, if you're doing it all in its own transaction it should be fairly
quick.

You might also find the DBI-link project useful, if you know any Perl.
That would let you reach out directly from PG to the SQL-Server database.
http://pgfoundry.org/projects/dbi-link/

2) When I call the adapter.update does NpgsqlDataAdapter checks to see
if the column value really changed or not? I believe SQLDataAdapter
does this validation before it actually writes to the database.

Sorry, don't know - but you have the source, should be easy enough to check.
If not, I'm sure the npgsql people would be happy of a patch.

Any suggestions and comments are greatly appreciated. Right now I am
in dead waters and can't get it to work on large datasets.

Fastest way to load data into PG is via COPY, don't know if npgsql driver
supports that. If not, you'd have to go via a text-file.

Load the data into an import table (TEMPORARY table probably) and then just
use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

--
Richard Huxton
Archonet Ltd
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail.

#4Joshua D. Drake
jd@commandprompt.com
In reply to: Murali Maddali (#3)
Re: Adapter update.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Murali Maddali wrote:

Richard,

I have added transaction to my code and it took about 2 and half hours to
process around 48,000 records. Again all this time is taken by update method
on the adapter.

I don't know Perl to setup the database link to SQL Server 2005 and also I
don't have permission to write the data to files. Are there any other
options like a different driver I can use or through stored procedures. I
have to compare each column in each row before doing the update.

This is probably where your time is spent, not the actual commit of the
data. 48k records is nothing.

Joshua D. Drake

Your suggestions and comments are greatly appreciated.

Thank you,
Murali K. Maddali
256-705-5191
murali.maddali@uai.com

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Wednesday, August 22, 2007 2:41 PM
To: Murali Maddali
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Adapter update.

Murali Maddali wrote:

This is what I am doing, I am reading the data from SQL Server 2005
and dumping to out to Postgresql 8.2 database.

while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);

r.Close();

// This is the where my application goes into lala

land.

If I call this update in my while loop above, it took about two hours
to process
// the whole thing
adp.Update(destTbl);

That's probably because it was doing each update in its own transaction.
That'll require committing each row to disk.

I have around 60000 records. I also have a geometry field on my table.

I have couple of questions.

1) What do I do to speed up the process? Any database configuration
changes, connection properties, ....

Well, if you're doing it all in its own transaction it should be fairly
quick.

You might also find the DBI-link project useful, if you know any Perl.
That would let you reach out directly from PG to the SQL-Server database.
http://pgfoundry.org/projects/dbi-link/

2) When I call the adapter.update does NpgsqlDataAdapter checks to see
if the column value really changed or not? I believe SQLDataAdapter
does this validation before it actually writes to the database.

Sorry, don't know - but you have the source, should be easy enough to check.
If not, I'm sure the npgsql people would be happy of a patch.

Any suggestions and comments are greatly appreciated. Right now I am
in dead waters and can't get it to work on large datasets.

Fastest way to load data into PG is via COPY, don't know if npgsql driver
supports that. If not, you'd have to go via a text-file.

Load the data into an import table (TEMPORARY table probably) and then just
use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGzccJATb/zqfZUUQRAsBWAJ4ppz8X4RABNTdJYH/iFNvmnuUZrgCfbJiD
8Lb6BstpYZ/ipR0jgyh4ALE=
=3DmY
-----END PGP SIGNATURE-----

#5Richard Huxton
dev@archonet.com
In reply to: Joshua D. Drake (#4)
Re: Adapter update.

Joshua D. Drake wrote:

I have added transaction to my code and it took about 2 and half hours to
process around 48,000 records. Again all this time is taken by update method
on the adapter.

I don't know Perl to setup the database link to SQL Server 2005 and also I
don't have permission to write the data to files. Are there any other
options like a different driver I can use or through stored procedures. I
have to compare each column in each row before doing the update.

This is probably where your time is spent, not the actual commit of the
data. 48k records is nothing.

Ditto what Joshua says. Loading that many records should take minutes
not hours.

Try this last bit of my first reply.

Load the data into an import table (TEMPORARY table probably) and then just
use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

--
Richard Huxton
Archonet Ltd

#6Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: Murali Maddali (#3)
Re: Adapter update.

At 01:30 AM 8/24/2007, Murali Maddali wrote:

options like a different driver I can use or through stored procedures. I
have to compare each column in each row before doing the update.

Do you have to compare with all rows, or just one? Can your
comparison make use of an index?

Link.

#7Ow Mun Heng
Ow.Mun.Heng@wdc.com
In reply to: Richard Huxton (#2)
Re: Adapter update.

On Wed, 2007-08-22 at 20:41 +0100, Richard Huxton wrote:

Murali Maddali wrote:

This is what I am doing, I am reading the data from SQL Server 2005 and
dumping to out to Postgresql 8.2 database.

My 2 cents.. I'm doing roughly the same thing, but I'm using perl and
DBI to do it.

Fastest way to load data into PG is via COPY, don't know if npgsql
driver supports that. If not, you'd have to go via a text-file.

Load the data into an import table (TEMPORARY table probably) and then
just use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

My way of doing it..

1. pull from SQL Server via DBI to temp csv file.
2. Import via \copy into PG to temp table.
begin transaction
3. DElete duplicate pkey entries in actual table
4. insert new entries into actual table
5, truncate temp table
6. update a log file
end transaction.

works great..

Note on [3]..all data are new.. so instead of just doing update, I
resorted to doing a delete like the mysql's mysqlimport --replace
command. (my choice)