Thread: Adapter update.
Hello Group,
I have asked this question already on the NpgSql forum, but didn't get a response so far. Sorry for cross posting, but want to check if any one had any suggestions for my problem.
I am trying to do my updates through NpgsqlDataAdapter (I also tried with Odbc driver with no luck) by passing in a Datatable with changes in it, this would take forever to do the updates.
This is what I am doing, I am reading the data from SQL Server 2005 and dumping to out to Postgresql 8.2 database.
using (SqlCommand cmd = new SqlCommand(t.SourceSelect, conn))
{
using (SqlDataReader r = cmd.ExecuteReader())
{
DataSet ds = new DataSet("postgis");
NpgsqlDataAdapter adp = new NpgsqlDataAdapter(t.DestinationSelect, destConn);
NpgsqlCommandBuilder cmdBld = new NpgsqlCommandBuilder(adp);
adp.Fill(ds, t.DestinationTable);
DataTable destTbl = ds.Tables[t.DestinationTable];
DataTable srcTblSchema = r.GetSchemaTable();
adp.FillSchema(ds, SchemaType.Mapped, t.DestinationTable);
{
using (SqlDataReader r = cmd.ExecuteReader())
{
DataSet ds = new DataSet("postgis");
NpgsqlDataAdapter adp = new NpgsqlDataAdapter(t.DestinationSelect, destConn);
NpgsqlCommandBuilder cmdBld = new NpgsqlCommandBuilder(adp);
adp.Fill(ds, t.DestinationTable);
DataTable destTbl = ds.Tables[t.DestinationTable];
DataTable srcTblSchema = r.GetSchemaTable();
adp.FillSchema(ds, SchemaType.Mapped, t.DestinationTable);
// My save method will check if the row exists or not and would add or update accordingly to the datatable (destTbl). The whole process
// of comparision is done under 2 mins on 60,000 records.
while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);
_save(r, srcTblSchema, destTbl, destConn);
r.Close();
// This is the where my application goes into lala land. If I call this update in my while loop above, it took about two hours to process
// the whole thing
adp.Update(destTbl);
}
}
}
}
I have around 60000 records. I also have a geometry field on my table.
I have couple of questions.
1) What do I do to speed up the process? Any database configuration changes, connection properties, ....
2) When I call the adapter.update does NpgsqlDataAdapter checks to see if the column value really changed or not? I believe SQLDataAdapter does this validation before it actually writes to the database.
Any suggestions and comments are greatly appreciated. Right now I am in dead waters and can't get it to work on large datasets.
Thank you all.
Regards,
Murali K. Maddali
UAI, Inc.
"Always bear in mind that your own resolution to succeed is more important than any one thing." - Abraham Lincoln
Richard, I have added transaction to my code and it took about 2 and half hours to process around 48,000 records. Again all this time is taken by update method on the adapter. I don't know Perl to setup the database link to SQL Server 2005 and also I don't have permission to write the data to files. Are there any other options like a different driver I can use or through stored procedures. I have to compare each column in each row before doing the update. Your suggestions and comments are greatly appreciated. Thank you, Murali K. Maddali 256-705-5191 murali.maddali@uai.com -----Original Message----- From: Richard Huxton [mailto:dev@archonet.com] Sent: Wednesday, August 22, 2007 2:41 PM To: Murali Maddali Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Adapter update. Murali Maddali wrote: > This is what I am doing, I am reading the data from SQL Server 2005 > and dumping to out to Postgresql 8.2 database. > while (r.Read()) > _save(r, srcTblSchema, destTbl, destConn); > > r.Close(); > > > // This is the where my application goes into lala land. > If I call this update in my while loop above, it took about two hours > to process > // the whole thing > adp.Update(destTbl); That's probably because it was doing each update in its own transaction. That'll require committing each row to disk. > I have around 60000 records. I also have a geometry field on my table. > > I have couple of questions. > > 1) What do I do to speed up the process? Any database configuration > changes, connection properties, .... Well, if you're doing it all in its own transaction it should be fairly quick. You might also find the DBI-link project useful, if you know any Perl. That would let you reach out directly from PG to the SQL-Server database. http://pgfoundry.org/projects/dbi-link/ > 2) When I call the adapter.update does NpgsqlDataAdapter checks to see > if the column value really changed or not? I believe SQLDataAdapter > does this validation before it actually writes to the database. Sorry, don't know - but you have the source, should be easy enough to check. If not, I'm sure the npgsql people would be happy of a patch. > Any suggestions and comments are greatly appreciated. Right now I am > in dead waters and can't get it to work on large datasets. Fastest way to load data into PG is via COPY, don't know if npgsql driver supports that. If not, you'd have to go via a text-file. Load the data into an import table (TEMPORARY table probably) and then just use three queries to handle deletion, update and insertion. Comparing one row at a time is adding a lot of overhead. -- Richard Huxton Archonet Ltd This email and any files transmitted with it are confidential and intended solely for the use of the individual or entityto whom they are addressed. If you have received this email in error please notify the sender. This message containsconfidential information and is intended only for the individual named. If you are not the named addressee you shouldnot disseminate, distribute or copy this e-mail.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Murali Maddali wrote: > Richard, > > I have added transaction to my code and it took about 2 and half hours to > process around 48,000 records. Again all this time is taken by update method > on the adapter. > > I don't know Perl to setup the database link to SQL Server 2005 and also I > don't have permission to write the data to files. Are there any other > options like a different driver I can use or through stored procedures. I > have to compare each column in each row before doing the update. This is probably where your time is spent, not the actual commit of the data. 48k records is nothing. Joshua D. Drake > > Your suggestions and comments are greatly appreciated. > > Thank you, > Murali K. Maddali > 256-705-5191 > murali.maddali@uai.com > > -----Original Message----- > From: Richard Huxton [mailto:dev@archonet.com] > Sent: Wednesday, August 22, 2007 2:41 PM > To: Murali Maddali > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Adapter update. > > Murali Maddali wrote: >> This is what I am doing, I am reading the data from SQL Server 2005 >> and dumping to out to Postgresql 8.2 database. > >> while (r.Read()) >> _save(r, srcTblSchema, destTbl, destConn); >> >> r.Close(); >> >> >> // This is the where my application goes into lala > land. >> If I call this update in my while loop above, it took about two hours >> to process >> // the whole thing >> adp.Update(destTbl); > > That's probably because it was doing each update in its own transaction. > That'll require committing each row to disk. > >> I have around 60000 records. I also have a geometry field on my table. >> >> I have couple of questions. >> >> 1) What do I do to speed up the process? Any database configuration >> changes, connection properties, .... > > Well, if you're doing it all in its own transaction it should be fairly > quick. > > You might also find the DBI-link project useful, if you know any Perl. > That would let you reach out directly from PG to the SQL-Server database. > http://pgfoundry.org/projects/dbi-link/ > >> 2) When I call the adapter.update does NpgsqlDataAdapter checks to see >> if the column value really changed or not? I believe SQLDataAdapter >> does this validation before it actually writes to the database. > > Sorry, don't know - but you have the source, should be easy enough to check. > If not, I'm sure the npgsql people would be happy of a patch. > >> Any suggestions and comments are greatly appreciated. Right now I am >> in dead waters and can't get it to work on large datasets. > > Fastest way to load data into PG is via COPY, don't know if npgsql driver > supports that. If not, you'd have to go via a text-file. > > Load the data into an import table (TEMPORARY table probably) and then just > use three queries to handle deletion, update and insertion. > Comparing one row at a time is adding a lot of overhead. > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGzccJATb/zqfZUUQRAsBWAJ4ppz8X4RABNTdJYH/iFNvmnuUZrgCfbJiD 8Lb6BstpYZ/ipR0jgyh4ALE= =3DmY -----END PGP SIGNATURE-----
Joshua D. Drake wrote: >> >> I have added transaction to my code and it took about 2 and half hours to >> process around 48,000 records. Again all this time is taken by update method >> on the adapter. >> >> I don't know Perl to setup the database link to SQL Server 2005 and also I >> don't have permission to write the data to files. Are there any other >> options like a different driver I can use or through stored procedures. I >> have to compare each column in each row before doing the update. > > This is probably where your time is spent, not the actual commit of the > data. 48k records is nothing. Ditto what Joshua says. Loading that many records should take minutes not hours. Try this last bit of my first reply. >> Load the data into an import table (TEMPORARY table probably) and then just >> use three queries to handle deletion, update and insertion. >> Comparing one row at a time is adding a lot of overhead. -- Richard Huxton Archonet Ltd
At 01:30 AM 8/24/2007, Murali Maddali wrote: >options like a different driver I can use or through stored procedures. I >have to compare each column in each row before doing the update. Do you have to compare with all rows, or just one? Can your comparison make use of an index? Link.