Thread: SQLBulkOperations Segfault

SQLBulkOperations Segfault

From

Jeremy Faith

Date:

18 April 2008, 16:06:50

Hi,

I have been experimenting with SQLBulkOperations to see if it is faster
at inserting rows than using individual insert statements.
But when I run the attached(bo.c) example program it segfaults(note,
this program works correctly with an Oracle ODBC connection).

The program requires the following table.
  create table customer
  (
    cust_num integer not null,
    first_name char(20),
    last_name char(20)
  );

Also the odb_db_open function call uses a hard coded DSN=pg_cdc_w,
username=username and password=password.

I am using the latest versions of pgsql+postgres on Linux i.e.
  psqlodbc-08.03.0100
  postgresql-8.3.1

I turned on the ODBC debug log and added some log output to the driver
and found that the segfault is occurring in the results.c:SC_pos_add
function.

In particular the log shows
  POS ADD fi=(nil) ti=(nil)

and the segfault occurs on this line:-
                        if (*used != SQL_IGNORE && fi[i]->updatable)
So fi is NULL but is being referenced.

I did a bit more checking and found that parse_statement sets up
stmt->ird->irdopts.fi i.e.
        if (SC_update_not_ready(stmt))
                parse_statement(s.stmt, TRUE);  /* not preferable */
so adding
  fi=stmt->ird->irdopts.fi;
after parse_statement gets past this segfault.

But then another segfault occurs in the results.c:positioned_load
function as stmt->load_stmt is NULL
  so strlen(stmt->load_stmt) segfaults.

The code is complicated and I have not been able to determine how to fix
this, it seems likely to me that even the fi fix may be covering up an
earlier problem somewhere else in the code.

Does anybody use SQLBulkOperations to load data?
If so is it faster than individual inserts?

The postgres driver included with unixODBC fails as well but it at least
return a standard ODBC error so I suppose it doesn't support
SQLBulkOperations, I understand the unixODBC driver is defunct anyway.

I have noticed that the unixODBC postgres driver is significantly
quicker when using a prepared insert to insert 100,000 rows.
Specifically psqlodbc takes about 75% longer, but this is reduced to
about 45% longer when 'UseServerSidePrepare = 1' is set. Indeed it was
this speed difference that prompted me to try SQLBulkOperation in the
first place.
Are there any other setting that may improve the speed of the psqlodbc
driver?
Any ideas why the old unixODBC driver is so much faster?

Regards,
Jeremy Faith
#include <ctype.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <sqlext.h>

#ifndef TRUE
  #define TRUE 1
#endif

#ifndef FALSE
  #define FALSE 0
#endif

#define ELOG );
#define LOGT fprintf(stderr,
#define LOGE fprintf(stderr,

struct odb_con_struct
{
  SQLHENV sqlenv;
  SQLHDBC sqlcon;
}*con;

void odb_err(SQLHSTMT stmt,int doexit,char *sql,char *msg);

SQLHSTMT odb_stmt(void)
{
  SQLHSTMT stmt;

  if(con==NULL||con->sqlcon==NULL)
  {
    LOGE "odb_stmt:no DB connection\n" ELOG
    exit(1);
  }
  if(SQLAllocStmt(con->sqlcon,&stmt)!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQLAllocStmt failed");
  return stmt;
}

void odb_stmt_free(SQLHSTMT stmt)
{
  if(stmt!=NULL) SQLFreeHandle(SQL_HANDLE_STMT,stmt);
}

void odb_db_close()
{
  SQLRETURN ret;
  LOGT "odb_db_close:\n" ELOG
  if(con==NULL) return;
  if(con->sqlcon!=NULL)
  {
    ret=SQLDisconnect(con->sqlcon);
    if(ret!=SQL_SUCCESS&&ret!=SQL_SUCCESS_WITH_INFO)
      odb_err(NULL,FALSE,NULL,"odb_db_close");
    SQLFreeHandle(SQL_HANDLE_DBC,con->sqlcon);
  }
  if(con->sqlenv!=NULL) SQLFreeHandle(SQL_HANDLE_ENV,con->sqlenv);
  free(con);
  con=NULL;
}

void odb_cleanup(int doexit,SQLHSTMT stmt)
{
  odb_stmt_free(stmt);
  odb_db_close();
  if(doexit) exit(13);
}

void odb_err(SQLHSTMT stmt,int doexit,char *sql,char *msg)
{
  SQLCHAR err[501];
  SQLCHAR state[10];
  SQLINTEGER native_err;
  SQLSMALLINT msg_len;

  LOGE "odb_err:%s\n%s\n",(sql==NULL)?"":sql,msg ELOG
  if(con==NULL)
  {
    LOGE "odb_err:no DB connection\n" ELOG
    exit(1);
  }
  if(stmt!=NULL)
    while(SQLError(con->sqlenv,con->sqlcon,stmt,state,&native_err,err,
           sizeof(err)-1,&msg_len)==SQL_SUCCESS)
    {
      LOGE "stmt native_err=%ld [%s]%s\n",native_err,state,err ELOG
    }
  if(con->sqlcon!=NULL)
    while(SQLError(con->sqlenv,con->sqlcon,NULL,state,&native_err,err,
      sizeof(err)-1,&msg_len)==SQL_SUCCESS)
    {
      LOGE "con native_err=%ld [%s]%s\n",native_err,state,err ELOG
    }
  if(con->sqlenv!=NULL)
    while(SQLError(con->sqlenv,NULL,NULL,state,&native_err,err,sizeof(err)-1,
      &msg_len)==SQL_SUCCESS)
    {
      LOGE "env native_err=%ld [%s]%s\n",native_err,state,err ELOG
    }
  if(doexit) odb_cleanup(TRUE,stmt);
}


void odb_db_open(char *dsn,char *uid,char *pwd)
{
  if(SQLAllocEnv(&con->sqlenv)!=SQL_SUCCESS)
    odb_err(NULL,TRUE,NULL,"odb_db_open:SQLAllocEnv failed");
  if(SQLAllocConnect(con->sqlenv,&con->sqlcon)!=SQL_SUCCESS)
    odb_err(NULL,TRUE,NULL,"SQLAllocConnect failed");
  if(!SQL_SUCCEEDED(SQLConnect(con->sqlcon,(SQLCHAR *)dsn,SQL_NTS,
      (SQLCHAR *)uid,SQL_NTS,(SQLCHAR *)pwd,SQL_NTS)))
    odb_err(NULL,TRUE,NULL,"SQLConnect failed");
  LOGT "odb_db_open:SQLconnect ok\n" ELOG
}



void bulk_load()
{
  #define ROWSET_SIZE 100
  SQLCHAR sqlstmt[] =
    "select cust_num,first_name,last_name from customer";
  struct cust_struct
  {
    SQLINTEGER cust_num;
    SQLINTEGER cust_num_i;
    SQLCHAR first_name[21];
    SQLINTEGER first_name_i;
    SQLCHAR last_name[21];
    SQLINTEGER last_name_i;
  }cust[ROWSET_SIZE];
  SQLUSMALLINT status[ROWSET_SIZE];
  SQLHSTMT stmt=odb_stmt();
  int i;
  SQLRETURN ret;

  ret=SQLExecDirect(stmt,sqlstmt,SQL_NTS);
  if(ret!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQLExecDirect");

  //set size of one row
  ret=SQLSetStmtAttr(stmt,SQL_ATTR_ROW_BIND_TYPE,
    (SQLPOINTER)sizeof(struct cust_struct),0);
  if(ret!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQL_ATTR_ROW_BIND_TYPE");
  //set pointer to row status array
  ret=SQLSetStmtAttr(stmt,SQL_ATTR_ROW_STATUS_PTR,(SQLPOINTER)status,0);
  if(ret!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQL_ATTR_ROW_STATUS_PTR");

  //bind data
  SQLBindCol(stmt,1,SQL_C_LONG,(SQLPOINTER)&cust[0].cust_num,
    (SQLINTEGER)sizeof(SQLINTEGER),&cust[0].cust_num_i);
  SQLBindCol(stmt,2,SQL_C_CHAR,(SQLPOINTER)cust[0].first_name,
    (SQLINTEGER)21,&cust[0].first_name_i);
  SQLBindCol(stmt,3,SQL_C_CHAR,(SQLPOINTER)cust[0].last_name,
    (SQLINTEGER)21,&cust[0].last_name_i);
  //make data
  for(i=0;i<ROWSET_SIZE;i++)
  {
    cust[i].cust_num=i;
    cust[i].cust_num_i=0;
    sprintf(cust[i].first_name,"FN%d",i);
    cust[i].first_name_i=SQL_NTS;
    sprintf(cust[i].last_name,"LN%d",i);
    cust[i].last_name_i=SQL_NTS;
  }
  //set number of rows to insert
  ret=SQLSetStmtAttr(stmt,SQL_ATTR_ROW_ARRAY_SIZE,(SQLPOINTER)ROWSET_SIZE,0);
  if(ret!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQL_ATTR_ROW_ARRAY_SIZE");
  LOGT "Before SQLBulkOperations\n" ELOG
  ret=SQLBulkOperations(stmt,SQL_ADD);
  LOGT "SQLBulkOperations ret=%d\n",ret ELOG
  if(ret!=SQL_SUCCESS)
    odb_err(stmt,TRUE,NULL,"SQLBulkOperations Failed");
}

int main(int argc,char **argv)
{
  con=malloc(sizeof(*con));
  odb_db_open("pg_cdc_w","username","password");
//  odb_db_open("ora_cdc","cem","cem");
  bulk_load();
  odb_cleanup(FALSE,NULL);
  return 0;
}

/*
  DB must have following table
  create table customer
  (
    cust_num integer not null,
    first_name char(20),
    last_name char(20)
  );
*/

Re: SQLBulkOperations Segfault

From

Hiroshi Inoue

Date:

20 April 2008, 03:19:27

Jeremy Faith wrote:
> Hi,
>
> I have been experimenting with SQLBulkOperations to see if it is faster
> at inserting rows than using individual insert statements.
> But when I run the attached(bo.c) example program it segfaults(note,
> this program works correctly with an Oracle ODBC connection).
>
> The program requires the following table.
>  create table customer
>  (
>    cust_num integer not null,
>    first_name char(20),
>    last_name char(20)
>  );
>
> Also the odb_db_open function call uses a hard coded DSN=pg_cdc_w,
> username=username and password=password.
>
> I am using the latest versions of pgsql+postgres on Linux i.e.
>  psqlodbc-08.03.0100
>  postgresql-8.3.1
>
> I turned on the ODBC debug log and added some log output to the driver
> and found that the segfault is occurring in the results.c:SC_pos_add
> function.
>
> In particular the log shows
>  POS ADD fi=(nil) ti=(nil)
>
> and the segfault occurs on this line:-
>                        if (*used != SQL_IGNORE && fi[i]->updatable)
> So fi is NULL but is being referenced.
>
> I did a bit more checking and found that parse_statement sets up
> stmt->ird->irdopts.fi i.e.
>        if (SC_update_not_ready(stmt))
>                parse_statement(s.stmt, TRUE);  /* not preferable */
> so adding
>  fi=stmt->ird->irdopts.fi;
> after parse_statement gets past this segfault.
>
> But then another segfault occurs in the results.c:positioned_load
> function as stmt->load_stmt is NULL
>  so strlen(stmt->load_stmt) segfaults.
>
> The code is complicated and I have not been able to determine how to fix
> this, it seems likely to me that even the fi fix may be covering up an
> earlier problem somewhere else in the code.

Before calling SQLExecDirect(), please set the SQL_ATTR_CURSOR_TYPE
to SQL_CURSOR_STATIC or SQL_CURSOR_KEYSET_DRIVER and also set the
SQL_ATTR_CONCURRENCY to SQL_CONCUR_ROWVER.

regards,
Hiroshi Inoue

Re: SQLBulkOperations Segfault

From

Jeremy Faith

Date:

21 April 2008, 10:20:44

Hiroshi Inoue wrote:
> Jeremy Faith wrote:
>> Hi,
>>
>> I have been experimenting with SQLBulkOperations to see if it is
>> faster at inserting rows than using individual insert statements.
>> But when I run the attached(bo.c) example program it segfaults(note,
>> this program works correctly with an Oracle ODBC connection).
>>
>> The program requires the following table.
>>  create table customer
>>  (
>>    cust_num integer not null,
>>    first_name char(20),
>>    last_name char(20)
>>  );
>>
>> Also the odb_db_open function call uses a hard coded DSN=pg_cdc_w,
>> username=username and password=password.
>>
>> I am using the latest versions of pgsql+postgres on Linux i.e.
>>  psqlodbc-08.03.0100
>>  postgresql-8.3.1
>>
>> I turned on the ODBC debug log and added some log output to the
>> driver and found that the segfault is occurring in the
>> results.c:SC_pos_add function.
>>
>> In particular the log shows
>>  POS ADD fi=(nil) ti=(nil)
>>
>> and the segfault occurs on this line:-
>>                        if (*used != SQL_IGNORE && fi[i]->updatable)
>> So fi is NULL but is being referenced.
>>
>> I did a bit more checking and found that parse_statement sets up
>> stmt->ird->irdopts.fi i.e.
>>        if (SC_update_not_ready(stmt))
>>                parse_statement(s.stmt, TRUE);  /* not preferable */
>> so adding
>>  fi=stmt->ird->irdopts.fi;
>> after parse_statement gets past this segfault.
>>
>> But then another segfault occurs in the results.c:positioned_load
>> function as stmt->load_stmt is NULL
>>  so strlen(stmt->load_stmt) segfaults.
>>
>> The code is complicated and I have not been able to determine how to
>> fix this, it seems likely to me that even the fi fix may be covering
>> up an earlier problem somewhere else in the code.
>
> Before calling SQLExecDirect(), please set the SQL_ATTR_CURSOR_TYPE
> to SQL_CURSOR_STATIC or SQL_CURSOR_KEYSET_DRIVER and also set the
> SQL_ATTR_CONCURRENCY to SQL_CONCUR_ROWVER.
>
That fixes it, thanks.

Perhaps SQLBulkOperations could check that these attributes are set
correctly and return an ODBC error if not. It is not clear from any ODBC
documentation that I have read that these need to be set this way to do
an SQL_ADD.

I have done a speed test and have found that the SQLBulkOperations
method takes about twice as long as using a prepared insert. So as it is
faster and easier the prepared insert seems to be the way to go.

Using the prepared insert method, the old driver included with unixODBC
is considerably faster than psqlodbcw, any idea why?
Is there any problem with using the old driver?

Regards,
Jeremy Faith