Thread: Performance problem with large insert statements

Performance problem with large insert statements

From
Barry Lind
Date:
In looking at some performance issues (I was trying to look at the
overhead of toast) I found that large insert statements were very slow.

My test case involved reading a file (6M file in my tests) and inserting
  it into the database into a "largeobject" like table defined as follows:

create table tblob1 (filename text, lastbyte integer, data bytea);

The first test program read the file 8000 bytes at a time and inserted
them into the above table until the entire file was inserted.  This test
program used a regular insert statement to do the inserting: (insert
into tblob1 values (?,?,?))

For three runs of this test the average time to insert the 6M file into
the database in 8000 byte rows (which ended up being 801 rows inserted
into the table) was: 17.803 seconds

The second test read the file in 8000 byte chucks just like the first
program but it used a function to do the insert and called the function
via the FastPath API.  The function was:

CREATE FUNCTION BYTEA_WRITE (TEXT, INTEGER, BYTEA) RETURNS INTEGER
as '
BEGIN
   INSERT INTO TBLOB1 VALUES ($1, $2, $3);
RETURN 1;
END;'
language 'plpgsql'

For three runs of this test the average time to insert the 6M file into
the database in 8000 byte parts was:  2.645


Thus using the insert statement was almost an order of magnitude slower
that using the function (17.803 sec vs. 2.645 sec).

Reading the data back from the server via a standard select statement
takes on average: 1.674 seconds.

I tried to run gprof to see where the time was going, but for some
reason the gprof output on my gmon.out file doesn't have any timing
information (all times are reported as 0.0) and I haven't been able to
figure out why yet.  So I don't know what is taking up the bulk of the
time  (I suspect it is either the decoding of the bytea data which the
Fastpath function call avoids, or the parser which needs to parse 801 8K
SQL statements vs. the function which has to parse one 100 byte statement.)

I have attached the two test programs (they are in java and use jdbc)
and a SQL script that creates the table and function.

thanks,
--Barry
import java.io.*;
import java.util.*;
import java.sql.*;


public class BlobTestInsert {

    private Connection con = null;

    public BlobTestInsert() {
        try {
            Class.forName("org.postgresql.Driver");
            con = DriverManager.getConnection(
                    "jdbc:postgresql://localhost:5432/files", "blind", "");
            if (con != null) {
               System.out.println("Connected to database system.\n");
            }
        } catch (SQLException e) {
            System.out.println(e.getMessage());
            System.out.println("Could not connect to database system.\n");
        } catch (ClassNotFoundException e) {
            System.out.println("Class not found...:-(\n");
        }
    }

    private void store(String filename) {
        PreparedStatement ps = null;
        ResultSet r = null;
        File file = new File(filename);
        byte[] l_buf = new byte[8000];
        int l_size = 0;
        System.out.println("store start " + System.currentTimeMillis() + " file.name:" + file.getName() +
"file.length:"+file.length());
        try {
            FileInputStream fis = new FileInputStream(file);
            con.setAutoCommit(false);
            ps = con.prepareStatement(
                "INSERT INTO tblob1 VALUES (?, ?, ?);");
            while (l_size < file.length()) {
            fis.read(l_buf);
              ps.setString(1, filename);
              ps.setInt(2,l_size);
              ps.setBytes(3,l_buf);
              l_size += 8000;
              ps.executeUpdate();
            }
            ps.close();
            fis.close();
            con.commit();
            System.out.println("store end   " + System.currentTimeMillis());
        } catch (SQLException sqle) {
            System.err.println("Store content: " + sqle.getMessage());
        } catch (IOException ioe) {
        }
    }

    private void retrieve(String filename) {
        Statement s = null;
        ResultSet r = null;
        int byteSum = 0;
        byte[] buffer = new byte[8000];
        try {
            System.out.println("read start  " + System.currentTimeMillis() + " Trying to write: " +filename + "test");
            FileOutputStream out = new FileOutputStream(filename + "test");
            con.setAutoCommit(false);
            s = con.createStatement();
            r = s.executeQuery("SELECT data FROM tblob1 WHERE filename = '"
                               + filename +"' order by lastbyte");
            if (r != null) {
                while (r.next()) {
                    buffer = r.getBytes(1);
                    out.write(buffer);
                    byteSum += buffer.length;
                }
            }
            out.close();
            System.out.println("read end    " + System.currentTimeMillis() + " bytes written: " + byteSum);
            con.commit();
        } catch (SQLException sqle) {
            System.err.println("Retrieve content: " + sqle.getMessage());
        } catch (Exception ioe) {
            System.err.println("Writing stuff: " + ioe.getMessage());
        }
    }

    public static void main(String[] args) {
        BlobTestInsert bt = new BlobTestInsert();
        bt.store(args[0]);
        bt.retrieve(args[0]);
    }

}
import java.io.*;
import java.util.*;
import java.sql.*;
import org.postgresql.fastpath.Fastpath;
import org.postgresql.fastpath.FastpathArg;


public class BlobTestFunction {

    private Connection con = null;

    public BlobTestFunction() {
        try {
            Class.forName("org.postgresql.Driver");
            con = DriverManager.getConnection(
                    "jdbc:postgresql://localhost:5432/files", "blind", "");
            if (con != null) {
               System.out.println("Connected to database system.\n");
            }
        } catch (SQLException e) {
            System.out.println(e.getMessage());
            System.out.println("Could not connect to database system.\n");
        } catch (ClassNotFoundException e) {
            System.out.println("Class not found...:-(\n");
        }
    }

    private void store(String filename) {
        File file = new File(filename);
        byte[] l_buf = new byte[8000];
        int l_size = 0;
        System.out.println("store start " + System.currentTimeMillis() + " file.name:" + file.getName() + "
file.length:"+file.length());
        try {
            FileInputStream fis = new FileInputStream(file);
            con.setAutoCommit(false);
            Fastpath l_fp = ((org.postgresql.Connection)con).getFastpathAPI();
           // Now get the function oid's for the api
           Statement l_stmt = null;
           ResultSet l_rset = null;
           try {
             try {
               l_stmt = con.createStatement();
               l_rset = l_stmt.executeQuery("SELECT PRONAME, OID FROM PG_PROC " +
                                            "WHERE PRONAME = 'bytea_write'");
               if(l_rset == null) {
                 System.err.println("Missing funciton bytea_write");
               }
               l_fp.addFunctions(l_rset);
             } finally {
               if (l_rset != null) {
                 l_rset.close();
               }
               if (l_stmt != null) {
                 l_stmt.close();
               }
             }
           } catch (SQLException l_se) {
             System.err.println("Store content: " + l_se.getMessage());
           }

           FastpathArg args[] = new FastpathArg[3];
           while ( l_size < file.length()) {
         fis.read(l_buf);
             args[0] = new FastpathArg(filename);
             args[1] = new FastpathArg(l_size);
             args[2] = new FastpathArg(l_buf, 0, 8000);
             l_size += 8000;
             l_fp.fastpath("bytea_write", true, args);
           }
           fis.close();
           con.commit();
           System.out.println("store end   " + System.currentTimeMillis());
        } catch (SQLException sqle) {
            System.err.println("Store content: " + sqle.getMessage());
        } catch (IOException ioe) {
            System.err.println("Store content: " + ioe.getMessage());
        }
    }

    private void retrieve(String filename) {
        Statement s = null;
        ResultSet r = null;
        int byteSum = 0;
        byte[] buffer;
        try {
            System.out.println("read start  " + System.currentTimeMillis() + " Trying to write: " +filename + ".test");
            FileOutputStream out = new FileOutputStream(filename + ".test");
            s = con.createStatement();
            r = s.executeQuery("SELECT data FROM tblob1 WHERE filename = '"
                               + filename +"' order by lastbyte");
            if (r != null) {
                while (r.next()) {
                    buffer = r.getBytes(1);
                    out.write(buffer);
                    byteSum += buffer.length;
                }
            }
            out.close();
            System.out.println("read end    " + System.currentTimeMillis() + " bytes written: " + byteSum);
            con.commit();
        } catch (SQLException sqle) {
            System.err.println("Retrieve content: " + sqle.getMessage());
        } catch (Exception ioe) {
            System.err.println("Writing stuff: " + ioe.getMessage());
        }
    }

    public static void main(String[] args) {
        BlobTestFunction bt = new BlobTestFunction();
        bt.store(args[0]);
        bt.retrieve(args[0]);
    }

}
DROP TABLE TBLOB1;

CREATE TABLE TBLOB1 (FILENAME TEXT, LASTBYTE INTEGER, DATA BYTEA);


DROP FUNCTION BYTEA_WRITE(TEXT, INTEGER, BYTEA);

CREATE FUNCTION BYTEA_WRITE (TEXT, INTEGER, BYTEA) RETURNS INTEGER
as '
BEGIN
  INSERT INTO TBLOB1 VALUES ($1, $2, $3);
RETURN 1;
END;'
language 'plpgsql'

Re: Performance problem with large insert statements

From
Tom Lane
Date:
Barry Lind <barry@xythos.com> writes:
> In looking at some performance issues (I was trying to look at the 
> overhead of toast) I found that large insert statements were very slow.
> ...
> I tried to run gprof to see where the time was going, but for some 
> reason the gprof output on my gmon.out file doesn't have any timing 
> information (all times are reported as 0.0) and I haven't been able to 
> figure out why yet.

That seems to be a common disease among Linuxen; dunno why.  gprof
works fine for me on HPUX.  I got around to reproducing this today,
and what I find is that the majority of the backend time is going into
simple scanning of the input statement:

Each sample counts as 0.01 seconds. %   cumulative   self              self     total           time   seconds
seconds   calls  ms/call  ms/call  name    31.24     11.90    11.90                             _mcount19.51     19.33
  7.43    10097     0.74     1.06  base_yylex 7.48     22.18     2.85 21953666     0.00     0.00  appendStringInfoChar
5.88    24.42     2.24      776     2.89     2.89  pglz_compress 4.36     26.08     1.66 21954441     0.00     0.00
pq_getbyte3.57     27.44     1.36  7852141     0.00     0.00  addlit 3.26     28.68     1.24     1552     0.80     0.81
scanstr 2.84     29.76     1.08      779     1.39     7.18  pq_getstring 2.31     30.64     0.88    10171     0.09
0.09 _doprnt 2.26     31.50     0.86      776     1.11     1.11  byteain 2.07     32.29     0.79
    msquadloop 1.60     32.90     0.61  7931430     0.00     0.00  memcpy 1.18     33.35     0.45
     chunks 1.08     33.76     0.41    46160     0.01     0.01  strlen 1.08     34.17     0.41
  encore 1.05     34.57     0.40     8541     0.05     0.05  XLogInsert 0.89     34.91     0.34
   appendStringInfo
 

60% of the call graph time is accounted for by these two areas:

index % time    self  children    called     name               7.43    3.32   10097/10097       yylex [14]
[13]    41.0    7.43    3.32   10097         base_yylex [13]               1.36    0.61 7852141/7852141     addlit [28]
             1.24    0.01    1552/1552        scanstr [30]               0.02    0.03    3108/3108
ScanKeywordLookup[99]               0.00    0.02    2335/2335        yy_get_next_buffer [144]               0.02
0.00    776/781         strtol [155]               0.00    0.01     777/3920        MemoryContextStrdup [108]
   0.00    0.00       1/1           base_yy_create_buffer [560]               0.00    0.00    4675/17091       isupper
[617]              0.00    0.00    1556/1556        yy_get_previous_state [671]               0.00    0.00     779/779
      yywrap [706]               0.00    0.00       1/2337        base_yy_load_buffer_state [654]
 
-----------------------------------------------               1.08    4.51     779/779         pq_getstr [17]
[18]    21.4    1.08    4.51     779         pq_getstring [18]               2.85    0.00 21953662/21953666
appendStringInfoChar[20]               1.66    0.00 21954441/21954441     pq_getbyte [29]
 
-----------------------------------------------

While we could probably do a little bit to speed up pg_getstring and its
children, it's not clear that we can do anything about yylex, which is
flex output code not handmade code, and is probably well-tuned already.

Bottom line: feeding huge strings through the lexer is slow.
        regards, tom lane


Re: Performance problem with large insert statements

From
Barry Lind
Date:
Tom,

Thanks for looking into this.  What you found is about what I expected.  What this means to me is that there should be
anew todo item along 
 
the lines:  * allow binding query args over FE/BE protocol

Generally what makes the statement large is the values being 
inserted/updated or being specified in the where clause.  If it were 
possible from the client declare a cursor with bind variables (i.e. $1, 
$2, ...), then later bind in the values and execute the cursor, it 
should be possible to work around this problem.

For example the Oracle SQL*Net protocol (Oracle's equivalent to 
postgres' FE/BE) has seven distinct operations that can be done on a 
cursor from the client:  open, bind, describe, parse, execute, fetch, 
close.  This perhaps is a little bit more control than you need, but you 
get the general idea.

Currently the FE/BE protocol has two modes for executing queries:  1) 
send the entire query as a string and get back the entire result set in 
one go, or 2) use cursors in sql, which still sends the entire query in 
one string, but allows you to get back a subset of the results.  It 
would be nice to have even more control than either of these two options 
offer.

It seems to me that the server has all the functionality to do this, 
because plpgsql supports it in 7.2.  It just isn't can't be done from 
the client.

I have been contemplating creating some SPI functions that could be 
called from the client that would implement this type of functionality 
and then enhancing the JDBC driver to use them instead of the regular 
query execution of the FE/BE protocol.  (i.e. one function to "declare" 
a cursor with $1 placeholders, a second function to bind values and 
execute the cursor, and then using standard fetch sql statement to get 
results, therefore it could be done without changing the actual 
protocol).  This could also support the client explicitly "caching" 
commonly used cursors and just rebinding/reexecuting them to avoid 
having to reparse/replan commonly used queries.

If this all isn't too off the wall, I would like to see something along 
these lines added to the todo list and perhapes this email thread along 
with.  At a minimum I would like to hear others opinions on the subject.

thanks,
--Barry





Tom Lane wrote:

> Barry Lind <barry@xythos.com> writes:
> 
>>In looking at some performance issues (I was trying to look at the 
>>overhead of toast) I found that large insert statements were very slow.
>>...
>>I tried to run gprof to see where the time was going, but for some 
>>reason the gprof output on my gmon.out file doesn't have any timing 
>>information (all times are reported as 0.0) and I haven't been able to 
>>figure out why yet.
>>
> 
> That seems to be a common disease among Linuxen; dunno why.  gprof
> works fine for me on HPUX.  I got around to reproducing this today,
> and what I find is that the majority of the backend time is going into
> simple scanning of the input statement:
> 
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total           
>  time   seconds   seconds    calls  ms/call  ms/call  name    
>  31.24     11.90    11.90                             _mcount
>  19.51     19.33     7.43    10097     0.74     1.06  base_yylex
>   7.48     22.18     2.85 21953666     0.00     0.00  appendStringInfoChar
>   5.88     24.42     2.24      776     2.89     2.89  pglz_compress
>   4.36     26.08     1.66 21954441     0.00     0.00  pq_getbyte
>   3.57     27.44     1.36  7852141     0.00     0.00  addlit
>   3.26     28.68     1.24     1552     0.80     0.81  scanstr
>   2.84     29.76     1.08      779     1.39     7.18  pq_getstring
>   2.31     30.64     0.88    10171     0.09     0.09  _doprnt
>   2.26     31.50     0.86      776     1.11     1.11  byteain
>   2.07     32.29     0.79                             msquadloop
>   1.60     32.90     0.61  7931430     0.00     0.00  memcpy
>   1.18     33.35     0.45                             chunks
>   1.08     33.76     0.41    46160     0.01     0.01  strlen
>   1.08     34.17     0.41                             encore
>   1.05     34.57     0.40     8541     0.05     0.05  XLogInsert
>   0.89     34.91     0.34                             appendStringInfo
> 
> 60% of the call graph time is accounted for by these two areas:
> 
> index % time    self  children    called     name
>                 7.43    3.32   10097/10097       yylex [14]
> [13]    41.0    7.43    3.32   10097         base_yylex [13]
>                 1.36    0.61 7852141/7852141     addlit [28]
>                 1.24    0.01    1552/1552        scanstr [30]
>                 0.02    0.03    3108/3108        ScanKeywordLookup [99]
>                 0.00    0.02    2335/2335        yy_get_next_buffer [144]
>                 0.02    0.00     776/781         strtol [155]
>                 0.00    0.01     777/3920        MemoryContextStrdup [108]
>                 0.00    0.00       1/1           base_yy_create_buffer [560]
>                 0.00    0.00    4675/17091       isupper [617]
>                 0.00    0.00    1556/1556        yy_get_previous_state [671]
>                 0.00    0.00     779/779         yywrap [706]
>                 0.00    0.00       1/2337        base_yy_load_buffer_state [654]
> -----------------------------------------------
>                 1.08    4.51     779/779         pq_getstr [17]
> [18]    21.4    1.08    4.51     779         pq_getstring [18]
>                 2.85    0.00 21953662/21953666     appendStringInfoChar [20]
>                 1.66    0.00 21954441/21954441     pq_getbyte [29]
> -----------------------------------------------
> 
> While we could probably do a little bit to speed up pg_getstring and its
> children, it's not clear that we can do anything about yylex, which is
> flex output code not handmade code, and is probably well-tuned already.
> 
> Bottom line: feeding huge strings through the lexer is slow.
> 
>             regards, tom lane
> 
> 




Re: Performance problem with large insert statements

From
Tom Lane
Date:
Barry Lind <barry@xythos.com> writes:
> It seems to me that the server has all the functionality to do this, 
> because plpgsql supports it in 7.2.  It just isn't can't be done from 
> the client.

True.  Basically it's a shortcoming in the FE/BE protocol.  There's
some other work that'd have to be done, but extending the protocol
would be the main bit.

> I have been contemplating creating some SPI functions that could be 
> called from the client that would implement this type of functionality 
> and then enhancing the JDBC driver to use them instead of the regular 
> query execution of the FE/BE protocol.

Interesting as a proof-of-concept hack, but I'd sure not want to see
it shipped as a production solution.  For one thing, the existing
fastpath protocol is itself too broken to encourage more widespread
use of.  (See the various comments in fastpath.c.)
        regards, tom lane


Re: Performance problem with large insert statements

From
Bruce Momjian
Date:
> Tom,
> 
> Thanks for looking into this.  What you found is about what I expected. 
>   What this means to me is that there should be a new todo item along 
> the lines:
>    * allow binding query args over FE/BE protocol

OK, I have added this to the TODO list in the cache section because it
is really just a special case of cached query plans where some constants
change.  I know it is a litte more than that, but the basic issues are
the same.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026