Thread: performance modality in 7.1 for large text attributes?

performance modality in 7.1 for large text attributes?

From
Paul A Vixie
Date:
(plz cc me on your replies, i'm not on pgsql-hackers for some reason.)

http://www.vix.com/~vixie/results-psql.png shows a gnuplot of the wall time
of 70K executions of "pgcat" (shown below) using a CIDR key and TEXT value.
(this is for storing the MAPS RSS, which we presently have in flat files.)

i've benchmarked this against a flat directory with IP addresses as filenames,
and against a deep directory with squid/netnews style hashing (127/0/0/1.txt)
and while it's way more predictable than either of those, there's nothing in
my test framework which explains the 1.5s mode shown in the above *.png file.

anybody know what i could be doing wrong?  (i'm also wondering why SELECT
takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless
TOAST is doing a LOT better than i think.)

furthermore, are there any plans to offer a better libpq interface to INSERT?
the things i'm doing now to quote the text, and the extra copy i'm maintaining,
are painful.  arbitrary-sized "text" attributes are a huge boon -- we would
never have considered using postgres for MAPS RSS (or RBL) with "large
objects".  (kudos to all who were involved, with both WAL and TOAST!)

here's the test jig -- please don't redistribute it yet since there's no man
page and i want to try binary cursors and other things to try to speed it up
or clean it up or both.  but if someone can look at my code (which i'm running
against the 7.1 bits at the head of the pgsql cvs tree) and at the *.png file
and help me enumerate the sources of my stupidity, i will be forever grateful.

# This is a shell archive.  Save it in a file, remove anything before
# this line, and then unpack it by entering "sh file".  Note, it may
# create directories; files and directories will be owned by you and
# have default permissions.
#
# This archive contains:
#
#    Makefile
#    pgcat.c
#
echo x - Makefile
sed 's/^X//' >Makefile << 'END-of-Makefile'
X## Copyright (c) 2000 by Mail Abuse Prevention System LLC
X##
X## Permission to use, copy, modify, and distribute this software for any
X## purpose with or without fee is hereby granted, provided that the above
X## copyright notice and this permission notice appear in all copies.
X##
X## THE SOFTWARE IS PROVIDED "AS IS" AND INTERNET SOFTWARE CONSORTIUM DISCLAIMS
X## ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
X## OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL INTERNET SOFTWARE
X## CONSORTIUM BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
X## DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
X## PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
X## ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
X## SOFTWARE.
X
X# $Id: Makefile,v 1.1.1.1 2000/12/19 04:49:51 vixie Exp $
X
XCC= gcc -Wall
XALL= pgcat
X
XLDFLAGS= -L/usr/local/pgsql/lib -L/usr/local/krb5/lib
XCFLAGS= -I/usr/local/pgsql/include
XLIBS= -lpq -lcom_err
X
Xall: $(ALL)
X
Xkit:; shar Makefile pgcat.c >kit
X
Xclean:; rm -f $(ALL) kit; rm -f *.o
X
Xpgcat: pgcat.o Makefile
X    $(CC) $(LDFLAGS) -o pgcat pgcat.o $(LIBS)
X
Xpgcat.o: pgcat.c Makefile
END-of-Makefile
echo x - pgcat.c
sed 's/^X//' >pgcat.c << 'END-of-pgcat.c'
X/*
X * Copyright (c) 2000 by Mail Abuse Prevention System LLC
X *
X * Permission to use, copy, modify, and distribute this software for any
X * purpose with or without fee is hereby granted, provided that the above
X * copyright notice and this permission notice appear in all copies.
X *
X * THE SOFTWARE IS PROVIDED "AS IS" AND INTERNET SOFTWARE CONSORTIUM DISCLAIMS
X * ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
X * OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL INTERNET SOFTWARE
X * CONSORTIUM BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
X * DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
X * PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
X * ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
X * SOFTWARE.
X */
X
X#ifndef LINT
Xstatic const char rcsid[] = "$Id: pgcat.c,v 1.1.1.1 2000/12/19 04:49:50 vixie Exp $";
X#endif
X
X#include <sys/param.h>
X#include <sys/types.h>
X#include <sys/stat.h>
X
X#include <stdio.h>
X#include <stdlib.h>
X#include <string.h>
X#include <unistd.h>
X
X#include <libpq-fe.h>
X
Xstatic const char tmp_template[] = "/tmp/pgcat.XXXXXX";
Xstatic const char *progname = "amnesia";
X
Xstatic int get(PGconn *, const char *, const char *, const char *,
X           const char *, const char *);
Xstatic int put(PGconn *, const char *, const char *, const char *,
X           const char *, const char *);
X
Xstatic void
Xusage(const char *msg) {
X    fprintf(stderr, "%s: usage error (%s)\n", progname, msg);
X    fprintf(stderr,
X     "usage: %s get|put <dbname> <table> <key> <value> <text> [<file>]\n",
X        progname);
X    exit(1);
X}
X
Xint
Xmain(int argc, char *argv[]) {
X    const char *pghost = NULL, *pgport = NULL, *pgoptions = NULL,
X        *pgtty = NULL;
X    const char *op, *dbname, *table, *key, *value, *text, *file;
X    PGconn *conn;
X    int status;
X
X    if ((progname = strrchr(argv[0], '/')) != NULL)
X        progname++;
X    else
X        progname = argv[0];
X    if (argc < 7)
X        usage("too few arguments");
X    op = argv[1];
X    dbname = argv[2];
X    table = argv[3];
X    key = argv[4];
X    value = argv[5];
X    text = argv[6];
X    if (argc > 8)
X        usage("too many arguments");
X    else if (argc == 8)
X        file = argv[7];
X    else
X        file = NULL;
X    if (strcmp(op, "get") != 0 && strcmp(op, "put") != 0)
X        usage("operation must be 'get' or 'put'");
X    conn = PQsetdb(pghost, pgport, pgoptions, pgtty, dbname);
X    if (PQstatus(conn) == CONNECTION_BAD) {
X        fprintf(stderr, "%s: \"%s\": %s", progname, dbname,
X            PQerrorMessage(conn));
X        status = 1;
X    } else if (strcmp(op, "get") == 0) {
X        status = get(conn, table, key, value, text, file);
X    } else {
X        status = put(conn, table, key, value, text, file);
X    }
X    PQfinish(conn);
X    return (status);
X}
X
Xstatic int
Xget(PGconn *conn, const char *table, const char *key, const char *value,
X    const char *text, const char *file)
X{
X    char cmd[999], ch, pch;
X    const char *p;
X    PGresult *res = NULL;
X    int status = 0;
X    FILE *fp = stdout;
X
X    /* Open the output file if there is one. */
X    if (file != NULL) {
X        fp = fopen(file, "w");
X        if (fp == NULL) {
X            perror(file);
X            status = 1;
X            goto done;
X        }
X    }
X
X    /* Quote the lookup value if nec'y. */
X    if (strchr(value, '\'') != NULL || strchr(value, ':') != NULL)
X        p = "";
X    else
X        p = "'";
X        
X    /* Send the query. */
X    if (snprintf(cmd, sizeof cmd, "SELECT %s FROM %s WHERE %s = %s%s%s",
X             text, table, key, p, value, p) >= sizeof cmd) {
X        fprintf(stderr, "%s: snprintf overflow\n", progname);
X        status = 1;
X        goto done;
X    }
X    res = PQexec(conn, cmd);
X    if (PQresultStatus(res) != PGRES_TUPLES_OK) {
X        fprintf(stderr, "%s: \"%s\": %s", progname, cmd,
X            PQresultErrorMessage(res));
X        status = 1;
X        goto done;
X    }
X    if (PQnfields(res) != 1) {
X        fprintf(stderr, "%s: \"%s\": %d fields?\n",
X            progname, cmd, PQnfields(res));
X        status = 1;
X        goto done;
X    }
X    if (PQntuples(res) != 1) {
X        fprintf(stderr, "%s: \"%s\": %d tuples?\n",
X            progname, cmd, PQntuples(res));
X        status = 1;
X        goto done;
X    }
X
X    /* Output the result. */
X    pch = '\0';
X    for (p = PQgetvalue(res, 0, 0), ch = '\0'; (ch = *p) != '\0'; p++) {
X        putc(ch, fp);
X        pch = ch;
X    }
X    if (pch != '\n')
X        putc('\n', fp);
X done:
X    if (fp != NULL && fp != stdout)
X        fclose(fp);
X    if (res != NULL)
X        PQclear(res);
X    return (status);
X}
X
Xstatic int
Xput(PGconn *conn, const char *table, const char *key, const char *value,
X    const char *text, const char *file)
X{
X    char *t, *tp, cmd[999];
X    const char *p;
X    PGresult *res = NULL;
X    int status = 0, ch, n;
X    FILE *fp = stdin, *copy = NULL;
X    struct stat sb;
X    size_t size;
X
X    /* Open the file if there is one. */
X    if (file != NULL) {
X        fp = fopen(file, "r");
X        if (fp == NULL) {
X            perror(file);
X            status = 1;
X            goto done;
X        }
X    }
X
X    /*
X     * Read the file to find out how large it will be when quoted.
X     * If it's not a regular file, make a copy while reading, then switch.
X     */
X    if (fstat(fileno(fp), &sb) < 0) {
X        perror("stat");
X        status = 1;
X        goto done;
X    }
X    if ((sb.st_mode & S_IFMT) != S_IFREG) {
X        char tmpname[MAXPATHLEN];
X        int fd;
X
X        strcpy(tmpname, tmp_template);
X        fd = mkstemp(tmpname);
X        if (fd < 0) {
X            perror("mkstemp");
X            status = 1;
X            goto done;
X        }
X        copy = fdopen(fd, "r+");
X        unlink(tmpname);
X    }
X    size = 0;
X    while ((ch = getc(fp)) != EOF) {
X        if (ch == '\\' || ch == '\'')
X            size++;
X        size++;
X        if (copy)
X            putc(ch, copy);
X    }
X    if (ferror(fp)) {
X        perror("fread");
X        status = 1;
X        goto done;
X    }
X    if (copy) {
X        if (fp != stdin)
X            fclose(fp);
X        fp = copy;
X        copy = NULL;
X    }
X    rewind(fp);
X
X    /* Quote the lookup value if nec'y. */
X    if (strchr(value, '\'') != NULL || strchr(value, ':') != NULL)
X        p = "";
X    else
X        p = "'";
X        
X    /* Construct the INSERT command. */
X    n = snprintf(cmd, sizeof cmd,
X             "INSERT INTO %s ( %s, %s ) VALUES ( %s%s%s, '",
X             table, key, text, p, value, p);
X    if (n >= sizeof cmd) {
X        fprintf(stderr, "%s: snprintf overflow\n", progname);
X        status = 1;
X        goto done;
X    }
X    t = malloc(n + size + sizeof "');");
X    if (t == NULL) {
X        perror("malloc");
X        status = 1;
X        goto done;
X    }
X    strcpy(t, cmd);
X    tp = t + n;
X    while ((ch = getc(fp)) != EOF) {
X        if (ch == '\\' || ch == '\'')
X            *tp++ = '\\';
X        *tp++ = ch;
X    }
X    *tp++ = '\'';
X    *tp++ = ')';
X    *tp++ = ';';
X    *tp++ = '\0';
X
X    /* Send the command. */
X    res = PQexec(conn, t);
X    if (PQresultStatus(res) != PGRES_COMMAND_OK) {
X        fprintf(stderr, "%s: \"%s\": %s", progname, t,
X            PQresultErrorMessage(res));
X        status = 1;
X        goto done;
X    }
X    if (strcmp(PQcmdTuples(res), "1") != 0) {
X        fprintf(stderr, "%s: \"%s...\": '%s' tuples? (%s)\n",
X            progname, cmd, PQcmdTuples(res), PQcmdStatus(res));
X        status = 1;
X        goto done;
X    }
X
X done:
X    if (fp != NULL && fp != stdin)
X        fclose(fp);
X    if (res != NULL)
X        PQclear(res);
X    return (status);
X}
END-of-pgcat.c
exit



Re: performance modality in 7.1 for large text attributes?

From
Thomas Lockhart
Date:
> anybody know what i could be doing wrong?  (i'm also wondering why SELECT
> takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless
> TOAST is doing a LOT better than i think.)

I would think that this is entirely due to planning the query. An INSERT
has no decisions to make, whereas a SELECT must decide among a variety
of possible plans. To hand-optimize selects, you can set some parameters
to force only some kinds of plans (such as index scan) but in general
you will need to remember to unset them afterwards or you run the risk
of bizarrely inappropriate plans for other queries in the same session.

> furthermore, are there any plans to offer a better libpq interface to INSERT?
> the things i'm doing now to quote the text, and the extra copy i'm maintaining,
> are painful.

What exactly are you looking for in "better"? Is it just the quoting
issue (a longstanding problem which persists for historical reasons :(

> ... but if someone can look at my code (which i'm running
> against the 7.1 bits at the head of the pgsql cvs tree) and at the *.png file
> and help me enumerate the sources of my stupidity, i will be forever grateful.

Possible causes of the 1.5s "mode" (at least as a starting point):

o task scheduling on your test machine (not likely??)

o swapping/thrashing on your test machine (not likely??)

o WAL fsync() log commits and cleanup (aggregate throughput is great,
but every once in a while someone waits while the paperwork gets done.
Waiting may be due to processor resource competition)

o Underlying file system bookkeeping from the kernel. e.g. flushing
buffers to disk etc etc.
                     - Thomas


Re: performance modality in 7.1 for large text attributes?

From
Alex Pilosov
Date:
Paul,

1) Have you ran vacuum analyze after all these inserts to update database
statistics? :) Without vacuum, pgsql will opt to table scan even when
there's an index.

2) I'm not sure if you are executing pgcat 70k times or executing inner
loop in pgcat 70k times. Postgres connection establishment is expensive.

3) Postgres INSERT is not very efficient if you are doing a bulk load of
data (it has to reparse the statement every time). If you want to delete
everything and load new data, use "COPY", which is about 5 times faster.
Also, there's a patch by someone to do following: INSERT INTO (fields...)
VALUES (...), (...), (...), which results in parsing the statement only
once.

Oh...And since I have your attention, could you please resolve
long-standing discussion between me and Tom Lane? :) 

Question is whether proper (standard/most-commonly-used) format for
printing CIDR network address is 10/8 or 10.0.0.0/8 (i.e. should all
octets be printed even if they are 0). After search of RFCs, there's
nothing that specifies the standard, but 10.0.0.0/8 is used more often in
examples than 10/8 form.

Postgres uses 10/8 form, and I'm saying that 10.0.0.0/8 is more accepted
by everyone else. (I.E. all software can deal with that, but not all
software accepts 10/8).

-alex

On Mon, 18 Dec 2000, Paul A Vixie wrote:

> (plz cc me on your replies, i'm not on pgsql-hackers for some reason.)
> 
> http://www.vix.com/~vixie/results-psql.png shows a gnuplot of the wall time
> of 70K executions of "pgcat" (shown below) using a CIDR key and TEXT value.
> (this is for storing the MAPS RSS, which we presently have in flat files.)
> 
> i've benchmarked this against a flat directory with IP addresses as filenames,
> and against a deep directory with squid/netnews style hashing (127/0/0/1.txt)
> and while it's way more predictable than either of those, there's nothing in
> my test framework which explains the 1.5s mode shown in the above *.png file.
> 
> anybody know what i could be doing wrong?  (i'm also wondering why SELECT
> takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless
> TOAST is doing a LOT better than i think.)
> 
> furthermore, are there any plans to offer a better libpq interface to INSERT?
> the things i'm doing now to quote the text, and the extra copy i'm maintaining,
> are painful.  arbitrary-sized "text" attributes are a huge boon -- we would
> never have considered using postgres for MAPS RSS (or RBL) with "large
> objects".  (kudos to all who were involved, with both WAL and TOAST!)
> 
> here's the test jig -- please don't redistribute it yet since there's no man
> page and i want to try binary cursors and other things to try to speed it up
> or clean it up or both.  but if someone can look at my code (which i'm running
> against the 7.1 bits at the head of the pgsql cvs tree) and at the *.png file
> and help me enumerate the sources of my stupidity, i will be forever grateful.








Re: performance modality in 7.1 for large text attributes?

From
Christopher Masto
Date:
On Tue, Dec 19, 2000 at 03:03:43PM +0000, Thomas Lockhart wrote:
> o WAL fsync() log commits and cleanup (aggregate throughput is great,
> but every once in a while someone waits while the paperwork gets done.
> Waiting may be due to processor resource competition)
> 
> o Underlying file system bookkeeping from the kernel. e.g. flushing
> buffers to disk etc etc.

I was going to suggest the same, but it's interesting that it happens
on reads as well.  I can't tell for sure from the graph, but it looks
like it happens fairly consistently - every Nth time.  I'd be curious
to see how this changes if you artificially slow down your loop, or
adjust your OS's filesystem parameters.  It may give some more clues.
-- 
Christopher Masto         Senior Network Monkey      NetMonger Communications
chris@netmonger.net        info@netmonger.net        http://www.netmonger.net

Free yourself, free your machine, free the daemon -- http://www.freebsd.org/


Re: performance modality in 7.1 for large text attributes?

From
Bruce Momjian
Date:
> furthermore, are there any plans to offer a better libpq interface to INSERT?
> the things i'm doing now to quote the text, and the extra copy i'm maintaining,
> are painful.  arbitrary-sized "text" attributes are a huge boon -- we would
> never have considered using postgres for MAPS RSS (or RBL) with "large
> objects".  (kudos to all who were involved, with both WAL and TOAST!)

If you are asking for a binary interface to TOAST values, I really wish
we had that in 7.1.  It never got finished for 7.1.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: performance modality in 7.1 for large text attributes?

From
Paul A Vixie
Date:
> 1) Have you ran vacuum analyze after all these inserts to update database
> statistics? :) Without vacuum, pgsql will opt to table scan even when
> there's an index.

i hadn't, but i did, and it didn't make that particular difference:
vixie=# explain select file from rss where addr = '127.0.0.2';NOTICE:  QUERY PLAN:
Seq Scan on rss  (cost=0.00..0.00 rows=1 width=12)
EXPLAIN

that sounded bad, so i
vixie=# vacuum analyze rss;VACUUM

but when i reran the explain, it still said it was doing it sequentially:
vixie=# explain select file from rss where addr = '127.0.0.2';NOTICE:  QUERY PLAN:
Seq Scan on rss  (cost=0.00..1685.10 rows=1 width=12)
EXPLAIN

i'll try remaking the table with "addr" as a unique key and see if that helps.

> 2) I'm not sure if you are executing pgcat 70k times or executing inner
> loop in pgcat 70k times. Postgres connection establishment is expensive.

it was 70K invocations, but connection establishment ought to be the same
for both "pgcat get" and "pgcat put" so this doesn't explain the difference
in the graphs.

> 3) Postgres INSERT is not very efficient if you are doing a bulk load of
> data (it has to reparse the statement every time). If you want to delete
> everything and load new data, use "COPY", which is about 5 times faster.

well, that doesn't help in my application.  i'm trying to find out whether
pgsql can be used as the generic backend for MAPS RSS, and the only time i
expect to be doing bulk loads is during benchmarking and during transition.
so, the speed of a "pgcat get" really matters if i want the web server to
go fast when it gets hit by a lot of simultaneous lookups.  so, even though
there are faster ways to do bulk loading, the current benchmark is accurate
for the real application's workload, which isn't about bulk loading.

> Oh...And since I have your attention, could you please resolve
> long-standing discussion between me and Tom Lane? :) 
> 
> Question is whether proper (standard/most-commonly-used) format for
> printing CIDR network address is 10/8 or 10.0.0.0/8 (i.e. should all
> octets be printed even if they are 0). After search of RFCs, there's
> nothing that specifies the standard, but 10.0.0.0/8 is used more often in
> examples than 10/8 form.
> 
> Postgres uses 10/8 form, and I'm saying that 10.0.0.0/8 is more accepted
> by everyone else. (I.E. all software can deal with that, but not all
> software accepts 10/8).

cisco IOS just won't take 10/8 and insists on 10.0.0.0/8.  you will never,
ever go wrong if you try to use 10.0.0.0/8, since everything that understands
CIDR understands that.  10/8 is a pleasant-appearing alternative format, but
it is not universally accepted and i recommend against it.  (i'm not sure if
my original CIDR type implementation for pgsql output the shorthand or not;
if it did, then i apologize to one and all.)


Re: performance modality in 7.1 for large text attributes?

From
Paul A Vixie
Date:
> > anybody know what i could be doing wrong?  (i'm also wondering why SELECT
> > takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless
> > TOAST is doing a LOT better than i think.)
> 
> I would think that this is entirely due to planning the query. An INSERT
> has no decisions to make, whereas a SELECT must decide among a variety
> of possible plans. To hand-optimize selects, you can set some parameters
> to force only some kinds of plans (such as index scan) but in general
> you will need to remember to unset them afterwards or you run the risk
> of bizarrely inappropriate plans for other queries in the same session.

since every "pgcat" invocation is its own sql session, i have no worries
about that.  what i don't know, is how to set these options.  i'm rerunning
my test with PRIMARY KEY on the thing i'm searching on, and will report
results here soon.  it appears that 60ms is still the average INSERT time
(which is fine, btw) but that PRIMARY KEY just about doubles the amount of
disk I/O per INSERT, and the postgres server process is using 4% of the CPU
rather than the 0.5% it had used without PRIMARY KEY.

> > furthermore, are there any plans to offer a better libpq interface to
> > INSERT?  the things i'm doing now to quote the text, and the extra copy
> > i'm maintaining, are painful.
> 
> What exactly are you looking for in "better"? Is it just the quoting
> issue (a longstanding problem which persists for historical reasons :(

well, the programmatic interface to SELECT is just about perfect.  i can
construct the command and send it over, then check the result to see how
many tuples and fields i got back, and then i can get the value in its
native form as a big block of goo.
       /* Send the query. */       if (snprintf(cmd, sizeof cmd, "SELECT %s FROM %s WHERE %s = %s%s%s",
  text, table, key, p, value, p) >= sizeof cmd) {               fprintf(stderr, "%s: snprintf overflow\n", progname);
           goto done;       }       res = PQexec(conn, cmd);       if (PQresultStatus(res) != PGRES_TUPLES_OK) {
      fprintf(stderr, "%s: \"%s\": %s", progname, cmd,                       PQresultErrorMessage(res));
gotodone;       }       if (PQnfields(res) != 1) {               fprintf(stderr, "%s: \"%s\": %d fields?\n",
          progname, cmd, PQnfields(res));               goto done;       }       if (PQntuples(res) != 1) {
 fprintf(stderr, "%s: \"%s\": %d tuples?\n",                       progname, cmd, PQntuples(res));               goto
done;      }
 
       /* Output the result. */       pch = '\0';       for (p = PQgetvalue(res, 0, 0), ch = '\0'; (ch = *p) != '\0';
p++){               putc(ch, fp);               pch = ch;       }       if (pch != '\n')               putc('\n', fp);
    status = 0;
 

for INSERT, though, there is no analogue.  i guess i'm looking for functions
which might have names like PQinsert() and PQaddtuple().  instead, i've got
       /*        * Read the file to find out how large it will be when quoted.        * If it's not a regular file,
makea copy while reading, then switch.        */...       /* Construct the INSERT command. */       n = snprintf(cmd,
sizeofcmd,                    "INSERT INTO %s ( %s, %s ) VALUES ( %s%s%s, '",                    table, key, text, p,
value,p);       if (n >= sizeof cmd) {               fprintf(stderr, "%s: snprintf overflow\n", progname);
goto done;       }       t = malloc(n + size + sizeof "');");       if (t == NULL) {               perror("malloc");
          goto done;       }       strcpy(t, cmd);       tp = t + n;       while ((ch = getc(fp)) != EOF) {
 if (ch == '\\' || ch == '\'')                       *tp++ = '\\';               *tp++ = ch;       }       *tp++ =
'\'';      *tp++ = ')';       *tp++ = ';';       *tp++ = '\0';
 
       /* Send the command. */       res = PQexec(conn, t);       if (PQresultStatus(res) != PGRES_COMMAND_OK) {
      fprintf(stderr, "%s: \"%s\": %s", progname, t,                       PQresultErrorMessage(res));
gotodone;       }       if (strcmp(PQcmdTuples(res), "1") != 0) {               fprintf(stderr, "%s: \"%s...\": '%s'
tuples?(%s)\n",                       progname, cmd, PQcmdTuples(res), PQcmdStatus(res));               goto done;
}       status = 0;
 

which is really, really painful.  the large "text" is a great idea, but the
old "lo_" API actually had some things going for it.

> Possible causes of the 1.5s "mode" (at least as a starting point):
> 
> o task scheduling on your test machine (not likely??)
> 
> o swapping/thrashing on your test machine (not likely??)

the machine was idle other than for this test.  it's a two processor
freebsd machine.

> o WAL fsync() log commits and cleanup (aggregate throughput is great,
> but every once in a while someone waits while the paperwork gets done.
> Waiting may be due to processor resource competition)
> 
> o Underlying file system bookkeeping from the kernel. e.g. flushing
> buffers to disk etc etc.

i'm going to make a 500MB MFS partition for /usr/local/pgsql/data/base if
the PRIMARY KEY idea doesn't work out, just to rule out actuator noise.


CIDR output format

From
Tom Lane
Date:
Paul A Vixie <vixie@mfnx.net> writes:
> cisco IOS just won't take 10/8 and insists on 10.0.0.0/8.  you will never,
> ever go wrong if you try to use 10.0.0.0/8, since everything that understands
> CIDR understands that.  10/8 is a pleasant-appearing alternative format, but
> it is not universally accepted and i recommend against it.  (i'm not sure if
> my original CIDR type implementation for pgsql output the shorthand or not;
> if it did, then i apologize to one and all.)

Well, that's an earful.  Faced with this authoritative opinion, I
withdraw my previous objections to changing the output format for CIDR.

It would seem that the appropriate behavior would be to make the default
display format for CIDR be like "10.0.0.0/8".  Now the text() conversion
function already produces this same format.  I'd be inclined to leave
text() as-is and add a new conversion function with some other name
(suggestions anyone?) that produces the shorthand form "10/8" as text,
for those who prefer it.

Comments?
        regards, tom lane


Re: performance modality in 7.1 for large text attributes?

From
Tom Lane
Date:
Paul A Vixie <vixie@mfnx.net> writes:
> http://www.vix.com/~vixie/results-psql.png shows a gnuplot of the wall time
> of 70K executions of "pgcat" (shown below) using a CIDR key and TEXT value.

I get a 404 on that URL :-(

> anybody know what i could be doing wrong?  (i'm also wondering why SELECT
> takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless
> TOAST is doing a LOT better than i think.)

Given your later post, the problem is evidently that the thing is
failing to use the index for the SELECT.  I am not sure why, especially
since it clearly does know (after vacuuming) that the index would
retrieve just a single row.  May we see the exact declaration of the
table --- preferably via "pg_dump -s -t TABLENAME DBNAME" ?

> furthermore, are there any plans to offer a better libpq interface to INSERT?

Consider using COPY if you don't want to quote the data.
COPY rss FROM stdin;values heremore values here\.

(If you don't like tab as column delimiter, you can specify another in
the copy command.)  The libpq interface to this is relatively
straightforward IIRC.
        regards, tom lane


Re: performance modality in 7.1 for large text attributes?

From
Larry Rosenman
Date:
* Paul A Vixie <vixie@mfnx.net> [001220 10:28]:
> > Question is whether proper (standard/most-commonly-used) format for
> > printing CIDR network address is 10/8 or 10.0.0.0/8 (i.e. should all
> > octets be printed even if they are 0). After search of RFCs, there's
> > nothing that specifies the standard, but 10.0.0.0/8 is used more often in
> > examples than 10/8 form.
> > 
> > Postgres uses 10/8 form, and I'm saying that 10.0.0.0/8 is more accepted
> > by everyone else. (I.E. all software can deal with that, but not all
> > software accepts 10/8).
> 
> cisco IOS just won't take 10/8 and insists on 10.0.0.0/8.  you will never,
> ever go wrong if you try to use 10.0.0.0/8, since everything that understands
> CIDR understands that.  10/8 is a pleasant-appearing alternative format, but
> it is not universally accepted and i recommend against it.  (i'm not sure if
> my original CIDR type implementation for pgsql output the shorthand or not;
> if it did, then i apologize to one and all.)
There was no way, prior to 7.1, to get all 4 octets printed using the
original code. 

Thanks for clearing up the info. 

Larry Rosenman
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: CIDR output format

From
Larry Rosenman
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001220 13:02]:
> Paul A Vixie <vixie@mfnx.net> writes:
> > cisco IOS just won't take 10/8 and insists on 10.0.0.0/8.  you will never,
> > ever go wrong if you try to use 10.0.0.0/8, since everything that understands
> > CIDR understands that.  10/8 is a pleasant-appearing alternative format, but
> > it is not universally accepted and i recommend against it.  (i'm not sure if
> > my original CIDR type implementation for pgsql output the shorthand or not;
> > if it did, then i apologize to one and all.)
> 
> Well, that's an earful.  Faced with this authoritative opinion, I
> withdraw my previous objections to changing the output format for CIDR.
> 
> It would seem that the appropriate behavior would be to make the default
> display format for CIDR be like "10.0.0.0/8".  Now the text() conversion
> function already produces this same format.  I'd be inclined to leave
> text() as-is and add a new conversion function with some other name
> (suggestions anyone?) that produces the shorthand form "10/8" as text,
> for those who prefer it.
I would call it cidrshort(). 

I assume this also is true for INET? 

Thanks!

LER
> 
> Comments?
> 
>             regards, tom lane
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: CIDR output format

From
Tom Lane
Date:
Larry Rosenman <ler@lerctr.org> writes:
>> It would seem that the appropriate behavior would be to make the default
>> display format for CIDR be like "10.0.0.0/8".  Now the text() conversion
>> function already produces this same format.  I'd be inclined to leave
>> text() as-is and add a new conversion function with some other name
>> (suggestions anyone?) that produces the shorthand form "10/8" as text,
>> for those who prefer it.

> I would call it cidrshort(). 

I was thinking something like abbrev().  There is no need to put the
type name in the function; that's what function overloading is for.

> I assume this also is true for INET? 

INET doesn't use abbreviation of the address part anyway.  The only
display shortcut it has is to suppress "/32" when the netmask is 32.
I figured that text() could produce an un-abbreviated result for an
INET input (as it does now), and abbrev() could produce one with
/32 suppression.  In short:

Value            Default output    text()        abbrev()

'127.0.0.1/32'::inet    127.0.0.1    127.0.0.1/32    127.0.0.1
'127.0.0.1/32'::cidr    127.0.0.1/32    127.0.0.1/32    127.0.0.1/32
'127/8'::cidr        127.0.0.0/8    127.0.0.0/8    127/8

This would be a little bit inconsistent, because the default output
format would match text() for CIDR values but abbrev() for INET values.
But that seems like the most useful behavior to me.  Possibly others
will disagree ;-)
        regards, tom lane


Re: CIDR output format

From
Larry Rosenman
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001221 09:49]:
> Larry Rosenman <ler@lerctr.org> writes:
> >> It would seem that the appropriate behavior would be to make the default
> >> display format for CIDR be like "10.0.0.0/8".  Now the text() conversion
> >> function already produces this same format.  I'd be inclined to leave
> >> text() as-is and add a new conversion function with some other name
> >> (suggestions anyone?) that produces the shorthand form "10/8" as text,
> >> for those who prefer it.
> 
> > I would call it cidrshort(). 
> 
> I was thinking something like abbrev().  There is no need to put the
> type name in the function; that's what function overloading is for.
> 
> > I assume this also is true for INET? 
> 
> INET doesn't use abbreviation of the address part anyway.  The only
> display shortcut it has is to suppress "/32" when the netmask is 32.
> I figured that text() could produce an un-abbreviated result for an
> INET input (as it does now), and abbrev() could produce one with
> /32 suppression.  In short:
> 
> Value            Default output    text()        abbrev()
> 
> '127.0.0.1/32'::inet    127.0.0.1    127.0.0.1/32    127.0.0.1
> '127.0.0.1/32'::cidr    127.0.0.1/32    127.0.0.1/32    127.0.0.1/32
> '127/8'::cidr        127.0.0.0/8    127.0.0.0/8    127/8
> 
> This would be a little bit inconsistent, because the default output
> format would match text() for CIDR values but abbrev() for INET values.
> But that seems like the most useful behavior to me.  Possibly others
> will disagree ;-)
I'm fine with it.  IIRC, you fixed it so we can cast from INET to CIDR
and back?  

Thanks!

> 
>             regards, tom lane
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: CIDR output format

From
Alex Pilosov
Date:
On Thu, 21 Dec 2000, Tom Lane wrote:

> Value            Default output    text()        abbrev()
> 
> '127.0.0.1/32'::inet    127.0.0.1    127.0.0.1/32    127.0.0.1
> '127.0.0.1/32'::cidr    127.0.0.1/32    127.0.0.1/32    127.0.0.1/32
> '127/8'::cidr        127.0.0.0/8    127.0.0.0/8    127/8
> 
> This would be a little bit inconsistent, because the default output
> format would match text() for CIDR values but abbrev() for INET values.
> But that seems like the most useful behavior to me.  Possibly others
> will disagree ;-)
I think it makes sense.