Re: performance modality in 7.1 for large text attributes? - Mailing list pgsql-hackers
From | Paul A Vixie |
---|---|
Subject | Re: performance modality in 7.1 for large text attributes? |
Date | |
Msg-id | 200012200043.QAA34847@redpaul.mfnx.net Whole thread Raw |
In response to | Re: performance modality in 7.1 for large text attributes? (Thomas Lockhart <lockhart@alumni.caltech.edu>) |
List | pgsql-hackers |
> > anybody know what i could be doing wrong? (i'm also wondering why SELECT > > takes ~250ms whereas INSERT takes ~70ms... seems counterintuitive, unless > > TOAST is doing a LOT better than i think.) > > I would think that this is entirely due to planning the query. An INSERT > has no decisions to make, whereas a SELECT must decide among a variety > of possible plans. To hand-optimize selects, you can set some parameters > to force only some kinds of plans (such as index scan) but in general > you will need to remember to unset them afterwards or you run the risk > of bizarrely inappropriate plans for other queries in the same session. since every "pgcat" invocation is its own sql session, i have no worries about that. what i don't know, is how to set these options. i'm rerunning my test with PRIMARY KEY on the thing i'm searching on, and will report results here soon. it appears that 60ms is still the average INSERT time (which is fine, btw) but that PRIMARY KEY just about doubles the amount of disk I/O per INSERT, and the postgres server process is using 4% of the CPU rather than the 0.5% it had used without PRIMARY KEY. > > furthermore, are there any plans to offer a better libpq interface to > > INSERT? the things i'm doing now to quote the text, and the extra copy > > i'm maintaining, are painful. > > What exactly are you looking for in "better"? Is it just the quoting > issue (a longstanding problem which persists for historical reasons :( well, the programmatic interface to SELECT is just about perfect. i can construct the command and send it over, then check the result to see how many tuples and fields i got back, and then i can get the value in its native form as a big block of goo. /* Send the query. */ if (snprintf(cmd, sizeof cmd, "SELECT %s FROM %s WHERE %s = %s%s%s", text, table, key, p, value, p) >= sizeof cmd) { fprintf(stderr, "%s: snprintf overflow\n", progname); goto done; } res = PQexec(conn, cmd); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, "%s: \"%s\": %s", progname, cmd, PQresultErrorMessage(res)); gotodone; } if (PQnfields(res) != 1) { fprintf(stderr, "%s: \"%s\": %d fields?\n", progname, cmd, PQnfields(res)); goto done; } if (PQntuples(res) != 1) { fprintf(stderr, "%s: \"%s\": %d tuples?\n", progname, cmd, PQntuples(res)); goto done; } /* Output the result. */ pch = '\0'; for (p = PQgetvalue(res, 0, 0), ch = '\0'; (ch = *p) != '\0'; p++){ putc(ch, fp); pch = ch; } if (pch != '\n') putc('\n', fp); status = 0; for INSERT, though, there is no analogue. i guess i'm looking for functions which might have names like PQinsert() and PQaddtuple(). instead, i've got /* * Read the file to find out how large it will be when quoted. * If it's not a regular file, makea copy while reading, then switch. */... /* Construct the INSERT command. */ n = snprintf(cmd, sizeofcmd, "INSERT INTO %s ( %s, %s ) VALUES ( %s%s%s, '", table, key, text, p, value,p); if (n >= sizeof cmd) { fprintf(stderr, "%s: snprintf overflow\n", progname); goto done; } t = malloc(n + size + sizeof "');"); if (t == NULL) { perror("malloc"); goto done; } strcpy(t, cmd); tp = t + n; while ((ch = getc(fp)) != EOF) { if (ch == '\\' || ch == '\'') *tp++ = '\\'; *tp++ = ch; } *tp++ = '\''; *tp++ = ')'; *tp++ = ';'; *tp++ = '\0'; /* Send the command. */ res = PQexec(conn, t); if (PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "%s: \"%s\": %s", progname, t, PQresultErrorMessage(res)); gotodone; } if (strcmp(PQcmdTuples(res), "1") != 0) { fprintf(stderr, "%s: \"%s...\": '%s' tuples?(%s)\n", progname, cmd, PQcmdTuples(res), PQcmdStatus(res)); goto done; } status = 0; which is really, really painful. the large "text" is a great idea, but the old "lo_" API actually had some things going for it. > Possible causes of the 1.5s "mode" (at least as a starting point): > > o task scheduling on your test machine (not likely??) > > o swapping/thrashing on your test machine (not likely??) the machine was idle other than for this test. it's a two processor freebsd machine. > o WAL fsync() log commits and cleanup (aggregate throughput is great, > but every once in a while someone waits while the paperwork gets done. > Waiting may be due to processor resource competition) > > o Underlying file system bookkeeping from the kernel. e.g. flushing > buffers to disk etc etc. i'm going to make a 500MB MFS partition for /usr/local/pgsql/data/base if the PRIMARY KEY idea doesn't work out, just to rule out actuator noise.
pgsql-hackers by date: