Re: [HACKERS] Another nasty cache problem - Mailing list pgsql-hackers
From | Patrick Welche |
---|---|
Subject | Re: [HACKERS] Another nasty cache problem |
Date | |
Msg-id | 20000203112434.B1509@quartz.newn.cam.ac.uk Whole thread Raw |
In response to | Re: [HACKERS] Another nasty cache problem (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Another nasty cache problem
Re: [HACKERS] Another nasty cache problem Re: [HACKERS] Another nasty cache problem |
List | pgsql-hackers |
On Mon, Jan 31, 2000 at 09:02:30PM -0500, Tom Lane wrote: > Patrick Welche <prlw1@newn.cam.ac.uk> writes: > > Tom Lane wrote: > > There are cache-flush-related bugs still left to deal with, but they > seem to be far lower in probability than the ones squashed so far. > I'm finding that even with MAXNUMMESSAGES set to 8, the parallel tests > usually pass; so it seems we need some other way of testing to nail down > the remaining problems. > > > I also tried that nonsensical join from the other day, and it failed in > > the same way again: > > newnham=# select * from crsids,"tblPerson" where > > newnham-# crsids.crsid != "tblPerson"."CRSID"; > > Backend sent B message without prior T > > Hmm. Can you provide a self-contained test case (a script to build the > failing tables, preferably)? It seems this is a memory exhaustion thing: I have 128Mb real memory. Attached below is the C program used to create some random data in tables test and test2 of database test (which needs to exist). Executing the non-sensical query select * from test,test2 where test.i!=test2.i; should result in 2600*599=1557400 (ie lots of) rows to be returned. The process's memory consumption during this select grows to 128Mb, and after a moment or two: Backend sent D message without prior T Backend sent D message without prior T ... Which isn't quite the same message as before, but is of the same type. 59 processes: 2 running, 57 sleeping CPU states: 2.3% user, 86.4% nice, 9.3% system, 0.0% interrupt, 1.9% idle Memory: 74M Act, 37M Inact, 184K Wired, 364K Free, 95M Swap, 262M Swap free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND1547 prlw1 50 0 128M 516K run 1:28 59.28%59.28% psql1552 postgres 50 0 1920K 632K run 1:37 24.32% 24.32% postgres later, while the "Backend sent..." messages appear 1547 prlw1 -5 0 128M 68M sleep 1:41 23.00% 23.00% psql1552 postgres 2 0 1920K 4K sleep 1:41 141.00% 6.88% <postgres> Note that there is still plenty of swap space. The 128Mb number seems to be more than a coincidence (how to prove?) So, is this only happening to me? How can lack of real memory affect timing of interprocess communication? Cheers, Patrick ========================================================================== #include <ctype.h> #include <stdio.h> #include <stdlib.h> #include "libpq-fe.h" const char *progname; PGresult *send_query(PGconn *db, const char *query) { PGresult *res; res=PQexec(db,query); switch(PQresultStatus(res)) { case PGRES_EMPTY_QUERY: printf("PGRES_EMPTY_QUERY: %s\n",query); break; case PGRES_COMMAND_OK: printf("PGRES_COMMAND_OK: %s\n",query); break; casePGRES_TUPLES_OK: printf("PGRES_TUPLES_OK: %s\n",query); break; case PGRES_COPY_OUT: printf("PGRES_COPY_OUT:%s\n",query); break; case PGRES_COPY_IN: printf("PGRES_COPY_IN: %s\n",query); break; case PGRES_BAD_RESPONSE: printf("PGRES_BAD_RESPONSE: %s\n",query); exit(1); break; casePGRES_NONFATAL_ERROR: printf("PGRES_NONFATAL_ERROR: %s\n",query); break; case PGRES_FATAL_ERROR: printf("PGRES_FATAL_ERROR: %s\n",query); exit(1); break; default: fprintf(stderr,"Error from %s:Unknown response from "\ "PQresultStatus()\n",progname); exit(1); break; } return res; } char get_letter(void) { int c; do c=(int)random()%128; while(!(isascii(c)&&isalpha(c))); return (char)tolower(c); } unsigned int get_num(void) { return random()%100; } int main(int argc, char* argv[]) { char id[7],query[2048]; int i; PGconn *db; PGresult *res; progname=argv[0]; srandom(42); /* same data each time hopefully */ db=PQconnectdb("dbname=test"); if(PQstatus(db)==CONNECTION_BAD) { fprintf(stderr,"Error from %s: Unable to connectto database \"test\".\n", progname); exit(1); } res=send_query(db,"create table test (txt text,var varchar(7),i integer)"); PQclear(res); res=send_query(db,"create tabletest2(txt text,var varchar(7),i integer)"); PQclear(res); for(i=1;i<=2600;++i) { sprintf(id,"%c%c%c%c%03u",get_letter(),get_letter(),get_letter(), get_letter(),get_num()); sprintf(query,"insert into test values ('%s','%s','%i')",id,id,i); res=send_query(db,query); PQclear(res); } for(i=1;i<=600;++i) { sprintf(id,"%c%c%c%c%03u",get_letter(),get_letter(),get_letter(), get_letter(),get_num()); sprintf(query,"insert into test2 values ('%s','%s','%i')",id,id,i); res=send_query(db,query); PQclear(res); } PQfinish(db); return 0; }
pgsql-hackers by date: