Thread: Re: serverlog rotation/functions
Andreas Pflug wrote: > Bruce Momjian wrote: > > > > > This seems quite involved. Can we get the basic functionality I > > described first? > > Current workable patch. > > Some questions/limitations: > - How's the official way to restrict pg_* functions to superuser only Very crudely :-)static int pg_signal_backend(int pid, int sig){ if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("only superuser can signal other backends")))); > - I've restricted pg_file_read to 50k max. What's a reasonable limit for > a generic function? Uh, that seems fine. You already check to see it is within the limit. I think a bigger question is should we limit it at all? Do we limit pg_largeobject? Is that similar? > - pg_file_read and pg_file_write read/write text only; should have > binary versions too. I guess we could but no one is asking for that yet so I would leave it for later. > Very open question: > - How should a backend know the logger's pid if it's not in shmem. Write > a magic string to the pipe? I think it has to and in fact the pid is being written by the postmaster, not by the logger process, so that should be OK. The issue is that the logger shouldn't _attach_ to shared memory unless it has to. As far as recording the current log timestamp, I think that will be a problem. I would much rather see us forget about doing timestamp processing with these log files and keep it simple at this point and see what needs we have for 7.6. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >>- How's the official way to restrict pg_* functions to superuser only > > > Very crudely :-) Got it. 'nother question: Is reading the logfile a task that may be allowed to superusers only? I don't think so, though acls might apply. > > Uh, that seems fine. You already check to see it is within the limit. > I think a bigger question is should we limit it at all? Do we limit > pg_largeobject? Is that similar? Ok, no limit (but a default maximum of 50k remains). And since it's superuser only, he hopefully knows what he does. >>Very open question: >>- How should a backend know the logger's pid if it's not in shmem. Write >>a magic string to the pipe? > > > I think it has to and in fact the pid is being written by the > postmaster, not by the logger process, so that should be OK. The issue > is that the logger shouldn't _attach_ to shared memory unless it has to. It doesn't. It inherits the unnamed shared mem segment from the postmaster, as all subprocesses. > > As far as recording the current log timestamp, I think that will be a > problem. I would much rather see us forget about doing timestamp > processing with these log files and keep it simple at this point and see > what needs we have for 7.6. I'm a bit insisting on this point. Remember, this all started from the attempt to display the serverlog on the client side. To do this, I need a way to retrieve the current logfile properties (size, and in case of rotation timestamp too) in a low-overhead way, or at least get to know something has changed. Scanning a whole directory and interpreting the data isn't low overhead any more. There's no locking on the shmem, and the single dependence on shmem is the existence of it at the time of rotation. If the shmem is gone, postmaster is probably dead anyway. Regards, Andreas
Andreas Pflug wrote: > Bruce Momjian wrote: > >>- How's the official way to restrict pg_* functions to superuser only > > > > > > Very crudely :-) > > Got it. > > 'nother question: Is reading the logfile a task that may be allowed to > superusers only? I don't think so, though acls might apply. Yes, the log file might contain SQL queries issued by others. It is a super-user only capability. > > Uh, that seems fine. You already check to see it is within the limit. > > I think a bigger question is should we limit it at all? Do we limit > > pg_largeobject? Is that similar? > > Ok, no limit (but a default maximum of 50k remains). And since it's > superuser only, he hopefully knows what he does. Huh? Why have a default maximum? > >>Very open question: > >>- How should a backend know the logger's pid if it's not in shmem. Write > >>a magic string to the pipe? > > > > > > I think it has to and in fact the pid is being written by the > > postmaster, not by the logger process, so that should be OK. The issue > > is that the logger shouldn't _attach_ to shared memory unless it has to. > > It doesn't. It inherits the unnamed shared mem segment from the > postmaster, as all subprocesses. Ah, I think it needs to close that as soon as it starts. Don't other subprocesses do that? That shared memory is very fragile and we don't want an errant pointer poking in there. > > As far as recording the current log timestamp, I think that will be a > > problem. I would much rather see us forget about doing timestamp > > processing with these log files and keep it simple at this point and see > > what needs we have for 7.6. > > I'm a bit insisting on this point. Remember, this all started from the > attempt to display the serverlog on the client side. To do this, I need > a way to retrieve the current logfile properties (size, and in case of > rotation timestamp too) in a low-overhead way, or at least get to know > something has changed. Scanning a whole directory and interpreting the > data isn't low overhead any more. This seems clean and fast enough to me: SELECT filenameFROM pg_dir_ls('/var/log')ORDER BY 1 DESCLIMIT 1 Considering that any query from a client is going to have to go through the parser and be executed, an 'ls' in a directory just isn't a measurable performance hit. If you want run a test that does an 'ls' and one that doesn't to see that there is no measurable performance difference. I would not worry about the clock going backward. PostgreSQL would have enough problems with timestamp columns moving backward that the file log times are the least of our problems. > There's no locking on the shmem, and the single dependence on shmem is > the existence of it at the time of rotation. If the shmem is gone, > postmaster is probably dead anyway. You can't know that you aren't reading corrupt data if you read shared memory without a lock. What if the write is happening as you read? The only clean solution I can think of is to write an operating system file that contains the current log filename and read from that. I believe such writes are atomic. But again, this seems like overkill to me. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >>Ok, no limit (but a default maximum of 50k remains). And since it's >>superuser only, he hopefully knows what he does. > > > Huh? Why have a default maximum? Just for convenience. Both start and size are optional parameters, but with start=0 and size=50000. Well, it's a very special function anyway, so we could require the user to supply all parameters. I'll remove it. > > > Ah, I think it needs to close that as soon as it starts. Don't other > subprocesses do that? That shared memory is very fragile and we don't > want an errant pointer poking in there. The result of an errant pointer writing to that shred mem would be 1) wrong pid for SysLogger, so it can't be signalled to rotate from backends 2) wrong timestamp, so backends don't know the latest logfile. Nothing particularly crash prone really. > This seems clean and fast enough to me: > > SELECT filename > FROM pg_dir_ls('/var/log') > ORDER BY 1 DESC > LIMIT 1 For a logfile listing function, this would look SELECT MAX(startdate) FROM pg_logfile_ls() > > Considering that any query from a client is going to have to go through > the parser and be executed, an 'ls' in a directory just isn't a > measurable performance hit. > > If you want run a test that does an 'ls' and one that doesn't to see > that there is no measurable performance difference. > So while a simple PG_RETURN_TIMESTAMP(logfiletimestamp) compared to a lengthy setof returning function is drastically faster, this appears much less drastic with parser overhead. > I would not worry about the clock going backward. PostgreSQL would have > enough problems with timestamp columns moving backward that the file log > times are the least of our problems. I see, so the admin is in trouble anyway (what about PITR? Data column deviations appear harmless compared to restoration based on timestamps). > > > You can't know that you aren't reading corrupt data if you read shared > memory without a lock. What if the write is happening as you read? I thought about this quite a while. If the shmem fields aren't written atomically (one is 32bit, one 64 bit, probably on dword boundaries so writing will happen at least processor bus wide, do we support any 16 bit processor?) the corruption consequences as above apply. In the case of the timestamp, the high word will rarely change anyway, only every 2^32 seconds... Concurrent access on the logger pid would mean to call pg_logfile_rotate() while a killed logger is being restarted, which is creating a new logfile then anyway. This would send a SIGINT into outer space, maybe to the bgwriter triggering a checkpoint, or the postmaster shutting it down (gracefully, still unwanted). BTW, the consequences of a trigger flag in shmem would be less because all that could happen was a log rotation (which appends to existing files, just in case syslogger died in the milliseconds after a rotation). > > The only clean solution I can think of is to write an operating system > file that contains the current log filename and read from that. I > believe such writes are atomic. But again, this seems like overkill to > me. Ah wait. Digging further behind SIGUSR1 I now *do* see a solution without pid in shmem, using SendPostmasterSignal. Well, a little hint from gurus would have helped... I'll convert to this, *dropping* all shmem. Regards, Andreas
Andreas Pflug wrote: > > Ah wait. > Digging further behind SIGUSR1 I now *do* see a solution without pid in > shmem, using SendPostmasterSignal. Well, a little hint from gurus would > have helped... > Oops, SendPostmasterSignal uses shmem.... At least, this enables syslogger.c to be free from shmem stuff, except for PGSharedMemDetach. Regards, Andreas
Andreas Pflug wrote: > Bruce Momjian wrote: > >>Ok, no limit (but a default maximum of 50k remains). And since it's > >>superuser only, he hopefully knows what he does. > > > > > > Huh? Why have a default maximum? > > Just for convenience. Both start and size are optional parameters, but > with start=0 and size=50000. Well, it's a very special function anyway, > so we could require the user to supply all parameters. I'll remove it. Agreed, and maybe a zero value gets the entire file. > > Ah, I think it needs to close that as soon as it starts. Don't other > > subprocesses do that? That shared memory is very fragile and we don't > > want an errant pointer poking in there. > > The result of an errant pointer writing to that shred mem would be > > 1) wrong pid for SysLogger, so it can't be signalled to rotate from backends > 2) wrong timestamp, so backends don't know the latest logfile. > > Nothing particularly crash prone really. No, I am thinking the program goes crazy and writes everywhere. > > This seems clean and fast enough to me: > > > > SELECT filename > > FROM pg_dir_ls('/var/log') > > ORDER BY 1 DESC > > LIMIT 1 > > For a logfile listing function, this would look > > SELECT MAX(startdate) > FROM pg_logfile_ls() > > > > > > Considering that any query from a client is going to have to go through > > the parser and be executed, an 'ls' in a directory just isn't a > > measurable performance hit. > > > > If you want run a test that does an 'ls' and one that doesn't to see > > that there is no measurable performance difference. > > > So while a simple PG_RETURN_TIMESTAMP(logfiletimestamp) compared to a > lengthy setof returning function is drastically faster, this appears > much less drastic with parser overhead. > > > I would not worry about the clock going backward. PostgreSQL would have > > enough problems with timestamp columns moving backward that the file log > > times are the least of our problems. > > I see, so the admin is in trouble anyway (what about PITR? Data column > deviations appear harmless compared to restoration based on timestamps). PITR uses WAL numbering so it would be fine, but the timestamps on the commit records would have problems. > > You can't know that you aren't reading corrupt data if you read shared > > memory without a lock. What if the write is happening as you read? > > I thought about this quite a while. > > If the shmem fields aren't written atomically (one is 32bit, one 64 bit, > probably on dword boundaries so writing will happen at least processor > bus wide, do we support any 16 bit processor?) the corruption > consequences as above apply. In the case of the timestamp, the high word > will rarely change anyway, only every 2^32 seconds... > > Concurrent access on the logger pid would mean to call > pg_logfile_rotate() while a killed logger is being restarted, which is > creating a new logfile then anyway. This would send a SIGINT into outer > space, maybe to the bgwriter triggering a checkpoint, or the postmaster > shutting it down (gracefully, still unwanted). > > BTW, the consequences of a trigger flag in shmem would be less because > all that could happen was a log rotation (which appends to existing > files, just in case syslogger died in the milliseconds after a rotation). > > > > > The only clean solution I can think of is to write an operating system > > file that contains the current log filename and read from that. I > > believe such writes are atomic. But again, this seems like overkill to > > me. > > Ah wait. > Digging further behind SIGUSR1 I now *do* see a solution without pid in > shmem, using SendPostmasterSignal. Well, a little hint from gurus would > have helped... > > I'll convert to this, *dropping* all shmem. Yes, that is the usual method. We signal the postmaster and it then does the signalling to the logger. I thought you had looked at other backend signalling examples so I didn't explain it. Now, one really good efficiency would be to use LISTEN/NOTIFY so clients could know when new data has appeared in the log, or the log file is rotated. Now that's an efficiency! However, let's get this infrastructure completed first. One wacky idea would be for the clients to LISTEN on 'pg_new_logfile' and have the logger do system('psql -c "NOTIFY pg_new_logfile" template1') or something like that. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andreas Pflug wrote: > Andreas Pflug wrote: > > > > > Ah wait. > > Digging further behind SIGUSR1 I now *do* see a solution without pid in > > shmem, using SendPostmasterSignal. Well, a little hint from gurus would > > have helped... > > > > Oops, SendPostmasterSignal uses shmem.... > > At least, this enables syslogger.c to be free from shmem stuff, except > for PGSharedMemDetach. Right. We already have to use shared mem for the backends and postmaster. It is the logger we are worried about. Tom brought up the point that if the logger used shared memory, we would have to kill/restart it if we need to reinitialize shared memory, meaning we would loose logging info at a time we really need it --- again a good reason not to use shared memory in the logger. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Andreas Pflug wrote: > Right. We already have to use shared mem for the backends and > postmaster. It is the logger we are worried about. > > Tom brought up the point that if the logger used shared memory, we would > have to kill/restart it if we need to reinitialize shared memory, I don't know why that particular segment should ever be renewed. Anyway, it's gone. Regards, Andreas
Andreas Pflug wrote: > Bruce Momjian wrote: > > Andreas Pflug wrote: > > > Right. We already have to use shared mem for the backends and > > postmaster. It is the logger we are worried about. > > > > Tom brought up the point that if the logger used shared memory, we would > > have to kill/restart it if we need to reinitialize shared memory, > > I don't know why that particular segment should ever be renewed. Anyway, > it's gone. As I remember, we have one big shared memory segment. Where you creating a special one just for this timestamp? If you were, I see why your approach was safer, but as you said, it doesn't buy us much anyway. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Andreas Pflug wrote: >>Just for convenience. Both start and size are optional parameters, but >>with start=0 and size=50000. Well, it's a very special function anyway, >>so we could require the user to supply all parameters. I'll remove it. > > > Agreed, and maybe a zero value gets the entire file. Which is a default param back again, maybe on a 100MB file? Better not. Lets leave it to the admin to do sick stuff as pg_read_file('base/5000/5002', 0, 100000000) ... > > No, I am thinking the program goes crazy and writes everywhere. What I described was just that situation. >> > Yes, that is the usual method. We signal the postmaster and it then > does the signalling to the logger. I thought you had looked at other > backend signalling examples so I didn't explain it. Well if you know the places where backends do signal stuff to the postmaster... Still, somebody could have yelled "use the standard way before reinventing the wheel". > > Now, one really good efficiency would be to use LISTEN/NOTIFY so clients > could know when new data has appeared in the log, or the log file is > rotated. Now that's an efficiency! However, let's get this > infrastructure completed first. One wacky idea would be for the > clients to LISTEN on 'pg_new_logfile' and have the logger do > system('psql -c "NOTIFY pg_new_logfile" template1') or something like > that. No, certainly not. This would mean that every time a log is done, psql is fired up. Tom wouldn't accept this as KISS, I believe. And h*ll, that would cause traffic (just imagine a single log message on client startup...) What you saw on LinuxTag was pgAdmin3 polling once a second if the logfile length changed, which is the fastest setting possible. Regards, Andreas