Re: Improve WALRead() to suck data directly from WAL buffers when possible - Mailing list pgsql-hackers
From | Nathan Bossart |
---|---|
Subject | Re: Improve WALRead() to suck data directly from WAL buffers when possible |
Date | |
Msg-id | 20230306220027.GA3122138@nathanxps13 Whole thread Raw |
In response to | Re: Improve WALRead() to suck data directly from WAL buffers when possible (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: Improve WALRead() to suck data directly from WAL buffers when possible
|
List | pgsql-hackers |
+void +XLogReadFromBuffers(XLogRecPtr startptr, + TimeLineID tli, + Size count, + char *buf, + Size *read_bytes) Since this function presently doesn't return anything, can we have it return the number of bytes read instead of storing it in a pointer variable? + ptr = startptr; + nbytes = count; + dst = buf; These variables seem superfluous. + /* + * Requested WAL isn't available in WAL buffers, so return with + * what we have read so far. + */ + break; nitpick: I'd move this to the top so that you can save a level of indentation. if (expectedEndPtr != endptr) break; ... logic for when the data is found in the WAL buffers ... + /* + * All the bytes are not in one page. Read available bytes on + * the current page, copy them over to output buffer and + * continue to read remaining bytes. + */ Is it possible to memcpy more than a page at a time? + /* + * The fact that we acquire WALBufMappingLock while reading the WAL + * buffer page itself guarantees that no one else initializes it or + * makes it ready for next use in AdvanceXLInsertBuffer(). + * + * However, we perform basic page header checks for ensuring that + * we are not reading a page that just got initialized. Callers + * will anyway perform extensive page-level and record-level + * checks. + */ Hm. I wonder if we should make these assertions instead. + elog(DEBUG1, "read %zu bytes out of %zu bytes from WAL buffers for given LSN %X/%X, Timeline ID %u", + *read_bytes, count, LSN_FORMAT_ARGS(startptr), tli); I definitely don't think we should put an elog() in this code path. Perhaps this should be guarded behind WAL_DEBUG. + /* + * Check if we have read fully (hit), partially (partial hit) or + * nothing (miss) from WAL buffers. If we have read either partially or + * nothing, then continue to read the remaining bytes the usual way, + * that is, read from WAL file. + */ + if (count == read_bytes) + { + /* Buffer hit, so return. */ + return true; + } + else if (read_bytes > 0 && count > read_bytes) + { + /* + * Buffer partial hit, so reset the state to count the read bytes + * and continue. + */ + buf += read_bytes; + startptr += read_bytes; + count -= read_bytes; + } + + /* Buffer miss i.e., read_bytes = 0, so continue */ I think we can simplify this. We effectively take the same action any time "count" doesn't equal "read_bytes", so there's no need for the "else if". if (count == read_bytes) return true; buf += read_bytes; startptr += read_bytes; count -= read_bytes; -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: