Thread: [PERFORMANCE] Insights: fseek OR read_cluster?
Hello, It is interesting how PostgreSQL reads the tablefiie. Whether its indexes store/use filesystem clusters locations containing required data (so it can issue a low level cluster read) or it just fseeks inside a file? Thank you
On 24/09/2011 2:49 PM, Antonio Rodriges wrote: > Hello, > > It is interesting how PostgreSQL reads the tablefiie. > Whether its indexes store/use filesystem clusters locations containing > required data (so it can issue a low level cluster read) or it just > fseeks inside a file? What is read_cluster() ? Are you talking about some kind of async and/or direct I/O? If so, PostgreSQL is not designed for direct I/O, it benefits from using the OS's buffer cache, I/O scheduler, etc. IIRC Pg uses pread() to read from its data files, but I didn't go double check in the sources to make sure. -- Craig Ringer
Thank you, Craig, your answers are always insightful > What is read_cluster() ? Are you talking about some kind of async and/or I meant that if you want to read a chunk of data from file you (1) might not call traditional fseek but rather memorize hard drive cluster numbers to boost disk seeks and, (2) perform the read of disk cluster directly. > direct I/O? If so, PostgreSQL is not designed for direct I/O, it benefits > from using the OS's buffer cache, I/O scheduler, etc. > > IIRC Pg uses pread() to read from its data files, but I didn't go double > check in the sources to make sure. > > -- > Craig Ringer >
On Mon, Sep 26, 2011 at 15:51, Antonio Rodriges <antonio.rrz@gmail.com> wrote: >> What is read_cluster() ? Are you talking about some kind of async and/or > > I meant that if you want to read a chunk of data from file you (1) > might not call traditional fseek but rather memorize hard drive > cluster numbers to boost disk seeks and, (2) perform the read of disk > cluster directly. PostgreSQL accesses regular files on a file system via lseek(), read() and write() calls, no magic. In modern extent-based file systems, mapping a file offset to a physical disk sector is very fast -- compared to the time of actually accessing the disk. I can't see how direct cluster access would even work, unless you'd give the database direct access to a raw partition, in which case Postgres would effectively have to implement its own file system. The gains are simply not worth it for Postgres, our developer resources are better spent elsewhere. Regards, Marti
Thank you, Marti, Is there any comprehensive survey of (at least most, if not all) modern features of operating systems, for example I/O scheduling, extent-based filesytems, etc.?