Re: browsing table with 2 million records - Mailing list pgsql-performance
From | PFC |
---|---|
Subject | Re: browsing table with 2 million records |
Date | |
Msg-id | op.sza6qpz4th1vuj@localhost Whole thread Raw |
In response to | Re: browsing table with 2 million records ("Merlin Moncure" <merlin.moncure@rcsonline.com>) |
List | pgsql-performance |
> I've done it... > First of all I totally agree with PFC's rant regarding absolute > positioning while browsing datasets. Among other things, it has serious > problems if you have multiple updating your table. Also it's kind of > silly to be doing this in a set based data paradigm. Recently I've been browsing some site and it had this problem : as users kept adding new entries as I was browsing the list page after page, when I hit "next page" I got on the next page half of what I already saw on the previous page. Of course the webmaster has set the "visited links" color the same as "unvisited links", so I couldn't tell, and had to use my brain, which was quite upsetting XDDD And bookmarking a page to resume browsing at some later time does not work either, because if I bookmark page 15, then when I come back, users have added 10 pages of content and what I bookmarked is now on page 25... >> All very well and good, but now do it generically... Hehe. I like ranting... It is not possible to do it in a generic way that works in all cases. For instance : Forum topic case : - posts are added at the bottom and not at the top - page number is relevant and meaningful However, in most cases, you can use a multipart key and get it right. Suppose that, for instance, you have a base of several million records, organized according to : - date (like the original poster) or : - country, region, city, customer last name, first name. You could ask for the first three, but then you'll get 50000 Smiths in New York and 1 Van Bliezinsky. Or you could precalculate, once a week, a key interval distribution that creates reasonable sized intervals (for instance, 100 values in each), maybe asking that each interval should only contain only one city. So, you would get : Country Region City LastName FirstName USA NYC NY Smith, '' USA NYC NY Smith, Albert USA NYC NY Smith, Bernard ..... USA NYC NY Smith, William ... USA NYC NY Von Braun ... So you'd predetermine your "page breaks" ahead of time, and recompute them once in a while. You won't get identically sized pages, but if the statistical distribution of the data plays nice, you should get evenly sized pages. The interesting part is that you can present the user with a selector which presents meaningful and useful data, AND is fast to compute, AND is fast to use. In this case, it would amount to "Select country, region, city", then, display a list like this : Smith, ...Albert Smith, Albus...Bernard ... Smith, William... ... Von Braun...Von Schwarts ... So Jeannette Smith would be easy to find, being in the link "Smith, Jean...John" for instance. If the aim is to quickly locate a particular record, I like javascript-powered autocompletion better ; but for browsing, this pagination method is cool. Regards !
pgsql-performance by date: