Thread: How would you store read/unread topic status?
Hi all, I'm running a quite large website which has its own forums. They are currently heavily used and I'm getting performance issues. Most of them are due to repeated UPDATE queries on a "flags" table. This "flags" table has more or less the following fields: UserID - TopicID - LastReadAnswerID The flags table keeps track of every topic a member has visited and remembers the last answer which was posted at this moment. It allows the user to come back a few days after and immediately jump to the last answer he has not read. My problem is that everytime a user READS a topic, it UPDATES this flags table to remember he has read it. This leads to multiple updates at the same time on the same table, and an update can take a few seconds. This is not acceptable for my users. Question: what is the general rule of thumb here? How would you store this information? Thanks a lot in advance. Mathieu.
On 06/23/2009 01:12 PM, Mathieu Nebra wrote: > I'm running a quite large website which has its own forums. They are > currently heavily used and I'm getting performance issues. Most of them > are due to repeated UPDATE queries on a "flags" table. > > This "flags" table has more or less the following fields: > > UserID - TopicID - LastReadAnswerID > > The flags table keeps track of every topic a member has visited and > remembers the last answer which was posted at this moment. It allows the > user to come back a few days after and immediately jump to the last > answer he has not read. > My problem is that everytime a user READS a topic, it UPDATES this flags > table to remember he has read it. This leads to multiple updates at the > same time on the same table, and an update can take a few seconds. This > is not acceptable for my users. Have you analyzed why it takes that long? Determining that is the first step of improving the current situation... My first guess would be, that your disks cannot keep up with the number of syncronous writes/second. Do you know how many transactions with write access you have? Guessing from your description you do at least one write for every page hit on your forum. With the default settings every transaction needs to wait for io at the end - to ensure transactional semantics. Depending on your disk the number of possible writes/second is quite low - a normal SATA disk with 7200rpm can satisfy something around 130 syncronous writes per second. Which is the upper limit on writing transactions per second. What disks do you have? On which OS are you? If you are on linux you could use iostat to get some relevant statistics like: iostat -x /path/to/device/the/database/resides/on 2 10 That gives you 10 statistics over periods of 2 seconds. Depending on those results there are numerous solutions to that problem... > Question: what is the general rule of thumb here? How would you store > this information? The problem here is, that every read access writes to disk - that is not going to scale very well. One possible solution is to use something like memcached to store the last read post in memory and periodically write it into the database. Which pg version are you using? Andres
On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebra<mateo21@siteduzero.com> wrote: > This "flags" table has more or less the following fields: > > UserID - TopicID - LastReadAnswerID We are doing pretty much same thing. > My problem is that everytime a user READS a topic, it UPDATES this flags > table to remember he has read it. This leads to multiple updates at the > same time on the same table, and an update can take a few seconds. This > is not acceptable for my users. First of all, and I'm sure you thought of this, an update isn't needed every time a user reads a topic; only when there are new answers that need to be marked as read. So an "update ... where last_read_answer_id < ?" should avoid the need for an update. (That said, I believe PostgreSQL diffs tuple updates, so in practice PostgreSQL might not be writing anything if you run an "update" with the same value. I will let someone more intimate with the internal details of updates to comment on this.) Secondly, an update should not take "a few seconds". You might want to investigate this part before you turn to further optimizations. In our application we defer the updates to a separate asynchronous process using a simple queue mechanism, but in our case, we found that the updates are fast enough (in the order of a few milliseconds) not to warrant batching them into single transactions. A.
On 06/23/2009 02:37 PM, Alexander Staubo wrote: > (That said, I believe PostgreSQL diffs tuple updates, so in practice > PostgreSQL might not be writing anything if you run an "update" with > the same value. I will let someone more intimate with the internal > details of updates to comment on this.) No, it does not do that by default. You can write a trigger to do that though - and there is one packaged with the core version in the upcoming 8.4 version. Andres
Mathieu Nebra wrote: > Hi all, > > I'm running a quite large website which has its own forums. They are > currently heavily used and I'm getting performance issues. Most of them > are due to repeated UPDATE queries on a "flags" table. > > This "flags" table has more or less the following fields: > > UserID - TopicID - LastReadAnswerID > > The flags table keeps track of every topic a member has visited and > remembers the last answer which was posted at this moment. It allows the > user to come back a few days after and immediately jump to the last > answer he has not read. > > My problem is that everytime a user READS a topic, it UPDATES this flags > table to remember he has read it. This leads to multiple updates at the > same time on the same table, and an update can take a few seconds. This > is not acceptable for my users. > > Question: what is the general rule of thumb here? How would you store > this information? > > Thanks a lot in advance. > Mathieu. > > Sounds like the server is getting IO bound by checkpoints causing flush to disk causing a IO to become bound. http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm there is some 8.0-8.2 tuning ideas in this link. Yes this is acceptable way to store such information. What is the PG version. performance tuning options are different depending on the version??? http://wiki.postgresql.org/wiki/Performance_Optimization http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
In our application we defer the updates to a separate asynchronous
process using a simple queue mechanism, but in our case, we found that
the updates are fast enough (in the order of a few milliseconds) not
to warrant batching them into single transactions.
We do a very similar trick for another sort of data and its worked wonders for performance. We had more frequent updates to fewer rows, though. If you happen to be using Java, HashMap and TreeMap are perfect for this because they are reentrant so you don't have to worry about synchronizing your sweeper with your web page activities. As an added bonus, when you do this trick you don't have to query this information from the database unless you have a cache miss.
On Tue, 23 Jun 2009, Nikolas Everett wrote: > If you happen to be using Java, HashMap and TreeMap are perfect for this > because they are reentrant so you don't have to worry about > synchronizing your sweeper with your web page activities. See the note in http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html > "Note that this implementation is not synchronized." If you have multiple threads accessing a TreeMap or HashMap, then they must be synchronised to ensure that only one thread at a time is accessing it. Otherwise, you may suffer severe data loss and possibly even JVM crashes. Perhaps you meant java.util.concurrent.ConcurrentHashMap? Be very careful. Matthew -- Now, you would have thought these coefficients would be integers, given that we're working out integer results. Using a fraction would seem really stupid. Well, I'm quite willing to be stupid here - in fact, I'm going to use complex numbers. -- Computer Science Lecturer
> On 06/23/2009 01:12 PM, Mathieu Nebra wrote: >> >> I'm running a quite large website which has its own forums. They are >> >> currently heavily used and I'm getting performance issues. Most of them >> >> are due to repeated UPDATE queries on a "flags" table. >> >> >> >> This "flags" table has more or less the following fields: >> >> >> >> UserID - TopicID - LastReadAnswerID >> >> >> >> The flags table keeps track of every topic a member has visited and >> >> remembers the last answer which was posted at this moment. It allows the >> >> user to come back a few days after and immediately jump to the last >> >> answer he has not read. >> >> My problem is that everytime a user READS a topic, it UPDATES this flags >> >> table to remember he has read it. This leads to multiple updates at the >> >> same time on the same table, and an update can take a few seconds. This >> >> is not acceptable for my users. > > Have you analyzed why it takes that long? Determining that is the first > > step of improving the current situation... > > > > My first guess would be, that your disks cannot keep up with the number > > of syncronous writes/second. Do you know how many transactions with > > write access you have? Guessing from your description you do at least > > one write for every page hit on your forum. I don't know how many writes/s Pgsql can handle on my server, but I first suspected that it was good practice to avoid unnecessary writes. I do 1 write/page for every connected user on the forums. I do the same on another part of my website to increment the number of page views (this was not part of my initial question but it is very close). > > > > With the default settings every transaction needs to wait for io at the > > end - to ensure transactional semantics. > > Depending on your disk the number of possible writes/second is quite low > > - a normal SATA disk with 7200rpm can satisfy something around 130 > > syncronous writes per second. Which is the upper limit on writing > > transactions per second. > > What disks do you have? We have 2 SAS RAID 0 15000rpm disks. > > > > On which OS are you? If you are on linux you could use iostat to get > > some relevant statistics like: > > iostat -x /path/to/device/the/database/resides/on 2 10 > > > > That gives you 10 statistics over periods of 2 seconds. > > > > > > Depending on those results there are numerous solutions to that problem... Here it is: $ iostat -x /dev/sda 2 10 Linux 2.6.18-6-amd64 (scratchy) 23.06.2009 avg-cpu: %user %nice %system %iowait %steal %idle 18,02 0,00 12,87 13,13 0,00 55,98 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,94 328,98 29,62 103,06 736,58 6091,14 51,46 0,04 0,25 0,04 0,51 avg-cpu: %user %nice %system %iowait %steal %idle 39,65 0,00 48,38 2,00 0,00 9,98 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 10,00 78,00 516,00 1928,00 27,77 6,44 73,20 2,75 24,20 avg-cpu: %user %nice %system %iowait %steal %idle 40,15 0,00 48,13 2,24 0,00 9,48 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 6,47 100,50 585,07 2288,56 26,87 13,00 121,56 3,00 32,04 avg-cpu: %user %nice %system %iowait %steal %idle 45,14 0,00 45,64 6,73 0,00 2,49 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 1,00 0,00 34,00 157,50 1232,00 3904,00 26,82 26,64 139,09 3,03 58,00 avg-cpu: %user %nice %system %iowait %steal %idle 46,25 0,00 49,25 3,50 0,00 1,00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 27,00 173,00 884,00 4224,00 25,54 24,46 122,32 3,00 60,00 avg-cpu: %user %nice %system %iowait %steal %idle 44,42 0,00 47,64 2,23 0,00 5,71 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 15,42 140,30 700,50 3275,62 25,53 17,94 115,21 2,81 43,78 avg-cpu: %user %nice %system %iowait %steal %idle 41,75 0,00 48,50 2,50 0,00 7,25 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,50 0,00 21,11 116,08 888,44 2472,36 24,50 12,62 91,99 2,55 34,97 avg-cpu: %user %nice %system %iowait %steal %idle 44,03 0,00 46,27 2,99 0,00 6,72 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 9,00 0,00 10,00 119,00 484,00 2728,00 24,90 15,15 117,47 2,70 34,80 avg-cpu: %user %nice %system %iowait %steal %idle 36,91 0,00 51,37 2,49 0,00 9,23 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,99 0,00 14,78 136,45 390,15 2825,62 21,26 21,86 144,52 2,58 39,01 avg-cpu: %user %nice %system %iowait %steal %idle 38,75 0,00 48,75 1,00 0,00 11,50 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 7,54 67,34 377,89 1764,82 28,62 5,38 71,89 2,95 22,11 > > >> >> Question: what is the general rule of thumb here? How would you store >> >> this information? > > The problem here is, that every read access writes to disk - that is not > > going to scale very well. That's what I thought. > > One possible solution is to use something like memcached to store the > > last read post in memory and periodically write it into the database. > > We're starting using memcached. But how would you "periodically" write that to database? > > > > Which pg version are you using? I should have mentionned that before sorry: PostgreSQL 8.2 Thanks a lot! Andres Freund a écrit : > On 06/23/2009 01:12 PM, Mathieu Nebra wrote: >> I'm running a quite large website which has its own forums. They are >> currently heavily used and I'm getting performance issues. Most of them >> are due to repeated UPDATE queries on a "flags" table. >> >> This "flags" table has more or less the following fields: >> >> UserID - TopicID - LastReadAnswerID >> >> The flags table keeps track of every topic a member has visited and >> remembers the last answer which was posted at this moment. It allows the >> user to come back a few days after and immediately jump to the last >> answer he has not read. >> My problem is that everytime a user READS a topic, it UPDATES this flags >> table to remember he has read it. This leads to multiple updates at the >> same time on the same table, and an update can take a few seconds. This >> is not acceptable for my users. > Have you analyzed why it takes that long? Determining that is the first > step of improving the current situation... > > My first guess would be, that your disks cannot keep up with the number > of syncronous writes/second. Do you know how many transactions with > write access you have? Guessing from your description you do at least > one write for every page hit on your forum. > > With the default settings every transaction needs to wait for io at the > end - to ensure transactional semantics. > Depending on your disk the number of possible writes/second is quite low > - a normal SATA disk with 7200rpm can satisfy something around 130 > syncronous writes per second. Which is the upper limit on writing > transactions per second. > What disks do you have? > > On which OS are you? If you are on linux you could use iostat to get > some relevant statistics like: > iostat -x /path/to/device/the/database/resides/on 2 10 > > That gives you 10 statistics over periods of 2 seconds. > > > Depending on those results there are numerous solutions to that problem... > >> Question: what is the general rule of thumb here? How would you store >> this information? > The problem here is, that every read access writes to disk - that is not > going to scale very well. > One possible solution is to use something like memcached to store the > last read post in memory and periodically write it into the database. > > > Which pg version are you using? > > > Andres
Alexander Staubo a écrit : > On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebra<mateo21@siteduzero.com> wrote: >> This "flags" table has more or less the following fields: >> >> UserID - TopicID - LastReadAnswerID > > We are doing pretty much same thing. > >> My problem is that everytime a user READS a topic, it UPDATES this flags >> table to remember he has read it. This leads to multiple updates at the >> same time on the same table, and an update can take a few seconds. This >> is not acceptable for my users. > > First of all, and I'm sure you thought of this, an update isn't needed > every time a user reads a topic; only when there are new answers that > need to be marked as read. So an "update ... where last_read_answer_id > < ?" should avoid the need for an update. We don't work that way. We just "remember" he has read these answers and then we can tell him "there are no new messages for you to read". So we just need to write what he has read when he reads it. > > (That said, I believe PostgreSQL diffs tuple updates, so in practice > PostgreSQL might not be writing anything if you run an "update" with > the same value. I will let someone more intimate with the internal > details of updates to comment on this.) > > Secondly, an update should not take "a few seconds". You might want to > investigate this part before you turn to further optimizations. Yes, I know there is a problem but I don't know if I am competent enough to tune PostgreSQL for that. It can take a while to understand the problem, and I'm not sure I'll have the time for that. I am, however, opened to suggestions. Maybe I'm doing something wrong somewhere. > > In our application we defer the updates to a separate asynchronous > process using a simple queue mechanism, but in our case, we found that > the updates are fast enough (in the order of a few milliseconds) not > to warrant batching them into single transactions. A few milliseconds would be cool. In fact, defering to another process is a good idea, but I'm not sure if it is easy to implement. It would be great to have some sort of UPDATE ... LOW PRIORITY to make the request non blocking. Thanks.
>> > Which pg version are you using? > > I should have mentionned that before sorry: PostgreSQL 8.2 I think there is an awful lot of speculation on this thread about what your problem is without anywhere near enough investigation. A couple of seconds for an update is a really long time, unless your server is absolutely slammed, in which case probably everything is taking a long time. We need to get some more information on what is happening here. Approximately how many requests per second are you servicing? Also, can you: 1. Run EXPLAIN ANALYZE on a representative UPDATE statement and post the exact query and the output. 2. Run VACUUM VERBOSE on your database and send the last 10 lines or so of the output. 3. Try your UPDATE statement at a low-traffic time of day and see whether it's faster than it is at a high-traffic time of day, and by how much. Or dump your database and reload it on a dev server and see how fast it runs there. ...Robert
Mathieu Nebra <mateo21 'at' siteduzero.com> writes: >> (That said, I believe PostgreSQL diffs tuple updates, so in practice >> PostgreSQL might not be writing anything if you run an "update" with >> the same value. I will let someone more intimate with the internal >> details of updates to comment on this.) >> >> Secondly, an update should not take "a few seconds". You might want to >> investigate this part before you turn to further optimizations. > > Yes, I know there is a problem but I don't know if I am competent enough > to tune PostgreSQL for that. It can take a while to understand the > problem, and I'm not sure I'll have the time for that. Short story: run the query in psql prepending EXPLAIN ANALYZE in front of it and copy-paste the output in reply to that list. Long story: there are a lot of interesting material in PG official documentation about optimization. It is very worth a read but it's longer than a short story. In my experience, database performance can be degraded orders of magnitude if not configured properly. > I am, however, opened to suggestions. Maybe I'm doing something wrong > somewhere. > >> >> In our application we defer the updates to a separate asynchronous >> process using a simple queue mechanism, but in our case, we found that >> the updates are fast enough (in the order of a few milliseconds) not >> to warrant batching them into single transactions. > > A few milliseconds would be cool. That also depends on the query. If your update selects rows not according to an index you're going to be in trouble if the table hosts a lot of data, but that's fair. So you might just need an index. That might also be related to row bloat. Your query with EXPLAIN ANALYZE would tell what postgres does (if it uses an index or not). > In fact, defering to another process is a good idea, but I'm not sure if > it is easy to implement. It would be great to have some sort of UPDATE No article on the site du zéro explaining how to implement producer-consumers? :) But that must really be thought before implementing. It's not worth piling queries in memory because it will create other problems if queries are produced faster than consumed in the long run. -- Guillaume Cottenceau
On 06/23/2009 04:54 PM, Mathieu Nebra wrote: >> On 06/23/2009 01:12 PM, Mathieu Nebra wrote: >>>>> I'm running a quite large website which has its own forums. >>>>> They are currently heavily used and I'm getting performance >>>>> issues. Most of > them >>>>> are due to repeated UPDATE queries on a "flags" table. >>>>> >>>>> This "flags" table has more or less the following fields: >>>>> >>>>> UserID - TopicID - LastReadAnswerID >>>>> >>>>> The flags table keeps track of every topic a member has >>>>> visited and remembers the last answer which was posted at >>>>> this moment. It allows the user to come back a few days >>>>> after and immediately jump to the last answer he has not >>>>> read. My problem is that everytime a user READS a topic, it >>>>> UPDATES this flags table to remember he has read it. This >>>>> leads to multiple updates at the same time on the same table, >>>>> and an update can take a few seconds. This is not acceptable >>>>> for my users. >>> Have you analyzed why it takes that long? Determining that is the >>> first step of improving the current situation... >>> >>> My first guess would be, that your disks cannot keep up with the >>> number of syncronous writes/second. Do you know how many >>> transactions with write access you have? Guessing from your >>> description you do at least one write for every page hit on your >>> forum. > > I don't know how many writes/s Pgsql can handle on my server, but I > first suspected that it was good practice to avoid unnecessary > writes. It surely is. > I do 1 write/page for every connected user on the forums. I do the > same on another part of my website to increment the number of page > views (this was not part of my initial question but it is very > close). That even more cries for some in-memory-caching. >>> On which OS are you? If you are on linux you could use iostat to >>> get some relevant statistics like: iostat -x >>> /path/to/device/the/database/resides/on 2 10 >>> >>> That gives you 10 statistics over periods of 2 seconds. >>> >>> >>> Depending on those results there are numerous solutions to that > problem... > > Here it is: > > $ iostat -x /dev/sda 2 10 Linux 2.6.18-6-amd64 (scratchy) 23.06.2009 > > avg-cpu: %user %nice %system %iowait %steal %idle 18,02 0,00 > 12,87 13,13 0,00 55,98 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,94 > 328,98 29,62 103,06 736,58 6091,14 51,46 0,04 0,25 0,04 > 0,51 > > avg-cpu: %user %nice %system %iowait %steal %idle 39,65 0,00 > 48,38 2,00 0,00 9,98 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 > 10,00 78,00 516,00 1928,00 27,77 6,44 73,20 2,75 24,20 > > avg-cpu: %user %nice %system %iowait %steal %idle 40,15 0,00 > 48,13 2,24 0,00 9,48 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 > 6,47 100,50 585,07 2288,56 26,87 13,00 121,56 3,00 32,04 > > avg-cpu: %user %nice %system %iowait %steal %idle 45,14 0,00 > 45,64 6,73 0,00 2,49 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 1,00 0,00 > 34,00 157,50 1232,00 3904,00 26,82 26,64 139,09 3,03 58,00 > > avg-cpu: %user %nice %system %iowait %steal %idle 46,25 0,00 > 49,25 3,50 0,00 1,00 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 > 27,00 173,00 884,00 4224,00 25,54 24,46 122,32 3,00 60,00 > > avg-cpu: %user %nice %system %iowait %steal %idle 44,42 0,00 > 47,64 2,23 0,00 5,71 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 > 15,42 140,30 700,50 3275,62 25,53 17,94 115,21 2,81 43,78 > > avg-cpu: %user %nice %system %iowait %steal %idle 41,75 0,00 > 48,50 2,50 0,00 7,25 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,50 0,00 > 21,11 116,08 888,44 2472,36 24,50 12,62 91,99 2,55 34,97 > > avg-cpu: %user %nice %system %iowait %steal %idle 44,03 0,00 > 46,27 2,99 0,00 6,72 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 9,00 0,00 > 10,00 119,00 484,00 2728,00 24,90 15,15 117,47 2,70 34,80 > > avg-cpu: %user %nice %system %iowait %steal %idle 36,91 0,00 > 51,37 2,49 0,00 9,23 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,99 0,00 > 14,78 136,45 390,15 2825,62 21,26 21,86 144,52 2,58 39,01 > > avg-cpu: %user %nice %system %iowait %steal %idle 38,75 0,00 > 48,75 1,00 0,00 11,50 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 > 7,54 67,34 377,89 1764,82 28,62 5,38 71,89 2,95 22,11 You see that your average wait time 'await' is quite high. That indicates some contention. You have somewhere between 50-200 writes/second, so you may be maxing out your disk (depending on your config those writes may mainly go to one disk at a time). >>> One possible solution is to use something like memcached to store >>> the last read post in memory and periodically write it into the >>> database. > We're starting using memcached. But how would you "periodically" > write that to database? Where do you see the problem? >>> Which pg version are you using? > I should have mentionned that before sorry: PostgreSQL 8.2 I definitely would consider upgrading to 8.3 - even without any config changes it might bring quite some improvement. But mainly it would allow you to use "asynchronous commit" - which could possibly increase your throughput tremendously. It has the drawback that you possibly loose async transactions in case of crash - but that doesn't sound too bad for your use case (use it only in the transactions where it makes sense). But all of that does not explain the issue sufficiently - you should not get that slow updates. I would suggest you configure "log_min_statement_duration" to get the slower queries. You then should run those slow statements using 'EXPLAIN ANALYZE' to see where the time is spent. How are you vacuuming? Andres
>>>> Which pg version are you using? >> >> I should have mentionned that before sorry: PostgreSQL 8.2 > > I definitely would consider upgrading to 8.3 - even without any config > changes it might bring quite some improvement. > > But mainly it would allow you to use "asynchronous commit" - which could > possibly increase your throughput tremendously. HOT can potentitally help a lot for this workload, too, if the columns being updated are not indexed. ...Robert
not better just to store last time user visited the topic ? or forum in general, and compare that ?
Robert Haas a écrit : >>>> Which pg version are you using? >> I should have mentionned that before sorry: PostgreSQL 8.2 > > I think there is an awful lot of speculation on this thread about what > your problem is without anywhere near enough investigation. A couple > of seconds for an update is a really long time, unless your server is > absolutely slammed, in which case probably everything is taking a long > time. We need to get some more information on what is happening here. You're right, I'll give you the information you need. > Approximately how many requests per second are you servicing? Also, How can I extract this information from the database? I know how to use pg_stat_user_tables. My table has: seq_tup_read 133793491714 idx_scan 12408612540 idx_tup_fetch 41041660903 n_tup_ins 14700038 n_tup_upd 6698236 n_tup_del 15990670 > can you: > > 1. Run EXPLAIN ANALYZE on a representative UPDATE statement and post > the exact query and the output. "Index Scan using prj_frm_flg_pkey on prj_frm_flg (cost=0.00..8.58 rows=1 width=18)" " Index Cond: ((flg_mid = 3) AND (flg_sid = 123764))" This time it only took 54ms, but maybe it's already a lot. > > 2. Run VACUUM VERBOSE on your database and send the last 10 lines or > so of the output. It's not very long, I can give you the whole log: INFO: vacuuming "public.prj_frm_flg"INFO: scanned index "prj_frm_flg_pkey" to remove 74091 row versions DETAIL: CPU 0.15s/0.47u sec elapsed 53.10 sec.INFO: scanned index "flg_fav" to remove 74091 row versions DETAIL: CPU 0.28s/0.31u sec elapsed 91.82 sec.INFO: scanned index "flg_notif" to remove 74091 row versions DETAIL: CPU 0.36s/0.37u sec elapsed 80.75 sec.INFO: scanned index "flg_post" to remove 74091 row versions DETAIL: CPU 0.31s/0.37u sec elapsed 115.86 sec.INFO: scanned index "flg_no_inter" to remove 74091 row versions DETAIL: CPU 0.34s/0.33u sec elapsed 68.96 sec.INFO: "prj_frm_flg": removed 74091 row versions in 5979 pages DETAIL: CPU 0.29s/0.34u sec elapsed 100.37 sec.INFO: index "prj_frm_flg_pkey" now contains 1315895 row versions in 7716 pages DETAIL: 63153 index row versions were removed. 672 index pages have been deleted, 639 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_fav" now contains 1315895 row versions in 18228 pages DETAIL: 73628 index row versions were removed. 21 index pages have been deleted, 16 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_notif" now contains 1315895 row versions in 18179 pages DETAIL: 73468 index row versions were removed. 22 index pages have been deleted, 13 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_post" now contains 1315895 row versions in 18194 pages DETAIL: 73628 index row versions were removed. 30 index pages have been deleted, 23 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_no_inter" now contains 1315895 row versions in 8596 pages DETAIL: 73628 index row versions were removed. 13 index pages have been deleted, 8 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: "prj_frm_flg": found 74091 removable, 1315895 nonremovable row versions in 10485 pages DETAIL: 326 dead row versions cannot be removed yet. There were 253639 unused item pointers. 10431 pages contain useful free space. 0 pages are entirely empty. CPU 1.91s/2.28u sec elapsed 542.75 sec. Total: 542877 ms. > > 3. Try your UPDATE statement at a low-traffic time of day and see > whether it's faster than it is at a high-traffic time of day, and by > how much. Or dump your database and reload it on a dev server and see > how fast it runs there. It took 4ms.
On Tue, Jun 23, 2009 at 11:50 AM, Mathieu Nebra<mateo21@siteduzero.com> wrote: >> Approximately how many requests per second are you servicing? Also, > > How can I extract this information from the database? I know how to use > pg_stat_user_tables. My table has: I was thinking you might look at your httpd logs. Not sure how to get it otherwise. >> can you: >> >> 1. Run EXPLAIN ANALYZE on a representative UPDATE statement and post >> the exact query and the output. > > "Index Scan using prj_frm_flg_pkey on prj_frm_flg (cost=0.00..8.58 > rows=1 width=18)" > " Index Cond: ((flg_mid = 3) AND (flg_sid = 123764))" > > This time it only took 54ms, but maybe it's already a lot. That looks like EXPLAIN, not EXPLAIN ANALYZE. And can we also have the query? >> 2. Run VACUUM VERBOSE on your database and send the last 10 lines or >> so of the output. > > It's not very long, I can give you the whole log: > > INFO: vacuuming "public.prj_frm_flg"INFO: scanned index > "prj_frm_flg_pkey" to remove 74091 row versions > DETAIL: CPU 0.15s/0.47u sec elapsed 53.10 sec.INFO: scanned index > "flg_fav" to remove 74091 row versions > DETAIL: CPU 0.28s/0.31u sec elapsed 91.82 sec.INFO: scanned index > "flg_notif" to remove 74091 row versions > DETAIL: CPU 0.36s/0.37u sec elapsed 80.75 sec.INFO: scanned index > "flg_post" to remove 74091 row versions > DETAIL: CPU 0.31s/0.37u sec elapsed 115.86 sec.INFO: scanned index > "flg_no_inter" to remove 74091 row versions > DETAIL: CPU 0.34s/0.33u sec elapsed 68.96 sec.INFO: "prj_frm_flg": > removed 74091 row versions in 5979 pages > DETAIL: CPU 0.29s/0.34u sec elapsed 100.37 sec.INFO: index > "prj_frm_flg_pkey" now contains 1315895 row versions in 7716 pages > DETAIL: 63153 index row versions were removed. > 672 index pages have been deleted, 639 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_fav" now contains > 1315895 row versions in 18228 pages > DETAIL: 73628 index row versions were removed. > 21 index pages have been deleted, 16 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_notif" now > contains 1315895 row versions in 18179 pages > DETAIL: 73468 index row versions were removed. > 22 index pages have been deleted, 13 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_post" now > contains 1315895 row versions in 18194 pages > DETAIL: 73628 index row versions were removed. > 30 index pages have been deleted, 23 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_no_inter" now > contains 1315895 row versions in 8596 pages > DETAIL: 73628 index row versions were removed. > 13 index pages have been deleted, 8 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: "prj_frm_flg": found 74091 > removable, 1315895 nonremovable row versions in 10485 pages > DETAIL: 326 dead row versions cannot be removed yet. > There were 253639 unused item pointers. > 10431 pages contain useful free space. > 0 pages are entirely empty. > CPU 1.91s/2.28u sec elapsed 542.75 sec. > > Total: 542877 ms. Is that just for the one table? I meant a database-wide VACUUM VERBOSE, so you can see if you've blown out your free-space map. >> 3. Try your UPDATE statement at a low-traffic time of day and see >> whether it's faster than it is at a high-traffic time of day, and by >> how much. Or dump your database and reload it on a dev server and see >> how fast it runs there. > > It took 4ms. Was that at a low traffic time of day, or on a different server? ...Robert
So your update doesn't take long to run during off-peak times, so basically your options are: 1. Optimize your postgresql.conf settings or upgrade to the latest version of PostgreSQL. 2. Redesign your forum code so it can scale better. 3. Upgrade your servers hardware as it may be overloaded. I would probably attack those in the order I described. As far as redesigning your forum code, keep in mind that in PostgreSQL an update is basically a select, delete, insert in a single statement. First it needs to find the rows to update, it marks the rows for deletion (which vacuum later does) and inserts a new row. So updates can be quite expensive. In SOME situations, it can be faster to do inserts only, and modify your select query to get just the data you need, for example: Rather then an update like this: update <table> set LastReadAnswerID = <value> where UserID = <value> AND TopicID = <value> You could do this instead: insert into <table> VALUES(<user_id>,<topic_id>,<last_read_answer_id>) Then just modify your select statement slightly to get the last inserted row: select * from <table> where user_id = <value> AND topic_id = <value> order by LastReadAnswerID DESC LIMIT 1 This makes your select statement slightly more expensive but your insert statement pretty much as cheap as possible. Since its much easier to cache select results you could easily wrap some caching mechanism around your select query to reduce the load there too. Then using a task scheduler like cron simply clear out old rows from the table you insert into every minute, 5 minutes, hour, day, whatever makes most sense to keep the select queries fast. A memcached solution would probably be much better, but its also likely much more involved to do. On Tue, 23 Jun 2009 17:50:50 +0200 Mathieu Nebra <mateo21@siteduzero.com> wrote: > Robert Haas a écrit : > >>>> Which pg version are you using? > >> I should have mentionned that before sorry: PostgreSQL 8.2 > > > > I think there is an awful lot of speculation on this thread about > > what your problem is without anywhere near enough investigation. A > > couple of seconds for an update is a really long time, unless your > > server is absolutely slammed, in which case probably everything is > > taking a long time. We need to get some more information on what > > is happening here. > > You're right, I'll give you the information you need. > > > Approximately how many requests per second are you servicing? > > Also, > > How can I extract this information from the database? I know how to > use pg_stat_user_tables. My table has: > > seq_tup_read > 133793491714 > > idx_scan > 12408612540 > > idx_tup_fetch > 41041660903 > > n_tup_ins > 14700038 > > n_tup_upd > 6698236 > > n_tup_del > 15990670 > > > can you: > > > > 1. Run EXPLAIN ANALYZE on a representative UPDATE statement and post > > the exact query and the output. > > "Index Scan using prj_frm_flg_pkey on prj_frm_flg (cost=0.00..8.58 > rows=1 width=18)" > " Index Cond: ((flg_mid = 3) AND (flg_sid = 123764))" > > This time it only took 54ms, but maybe it's already a lot. > > > > > > 2. Run VACUUM VERBOSE on your database and send the last 10 lines or > > so of the output. > > It's not very long, I can give you the whole log: > > INFO: vacuuming "public.prj_frm_flg"INFO: scanned index > "prj_frm_flg_pkey" to remove 74091 row versions > DETAIL: CPU 0.15s/0.47u sec elapsed 53.10 sec.INFO: scanned index > "flg_fav" to remove 74091 row versions > DETAIL: CPU 0.28s/0.31u sec elapsed 91.82 sec.INFO: scanned index > "flg_notif" to remove 74091 row versions > DETAIL: CPU 0.36s/0.37u sec elapsed 80.75 sec.INFO: scanned index > "flg_post" to remove 74091 row versions > DETAIL: CPU 0.31s/0.37u sec elapsed 115.86 sec.INFO: scanned index > "flg_no_inter" to remove 74091 row versions > DETAIL: CPU 0.34s/0.33u sec elapsed 68.96 sec.INFO: "prj_frm_flg": > removed 74091 row versions in 5979 pages > DETAIL: CPU 0.29s/0.34u sec elapsed 100.37 sec.INFO: index > "prj_frm_flg_pkey" now contains 1315895 row versions in 7716 pages > DETAIL: 63153 index row versions were removed. > 672 index pages have been deleted, 639 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_fav" now > contains 1315895 row versions in 18228 pages > DETAIL: 73628 index row versions were removed. > 21 index pages have been deleted, 16 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_notif" now > contains 1315895 row versions in 18179 pages > DETAIL: 73468 index row versions were removed. > 22 index pages have been deleted, 13 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_post" now > contains 1315895 row versions in 18194 pages > DETAIL: 73628 index row versions were removed. > 30 index pages have been deleted, 23 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_no_inter" now > contains 1315895 row versions in 8596 pages > DETAIL: 73628 index row versions were removed. > 13 index pages have been deleted, 8 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: "prj_frm_flg": found 74091 > removable, 1315895 nonremovable row versions in 10485 pages > DETAIL: 326 dead row versions cannot be removed yet. > There were 253639 unused item pointers. > 10431 pages contain useful free space. > 0 pages are entirely empty. > CPU 1.91s/2.28u sec elapsed 542.75 sec. > > Total: 542877 ms. > > > > > 3. Try your UPDATE statement at a low-traffic time of day and see > > whether it's faster than it is at a high-traffic time of day, and by > > how much. Or dump your database and reload it on a dev server and > > see how fast it runs there. > > It took 4ms. >
So your update doesn't take long to run during off-peak times, so basically your options are: 1. Optimize your postgresql.conf settings or upgrade to the latest version of PostgreSQL. 2. Redesign your forum code so it can scale better. 3. Upgrade your servers hardware as it may be overloaded. I would probably attack those in the order I described. As far as redesigning your forum code, keep in mind that in PostgreSQL an update is basically a select, delete, insert in a single statement. First it needs to find the rows to update, it marks the rows for deletion (which vacuum later does) and inserts a new row. So updates can be quite expensive. In SOME situations, it can be faster to do inserts only, and modify your select query to get just the data you need, for example: Rather then an update like this: update <table> set LastReadAnswerID = <value> where UserID = <value> AND TopicID = <value> You could do this instead: insert into <table> VALUES(<user_id>,<topic_id>,<last_read_answer_id>) Then just modify your select statement slightly to get the last inserted row: select * from <table> where user_id = <value> AND topic_id = <value> order by LastReadAnswerID DESC LIMIT 1 This makes your select statement slightly more expensive but your insert statement pretty much as cheap as possible. Since its much easier to cache select results you could easily wrap some caching mechanism around your select query to reduce the load there too. Then using a task scheduler like cron simply clear out old rows from the table you insert into every minute, 5 minutes, hour, day, whatever makes most sense to keep the select queries fast. A memcached solution would probably be much better, but its also likely much more involved to do. On Tue, 23 Jun 2009 17:50:50 +0200 Mathieu Nebra <mateo21@siteduzero.com> wrote: > Robert Haas a écrit : > >>>> Which pg version are you using? > >> I should have mentionned that before sorry: PostgreSQL 8.2 > > > > I think there is an awful lot of speculation on this thread about > > what your problem is without anywhere near enough investigation. A > > couple of seconds for an update is a really long time, unless your > > server is absolutely slammed, in which case probably everything is > > taking a long time. We need to get some more information on what > > is happening here. > > You're right, I'll give you the information you need. > > > Approximately how many requests per second are you servicing? > > Also, > > How can I extract this information from the database? I know how to > use pg_stat_user_tables. My table has: > > seq_tup_read > 133793491714 > > idx_scan > 12408612540 > > idx_tup_fetch > 41041660903 > > n_tup_ins > 14700038 > > n_tup_upd > 6698236 > > n_tup_del > 15990670 > > > can you: > > > > 1. Run EXPLAIN ANALYZE on a representative UPDATE statement and post > > the exact query and the output. > > "Index Scan using prj_frm_flg_pkey on prj_frm_flg (cost=0.00..8.58 > rows=1 width=18)" > " Index Cond: ((flg_mid = 3) AND (flg_sid = 123764))" > > This time it only took 54ms, but maybe it's already a lot. > > > > > > 2. Run VACUUM VERBOSE on your database and send the last 10 lines or > > so of the output. > > It's not very long, I can give you the whole log: > > INFO: vacuuming "public.prj_frm_flg"INFO: scanned index > "prj_frm_flg_pkey" to remove 74091 row versions > DETAIL: CPU 0.15s/0.47u sec elapsed 53.10 sec.INFO: scanned index > "flg_fav" to remove 74091 row versions > DETAIL: CPU 0.28s/0.31u sec elapsed 91.82 sec.INFO: scanned index > "flg_notif" to remove 74091 row versions > DETAIL: CPU 0.36s/0.37u sec elapsed 80.75 sec.INFO: scanned index > "flg_post" to remove 74091 row versions > DETAIL: CPU 0.31s/0.37u sec elapsed 115.86 sec.INFO: scanned index > "flg_no_inter" to remove 74091 row versions > DETAIL: CPU 0.34s/0.33u sec elapsed 68.96 sec.INFO: "prj_frm_flg": > removed 74091 row versions in 5979 pages > DETAIL: CPU 0.29s/0.34u sec elapsed 100.37 sec.INFO: index > "prj_frm_flg_pkey" now contains 1315895 row versions in 7716 pages > DETAIL: 63153 index row versions were removed. > 672 index pages have been deleted, 639 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_fav" now > contains 1315895 row versions in 18228 pages > DETAIL: 73628 index row versions were removed. > 21 index pages have been deleted, 16 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_notif" now > contains 1315895 row versions in 18179 pages > DETAIL: 73468 index row versions were removed. > 22 index pages have been deleted, 13 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_post" now > contains 1315895 row versions in 18194 pages > DETAIL: 73628 index row versions were removed. > 30 index pages have been deleted, 23 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "flg_no_inter" now > contains 1315895 row versions in 8596 pages > DETAIL: 73628 index row versions were removed. > 13 index pages have been deleted, 8 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: "prj_frm_flg": found 74091 > removable, 1315895 nonremovable row versions in 10485 pages > DETAIL: 326 dead row versions cannot be removed yet. > There were 253639 unused item pointers. > 10431 pages contain useful free space. > 0 pages are entirely empty. > CPU 1.91s/2.28u sec elapsed 542.75 sec. > > Total: 542877 ms. > > > > > 3. Try your UPDATE statement at a low-traffic time of day and see > > whether it's faster than it is at a high-traffic time of day, and by > > how much. Or dump your database and reload it on a dev server and > > see how fast it runs there. > > It took 4ms. >
You're holding this behavior to far too strict of a transactional guarantee. The client software can cache a set of recent views, and sent updates in bulk every 1 or 2 seconds. Worst case, if your client crashes you lose a second worth of user metadata updates on last accessed and view counts. This isn't a financial transaction, don't build the app like one. The same facility can serve as a read cache for other bits that don't need to be 'perfect' in the transactional sense -- counts on the number of views / posts of a topic, etc. Using the db to store and retrieve such counts synchronously is frankly, a bad application design. The tricky part with the above is two fold: you need to have client software capable of a thread-safe shared cache, and the clients will have to have sticky-session if you are load balancing. Corner cases such as a server going down and a user switching servers will need to be worked out. On 6/23/09 4:12 AM, "Mathieu Nebra" <mateo21@siteduzero.com> wrote: > Hi all, > > I'm running a quite large website which has its own forums. They are > currently heavily used and I'm getting performance issues. Most of them > are due to repeated UPDATE queries on a "flags" table. > > This "flags" table has more or less the following fields: > > UserID - TopicID - LastReadAnswerID > > The flags table keeps track of every topic a member has visited and > remembers the last answer which was posted at this moment. It allows the > user to come back a few days after and immediately jump to the last > answer he has not read. > > My problem is that everytime a user READS a topic, it UPDATES this flags > table to remember he has read it. This leads to multiple updates at the > same time on the same table, and an update can take a few seconds. This > is not acceptable for my users. > > Question: what is the general rule of thumb here? How would you store > this information? > > Thanks a lot in advance. > Mathieu. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
On 6/23/09 7:54 AM, "Mathieu Nebra" <mateo21@siteduzero.com> wrote: >> On 06/23/2009 01:12 PM, Mathieu Nebra wrote: >>>>> I'm running a quite large website which has its own forums. They are >>>>> currently heavily used and I'm getting performance issues. Most of > them >>>>> are due to repeated UPDATE queries on a "flags" table. >>>>> >>>>> This "flags" table has more or less the following fields: >>>>> >>>>> UserID - TopicID - LastReadAnswerID >>>>> >>>>> The flags table keeps track of every topic a member has visited and >>>>> remembers the last answer which was posted at this moment. It > allows the >>>>> user to come back a few days after and immediately jump to the last >>>>> answer he has not read. >>>>> My problem is that everytime a user READS a topic, it UPDATES this > flags >>>>> table to remember he has read it. This leads to multiple updates > at the >>>>> same time on the same table, and an update can take a few seconds. > This >>>>> is not acceptable for my users. >>> Have you analyzed why it takes that long? Determining that is the first >>> step of improving the current situation... >>> >>> My first guess would be, that your disks cannot keep up with the number >>> of syncronous writes/second. Do you know how many transactions with >>> write access you have? Guessing from your description you do at least >>> one write for every page hit on your forum. > > I don't know how many writes/s Pgsql can handle on my server, but I > first suspected that it was good practice to avoid unnecessary writes. > > I do 1 write/page for every connected user on the forums. > I do the same on another part of my website to increment the number of > page views (this was not part of my initial question but it is very close). > >>> >>> With the default settings every transaction needs to wait for io at the >>> end - to ensure transactional semantics. >>> Depending on your disk the number of possible writes/second is quite low >>> - a normal SATA disk with 7200rpm can satisfy something around 130 >>> syncronous writes per second. Which is the upper limit on writing >>> transactions per second. >>> What disks do you have? > > We have 2 SAS RAID 0 15000rpm disks. > >>> >>> On which OS are you? If you are on linux you could use iostat to get >>> some relevant statistics like: >>> iostat -x /path/to/device/the/database/resides/on 2 10 >>> >>> That gives you 10 statistics over periods of 2 seconds. >>> >>> >>> Depending on those results there are numerous solutions to that > problem... > > Here it is: > > $ iostat -x /dev/sda 2 10 > Linux 2.6.18-6-amd64 (scratchy) 23.06.2009 > > avg-cpu: %user %nice %system %iowait %steal %idle > 18,02 0,00 12,87 13,13 0,00 55,98 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,94 328,98 29,62 103,06 736,58 6091,14 51,46 > 0,04 0,25 0,04 0,51 > > avg-cpu: %user %nice %system %iowait %steal %idle > 39,65 0,00 48,38 2,00 0,00 9,98 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,00 0,00 10,00 78,00 516,00 1928,00 27,77 > 6,44 73,20 2,75 24,20 > > avg-cpu: %user %nice %system %iowait %steal %idle > 40,15 0,00 48,13 2,24 0,00 9,48 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,00 0,00 6,47 100,50 585,07 2288,56 26,87 > 13,00 121,56 3,00 32,04 > > avg-cpu: %user %nice %system %iowait %steal %idle > 45,14 0,00 45,64 6,73 0,00 2,49 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 1,00 0,00 34,00 157,50 1232,00 3904,00 26,82 > 26,64 139,09 3,03 58,00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 46,25 0,00 49,25 3,50 0,00 1,00 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,00 0,00 27,00 173,00 884,00 4224,00 25,54 > 24,46 122,32 3,00 60,00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 44,42 0,00 47,64 2,23 0,00 5,71 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,00 0,00 15,42 140,30 700,50 3275,62 25,53 > 17,94 115,21 2,81 43,78 > > avg-cpu: %user %nice %system %iowait %steal %idle > 41,75 0,00 48,50 2,50 0,00 7,25 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,50 0,00 21,11 116,08 888,44 2472,36 24,50 > 12,62 91,99 2,55 34,97 > > avg-cpu: %user %nice %system %iowait %steal %idle > 44,03 0,00 46,27 2,99 0,00 6,72 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 9,00 0,00 10,00 119,00 484,00 2728,00 24,90 > 15,15 117,47 2,70 34,80 > > avg-cpu: %user %nice %system %iowait %steal %idle > 36,91 0,00 51,37 2,49 0,00 9,23 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,99 0,00 14,78 136,45 390,15 2825,62 21,26 > 21,86 144,52 2,58 39,01 > > avg-cpu: %user %nice %system %iowait %steal %idle > 38,75 0,00 48,75 1,00 0,00 11,50 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0,00 0,00 7,54 67,34 377,89 1764,82 28,62 > 5,38 71,89 2,95 22,11 > I see a lot of io wait time there. My guess is that your DB is flooded with synchronous writes. IF you want to optimize the hardware for this you have a couple options. I'm assuming your RAID 0 is not hardware RAID. 1. Use 8.3+ and asynchronous commit (set synchronous_commit=false). This is safe data wise, but if your DB crashes you might lose the last second of transactions or so that the app thought were comitted. For a DB forum, this is probably very acceptable. Performance should significantly gain as the writes/sec will go down a lot. 2. put your data on one partition and your WAL log on another. 3. Get a battery backed hardware raid with write-back caching. 4. If you are using ext3 on linux, make sure you mount with data=writeback on the file system that your wal logs are on. data=ordered will cause the WHOLE file sytem to be flushed for each fsync, not just the tiny bit of WAL log. In short, if you combined 1,2, and 4, you'll probably have significantly more capacity on the same server. So make sure your WAL log is in a different file system from your OS and data, mount it optimally, and consider turning synchronous_commit off. If you're using RAID 0, I doubt the data is so precious that synchronous_commit being true is important at all. > > >>> >>>>> Question: what is the general rule of thumb here? How would you store >>>>> this information? >>> The problem here is, that every read access writes to disk - that is not >>> going to scale very well. > > That's what I thought. > >>> One possible solution is to use something like memcached to store the >>> last read post in memory and periodically write it into the database. >>> > > We're starting using memcached. But how would you "periodically" write > that to database? > >>> >>> Which pg version are you using? > > I should have mentionned that before sorry: PostgreSQL 8.2 > > Thanks a lot! > > > > Andres Freund a écrit :
All the other comments are accurate, though it does seem like something the database ought to be able to handle. The other thing which hasn't been mentioned is that you have a lot of indexes. Updates require maintaining all those indexes. Are all of these indexes really necessary? Do you have routine queries which look up users based on their flags? Or all all your oltp transactions for specific userids in which case you probably just need the index on userid. You'll probably find 8.3 helps this workload more than any tuning you can do in the database though. Especially if you can reduce the number of indexes and avoid an index on any flags that are being updated.
Greg Stark a écrit : > All the other comments are accurate, though it does seem like > something the database ought to be able to handle. > > The other thing which hasn't been mentioned is that you have a lot of > indexes. Updates require maintaining all those indexes. Are all of > these indexes really necessary? Do you have routine queries which look > up users based on their flags? Or all all your oltp transactions for > specific userids in which case you probably just need the index on > userid. We are using these indexes, but I can't be sure if we _really_ need them or not. I can go into detail. We have: UserID - TopicID - LastReadAnswerID - WrittenStatus - IsFavorite So basically, we toggle the boolean flag WrittenStatus when the user has written in that topic. The same goes for IsFavorite. We have indexes on them, so we can SELECT every topic WHERE the user has written. Is it the good way of doing this? Oh, I've made a mistake before, we have RAID 1 disks, not RAID 0. > > You'll probably find 8.3 helps this workload more than any tuning you > can do in the database though. Especially if you can reduce the number > of indexes and avoid an index on any flags that are being updated. I'll start this way, thanks. First 8.3, then I'll check my flags. I have a lot of ways to investigate and I would like to thank every contributor here. I might come again with more precise information. Thanks.
On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebra<mateo21@siteduzero.com> wrote: > The flags table keeps track of every topic a member has visited and > remembers the last answer which was posted at this moment. It allows the > user to come back a few days after and immediately jump to the last > answer he has not read. I forgot to mention that we speed up our queries by caching the "last read" ID in Memcached. This is the kind of thing that Memcached is ideal for. For example, we show the list of the most recent posts, along with a comment count, eg. "42 comments (6 new)". We found that joining posts against the last-read table is expensive, so instead we read from Memcached on every post to find the number of unread comments. We use the thread's "last commented at" timestamp as part of the key so that when somebody posts a new comment, every user's cached unread count is invalidated; it is automatically recalculated the next time they view the post. A.
Mathieu Nebra wrote: > Greg Stark a écrit : >> All the other comments are accurate, though it does seem like >> something the database ought to be able to handle. >> >> The other thing which hasn't been mentioned is that you have a lot of >> indexes. Updates require maintaining all those indexes. Are all of >> these indexes really necessary? Do you have routine queries which look >> up users based on their flags? Or all all your oltp transactions for >> specific userids in which case you probably just need the index on >> userid. > > > We are using these indexes, but I can't be sure if we _really_ need them > or not. > > I can go into detail. We have: > > UserID - TopicID - LastReadAnswerID - WrittenStatus - IsFavorite > > So basically, we toggle the boolean flag WrittenStatus when the user has > written in that topic. The same goes for IsFavorite. Do those last two columns hold much data? Another thing to consider is to split this into two tables: UserID - TopicID - LastReadAnswerID UserID - TopicID - WrittenStatus - IsFavorite As others have pointed out, an UPDATE in Postgres is a select/delete/insert, and if you're updating just the LastReadAnswerIDall the time, you're wasting time deleting and re-inserting a lot of data that never change (assuming they'renot trivially small columns). This might also solve the problem of too many indexes -- the table that's updated frequently would only have an index on(UserID, TopicID), so the update only affects one index. Then to minimize the impact on your app, create a view that looks like the original table for read-only apps. Craig
On Tue, Jun 23, 2009 at 9:04 PM, Mathieu Nebra<mateo21@siteduzero.com> wrote: > We have indexes on them, so we can SELECT every topic WHERE the user has > written. Is it the good way of doing this? I'm kind of skeptical that a simple index on userid,topic isn't sufficient to handle this case. But you would have to test it on actual data to be sure. It depends whether you have enough topics and enough userid,topic records for a given userid that scanning all the topics for a given user is actually too slow. Even if it's necessary you might consider having a "partial" index on user,topic WHERE writtenstatus instead of having a three-column index. -- greg http://mit.edu/~gsstark/resume.pdf
Mathieu Nebra wrote:
It looks like this may just be a direct translation of PQsendQuery() from libpq. Your preferred language may have a function like this.
I use pg_send_query() <http://ca2.php.net/manual/en/function.pg-send-query.php> in php to achieve this for a views counter. "Script execution is not blocked while the queries are executing."Alexander Staubo a écrit :On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebra<mateo21@siteduzero.com> wrote:This "flags" table has more or less the following fields: UserID - TopicID - LastReadAnswerIDWe are doing pretty much same thing.My problem is that everytime a user READS a topic, it UPDATES this flags table to remember he has read it. This leads to multiple updates at the same time on the same table, and an update can take a few seconds. This is not acceptable for my users.First of all, and I'm sure you thought of this, an update isn't needed every time a user reads a topic; only when there are new answers that need to be marked as read. So an "update ... where last_read_answer_id < ?" should avoid the need for an update.We don't work that way. We just "remember" he has read these answers and then we can tell him "there are no new messages for you to read". So we just need to write what he has read when he reads it.(That said, I believe PostgreSQL diffs tuple updates, so in practice PostgreSQL might not be writing anything if you run an "update" with the same value. I will let someone more intimate with the internal details of updates to comment on this.) Secondly, an update should not take "a few seconds". You might want to investigate this part before you turn to further optimizations.Yes, I know there is a problem but I don't know if I am competent enough to tune PostgreSQL for that. It can take a while to understand the problem, and I'm not sure I'll have the time for that. I am, however, opened to suggestions. Maybe I'm doing something wrong somewhere.In our application we defer the updates to a separate asynchronous process using a simple queue mechanism, but in our case, we found that the updates are fast enough (in the order of a few milliseconds) not to warrant batching them into single transactions.A few milliseconds would be cool. In fact, defering to another process is a good idea, but I'm not sure if it is easy to implement. It would be great to have some sort of UPDATE ... LOW PRIORITY to make the request non blocking. Thanks.
It looks like this may just be a direct translation of PQsendQuery() from libpq. Your preferred language may have a function like this.
Craig James a écrit : > Mathieu Nebra wrote: >> Greg Stark a écrit : >>> All the other comments are accurate, though it does seem like >>> something the database ought to be able to handle. >>> >>> The other thing which hasn't been mentioned is that you have a lot of >>> indexes. Updates require maintaining all those indexes. Are all of >>> these indexes really necessary? Do you have routine queries which look >>> up users based on their flags? Or all all your oltp transactions for >>> specific userids in which case you probably just need the index on >>> userid. >> >> >> We are using these indexes, but I can't be sure if we _really_ need them >> or not. >> >> I can go into detail. We have: >> >> UserID - TopicID - LastReadAnswerID - WrittenStatus - IsFavorite >> >> So basically, we toggle the boolean flag WrittenStatus when the user has >> written in that topic. The same goes for IsFavorite. > > Do those last two columns hold much data? Another thing to consider is > to split this into two tables: The last two columns only store TRUE or FALSE, they're booleans. So you're saying that an index on them might be useless ? We're retrieving 1000-2000 rows max and we need to extract only those who have TRUE on the last column for example. > > UserID - TopicID - LastReadAnswerID > UserID - TopicID - WrittenStatus - IsFavorite > > As others have pointed out, an UPDATE in Postgres is a > select/delete/insert, and if you're updating just the LastReadAnswerID > all the time, you're wasting time deleting and re-inserting a lot of > data that never change (assuming they're not trivially small columns). They are trivially small columns. > > This might also solve the problem of too many indexes -- the table > that's updated frequently would only have an index on (UserID, TopicID), > so the update only affects one index. I'll investigate that way. > > Then to minimize the impact on your app, create a view that looks like > the original table for read-only apps. Good idea, thanks again.
>> >>> In our application we defer the updates to a separate asynchronous >>> process using a simple queue mechanism, but in our case, we found that >>> the updates are fast enough (in the order of a few milliseconds) not >>> to warrant batching them into single transactions. >>> >> >> A few milliseconds would be cool. >> In fact, defering to another process is a good idea, but I'm not sure if >> it is easy to implement. It would be great to have some sort of UPDATE >> ... LOW PRIORITY to make the request non blocking. >> >> Thanks. >> >> > I use pg_send_query() > <http://ca2.php.net/manual/en/function.pg-send-query.php> in php to > achieve this for a views counter. "Script execution is not blocked while > the queries are executing." > > It looks like this may just be a direct translation of PQsendQuery() > from libpq. Your preferred language may have a function like this. > I am using PHP. That was one of the thing I was looking for, thank you! :) We'll combine this with a memcached solution so we just update every 1000 views for example.