Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Mark Woodward
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 18382.24.91.171.78.1151002846.squirrel@mail.mohawksoft.com
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (PFC <lists@peufeu.com>)
Responses Re: vacuum, performance, and MVCC
Re: vacuum, performance, and MVCC
List pgsql-hackers
>
>> As you can see, in about a minute at high load, this very simple table
>> lost about 10% of its performance, and I've seen worse based on update
>> frequency.  Before you say this is an obscure problem, I can tell you it
>> isn't. I have worked with more than a few projects that had to switch
>> away
>> from PostgreSQL because of this behavior.
>>
>> Obviously this is not a problem with small sites, but this is a real
>> problem with an enterprise level web site with millions of visitors and
>> actions a day. Quite frankly it is a classic example of something that
>> does not scale. The more and more updates there are, the higher the load
>> becomes. You can see it on "top" as the footest program runs.
>
>
>     I believe sessions should not be stored in a SQL database.
Alas, this is a long debate, and while I fundimentally agree with this
position, there is an inconvenient truth that it is often nessisary.

http://www.mohawksoft.org/?q=node/8


>     It makes no sense. Updates and Inserts to the database should only be
> done where there is an interesting thing to record, when the user does an
> action like posting to a forum, making a purchase, sending a message, etc.

Again, preaching to the chior.

>
>     I believe sessions should be stored in the memory of the application
> server, as native objects of the whatever language the application is
> written in. This way, sessions incur no serializing overhead and can be
> quite large and complex, which allows storage of interesting things, like
> the result of a complicated search query which is to be later paginated,
> for instance. It really makes sense to use native language objects too, as
> these have a lot more power and versatility than a database row. Think
> about rights management, for instance.

What you seem not to grasp at this point is a large web-farm, about 10 or
more servers running PHP, Java, ASP, or even perl. The database is usually
the most convenient and, aside from the particular issue we are talking
about, best suited.

>
>     When the framework used lacks this power (most do and this is sad), then
> sessions incur serializing overhead ; but they should be serialized to
> filesystem files, or better, to memory using memcached, for instance.

I actually have a good number of years of experience in this topic, and
memcached or file system files are NOT the best solutions for a server
farm.

> It
> makes no sense to pay the performance penalty of a COMMIT (disk seek delay
> etc) and the database overhead for sessions, which are by nature volatile
> data changing all the time.

Very true.

>
>     I don't think postgres should be tweaked to allow better handling of
> this. It would only make a better foot-gun.

I think the debate is over, it may be a bad use of a database, but there
are few alternatives, and SQL databases have become the defacto
methodology for dealing with this type of problem.

>



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions
Next
From: "Bort, Paul"
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions