Thread: nooby Q: temp tables good for web apps?

nooby Q: temp tables good for web apps?

From
Kenneth Tilton
Date:
I am porting a datamining web app to postgres from a non-sql datastore
and plan to use temporary tables quite a bit, to manage collections the
user will be massaging interactively. They might search and find
anywhere from 50 to 50k items, then filter that, unfilter, sort, etc.

Currently I manage those collections in the server application, meaning
everything gets pulled from the datastore into RAM. I see postgres
temporary tables and postgres features in general can greatly simplify
my code because so much of what I do can be expressedin postgres-ese. Yayyy.

Some on the team think I am nuts, but one reason given was the absence
of indices and I see (a) temporary tables *can* be indexed and (b)
postgres does not even use an index for small sets, and many collections
will be relatively small (as a design goal in fact--we hope to make
search smarter and return fewer hits).

I thought it would not hurt to check with the gurus before spending a
week on the wrong code, so... dumb idea?

kenny


Re: nooby Q: temp tables good for web apps?

From
Scott Marlowe
Date:
On Tue, Apr 7, 2009 at 3:11 PM, Kenneth Tilton <kentilton@gmail.com> wrote:
> I am porting a datamining web app to postgres from a non-sql datastore and
> plan to use temporary tables quite a bit, to manage collections the user
> will be massaging interactively. They might search and find anywhere from 50
> to 50k items, then filter that, unfilter, sort, etc.
>
> Currently I manage those collections in the server application, meaning
> everything gets pulled from the datastore into RAM. I see postgres temporary
> tables and postgres features in general can greatly simplify my code because
> so much of what I do can be expressedin postgres-ese. Yayyy.
>
> Some on the team think I am nuts,

People are often resistant to new ideas, even good ones.

> but one reason given was the absence of
> indices and I see (a) temporary tables *can* be indexed

Correct

> and (b) postgres
> does not even use an index for small sets, and many collections will be
> relatively small (as a design goal in fact--we hope to make search smarter
> and return fewer hits).

Correct again.

> I thought it would not hurt to check with the gurus before spending a week
> on the wrong code, so... dumb idea?

Good idea to at least come up with some tests to prove (or disprove)
your point.  We can wait for the gurus to check in later...

Re: nooby Q: temp tables good for web apps?

From
Tom Lane
Date:
Kenneth Tilton <kentilton@gmail.com> writes:
> I am porting a datamining web app to postgres from a non-sql datastore
> and plan to use temporary tables quite a bit, to manage collections the
> user will be massaging interactively. They might search and find
> anywhere from 50 to 50k items, then filter that, unfilter, sort, etc.

The main issue you should think about is whether the required lifespan
of the temp tables matches up with your application's use of database
connections.  If you are going through a connection pooler, for example,
it can be pretty awkward to hold onto the connection that has got the
temp table instances you need.  Web apps in general tend to have a hard
time maintaining such state across successive page references, so I'm
afraid this could be a show-stopper for you.

> Some on the team think I am nuts, but one reason given was the absence
> of indices and I see (a) temporary tables *can* be indexed

Yeah, whoever claimed that is simply uninformed, or at least is
well-informed about some other database.

            regards, tom lane

Re: nooby Q: temp tables good for web apps?

From
Greg Smith
Date:
Temp tables can be great for simplifying your code into more logical
sections.  When making a case for using them, make sure to point out that
using them more aggressively can cut down on the amount of indexing you
need on the big tables, which has positive implications in terms of
getting simpler and robust query plans and cutting down on insertion
overhead.

You should be sure to turn on log_temp_files (which is handy in general,
that's not specific to temp tables).  One specific thing to look for to
support your case is that sorts that used to execute in RAM and spill to
disk when they exceed work_mem might instead execute with less memory
usage; you'll be doing the final sort/filter steps using the temp tables
instead.  If that is already happening, the overhead of using the temp
table can end up looking pretty good.

One thing I like doing when in the early development stages is to create a
seperate disk partition for the temporary tables, turn that into a
tablespace, and then use temp_tablespaces to point the temp tables toward
it.  The idea is to separate out I/O to the temp tables so that you can
measure it to see how significant it is.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: nooby Q: temp tables good for web apps?

From
Kenneth Tilton
Date:

Tom Lane wrote:
> Kenneth Tilton <kentilton@gmail.com> writes:
>> I am porting a datamining web app to postgres from a non-sql datastore
>> and plan to use temporary tables quite a bit, to manage collections the
>> user will be massaging interactively. They might search and find
>> anywhere from 50 to 50k items, then filter that, unfilter, sort, etc.
>
> The main issue you should think about is whether the required lifespan
> of the temp tables matches up with your application's use of database
> connections.  If you are going through a connection pooler, for example,
> it can be pretty awkward to hold onto the connection that has got the
> temp table instances you need.  Web apps in general tend to have a hard
> time maintaining such state across successive page references, so I'm
> afraid this could be a show-stopper for you.

Ah, I should disclosed I am a relative web application nooby as well.
But our design depends anyway on the same server process handling a web
session from start to finish, and I thought this was doable with
sufficient effort. I mean, I asked and people said it could be arranged.
Not so? Yes, I am doomed. But if we can do that, I already have a ton of
logic for keeping sessions separate and for hanging onto a connection
for the life of a session (most requests use a PG connection pooler, the
ones that need to see the temp tables use a dedicated connection (or two
I think I might need). Keep sessions separate by working the session key
into the temp table name...well, that's the plan anyway.

>
>> Some on the team think I am nuts, but one reason given was the absence
>> of indices and I see (a) temporary tables *can* be indexed
>
> Yeah, whoever claimed that is simply uninformed, or at least is
> well-informed about some other database.

I think I misrepresented their position. They did not say it, but I
think they were referring to some hairy freetext indexing they did on
the permanent tables. And that's OK, we do not need that on the temp tables.

thx for the input, I will study up on the viability of getting a session
managed by the same process throughout.

ken

Re: nooby Q: temp tables good for web apps?

From
Kenneth Tilton
Date:

Greg Smith wrote:
> Temp tables can be great for simplifying your code into more logical
> sections.  When making a case for using them, make sure to point out
> that using them more aggressively can cut down on the amount of indexing
> you need on the big tables, which has positive implications in terms of
> getting simpler and robust query plans and cutting down on insertion
> overhead.
>
> You should be sure to turn on log_temp_files (which is handy in general,
> that's not specific to temp tables).  One specific thing to look for to
> support your case is that sorts that used to execute in RAM and spill to
> disk when they exceed work_mem might instead execute with less memory
> usage; you'll be doing the final sort/filter steps using the temp tables
> instead.  If that is already happening, the overhead of using the temp
> table can end up looking pretty good.
>
> One thing I like doing when in the early development stages is to create
> a seperate disk partition for the temporary tables, turn that into a
> tablespace, and then use temp_tablespaces to point the temp tables
> toward it.  The idea is to separate out I/O to the temp tables so that
> you can measure it to see how significant it is.

Thx, I will keep that in mind as a good way of really seeing what is
going on. I did notice the tablespace feature but wasn't sure how to
leverage it. Mgmt has been lusting after those new solid-state memory
disks (SSDs?), this could be a good excuse for a PO. We are a skunkworks
project getting as much praise so far for the speed of the web app as
anything else so we don't want to give up this plus.

ken


Re: nooby Q: temp tables good for web apps?

From
Scott Marlowe
Date:
On Tue, Apr 7, 2009 at 5:05 PM, Kenneth Tilton <kentilton@gmail.com> wrote:
>
>
> Greg Smith wrote:
>>
>> Temp tables can be great for simplifying your code into more logical
>> sections.  When making a case for using them, make sure to point out that
>> using them more aggressively can cut down on the amount of indexing you need
>> on the big tables, which has positive implications in terms of getting
>> simpler and robust query plans and cutting down on insertion overhead.
>>
>> You should be sure to turn on log_temp_files (which is handy in general,
>> that's not specific to temp tables).  One specific thing to look for to
>> support your case is that sorts that used to execute in RAM and spill to
>> disk when they exceed work_mem might instead execute with less memory usage;
>> you'll be doing the final sort/filter steps using the temp tables instead.
>>  If that is already happening, the overhead of using the temp table can end
>> up looking pretty good.
>>
>> One thing I like doing when in the early development stages is to create a
>> seperate disk partition for the temporary tables, turn that into a
>> tablespace, and then use temp_tablespaces to point the temp tables toward
>> it.  The idea is to separate out I/O to the temp tables so that you can
>> measure it to see how significant it is.
>
> Thx, I will keep that in mind as a good way of really seeing what is going
> on. I did notice the tablespace feature but wasn't sure how to leverage it.
> Mgmt has been lusting after those new solid-state memory disks (SSDs?), this
> could be a good excuse for a PO. We are a skunkworks project getting as much
> praise so far for the speed of the web app as anything else so we don't want
> to give up this plus.

Make sure the newer generation like Intel's that are fast under
concurrent access.  Most of the older SSDs are horrificall slow when
handling multiple random accesses.

You can use a different method if you need a table available to the
same session.  Create a schema based on the session id, and put your
temp tables there, only don't call them temp tables.  You'll either
need to make sure you always clean up your temp schema your session
created or come up with a daemon that comes along every hour or so and
kills off old schemas that aren't in use anymore.

Re: nooby Q: temp tables good for web apps?

From
John Cheng
Date:
Hi Kenneth,

One concern I have with SSD drives is that the performance degrades over time. If you were not familiar with this issue
already,take a look at the following article. 

http://www.anandtech.com/storage/showdoc.aspx?i=3531

It is not a huge problem and I have faith in Intel to come up with a good solution fairly quickly, but it is worth
noting.Given the cost of SSD, it does make me think that perhaps a more cost effective solution is to have plenty of
RAMon the box.  

----
John L. Cheng



----- Original Message ----
> From: Scott Marlowe <scott.marlowe@gmail.com>
> To: Kenneth Tilton <kentilton@gmail.com>
> Cc: pgsql-general@postgresql.org
> Sent: Tuesday, April 7, 2009 4:47:17 PM
> Subject: Re: [GENERAL] nooby Q: temp tables good for web apps?
>
> On Tue, Apr 7, 2009 at 5:05 PM, Kenneth Tilton wrote:
> >
> >
> > Greg Smith wrote:
> >>
> >> Temp tables can be great for simplifying your code into more logical
> >> sections.  When making a case for using them, make sure to point out that
> >> using them more aggressively can cut down on the amount of indexing you need
> >> on the big tables, which has positive implications in terms of getting
> >> simpler and robust query plans and cutting down on insertion overhead.
> >>
> >> You should be sure to turn on log_temp_files (which is handy in general,
> >> that's not specific to temp tables).  One specific thing to look for to
> >> support your case is that sorts that used to execute in RAM and spill to
> >> disk when they exceed work_mem might instead execute with less memory usage;
> >> you'll be doing the final sort/filter steps using the temp tables instead.
> >>  If that is already happening, the overhead of using the temp table can end
> >> up looking pretty good.
> >>
> >> One thing I like doing when in the early development stages is to create a
> >> seperate disk partition for the temporary tables, turn that into a
> >> tablespace, and then use temp_tablespaces to point the temp tables toward
> >> it.  The idea is to separate out I/O to the temp tables so that you can
> >> measure it to see how significant it is.
> >
> > Thx, I will keep that in mind as a good way of really seeing what is going
> > on. I did notice the tablespace feature but wasn't sure how to leverage it.
> > Mgmt has been lusting after those new solid-state memory disks (SSDs?), this
> > could be a good excuse for a PO. We are a skunkworks project getting as much
> > praise so far for the speed of the web app as anything else so we don't want
> > to give up this plus.
>
> Make sure the newer generation like Intel's that are fast under
> concurrent access.  Most of the older SSDs are horrificall slow when
> handling multiple random accesses.
>
> You can use a different method if you need a table available to the
> same session.  Create a schema based on the session id, and put your
> temp tables there, only don't call them temp tables.  You'll either
> need to make sure you always clean up your temp schema your session
> created or come up with a daemon that comes along every hour or so and
> kills off old schemas that aren't in use anymore.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general






Re: nooby Q: temp tables good for web apps?

From
Kenneth Tilton
Date:

Scott Marlowe wrote:
> You can use a different method if you need a table available to the
> same session.  Create a schema based on the session id, and put your
> temp tables there, only don't call them temp tables.  You'll either
> need to make sure you always clean up your temp schema your session
> created or come up with a daemon that comes along every hour or so and
> kills off old schemas that aren't in use anymore.

I am LMAO because Lisp (my server-side lang) does this to noobs, too:
three (at least) ways to do everything. Well, if all things are equal
dropping one schema and not kludging up mangled table names has a lot of
appeal. Thx.

ken


Re: nooby Q: temp tables good for web apps?

From
Scott Marlowe
Date:
On Tue, Apr 7, 2009 at 7:12 PM, Kenneth Tilton <kentilton@gmail.com> wrote:
>
>
> Scott Marlowe wrote:
>>
>> You can use a different method if you need a table available to the
>> same session.  Create a schema based on the session id, and put your
>> temp tables there, only don't call them temp tables.  You'll either
>> need to make sure you always clean up your temp schema your session
>> created or come up with a daemon that comes along every hour or so and
>> kills off old schemas that aren't in use anymore.
>
> I am LMAO because Lisp (my server-side lang) does this to noobs, too: three
> (at least) ways to do everything. Well, if all things are equal dropping one
> schema and not kludging up mangled table names has a lot of appeal. Thx.

Schemas, search_path and views together can let you do some pretty
cool things in terms of integrating external postgresql based apps
with each other.

Re: nooby Q: temp tables good for web apps?

From
Kenneth Tilton
Date:

Scott Marlowe wrote:
> On Tue, Apr 7, 2009 at 7:12 PM, Kenneth Tilton <kentilton@gmail.com> wrote:
>>
>> Scott Marlowe wrote:
>>> You can use a different method if you need a table available to the
>>> same session.  Create a schema based on the session id, and put your
>>> temp tables there, only don't call them temp tables.  You'll either
>>> need to make sure you always clean up your temp schema your session
>>> created or come up with a daemon that comes along every hour or so and
>>> kills off old schemas that aren't in use anymore.
>> I am LMAO because Lisp (my server-side lang) does this to noobs, too: three
>> (at least) ways to do everything. Well, if all things are equal dropping one
>> schema and not kludging up mangled table names has a lot of appeal. Thx.
>
> Schemas, search_path and views together can let you do some pretty
> cool things in terms of integrating external postgresql based apps
> with each other.

Or between XHRs? It just occurred to me that if I go with a schema
instead of temp tables then I do not need to worry about hanging on to a
connection/pgsession, or even worry about routing a web session to the
same process if all state is stored in pg under the session id.

ken *coming up to speed slowly, going to look up search_path*

Re: nooby Q: temp tables good for web apps?

From
Greg Smith
Date:
On Tue, 7 Apr 2009, John Cheng wrote:

> One concern I have with SSD drives is that the performance degrades over
> time.

The bigger concern I have with them is that even the Intel drives have a
volatile write cache in them.  You have either turn off the write cache
(which degrades performance substantially and might even have a longevity
impact) or use a battery-backed disk controller for them to be safe
database storage.  There's a good article about this at
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/

If there's a disk controller with a write cache involved, that narrows the
gap between SDD and regular drives quite a bit.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: nooby Q: temp tables good for web apps?

From
Erik Jones
Date:
On Apr 7, 2009, at 7:32 PM, Kenneth Tilton wrote:

> Scott Marlowe wrote:
>> On Tue, Apr 7, 2009 at 7:12 PM, Kenneth Tilton
>> <kentilton@gmail.com> wrote:
>>>
>>> Scott Marlowe wrote:
>>>> You can use a different method if you need a table available to the
>>>> same session.  Create a schema based on the session id, and put
>>>> your
>>>> temp tables there, only don't call them temp tables.  You'll either
>>>> need to make sure you always clean up your temp schema your session
>>>> created or come up with a daemon that comes along every hour or
>>>> so and
>>>> kills off old schemas that aren't in use anymore.
>>> I am LMAO because Lisp (my server-side lang) does this to noobs,
>>> too: three
>>> (at least) ways to do everything. Well, if all things are equal
>>> dropping one
>>> schema and not kludging up mangled table names has a lot of
>>> appeal. Thx.
>> Schemas, search_path and views together can let you do some pretty
>> cool things in terms of integrating external postgresql based apps
>> with each other.
>
> Or between XHRs? It just occurred to me that if I go with a schema
> instead of temp tables then I do not need to worry about hanging on
> to a connection/pgsession, or even worry about routing a web session
> to the same process if all state is stored in pg under the session id.
>
> ken *coming up to speed slowly, going to look up search_path*

If you're using pg_dump for backups then you'll probably want at least
a standard prefix on your "temp" schemas so that you can easily have
pg_dump ignore them when doing backups with it's -N flag.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k