Thread: cache problem (v2)

cache problem (v2)

From

De Leeuw Guy

Date:

16 July 2007, 11:41:03

Hello all

I encoured a problem with specific kind of commands

a table :
CREATE TABLE test (
  code                int8 NOT NULL,
  qte                 float8,
  CONSTRAINT test_pkey PRIMARY KEY (code)
 )
CREATE TRIGGER trig_update_sum BEFORE INSERT OR UPDATE OR DELETE ON
table_test
  FOR EACH ROW EXECUTE PROCEDURE gd_trigfunc_before_buildsum();
EOF

code maybe 1,2,3
3 = sum of 1 & 2

the trigger make the sum :
if event = INSERT code = 3 ==> skip
if event = INSERT code = 1 or 2
   if select code 3 exist ? yes = update qte code 3 + qte code x
                            no  = insert qte to code 3
finally process the initial command

pgsql test :
INSERT INTO test VALUES (1, 50);
INSERT INTO test VALUES (3, 60);
ok code 3 = 110

psql : delete from test;
code 1 & 2 deleted
code 3 = 50

a flat file :
1;50
2;60

COPY test FROM '/var/lib/postgresql/flatfile' WITH DELIMITER ';';
code 1 & 2 ok
code 3 = 60


maybe this is because these commands start a transaction with a cache
and each time that I request a select for a sum code that are on the
cache the value returned are not the real new value.

two questions :

It is possible to work around this ?
More it is possible to optimize postgres to query the cache before
querying the row on disk ?

Thanks in advance for your attention
(sorry my english are poor)

Guy

Re: cache problem (v2)

From

Andrew Sullivan

Date:

16 July 2007, 14:10:17

I tried to answer the previous version of this mail, but I don't
understand it.  Let me see if my questions trigger something for you,
though.

On Mon, Jul 16, 2007 at 12:32:07PM +0200, De Leeuw Guy wrote:
>
> a table :
> CREATE TABLE test (
>   code                int8 NOT NULL,
>   qte                 float8,
>   CONSTRAINT test_pkey PRIMARY KEY (code)
>  )
> CREATE TRIGGER trig_update_sum BEFORE INSERT OR UPDATE OR DELETE ON
> table_test
>   FOR EACH ROW EXECUTE PROCEDURE gd_trigfunc_before_buildsum();
> EOF
>
> code maybe 1,2,3
> 3 = sum of 1 & 2
>
> the trigger make the sum :

> if event = INSERT code = 3 ==> skip
> if event = INSERT code = 1 or 2
>    if select code 3 exist ? yes = update qte code 3 + qte code x
>                             no  = insert qte to code 3

So if I am reading you correctly, IF code IN (1,2) THEN {do
something}?  What do you do then?  I don't get this part.

> INSERT INTO test VALUES (1, 50);
> INSERT INTO test VALUES (3, 60);
> ok code 3 = 110
>
> psql : delete from test;
> code 1 & 2 deleted
> code 3 = 50

I don't see how this is possible either.  You just deleted from test;
how can you hae anything left in there?

> maybe this is because these commands start a transaction with a cache
> and each time that I request a select for a sum code that are on the
> cache the value returned are not the real new value.

No, but a trigger that executes SQL can cause the trigger to fire
again.  Are you sure that's not happening?

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Everything that happens in the world happens at some place.
        --Jane Jacobs

Re: cache problem (v2)

From

De Leeuw Guy

Date:

16 July 2007, 15:34:10

Hello Andrew,

Thanks for your attention

Andrew Sullivan a écrit :
> I tried to answer the previous version of this mail, but I don't
> understand it.  Let me see if my questions trigger something for you,
> though.
>
> So if I am reading you correctly, IF code IN (1,2) THEN {do
> something}?  What do you do then?  I don't get this part.
>
I record a row with code = 3 and the value = the sum of each values in
code 1 & 2
> I don't see how this is possible either.  You just deleted from test;
> how can you hae anything left in there?
>
I skip the delete action when the code is a sum code
>
>> maybe this is because these commands start a transaction with a cache
>> and each time that I request a select for a sum code that are on the
>> cache the value returned are not the real new value.
>>
>
> No, but a trigger that executes SQL can cause the trigger to fire
> again.  Are you sure that's not happening?
>
>

Yes I test this situation on my trigger.
I sure about 99% that my problem come from the cache.

Regards

Guy

Re: cache problem (v2)

From

Andrew Sullivan

Date:

16 July 2007, 15:48:57

On Mon, Jul 16, 2007 at 07:47:09PM +0200, De Leeuw Guy wrote:
> >
> > So if I am reading you correctly, IF code IN (1,2) THEN {do
> > something}?  What do you do then?  I don't get this part.
> >
> I record a row with code = 3 and the value = the sum of each values in
> code 1 & 2

This is also opaque to me now, because that doesn't seem to be what
your example said.  Perhaps you should post your function body?

> > I don't see how this is possible either.  You just deleted from test;
> > how can you hae anything left in there?
> >
> I skip the delete action when the code is a sum code

Well, that's not what your example said either.  You need to show us
actually what you are doing, rather than describing it in terms that
leave out significant parts.

> Yes I test this situation on my trigger.
> I sure about 99% that my problem come from the cache.

There is nothing in the cache that isn't "in the database", as it
were, but there are visibility rules that might be affecting you.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?
        --attr. John Maynard Keynes

Re: cache problem (v2)

From

Andrew Sullivan

Date:

16 July 2007, 17:37:11

On Mon, Jul 16, 2007 at 10:22:33PM +0200, De Leeuw Guy wrote:
> >
> Yes, but maybe the post are too big, can I send it to your address ?
> It's written in C

Well, sending it to me will do you zero good, as my C skills are
awful.  Assuming it's a reasonable size, I think the list will take
it.

> > There is nothing in the cache that isn't "in the database", as it
> > were, but there are visibility rules that might be affecting you.
> >
> visibility rules ?

You can see things that happened in your own transaction, but nobody
else can.  If you're working READ COMMITTED, you can also see work
that other transactions commit while you were working.  Could any of
that be affecting you?  Unless your own function is doing something
with some cache, I am positive there's no cache issue here as you
describe it.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful
than reformatting your hard drive when it gets out of whack.
        --Scott Morris

Re: cache problem (v2)

From

De Leeuw Guy

Date:

16 July 2007, 17:40:14

> This is also opaque to me now, because that doesn't seem to be what
> your example said.  Perhaps you should post your function body?
>
Yes, but maybe the post are too big, can I send it to your address ?
It's written in C
> There is nothing in the cache that isn't "in the database", as it
> were, but there are visibility rules that might be affecting you.
>
visibility rules ?

regards
Guy

Re: cache problem (v2)

From

De Leeuw Guy

Date:

16 July 2007, 17:59:22

> Well, sending it to me will do you zero good, as my C skills are
> awful.  Assuming it's a reasonable size, I think the list will take
> it.
No the complete example need also db schema and data.
I try to reproduce with a small program.
>
>
>
>>> There is nothing in the cache that isn't "in the database", as it
>>> were, but there are visibility rules that might be affecting you.
>>>
>>>
>> visibility rules ?
>>
>
> You can see things that happened in your own transaction, but nobody
> else can.  If you're working READ COMMITTED, you can also see work
> that other transactions commit while you were working.  Could any of
> that be affecting you?  Unless your own function is doing something
> with some cache, I am positive there's no cache issue here as you
> describe it.
>
> A
>
>
I check that tomorrow (I'm tired, 22h49h in Belgium) :-)
thanks for your help
Guy

Re: cache problem (v2)

From

De Leeuw Guy

Date:

17 July 2007, 09:08:47

Hello Andrew

> You can see things that happened in your own transaction, but nobody
> else can.  If you're working READ COMMITTED, you can also see work
> that other transactions commit while you were working.  Could any of
> that be affecting you?  Unless your own function is doing something
> with some cache, I am positive there's no cache issue here as you
> describe it.
>
>
I read the documentation :
http://www.postgresql.org/docs/8.2/interactive/transaction-iso.html#XACT-SERIALIZABLE
and it's maybe my problem

the COPY start a transaction ? if yes each time my trigger select a data
and update then the value returned by the read are the value before the
transaction. It's true ? if yes my question are : also in the same
transaction (like my trigger) ?

Great thanks

Guy

Re: cache problem (v2)

From

Andrew Sullivan

Date:

17 July 2007, 12:28:59

On Tue, Jul 17, 2007 at 07:59:32AM +0200, De Leeuw Guy wrote:
> the COPY start a transaction

Yes.  _Everything_ is in a transaction in PostgreSQL.  If you don't
explicitly do BEGIN. . .COMMIT, then the system does it implicitly
for each statement.

> if yes each time my trigger select a data
> and update then the value returned by the read are the value before the
> transaction. It's true ? if yes my question are : also in the same
> transaction (like my trigger) ?

Your trigger function runs inside the transaction of the calling
statement, unless you have explicitly started a transaction.

I'm starting to get the impression you have a misunderstanding of how
PostgreSQL works.  I think you maybe need to spend some more time
with the manual.  If you find the English version puzzling, you can
also work in French (at http://docs.postgresqlfr.org/) or German (I
see a translation at
http://www.postgresql.org/docs/books/pghandbuch.html.de; I dunno if
there's a more recent one).  Sorry, AFAIK there isn't one in Dutch.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.
        --George Orwell

Re: cache problem (v2)

From

De Leeuw Guy

Date:

17 July 2007, 13:10:46

> Yes.  _Everything_ is in a transaction in PostgreSQL.  If you don't
> explicitly do BEGIN. . .COMMIT, then the system does it implicitly
> for each statement.
>
>> if yes each time my trigger select a data
>> and update then the value returned by the read are the value before the
>> transaction. It's true ? if yes my question are : also in the same
>> transaction (like my trigger) ?
>>
>
> Your trigger function runs inside the transaction of the calling
> statement, unless you have explicitly started a transaction.
>
Not possible from a trigger

> I'm starting to get the impression you have a misunderstanding of how
> PostgreSQL works.  I think you maybe need to spend some more time
> with the manual.  If you find the English version puzzling, you can
> also work in French (at http://docs.postgresqlfr.org/) or German (I
> see a translation at
> http://www.postgresql.org/docs/books/pghandbuch.html.de; I dunno if
> there's a more recent one).  Sorry, AFAIK there isn't one in Dutch.
>
Yes true, It's my first step with postgres but also with a relationnal
database.
But I'm sure that with COPY the data are not correctly updated and
if I edit the file loaded by the COPY command and adapt it with each
line with a command INSERT all work fine.

Guy

Re: cache problem (v2)

From

Andrew Sullivan

Date:

17 July 2007, 13:23:17

On Tue, Jul 17, 2007 at 05:49:15PM +0200, De Leeuw Guy wrote:

> > Your trigger function runs inside the transaction of the calling
> > statement, unless you have explicitly started a transaction.
> >
> Not possible from a trigger

Yes, sorry, I phrased that wrong.  Let me put it differently: your
trigger runs only inside the transaction of the calling statement,
unless that statement itself is inside a longer explicitly-called
transaction.  For example:

t1                        t2

BEGIN
UPDATE table_with_trigger
SELECT something             SELECT ...FROM trigger_effect
INSERT something else
COMMIT

In this case, t2 does _not_ see the effects of the trigger in t1,
because those effects are not visible until the COMMIT.  But

t1                        t2

UPDATE table_with_trigger
SELECT something            SELECT...FROM trigger_effect
INSERT something else

in this case, t2 _does_ see the effects, because the trigger's
effects are COMMITted implicitly after the UPDATE statement.

> Yes true, It's my first step with postgres but also with a relationnal
> database.
> But I'm sure that with COPY the data are not correctly updated and
> if I edit the file loaded by the COPY command and adapt it with each
> line with a command INSERT all work fine.

If I read that right, you admit that you are inexperienced with the
concepts and the software, and you are unable to show us all the
relevant code or send us a precise description of what you are
doing; but, you are convinced nevertheless that the problem is a bug
or deficiency in PostgreSQL that nobody else seems to be having, and
not a problem with your approach?  I suggest you think again.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
However important originality may be in some fields, restraint and
adherence to procedure emerge as the more significant virtues in a
great many others.   --Alain de Botton

Re: cache problem (v2)

From

De Leeuw Guy

Date:

17 July 2007, 16:00:08

> Yes, sorry, I phrased that wrong.  Let me put it differently: your
> trigger runs only inside the transaction of the calling statement,
> unless that statement itself is inside a longer explicitly-called
> transaction.  For example:
>
> t1                        t2
>
> BEGIN
> UPDATE table_with_trigger
> SELECT something             SELECT ...FROM trigger_effect
> INSERT something else
> COMMIT
>
> In this case, t2 does _not_ see the effects of the trigger in t1,
> because those effects are not visible until the COMMIT.  But
>
> t1                        t2
>
> UPDATE table_with_trigger
> SELECT something            SELECT...FROM trigger_effect
> INSERT something else
>
> in this case, t2 _does_ see the effects, because the trigger's
> effects are COMMITted implicitly after the UPDATE statement.
>
Yes, that I understand, but it's not my case
I have :
table test ( int code, int qte);
t1
INSERT test  values(1, 150)
   call my trigger that SELECT WHERE code=3 (does not exist) and INSERT
INTO test (code=3, qte=150)
INSERT test  values(2, 450)
    call my trigger that SELECT WHERE code = 3 (exist) and UPDATE test
(code=3, qte=600)
....

Ok all work fine

Now I have a flat file :
1,150
2,450

COPY .... path_to_this_flat_file

code=3,qte=450
Why ?
Another error also is "duplicate key"
It's like that, into my trigger, the second SELECT return 0 rows
(SPI_processed = 0) and in this case the trigger try to INSERT instead
of UPDATE
 I loose one week with this problem.
> If I read that right, you admit that you are inexperienced with the
> concepts and the software, and you are unable to show us all the
> relevant code or send us a precise description of what you are
> doing; but, you are convinced nevertheless that the problem is a bug
> or deficiency in PostgreSQL that nobody else seems to be having, and
> not a problem with your approach?  I suggest you think again.
>
I read the documentation from postgres not all. But sufficiently to
start a test  of a trigger.
but sure, I'm not a veteran :-)
I never say that it's a bug, I say that with COPY the trigger does not
work like with INSERT.
The event received by the trigger are the same no ?
And it's what  I try to understand why the comportment of my trigger
change when the call
come from COPY  or from a lot of INSERT it's all.
You give me a way with the COMMITTED READ.

I try to explain more : (tomorrow I try to put on our web site a full
example and data)
I work on a statics project, before this project are builded in 1990
with CIsam from informix.
In 2004 I migrate the development from CIsam to  berkeleyDB.

Now to allow more possibilities for our users  (like connection from OOo
base, calc and so on)
I try to migrate the model to postgres

The problem is that I receive about 4 millions of data by month.
and to speed up the major type of query called by our users I build a
sum of different  items.
example :

Origins (company a, company b,....) I build a "Total Eurofer"
Market (France, Belgium, ....) I build a "Total All Markets"
Products (product a, product b) total product a+b

In real world : for the Import data : 30 Origins, 249 Markets, 1472
products, 360 periods (like 1999-01)
The sum are about 3 Origins, 28 Markets, 158 products

This is the job of my trigger : build the sum code to speed up the
standard query of our users.
for each origins
  insert/update the base code
  check if this code update a sum code if yes insert/update the sum code
    for each market
      insert/update the base code
        check if this code update a sum code if yes do it
          ...

finally from about 4 millions of input data I output about 16 millions
of records

Sorry if I disturb you.
Regards

Guy

Re: cache problem (v2)

From

Andrew Sullivan

Date:

17 July 2007, 16:22:07

On Tue, Jul 17, 2007 at 08:45:13PM +0200, De Leeuw Guy wrote:
> I have :
> table test ( int code, int qte);

Right, I remember that part.

> t1

Is that BEGIN;?  If not, this is _not_ one transaction.  It's two.

> INSERT test  values(1, 150)
>    call my trigger that SELECT WHERE code=3 (does not exist) and INSERT

You're not actually _calling_ the trigger, right?  It just happens
automatically?  Also, I don't have the slightest clue how this code=3
(does not exist) works.  AFAICT code=1.  So code=3 is never true, no?

> Now I have a flat file :
> 1,150
> 2,450
>
> COPY .... path_to_this_flat_file
>
> code=3,qte=450
> Why ?

Because it's all one transaction, and because there's more than one
state your variables could resolve to, only one of them actually
applies is my _guess_.  It's hard for me to say with any more detail
without the code and the schema.  Anyway, you have two transactions
in your first example, it appears.  The COPY statement is only one.

> Another error also is "duplicate key"

This is a different problem.  Where is it coming from?  Anyway, you
have some sort of collision there, as the result I assume of your
modifications of the data.  Does it only happen with the COPY case?
If so, that's another clue that the trigger function is not doing
what you think it is.

> I read the documentation from postgres not all. But sufficiently to
> start a test  of a trigger.

But apparently not a successful one ;-)  I'm just saying, it seems to
me that you have a deep misunderstanding of the way transaction scope
works.  I think you need to have another look at that.  I also think
you need to look a little harder at how COPY works as compared to
INSERT.

> I never say that it's a bug, I say that with COPY the trigger does not
> work like with INSERT.

I think it does, and I think your problem is coming from your
misunderstanding of how the trigger is working.  But I still don't
have a clear handle on what you're trying to do.  One way to try this
is to put your two INSERT statements into one transaction.  If things
are really as you seem to have described them (two insert statements,
two lines in the COPY file), then if you do BEGIN;INSERT..;INSERT..;
COMMIT you should get the same problem you do with COPY.

> and to speed up the major type of query called by our users I build a
> sum of different  items.

I can see why you'd do that (although it seems like a
pre-optimisation to me -- you might find that PostgreSQL is fast
enough to do this without the precompiled number).

> This is the job of my trigger : build the sum code to speed up the
> standard query of our users.

Right.  Like I said, I think you have a problem, likely in your
function code, that is causing only one criterion to evaluate to TRUE
when you think two of them should.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
        --Alexander Hamilton

Re: cache problem (v2)

From

De Leeuw Guy

Date:

17 July 2007, 17:43:19

> Is that BEGIN;?  If not, this is _not_ one transaction.  It's two.
>
Ok
> You're not actually _calling_ the trigger, right?  It just happens
> automatically?  Also, I don't have the slightest clue how this code=3
> (does not exist) works.  AFAICT code=1.  So code=3 is never true, no?
>
Yes it's a case study but my trigger consider  this possibility.
>
>> Now I have a flat file :
>> 1,150
>> 2,450
>>
>> COPY .... path_to_this_flat_file
>>
>> code=3,qte=450
>> Why ?
>>
>
> Because it's all one transaction, and because there's more than one
> state your variables could resolve to, only one of them actually
> applies is my _guess_.  It's hard for me to say with any more detail
> without the code and the schema.  Anyway, you have two transactions
> in your first example, it appears.  The COPY statement is only one.
>
Ok, but strange for me ... the transaction is a guarantee of visibility
inter process but in the same also ?
If yes it' is possible to change that ? I see on the documentation (yes,
yes I read them :-) ) that the UNCOMMITED_READ are equal to COMMITED
READ in postgres
>
>> Another error also is "duplicate key"
>>
>
> This is a different problem.  Where is it coming from?  Anyway, you
> have some sort of collision there, as the result I assume of your
> modifications of the data.  Does it only happen with the COPY case?
> If so, that's another clue that the trigger function is not doing
> what you think it is.
>
Yes I agree, originally same problem
>
>> I read the documentation from postgres not all. But sufficiently to
>> start a test  of a trigger.
>>
>
> But apparently not a successful one ;-)  I'm just saying, it seems to
> me that you have a deep misunderstanding of the way transaction scope
> works.  I think you need to have another look at that.  I also think
> you need to look a little harder at how COPY works as compared to
> INSERT.
>
Yes for me in the same transaction I see my update but is not the case.
But I agree I take more time to understand the transactional part :-)

Regards

Guy