Thread: I want your opinion on how to do something.

I want your opinion on how to do something.

From
Herouth Maoz
Date:
Basically, I have several production databases with various data, and I have a reports database that grabs all necessary data once a day.

Now, there is is a new requirement to have some of the data available in the reports database as soon as it is inserted in the production database. Specifically, the data in question is in two particular tables. However, in addition to just shipping the tables in, I also need to perform a bit of processing on the data as it comes. Basically, each transaction in production is represented by 1-3 rows in the two tables. One of the fields contains urlencoded data, which I need to decode and concatenate, so that the rows are represented by one row in another table.

E.g.

Table A brought from production:
key field 1 | key field 2 | num of segments | segment num | segment id |
------------+-------------+-----------------+-------------+------------|
abcde       | 134         | 3               | 1           | 999000     |
abcde       | 567         | 3               | 3           | 999000     |
abcde       | 890         | 3               | 2           | 999000     |
fghij       | 987         | 2               | 1           | 999001     |
fghij       | 654         | 2               | 2           | 999001     |
abcde       | -11         | 1               | 1           | 999003     |

Table B from production

key field 1 | key field 2 | urlencoded data |
------------+-------------+-----------------+
abcde       | 134         | AAA             |
abcde       | 567         | CCC             |
abcde       | 890         | BBB             |
fghij       | 987         | fff             |
fghij       | 654         | ggg             |
abcde       | -11         | XXX             |

Here we have basically three transactions - one with three segments, one with two, and one with a single segment. The data that identifies that certain rows belong to the same transation and what the order is is in table A. The actual data to decode is in table B.

The result I need to produce is a new table like this:

key field 1 | key field 2 | segment id | decoded concatenated data|
------------+-------------+------------+--------------------------|
abcde       | 134         | 999000     | AAABBBCCC                |
fghij       | 987         | 999001     | fffggg                   |
abcde       | -11         | 999003     | XXX                      |

Basically, a single row for each transaction, with the key fields taken from the original's first segment, and the data decoded and concatenated. But I need this to be done in (near) real-time - as the rows are added. Because of the decoding and concatenation requirements, this can't be a view, because there is no SQL function that will do the required processing, and I'm not sure a view would be efficient for running reports anyway. So a new table it will be.

Basically, I can replicate these two tables from production to the reports database using Londiste, which allows me to replicate single tables and is also supposed to allow me to run triggers on the tables.

What I thought of doing was create a trigger on one of the tables, such that when data is inserted, it will insert data into the "result" table, and write this trigger using PL/Perl or PL/Python which I'm guessing will allow me to do the string manipulation required.

The issue is that when an insert or an update is fired, I can't say whether all the segments of the same transaction have been written yet, and if only some of them were written, there is no guarantee on the order in which they are written. The question is whether such a trigger can be created at all. It needs to verify that all parts are in the database - run a query on the same table at the time the trigger is running. If not all data is there, it shouldn't insert anything. Is this feasible at all? How would you achieve it?

Thank you,
Herouth

Re: Maintaining a materialized view only on a replica

From
Craig Ringer
Date:
Subject changed to describe the problem. Reply in-line.

On 09/04/2012 07:57 PM, Herouth Maoz wrote:

> The issue is that when an insert or an update is fired, I can't say
> whether all the segments of the same transaction have been written yet,
> and if only some of them were written, there is no guarantee on the
> order in which they are written.

Does Slony-I provide stronger guarantees? If your replication doesn't
guarantee ordering then you're going to have a very hard time doing this.

> Is this
> feasible at all? How would you achieve it?

I'd try to find a replication system that guaranteed ordering if at all
possible.

--
Craig Ringer


Re: Maintaining a materialized view only on a replica

From
Herouth Maoz
Date:
It's not an issue with the replication software. The reason the parts of the transaction are written out of order is that the original system that writes them in the first place makes no guarantees as to the order of writing.

So basically my question is whether a trigger that runs a full aggregate SQL query on the table that triggered it, joining with another table, checking the rows returned and doing the insert in the second table only when the data is complete is feasible, because that's basically what I need to do.

Herouth


On 05/09/2012, at 00:52, Craig Ringer wrote:

Subject changed to describe the problem. Reply in-line.

On 09/04/2012 07:57 PM, Herouth Maoz wrote:

The issue is that when an insert or an update is fired, I can't say
whether all the segments of the same transaction have been written yet,
and if only some of them were written, there is no guarantee on the
order in which they are written.

Does Slony-I provide stronger guarantees? If your replication doesn't guarantee ordering then you're going to have a very hard time doing this.

Is this
feasible at all? How would you achieve it?

I'd try to find a replication system that guaranteed ordering if at all possible.

--
Craig Ringer


--
חרות מעוז
יוניסל פתרונות סלולריים מתקדמים
☎ 03-5181717 שלוחה 742