Re: DB design advice - Mailing list pgsql-novice

From Sergey Konoplev
Subject Re: DB design advice
Date
Msg-id CAL_0b1tJSCMJVWKgx-QgbndMZi8ikBdpo7zx53tHB+Zi3FLY4g@mail.gmail.com
Whole thread Raw
In response to Re: DB design advice  (Toni Alfirević <toni.alfirevic@gmail.com>)
Responses Re: DB design advice  (Toni Alfirević <toni.alfirevic@gmail.com>)
List pgsql-novice
On Thu, Jun 6, 2013 at 11:43 PM, Toni Alfirević
<toni.alfirevic@gmail.com> wrote:
> It will most likely return subsets of data just like you said (pagination).

Okay. What do we have?

Table log (id bigserial primary key, ts timestamptz, value text,
user_id bigint) with (user_id) foreign key .

Table tag (id bigserial primary key, value text).

Table log_tag (log_id bigint, log_ts timestamptz, log_user_id bigint,
tag_id bigint) with foreign keys (log_id, log_ts, log_user_id) and
(tag_id).

Index log_tag_pk (log_user_id, tag_id, log_ts, log_id) on log_tag.

It it is not so critical, you can make it without foreign keys to gain
some more performance of course, but I wont recommend it on early
stages of your software until it is mature and covered by tests good
enough, so you could believe that it wont break your data integrity.
You can always do this later.

We also have a plpgsql stored function get_log_by_tags(i_log_user_id
bigint, i_tag_ids bigint[], i_ts timestamptz, i_limit) returning set
of log.

This function initiates an empty resulting array _result_logs log[].

Then it takes each _tag_id from i_tag_igs with in a for loop.

And selects all log entries for i_log_user_id and _tag_id inserted
after or before i_ts, depending on the order you want to output your
log records by, and limiting it by i_limit. Then it concatenates these
entries with the resulting array, orders the result by ts and limits
it with i_limit.

This query is supposed to be in the for loop and represent the
paragraph above. I will use descending order as an example.

select into _result_logs array(
    select ll from (
        (
            select log.* from log_tag join log on log.id = log_id
            where log_user_id = i_log_user_id
            and tag_id = _tag_id and log_ts > i_ts
            order by log_ts desc limit i_limit
        ) union (
            select (l::log).* from unnest(_result_logs) as l
        )
    ) as ll
    order by ts limit i_limit
)

I did not check this query so it might contains some typos, etc.

Finally it returns the result of unnesting _result_logs.

select (l::log).* from unnest(_result_logs) as l

If you need to be able to go back and forth you can send, for example,
add an i_back boolean parameter to the function, and according to its
value choose either log_ts > i_ts with desc or log_ts < i_ts with asc
queries, using if/else.

That is is.

> But there will be a situation where all data will need to be analyzed for, let's call it', a report. On the other
handsthose reports are not that sensitive when it comes to performance. It won't be an issue if it takes a minute, two
orthree to generate it. 

Then you can use simple queries like this one here.

select count(1) from log_tag
where
    log_user_id = 987 and
    tag_id in (123, 456) and
    log_ts between now() and now() - '1 week'::interval

It will be okay for kind of reports in most cases.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray.ru@gmail.com


pgsql-novice by date:

Previous
From: Aditya Rastogi
Date:
Subject: PostgreSQL 8.3.6 creating empty log files.
Next
From: Tom Lane
Date:
Subject: Re: PostgreSQL 8.3.6 creating empty log files.