Re: Which replication is the best for our case ? - Mailing list pgsql-general

From John R Pierce
Subject Re: Which replication is the best for our case ?
Date
Msg-id 55942A53.6010307@hogranch.com
Whole thread Raw
In response to Re: Which replication is the best for our case ?  ("ben.play" <benjamin.cohen@playrion.com>)
List pgsql-general
On 7/1/2015 3:08 AM, ben.play wrote:
> In fact, the cron job will :
> -> select about 10 000 lines from a big table (>100 Gb of data). 1 user has
> about 10 lines.
> -> each line will be examinate by an algorithm
> -> at the end of each line, the cron job updates a few parameters for the
> user (add some points for example)
> -> Then, it inserts a line in another table to indicate to the user each
> transaction.
>
> All updates and inserts can be inserted ONLY by the cron job ...
> Therefore ... the merge can be done easily : no one can be update these new
> datas.
>
> But ... how big company like Facebook or Youtube can calculate on (a)
> dedicated server(s) without impacting users ?

that sort of batch processing is not normally done in database-centric
systems, rather, databases are usually updated continuously in realtime
as the events come in via transactions.

your cron task is undoubtably single threaded which means it runs on one
core only,  so the whole system ends up waiting on a single task
crunching massive amounts of data, while your other processor cores have
nothing to do.

it sounds to me like whomever designed this system didn't have a solid
grip on transactional database processing.



--
john r pierce, recycling bits in santa cruz



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Systemd vs logging collector
Next
From: Arthur Silva
Date:
Subject: Re: Which replication is the best for our case ?