Re: Best system for a data warehouse application - Mailing list pgsql-admin

From Steven Chang
Subject Re: Best system for a data warehouse application
Date
Msg-id CAEJt7k3+VKQti-C7_m6BcnC1zK528jBsRt6yGen6Nds0cRPKvg@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] Best system for a data warehouse application  (Jaime Soler <jaime.soler@gmail.com>)
List pgsql-admin
hello Jaim,

        Regarding cache-line-contention issue , I think you can refer this page
        Once I resized my shared_buffer from 15GB to 45GB,  AP team reported performance issue after 2 days ,
        and we found swap in/out activities increased vastly
        At that time, the only thing I can do was returning to the original shared_buffer setting -- 15GB, 
        and performance recovered to normal.
        I consulted EDB for memory issue, because their slides introduced at most 8GB for shared_buffer.
        And  I got positive response from them at that time.
       Now I think it maybe an numa issue.
       I will try the same evualtion size of  data buffer size as Oracle's SGA in my future projects.
       Or anybody can share his/her experiences with large shared_buffer size with postgres db.

Regards,
Steven

2017-02-22 22:23 GMT+08:00 Jaime Soler <jaime.soler@gmail.com>:
Well I prefer 1 choice, but first of choose a cpu model, I would check how many concurrent connections use your DW system. Also if you prefer a many-cores cpu, please check some previous errors related with cache line conetention in large numa system:
https://www.enterprisedb.com/blog/solving-cache-line-contention-large-numa-systems

2017-02-22 3:15 GMT+01:00 Fernando Hevia <fhevia@gmail.com>:
Option 1 is best if you will handle but a few connections to postgres (~<=4), since the higher cpu clock will prevail in this scenario.
But if you expect having more than a few simultaneous connections go with option 2 without a doubt.
Check how many simultaneous connections your DW system typically uses and go from there. Although more cores would in general be the correct choice, keep in mind that common DW in-database computations will greatly profit from the extra cpu cycles of the 3.5 GHz cpu.

Cheers.


El 21 feb. 2017 2:39 PM, "John Scalia" <jayknowsunix@gmail.com> escribió:
Hi all,

One of my system admins has approached me about replacing our production data warehouse system. I believe I know the answer, but would just like more opinions. He's giving me two options so far:

1) CPU = Intel e5-2637 3.5GHz 4 core with hyperthreading

vs.

2) CPU = Intel e5-2650v4 2.2GHz 14 core with hyperthreading

The rest of the specs are identical with 512Gb RAM and more than 2 Tb SSD drives. Our data warehouse is just shy of 2Tb now. I suspect the the first option would be more optimal as I don't think PostgreSQL can make effective use of a lot of processor cores. So fewer cores would be better optimized with the O/S. Can anyone weigh in on whether my assumptions are correct?
--
Jay


pgsql-admin by date:

Previous
From: John Scalia
Date:
Subject: Re: [ADMIN] Replication mode Master-Slave - maintenance question.
Next
From: Steven Chang
Date:
Subject: Re: The same role for multiple databases