Thread: RE: Handling time series data with PostgreSQL

RE: Handling time series data with PostgreSQL

From
Jayaram
Date:
Dear All,

I'm Jayaram S, oracle DBA. Currently we are planning to develop a stock market based application which deals 80% of data with time data. We are in the process of choosing the right database for the requirement especially for time series data. After all multiple investigations, I found PostgreSQL with timescaleDB works better than other DBs.

But still I'm new to PGSQL and we wanted only open source technology to handle our requirements. It will be helpful to me if anyone can suggest implementing the time series concepts in PostgreSQL database. 
It's better if I can get proper docs or links with explanation.

Thanks in advance.,

--
Thanks & Regards,
Jayaram S,
Banglore.
India.

Re: Handling time series data with PostgreSQL

From
Adalberto Caccia
Date:
Hi,
TimescaleDB as a Postgresql extension has been used in my firm for two years now, I've recently managed to upgrade it from pg10 to pg12 and from discrete VM's to Kubernetes as well.
Frankly speaking, being new to TimescaleDB at that time, I've found it easy to manage, easy to scale (it's 100% compatible with pg replication, unfortunately not the logical one, yet...), easy to install, easy to upgrade... what else?
From a developer's perspective, it just adds "superpowers" to ordinary PG tables, all under the hood. On disk, it features a "chunked" layout, where each chunk belongs to a definite "time" range; and of course the "time" column on which to index time data is just passed as a parameter to the call to TimescaleDB, for each table on which you need such power.
At the moment, we're also using it for time aggregate calculations, but only for the coarse ones (30m --> 1h and 1h --> 1 day), while we're still handling the finer ones (1s --> 1m and so on) in Kafka+Flink, which is a common scenario for a streaming data platform, anyway.

Regards,
Adalberto


Il giorno mar 6 ott 2020 alle ore 11:47 Jayaram <jairamcbe@gmail.com> ha scritto:
Dear All,

I'm Jayaram S, oracle DBA. Currently we are planning to develop a stock market based application which deals 80% of data with time data. We are in the process of choosing the right database for the requirement especially for time series data. After all multiple investigations, I found PostgreSQL with timescaleDB works better than other DBs.

But still I'm new to PGSQL and we wanted only open source technology to handle our requirements. It will be helpful to me if anyone can suggest implementing the time series concepts in PostgreSQL database. 
It's better if I can get proper docs or links with explanation.

Thanks in advance.,

--
Thanks & Regards,
Jayaram S,
Banglore.
India.

Re: Handling time series data with PostgreSQL

From
Jayaram
Date:
Hi Adalberto,

Awesome.!! Thanks for your reply. 

So, Do we need the timescaleDB as mandatory to handle time series data? Is there any way to handle hourly to days,months,yearly data with PGSQL alone without timescale addon?
Ours is a new project and we are unsure about whether we should have both timescaleDB and PGSQL or PGSQL alone is capable of handling this time series data by tuning the right indexes.etc..

What we are planning is to start with PGSQL alone (For handling hourly,daily,monthly,weekly,yearly calculations) and later when we add seconds and minutes, we will include the timescaleDB. Is this the right approach?
Is PGSQL alone capable of meeting this requirement?

Thank you in advance.

Regards,
Jayaram S.

On Wed, Oct 7, 2020 at 1:34 AM Adalberto Caccia <adacaccia@gmail.com> wrote:
Hi,
TimescaleDB as a Postgresql extension has been used in my firm for two years now, I've recently managed to upgrade it from pg10 to pg12 and from discrete VM's to Kubernetes as well.
Frankly speaking, being new to TimescaleDB at that time, I've found it easy to manage, easy to scale (it's 100% compatible with pg replication, unfortunately not the logical one, yet...), easy to install, easy to upgrade... what else?
From a developer's perspective, it just adds "superpowers" to ordinary PG tables, all under the hood. On disk, it features a "chunked" layout, where each chunk belongs to a definite "time" range; and of course the "time" column on which to index time data is just passed as a parameter to the call to TimescaleDB, for each table on which you need such power.
At the moment, we're also using it for time aggregate calculations, but only for the coarse ones (30m --> 1h and 1h --> 1 day), while we're still handling the finer ones (1s --> 1m and so on) in Kafka+Flink, which is a common scenario for a streaming data platform, anyway.

Regards,
Adalberto


Il giorno mar 6 ott 2020 alle ore 11:47 Jayaram <jairamcbe@gmail.com> ha scritto:
Dear All,

I'm Jayaram S, oracle DBA. Currently we are planning to develop a stock market based application which deals 80% of data with time data. We are in the process of choosing the right database for the requirement especially for time series data. After all multiple investigations, I found PostgreSQL with timescaleDB works better than other DBs.

But still I'm new to PGSQL and we wanted only open source technology to handle our requirements. It will be helpful to me if anyone can suggest implementing the time series concepts in PostgreSQL database. 
It's better if I can get proper docs or links with explanation.

Thanks in advance.,

--
Thanks & Regards,
Jayaram S,
Banglore.
India.


--
Thanks & Regards,
Jayaram S,
Banglore.
Mobile: 91-7760951366.

Re: Handling time series data with PostgreSQL

From
Stephen Frost
Date:
Greetings,

* Jayaram (jairamcbe@gmail.com) wrote:
> So, Do we need the timescaleDB as mandatory to handle time series data? Is
> there any way to handle hourly to days,months,yearly data with PGSQL alone
> without timescale addon?

Certainly there is and a lot of people do it- what isn't clear is what
it is you feel is missing from PG when it comes to handling time series
data..?  Generally speaking there's concerns about PG's ability to
handle lots of partitions (which comes from there being very large
amounts of data being stored), but v12 and v13 have made great
improvements in that area and it's not nearly an issue any longer (and
performs better in quite a few cases than extensions).

> Ours is a new project and we are unsure about whether we should have both
> timescaleDB and PGSQL or PGSQL alone is capable of handling this time
> series data by tuning the right indexes.etc..

Partitioning and index tuning in PG (look at using BRIN if you haven't
already...) is important when you get to larger data volumes.

Thanks,

Stephen

Attachment