Thread: [GENERAL] Log storage
Hello everyone, An inaugural poster here, sorry if I misidentified a list for my question. I am planning to use PostgreSQL as a storage for application logs (lines of text) with the following properties: - Ingest logs at high rate: 3K lines per second minimum, but the more the better as it would mean we could use one Postgres instance for more than one app. - Only store logs for a short while: days, may be weeks. - Efficiently query logs by an arbitrary time period. - A "live feed" output, akin to `tail -f` on a file. For context, I only used Postgres for a bog standard read-heavy web apps, so I'm completely out of expertise for such a case. Here are my questions: - Is it even possible/advisable to use an actual ACID RDBMS for such a load? Or put another way, can Postgres be tuned to achieve the required write throughput on some mid-level hardware on AWS? May be at the expense of sacrificing transaction isolation or something… - Is there an efficient kind of index that would allow me to do `where 'time' between ... ` on a constantly updated table? - Is there such a thing as a "live cursor" in Postgres for doing the `tail -f` like output, or I should just query it in a loop (and skip records if the client can't keep up)? Thanks in advance for all the answers! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
What is the (min, max, avg) size of the inserted text ? ----- PAscal SQLeo projection manager Senior Oracle dba migrating towards PostgreSQL -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Can't get to hard data right now, but those are app logs that try to be no more than ~100 bytes characters long for readability, and also HTTP logs with long-ish request lines which might put it in the neighborhood of 2K characters. On 10/18/2017 02:30 AM, legrand legrand wrote: > > What is the (min, max, avg) size of the inserted text ? > > > > ----- > PAscal > SQLeo projection manager > Senior Oracle dba migrating towards PostgreSQL > -- > Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html > > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
I saw a similar project on oracle that was storing (long) messages (clob). Partionning by creation date was in place, as btree indexes to access data per id. It was working fine for inserts, as for sélect, but purges (delete) where not freeing space. In fact rétention was not the same for all records. We changed partitionning key to deletion date ans replaced deletes per partition drops. Maybe, if all your records have the same rétention , partition per création can help for purge. Local index on création date would Aldo help. PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Friendly greetings !
You may want to take a look at a postgresql "fork" called pipelinedb : https://www.pipelinedb.com/
I'm not working for them, not using it, but i happen to know it exist :)
*hugs*
Laurent "ker2x" Laborde
On 18-10-2017 09:18, Ivan Sagalaev wrote: > Hello everyone, > > An inaugural poster here, sorry if I misidentified a list for my > question. > > I am planning to use PostgreSQL as a storage for application logs > (lines of text) with the following properties: > > - Ingest logs at high rate: 3K lines per second minimum, but the more > the better as it would mean we could use one Postgres instance for > more than one app. > > - Only store logs for a short while: days, may be weeks. > > - Efficiently query logs by an arbitrary time period. > > - A "live feed" output, akin to `tail -f` on a file. > > For context, I only used Postgres for a bog standard read-heavy web > apps, so I'm completely out of expertise for such a case. Here are my > questions: > > - Is it even possible/advisable to use an actual ACID RDBMS for such a > load? Or put another way, can Postgres be tuned to achieve the > required write throughput on some mid-level hardware on AWS? May be at > the expense of sacrificing transaction isolation or something… > > - Is there an efficient kind of index that would allow me to do `where > 'time' between ... ` on a constantly updated table? > > - Is there such a thing as a "live cursor" in Postgres for doing the > `tail -f` like output, or I should just query it in a loop (and skip > records if the client can't keep up)? > > Thanks in advance for all the answers! Hello, not much on the topic, I had the same problem and I solved it by using a Redis server (memory is cheap and fast) to store the logs for an hour / day depending on the load average and then drop them on a csv or sql file and insert it into Postgresql database. My Redis record is so structured that I have the ability to review the current actions of each user like tail -f. Hardware is not much, Redis server with a lot of memory and cheap server for database to store logs and I now even try to make different approach to remove the database server, because I store every day as separate gziped log file for backup. Regards, Hristo S -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general