Thread: [GENERAL] Log storage

[GENERAL] Log storage

From
Ivan Sagalaev
Date:
Hello everyone,

An inaugural poster here, sorry if I misidentified a list for my question.

I am planning to use PostgreSQL as a storage for application logs (lines 
of text) with the following properties:

- Ingest logs at high rate: 3K lines per second minimum, but the more 
the better as it would mean we could use one Postgres instance for more 
than one app.

- Only store logs for a short while: days, may be weeks.

- Efficiently query logs by an arbitrary time period.

- A "live feed" output, akin to `tail -f` on a file.

For context, I only used Postgres for a bog standard read-heavy web 
apps, so I'm completely out of expertise for such a case. Here are my 
questions:

- Is it even possible/advisable to use an actual ACID RDBMS for such a 
load? Or put another way, can Postgres be tuned to achieve the required 
write throughput on some mid-level hardware on AWS? May be at the 
expense of sacrificing transaction isolation or something…

- Is there an efficient kind of index that would allow me to do `where 
'time' between ... ` on a constantly updated table?

- Is there such a thing as a "live cursor" in Postgres for doing the 
`tail -f` like output, or I should just query it in a loop (and skip 
records if the client can't keep up)?

Thanks in advance for all the answers!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Log storage

From
legrand legrand
Date:
What is the (min, max, avg) size of the inserted text ?



-----
PAscal
SQLeo projection manager
Senior Oracle dba migrating towards PostgreSQL 
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Log storage

From
Ivan Sagalaev
Date:
Can't get to hard data right now, but those are app logs that try to be 
no more than ~100 bytes characters long for readability, and also HTTP 
logs with long-ish request lines which might put it in the neighborhood 
of 2K characters.

On 10/18/2017 02:30 AM, legrand legrand wrote:
> 
> What is the (min, max, avg) size of the inserted text ?
> 
> 
> 
> -----
> PAscal
> SQLeo projection manager
> Senior Oracle dba migrating towards PostgreSQL
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> 
> 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Log storage

From
legrand legrand
Date:
I saw a similar project on oracle that was storing (long) messages (clob).
Partionning by creation date was in place, as btree indexes to access data
per id. It was working fine for inserts, as for sélect, but purges (delete)
where not freeing space. In fact rétention was  not the  same for all
records. We changed partitionning key to deletion date ans replaced  deletes
per partition drops.

Maybe, if all your records have the same rétention , partition per création
can help for purge. Local index on création date would Aldo help.

PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Log storage

From
Laurent Laborde
Date:
Friendly greetings ! 

You may want to take a look at a postgresql "fork" called pipelinedb : https://www.pipelinedb.com/

I'm not working for them, not using it, but i happen to know it exist :)

*hugs*

--
Laurent "ker2x" Laborde

Re: [GENERAL] Log storage

From
Condor
Date:
On 18-10-2017 09:18, Ivan Sagalaev wrote:
> Hello everyone,
> 
> An inaugural poster here, sorry if I misidentified a list for my 
> question.
> 
> I am planning to use PostgreSQL as a storage for application logs
> (lines of text) with the following properties:
> 
> - Ingest logs at high rate: 3K lines per second minimum, but the more
> the better as it would mean we could use one Postgres instance for
> more than one app.
> 
> - Only store logs for a short while: days, may be weeks.
> 
> - Efficiently query logs by an arbitrary time period.
> 
> - A "live feed" output, akin to `tail -f` on a file.
> 
> For context, I only used Postgres for a bog standard read-heavy web
> apps, so I'm completely out of expertise for such a case. Here are my
> questions:
> 
> - Is it even possible/advisable to use an actual ACID RDBMS for such a
> load? Or put another way, can Postgres be tuned to achieve the
> required write throughput on some mid-level hardware on AWS? May be at
> the expense of sacrificing transaction isolation or something…
> 
> - Is there an efficient kind of index that would allow me to do `where
> 'time' between ... ` on a constantly updated table?
> 
> - Is there such a thing as a "live cursor" in Postgres for doing the
> `tail -f` like output, or I should just query it in a loop (and skip
> records if the client can't keep up)?
> 
> Thanks in advance for all the answers!


Hello,

not much on the topic, I had the same problem and I solved it by using a 
Redis server (memory is cheap and fast) to store the logs for an
hour / day depending on the load average and then drop them on a csv or 
sql file and insert it into Postgresql database.
My Redis record is so structured that I have the ability to review the 
current actions of each user like tail -f.
Hardware is not much, Redis server with a lot of memory and cheap server 
for database to store logs and I now even try to make different approach 
to
remove the database server, because I store every day as separate gziped 
log file for backup.

Regards,
Hristo S


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general