Thread: how does license work for non-profit companies?
A non-profit organization is interested in a new data application that would use a SQL storage system. I'm interested to know how non-profit companies that are not selling products are considered for licensing. Can they use PostgreSQL just like anyone else or do non-profits qualify as commercial use and need to consider other licensing? D. Stimits, stimits AT comcast DOT net
"D. Stimits" <stimits@comcast.net> wrote: > A non-profit organization is interested in a new data application that > would use a SQL storage system. I'm interested to know how non-profit > companies that are not selling products are considered for licensing. > Can they use PostgreSQL just like anyone else or do non-profits qualify > as commercial use and need to consider other licensing? I am not a lawyer. The BSD license (under which PostgreSQL is distributed) does not differentiate between different types or classes of users. You are free to use PostgreSQL for whatever you need, for no fee, as long as you don't violate the terms of the license. -- Bill Moran Potential Technologies http://www.potentialtech.com
On Tue, 2004-06-15 at 08:20, D. Stimits wrote: > A non-profit organization is interested in a new data application that > would use a SQL storage system. I'm interested to know how non-profit > companies that are not selling products are considered for licensing. > Can they use PostgreSQL just like anyone else or do non-profits qualify > as commercial use and need to consider other licensing? > > D. Stimits, stimits AT comcast DOT net The basic philosophy of the PostgreSQL Global Development Group seems to be: Do what you want with it, just don't sue us.
On Tue, 15 Jun 2004 14:20:21 +0000 "D. Stimits" <stimits@comcast.net> wrote: > A non-profit organization is interested in a new data application that > would use a SQL storage system. I'm interested to know how non-profit > companies that are not selling products are considered for licensing. > Can they use PostgreSQL just like anyone else or do non-profits qualify > as commercial use and need to consider other licensing? umm, go read the license. the BSD license is quite simple and quite clear. nobody has to pay. ever. richard -- Richard Welty rwelty@averillpark.net Averill Park Networking 518-573-7592 Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security
On Tue, 2004-06-15 at 22:38, Richard Welty wrote: > On Tue, 15 Jun 2004 14:20:21 +0000 "D. Stimits" <stimits@comcast.net> wrote: > > > A non-profit organization is interested in a new data application that > > would use a SQL storage system. I'm interested to know how non-profit > > companies that are not selling products are considered for licensing. > > Can they use PostgreSQL just like anyone else or do non-profits qualify > > as commercial use and need to consider other licensing? > > umm, go read the license. > > the BSD license is quite simple and quite clear. nobody has to pay. > ever. Doesn't the open source community just ROCK?! It's like Findhorn for Techies. -- Andrew Kelly Online Products Developer Transparency International e.V. Otto-Suhr-Allee 97/99 10585 Berlin, Germany Tel: +4930 3530 5701 Fax: +4930 3530 5702 mailto:devel@corisweb.org http://www.corisweb.org http://www.transparency.org
Hi, We're working since 4 years on Postgres for a e-crm web based application that deals with emails. As our customers bases keep growing (more than one million rows in sereval tables) we started to face performance issue with 7.2 and decided to upgrade to 7.4 since one month. The upgrade process was ok. But we still face performances problems. We decided to import the database into SQLServer to make some benchmarks between the two with same hardware and same data. Please find included main query we're testing on, the explain plan and indexes list. Here is the result : Total running time Postges: 10 - 11s Total running time sql Server : < 1s We must admit that we're quite disappointed with the result and currently think there is something wrong in the way we're working with Postgres. What do you think? is there some wrong with the configuration? with the way we're building the query? Thanks a lot for your answers. Christophe Musielak Christine Bruzaud Akio Software PS : Concerning the seq_scan shown in the explain plan, as there is indexes on the tables as show below, we think Postgres is choosing seq_scan versus index_scan to improve performance. We tried to force using index_scan with 'set enable_seqscan = false) but there is no improvement in the total duration of the query. ------------------------------------------------------------------------ We're running our tests on : Table threads rows number : 125 000 Table emails rows number : 650 000 Table operators rows number : 50 Total nb rows returned by the select : 116 000 (without the LIMIT) explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject, e.email, o.lastname AS "operator" FROM threads t JOIN emails e ON (e.id = t.client_email_id) JOIN operators o ON (o.id = t.operator_id) WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 ) and t.desktop = 2 ORDER BY t.date_last asc, t.id asc LIMIT 16 OFFSET 0; QUERY PLAN ------------------------------------------------------------------ Limit (cost=93539.02..93539.06 rows=16 width=619) -> Sort (cost=93539.02..93745.69 rows=82669 width=619) Sort Key: t.date_last, t.id -> Hash Join (cost=26186.03..73789.00 rows=82669 width=619) Hash Cond: ("outer".operator_id = "inner".id) -> Merge Join (cost=26183.07..69487.26 rows=82668 width=609) Merge Cond: ("outer".id = "inner".client_email_id) -> Index Scan using emails_pkey on emails e (cost=0.00..37782.57 rows=654511 width=30) -> Sort (cost=26183.07..26389.74 rows=82667 width=583) Sort Key: t.client_email_id -> Seq Scan on threads t (cost=0.00..19431.23 rows=82667 width=583) Filter: (((mailbox_id = 2) OR (mailbox_id = 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR (mailbox_id = 13) OR (mailbox_id = 22) OR (mailbox_id = 23) OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26) OR (mailbox_id = 19)) AND (desktop = 2)) -> Hash (cost=2.85..2.85 rows=44 width=14) -> Seq Scan on operators o (cost=0.00..2.85 rows=44 width=14) ------------------------------------------------------------------ Indexes on threads : threads_pkey PRIMARY KEY (id) CREATE INDEX threads_date_last_idx ON threads (date_last); CREATE INDEX threads_desktop_idx ON threads (desktop); CREATE INDEX threads_operator_id_idx ON threads (operator_id); CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority, date_last, id); Indexes on emails : emails_pkey PRIMARY KEY (id) Indexes on operators : operators_pkey PRIMARY KEY (id) ----------------------------------------------------------------- postgresql.conf modified parameters : shared_buffers = 14000 # = 112 Mo with 10 Mo used by wal_buffers sort_mem = 51200 # = 50 Mo vacuum_mem = 102400 # = 100 Mo # Recommended : 64 MB for 1 - 2 Go RAM wal_buffers = 1280 # = 10240 Ko = 10 Mo effective_cache_size = 65536 # Choice : 50% of RAM # <=> 1 * 1024 * 1024 * .50 / 8 = 65536 random_page_cost = 2 # make the planner favor indexscans cpu_tuple_cost = 0.042 # got the planner to choose the index stats_command_string = true stats_block_level = true stats_row_level = true datestyle = 'iso, dmy' ------------------------------------------------------------------
Christophe Musielak wrote: > Hi, > > We're working since 4 years on Postgres for a e-crm web based > application that deals with emails. A few immediate observations: 1. Try not to reply to an existing message when posting a new question, it can mess up threaded views. 2. The performance list is probably the best place for this. > Total running time Postges: 10 - 11s > Total running time sql Server : < 1s > > We must admit that we're quite disappointed with the result and > currently think there is something wrong in the way we're working with > Postgres. Hmm - it's not common to get that sort of difference. > PS : Concerning the seq_scan shown in the explain plan, as there is > indexes on the tables as show below, we think Postgres is choosing > seq_scan versus index_scan to improve performance. We tried to force > using index_scan with 'set enable_seqscan = false) but there is no > improvement in the total duration of the query. Which suggests PG is getting it right. > explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject, > e.email, > o.lastname AS "operator" > FROM threads t JOIN emails e ON (e.id = t.client_email_id) > JOIN operators o ON (o.id = t.operator_id) > WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 ) > and t.desktop = 2 > ORDER BY t.date_last asc, t.id asc > LIMIT 16 OFFSET 0; OK, so you're filtering on desktop,mailbox_id mostly. Could you post the output of EXPLAIN ANALYSE rather than just EXPLAIN. That'll show us not just what PG thinks should happen, but what actually did happen too. From the explain, the greatest cost seems to be the join on threads to email, where it's using the index scan on emails_pkey. > > QUERY PLAN > ------------------------------------------------------------------ > > Limit (cost=93539.02..93539.06 rows=16 width=619) > -> Sort (cost=93539.02..93745.69 rows=82669 width=619) > Sort Key: t.date_last, t.id > -> Hash Join (cost=26186.03..73789.00 rows=82669 width=619) > Hash Cond: ("outer".operator_id = "inner".id) > -> Merge Join (cost=26183.07..69487.26 rows=82668 > width=609) > Merge Cond: ("outer".id = "inner".client_email_id) > -> Index Scan using emails_pkey on emails e > (cost=0.00..37782.57 rows=654511 width=30) > -> Sort (cost=26183.07..26389.74 rows=82667 > width=583) > Sort Key: t.client_email_id > -> Seq Scan on threads t > (cost=0.00..19431.23 > rows=82667 width=583) > Filter: (((mailbox_id = 2) OR > (mailbox_id > = 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR > (mailbox_id = 13) OR (mailbox_id = 22) OR (mailbox_id = > 23) > OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26) > OR (mailbox_id = 19)) AND (desktop = 2)) > -> Hash (cost=2.85..2.85 rows=44 width=14) > -> Seq Scan on operators o (cost=0.00..2.85 > rows=44 > width=14) > > > ------------------------------------------------------------------ > > Indexes on threads : > > threads_pkey PRIMARY KEY (id) > CREATE INDEX threads_date_last_idx ON threads (date_last); > CREATE INDEX threads_desktop_idx ON threads (desktop); > CREATE INDEX threads_operator_id_idx ON threads (operator_id); > CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority, > date_last, id); None of these indexes are going to be much use here - you're mostly filtering on mailbox_id. Unless "desktop" is very selective then a sequential-scan seems sensible. I'd recommend running an EXPLAIN ANALYSE and posting that to the performance list. You might want to include your postgresql.conf settings along with some more details on your hardware. b -- Richard Huxton Archonet Ltd
Just to cover the ovbious solution ... How often do you run "vacuum analyze" on this database? If the data has been around for a long time, have you considered using CLUSTER to physically reorder the tables? Christophe Musielak <cmusielak@akio-software.com> wrote: > Hi, > > We're working since 4 years on Postgres for a e-crm web based > application that deals with emails. > > As our customers bases keep growing (more than one million rows in > sereval tables) we started to face performance issue with 7.2 and > decided to upgrade to 7.4 since one month. > > The upgrade process was ok. > > But we still face performances problems. > > We decided to import the database into SQLServer to make some benchmarks > between the two with same hardware and same data. > > Please find included main query we're testing on, the explain plan and > indexes list. > > Here is the result : > > Total running time Postges: 10 - 11s > Total running time sql Server : < 1s > > We must admit that we're quite disappointed with the result and > currently think there is something wrong in the way we're working with > Postgres. > > What do you think? is there some wrong with the configuration? with the > way we're building the query? > > Thanks a lot for your answers. > > Christophe Musielak > Christine Bruzaud > Akio Software > > > > > PS : Concerning the seq_scan shown in the explain plan, as there is > indexes on the tables as show below, we think Postgres is choosing > seq_scan versus index_scan to improve performance. We tried to force > using index_scan with 'set enable_seqscan = false) but there is no > improvement in the total duration of the query. > > > ------------------------------------------------------------------------ > > We're running our tests on : > > Table threads rows number : 125 000 > Table emails rows number : 650 000 > Table operators rows number : 50 > Total nb rows returned by the select : 116 000 (without the LIMIT) > explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject, > e.email, > o.lastname AS "operator" > FROM threads t JOIN emails e ON (e.id = t.client_email_id) > JOIN operators o ON (o.id = t.operator_id) > WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 ) > and t.desktop = 2 > ORDER BY t.date_last asc, t.id asc > LIMIT 16 OFFSET 0; > > QUERY PLAN > ------------------------------------------------------------------ > > Limit (cost=93539.02..93539.06 rows=16 width=619) > -> Sort (cost=93539.02..93745.69 rows=82669 width=619) > Sort Key: t.date_last, t.id > -> Hash Join (cost=26186.03..73789.00 rows=82669 width=619) > Hash Cond: ("outer".operator_id = "inner".id) > -> Merge Join (cost=26183.07..69487.26 rows=82668 > width=609) > Merge Cond: ("outer".id = "inner".client_email_id) > -> Index Scan using emails_pkey on emails e > (cost=0.00..37782.57 rows=654511 width=30) > -> Sort (cost=26183.07..26389.74 rows=82667 > width=583) > Sort Key: t.client_email_id > -> Seq Scan on threads t > (cost=0.00..19431.23 > rows=82667 width=583) > Filter: (((mailbox_id = 2) OR > (mailbox_id > = 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR > (mailbox_id = 13) OR (mailbox_id = 22) OR (mailbox_id = > 23) > OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26) > OR (mailbox_id = 19)) AND (desktop = 2)) > -> Hash (cost=2.85..2.85 rows=44 width=14) > -> Seq Scan on operators o (cost=0.00..2.85 > rows=44 > width=14) > > > ------------------------------------------------------------------ > > Indexes on threads : > > threads_pkey PRIMARY KEY (id) > CREATE INDEX threads_date_last_idx ON threads (date_last); > CREATE INDEX threads_desktop_idx ON threads (desktop); > CREATE INDEX threads_operator_id_idx ON threads (operator_id); > CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority, > date_last, id); > > Indexes on emails : > > emails_pkey PRIMARY KEY (id) > > Indexes on operators : > > operators_pkey PRIMARY KEY (id) > > ----------------------------------------------------------------- > postgresql.conf modified parameters : > > shared_buffers = 14000 # = 112 Mo with 10 Mo used by wal_buffers > sort_mem = 51200 # = 50 Mo > vacuum_mem = 102400 # = 100 Mo > # Recommended : 64 MB for 1 - 2 Go RAM > wal_buffers = 1280 # = 10240 Ko = 10 Mo > effective_cache_size = 65536 # Choice : 50% of RAM > # <=> 1 * 1024 * 1024 * .50 / 8 = 65536 > random_page_cost = 2 # make the planner favor indexscans > cpu_tuple_cost = 0.042 # got the planner to choose the index > > stats_command_string = true > stats_block_level = true > stats_row_level = true > > datestyle = 'iso, dmy' > ------------------------------------------------------------------ > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Bill Moran Potential Technologies http://www.potentialtech.com
Christophe Musielak <cmusielak@akio-software.com> writes: > -> Seq Scan on threads t > (cost=0.00..19431.23 > rows=82667 width=583) > Filter: (((mailbox_id = 2) OR > (mailbox_id > = 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR > (mailbox_id = 13) OR (mailbox_id = 22) OR (mailbox_id = > 23) > OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26) > OR (mailbox_id = 19)) AND (desktop = 2)) Although it's hard to be sure without EXPLAIN ANALYZE output, it seems like an index on threads(mailbox_id) might help here. You have a bunch of other indexes on threads that are useless for this query :-( ... I hope they are some good for other queries, else they're just slowing down updates. Also a larger value of sort_mem might help. At least by the planner's estimates, the other big hit is the use of a merge join between emails and threads. I think a hash join would work better, but it's not going to pick that unless the inner relation will fit in sort_mem. regards, tom lane