Thread: Request for download stats on release

Request for download stats on release

From
Josh Berkus
Date:
Folks,

Is there a way we can get word to all the mirrors, asking them to compile
download counts for the week after 8.0 final release?  I'd really like to
have some stats to give reporters, and we had trouble collecting these last
time, so I'm asking way in advance this time.

--
-Josh Berkus
 "A developer of Very Little Brain"
 Aglio Database Solutions
 San Francisco


Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Josh Berkus
> Sent: 10 August 2004 19:05
> To: PostgreSQL www
> Subject: [pgsql-www] Request for download stats on release
>
> Folks,
>
> Is there a way we can get word to all the mirrors, asking
> them to compile download counts for the week after 8.0 final
> release?  I'd really like to have some stats to give
> reporters, and we had trouble collecting these last time, so
> I'm asking way in advance this time.

Hi Josh,

Last time round we did tell them that this wasn't going to become a
regular request. We can certainly email them, however bear in mind that
it tends to be the smaller mirrors that help, so the stats are not going
to be that accurate.

Regards, Dave

Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: 11 August 2004 17:07
> To: Dave Page; PostgreSQL www
> Subject: Re: [pgsql-www] Request for download stats on release
>
> Dave,
>
> > Last time round we did tell them that this wasn't going to become a
> > regular request.
>
> Yes, but last time round we also didn't get the figures; many
> mirrors never responded, and others did for the week *after*
> release week.
>
> > We can certainly email them, however bear in mind that it
> tends to be
> > the smaller mirrors that help, so the stats are not going
> to be that
> > accurate.
>
> Oh.  Then there isn't any way to get a reasonably accurate
> download count?

Not that I know of. The larger sites (the likes of sunsite,
mirror.ac.uk, heanet.ie etc.) are unlikely to start grepping their ftp
logs for us. The smaller ones, run by community members are far more
likely to, but then how representative are their stats?

I think the only way you'll get anything more than 'it's at least x' is
by hiding the downloads behind a php script that logs each download, no
matter which mirror it's from - and even then you'll still miss those
users that use ftp the 'proper way'.

Regards, Dave.

Re: Request for download stats on release

From
Josh Berkus
Date:
Dave,

> Last time round we did tell them that this wasn't going to become a
> regular request.

Yes, but last time round we also didn't get the figures; many mirrors never
responded, and others did for the week *after* release week.

> We can certainly email them, however bear in mind that
> it tends to be the smaller mirrors that help, so the stats are not going
> to be that accurate.

Oh.  Then there isn't any way to get a reasonably accurate download count?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Request for download stats on release

From
Peter Eisentraut
Date:
Am Mittwoch, 11. August 2004 18:06 schrieb Josh Berkus:
> Oh.  Then there isn't any way to get a reasonably accurate download count?

These download counts would be completely random anyway.  Most people get
their dose of PostgreSQL via an OS distributor.  (Just look how often someone
asks a question about PostgreSQL 7.2.1; these are the people stuck on Debian
"stable".)  And even they are unable to count the downloads, because they
have mirrors, and people mirror the mirrors, and put mirrors on their
laptops.

Additionally, PostgreSQL is in or near the default installation in many
distributions, so even if you knew the install count, you don't know anything
about who actually uses it.

If you want some amusing (nonabsolute) usage statistics, I recommend
<http://popcon.debian.org/>.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Request for download stats on release

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> These download counts would be completely random anyway.  Most people get
> their dose of PostgreSQL via an OS distributor.

Sure, but the download counts just following a new release would give us
a reading on enthusiastic real users --- ie, people who are interested
enough to download a new major release as soon as it's available.

However, this is probably moot given the point that we won't get
cooperation from the larger mirror sites.

            regards, tom lane

Re: Request for download stats on release

From
Joe Conway
Date:
Tom Lane wrote:
> Sure, but the download counts just following a new release would give us
> a reading on enthusiastic real users --- ie, people who are interested
> enough to download a new major release as soon as it's available.
>
> However, this is probably moot given the point that we won't get
> cooperation from the larger mirror sites.

How difficult would it be to use a page similar to sourceforge's
download page -- i.e. display and pick from available files on the main
site, and then redirect the actual download to a selected mirror?
Wouldn't that allow centralized collection of download stats?

Joe

Re: Request for download stats on release

From
"Marc G. Fournier"
Date:
On Wed, 11 Aug 2004, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
>> These download counts would be completely random anyway.  Most people get
>> their dose of PostgreSQL via an OS distributor.
>
> Sure, but the download counts just following a new release would give us
> a reading on enthusiastic real users --- ie, people who are interested
> enough to download a new major release as soon as it's available.
>
> However, this is probably moot given the point that we won't get
> cooperation from the larger mirror sites.

Why not add something like what pine does?  when you start up a new
release of pine, the first time, there is an option to send an email
letting them count # of users ... its totally optional ...

Maybe have an extension on initdb so that if run from a terminal, it
prompts the installer for whether they want to send in a simple email
letting us know its being used ... ?

Or something like that ... something both visible and that defaults to
'opt-out', not opt-in ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Request for download stats on release

From
Tom Lane
Date:
"Marc G. Fournier" <scrappy@postgresql.org> writes:
> Maybe have an extension on initdb so that if run from a terminal, it
> prompts the installer for whether they want to send in a simple email
> letting us know its being used ... ?

I think this is a bad idea.  People will find it invasive of their
privacy, and even with it we'd still have no idea of real usage counts,
considering how many people would opt out --- or never see the prompt
in the first place, because they're using an RPM or .deb installation
that's going to run initdb non-interactively.

We used to have a request to send email in the message that got printed
by 'make install'.  How many people responded to that?  Darn few that
I can recall.

            regards, tom lane

Re: Request for download stats on release

From
Bruce Momjian
Date:
The issue is how hard is it to get the information, and how useful is it
to us.  It is pretty hard to get, and if we knew a number, what would we
do with it that would help PostgreSQL?  Not much, I think.

---------------------------------------------------------------------------

Tom Lane wrote:
> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > Maybe have an extension on initdb so that if run from a terminal, it
> > prompts the installer for whether they want to send in a simple email
> > letting us know its being used ... ?
>
> I think this is a bad idea.  People will find it invasive of their
> privacy, and even with it we'd still have no idea of real usage counts,
> considering how many people would opt out --- or never see the prompt
> in the first place, because they're using an RPM or .deb installation
> that's going to run initdb non-interactively.
>
> We used to have a request to send email in the message that got printed
> by 'make install'.  How many people responded to that?  Darn few that
> I can recall.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Request for download stats on release

From
Josh Berkus
Date:
Bruce,

> The issue is how hard is it to get the information, and how useful is it
> to us.  It is pretty hard to get, and if we knew a number, what would we
> do with it that would help PostgreSQL?  Not much, I think.

Well, it's a moot point, I think, since we can't get it.  But this is
something which reporters ask me for all the time.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Request for download stats on release

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Bruce,
>> The issue is how hard is it to get the information, and how useful is it
>> to us.  It is pretty hard to get, and if we knew a number, what would we
>> do with it that would help PostgreSQL?  Not much, I think.

> Well, it's a moot point, I think, since we can't get it.  But this is
> something which reporters ask me for all the time.

Well, sure they ask that, because they don't understand what open source
is.  View it as an opportunity to educate them.   Commercial software
companies know exactly how many licenses they've sold, so (discounting
pirates) they have a good idea how many users are out there.  The
Postgres project does not control distribution of our software and so
we have no reasonable way to measure the number of users.   From our
perspective this is not a problem.

            regards, tom lane

Re: Request for download stats on release

From
Peter Eisentraut
Date:
Josh Berkus wrote:
> Well, it's a moot point, I think, since we can't get it.  But this is
> something which reporters ask me for all the time.

You might might remember, a great(?) while ago, the MySQL folks kept
repeating that they had 4 million downloads, period.  Apparently,
people just stopped downloading MySQL at some point, I guess.  With a
bit of research and imagination you can compute that, even back then,
PostgreSQL must have had a lot more than 4 million downloads in total.
So in the end that was just  cute figure that told us nothing.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Request for download stats on release

From
Josh Berkus
Date:
Peter,

> I guess you could compare the 8.0 download counts to the 7.4 and earlier
> counts on a few cooperating sites to get a growth curve.  That should
> tell you more than absolute numbers anyway.

Yeah, I'd like to.  Unfortunately, we don't have numbers for 7.4.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Request for download stats on release

From
Peter Eisentraut
Date:
Tom Lane wrote:
> Sure, but the download counts just following a new release would give
> us a reading on enthusiastic real users --- ie, people who are
> interested enough to download a new major release as soon as it's
> available.

I guess you could compare the 8.0 download counts to the 7.4 and earlier
counts on a few cooperating sites to get a growth curve.  That should
tell you more than absolute numbers anyway.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Request for download stats on release

From
"Marc G. Fournier"
Date:
On Thu, 12 Aug 2004, Josh Berkus wrote:

> Peter,
>
>> I guess you could compare the 8.0 download counts to the 7.4 and earlier
>> counts on a few cooperating sites to get a growth curve.  That should
>> tell you more than absolute numbers anyway.
>
> Yeah, I'd like to.  Unfortunately, we don't have numbers for 7.4.

Does this help:

     http://developer.postgresql.org/xferstats

Note that I have xferlog back to June 4th of last year ... 7.4 was
released Nov 23rd of last year ...

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Request for download stats on release

From
Josh Berkus
Date:
Marc,

>      http://developer.postgresql.org/xferstats

Not really.  What we'd need is to get stats on one or two of those big
mirrors.

--
-Josh Berkus
 "A developer of Very Little Brain"
 Aglio Database Solutions
 San Francisco


Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: 12 August 2004 20:58
> To: Dave Page
> Cc: PostgreSQL www
> Subject: Re: [pgsql-www] Request for download stats on release
>
> Dave,
>
> > ?? I sent you numbers for 7.4....
>
> Yes, but per previous discussion:
> 1) The numbers for 7.4 omitted the largest traffic mirrors;
> 2) For several of the smaller mirrors, those numbers were for
> the week *after* the week of release, also limiting the data.

Yeah, but I think what Peter was suggesting was that if you got the same
figure from the same mirrors for 8.0, then at least you could say 'well,
it looks like a 175% increase over last time from these 10 mirrors,
let's assume the rest are the same'

/D

Re: Request for download stats on release

From
Josh Berkus
Date:
Dave,

> ?? I sent you numbers for 7.4....

Yes, but per previous discussion:
1) The numbers for 7.4 omitted the largest traffic mirrors;
2) For several of the smaller mirrors, those numbers were for the week *after*
the week of release, also limiting the data.

--
-Josh Berkus
 "A developer of Very Little Brain"
 Aglio Database Solutions
 San Francisco


Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Josh Berkus
> Sent: 12 August 2004 18:03
> To: Peter Eisentraut
> Cc: PostgreSQL www
> Subject: Re: [pgsql-www] Request for download stats on release
>
> Peter,
>
> > I guess you could compare the 8.0 download counts to the 7.4 and
> > earlier counts on a few cooperating sites to get a growth
> curve.  That
> > should tell you more than absolute numbers anyway.
>
> Yeah, I'd like to.  Unfortunately, we don't have numbers for 7.4.

?? I sent you numbers for 7.4....

/D

Re: Request for download stats on release

From
Josh Berkus
Date:
Dave,

> Yeah, but I think what Peter was suggesting was that if you got the same
> figure from the same mirrors for 8.0, then at least you could say 'well,
> it looks like a 175% increase over last time from these 10 mirrors,
> let's assume the rest are the same'

Yeah, that would be good.  Do you think the mirrors would go for it?

--
-Josh Berkus
 "A developer of Very Little Brain"
 Aglio Database Solutions
 San Francisco


Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: 12 August 2004 22:06
> To: Dave Page
> Cc: PostgreSQL www
> Subject: Re: [pgsql-www] Request for download stats on release
>
> Dave,
>
> > Yeah, but I think what Peter was suggesting was that if you got the
> > same figure from the same mirrors for 8.0, then at least
> you could say
> > 'well, it looks like a 175% increase over last time from these 10
> > mirrors, let's assume the rest are the same'
>
> Yeah, that would be good.  Do you think the mirrors would go for it?

We can only ask. I'll see if I can find the script I got them to run
last time round tomorrow.

Regards, Dave.

Re: Request for download stats on release

From
"Marc G. Fournier"
Date:
On Thu, 12 Aug 2004, Josh Berkus wrote:

> Dave,
>
>> ?? I sent you numbers for 7.4....
>
> Yes, but per previous discussion:
> 1) The numbers for 7.4 omitted the largest traffic mirrors;
> 2) For several of the smaller mirrors, those numbers were for the week *after*
> the week of release, also limiting the data.

/var/log/xferlog on developer.postgresql.org contains everything frm June
4th of last year until ftp.postgresql.org was redirected to a different VM
(I can give you those too) ... that is the main ftp site only, but again,
the idea is to get an idea of % increase from release to release ... you
won't get an exact #, but at least you can start to get approx values ...



  >
> --
> -Josh Berkus
> "A developer of Very Little Brain"
> Aglio Database Solutions
> San Francisco
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Request for download stats on release

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-www-owner@postgresql.org
> [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Dave Page
> Sent: 12 August 2004 22:59
> To: josh@agliodbs.com
> Cc: PostgreSQL www
> Subject: Re: [pgsql-www] Request for download stats on release
>
> We can only ask. I'll see if I can find the script I got them
> to run last time round tomorrow.
>

OK, found it. This is the email I sent to all the ftp mirrors last time.
Do you want to handle this the same way again? Any changes to the
message (other than the obvious!)?

Regards, Dave


Hello,

You have received this email because your email address is registered as
the contact address for one of the ftp.postgresql.org mirror sites. If
this is incorrect, please contact me and I will amend our records.

As you may be aware, on the 17th of November the PostgreSQL Global
Development Group announced the release of PostgreSQL 7.4. In order to
try to guage the usage and take-up of the new release, we would
appreciate it if you could find a few minutes to provide us with some
statistics. The data we are looking for is simply the number of
downloads for each of four files:

/pub/source/v7.4/postgresql-7.4.tar.gz
/pub/source/v7.4/postgresql-7.4.tar.bz2
/pub/source/v7.4/postgresql-base-7.4.tar.gz
/pub/source/v7.4/postgresql-base-7.4.tar.bz2

This information may be obtained from standard ftp server transfer logs
on a Unix system using a command similar to the following:

grep /var/spool/ftp/pub/source/v7.4/ /var/log/xferlog | \ awk '{print
$9}' | \ egrep
"postgresql-7.4.tar.bz2|postgresql-7.4.tar.gz|postgresql-base-7.4.tar.bz
2|postgresql-base-7.4.tar.gz" | \ egrep -v "md5" | \ sort | \ uniq -c |
\ sort -nr

Note that the directory and logfilename may need to be adjusted for your
particular system. This command should produce output similar to:

 210 /var/spool/ftp/pub/source/v7.4/postgresql-7.4.tar.gz
 111 /var/spool/ftp/pub/source/v7.4/postgresql-7.4.tar.bz2
  20 /var/spool/ftp/pub/source/v7.4/postgresql-base-7.4.tar.bz2
  19 /var/spool/ftp/pub/source/v7.4/postgresql-base-7.4.tar.gz

Please email the output from your system to me, and thank you for
helping support PostgreSQL!

Regards, Dave.

--
Dave Page
PostgreSQL Web Team