Thread: Premature timeout on git.postgresql.org?

Premature timeout on git.postgresql.org?

From
Tom Lane
Date:
Trying to check out from the git mirror on a machine with a
slow internet connection, I repeatably get

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql
Cloning into 'pgsql'...
remote: Enumerating objects: 14523, done.
remote: Counting objects: 100% (14523/14523), done.
remote: Compressing objects: 100% (9482/9482), done.
error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
      364.36 real         0.02 user         0.08 sys

It gets almost but not quite to the end of the data-transfer stage
before failing.  I think I've seen this symptom before, and IIRC
we traced it to a poorly-chosen timeout somewhere in the PG web
infrastructure.  I can't find a record of that in the archives,
though; the closest thing is [1] which is a somewhat different
symptom.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAGuFTBUrqO8kK_J2URKpWwathQSunZU6VtqMSrJnEt8F0qZaTg%40mail.gmail.com



Re: Premature timeout on git.postgresql.org?

From
Bruce Momjian
Date:
On Sat, Feb 19, 2022 at 01:36:58PM -0500, Tom Lane wrote:
> Trying to check out from the git mirror on a machine with a
> slow internet connection, I repeatably get
> 
> $ time git clone https://git.postgresql.org/git/postgresql.git pgsql
> Cloning into 'pgsql'...
> remote: Enumerating objects: 14523, done.
> remote: Counting objects: 100% (14523/14523), done.
> remote: Compressing objects: 100% (9482/9482), done.
> error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output
>       364.36 real         0.02 user         0.08 sys
> 
> It gets almost but not quite to the end of the data-transfer stage
> before failing.  I think I've seen this symptom before, and IIRC
> we traced it to a poorly-chosen timeout somewhere in the PG web
> infrastructure.  I can't find a record of that in the archives,
> though; the closest thing is [1] which is a somewhat different
> symptom.

I did a git clone from https://git.postgresql.org/git/postgresql.git on
Friday and just now and was only seeing 1.15 MiB/s, which seemed low on
my 100Mb connection.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Premature timeout on git.postgresql.org?

From
Magnus Hagander
Date:
On Sun, Feb 20, 2022 at 3:52 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Sat, Feb 19, 2022 at 01:36:58PM -0500, Tom Lane wrote:
> > Trying to check out from the git mirror on a machine with a
> > slow internet connection, I repeatably get
> >
> > $ time git clone https://git.postgresql.org/git/postgresql.git pgsql
> > Cloning into 'pgsql'...
> > remote: Enumerating objects: 14523, done.
> > remote: Counting objects: 100% (14523/14523), done.
> > remote: Compressing objects: 100% (9482/9482), done.
> > error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> > fetch-pack: unexpected disconnect while reading sideband packet
> > fatal: early EOF
> > fatal: fetch-pack: invalid index-pack output
> >       364.36 real         0.02 user         0.08 sys
> >
> > It gets almost but not quite to the end of the data-transfer stage
> > before failing.  I think I've seen this symptom before, and IIRC
> > we traced it to a poorly-chosen timeout somewhere in the PG web
> > infrastructure.  I can't find a record of that in the archives,
> > though; the closest thing is [1] which is a somewhat different
> > symptom.

Hmm. That's interesting. Do you see it repeatable now? Or is it
possible it was a temporary network flakiness?

How long (ish) does it take from the start of the clone until it fails?


> I did a git clone from https://git.postgresql.org/git/postgresql.git on
> Friday and just now and was only seeing 1.15 MiB/s, which seemed low on
> my 100Mb connection.

I get about 50MB/500Mbit per second when I clone from it. The git
server is physically located in Amsterdam, maybe you have a limited
transit performance there (or it has to you).

Can you, as well, try again now and see if you see the same level of
performance now?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Premature timeout on git.postgresql.org?

From
Adrian Klaver
Date:
On 2/20/22 07:07, Magnus Hagander wrote:
> On Sun, Feb 20, 2022 at 3:52 PM Bruce Momjian <bruce@momjian.us> wrote:
>>

> 
> I get about 50MB/500Mbit per second when I clone from it. The git
> server is physically located in Amsterdam, maybe you have a limited
> transit performance there (or it has to you).
> 
> Can you, as well, try again now and see if you see the same level of
> performance now?
> 

For a point of reference I just cloned on a Linode instance in Fremont 
CA. For the Postgres repo I got 2.25 MiB/s. Trying some other project 
repos I got 25-27 MiB/s


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Premature timeout on git.postgresql.org?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Sun, Feb 20, 2022 at 08:31:23AM -0800, Adrian Klaver wrote:
>> For a point of reference I just cloned on a Linode instance in Fremont CA.
>> For the Postgres repo I got 2.25 MiB/s. Trying some other project repos I
>> got 25-27 MiB/s

> My transfer number fluctuated from 800kB to 2.2MB.

Yeah, I'd supposed that the el-cheapo ethernet dongle I'm using on
florican's host was the source of the crummy performance, but it is not,
or at least not all of it: there is something rotten between here and
Amsterdam.  Trying it now on my primary workstation, I can clone from
the github mirror at a more or less respectable speed:

$ time git clone https://github.com/postgres/postgres.git pgsql-github
Cloning into 'pgsql-github'...
remote: Enumerating objects: 879421, done.
remote: Counting objects: 100% (240/240), done.
remote: Compressing objects: 100% (185/185), done.
remote: Total 879421 (delta 111), reused 82 (delta 55), pack-reused 879181
Receiving objects: 100% (879421/879421), 616.24 MiB | 33.59 MiB/s, done.
Resolving deltas: 100% (721439/721439), done.

real    0m36.548s
user    0m59.798s
sys     0m2.830s

but not so much from our official repo:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
remote: Total 879421 (delta 9965), reused 6933 (delta 5125), pack-reused 864639
Receiving objects: 100% (879421/879421), 289.26 MiB | 1.40 MiB/s, done.
Resolving deltas: 100% (754313/754313), done.

real    3m48.432s
user    0m54.379s
sys     0m4.514s

The 1.4MB/s average seen here hides a very variable speed: like
Bruce, I saw a speed around 700/800 kB/s to start, and then it
gradually ramped up to something over 2MB/s.

Now in fairness, the ping time to github.com from here is circa 10ms
while the ping time to gothos.postgresql.org is circa 100ms, so
I'd not really expect equivalent performance ... but this seems
seriously bad.

Trying it again on the machine where I originally saw the problem,
cloning from github works:

$ time git clone https://github.com/postgres/postgres.git pgsql-github
Cloning into 'pgsql-github'...
remote: Enumerating objects: 879421, done.
remote: Counting objects: 100% (240/240), done.
remote: Compressing objects: 100% (129/129), done.
remote: Total 879421 (delta 111), reused 235 (delta 111), pack-reused 879181
Receiving objects: 100% (879421/879421), 616.37 MiB | 4.00 MiB/s, done.
Resolving deltas: 100% (721488/721488), done.
Updating files: 100% (6058/6058), done.
      318.52 real       332.96 user        26.12 sys

It's slow, but I expected that.  I then tried twice to clone from
git.postgresql.org.  The first try succeeded:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
remote: Total 879421 (delta 9966), reused 6933 (delta 5125), pack-reused 864639
Receiving objects: 100% (879421/879421), 289.20 MiB | 862.00 KiB/s, done.
Resolving deltas: 100% (754314/754314), done.
Updating files: 100% (6058/6058), done.
      502.96 real       307.15 user        15.66 sys

The second try got the same results as yesterday:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
[ I hit return a few extra times to capture state, last printout was this: ]
Receiving objects:  99% (876395/879421), 285.62 MiB | 1.19 MiB/s
error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
      323.67 real         0.00 user         0.11 sys

(The instantaneous transfer speed varied from circa 1.3MB/s to
as low as a couple hundred kB/s.)

I doubt there's much we can do at the project level about the
poor transfer speed of the transatlantic link.  However, the
fact that it's failing outright looks to me to be due to a
timeout of 5 or possibly 6 minutes that's breaking the "sideband"
connection.  That probably is within our control.

            regards, tom lane