git instructions

Started by Magnus Haganderalmost 8 years ago6 messages
#1Magnus Hagander
magnus@hagander.net
1 attachment(s)

Since some time back, we have deployed the git server side http handler on
git.postgresql.org, so the instructions currently on the site are incorrect
in saying that git:// is faster than https://. In fact, we have some
reports and testing that https:// can be significantly faster (due to other
reasons). https is also a protocol that's a lot less likely to run into
firewall issues etc.

Based on that, I propose the following patch which both moves https:// up
to being the first-hand choice for protocol, and removes the references to
it being slower.

Thoughts?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

Attachments:

git_instructions.patchtext/x-patch; charset=US-ASCII; name=git_instructions.patchDownload
diff --git a/doc/src/sgml/sourcerepo.sgml b/doc/src/sgml/sourcerepo.sgml
index 8729a8450d..c3582ec3e9 100644
--- a/doc/src/sgml/sourcerepo.sgml
+++ b/doc/src/sgml/sourcerepo.sgml
@@ -52,7 +52,7 @@
      To begin using the Git repository, make a clone of the official mirror:
 
 <programlisting>
-git clone git://git.postgresql.org/git/postgresql.git
+git clone https://git.postgresql.org/git/postgresql.git
 </programlisting>
 
      This will copy the full repository to your local machine, so it may take
@@ -62,16 +62,13 @@ git clone git://git.postgresql.org/git/postgresql.git
     </para>
 
     <para>
-     The Git mirror can also be reached via the HTTP protocol, if for example
-     a firewall is blocking access to the Git protocol. Just change the URL
-     prefix to <literal>https</literal>, as in:
+     The Git mirror can also be reached via the older Git protocol. Just change the URL
+     prefix to <literal>git</literal>, as in:
 
 <programlisting>
-git clone https://git.postgresql.org/git/postgresql.git
+git clone git://git.postgresql.org/git/postgresql.git
 </programlisting>
 
-     The HTTP protocol is less efficient than the Git protocol, so it will be
-     slower to use.
     </para>
    </step>
 
#2Chapman Flack
chap@anastigmatix.net
In reply to: Magnus Hagander (#1)
Re: git instructions

On 02/01/2018 10:54 AM, Magnus Hagander wrote:

in saying that git:// is faster than https://. In fact, we have some
reports and testing that https:// can be significantly faster (due to other
reasons).

Can you elaborate on the other reasons? It occurs to me that there
might be cases in which each way works better.

From an experience about 3½ years ago[1]https://stackoverflow.com/questions/25954622/a-way-to-keep-a-shallow-git-clone-just-minimally-up-to-date, I drew a conclusion
(which may have been erroneous, or may have changed in newer
git releases) that the http protocol handler was not as bidirectional:
the client was less able to negotiate with the server exactly which
objects it already had and which were wanted, leaving the server to
send a needlessly large mass of stuff by default, whereas git-over-ssh
was able to negotiate a tiny minimal pack file to transfer.

My experience was in the context of keeping a local clone that was
shallow (the project repo had enormous history going back aeons,
of no use for me to test small patches on HEAD), and it seemed
possible that the cutoff points for the shallow history were among
the information that did not get effectively conveyed to the server
over http.

I have not tested that again lately, or with the postgresql repo.
I guess I could, without much trouble.

-Chap

[1]: https://stackoverflow.com/questions/25954622/a-way-to-keep-a-shallow-git-clone-just-minimally-up-to-date
https://stackoverflow.com/questions/25954622/a-way-to-keep-a-shallow-git-clone-just-minimally-up-to-date

#3Magnus Hagander
magnus@hagander.net
In reply to: Chapman Flack (#2)
Re: git instructions

On Thu, Feb 1, 2018 at 5:20 PM, Chapman Flack <chap@anastigmatix.net> wrote:

On 02/01/2018 10:54 AM, Magnus Hagander wrote:

in saying that git:// is faster than https://. In fact, we have some
reports and testing that https:// can be significantly faster (due to

other

reasons).

Can you elaborate on the other reasons? It occurs to me that there
might be cases in which each way works better.

Those aren't protocol based, they're deployment based.

For example, for the https we have a fast cache, for the git:// stuff it
reloads things all the time. The git daemon also has no proper way to limit
or handle concurrency,so tends to get hit much harder there where the http
cache can take care of much of that. Things like that, not the protocol
itself.

From an experience about 3½ years ago[1], I drew a conclusion

(which may have been erroneous, or may have changed in newer
git releases) that the http protocol handler was not as bidirectional:
the client was less able to negotiate with the server exactly which
objects it already had and which were wanted, leaving the server to
send a needlessly large mass of stuff by default, whereas git-over-ssh
was able to negotiate a tiny minimal pack file to transfer.

Yes, this used to be the case, and is the reason behind the original
recommendation. It's what they call the "dumb HTTP protocol" in the docs.
This is not the case when you use git-http-backend, which is the change we
made a few months back.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#4Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Magnus Hagander (#3)
Re: git instructions

On 02/01/2018 05:35 PM, Magnus Hagander wrote:

On Thu, Feb 1, 2018 at 5:20 PM, Chapman Flack <chap@anastigmatix.net
<mailto:chap@anastigmatix.net>> wrote:

On 02/01/2018 10:54 AM, Magnus Hagander wrote:

in saying that git:// is faster than https://. In fact, we have some
reports and testing that https:// can be significantly faster (due to other
reasons).

Can you elaborate on the other reasons? It occurs to me that there
might be cases in which each way works better.

Those aren't protocol based, they're deployment based.

For example, for the https we have a fast cache, for the git:// stuff it
reloads things all the time. The git daemon also has no proper way to
limit or handle concurrency,so tends to get hit much harder there where
the http cache can take care of much of that. Things like that, not the
protocol itself.

yeah, from a infrastructure perspective http(s) is much much nicer and
provides us with way more control over limiting (or accelerating) access
- so a big +1 from my side for changing the default and the
recommendation to https.

From an experience about 3½ years ago[1], I drew a conclusion
(which may have been erroneous, or may have changed in newer
git releases) that the http protocol handler was not as bidirectional:
the client was less able to negotiate with the server exactly which
objects it already had and which were wanted, leaving the server to
send a needlessly large mass of stuff by default, whereas git-over-ssh
was able to negotiate a tiny minimal pack file to transfer.

Yes, this used to be the case, and is the reason behind the original
recommendation. It's what they call the "dumb HTTP protocol" in the
docs. This is not the case when you use git-http-backend, which is the
change we made a few months back.

Agreed - wrt the actual patch - not sure it is accurate to classify the
current way als the "older git protocol" as I cannot find that wording
used in the git docs - maybe "classic"? - we also might want to change
the url for http://git-scm.com/ to https://git-scm.com/ while we are
changing that page.

We also should doublecheck that the docs on
https://git.postgresql.org/adm/help/ match what we have in the main
source docs.

Stefan

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Stefan Kaltenbrunner (#4)
Re: git instructions

On Tue, Feb 6, 2018 at 1:46 PM, Stefan Kaltenbrunner <
stefan@kaltenbrunner.cc> wrote:

Yes, this used to be the case, and is the reason behind the original
recommendation. It's what they call the "dumb HTTP protocol" in the
docs. This is not the case when you use git-http-backend, which is the
change we made a few months back.

Agreed - wrt the actual patch - not sure it is accurate to classify the
current way als the "older git protocol" as I cannot find that wording
used in the git docs - maybe "classic"?

Neither "older" nor "classic"​ appeal to me. If you want to convey an
opinion of quality I'd say something like "the more limited git protocol"
otherwise its just "the git protocol" and we can explain the pros and cons
between the http and git protocols. Noting the improvement of the http
protocol from its former "dumb" version, early on so people have the new
paradigm in their head when they get to the quality comparison, will be
worthwhile for some period of time.

David J.

#6Magnus Hagander
magnus@hagander.net
In reply to: David G. Johnston (#5)
Re: git instructions

On Tue, Feb 6, 2018 at 9:59 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Tue, Feb 6, 2018 at 1:46 PM, Stefan Kaltenbrunner <
stefan@kaltenbrunner.cc> wrote:

Yes, this used to be the case, and is the reason behind the original
recommendation. It's what they call the "dumb HTTP protocol" in the
docs. This is not the case when you use git-http-backend, which is the
change we made a few months back.

Agreed - wrt the actual patch - not sure it is accurate to classify the
current way als the "older git protocol" as I cannot find that wording
used in the git docs - maybe "classic"?

Neither "older" nor "classic"​ appeal to me. If you want to convey an
opinion of quality I'd say something like "the more limited git protocol"
otherwise its just "the git protocol" and we can explain the pros and cons
between the http and git protocols. Noting the improvement of the http
protocol from its former "dumb" version, early on so people have the new
paradigm in their head when they get to the quality comparison, will be
worthwhile for some period of time.

Just "the git protocol" is probably best here, so changed to that. I also
changed the http->https urls per Stefans suggestion.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;