Ticket #667 (closed enhancement: fixed)

Opened 3 years ago

Last modified 3 years ago

Git: Initial clone should use "--depth 1" option

Reported by: tfogal Owned by:
Priority: minor Milestone: 0.8.+
Version: 0.7.11 Keywords: git
Cc:

Description

The git repositories that buildbot needs could be considered "leaf" repositories. That is, one doesn't expect that the repository buildbot checks out would be used as an upstream for anybody else. As such, in the common case, buildbot only needs "the latest" revision in the repository.

To reduce disk load and processing time, but mostly network bandwidth, it would be nice if buildbot only grabbed a "shallow" clone. Such clones are not useful for browsing history, but buildbot does not do that regularly.

This all breaks down if the user requests a specific revision. Presumably, we could fetch only that revision and later, but the effort involved in figuring that out seems difficult.

Attachments

bb-shallow.diff Download (1.3 KB) - added by tfogal 3 years ago.
(Incomplete?) patch to implement shallow clones.

Change History

Changed 3 years ago by tfogal

(Incomplete?) patch to implement shallow clones.

comment:1 Changed 3 years ago by tfogal

The attached patch is the start of an implementation. Unfortunately it suffers from an issue: the initial clone implicitly does a git checkout, and then git reset --hard FETCH_HEAD that runs afterwards ends up doing another checkout. This might be avoidable if we switch to "git clone ... ; git checkout -f branch; git clean ..." instead of the fetch business that is going on now.

Then again, it might just be an artifact of my test repository. "git clone" on this repo does not give the branch I specify in master.cfg, so it could be this works fine && it's only my configuration being suboptimal.

comment:2 Changed 3 years ago by dustin

If I understand your worry correctly, it's hat 'git fetch' is happening twice. But it isn't downloading the commits twice, right? I'm worried about changing the behavior of any of the VC modes - a git clean is anathema to mode=update, for example.

comment:3 Changed 3 years ago by marcusl

IIUC, this is similar to some discussion we had on Mercurial, where we decided to keep the full repo on disk between update, but for clobber mode, just retrieve the specified. For git, cloning shallow in clobber mode makes sense, in update, not as much.

(I also think that it'd be cool to write something that would allow a buildslave to cache DVCS:es locally, outside the build.dir, then do hard-link clone into the buld dir. That would reduce bandwidth usage between slave and repo servers, but still have access to the full history locally.)

comment:4 Changed 3 years ago by dustin

  • Milestone changed from undecided to 0.8.0

comment:5 Changed 3 years ago by dustin

  • Milestone changed from 0.8.0 to 0.8.+

comment:6 Changed 3 years ago by tfogal

  • Status changed from new to closed
  • Resolution set to fixed

This was actually fixed in:

5c269c04ab20ad82b3ca9921d34f7a6a1fe6baf0 , 8b6e8c3436762b9d45f6a1f749f180ed2553b44e , and 7d2ccda6386f7f0cce0bc7dc101621ba114e7f1c

(sort of, user-configurable, not default) but the bug was never closed.

Note: See TracTickets for help on using tickets.