When cloning a Git repository, there is an option to limit the amount of history your clone will have. If you set the parameter to –depth 1, you get the least amount of history, and you create a shallow clone.

The git clone man page says that you cannot push your commits if you have a shallow clone. Apparently, there is no error message when you actually push your commits, so it is a situation that might bring problems in the repository in the future.

Lacking more details on whether pushing commits from shallow clones is bad for the repository, let’s measure if there are any gains when someone opts for shallow clones.

Module (gnome-2-26) Full clone (MB) Shallow clone (MB)
evolution 204 189
gtk+ 193 172
nautilus 139 108
gnome-games 127 120
gnome-applets 110 98
gnome-user-docs 108 102
evolution-data-server 84 77
anjuta 76 66
libgweather 69 68
gnome-panel 68 60
ekiga 61 49
dasher 58 49
orca 55 47
gnome-utils 53 48
gnome-icon-theme 51 49
gedit 49 45
epiphany 48 42
gnome-control-center 46 40
gdm 43 38
glib 42 37
gnome-system-tools 33 29
gnome-media 33 30
totem 31 27
gnome-power-manager 31 27
gnome-backgrounds 31 30
brasero 31 29
metacity 29 27
gnome-desktop 28 24
tomboy 27 25
seahorse 24 22
gnome-terminal 23 21
gnome-session 23 20
gucharmap 22 19
gnome-vfs 22 19
glade3 21 19
gconf 21 20
eog 21 18
gcalctool 19 17
libgnomeui 18 15
gtkhtml 18 16
evince 18 15
gnome-themes 17 16
cheese 17 15
file-roller 16 14
empathy 16 15
gok 14 13
gtksourceview 13 12
gnome-keyring 13 12
gnome-doc-utils 13 13
bug-buddy 13 11
zenity 12 11
yelp 12 11
sound-juicer 12 11
libgnome 12 11
gvfs 12 9.9
gnome-system-monitor 12 11
deskbar-applet 12 9.5
libbonobo 11 8.8
gnome-settings-daemon 11 11
gnome-devel-docs 11 11
evolution-exchange 9.9 9.3
gnome-screensaver 9 8.3
vte 8.7 7.5
libbonoboui 8.7 7.4
libgtop 8.4 6.9
libgnomeprintui 8.4 7.1
gconf-editor 8.4 7.9
libgnomeprint 8.1 7
vinagre 7.3 6
libwnck 6.6 5.9
accerciser 6.6 6.3
gtk-engines 6.4 5.4
sabayon 5.8 5.2
vino 5.7 5.3
gnome-nettool 5.3 4.9
mousetweaks 5 4.7
totem-pl-parser 4.6 4.5
at-spi 4.5 3.9
libgnomecanvas 4.3 3.7
atk 4.2 3.7
gnome-netstatus 4.1 3.8
devhelp 3.9 3.2
gdl 3.5 3.2
gnome-mag 3.2 2.9
gnome-menus 3 2.6
hamster-applet 2.8 2.2
gnome-user-share 2.6 2.5
evolution-mapi 2.2 2.1
libgnomekbd 1.8 1.7
alacarte 1.6 1.4
pessulus 1.5 1.3
evolution-webcal 1.4 1.3
swfdec-gnome 1.1 0.94
Total (MB) 2625.6 2349.24
Time (min) 52 37

The git repositories for all modules of gnome-2-26 weight 2.6GB while their shallow clones are 2.3GB. There is a difference of less than 300MB.

Comparatively, if it takes 52 minutes to clone all GNOME 2.26 repositories, their shallow clones save 15 minutes.The speed that was reported by git clone was about 1.4MB/s in this experiment.

Cloning is bound by both your bandwidth and your CPU (especially when resolving deltas). It would be interesting to evaluate if there would be benefits (on git.gnome.org load, speed of cloning) by having daily tarballs of anonymous clones of the modules, so that one can download using HTTP and then simply add their account details and update with git pull –rebase.

With the above information, it makes sense to avoid making shallow clones, especially when you intend to push your changes. Instead, one would dedicate at least 2.6GB for the repositories, and keep them.

intltool-manage-vcs was used to retrieve the repositories.

Update: The GNOME 2.26 modules (2.6GB in size for all their repositories), compresses down to 1.6GB (.tar.bz2).

8 Responses to “Git clones vs Shallow Git clones”

  1. foo Says:

    Yes, daily tarballs of git clones would be nice. You could also make them bare repositories (so they are smaller) and add a README.GNOME.git file to make sure people do git checkout master in them.


  2. Time (min) 52:19.49s 37:41.48s
    Time in (min*s)? Does that make it super-time or something? Nice post for the rest of it.

  3. Simos Says:

    @Michael: The system that I was running the test on has a very fast connection to the Internet, between 20-30Mbps.
    For a typical case with slow home broadband, the speed will definitely slower. It should take about 3-5 hours.


  4. Maybe I wasn’t clear enough… The unit of time is either minutes, or seconds, not (minutes * seconds), because that would make it (time ^ 2).

  5. Simos Says:

    @Michael: Ok, now I see. It was a copy-paste typo.

  6. Steven Walter Says:

    If you’re getting that much compression from bzip2, then it sounds like your repositories aren’t packed very efficiently. Are you tarring only the .git directory, or the .git directory and working copy? The latter is pointless, as the .git directory contains all the information needed to restore the working copy (provided you have no uncommitted changes). Running “git reset –hard” will do this.

  7. Simos Says:

    @Steven: Indeed, I have been tarring the working copy as well.
    I just measured the size of the .git/ directories only, they are 1.124 GB

    Thanks for the tip, it will be very helpful.

  8. dr.t.vasudevan Says:

    quite informative. so apparently nothing to be gained much with shallow repo.
    thanks


Leave a Reply