Jun 6, 2016 by Katsutoshi Seki
Tags: english git github

As GitHub introduced unlimited private repositories, I uploaded gigabytes of data to GitHub repository. Here is some technical notes for uploading many files to GitHub repository.

Storage and bandwidth limit of large files

Git Large File Storage (LFS) is required to push a file larger than 100 MB to GitHub repository. There is a storage and bandwidth limit for LFS, and if you exceed the limit you need to buy extra storage and bandwidth.

You may want to use LFS to handle the large files, or you may just want to ignore the large files. Both methods are written in this article.

Create a repository

Sign in to GitHub and create a repository with New reposiroty button (Don't check Initialize this repository with a README). If there is no REAME.md, create a tentative file with

echo "# test" >> README.md

and run the following commands to initialize the repository and add README.md (Rewrite USER and REP).

git init
git add README.md
git commit -m "First commit"
git remote add origin git@github.com:USER/REP.git
git push -u origin master

Remove spaces from the file names

This command replace spaces " " into underscores "_" in the file names under the current directory.

for A in $(find . | grep " " | sed -e s/" "/x3Exe/g) ; do mv "$(echo $A | sed -e s/x3Exe/' '/g)" "$(echo $A | sed -e s/x3Exe/'_'/g)"; done

If there are spaces in the name of the directory, error may arise. Just repeat this command until no error is shown.

Manage file with LFS

Skip this step when not using LFS. For using LFS, install LFS first. Then setup LFS to Git by

git lfs install

Files larger than 100 MB can be listed with

find . -size +100M | xargs du -sh

Designate the file to manage with LFS from this list. For example, to track files of .psd extention:

git lfs track "*.psd"

Ignore large files

When not using LFS, large files can be ignored with .gitignore file.

To add all the files larger than 100 MB to .gitignore:

find . -size +100M | sed -e 's/^\.\///' >> .gitignore

Increase the HTTP post buffer size

When pushing large files, error may arise

packet_write_wait: Connection to 192.30.252.123: Broken pipe
fatal: The remote end hung up unexpectedly
error: failed to push some refs to 'git@github.com:USER/REP.git'

To avoid this, increasing the HTTP post buffer size is recommended. To increase the buffer size to 50 MB,

git config http.postBuffer 52428800

Adding files to repository

Standard way to adding all the files to repository is git add -A; git commit; git push, but it does not succeed when trying to add gigabytes of files; fatal: The remote end hung up unexpectedly error arises even when the HTTP buffer size is increased. Therefore I made the following shell script, gitadd, to add all the files in the current directory step by step.

When you get error by git add -A; git commit; git push, you can reset the commit and index by git reset HEAD~ and run gitadd after that.