Uploading many files to GitHub repository
Jun 6, 2016 by Katsutoshi Seki
Tags:
english
git
github
As GitHub introduced unlimited private repositories, I uploaded gigabytes of data to GitHub repository. Here is some technical notes for uploading many files to GitHub repository.
Storage and bandwidth limit of large files
Git Large File Storage (LFS) is required to push a file larger than 100 MB to GitHub repository. There is a storage and bandwidth limit for LFS, and if you exceed the limit you need to buy extra storage and bandwidth.
You may want to use LFS to handle the large files, or you may just want to ignore the large files. Both methods are written in this article.
Create a repository
Sign in to GitHub and create a repository with New reposiroty
button (Don't check Initialize this repository with a README
). If there is no REAME.md
, create a tentative file with
echo "# test" >> README.md
and run the following commands to initialize the repository and add README.md (Rewrite USER and REP
).
git init
git add README.md
git commit -m "First commit"
git remote add origin git@github.com:USER/REP.git
git push -u origin master
Remove spaces from the file names
This command replace spaces " " into underscores "_" in the file names under the current directory.
for A in $(find . | grep " " | sed -e s/" "/x3Exe/g) ; do mv "$(echo $A | sed -e s/x3Exe/' '/g)" "$(echo $A | sed -e s/x3Exe/'_'/g)"; done
If there are spaces in the name of the directory, error may arise. Just repeat this command until no error is shown.
Manage file with LFS
Skip this step when not using LFS. For using LFS, install LFS first. Then setup LFS to Git by
git lfs install
Files larger than 100 MB can be listed with
find . -size +100M | xargs du -sh
Designate the file to manage with LFS from this list. For example, to track files of .psd extention:
git lfs track "*.psd"
Ignore large files
When not using LFS, large files can be ignored with .gitignore
file.
To add all the files larger than 100 MB to .gitignore
:
find . -size +100M | sed -e 's/^\.\///' >> .gitignore
Increase the HTTP post buffer size
When pushing large files, error may arise
packet_write_wait: Connection to 192.30.252.123: Broken pipe
fatal: The remote end hung up unexpectedly
error: failed to push some refs to 'git@github.com:USER/REP.git'
To avoid this, increasing the HTTP post buffer size is recommended. To increase the buffer size to 50 MB,
git config http.postBuffer 52428800
Adding files to repository
Standard way to adding all the files to repository is git add -A; git commit; git push
, but it does not succeed when trying to add gigabytes of files; fatal: The remote end hung up unexpectedly
error arises even when the HTTP buffer size is increased. Therefore I made the following shell script, gitadd
, to add all the files in the current directory step by step.
When you get error by git add -A; git commit; git push
, you can reset the commit and index by git reset HEAD~
and run gitadd
after that.