Secure your code (and hard work): A guide to backing up your GitHub repositories
GitHub repository backup
Intro
Concerned that you may eventually, lose access to GitHub and all of your repositories? Having an external backup of your work is important, and should follow the 3-2-1 Backup rule. In the case below, I’ve used the python-github-backup PIP package to back up all of my repositories to an external source. The code backup strategy I follow is:
- Commit code to GitHub, etc. regularly (particularly private repositories)
- Once a quarter, I run
python-github-backup
to back this up to my Q-NAP NAS (Network Attached Storage) - The directory is backed up nightly to BackBlaze B2 (I also use Amazon S3/Glacier, but not for this use-case)
Is there possibilities for improvement? Always! I plan to publish a follow-up blog post around using a AWS Lambda function, invoked by an AWS EventBridge event (time-based) to automate this task.
I’ll go into what is GitHub and why it’s used, but if you’re more interested in the pip
package and execution, jump here!
What is GitHub?
In the world of software development, GitHub has become one of the most popular platforms for managing and storing source code. GitHub is a web-based hosting service that provides version control using Git. With GitHub, developers can store their code in repositories, collaborate with others, and track changes to their code over time. However, while GitHub is a great tool for version control, it’s important to remember that it’s not a backup solution. Why isn’t GitHub a one-stop shop for code storage? Let’s dive in..
Protection against data loss
GitHub itself offers a high degree of redundancy and backup, but there’s always the possibility that your data could be lost due to a server failure or other catastrophic event. Backing up your GitHub repositories to an external source ensures that you have a copy of your code in case anything happens to the GitHub servers. By keeping a backup copy of your repositories, you can quickly restore your code and continue working without losing any progress.
Collaboration
GitHub is an excellent platform for collaborating with other developers on a project. However, when you’re working with others, there’s always the possibility that someone could accidentally delete or overwrite a critical piece of code. By backing up your GitHub repositories to an external source, you can quickly restore any lost code and keep your project moving forward.
Control over versioning
One of the benefits of using GitHub is that it keeps a history of all changes made to your code, allowing you to easily roll back to previous versions if necessary. However, if you don’t have a backup of your repositories, you’re relying solely on GitHub to maintain this version history. By backing up your repositories to an external source, you can have greater control over versioning and ensure that you always have access to previous versions of your code.
Compliance with regulatory requirements
If you’re working on a project that’s subject to regulatory requirements, you may be required to keep backups of your code in a separate location. By backing up your GitHub repositories to an external source, you can ensure that you’re meeting these requirements and avoiding any potential compliance issues.
Peace of mind
Perhaps the most important reason to back up your GitHub repositories to an external source is for your peace of mind. Knowing that your code is backed up in a separate location can give you the confidence to work on your projects without worrying about losing your work. By taking this simple step, you can avoid the stress and frustration of losing hours, days, or even weeks of work due to a data loss event.
Why back it up?
GitHub is an excellent platform for managing and storing your code, it’s essential to back up your repositories to an external source to protect against data loss, facilitate collaboration, maintain version control, comply with regulatory requirements, and provide peace of mind. By taking this simple step, you can ensure that your code is always safe and secure, no matter what happens to GitHub and/or your ability to login.
Backing up your code
-
Install
pip
if you don’t already have it -
Install
python3.x
or ensure it’s readily available -
Create a GitHub access token (a
personal access token (classic)
will suffice -
Specify
repo
access for scope, and set the duration to the minimal -
Export the GitHub access token you created in step #3 to your session:
export ACCESS_TOKEN=SOME-GITHUB-TOKEN
-
cd
to the directory in which you want to store all of your repositories -
Execute the command (replace
YOUR_GITHUB_USERNAME
with your GitHub username):github-backup YOUR_GITHUB_USERNAME --token $ACCESS_TOKEN --output-directory ./ --repositories --private
-
You should see the following output:
2023-03-08T00:20:11.153: Backing up user YOUR_GITHUB_USERNAME to /mnt/y/coderepo/example 2023-03-08T00:20:11.154: Requesting https://api.github.com/user?per_page=100&page=1 2023-03-08T00:20:11.336: Retrieving repositories 2023-03-08T00:20:11.336: Requesting https://api.github.com/user/repos?per_page=100&page=1 2023-03-08T00:20:13.869: Requesting https://api.github.com/user/repos?per_page=100&page=2 2023-03-08T00:20:14.180: Filtering repositories 2023-03-08T00:20:14.181: Backing up repositories 2023-03-08T00:20:14.458: Cloning 0x4447_product_maintenance repository from https://*****:x-oauth-basic@github.com/YOUR_GITHUB_USERNAME/example.git to /mnt/c/coderepo/gh_backup/repositories/example/repository
Conclusion
You’ve now safely stored your GitHub repositories locally, and are hopefully following a 3-2-1 backup strategy. Huzzah!