Git Clone Command vs. GitHub Backup - Best Practices | DevsDay.ru

IT-блоги Git Clone Command vs. GitHub Backup - Best Practices

DZone DevOps 20 ноября 2021 г. Marta Przybylska


Cloning is a popular theme in science fiction movies and literature. Just to mention Star Wars and Attack of the Clones. But it’s not science fiction at all – in the real world probably everyone has heard of Dolly the sheep, the first cloned mammal. Since then, mankind has managed to clone, among others horse, pig, or dog. Wait, we are interested in the IT world, right? The world of over 87.2% programmers using the Git version control system, 60M GitHub users, 10M Bitbucket teams, and over 30M GitLab enthusiasts – so let’s focus on a very in-depth look at the git clone command topic. Do we have your attention?

What Is a Git Clone?

To work with Git, we need to have a copy of the repo on our device. In an event of failure and the lack of backups, we can restore the entire repository on the basis of such a copy. So what is a clone? This is a complete copy of the repository with an entire history of changes. But a clone is also the name of a specific function in Git that allows us to do this.

To work with Git, we need to have a copy of the repo on our device. In an event of failure and the lack of backups, we can restore the entire repository on the basis of such a copy. So what is a clone? This is a complete copy of the repository with an entire history of changes. But a clone is also the name of a specific function in Git that allows us to do this.

How To Git Clone a Branch?

One of the parameters for the clone function is –branch. By default, the clone takes all branches and performs a checkout only on the main branch. However, we can parametrize it to perform a checkout for a particular branch that we specified but it won’t change fetching all branches anyway. 

Now we know what git clone and –branch is. It allows us to have more control over what we do, but it also has its consequences. We mentioned that a local copy of the repo, in extreme cases, allows you to restore the project. So each local clone works a little bit like a backup of the base repository. The problem appears when this copy contains only a single branch, then of course we do not have the entire repository. And with the company scale, grows the risk. Thus, having a reliable repository and metadata backup in place is important and it should never rely on local reproductions because the parameters of the clone command allow you to filter many items and we can never be sure of the differences between ours and external repos.

Git Clone With SSH 

SSH is a communication protocol that enables a remote terminal connection, for example with a server or another computer. Importantly, such connections are encrypted. To establish it, we need a pair of keys: private (saved on our hard drive) and public, shared with the service. We can quite easily establish such a connection in Bitbucket, GitHub, and GitLab. Why should we do this? It allows us to limit the risk of data interception by unauthorized persons as SSH keys are unlikely to be changed often, and certainly not as often as passwords.

Git Clone With HTTPS

This is the default method of cloning on most popular platforms such as GitHub, Bitbucket, or GitLab. We do not need any dedicated configuration we only need an account with a login and password. Of course, when it comes to security, we start to notice some problems, but the convenience and ease of use are its great advantages. The risk of losing the login and password is high, so it is worth taking steps to increase security. Best practice? Using two-factor authentication and having a third-party Bitbucket, GitLab, or GitHub backup like in place, in case a hacker takes over our passwords and removes or encrypts all our code in order to receive a ransom.

I Have My Backup Script, Do I Need a Github Backup?

We already know how the clone function works. We have tools that we can use to create copies, but just imagine doing it manually every time! Impossible, huh? Thus, usually, we prepare scripts that execute the right commands at the right time. For this, we also need some external storage to store these copies. Simple in theory, but the practice requires us to maintain these scripts on a daily basis, which can be problematic, expensive, and time-consuming. Let’s indicate only several situations: adding a new repository, changing a hosting, closing, or archiving some old projects. 

There is another huge disadvantage of that approach. Even if your script creates a copy – how to restore those data? By another script? It means other maintenance work and time. And are you sure that this copy even works? Creating a copy of your data is a very good idea, but it’s not enough to call it a proper backup. You should be able to automatically create copies, check their correctness, version them, restore if needed, and manage easily.

Conclusion 

To sum up, there is a huge difference between ‘copy’ and ‘backup’. Copy is fine for daily work, but it’s not enough for the real protection of your source code. To create a complete backup you need to have control over encryption, versioning, long-term data retention, flexible restore options, and so on, to be prepared for unexpected events of failure.

According to official numbers, there are over 56M developers on GitHub and 60M new repositories created only in the last year. Bitbucket is used by over 12 million users, and GitLab has 30 million estimated registered users! 

Nowadays, developers are the ones who force the digital revolution. Once most of the code is created and hosted within the version control system, you need to have a Bitbucket, GitLab, and GitHub backup in place. 

Источник: DZone DevOps

Наш сайт является информационным посредником. Сообщить о нарушении авторских прав.

devops cybersecurity git github backup data protection disaster recovery gitlab bitbucket