What is Git?

Distributed Version Control. (Series 2)

Shubham Singh
5 min readFeb 29, 2020

In this series, I want to explain what Distributed Version Control means so we can understand why this is such an important feature of Git.

In the previous series where we looked at the history of version control systems, we talked about SCSS, RCS, CVS, and SVN, four of the most popular version control systems of the past. And all four of these use a central code repository model, that is that there is one central place where you store the master copy of your code, and when you’re working with the code you check out a copy from the master repository. You work with it to make your changes, and then you submit those changes back to the central repository. Other users can also work from that repository submitting their changes. And it’s up to us as users to keep up-to-date with whatever is happening in that central code repository, to make sure that we pull down and update any changes that other people have made.

Git doesn’t work that way, Git is Distributed Version Control, different users or teams of users, each maintains its repositories instead of working from a central repository. And the changes are stored as changesets or patches, and we’re focused on tracking changes not the versions of the document.

Now that’s a subtle difference, you may think, well, CVS and SVN those track changes too!! they don’t. They track the changes that it takes to Git from version-to-version of each of the different files or the different states of a directory. Git doesn’t work that way, Git focuses on these changesets in encapsulating a changeset as a discrete unit and then those changesets can be exchanged between repositories. We’re not trying to keep up-to-date with the latest version of something instead the question is, do we have a changeset applied or not? So there is no single master repository, there are just many working copies each with their combination of changesets.

Let me give an illustration to make this clear. Imagine that we have changes to a single document as sets A, B, C, D, E, and F, we’re just going to give them arbitrary letter names so that we can help see it.

REPO 1: A, B, C, D, E, F has all six of those changesets in it
REPO 2: A, B, D, F i.e. has only four of those changes in it.
REPO 3: A, B, C, E
REPO 4: A, B, E, F

We could have a first repository that has all six of those changesets in it. We can have repository 2 that only has four of those changes in it. It’s not that it’s behind repository 1 or that it needs to be brought up-to-date, it’s just simply that it doesn’t have the same changesets. We can have repository 3 that has A, B, C, and E, and repository 4 that has A, B, E, and F.

No one of these is right and no one of these is wrong. None of them is the master repository and the others are somehow out-of-date or out of sync with it. They all are just different repositories that happened to have different changes in them. We could just as easily add changeset G to repository 3, and then we could share it with
Whereas with CVS and SVN, for example, you would need to submit those changes to the central server and then people would need to pull down those changes to update their versions of the file.

Now by convention, we often do designate a repository as being a master repository, but that’s not built-in to Git, it’s not part of the Git architecture, that’s just convention, that we say, okay, this is the master one and everyone is going to submit their changes to the master one, we’re going to try all stay in sync from that one or we don’t have to. We can actually have three or four different master ones that have different versions, different features in them, and we could all be contributing to those equally and just swapping changes between them. Now because it’s distributed that has a couple of advantages, it means that there is no need to communicate with a central server, which makes things faster, and it means that no network access is required to submit our changes, we can work on an airplane.
There is no single failure point, with CVS and SVN if something goes wrong with that central repository that can be a real showstopper for everyone else who is working off of that central repository.
With Git we don’t have that problem, everyone can keep working, they’ve each got their repository that they are working from, not just a copy that they’re trying to keep in sync with a central repository.

It also encourages participation in the forking of projects, and this is important in the open-source community. Because developers can work independently, they can then make changes, bug fixes, feature improvements, and then submit those back to the project for either inclusion or rejection. And if you decide you don’t like the way that an open-source project is going, you can fork it, take it to a completely different direction and say, you know what, I’m going to just make a clean break and make my repository now the one that I’m going to work from, all of my changes will be submitted to there, and I can still pull changesets from the master one into my project whenever I want. But I don’t have to, I can go my own way. That becomes a really powerful and flexible feature that’s well suited to collaboration between teams especially loose groups of distributed developers like you have in the open-source world. Distributed version control is an important part of the Git architecture that you need to keep in mind. Especially, if you have previous experience with another Version Control System like CVS or SVN. We’ll talk a lot more about how Git tracks and merges these changesets as we go forward.

For now, just make sure that you understand that there is no central repository that we were from, all repositories are considered equal by Git, it’s just a matter of whether a repository has changed in it or doesn’t.

Use this link to visualize git: https://git-school.github.io/visualizing-git/

Please refer to this video: https://www.linkedin.com/learning/git-essential-training-2012/about-distributed-version-control

--

--

No responses yet