Version Control#

Notes on Version Control Systems (VCS)#

A Version Control System (VCS) is a software tool that helps developers manage changes to their code over time. As you develop software, you make many changes to your codebase. Sometimes, you might need to revisit an earlier version of your code, whether to understand the history of the project, to fix a bug, or to revert an unhelpful change. This is where version control systems come in. They allow you to track the history of your project, understand what changes were made, by whom, and why.

There are several reasons why version control systems are crucial in software (and web) development:

  • Collaboration: Multiple developers can work on the same project without stepping on each other’s toes. You can work on your part of the code, and others can work on theirs, without fear of overwriting each other’s changes.

  • Versioning: You can keep a historical record of all changes made to the project. If you introduce a bug, or if a feature doesn’t work as expected, you can always go back to a previous, working version of the code.

  • Backup: All the versions of your project are stored safely. If something happens to your local code, you can always retrieve the code from the VCS.

  • Documentation: When committing changes to the VCS, developers provide messages explaining why the change was made. This forms a form of documentation that can be helpful to understand the project history and make future decisions.

There are three types of version control systems:

  1. Local Version Control Systems: These systems have a simple database that keeps all the changes to files under revision control.

  2. Centralized Version Control Systems (CVCS): These systems contain a single, central repository. Developers get the latest version from the central repository and work on it. Once their changes are complete, they commit the changes to the central repository.

  3. Distributed Version Control Systems (DVCS): In this system, every developer has a complete copy of the entire project. This includes all the files, history, and versions. Developers work on their local repository and then push their changes to the central repository.

Git#

Git is one of the most popular Distributed Version Control Systems (DVCS) in use today. Created by Linus Torvalds, the creator of the Linux operating system, Git is a free and open-source tool that is widely used by both individual developers and large corporations. As a DVCS, Git allows every developer to have a complete copy of the entire project, including its history and all versions of every file. This design allows for powerful collaboration features while also providing the ability to work offline.

Some of the key features of Git are:

  • Efficiency and Speed: Git is designed to be fast. No matter the size of your project, you can expect quick operations for committing, branching, merging, and comparing past versions.

  • Data Integrity: Git uses a data model that ensures the cryptographic integrity of your projects. Every file and commit is checksummed and retrieved by its checksum when checked back out, which ensures that what you put into Git is precisely what you get out of it.

  • Non-linear Development: Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history.

  • Fully Distributed: Every Git directory on every computer is a full-fledged repository with a complete history and full version-tracking capabilities, independent of network access or a central server.

Explanation of Git’s Distributed Version Control System#

In a DVCS like Git, instead of downloading the latest version of the files in the project, you fully clone the repository. This means that if the server containing the central repository goes down, any of the client repositories can be copied back up to the server to restore it. Every clone is a full backup of all the data.

Furthermore, many operations are local, so they are fast. You have the entire project history, so you don’t need to communicate with a server to get the project’s history. And if you’re working offline or on a plane, you can still commit your changes locally and then upload them to the server when you have connectivity.

Installing Git and Setting Up the Development Environment#

To start using Git, you must first install it on your computer. The process of installing Git will depend on your operating system:

  • For Windows, you can download Git from Git for Windows and then install it. This package also provides the Git Bash command line experience.

  • For macOS, you can use the built-in Terminal or install Git with Homebrew by running the command brew install git.

  • For Linux, Git is usually available via your distribution’s package manager. For example, on Ubuntu or Debian, you can install Git with sudo apt-get install git.

After installing Git, you should configure it with your personal information. This is important because every Git commit uses this information:

git config --global user.name "Your Name"
git config --global user.email "youremail@example.com"

Initializing a New Repository (git init)#

To start using Git, you need to initialize a Git repository in your project’s directory. This creates a new subdirectory named .git that contains all of your necessary repository files — a Git repository skeleton.

You can do this by navigating to your project’s directory in your terminal and running the following command:

git init

Making Changes and Tracking Them (git add, git commit)#

Once you’ve initialized a repository, you can start tracking changes to your project. Let’s say you’ve just created a new file named index.html for your web development project. To start tracking changes to this file, you need to add it to Git using the git add command:

git add index.html

The git add command takes a path name for either a file or a directory; if it’s a directory, the command adds all the files in that directory to staging.

After you’ve added your changes, you need to commit them to the Git repository. Commits are like snapshots of your project — they record the state of your project at a specific point in time. To commit your changes, you can use the git commit command:

git commit -m "Initial commit"

The -m option allows you to add a message to your commit, which should describe the changes you’ve made.

Viewing the Change History (git log)#

Git keeps a history of all the changes you’ve committed. You can view this history using the git log command:

git log

This will display a list of all the commits you’ve made to the repository, starting with the most recent. For each commit, Git displays the commit hash, the author, the date, and the commit message.

Practical Example: Developing a Simple Web Page and Tracking Changes with Git#

Let’s walk through a practical example. You’re developing a simple web page. You start by creating an index.html file:

<!DOCTYPE html>
<html>
<head>
    <title>My First Web Page</title>
</head>
<body>
    <h1>Hello, world!</h1>
</body>
</html>

After saving the file, you decide to track it with Git. You initialize a new Git repository, add your index.html file to it, and then make your first commit:

git init
git add index.html
git commit -m "Created index.html with a greeting"

Congratulations! You’ve just made your first commit with Git. As you continue to develop your web page, you can continue to make commits whenever you reach a point that you want to save.

In the next section, we’ll explore how you can use Git’s powerful branching features to safely experiment with new ideas.

Working with Branches in Git#

One of the powerful features of Git is its support for branches. In Git, a branch is essentially a unique series of code changes with a unique name. Each repository can have one or more branches. This allows you to move in different directions and explore different ideas concurrently.

When you create a branch in your project, you’re creating an environment where you can try out new ideas. Changes you make on a branch don’t affect the master branch or any other branches, so you’re free to experiment and commit changes, safe in the knowledge that your branch won’t be merged until it’s ready.

To create a new branch, you use the command git branch followed by the name of the new branch:

git branch my-new-branch

To switch to an existing branch, you use the command git checkout followed by the name of the branch:

git checkout my-new-branch

To delete a branch, you use the command git branch -d followed by the name of the branch:

git branch -d my-old-branch

Merging Branches#

When you’ve finished working on a branch and your changes are ready to be integrated into the main project, you need to merge your branch into the master branch (or any other target branch).

The command to merge branches is git merge followed by the name of the branch you want to merge:

git checkout master
git merge my-new-branch

Resolving Merge Conflicts#

Sometimes when you merge branches, Git won’t be able to reconcile the changes. This results in a merge conflict. Git will give you a message indicating that a conflict has occurred and marking the conflicting code.

You’ll need to manually resolve these conflicts. Once you’ve resolved all the conflicts, you can finish the merge with git add and git commit.

Practical Example: Implementing a New Feature on a Separate Branch#

Imagine you’re developing a website and you want to experiment with adding a new feature. You don’t want your experiment to interfere with the main website, so you create a new branch for your feature:

git branch feature-branch
git checkout feature-branch

Now you can work on your new feature, make commits, and even push this branch to GitHub or any other remote repository.

Once the feature is ready, you can merge it back into the master branch:

git checkout master
git merge feature-branch

This workflow allows you to work on multiple features simultaneously, each on its own branch, and merge them back into the main project when they’re ready.

Reverting and Resetting Changes#

Git provides two main ways to undo changes: git revert and git reset.

  • git revert: The git revert command will create a new commit that undoes the changes made in a previous commit. This is a safe command that won’t alter your commit history.

git revert [commit_hash]
  • git reset: The git reset command, on the other hand, can be used to discard commits in a private branch or to discard commits from the current HEAD. Be careful when using this command, as it can alter your commit history.

git reset --hard [commit_hash]

Using .gitignore#

A .gitignore file is a text file that tells Git which files or folders to ignore in a project. This can be useful when you have files that you don’t want Git to track or include in your repository, such as log files, dependencies, system files, or IDE configuration files.

To create a .gitignore file, simply create a new file named .gitignore in your project’s root directory. Then, in this file, you can list the files or directories that you want Git to ignore. For example:

# .gitignore file
node_modules/
log.txt
.DS_Store

This will tell Git to ignore the node_modules directory, the log.txt file, and the .DS_Store file.

In the next section, we’ll move from Git to GitHub, a platform that brings the power of Git to the cloud, allowing for collaboration and sharing of code in a way that’s accessible to the entire development community.

GitHub#

While Git is a command-line tool that tracks changes to files in a local repository on your computer, GitHub is a web-based platform that takes Git’s functionality and extends it to provide additional features that facilitate collaboration among developers.

In a nutshell, GitHub is a Git repository hosting service, but it also provides a web-based graphical interface. It also provides access control and several collaboration features, such as wikis and basic task management tools for every project.

GitHub can also be thought of as a social network for developers. You can make your projects public and allow others to contribute to them and you can contribute to other public projects. On GitHub, you can see what changes others have made and offer feedback. You can also make your own changes and submit them for review. This is done through a process called a “pull request”, which we will cover later.

It’s important to understand that while Git and GitHub are related, they are not the same thing. Git is a version control system that lets you manage and keep track of your source code history, and GitHub is a hosting service for Git repositories. So they are designed to work together, but they can also be used independently.

Importance of GitHub in Open Source and Collaborative Projects#

GitHub is particularly important in the open-source community as it hosts over 100 million repositories, many of which are open-source projects. Open-source projects allow anyone in the world to view, use, modify, and distribute the project’s source code.

Some of the largest and most influential open-source projects are hosted on GitHub. Developers from all over the world can contribute to these projects, making them better for everyone.

Creating an Account and Setting Up a Repository on GitHub#

To start using GitHub, you first need to create a GitHub account:

  1. Go to the GitHub homepage.

  2. Click on the ‘Sign Up’ button and fill in your details.

After you’ve set up your account, you can create a new repository:

  1. Click on the ‘+’ icon at the top right of the GitHub interface and select ‘New repository’.

  2. Name your repository, provide a short description, choose to make it public or private, and click on ‘Create repository’.

Cloning a Repository#

To get a copy of a repository from GitHub to your local machine, you use the git clone command followed by the URL of the repository. This creates a directory with your project’s name, sets up a .git directory inside it, pulls down all the data for that repository, and checks out a working copy of the latest version.

For example, to clone a repository named “my_project”:

git clone https://github.com/username/my_project.git

Pushing and Pulling Changes to/from GitHub#

When you make changes to your local repository and want to update your remote repository on GitHub, you use the git push command.

Before you can push your changes, you need to commit them:

git commit -m "Add a descriptive message for the changes made"

Then you can push your changes:

git push origin main

Here, “origin” is the default name Git gives to the server from which you cloned and “main” is the branch name.

To update your local version with the latest changes made to the repository on GitHub, you use the git pull command:

git pull origin main

Understanding GitHub Issues and Pull Requests#

Issues in GitHub are a great way to keep track of tasks, enhancements, and bugs for your projects. They’re a place to start a conversation about enhancements or bugs, or other project-related ideas.

A pull request (PR) is a method of submitting contributions to a software project. It occurs when a developer asks that changes they’ve made to a piece of code be considered for inclusion in the main project’s codebase.

After you’ve made changes or additions on a branch in your repo, you can ask the project maintainer to pull in your contribution, hence the term “pull request.”

Practical Example: Collaborating on a Web Project Using GitHub#

Suppose you and a friend are working on a web project. You could start by creating a new repository on GitHub. Then, both you and your friend can clone the repository to your local machines. You can each work on different features in different branches.

When you complete your feature, you commit your changes, push the feature branch to GitHub, and create a pull request. Your friend can then review your changes, provide comments, and finally merge your changes into the main branch. Similarly, you can review and merge your friend’s changes.

GitHub Advanced Features#

GitHub is more than just a platform for hosting Git repositories. It also offers a range of advanced features that can significantly improve your development workflows. In this section, we’ll explore some of these features and see how they can be used in web development.

Forking a Repository#

Forking is a feature unique to platforms like GitHub. When you “fork” a repository, you’re creating a copy of the repository under your own GitHub account. This allows you to freely experiment with changes without affecting the original project.

Forking is often used to propose changes to someone else’s project. You can make your changes in your fork and then create a pull request in the original repository to propose your changes.

To fork a repository, simply click the “Fork” button in the upper right corner of the repository page on GitHub.

Using GitHub Actions for Continuous Integration/Continuous Deployment (CI/CD)#

GitHub Actions is a feature that allows you to automate your software development workflows directly in your GitHub repository. You can write individual tasks, called “actions,” and combine them to create a custom workflow. Workflows are custom automated processes that you can set up in your repository to build, test, package, release, or deploy any project on GitHub.

For example, you can set up a workflow that runs every time someone pushes a commit to your repository. This workflow could run a series of tests and then automatically deploy your web application to a live server if the tests pass.

To create a GitHub Actions workflow, you need to create a workflow file in your repository. This file is stored in the .github/workflows directory and is written in YAML. A simple workflow file might look like this:

name: Deploy to Live Server

on: [push]

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout code
      uses: actions/checkout@v2
    - name: Install dependencies
      run: npm install
    - name: Run tests
      run: npm run test
    - name: Deploy to server
      run: ./deploy.sh

This is a basic example. GitHub Actions is a powerful tool, and you can create complex workflows that fit your project’s needs.

Using GitHub Pages for Web Hosting#

GitHub Pages is a static site hosting service that takes HTML, CSS, and JavaScript files straight from a repository on GitHub and publishes them as a website. It’s an excellent tool for hosting your project’s documentation—or for that matter, any other static content.

To create a GitHub Pages site, you need to create a new repository (or use an existing one), create a new branch named gh-pages, and then push your site’s files to that branch. Within a few minutes, your site will be accessible at https://<yourgithubusername>.github.io/<repositoryname>.

GitHub Pages supports Jekyll, a static site generator, out of the box, allowing you to automatically create websites from Markdown files. It’s also compatible with other static site generators and plain HTML/CSS/JavaScript projects.

Practical Example: Deploying a Web Project Using GitHub Pages and Maintaining It Through CI/CD#

In the following chapters, we’ll walk through a practical example where we’ll create a small web project, deploy it using GitHub Pages, and set up a simple CI/CD workflow to automatically test and update the live site whenever we push changes to the repository.

Other Version Control Systems#

(OPTIONAL READING, this section will not be on exams)

While Git is the most popular version control system, it’s not the only one. There are several other version control systems that developers use, and it can be helpful to know about them and understand the differences.

Mercurial#

Mercurial is another distributed version control system that’s similar to Git. It’s known for its simplicity and ease of use. Mercurial’s commands and operation are often simpler than Git’s, which makes it a good choice for beginners or for projects that don’t require Git’s more advanced features.

However, Mercurial isn’t as widely adopted as Git, and it doesn’t have a centralized hosting service as popular as GitHub. This means that while using Mercurial might be easier, collaborating with others can be more challenging if they’re using Git and GitHub.

Subversion#

Subversion, often abbreviated as SVN, is a centralized version control system. Unlike Git and Mercurial, which allow every developer to have a complete copy of the repository, Subversion requires a connection to the central repository for most operations.

Subversion is older than Git and was widely used before Git’s release. It’s known for its simplicity, especially in linear workflows. However, it lacks Git’s flexibility when it comes to branching and merging. SVN can be a good choice for projects with a simple, linear development workflow, but for complex projects with many developers and branches, Git is usually the better choice.

Comparing and Contrasting with Git#

Both Mercurial and Subversion have their strengths and weaknesses compared to Git. Here are some key differences:

  • Ease of use: Both Mercurial and Subversion are often considered easier to learn and use than Git. This is because they have simpler commands and fewer “gotchas.” However, this simplicity comes at the cost of features.

  • Features: Git has more features than either Mercurial or Subversion. This includes advanced features like staging areas, multiple workflows, local branching, and more. However, these features can make Git more complex and harder to learn.

  • Popularity: Git is the most popular version control system and is becoming even more so. This means that most new projects are likely to use Git. It also means that there are more resources for learning Git and more tools that integrate with Git.

In the end, the best version control system depends on your needs, the needs of your project, and your personal preference. While Git is the most popular choice, Mercurial and Subversion are both excellent tools that might be the perfect fit for certain situations.