Git introduction#

A version control system allows to keep record of every change done on a given set of files, and also encourages and easies collaboration. It works better for plain text files (source code, latex, txt, etc).

In this module we will learn the basic of version control by using git. At the end, the user will be able to use git for source code version control with a bitbucket repository.

REFS:

Introduction#

Git:

  • is a de-centralized version control system. Every contributor has a copy of the repository and performs local changes on it. Then, he tries to commit the changes to the master repository, and if no conflicts appear then the changes are merged and can be propagated to the other developers.

  • was started by Linus Torvalds to manage the linux kernel development, after the original system, BitKeeper, was license-restricted.

  • is very fast and branches are just a natural thing.

  • is kind of a file system which offers snapshots of the current files version.

  • Every commit is marked with a hash SHA-1 like 24b9da6552252987aa493b52f8696cd6d3b00373 .

  • The typical workflow is (on an already git repo): edit, stage, commit.

NOTE: Getting help If you want to read the manual for a given git command, like commit, you should use

    git help commit

The first repo#

Open the terminal. Create a directory for this first example repository:

    $ mkdir myfirstrepo
    $ cd myfirstrepo

Now just start your git repo

    $ git init

That’s all. You now have a git repository ready to be used. What is inside the repo?

    $ ls

Nothing? what about

    $ ls -a

There should be a sub-directory called .git, which includes all the information about your repo. You are not supposed to directly modify any content of that hidden repository.

What does this mean?

    $ git init <directory>

You can also clone an existing repository with

    $ git clone https://github.com/path/to/my-project.git

This will clone the my-project.git repo into the my-project directory. Cloning a repository is the most natural way to start using a remote repository, locally. For example, let’s assume you have a local repository at the University. Then you commit and send it to a remote repository, stored, for example, at bitbucket or github. Then you go to home. At home, you want a copy of the remote repository to keep working on it: you clone it. After successful cloning, you will have three copies of your repository: one at the university, one at your home, and one at the remote location. Then you could make some changes at home, send them to the remote repository (with git push), and then, when at the University, get those updates from the remote (with git pull). At that point, you will have all repositories correctly synced. This way you will able to have several copies of your remote repository and all of them synced. Do not forget to start your work with a git pull, and end it with a git push.

Configuring git (individual repo or installation)#

The git config command allows to configure the individual repository preferences or the global options for the installations. To read the help, please write

    $ git help config

Some examples:

  • Configures the user name for the current repository:

    $ git config user.name <name>

Or for the current user (globally, not just this repo)

    $ git config --global user.name <name>

NOTE : Local config is stored at .git/config. Global config for the user is stored at ~/.gitconfig. Finally, the file $(prefix)/etc/gitconfig stores the system-wide settings.

NOTE : The file .gitignore allows to specify files of patterns of files to be ignored.

NOTE : All this files are plain text files. You can directly edit them (not recommended) or use the config command o edit them.

  • Configure the email

    $ git config --global user.email <email>
  • Opens the global config options in the default editor

    $ git config --global --edit

Examples:

    $ # Tell Git who you are
    $ git config --global user.name "John Smith"
    $ git config --global user.email john@example.com

    $ # Select your favorite text editor
    $ git config --global core.editor vim

    $ # Add some SVN-like aliases
    $ git config --global alias.st status
    $ git config --global alias.co checkout
    $ git config --global alias.br branch
    $ git config --global alias.up rebase
    $ git config --global alias.ci commit

The typical appearance of a configuration file is (DO NOT EDIT IT)

    [user] 
    name = John Smith
    email = john@example.com
    [alias]
    st = status
    co = checkout
    br = branch
    up = rebase
    ci = commit
    [core]
    editor = vim

Committing files to the repo#

The typical pattern for working inside a repo is edit/stage/commit. You perform some edits, then you select some of them as logically related and working and then put them inside the staging area, and finally you commit them to the repository. Therefore, you can have files in three different states: edited but not staged, staged but not committed, and committed. The command git add allows to put files into the staging area, in other words, to prepare them to be committed.

    $ git add <file>     # adds file <file> to the stagging area, is not yet commited

or

    git add <directory>    # adds all files inside directory to the stagging area

For this case, create a new file called Readme.txt, and add any content to it (like “Hello World” ;)

    $ code Readme.txt

Currently, Readme.txt is just a file inside our repository but is not in the repository itself. Let’s check the repository status

    $ git status 

Read the output, what does it mean?

Now let’s add the file to the repo:

    $ git add Readme.txt

Now the file is in the staging area. We have marked it to be added to the repository, but it is still not in the repo. We need to commit it.

    $ git commit -m "First commit. Adding Readme.txt file"

This command will commit the file Readme.txt to the repo with the log message stated (after the -m flag). You can also add many stagged files or changes with the -a flag. Check git help commit.

TIP: Always make small and frequent commits. Do not make large commits with many changes since it will lead to difficulties when you want to revert or track a problematic change, among other possible difficulties. Commit as much as possible.

NOTE : The staged area allows to group related changes, possibly distributed across several files, to a single commit. The staged area is kind of a buffer in the space of changes. NOTE : Git performs snapshots when it commits. That is, it stores the full files, no just files diffs.

NOTE : Check the output of git log. Read its manual. What is it useful for?

Exercise > Perform several changes to the Readme.txt files, and for each change perform some commits with an appropiate log message. Possibly, add another file and then commit it to the repository.

The git log command#

Allows to explore the history of commits. In contrast, git status gives information of the current status of the repo: staged files, changes, new files, etc.

Usage:

    $ git log # Normal usage

    $ git log -n <limit>   # Print only the last <limit> commits 

    $ git log --oneline  # Condense each commit to a single line

    $ git log --grep="<pattern>"  # search for commits with pattern <pattern>

Undoing changes#

One of the advantages of a version control system is the possibility to revert to a previous version of a file if a commit is found to be faulty. There are several ways to do that. For example, to get an specific revision of a file, we could use

    git checkout <commit> <file> # this affects the repo since it gets the file from the specific commit

To checkout an specific revision, use

    git checkout <commit> 

This last one is a read-only version which does not affect the repo (you can revert back to the original last state with git checkout master, which moves you to the master branch.)

If you want to undo a specific commit, use

    git revert <commit>

This generates a new commit which undoes all changes introduced in commit <commit>, and applies it to the current branch. Note that this could imply a possible conflict if you are not undoing the previous commit but other intermediate commits.

NOTE: Another option, not advised, is to use the git reset <commit> command. This reset ALL commits until revision <commit> which deletes possible correct changes. In contrast, git revert <commit> preserves the history and the commits between the current revision and the one which is going to be reverted. Git reset, in turns, has also a very useful use: to remove all current modifications and restart in the current repo status, after using git reset. Furthermore, it allows to remove specific files from the staging area.

In order to remove untracked changes in the current repository, you can use the git clean command. This is not undoable, the untracked files will be deleted forever.

Exercise > After performing several commits, revert the last commit . What happened to the other files (if you added other ones)? check the log. You can also revert to the first revision of that file.

Git branches#

Branches in git are a natural part of the daily job, and allow to keep several development paths that can be merged info the main branch at any time. They are very handful for experimental testing and development of new features. Commits, staging areas, and history are independent for each branch, before merge. Three basic commands are needed here: git branch, git checkout, and git merge.

Examples :

    $ git branch    # lists all branch in the repository
    $ git branch <branchname>  # creates a branch called <branchname>
    $ git branch -d <branch>   # deletes an specific branch
    $ git branch -m <branch>   # renames the current branch to <branch> 

To select a specific branch, use the command git checkout <branch>. Other uses:

    $ git checkout -b <new-branch>    # create and checkout the new branch
    $ git checkout -b <new-branch> <existing-branch>  # creates and checkout but re-base with some existing-branch

Example/Exercise:

    $ git branch new-feature
    $ git checkout new-feature
    $ # Edit some files
    $ git add <file>
    $ git commit -m "Started work on a new feature"
    $ # Repeat
    $ git checkout master

After you have decided that the work in an experimental branch is good enough to be used, you can merge the changes from that branch into the any branch of reference (like the master one). To do so, use

    git merge <branch>

This will merge the changes from branch <branch> into the the current one. Hopefully, this wont generate any inconsistencies. Otherwise, you will be forced to sove them manually. Merges can be fast-forward (if history is linear) or three-way (if not).

** Resolving conflicts ** Sometimes a file is edited in the same part by two commits. This will generate a merge conflict. Since git cannot figure out what part to commit, it stops and the conflict should be resolved manually. After you finish the modification, you can use git add on the conflicting file to tell git you have resolved the problem. Then you run a normal git commit .

Example for a fast-forward merge:

    # Start a new feature
    git checkout -b new-feature master

    # Edit some files
    git add 
    git commit -m "Start a feature"

    # Edit some files
    git add 
    git commit -m "Finish a feature"

    # Merge in the new-feature branch
    git checkout master
    git merge new-feature
    git branch -d new-feature
    # Start a new feature
    git checkout -b new-feature master

    # Edit some files
    git add 
    git commit -m "Start a feature"

    # Edit some files
    git add 
    git commit -m "Finish a feature"

    # Develop the master branch
    git checkout master

    # Edit some files
    git add 
    git commit -m "Make some super-stable changes to master"

    # Merge in the new-feature branch
    git merge new-feature
    git branch -d new-feature

TIP : There are several graphical clients which allows to follow the branch commits visually. One example is gitk. There are several others. But you should be able to manage from the plain command line.

TIP : Another visual option is to use the bitbucket page for your repo where all the code. commits, branches, etc can be visualized.

Modifying the history#

It is possible to amend a commit by using the command git commit --amend. This allows to either combine the staged changes with the previous commit or to simply modify the log message. An amended commit is a completely new commit. The original commit is removed from history. Other commands to check : git rebase, git rebase -i, git reflog.

Remote repositories#

Git de-centralized approach allows you to have your own copy of the repository where you can develop new commits and/or branches and then push them to remote repositories while also allowing you to pull changes other developers could have performed.

In this regard, the git remote command allows you create, view, and delete connections to other repositories. First, let check the help

    git help remote

The command git remote alone will list all connections you have in your repo to other repos. The command

    git remote add <name> <url>

will create a new connection to another repo located at <url>, and allows you to use <name> as a convenient shortcut.

What does the following do?

    git remote rm <name>
    git remote rename <old-name> <new-name>

NOTE : When you clone a remote repository, it automatically creates a remote connection called origin which points to the remote repository.

Fetching updates#

If you want to get some remote changes from the remote repo but do not want yet to merge them into your local repo, you can use

    git fetch <remote>

or

    git fetch <remote> <branch>

The latter will fetch only an specific branch. If you think that the remote changes are appropriate, you can merge them into your local repository with git merge.

Pulling updates#

You can also fetch and merge at the same time with a simple

    git pull <remote>

which is equivalent to git fetch <remote> followed by git merge origin/<current-branch>.

Pushing updates#

The inverse process, where you publish your result to the remote repo, is done by

    git push <remote> <branch>

From the previous, you can see that if you already have a local repo connected to a remote one, the basic work loop is : pull (to update your local repo), edit, add, commit, edit, add, commit, …, edit, add, commit, and push.