Git introduction#
A version control system allows to keep record of every change done on a given set of files, and also encourages and easies collaboration. It works better for plain text files (source code, latex, txt, etc).
In this module we will learn the basic of version control by using git. At the end, the user will be able to use git for source code version control with a bitbucket repository.
REFS:
Introduction#
Git:
is a de-centralized version control system. Every contributor has a copy of the repository and performs local changes on it. Then, he tries to commit the changes to the master repository, and if no conflicts appear then the changes are merged and can be propagated to the other developers.
was started by Linus Torvalds to manage the linux kernel development, after the original system, BitKeeper, was license-restricted.
is very fast and branches are just a natural thing.
is kind of a file system which offers snapshots of the current files version.
Every commit is marked with a hash SHA-1 like 24b9da6552252987aa493b52f8696cd6d3b00373 .
The typical workflow is (on an already git repo): edit, stage, commit.
NOTE: Getting help If you want to read the manual for a given
git
command, likecommit
, you should use
git help commit
The first repo#
Open the terminal. Create a directory for this first example repository:
$ mkdir myfirstrepo
$ cd myfirstrepo
Now just start your git repo
$ git init
That’s all. You now have a git repository ready to be used. What is inside the repo?
$ ls
Nothing? what about
$ ls -a
There should be a sub-directory called .git, which includes all the information about your repo. You are not supposed to directly modify any content of that hidden repository.
What does this mean?
$ git init <directory>
You can also clone an existing repository with
$ git clone https://github.com/path/to/my-project.git
This will clone the my-project.git repo into the my-project directory.
Cloning a repository is the most natural way to start using a remote
repository, locally. For example, let’s assume you have a local
repository at the University. Then you commit and send it to a remote
repository, stored, for example, at bitbucket or github. Then you go to home. At
home, you want a copy of the remote repository to keep working on it:
you clone it. After successful cloning, you will have three copies of
your repository: one at the university, one at your home, and one at the
remote location. Then you could make some changes at home, send them to
the remote repository (with git push
), and then, when at the
University, get those updates from the remote (with git pull
). At that
point, you will have all repositories correctly synced. This way you
will able to have several copies of your remote repository and all of
them synced. Do not forget to start your work with a git pull
, and end
it with a git push
.
Configuring git (individual repo or installation)#
The git config
command allows to configure the individual repository
preferences or the global options for the installations. To read the
help, please write
$ git help config
Some examples:
Configures the user name for the current repository:
$ git config user.name <name>
Or for the current user (globally, not just this repo)
$ git config --global user.name <name>
NOTE : Local config is stored at
.git/config
. Global config for the user is stored at~/.gitconfig
. Finally, the file$(prefix)/etc/gitconfig
stores the system-wide settings.
NOTE : The file
.gitignore
allows to specify files of patterns of files to be ignored.
NOTE : All this files are plain text files. You can directly edit them (not recommended) or use the
config
command o edit them.
Configure the email
$ git config --global user.email <email>
Opens the global config options in the default editor
$ git config --global --edit
Examples:
$ # Tell Git who you are
$ git config --global user.name "John Smith"
$ git config --global user.email john@example.com
$ # Select your favorite text editor
$ git config --global core.editor vim
$ # Add some SVN-like aliases
$ git config --global alias.st status
$ git config --global alias.co checkout
$ git config --global alias.br branch
$ git config --global alias.up rebase
$ git config --global alias.ci commit
The typical appearance of a configuration file is (DO NOT EDIT IT)
[user]
name = John Smith
email = john@example.com
[alias]
st = status
co = checkout
br = branch
up = rebase
ci = commit
[core]
editor = vim
Committing files to the repo#
The typical pattern for working inside a repo is edit/stage/commit. You
perform some edits, then you select some of them as logically related
and working and then put them inside the staging area, and finally you
commit them to the repository. Therefore, you can have files in three
different states: edited but not staged, staged but not committed, and
committed. The command git add
allows to put files into the staging
area, in other words, to prepare them to be committed.
$ git add <file> # adds file <file> to the stagging area, is not yet commited
or
git add <directory> # adds all files inside directory to the stagging area
For this case, create a new file called Readme.txt
, and add any
content to it (like “Hello World” ;)
$ code Readme.txt
Currently, Readme.txt
is just a file inside our repository but is not
in the repository itself. Let’s check the repository status
$ git status
Read the output, what does it mean?
Now let’s add the file to the repo:
$ git add Readme.txt
Now the file is in the staging area. We have marked it to be added to the repository, but it is still not in the repo. We need to commit it.
$ git commit -m "First commit. Adding Readme.txt file"
This command will commit the file Readme.txt to the repo with the log
message stated (after the -m
flag). You can also add many stagged
files or changes with the -a
flag. Check git help commit
.
TIP: Always make small and frequent commits. Do not make large commits with many changes since it will lead to difficulties when you want to revert or track a problematic change, among other possible difficulties. Commit as much as possible.
NOTE : The staged area allows to group related changes, possibly distributed across several files, to a single commit. The staged area is kind of a buffer in the space of changes. NOTE : Git performs snapshots when it commits. That is, it stores the full files, no just files diffs.
NOTE : Check the output of
git log
. Read its manual. What is it useful for?
Exercise > Perform several changes to the Readme.txt files, and for each change perform some commits with an appropiate log message. Possibly, add another file and then commit it to the repository.
The git log
command#
Allows to explore the history of commits. In contrast, git status
gives information of the current status of the repo: staged files,
changes, new files, etc.
Usage:
$ git log # Normal usage
$ git log -n <limit> # Print only the last <limit> commits
$ git log --oneline # Condense each commit to a single line
$ git log --grep="<pattern>" # search for commits with pattern <pattern>
Undoing changes#
One of the advantages of a version control system is the possibility to revert to a previous version of a file if a commit is found to be faulty. There are several ways to do that. For example, to get an specific revision of a file, we could use
git checkout <commit> <file> # this affects the repo since it gets the file from the specific commit
To checkout an specific revision, use
git checkout <commit>
This last one is a read-only version which does not affect the repo (you
can revert back to the original last state with git checkout master
,
which moves you to the master branch.)
If you want to undo a specific commit, use
git revert <commit>
This generates a new commit which undoes all changes introduced in
commit <commit>
, and applies it to the current branch. Note that this
could imply a possible conflict if you are not undoing the previous
commit but other intermediate commits.
NOTE: Another option, not advised, is to use the
git reset <commit>
command. This reset ALL commits until revision<commit>
which deletes possible correct changes. In contrast,git revert <commit>
preserves the history and the commits between the current revision and the one which is going to be reverted. Git reset, in turns, has also a very useful use: to remove all current modifications and restart in the current repo status, after usinggit reset
. Furthermore, it allows to remove specific files from the staging area.
In order to remove untracked changes in the current repository, you can
use the git clean command
. This is not undoable, the untracked
files will be deleted forever.
Exercise > After performing several commits, revert the last commit . What happened to the other files (if you added other ones)? check the log. You can also revert to the first revision of that file.
Git branches#
Branches in git are a natural part of the daily job, and allow to keep
several development paths that can be merged info the main branch at any
time. They are very handful for experimental testing and development of
new features. Commits, staging areas, and history are independent for
each branch, before merge. Three basic commands are needed here:
git branch
, git checkout
, and git merge
.
Examples :
$ git branch # lists all branch in the repository
$ git branch <branchname> # creates a branch called <branchname>
$ git branch -d <branch> # deletes an specific branch
$ git branch -m <branch> # renames the current branch to <branch>
To select a specific branch, use the command git checkout <branch>
.
Other uses:
$ git checkout -b <new-branch> # create and checkout the new branch
$ git checkout -b <new-branch> <existing-branch> # creates and checkout but re-base with some existing-branch
Example/Exercise:
$ git branch new-feature
$ git checkout new-feature
$ # Edit some files
$ git add <file>
$ git commit -m "Started work on a new feature"
$ # Repeat
$ git checkout master
After you have decided that the work in an experimental branch is good enough to be used, you can merge the changes from that branch into the any branch of reference (like the master one). To do so, use
git merge <branch>
This will merge the changes from branch <branch>
into the the current
one. Hopefully, this wont generate any inconsistencies. Otherwise, you
will be forced to sove them manually. Merges can be fast-forward (if
history is linear) or three-way (if not).
** Resolving conflicts ** Sometimes a file is edited in the same
part by two commits. This will generate a merge conflict. Since git
cannot figure out what part to commit, it stops and the conflict should
be resolved manually. After you finish the modification, you can use
git add
on the conflicting file to tell git you have resolved the
problem. Then you run a normal git commit
.
Example for a fast-forward merge:
# Start a new feature
git checkout -b new-feature master
# Edit some files
git add
git commit -m "Start a feature"
# Edit some files
git add
git commit -m "Finish a feature"
# Merge in the new-feature branch
git checkout master
git merge new-feature
git branch -d new-feature
# Start a new feature
git checkout -b new-feature master
# Edit some files
git add
git commit -m "Start a feature"
# Edit some files
git add
git commit -m "Finish a feature"
# Develop the master branch
git checkout master
# Edit some files
git add
git commit -m "Make some super-stable changes to master"
# Merge in the new-feature branch
git merge new-feature
git branch -d new-feature
TIP : There are several graphical clients which allows to follow the branch commits visually. One example is gitk. There are several others. But you should be able to manage from the plain command line.
TIP : Another visual option is to use the bitbucket page for your repo where all the code. commits, branches, etc can be visualized.
Modifying the history#
It is possible to amend a commit by using the command
git commit --amend
. This allows to either combine the staged changes
with the previous commit or to simply modify the log message. An amended
commit is a completely new commit. The original commit is removed from
history. Other commands to check : git rebase
, git rebase -i
,
git reflog
.
Remote repositories#
Git de-centralized approach allows you to have your own copy of the repository where you can develop new commits and/or branches and then push them to remote repositories while also allowing you to pull changes other developers could have performed.
In this regard, the git remote
command allows you create, view, and
delete connections to other repositories. First, let check the help
git help remote
The command git remote
alone will list all connections you have in
your repo to other repos. The command
git remote add <name> <url>
will create a new connection to another repo located at <url>
, and
allows you to use <name>
as a convenient shortcut.
What does the following do?
git remote rm <name>
git remote rename <old-name> <new-name>
NOTE : When you clone a remote repository, it automatically creates a remote connection called origin which points to the remote repository.
Fetching updates#
If you want to get some remote changes from the remote repo but do not want yet to merge them into your local repo, you can use
git fetch <remote>
or
git fetch <remote> <branch>
The latter will fetch only an specific branch. If you think that the
remote changes are appropriate, you can merge them into your local
repository with git merge
.
Pulling updates#
You can also fetch and merge at the same time with a simple
git pull <remote>
which is equivalent to git fetch <remote>
followed by
git merge origin/<current-branch>
.
Pushing updates#
The inverse process, where you publish your result to the remote repo, is done by
git push <remote> <branch>
From the previous, you can see that if you already have a local repo connected to a remote one, the basic work loop is : pull (to update your local repo), edit, add, commit, edit, add, commit, …, edit, add, commit, and push.