2012-07-29

Introduction

We have long been advocates of using SVN – but times have changed and so has the style of the way we work – which is what makes Git such an appealing choice for us. So if you’re coming from SVN too, some things worth knowing are:

Repositories are de-centralised – With SVN, you have 1 master repository in a central location and everything is checked in/out of this location; with Git, its different. Each copy of the project tree (ie. your working copy) has its own repository – the .git sub-directory of the project tree root.

Revisions are no longer decimal numbers – With SVN, your revisions are numbered sequentially with an integer. Due to the distributed nature of Git, and its potential to scaling to hundreds of thousands of revisions, the revisions are identified by a SHA1 hash. You can still short-cut your way through the tree though, HEAD (the latest revision), HEAD^ (the latest revision’s parent), HEAD^^ or HEAD~2 (the latest revision’s parent’s parent); etc.

Hierarchy – Perhaps the biggest adjustment you’ll make is that SVN has a folder based hierarchy, a trunk isn’t anything special, its just a folder, the same with your tags or branches. In fact, with SVN you can check-out a single folder – which gives it a great advantage. Where Git, is just a URL – which identifies its repositories location (be it local or remote), within it, it automatically contains your master, branches and tags (not your local branches, but we’ll come back to this later).

NB. Its worth noting – this guide is not to be a fully fledged Git tutorial – there are plenty of those around and we’ve even listed the ones we prefer in the resources section below. This guide will serve as an insight into Git, traversing from SVN to Git and if you ever work with Sonassi as your development team – our rulebook and work-flow for collaborative development.

Contents

Getting started

Comparable Git:SVN functions

The rules

The Hierarchy

The Work-flow

Resources

Getting started with Git

The best way to start anything is to introduce yourself, so be sure to set your name and email

And its always nice to have colours enabled

At this stage, we’ll assume you’ve already got Git installed (if not read, How do I install Git?) – and we’ll follow the same practice of setting up a repository like we did in our Magento SVN guide. For the purpose of simplicity, we’ll also assume you are using a remote repository, like BitBucket, GitHub or Springloops.

So start by entering your root directory for your project,

If it is an empty repository, Git will set up the current directory. If the repository contains files, it will download them to a sub-directory (named from the repository name). Eg. ./sonassi. It is safe to move the contents of this directory to your current directory (including the .git directory it will create).

Lets start by ignoring the non-essential files

Then add the files we want to version control

That's it! We've created the repository, committed the initial files and pushed that to our remote repository. Now we've nailed the basics, we can start to understand how Git should fall into your daily work-flow.

Comparable SVN:Git Functions

If you're coming from SVN, this look-up table will serve as a helpful reference to show you the equivalent Git command.

Commiting

git clone url

svn checkout url

git pull

svn update

git pull

svn update

--

--

git init

git add .

svnadmin create repo

git commit

svn import file://repo

--

--

git diff

svn diff | less

git diff rev path

svn diff -r rev path

--

--

git apply

patch -p0

--

--

git status

svn status

git checkout path

svn revert path

--

--

git add file

svn add file

git rm file

svn rm file

git mv file

svn mv file

--

--

git commit -a

svn commit

Branching

git branch branch

svn copy path/trunk/ path/branches/branch

git checkout branch

svn switch path/branches/branch

git checkout rev

svn update -r rev

Merging

git merge branch

svn merge -r rev:HEAD path/branches/branch

Resetting

git reset --hard origin master

svn checkout -r rev path/to/branch

Rules

The rule that we follow; to ensure a clean, stable and scalable development, for anything other than short-lived changes, is to branch early - merge & commit often. Most changes, should be branched off into a local branch, the change made, a commit made and when complete - can be merged back into the master and pushed back to the repository. The exclusion being when you need to make a quick single-file change (eg. a minor CSS edit).

Merge from origin/master at the start of the work day

Commit your edits to your local repository at the end of the day

The database name must be suffixed with _branch_purpose/name, Eg. _live or _stag

Never copy/rename (Eg. creating a .bak file) a file to create a restore point for temporary edits

The staging site must only be used for short-lived changes and previews of branch releases

Live/production site pulls must be authorised by the relevant party only after the staging release has been approved

Always give descriptive commit comments with ticket number/bug ID references where necessary

If possible, use a hub system to isolate live pulls

The staging site must never be left in an inconsistent state - all changes should either be committed or reset

When you are finished with a branch, delete the branch, its files, its database and associated users

Never push to the remote repository master, unless you have sufficient time to test and approve the changes for live use

When creating a branch, do not give it an arbitrary name - ether name it after a specific ticket that is to be resolved (eg. bugfix1999) or if it is a long-term personal branch, use your name-companyname (eg. ben-sonassi)

Hierarchy

In practice, our development, staging and live Magento VCS environment would be set up like this.



You'll notice that the staging site and live site actually are both master, we define staging as being the final pre-live test environment. So after all your local branch testing, you test once more on the staging site; once that is approved; you can then pull on the production/live site. Remember rule #5.

To create the hierarchy is very straightforward, it begins by creating quick clones of the live site; we've wrote a few scripts to automate the process, so that's what we'll use.

Creating/refreshing the staging site

We clone the live site to create the mirrored staging environment. But we deliberately exclude some of the content (logs, sessions etc.) and we create a symbolic link for the media directory (to save on disk space usage). Just remember, with this set up, if you delete an image/product via the admin on the staging site - it will remove the file from the live site.

Then edit the database connection details to suit in ./subdomains/staging/app/etc/local.xml. Then we'll dump the live database ready for import on the staging site - we have wrote a script which can dump the live database faster than the normal process and without causing table-level locks (ie. without impacting the live site at all - read the full guide here). After the dump, we also used sed to find and replace all he URLs to be the new staging URL.

Now, we've got an operational staging site, which is a clone of the live site, and also a working directory for master. This now gives us the final preview point before any changes are made live - and a little environment for tiny changes to be made.

Work Flow



Short lived or tiny changes

The only exception to making direct edits to the staging site is when you have a change that can be executed very quickly and you know the definitive output. A good example is if you need to make a quick CSS correction and you know the exact code (or close enough) to make the change. So you edit the relevant file(s), test the output, commit your changes with an appropriate comment, then push the changes to origin master. After approval from the live site maintainer, you can then perform a pull on the live site.

In practice, the code execution would be like this:

You shouldn't have to reset the live site's working directory - but we just want to be extra cautious to remove any temporary edits some naughty developers may have done.

So in summary, make sure you remember:

NEVER edit the live site files directly - ever!

NEVER make a change on the staging site, unless it takes less than a few minutes to complete

Branching out & merging



Branching

Branching out is a very straightforward practice, the first step is to set up the new branch environment. Thankfully, the process is near identical to creating a staging environment, with two additional commands at the end:

This command then changes the current working directory to be an instance of your new branch. You can make changes in here, commit as frequently as you desire, then whenever necessary, either merge master into your branch - or if you are complete, fold your branch into master.

Remember, that branches are local to your own repository - so no-one else will be able to use, see or edit your branch unless you push the branch to the origin repository. But, unless you have a compelling reason to share a branch, its unlikely you'll need to do this.

Merging

Merging, assuming no conflicts happen, are very painless to carry out. There are two ways you can go about a merge,

Merge another branch into your branch

Merge your branch into another branch

It doesn't actually matter which you are doing, but sometimes one approach may seem more appropriate over another.

For example, if you have an actively developed branch, and you need to bring it up to date with the master - then you would execute

Or, as another example, you have finished with your branch and intend to fold it into the master - then you would execute

Conflict resolution

Git is very sophisticated with auto-merging and will only fall back to typical conflict resolution as a last resort, but once you get your head around the <<<<<<<< notation - it will soon make sense, and you'll no longer have that sinking feeling when you've hit a conflict. A common file for conflicts to exist is likely to be a stylesheet, so we'll use style.css as our example.

First thing to do is make sure you don't panic, resolving a conflict manually isn't any where near as hard as you might think it is. The second step is to open the conflicted file and scroll down to the conflict, you'll notice that Git has added a number of left chevrons to denote where the conflict(s) has happened.

HEAD represents the current working directory/branch you have selected - ie. your code.

master represents the current remove directory/branch - ie. someone else's code (similarly, this could be a branch or anywhere else you are merging from)

So you have to make the educated decision as to what the correct output should be; which may be ...

Entirely the other code

Entirely your code

A combination of the two.

The key here is to make a decision and edit the file accordingly - making sure you remove the conflict notation (<<<<<<<<,======= and >>>>>>>>) as necessary. So in our example, we've decided that we actually need a bit of both, leaving:

Then, after resolving your conflict(s), its worth committing those changes before actually continuing with your work.

Your daily routine

Getting into the habit of pulling down the changes from the remote repository in the morning and pushing your changes (or at the very least, committing your changes) in the evening is important to ensure you have a (relatively) conflict free and up-to-date working copy of the site. The last thing you want to do is go 3 days without a pull, suddenly finding dozens of conflicts that you've got to fiddle your way through.

So, print this PDF off, stick it on your wall - and live by it - the Git daily work-flow guide.

If you already having an existing branch that you are working on, then the first thing to do is merge the latest changes from master into your branch. You shouldn't have to commit any changes in your branch because you should have done that the night before!

If you have any conflicts, work through them, if not - you can carry on with the development for the day. Be sure to regularly commit your changes throughout the day (with meaningful commits). Then as it comes to the end of the day, you'll need to make sure any new files you've added to the repository have been added to version control and then make a final commit for the end of the day.

Do not merge your changes to the remote repository's master unless you have sufficient time to test and gain approval on the staging site.

Undoing a file deletion from a historical commit

Find the last commit that affected the given path. As the file isn't in the HEAD commit, this commit must have deleted it; then checkout the version at the commit before.

Or in one command, if $file is the file in question.

Quick and dirty restore points

The Git index is a great resource for making checkpoints in your code, that don't really require a commit exactly. If you are not familiar with the Git index, it is the "placeholder" in which changes are stored after a git add and before a git commit (see diagram). Normally, there isn't any time between add and commit - but you can actually use the index to your advantage.

The index can be used exactly the same way you would rely on the undo history in your PHP editor to provide a way to restore back to a previous point in the code. So if you are ultimately quite happy with your progress on a specific file - or you want to try some slightly different/risky code, then before you proceed you can quickly add that file to the index.

Then if you do happen to make a change and you want to restore back to the previous "working" version of the file quickly, you can just checkout the file.

This way, you can add files to the index repeatedly throughout the day without muddying your commit history.

Amending previous commits

One unmentioned commit management feature is git commit –amend which would allow you to update the last commit with new edits. If you’re familiar with git rebase -i squashing, then this is like squashing your index into the last commit. You could also amend with the working files by using git commit –amend -a or providing specific files on the command line.

Rebase-ing

In Git, there are two main ways to integrate changes from one branch into another - the merge and the rebase. Rather than clone someone else's article, there is a fantastic and clear explanation of rebasing - what it does, when to do it and importantly, when not to do it. You can find the article here

Resources

This article provided an insight into Git management and work-flow - but is only a taster for what can be done with Git. It also wouldn't have been possible to write it without the excellent resources that people have taken the time to write; so here are some great websites that you can also use to supplement your knowledge.

The Simple Guide Guide - http://rogerdudler.github.com/git-guide/

Git Crash Course - http://git.or.cz/course/svn.html

The Definitive Git Website - http://git-scm.com/

Kent Nguyen's Git Practices - http://kentnguyen.com/development/visualized-git-practices-for-team/

Oliver Steel's Git Workflow - http://osteele.com/archives/2008/05/my-git-workflow

Git for Beginners - The definitive practical guide - http://stackoverflow.com/questions/315911/git-for-beginners-the-definitive-practical-guide

Joe Maller's Web Focused Git Work Flow - http://joemaller.com/990/a-web-focused-git-workflow/

Vincent Driessen's Successful Branching Model - http://nvie.com/posts/a-successful-git-branching-model/

Naked Startup's Simple Daily Git Work Flow - http://nakedstartup.com/2010/04/simple-daily-git-workflow

Show more