2013-10-08

I am working on some client modifications to a module within a PHP web application. I have access to the upstream source in a git repo, and I would like to track our changes in our own git repo.

The full source repo is huge (in comparison to the module) so I would prefer to extract the module (contained in two subdirectories) in our repo... but I would like to be able to merge in any upstream changes in the future.

If I was in control of the upstream repo I would spin the module off into a subtree. But I'm not, so I can't!

My best solution so far

Initialise my own repo:

Then add the upstream repo as a remote:

Track (and checkout for good measure) the upstream code as a branch:

Merge the upstream branch, as a subtree, into my master branch:

Then create my own src directory with symlinks to the upstream directories I want to work on:

Why do it this way?

I now have a src directory which just represents the module I want to work on. I can commit my changes to my repo, yet still merge in upstream changes (by refreshing the vendor subtree).

However, this does not seem very elegant. I am carrying the entire upstream codebase in my repo even though I only care about two subdirectories.

What else have I tried?

I initially tried using sparseCheckout to restrict the directories pulled in from upstream. It appeared to do exactly what I wanted on my box, but the repo still contained all of the upstream code. What is more, the sparseCheckout settings do not replicate when the repo is cloned.

What I want

Is there a better/more elegant solution to this problem? I realise the ideal is to spin out a sub-repo upstream, but I can't do that. I don't really need to track upstream version history... apart from the fact I want to easily merge in upstream changes.

Show more