I am working on some client modifications to a module within a PHP web application. I have access to the upstream source in a git repo, and I would like to track our changes in our own git repo.
The full source repo is huge (in comparison to the module) so I would prefer to extract the module (contained in two subdirectories) in our repo... but I would like to be able to merge in any upstream changes in the future.
If I was in control of the upstream repo I would spin the module off into a subtree. But I'm not, so I can't!
My best solution so far
Initialise my own repo:
Then add the upstream repo as a remote:
Track (and checkout for good measure) the upstream code as a branch:
Merge the upstream branch, as a subtree, into my master branch:
Then create my own src directory with symlinks to the upstream directories I want to work on:
Why do it this way?
I now have a src directory which just represents the module I want to work on. I can commit my changes to my repo, yet still merge in upstream changes (by refreshing the vendor subtree).
However, this does not seem very elegant. I am carrying the entire upstream codebase in my repo even though I only care about two subdirectories.
What else have I tried?
I initially tried using sparseCheckout to restrict the directories pulled in from upstream. It appeared to do exactly what I wanted on my box, but the repo still contained all of the upstream code. What is more, the sparseCheckout settings do not replicate when the repo is cloned.
What I want
Is there a better/more elegant solution to this problem? I realise the ideal is to spin out a sub-repo upstream, but I can't do that. I don't really need to track upstream version history... apart from the fact I want to easily merge in upstream changes.