Table of Contents
Creating a two-way SVN <> GitHub sync
Intro
When migrating PDCLib from Bitbucket / Mercurial to a local Subversion, I wanted to provide the advocats of Distributed VCS with a way to get the PDCLib sources “their way”.
As Erin Shepherd pointed out in one of the many emails we exchanged, Git seems to have pretty much won the Version Control battle. I still much prefer Subversion, but I realize that being present on GitHub would certainly not hurt the project.
But that meant keeping the Subverion repo and the GitHub repo in sync… and Erin was quite certain that git-svn
was the worst of both worlds combined. So it'd better be a real Git repository, and not some SVN / Git chimera – I wanted to provide “real” Git access to the PDCLib sources.
There were multiple how-to's available online on how to achieve a two-way sync, but none of them really did work the way they were supposed to. Ben Lobaugh's article came really close, and was the most helpful.
In the end, I figured a write-up that is not missing one or two crucial steps would be nice, so I did one.
Initial Setup
So I had the SVN repo at svn://rootdirectory.ddns.net/pdclib
. To complicate matters, that repo had two Major branches in it, trunk
and shepherd
, which I would have to synchronize with two just-as-separate Git branches.
I did set up an GitHub account and a PDCLib project. I did initialize the Project with a .gitignore
. It would have been “cleaner” to start with an empty repository, but I wanted to showcase how to deal with this two-way sync if the Git repository is not empty.
Get a local clone
This is easy:
git clone git@github.com:/DevSolar/pdclib
You need to have copied your SSH pubkey to your GitHub account for this kind of authorization to work (click on your avatar in the top-right corner, select “Settings” from the drop-down, then “SSH and GPG keys” in the left sidebar).
Change into the freshly cloned Git Directory:
cd pdclib
Tell Git about SVN
Now we tell Git that there is another repository which we want to fetch data from (a.k.a. “setting up a remote” in Git parlance). As it is a Subversion repo, we need to use git svn
for that:
git svn init -s svn://rootdirectory.ddns.net/pdclib --prefix=svn/
The -s
option tells git svn
that the repository is using the trunk / tags / branches
setup common for SVN, and –prefix=svn/
selects a name for the remote. See the documentation for git svn
for other options.
Map the authors
Subversion logs commits with the user's login name. For Git, firstname lastname email@example.com
is usually preferred. You can create an “authors file” that does this mapping. My authors.txt
looks like this:
solar = Martin Baute <solar@rootdirectory.de> erin = Erin Shepherd <erin.shepherd@e43.eu> cycl0ne = cycl0ne <claus@poweros.de>
Fetch the data
Now we can fetch all the revision data from Subversion. For this, we need git svn
again because we are talking to a SVN repo:
git svn fetch --authors-file=authors.txt
If git svn
finds a user name not mapped in authors.txt
, it will give an error message.
After this step is complete (which might take a while), the latest Subversion information is available to Git.
Git Branches
Let's set up the shepherd
branch in Git.
git branch --no-track shepherd
I selected –no-track
because, for all practical purposes, the shepherd
branch is a disjunct project. That might not apply to your project, or be a dumb idea outright, but it's what I did.
Sync Branches
Now here comes the trick. Git users do not really like git svn
, as it is clunky to use. Ideally, a Git user would not “see” the SVN plumbing at all – and that is exactly what we will do here.
We set up two additional branches for the sole purpose of synchronizing SVN <> GIT:
git branch --no-track trunksvn git branch --no-track shepherdsvn
Then we link up both of these branches to their respective SVN remote. (This is where Ben Lobaugh's tutorial missed a step, the checkout
of the sync branch):
git checkout trunksvn git reset --hard remotes/svn/trunk
If you look at the contents of your directory, you will now see the trunk
version of your project.
Make sure that everything is up-to-date (which it will be anyway at this point, but get into the habit early). We are talking to Subversion again, so git svn
it is:
git svn rebase
Now switch to the target branch, and merge in the sync branch. We need to tell Git explicitly that it is OK to merge master
(which contains nothing but .gitignore
at this time) with the sync branch despite the two having nothing in common (yet). This is where an empty Git repo would have been easier, but I wanted to show you the option to make it work anyway:
git checkout master git merge trunksvn --allow-unrelated-histories
We do the same thing for the second branch-to-be-synced:
git checkout shepherdsvn git reset --hard remotes/svn/shepherd git svn rebase git checkout shepherd git merge shepherdsvn --allow-unrelated-histories
Push Branches
We can now push the shepherd
branch to upstream / origin (i.e. GitHub), and marking our local branch to be “tracking” that upstream branch in the process:
git push --set-upstream origin shepherd
Then we switch to our local master
branch (which is already tracking upstream / origin), and push that as well:
git checkout master git push
Any Git user cloning our GitHub repo now will see only master
and shepherd
. Neither the two sync branches nor the SVN remote will be visible to them, nor do they need to touch git svn
, which is as it should be.
Sync SVN -> Git
To update the Git repo with changes made to SVN, we follow these steps (in our local clone directory which does have the SVN remote and the sync branches):
git checkout trunksvn git svn rebase git checkout master git merge trunksvn git push origin master
Or, for the shepherd
branch:
git checkout shepherdsvn git svn rebase git checkout shepherd git merge shepherdsvn git push origin shepherd
Sync Git -> SVN
To update the SVN repo with changes made to Git, we follow these steps (in our local clone directory which does have the SVN remote and the sync branches):
git checkout master git pull origin master git checkout trunksvn git svn rebase git merge --no-ff master git commit git svn dcommit
Or, for the shepherd
branch:
git checkout shepherd git pull origin shepherd git checkout shepherdsvn git svn rebase git merge --no-ff shepherd git commit git svn dcommit
Re-Building
If, for some reason, you lose that Git setup created above, you need to re-build it:
git clone git@github.com:/DevSolar/pdclib cd pdclib git svn init -s svn://rootdirectory.ddns.net/pdclib --prefix=svn/
Now you need to re-build the authors.txt
file (as above). Then comes the tricky part: If your SVN repository has moved ahead of your Git repository (i.e. you sync Git → SVN), you need to re-build the setup with the SVN revision you left off with. You can find the revision number of each synced SVN commit in git log
, in the git-svn-id
:
commit 5950958ff57391789d9a164a56cd1ed87dedaa12 (HEAD -> master, origin/master, origin/HEAD) Merge: 02e56d5 ec5835f Author: Martin Baute <solar@rootdirectory.de> Date: Tue Feb 2 10:59:34 2021 +0100 Merge branch 'trunksvn' commit ec5835f129d8f9629d334657d4c31b40d6190724 Author: solar <solar@bcf39385-58cc-4174-9fcf-14f50f90dd47> Date: Mon Feb 1 21:15:12 2021 +0000 git-svn-id: https://srv183.svn-repos.de/dev34/pdclib/trunk@992 bcf39385-58cc-4174-9fcf-14f50f90dd47 ^^^
A bit easier is to have the computer extract the number for you:
REVISION=$(git log | grep git-svn-id | head -n1 | sed -e "s/.*@//" -e "s/ .*//")
Now fetch everything from your SVN repository up to that revision:
git svn fetch -r0:$REVISION --authors-file=authors.txt
Set up the sync branch, and link it to the SVN remote:
git branch --no-track trunksvn git checkout trunksvn git reset --hard remotes/svn/trunk
Now rebase the branch to what you already fetched. The –local
option keeps git svn
from connecting to the repository (which would fetch the revisions SVN is ahead, which we do not want at this point).
git svn rebase --local
Now we merge the sync branch to our master. This is basically a no-op, but it sets the merge point from which we will proceed.
git checkout master git merge trunksvn --allow-unrelated-histories
Now we are set up again, and can sync the SVN revisions we left out previously by the “normal” procedure.
git checkout trunksvn git svn rebase git checkout master git merge trunksvn git push origin master
Conclusion
I hope this little how-to helps you settling the holy VCS war.