====== Creating a two-way SVN <> GitHub sync ======
===== Intro =====
When migrating [[pdclib:start | PDCLib]] from Bitbucket / Mercurial to a local Subversion, I wanted to provide the advocats of Distributed VCS with a way to get the PDCLib sources "their way".
As Erin Shepherd pointed out in one of the many emails we exchanged, Git seems to have pretty much won the Version Control battle. I still much prefer Subversion, but I realize that being present on GitHub would certainly not hurt the project.
But that meant keeping the Subverion repo and the GitHub repo in sync... and Erin was quite certain that ''git-svn'' was the worst of both worlds combined. So it'd better be a //real// Git repository, and not some SVN / Git chimera -- I wanted to provide "real" Git access to the PDCLib sources.
There were multiple how-to's available online on how to achieve a two-way sync, but none of them really did work the way they were supposed to. [[https://ben.lobaugh.net/blog/147853/creating-a-two-way-sync-between-a-github-repository-and-subversion | Ben Lobaugh's article]] came really close, and was the most helpful.
In the end, I figured a write-up that is //not// missing one or two crucial steps would be nice, so I did one.
===== Initial Setup =====
So I had the SVN repo at ''[[svn://rootdirectory.ddns.net/pdclib | svn://rootdirectory.ddns.net/pdclib]]''. To complicate matters, that repo had two Major branches in it, ''trunk'' and ''shepherd'', which I would have to synchronize with two just-as-separate Git branches.
I did set up an GitHub account and a [[https://github.com/DevSolar/pdclib | PDCLib project]]. I did initialize the Project with a ''.gitignore''. It would have been "cleaner" to start with an empty repository, but I wanted to showcase how to deal with this two-way sync //if the Git repository is not empty.//
===== Get a local clone =====
This is easy:
git clone git@github.com:/DevSolar/pdclib
You need to have copied your SSH pubkey to your GitHub account for this kind of authorization to work (click on your avatar in the top-right corner, select "Settings" from the drop-down, then "SSH and GPG keys" in the left sidebar).
Change into the freshly cloned Git Directory:
cd pdclib
===== Tell Git about SVN =====
Now we tell Git that there is another repository which we want to fetch data from (a.k.a. "setting up a remote" in Git parlance). As it is a Subversion repo, we need to use ''git svn'' for that:
git svn init -s svn://rootdirectory.ddns.net/pdclib --prefix=svn/
The ''-s'' option tells ''git svn'' that the repository is using the ''trunk / tags / branches'' setup common for SVN, and ''--prefix=svn/'' selects a name for the remote. See the documentation for ''git svn'' for other options.
===== Map the authors =====
Subversion logs commits with the user's login name. For Git, ''firstname lastname '' is usually preferred. You can create an "authors file" that does this mapping. My ''authors.txt'' looks like this:
solar = Martin Baute
erin = Erin Shepherd
cycl0ne = cycl0ne
===== Fetch the data =====
Now we can fetch all the revision data from Subversion. For this, we need ''git svn'' again because we are talking to a SVN repo:
git svn fetch --authors-file=authors.txt
If ''git svn'' finds a user name not mapped in ''authors.txt'', it will give an error message.
After this step is complete (which might take a while), the latest Subversion information is available to Git.
===== Git Branches =====
Let's set up the ''shepherd'' branch in Git.
git branch --no-track shepherd
I selected ''--no-track'' because, for all practical purposes, the ''shepherd'' branch is a disjunct project. That might not apply to your project, or be a dumb idea outright, but it's what I did.
===== Sync Branches =====
Now here comes the trick. Git users do not really like ''git svn'', as it is clunky to use. Ideally, a Git user would not "see" the SVN plumbing at all -- and that is exactly what we will do here.
We set up //two additional branches// for the //sole purpose// of synchronizing SVN <> GIT:
git branch --no-track trunksvn
git branch --no-track shepherdsvn
Then we link up both of these branches to their respective SVN remote. (This is where Ben Lobaugh's tutorial missed a step, the ''checkout'' of the sync branch):
git checkout trunksvn
git reset --hard remotes/svn/trunk
If you look at the contents of your directory, you will now see the ''trunk'' version of your project.
Make sure that everything is up-to-date (which it will be anyway at this point, but get into the habit early). We are talking to Subversion again, so ''git svn'' it is:
git svn rebase
Now switch to the //target// branch, and merge in the sync branch. We need to tell Git explicitly that it is OK to merge ''master'' (which contains nothing but ''.gitignore'' at this time) with the sync branch despite the two having nothing in common (yet). This is where an empty Git repo would have been easier, but I wanted to show you the option to make it work anyway:
git checkout master
git merge trunksvn --allow-unrelated-histories
We do the same thing for the second branch-to-be-synced:
git checkout shepherdsvn
git reset --hard remotes/svn/shepherd
git svn rebase
git checkout shepherd
git merge shepherdsvn --allow-unrelated-histories
===== Push Branches =====
We can now push the ''shepherd'' branch to upstream / origin (i.e. GitHub), and marking our local branch to be "tracking" that upstream branch in the process:
git push --set-upstream origin shepherd
Then we switch to our local ''master'' branch (which is already tracking upstream / origin), and push that as well:
git checkout master
git push
Any Git user cloning our GitHub repo now will see //only// ''master'' and ''shepherd''. Neither the two sync branches nor the SVN remote will be visible to them, nor do they need to touch ''git svn'', which is as it should be.
===== Sync SVN -> Git =====
To update the Git repo with changes made to SVN, we follow these steps (in our local clone directory which **does** have the SVN remote and the sync branches):
git checkout trunksvn
git svn rebase
git checkout master
git merge trunksvn
git push origin master
Or, for the ''shepherd'' branch:
git checkout shepherdsvn
git svn rebase
git checkout shepherd
git merge shepherdsvn
git push origin shepherd
===== Sync Git -> SVN =====
To update the SVN repo with changes made to Git, we follow these steps (in our local clone directory which **does** have the SVN remote and the sync branches):
git checkout master
git pull origin master
git checkout trunksvn
git svn rebase
git merge --no-ff master
git commit
git svn dcommit
Or, for the ''shepherd'' branch:
git checkout shepherd
git pull origin shepherd
git checkout shepherdsvn
git svn rebase
git merge --no-ff shepherd
git commit
git svn dcommit
===== Re-Building =====
If, for some reason, you lose that Git setup created above, you need to re-build it:
git clone git@github.com:/DevSolar/pdclib
cd pdclib
git svn init -s svn://rootdirectory.ddns.net/pdclib --prefix=svn/
Now you need to re-build the ''authors.txt'' file (as above). Then comes the tricky part: If your SVN repository has moved //ahead// of your Git repository (i.e. you sync Git -> SVN), you need to re-build the setup //with the SVN revision you left off with//. You can find the revision number of each synced SVN commit in ''git log'', in the ''git-svn-id'':
commit 5950958ff57391789d9a164a56cd1ed87dedaa12 (HEAD -> master, origin/master, origin/HEAD)
Merge: 02e56d5 ec5835f
Author: Martin Baute
Date: Tue Feb 2 10:59:34 2021 +0100
Merge branch 'trunksvn'
commit ec5835f129d8f9629d334657d4c31b40d6190724
Author: solar
Date: Mon Feb 1 21:15:12 2021 +0000
git-svn-id: https://srv183.svn-repos.de/dev34/pdclib/trunk@992 bcf39385-58cc-4174-9fcf-14f50f90dd47
^^^
A bit easier is to have the computer extract the number for you:
REVISION=$(git log | grep git-svn-id | head -n1 | sed -e "s/.*@//" -e "s/ .*//")
Now fetch everything from your SVN repository //up to that revision//:
git svn fetch -r0:$REVISION --authors-file=authors.txt
Set up the sync branch, and link it to the SVN remote:
git branch --no-track trunksvn
git checkout trunksvn
git reset --hard remotes/svn/trunk
Now rebase the branch //to what you already fetched//. The ''--local'' option keeps ''git svn'' from connecting to the repository (which would fetch the revisions SVN is ahead, which we do not want at this point).
git svn rebase --local
Now we merge the sync branch to our master. This is basically a no-op, but it sets the merge point from which we will proceed.
git checkout master
git merge trunksvn --allow-unrelated-histories
Now we are set up again, and can sync the SVN revisions we left out previously by the "normal" procedure.
git checkout trunksvn
git svn rebase
git checkout master
git merge trunksvn
git push origin master
===== Conclusion =====
I hope this little how-to helps you settling the holy VCS war.