SVN Users and Git Authors
Both Subversion and Git keep authors in commits, but those authors differ.
In SVN, the author is being stored as an unversioned revision property svn:author
. Every time a Subversion user makes a commit, SVN creates a new revision and sets this revision svn:author
property to that exact user's name, for example, johndoe:
------------------------------------------------------------------------ r163 | johndoe | 2017-06-07 20:22:15 +0500 (Wed, 07 Jun 2017) | 1 line Changed paths: A /project A /project/branches A /project/tags A /project/trunk initial layout for the project ------------------------------------------------------------------------ ???? $ svn proplist -v --revprop --revision 163 Unversioned properties on revision 163: svn:author johndoe svn:date 2017-06-07T15:22:15.655243Z svn:log initial layout for the project
Git also stores author name along with commits, but this name differs from that in SVN: whereas SVN stores actual username, Git user identity consists of a name and email:
Git User <gituser@domain.com>
Those name and email don't relate to an actual username that is used to login to Git repository, they are being set in Git configuration, for example then may be set by the commands below:
$ git config --global user.name "John Doe" $ git config --global user.email johndoe@example.com
The Git user John Doe is then referred as
John Doe <johndoe@example.com>
This exact line then appears as the author name in every commit John Doe makes.
It worth to mention that Git holds not only author name, but also a committer name:
$ git cat-file commit HEAD tree 905df23db37b33320483fc6676bfc684078ed248 parent 4a0cf06baa9aefaa20a13820265ef401d7b1c2b6 author John Doe <johndoe@example.com> 1496849115 +0000 committer Jane Doe <janedoe@example.com> 1496849115 +0000
Pro Git book describes the difference between those names as follows: the author is the person who originally wrote the work, whereas the committer is the person who last applied the work. So, if you send in a patch to a project and one of the core members applies the patch, both of you get credit – you as the author, and the core member as the committer.
SubGit translates Git author name, so committer name doesn't mean much for the SVN-to-Git translation process.
Authors mapping affects licensing
Note, that SubGit uses authors names to count licensed users, see Licensing manual for details.
Configuration options
There are two configuration options that relate to authors:
core.authorsFile
<a name="core.authorsFile">this option represents a path to the authors mapping file or authors mapping helper program. The path can be either relative to the Git repository or absolute. The authors mapping file is a text file that lists SVN and Git usernames pairs, see more detail below. The authors mapping helper program is either script or binary executable file that provides authors related data in a certain form, find details below in helpers chapter. Note, there may be more that one
authorsFile
option set in the file, e.g.: [core] authorsFile = subgit/authors.txt authorsFile = /etc/authors.txt All the mentioned files contents will be merged into full list, but there's some specific: if some SVN username appears twice (or more) - only its first occurence will be applied. For example, if SVN username johndoe appears both insubgit/authors.txt
subgit/authors.txt johndoe = John Doe EMAIL: johndoe@example.com and in/etc/authors.txt
: /etc/authors.txt johndoe = John M. Doe EMAIL: john_doe@example.com then mapping fromsubgit/authors.txt
will be applied since that file appears before/etc/authors.txt
in the list.
core.defaultDomain
<a name="core.defaultDomain">this option provides a domain name to be added to the username to form an email address in Git when automatic authors mapping is used. SubGit automatically fills that option with a hostname when
subgit configure
command is invoked. If the option is not set or omitted in the configuration file, SubGit will not generate the email address for Git commits and author's email will appears empty (just a pair of angle brackets with nothing in between) in the commit:$ git log -v commit d5d46afc3aa33240de8b5200e72611d4e0d72a99 Author: john_doe <> Date: Thu Jun 6 10:25:02 2017 +0200
minor changes
Those are two authors-related SubGit options, but those are not all the configurations that may be needed authors mapping to work correctly: an additional setting may be needed on SVN side depending on how SubGit logs in the SVN repository.
Actually, there are two possible alternatives: SubGit can use one dedicated SVN account to log in SVN repository and it can use several different accounts for that. There's sugit/passwd
file that's intended to store SVN accounts list that SubGit can use to get authenticated. When SubGit performs a Git commit translation into SVN revision (in case the mirror is established), it searches for the commit author in the authors file. If there's a match, SubGit then searches the passwd
file for that exact SVN username. If the password for that account is found - SubGit uses that username to log in SVN and create a new revision. In this case, correct revision author is being set automatically since SubGit is logged using the correct account.
If SubGit uses one dedicated SVN account (in cases of cached SVN credentials, only one provided SVN account or if no matching SVN accounts found in sugit/passwd
) it works a little different. It connects to SVN, creates a new revision and sets the revision's author equal to the SVN username it uses to log in. The problem is that this username usually is not correct author name - it might be, but commonly it differs. So SubGit then connects the SVN server second time and changes the newly created revision svn:author
property to the correct author name.
And some additional configuration may be needed here, namely:
- if SVN server 1.7.20, 1.8.12 or 1.9.0 or later is used and it's being accessed over
http(s)://
protocol - or if the SVN server is being accessed over
svn://
protocol
then pre-revprop-change hook has to be enabled in the SVN repository. That requirement is introduced by SVN and that's why we need to make some changes on SVN side.
The hook per se is pretty simple: it just an executable file, script or binary, that may even do nothing, just start and exit. So you can just create as simple script as
Linux and OS X:
#!/bin/sh exit 0;
Windows:
@echo off exit 0
place it into SVN repository hooks directory:
SVN_REPOSITORY/
hooks/
pre-revprop-change # for Linux and OS X
pre-revprop-change.bat # for Windows
make the file executable in Linux/MacOS
chmod +x pre_revprop_change
and that's it!
Automatic Authors Mapping
When SubGit starts translation beween SVN and Git, it looks for authors mapping files or authors helper programs. If none of them present, it generates the mapping automatically, following these rules for the translation:
- Subversion svnusername is translated to svnusername <svnusername@defaultDomain>> in Git
- Git Author Name <email@domain.com> is translated to Author Name in Subversion
'defaultDomain' here stands for the core.defaultDomain
SubGit configuration option. SubGit fills that setting with the hostname during subgit configure
process, but it can be changed later. Also, if subgit configure
is invoked with --layout auto
option, SubGit fills the authors file with automatically generated mapping - i.e. SubGit connects to the SVN, checks through the project history and records all the SVN users found in the history. Then SubGit generates Git names and emails from those SVN usernames according to the rules above and records resulting mapping to the authors file.
Say, a user makes commits using john_doe SVN user; a SVN revision he made may look like:
------------------------------------------------------------------------
r167 | john_doe | 2017-06-06 10:25:02 +0200 (Tue, 06 Jun 2017) | 1 line
Changed paths:
M /project/trunk/foo.c
minor changes
------------------------------------------------------------------------
at some point, the SVN project is being translated to Git. If no explicit authors mapping provided, SubGit will create automatic mapping according to the rules we've mentioned, so the revision 167 we showed above will look like this in Git:
$ git log -v
commit d5d46afc3aa33240de8b5200e72611d4e0d72a99
Author: john_doe <john_doe@git.example.com>
Date: Thu Jun 6 10:25:02 2017 +0200
minor changes
supposing Git machine has 'git.example.com' hostname.
And vise versa, if a user John Doe <johndoe@example.com> will make commit to the Git repository:
commit 7faaf52c41a0325d4686f2a6f2851dc3e3739136
Author: John Doe <johndoe@example.com>
Date: Thu Jun 8 20:06:31 2017 +0200
minor changes to bar.c
being mirrored to SVN it will look like:
------------------------------------------------------------------------
r173 | John Doe | 2017-06-08 20:06:31 +0200 (Thu, 08 Jun 2017) | 1 line
Changed paths:
M /project/trunk/bar.c
minor changes to bar.c
------------------------------------------------------------------------
Note, that since SVN username and Git user.name
commonly differ, licensed committers counter might be affected, see the details in chapter 2.
Authors File
The authors mapping file is actually just a text file filled with SVN username - Git author pairs. Each pair maps SVN username to Git author like:
svn_user_name = Git Name <gitname@domain.name>
e.g. for a user named John Doe, the mapping can be set as:
john_doe = John Doe <john_doe@example.com>
Every SVN or Git user that makes commits either to SVN project or to mirrored Git repository is supposed to have the author mapping pair here in the authors file and each author pair must reside on a new line.
During SVN to Git translation, SubGit takes a SVN revision authors name and search the authors file for a match. If there is a matching line in the file - SubGit uses appropriate Git username to create commit in Git; otherwise, if there is not - SubGit will construct a Git commit author name using automatic mapping. And vice versa - during Git commit to SVN revision translate, SubGit searches the file and use appropriate SVN username to create the SVN revision; if there's no matching pair in the file - automatic mapping is used.
Note, that it is possible to map two different SVN usernames to the same Git author - for cases, say, when one team member uses two identities to make commits or some SVN username was renamed some time. In such case there might be such configuration created:
john_doe = John Doe <john_doe@example.com>
johndoe = John Doe <john_doe@example.com>
Every revision that was made by john_doe
or johndoe
will be translated to Git commit with author name John Doe <john_doe@example.com>
. But note, when this Git user makes commit in Git and this commit is being translated to SVN, the author on the SVN side will be set to first SVN username that matches particular Git name in the authors file. That is if those two authors mapping lines appear in the authors file in that exact order - john_doe
first, then johndoe
- then SVN revision author will always be set to john_doe
when John Doe's Git commits are being translated to SVN.
Similarly, one SVN user can be mapped to different Git authors, e.g.:
jdoe = John Doe <john_doe@example.com>
jdoe = Jane Doe <jane_doe@example.com>
jdoe = James Doe <james_doe@example.com>
and again, every Git commit made by those authors will be translated to SVN with revision author set to jdoe
; but SVN revisions made by that jdoe
SVN user will always be set to first matching Git user in the authors file - John Doe <john_doe@example.com>
in this particular case.
Scriptable Authors Mapping
In addition to the authors file, there's another way to provide SVN to Git authors mapping using authors helper program. The authors helper is an executable - script or binary - that is able to read data from standard input and provide its work result to the standard output. The data helper reads from input and the data helper provides to output must fulfill certain format:
for Git to Subversion mapping: INPUT: Author Name author email OUTPUT: Subversionusername
for Subversion to Git mapping:
INPUT: Subversionusername OUTPUT: Author Name author email
Every time SubGit finds an author name during translation, it invokes the authors mapping helper program, passes the name to it and expects the helper to answer with matching author name.
The authors helper program might be extremely useful especially when you have many authors and the authors list is constantly changing - new users are being added, names and emails changes and so on. If you use some catalog to store accounts - LDAP, Active Directory, OpenID and so forth - you can create a script that will gather needed information from the catalog and provide it to SubGit.
During configuration or installation phase SubGit places simple authors.sh
script into subgit/samples
directory. This script doesn't do much useful, it's just some 'proof of concept' that demonstrates how input data is being read and output data provided. The script as simple as:
while read input
do
if [ -z "$name" ]; then
name="$input"
elif [ -z "$email" ]; then
email="$input"
fi
done
if [ -z "$email" ]; then
echo Full Name
echo author@email.com
else
echo shortSvnUserName
fi
exit 0;
Depending on what was sent to its input script returns either Git author name and email or SVN short name. It can be extended to, say, receive the data from catalog or database thereby facilitate the authors mapping.