adrian_broher wrote:The simplest solution I can offer is to write a migration script, that fixes the inconsistencies.
The major issue is how to get all the branches and tags imported when we point the importer to <repo-url>/trunk/FreeOrion. A script that manages this doesn't sound trivial to code.
After looking further into the matter I doubt that the repository can be used without a scrubbing/cleanup.
Why? The test runs I did so far produced perfectly usable git repos - I guess it depends what we want to do:
Serious issues I found so far:
* Some branches where committed with the FreeOrion root dir, some without. This leads to a disconnected history and will cause trouble when merging.
These branches are, without exception, not active anymore. All of them have been merged prior to the migration (except the 0.4.4 release branch, which won't receive further updates nor get merged), so none of them is ever going to be merged in git. So this shouldn't be an issue.
If there were active branches we intend to continue to work on and merge after the migration, then you're right, this would not be possible without scrubbing/cleanup.
Because we didn't want to open that can of worms, we merged all active branches to trunk before we considered the switch to git/github.
The point of wanting the branches and tags to include when migrating is preservation of commit history, nothing else. Actually I strongly recommend to delete the branches after the import (which will just remove the branch markers pointing to the latest commits of the branches).
* Some tags are tagged with the FreeOrion root dir, some without. This leads to a disconnected history.
Yeah, I've already observed that, not ideal, but IMO bearable.
* There are some big binary files deep within the history, which will bloat the repository download unnecessary.
How big of a difference will it really be if we remove these big binary files? Currently the entire repo is at roughly 770MB. That's not nothing, but manageable I guess. After all, you don't clone the repo every day. And 100MB more or less doesn't make that much of a difference.
Not so serious issues I found so far:
* Commits without commit message.
* Commits without author.
* Inconsistent user naming.
* Inconsistent commit message style.
I think we can live with that. Fixing that would mean a lot of effort for only limited benefit.
More important is to try to avoid these inconsistencies in future. So try to adhere to git commit message guidelines, committing without message and/or author isn't possible anyway with git AFAIK. What is inconsistent user naming...?