Story
History can be saved for all of these approaches using git svn: http://git-scm.com/book/en/Git-and-Other-Systems-Migrating-to-Git It is even possible to revert to previous commits.
However, there were suggestions not to save the story and just leave the svn repository frozen for about 6 months, and the whole story will change in the git repository. I do not agree with such advice, because history is important for our project. I bet no one makes that decision.
Giant Highway Approach
- You must clone the entire large tree, even if you plan to only work in one subdirectory (basic use case)
- some git commands will be slow (for example: git status, as needed to check the whole tree)
- Even if you configure jenkins to run assemblies only for certain parts of repo (This can be done using the "include" property of the jenkins git plugin). Still need to pull out all the repos to complete the build. This is unlikely to affect the whole work, because a “clean” test will take a long time even to create small modules.
Concern: having 200+ Dev and QA as a whole, I suspect that it will be quite difficult to push the changes in the end.
- Changes are transferred to the master branch only after the review is approved by gerrit and the tests have been passed, so we won’t have a continuous push-push flow, go bankrupt-pull-out by pressing
- However, gerrit may reject the merge if the main branch has been changed since the commit was clicked on gerrit, you will need to click "rebase" and repeat the tests.
- The Linux kernel has a monolithic repo, because c / C ++ has no control dependency, such as java: building a kernel that looks like a war against a dependency bank is not like that.
Trivia
What are the stages, their cost and the total cost of migration using this approach?
- git svn clone SVN_URL REPO_NAME
- Jenkins material
How can it support code coding? What changes are needed for the VCS / tools perspective? Suppose a full CI launch takes 15 minutes.
- Jenkins must have an “include” filter in the scm trigger to filter changes for a specific part of the project. Ist is not so difficult, but still requires some effort to configure and verify them. In the case of "wiping the workspace before assembly", the entire repo should be cloned by all the time. This can increase the overall time from commit to “approved tests” because verification will be rather slow.
What are the effective workflows of developers?
- Developers use local / remote function branches
- Push changes to gerrit
- Gerrit checks changes in tests
- Change merges with the main branch
Submodules
Most of the caveats described here are http://git-scm.com/book/en/Git-Tools-Submodules , and here is http://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt -use-git-submodules /
The main problem is that you have to commit twice
- To submodule itself
- To aggregate a repo - update a submodule No sense. Why would you ever need repo aggregation if dependencies are managed through an artifact repository?
In fact, submodules are created for cases where there is a library that can be reused with different projects, but you want to depend on a specific library tag with the possibility of updating the link in the future. However, we will not mark every commit (only release after each commit), and changing versions of dependencies (released) in a war will be easier than maintaining submodules. Java dependency management makes things easier.
It is not recommended to point to the head of the submodule and leads to problems with submodules, so this approach is a dead end for snapshots. And again, we don’t need it, because java dependency management will do everything for us.
Quizzes What are the steps, their cost and the total cost of migration using this approach?
- git svn clone SVN_URL REPO_NAME for each module
- Create git repo aggregation
- Add module repositories as submodules for repo aggregation
How can it support code coding? What changes are needed for the VCS / tools perspective? Suppose a full CI launch takes 15 minutes.
- Gerrit supports both merging and commits with submodules, so it should do well.
- Jenkins stuff - triggers for submodule changes and aggregation of repo changes (argh! No sense in two places!)
What are the effective workflows of developers? (Gerrit process is omitted)
- Developers pass to the submodule
- Creating a tag for fixing it
- Developer moves to repo aggregation
- cd to submodule, verification tag
- commit repo aggregation with modified submodule hash
Or
- Developer changes submodule
- Discards changes to the submodule so as not to lose the changes.
- commit repo aggregation with modified submodule hash
As you can see, the developer’s workflow is cumbersome (you always need to update two places) and does not meet our needs.
Subtrees
The main problem is that you will have to commit twice To a subdirectory with a combined tree Press changes to the original repo
Subtrees are a better alternative to submodules, more robust and combining the source code of submodules to aggregate repos instead of just referencing it. This makes it easy to maintain such an aggregating repo, but the problem with subtrees is the same as for submodules, which makes double commits completely useless. You do not have to commit changes to the original repo module and you can commit it with repo aggregation, this can lead to inconsistency between repos ...
The differences are well explained here: http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/
Quizzes What are the steps, their cost and the total cost of migration using this approach?
- git svn clone SVN_URL REPO_NAME for each module
- Create repo aggregation
- Perform subtree merge for each module
How can it support code coding? What changes are needed for the VCS / tools perspective? Suppose a full CI launch takes 15 minutes.
- Gerrit doesn't seem to support the merge subtree very well ( https://www.google.com/#q=Gerrit+subtrees )
- But we cannot be sure that we will try
- Jenkins. Triggers on subtree repositories and aggregate repo changes (argh! No sense in two places!)
What are the effective workflows of developers? (Gerrit process is omitted)
- Developer changes something in subtree (inside repo aggregation)
- Developer performs repo aggregation
- The developer does not forget about changing the original repo (no sense!)
- Developer doesn’t forget NOT to mix subtree changes with aggregate repo change in one commit
Again, as with submodules, it makes no sense to have two places (repoes) where codes / changes are present. Not for our case.
Separate repositories
Individual repositories look like the best solution and follow the original git intent. The repo granularity may vary. The thinnest case is to have a repo group per maven group, however this can lead to too many repositories. We also need to think about how often one particular svn transaction affects multiple modules or release groups. If we see that a fix usually affects 3-4 release groups, then these groups should form a repo.
I also think that it is worth at least decoupling the api modules from the implementation modules.
Quizzes What are the steps, their cost and the total cost of migration using this approach?
- git svn clone SVN_URL REPO_NAME for each more or less fine-grained number of modules
How can it support code coding? What changes are needed for the VCS / tools perspective? Suppose a full CI launch takes 15 minutes.
- Jenkins works for each repo separately. No 'enable filters. Just do a check, build, expand.
What are the effective workflows of developers?
- Developers use local / remote function branches for each repo
- Push changes to gerrit
- Gerrit checks changes in tests
- Change merges with the main branch