As I see it, you mostly ask about the best practices and tools for developing AKA releases â itâs important to know the term âartâ for the subject, since it makes it much easier to find additional information.
A configuration management system (CMS - as a version control system or version control system) is necessary for the development of software today; if you use one or more IDEs, it is also good to have good integration between them and the CMS, although this is a big problem for development purposes than for deploying / releng.
From the point of view of reviewing, the key thing in CMS is that it should have good support for âbranchingâ (under any name), since releases should be made from the ârelease branchâ, where all the developed code and all its dependencies (code and data) are in a stable âsnapshotâ, from which the exact identical configuration can be reproduced as desired.
The need for good branch support may be more obvious if you need to support multiple branches (configured for different uses, platforms, etc.), but even if your releases are always strictly in the same linear sequence, the creation of a release branch dictates. âGood branching supportâ includes ease of merging (and âconflict resolutionâ when different changes are made to the file), âcherry pickingâ (taking one patch or set of changes from one branch or head / trunk and applying it to another branch) etc.
In practice, you begin the release process by creating a release branch; then you do exhaustive testing on this thread (usually much more than what you run every day in your continuous build, including extensive regression testing, integration testing, stress testing, performance testing, etc. and possibly even more expensive quality assurance processes, depending on). If and when exhaustive testing and QA identify defects in the release candidate (including regressions, performance degradation, etc.), they must be corrected; in a large team, development on the head / body can continue while QA is running, which requires the ease of collecting / merging cherries / etc (whether your practice will perform corrections on the head or in release branches, it still needs to be combined on the other hand; -).
Last but not least, you DO NOT get the full releng value from your CMS unless you somehow track with it âeverythingâ your releases depend on - the easiest way would be to have copies or hard links to all binaries for the tools needed to create the release, etc., but this can often be impractical; therefore, at least keep track of the exact version numbers, versions, bugfix and am of the tools used (operating system, compilers, system libraries, tools that pre-process images, sound or video files in final form, etc. etc. .). The key is able, if necessary, to accurately reproduce the environment necessary to restore the exact version proposed for release (otherwise you will go crazy tracking subtle errors that may depend on changes in third-party tools when changing their versions;).
After CMS, the second most important tool for releng is a good problem tracking system - ideally one that integrates well with CMS. It is also important for the development process (and other aspects of product management), but the issue of problem tracking is important from the point of view of the release process - it is the ability to easily document which errors have been fixed, which features have been added, removed, or changes, and which changes in performance (or other user-observable features) are expected in the new release. For this purpose, the key âbest practiceâ in development is that each set of changes that CMS sends out must be associated with one (or more) problems in the problem tracking system: in the end, there must be some purpose ( fix a bug, change a function, optimize something or some kind of internal refactor, which should be invisible to the software user); likewise, each monitored problem marked as âclosedâ must be connected to one (or more) changesets (unless the closure âfixes / works as intendedâ, problems related to errors and am in; third-party components, which were fixed by a third-party vendor are easily handled in a similar way if you can also track all third-party components in the CMS, see above, if you do not, at least be text files in the CMS that document the third-party components and their evolution, see above again, and they need to be changed when some traceable problem on a third-party component closes).
Automation of various regeneration processes (including creation, automatic testing, and deployment tasks) is the third main priority - automated processes are much more productive and repeatable than asking some poor people to manually list the steps (for quite complex tasks, of course, an automation workflow may require "get the person in the cycle"). As you might have guessed, tools like Ant (and SCons , etc. Etc.) may help here, but inevitably (if you fail to get away with very simple and intuitive processes), you can enrich them with ad -hoc scripts & c (some powerful and flexible scripting language such as perl, python, ruby, and c will help). A âworkflow mechanismâ can also be precious when your release workflow is quite complex (for example, involving individuals or their groups) âdisconnectingâ from QA compliance, compliance with legislation, compliance with user interface guidelines, etc.).
Some of the other specific problems you ask about vary greatly depending on the specifics of your environment. If you can afford the programmed downtime, your life is relatively simple, even if the game has a large database, since you can work consistently and deterministically: you close the existing system gracefully, ensure that the current database is saved and backed up (reducing rollback, in the hope that this is a very rare case), run one-time scripts to migrate the circuit or other "irreversible" changes in the environment, restart the system again in a mode that is still not available for ychnyh users run one more extensive set of automated tests - and finally, if all went smoothly (including preservation and backup of the database in its new state, if it is necessary), the system is again opened for public use.
If you need to upgrade a live system without downtime, this can range from minor inconveniences to a systematic nightmare. In the best case, transactions are quite short, and the synchronization between the state set by transactions can be delayed a bit without damage ... and you have enough resources (CPU, storage, etc.). In this case, you start two systems in parallel - the old and the new - and just make sure that all new transactions are aimed at the new system, allowing the old ones to complete the old system. A separate task periodically synchronizes "new data in the old system" with the new system, as transactions on the old system are completed. In the end, you can determine that no transactions are performed on the old system, and all the changes that occurred there are synchronized to the new one - and at this time you can finally close the old system. (You should also be prepared for âreverse synchronization,â of course, if you need to roll back the change).
This is the âsimple, sweetâ extreme for updating a living system; on the other hand, you may find yourself in such an overly harsh situation that you can prove that the task is impossible (you simply cannot logically fulfill all of these requirements with these resources). Long sessions open on an old system that simply cannot be interrupted - limited resources that make it impossible to start two systems in parallel - the basic requirements for real-time synchronous synchronization for each transaction - etc., life may be miserable (and , as I noticed, in a pinch, you can make the stated task absolutely impossible). The two best things you can do about this are: (1) make sure you have a lot of resources (it will also save your skin when some unexpected server hits your stomach ... you will have another one to shoot, to meet an emergency! -); (2) look at this predicament from the very beginning when they initially define the architecture of the entire system (for example: prefer short-term transactions for long-lived sessions that simply cannot be a snapshot, are closed and are easily reloaded from the snapshot, "is one good architectural guide , -).