One of the important factors determining how you should build servers, AMI and infrastructure planning is the answer to the question: at what stage will I need a new instance?
The answer to this question will determine how much you bake in the AMI, and how much you build after loading.
NOTE. . My experience is with Chef Server, so I will use Chef terminology, but the concepts are the same for any other configuration management stack.
A general rule is to consider your "infrastructure as code." This means that we think about launching instances, creating users on this computer, and managing the known_hosts and SSH files as well as your application code. The ability to track changes in the infrastructure in the source code simplifies management, redistribution, and even CI.
A true chef Introduction covers the terminology of a chef cookbooks, recipes, resources, etc. It shows you how to create a simple LAMP stack, and how you can easily restart it with a single command.
So, given the example in your question, at a high level, I would do the following:
- Run the base UI-Ubuntu Linux AMI (currently 14.04) with a cloud-based information script.
- In the UserData section of the Instance configuration, download the Chef client installation process.
- Run the recipe to create a user.
- Run the recipe to create a known_hosts file for the user.
Tools like Chef are used because you can break the infrastructure down into small blocks of code that perform specific functions. There are numerous Cookbooks already built-in and affordable that perform the basic building blocks of creating services, installing software packages, etc.
All that is said, there are several times when you need to deviate from best practices in the interests of your specific domain and requirements. There may be situations in which you still have to bake all the benefits of infrastructure management in AMI.
Assume that your application performs image processing and requires the use of ImageMagick. Suppose you need to create an ImageMagick from source code. If you did this with chef recipes, it could add another 7 minutes by simply compiling ImageMagick at the normal instance load time. If the expectation of 10-12 minutes is too high for a new instance to appear on the network, you may want to bake your own AMI, which ImageMagick has already compiled and installed.
This is an acceptable solution, but you should keep in mind that managing your own fleet of pre-processed AMIs adds extra infrastructure overhead. You will need to update your custom AMIs as new AMIs are released, you expand to different types of instances and different areas of AWS.