TL; DR: AWS is very reliable if you know what you are doing, a bad idea if you do not.
As you are not familiar with the conditions here is a very quick glossary: AZ is the accessibility zone, there are several accessibility zones for the region (for example, 3 in Ireland). They are physically isolated data centers with various networks, floodplains, etc. But with internal network qualities. Perhaps, even in some cases, AZ may become inaccessible, I do not think that all AZs in the region have ever been down.
EBS / Instance Store are the two main types of storage available to an instance. The best way to describe them is that the Store instance is equivalent to the hard drive that you connected via sata to the motherboard - very quickly. But what happens if you turn off your copy (or if the motherboard does not work) and want to immediately start working on another board? (Amazon completely hides the physical configuration of the equipment), it is obvious that you are not going to wait until the engineer disconnects the drive from one server to another, so they do not even offer this. The instance store is fast but temporary and tied to a physical machine. DO NOT store anything important in it. EBS is then an alternative, it is a very low latent network drive, to which any server can connect, as if it were local. You close the server, change the size and restart on a completely different server on the other side of the data center (again, the physical material is hidden), it doesn’t matter that your eb hasn’t gone anywhere (by default theyre also on several physical disks).
Cloud hardware - my interpretation of all "cloud hardware doesn't work all the time - its really dangerous and unreliable" is that aws hardware is not as reliable as enterprise-level components in a managed data center. This does not mean that it is unreliable, it just means that you should create a failure as an option in your design.
First of all, it is very important to note that when talking about SLA, this is what amazon says very clearly that SLA is ONLY applied if one or more AZs go down. Therefore, if you do not understand how their service works, and only one server in one AZ, and the generator or router fail with your own error.
As for recovery, it depends on whether all your application state is stored on one server - if so, don’t worry about the cloud. If you can group your state on several servers, save it in RDS or some other permanent database. OR, if your content changes so rarely, you can use periodic copies in the s3 repository, everything will be fine. Failure strategy (in order of preference) can be grouped, fault tolerant or auto repair. For the first, you have clustered servers sharing state - it doesn’t matter if you lose the server or AZ. For the second, you only have one live server, but if it goes down, you have a switch to another resource with the same content. Finally, during automatic repair, two situations are possible: if your data is located on only one EBS disk, you can start another instance with the same disk and continue. But if the EBS or AZ drive fails, you need to be prepared with some snapshot in s3 that can copy and start a completely new instance.
Reserved instances are no more reliable - they are the same equipment, you just sign a contract to say that I will have x cars over the years. Which allows aws to plan better, which is cheaper for you.