I came here quite comfortably with these two concepts, but with something incomprehensible to me about them.
After reading some answers, I think I have the right and useful metaphor to describe the difference.
If you think that your individual lines of code are separate but ordered playing cards (stop me if I explain how the punch start cards work), then for each separate written procedure you will have a unique card stack (donβt copy and paste !) and the difference between what usually happens during normal code execution and asynchronously depends on whether you care or not.
When you run the code, you pass the OS a set of single operations (your compiler or interpreter has broken your "higher level" code into) for transfer to the processor. With one processor, only one line of code can be executed at any given time. Thus, to achieve the illusion of running multiple processes simultaneously, the OS uses a technique in which it sends the processor only a few lines from a given process at a time, switching between all processes in accordance with how it sees fit. The result is several processes showing progress to the end user at the same time.
For our metaphor, the relationship is that the OS always shuffles the cards before sending them to the processor. If your card stack does not depend on another stack, you will not notice that your stack stops receiving the selected one, and the other stack becomes active. Therefore, if you do not care, it does not matter.
However, if you don't care (for example, there are several processes - or stacks of cards - that depend on each other), then shuffling the OS will spoil your results.
Writing asynchronous code requires handling dependencies between the execution order, no matter how it is ordered. This is why constructs such as callbacks are used. They tell the processor: "The next thing to do is tell the other stack what we did." Using such tools, you can be sure that another stack will receive a notification before it allows the OS to run its instructions more. ("If call_back == false: send (no_operation)" - not sure if this is actually how it is implemented, but logical, I think it is consistent)
For parallel processes, the difference is that you have two stacks that do not care about each other, and two workers for processing them. At the end of the day, you may need to combine the results with two stacks, which then will be a matter of synchronism, but for execution you don't care.
Not sure if this helps, but I always find some explanations helpful. Also note that asynchronous execution is not limited to an individual computer and its processors. Generally speaking, we are talking about time or (even more general) the order of events. Therefore, if you send the dependent stack A to the network node X and the associated stack B to Y, the correct asynchronous code should be able to take into account the situation as if it were running locally on your laptop.