No, compilers do not work at all. The amount of optimization that can be superseded by the build is negligible for most programs.
This amount depends on how you define the "modern C compiler." A completely new compiler (for a chip that has just reached the market) can have a lot of inefficiencies that will be fixed over time. Just compile some simple programs (like string.h functions) and analyze what each line of code does. You may be surprised at some wasteful things that the C compiler has not tested, and recognize the error with a simple reading of the code. The mature, well-tested, carefully optimized compiler (Think x86) will do the job of building the build well, although the new one will still do a decent job.
In no case can C work better than assembly. You can simply compare the two, and if your assembly was slower, compile with -S and submit the resulting assembly, and you guaranteed a draw. C is compiled into an assembly that has a 1: 1 correlation with bytecode. The computer cannot do anything that the assembly cannot complete, provided that a complete set of instructions is published.
In some cases, C is not expressive enough to be fully optimized. A programmer may know something about the nature of data that simply cannot be expressed in C so that the compiler can take advantage of this knowledge. Of course, C is expressive and close to metal and very good for optimization, but full optimization is not always possible.
The compiler cannot define "performance" as a person can. I understand that you said trivial programs, but even in the simplest (useful) algorithms there will be a compromise between size and speed. The compiler cannot do this on a smaller scale than the -Os / -O flags [1-3], but a person may know what “better” means in the context of the program’s purpose.
Some architecture-related assembly instructions cannot be expressed in C. Here are the ASM () instructions. Sometimes this is not for optimization at all, but simply because there is no way to express in C that this line should use, say, an atomic test and given operation, or that we want to issue an SVC interrupt with an encoded parameter X.
Despite the above points, C is an order of magnitude more efficient to program and manage. If performance is important, build analysis is needed and optimizations are likely to be found, but the trade-off between time and developer effort is rarely worth the effort for complex PC programs. For very simple programs that should be as fast as absolutely possible (for example, RTOS) or that have serious memory limitations (for example, ATTiny with 1 KB of flash memory (non-writable) and 64 bits of RAM), the assembly may be the only one way to go.