Edit: Recently, this question has had quite a few new eyeballs, so it is updated here to make it more useful.
Alohci's solution is correct, but this may not be entirely clear for more graphically-inclined ones.
So let me clarify the solution with the pictures a bit.
Firstly, the height of the line is inherited as its calculated size, therefore, although it is specified in units of em , children inherit the value in pixels. For example, with a font size of 20px and a line height of 3em , the line height will be 60 pixels, even for descendants with different font sizes (unless they specify their own line heights).
Now suppose the font has 1/4 descender. That is, if you have a 20px font, the descender is 5 pixels and the 15-pixel font is 15 pixels. The remaining line length (in this case 40 pixels) is then divided evenly above and below the baseline, like this.

For a block with a smaller font (0.6em or 12 pixels), the remaining number of lines is 60-12 or 48 pixels, which is also divided equally: 24 above and 24 below the baseline.

Then, if we combine the two fonts on the same baseline, you will see that the line heights are not divided equally, so the total height of the block containing the block increases, although both line heights are 60 pixels.

Hope this explains things!
Mr lister
source share