It would seem that there are no strict rules. The HTML specification only describes how you can expect a browser to behave, but it does not establish browser regulatory requirements to follow the specified behavior with a letter. The CSS specification does not even affect it.
Section 10.4.2 of W3C HTML5 describes how img elements are displayed according to a set of rules that refer to what img represent.
Section 4.7.1 , paths down, describes what the img element represents, depending on its src and alt attributes
What the img element represents depends on the src attribute and the alt attribute.
[...]
If the src attribute is set and the alt attribute is set to non-empty
The image is a key part of the content; the alt attribute gives the text equivalent or replacement of the image.
If an image is available and the user agent is configured to display this image, then the item represents the image data of the item.
Otherwise, the element represents the text specified by the alt attribute. User agents can provide the user with a notification that the image is present, but has been omitted from rendering.
So, if the image is not available, the img element represents its alt text (since the image should not be presented!).
So, back to section 10.4.2, the following rule applies:
If the element is an img element that represents some text and the user agent does not expect this to change
It is expected that the user agent will treat the element as an irreplaceable phrasing element, the contents of which are text, optionally with an icon indicating that the image is missing, so the user can request the image to be displayed or find out why it is not displayed.In non-graphic contexts, this icon should be omitted.
It seems that Firefox is closely monitoring this expectation (note: do not require), although I am not sure if the box that is generated is replaced or not replaced. Similarly for other browsers - the built-in block can be replaced or not replaced. Note that HTML says “phrasing element”, not “inline element” or “inline block element”, adding to the uncertainty of all this again.
What Chrome does when images are disabled using user preferences is not what I would call “[giving] the user a notification that the image is present but has been omitted from rendering,” but again this is also not a requirement. However, I don’t understand why Chrome thinks this is a good idea. What else was there for the text?