Why doesn't PHP use an inline smart string for strings?

PHP has an internal data structure called smart string (smart_str?), Where they store both the length and the size of the buffer. That is, more memory than line length is allocated to improve concatenation performance. Why is this data structure not used for real PHP strings? Wouldn't that lead to less memory allocation and better performance?

+5
source share
1 answer

The standard PHP strings (starting with PHP 7) are represented by the zend_string type, which includes both the length of the string and the array of characters. zend_string usually allocated to exactly match character data (regardless of alignment): they won’t leave room for additional characters to be added.

The smart_str structure includes a pointer to zend_string and the size of the distribution. This time zend_string will not be exactly allocated. Instead, the distribution will be made too large so that additional characters can be added without costly redistributions.

The reallocation policy for smart_str is as follows: firstly, it will have a total size of 256 bytes (minus the zend_string header, minus the overhead of the allocator). If this size is exceeded, it will be redistributed to 4096 bytes (minus service ones). After that, the size will increase in increments of 4096 bytes.

Now imagine that we replace all smart_str lines. This would mean that even one character string would have a minimum allocation size of 256 bytes. Given that most of the strings used are few, this is an unacceptable overhead.

This is essentially a classic performance / memory tradeoff. By default, we use a compact representation of memory and switch to a faster, but less memory efficient representation in cases where it is beneficial for most of them, that is, cases when large lines are built of small parts.

+6
source

All Articles