Does PHP quietly optimize consecutive fseek commands in a single fseek command?

I am running Windows and 7 - 64 bit, with the latest version of XAMPP, which has a 32 bit version of PHP.

When testing http://php.net/manual/en/function.fseek.php#112647 for a very large file (larger than PHP_MAX_INT 2147483647). Now I am sure that successive following fseeks are summed before being executed on filepointer.

I have two questions:

  • Can I break this summation with reasonable means (or only with the workaround mentioned in the link above)?

  • Is this aggregation happening in PHP (as I assume, although I don't know where in PHP) or in Windows 7?

Answering a question: trying two workarounds with multiple attempts did not work on my system. Instead, they put filepointer at different positions in the PHP_MAX_INT section. (32-bit PHP can only look for PHP_MAX_INT + 8192. Reading from there is still possible, but I don't know how far.)

Therefore, the question for my specific case is out of date, since 32-bit PHP can only look for PHP_MAX_INT + 8192, whatever you do. I leave the question because two people voted for it, and may be interested in a general answer.

I filed an error report here:
https://bugs.php.net/bug.php?id=69213
Result: with a 64-bit PHP build, this might work, but I have not tried.

+7
performance php windows-7 32bit-64bit internal
source share
2 answers

This is not true. It actually makes something even more dumb. Here is a snippet of PHP source code:

switch(whence) { case SEEK_CUR: offset = stream->position + offset; whence = SEEK_SET; break; } 

This is in the guts implementation for PHP fseek . What happens here: if you tell PHP to search from the current position, this translates it into an "equivalent" search from the beginning of the file. This only works when the offset calculation is not overloaded; if so, well, offset is a signed integer, so the behavior is undefined.

And, well, this is because PHP buffers the streams inside, so they need to do something. But it should not be.

You are probably best off trying to do your work in a language that actually does what you tell him.

+1
source share

If aggregation was to occur, it probably should have been an optimization of the opcode or should have occurred at a low level through a buffer.

I can answer low. fseek () in php is implemented using php streams. It is declared in ext / standard / file.h and is defined in .c. Its implementation calls php_stream_seek (), which calls in _php_stream_seek () in streams.c. The implementation of this low level is done through the shell of simple threads, in which case you can turn to zend_seek or zend_fseek, which, in turn, simply go to 32 or 64-bit _seeki64 c requests.

So ... if any aggregation occurs, it seems to be in code optimization or even lower in the operating system or hardware. Hard drives implement offsite sampling to reduce head search distances, and file system buffering systems can reduce the amount of distortion that does not have side effects. If you are concerned about disk read time, the former will automatically handle this. If you are interested in possibly sorting through memory (requiring large distances without the need for a buffer), you might consider a different approach. See http://www.cs.iit.edu/~cs561/cs450/disksched/disksched.html for more information on how drives do not waste time searching.

Hope this helps.

0
source share

All Articles