Get PHP to stop the replacement. characters in $ _GET or $ _POST arrays?

If I pass in PHP variables with . in their names through $ _GET PHP, automatically replaces them with _ characters. For example:

 <?php echo "url is ".$_SERVER['REQUEST_URI']."<p>"; echo "xy is ".$_GET['x.y'].".<p>"; echo "x_y is ".$_GET['x_y'].".<p>"; 

... outputs the following:

 url is /SpShipTool/php/testGetUrl.php?xy=ab xy is . x_y is ab 

... my question is: is there any way to stop this? Itโ€™s impossible for me to find out what I did to deserve it.

The PHP version I'm working with is 5.2.4-2ubuntu5.3.

+60
php regex postback
Sep 16 '08 at 1:47
source share
14 answers

Here's an explanation of PHP.net about why it does this:

Dots in input variable names

As a rule, PHP does not change variable names when they are passed to a script. However, it should be noted that a period (period, full stop) is not a valid character in the name of a PHP variable. For a reason, look at this:

 <?php $varname.ext; /* invalid variable name */ ?> 

Now that the parser sees a variable named $ varname, followed by a string concatenation operator, followed by a ban (i.e. an invalid string that does not match any known key or reserved words) 'ext'. Obviously, this does not have the expected result.

For this reason, it is important to note that PHP will automatically replace any points in the incoming variable with names with underscores.

Which of http://ca.php.net/variables.external .

Also, according to this comment , these other characters are converted to underscores:

The full list of characters for the field name that PHP converts to _ (underscore) is as follows (not just a period):

  • chr (32) () (space)
  • chr (46) (.) (dot)
  • chr (91) ([) (open square bracket)
  • chr (128) - chr (159) (various)

So it looks like you are stuck in it, so you will need to convert the underscores back to dots in the script using the dawnerd clause (I 'd just use str_replace , though.)

+56
Sep 16 '08 at 2:01
source share

For a long time I answered the question, but in fact there is a better answer (or workaround). PHP allows you to use the original stream stream , so you can do something like this:

 $query_string = file_get_contents('php://input'); 

which will give you an array of $ _POST in query string format, periods as they should be.

Then you can analyze it if you need (according to POSTer comment )

 <?php // Function to fix up PHP messing up input containing dots, etc. // `$source` can be either 'POST' or 'GET' function getRealInput($source) { $pairs = explode("&", $source == 'POST' ? file_get_contents("php://input") : $_SERVER['QUERY_STRING']); $vars = array(); foreach ($pairs as $pair) { $nv = explode("=", $pair); $name = urldecode($nv[0]); $value = urldecode($nv[1]); $vars[$name] = $value; } return $vars; } // Wrapper functions specifically for GET and POST: function getRealGET() { return getRealInput('GET'); } function getRealPOST() { return getRealInput('POST'); } ?> 

Very useful for OpenID parameters that contain both .. and "_", each of which has a specific meaning!

+53
Dec 21 '09 at 12:47
source share

Highlighting Johan's actual answer in the comment above - I just wrapped my entire post in a top-level array that completely bypasses the problem without the need for heavy processing.

In the form you make

 <input name="data[database.username]"> <input name="data[database.password]"> <input name="data[something.else.really.deep]"> 

instead

 <input name="database.username"> <input name="database.password"> <input name="something.else.really.deep"> 

and in the message handler just expand it:

 $posdata = $_POST['data']; 

For me, this was a two-line change, since my views were completely masked.

Fyi. I use dots in field names to edit grouped data trees.

+20
Dec 04 '13 at 1:36
source share

This fix works everywhere and supports an array, for example a[2][5]=10 .

 function fix($source) { $source = preg_replace_callback( '/(^|(?<=&))[^=[&]+/', function($key) { return bin2hex(urldecode($key[0])); }, $source ); parse_str($source, $post); return array_combine(array_map('hex2bin', array_keys($post)), $post); } 

And then you can call this function like this, depending on the source:

 $_POST = fix(file_get_contents('php://input')); $_GET = fix($_SERVER['QUERY_STRING']); $_COOKIE = fix($_SERVER['HTTP_COOKIE']); 

For PHP below 5.4: use base64_encode instead of bin2hex and base64_decode instead of hex2bin .

+16
Aug 13 '13 at 12:59 on
source share

This is because the period is an invalid character in the variable name, a reason that is very deep in the implementation of PHP, so there are no simple fixes (for now).

In the meantime, you can work around this problem:

  • Access raw request data via php://input for POST data or $_SERVER['QUERY_STRING'] for GET data
  • Using the conversion function.

The following conversion function (PHP> = 5.4) encodes the names of each key-value pair into a hexadecimal representation and then performs regular parse_str() ; after execution, it returns the hexadecimal names back to their original form:

 function parse_qs($data) { $data = preg_replace_callback('/(?:^|(?<=&))[^=[]+/', function($match) { return bin2hex(urldecode($match[0])); }, $data); parse_str($data, $values); return array_combine(array_map('hex2bin', array_keys($values)), $values); } // work with the raw query string $data = parse_qs($_SERVER['QUERY_STRING']); 

Or:

 // handle posted data (this only works with application/x-www-form-urlencoded) $data = parse_qs(file_get_contents('php://input')); 
+6
Jan 21 '13 at 4:52
source share

The reason this happens is because of the functionality of PHP old register_globals .. the character is not a valid character in the variable name, so PHP hides it underscore to make sure it is compatible.

In short, it is not a good practice to do periods in URL variables.

+4
Sep 16 '08 at 1:56
source share

This approach is a modified version of Rok Kralj's, but with some tweaking to work to increase efficiency (avoids unnecessary callbacks, encoding and decoding on unaffected keys) and correctly handle array keys.

A gist with tests is available, and any feedback or suggestions are welcome here or there.

 public function fix(&$target, $source, $keep = false) { if (!$source) { return; } $keys = array(); $source = preg_replace_callback( '/ # Match at start of string or & (?:^|(?<=&)) # Exclude cases where the period is in brackets, eg foo[bar.blarg] [^=&\[]* # Affected cases: periods and spaces (?:\.|%20) # Keep matching until assignment, next variable, end of string or # start of an array [^=&\[]* /x', function ($key) use (&$keys) { $keys[] = $key = base64_encode(urldecode($key[0])); return urlencode($key); }, $source ); if (!$keep) { $target = array(); } parse_str($source, $data); foreach ($data as $key => $val) { // Only unprocess encoded keys if (!in_array($key, $keys)) { $target[$key] = $val; continue; } $key = base64_decode($key); $target[$key] = $val; if ($keep) { // Keep a copy in the underscore key version $key = preg_replace('/(\.| )/', '_', $key); $target[$key] = $val; } } } 
+4
Aug 10 '13 at 15:31
source share

If you are looking for any way to literally force PHP to stop replacing '.' characters in $ _GET or $ _POST arrays, then one of these methods is to change the source of PHP (in this case, it is relatively simple).

WARNING: changing the source of PHP C is an advanced option!

Also see this PHP bug report , which offers the same modification.

To study you will need:

  • download php c source code
  • disable check replacement .
  • ./configure, create, and deploy your customized PHP assembly

Changing the source itself is trivial and involves updating one half of one line in main/php_variables.c :

 .... /* ensure that we don't have spaces or dots in the variable name (not binary safe) */ for (p = var; *p; p++) { if (*p == ' ' /*|| *p == '.'*/) { *p='_'; .... 

Note: compared to the original || *p == '.' || *p == '.' was commented out




Result:

by setting QUERY_STRING aa[]=bb&a.a[]=BB&c%20c=dd , running <?php print_r($_GET); now produces:

 Array
 (
     [aa] => Array
         (
             [0] => bb
             [1] => BB
         )

     [c_c] => dd
 )

Notes:

  • this patch is for the original question only (it stops the substitution of dots, not spaces).
  • working on this patch will be faster than script-level solutions, but these pure-.php answers are still preferable (because they do not change PHP itself).
  • in theory, the polyfill approach is possible and can combine approaches - the test for changing the level of C using parse_str() and (if it is not available) falls back to slower methods.
+3
Aug 15 '13 at 0:28
source share

Looking at the Rok solution, I came up with a version that addresses the limitations in my answer below, crb above and the Rok solution. See my improved version .




@crb's answer above is a good start, but there are a few problems.

  • He recycles everything that is redundant; only those fields that have a "." the title needs to be recycled.
  • It cannot handle arrays in the same way that native PHP processing does. for keys like "foo.bar []".

The solution below addresses both of these issues (note that it has been updated since publication). This is about 50% faster than my answer above in my testing, but will not handle situations where the data has the same key (or a key that is extracted the same way, for example, foo.bar and foo_bar are both extracted as foo_bar).

 <?php public function fix2(&$target, $source, $keep = false) { if (!$source) { return; } preg_match_all( '/ # Match at start of string or & (?:^|(?<=&)) # Exclude cases where the period is in brackets, eg foo[bar.blarg] [^=&\[]* # Affected cases: periods and spaces (?:\.|%20) # Keep matching until assignment, next variable, end of string or # start of an array [^=&\[]* /x', $source, $matches ); foreach (current($matches) as $key) { $key = urldecode($key); $badKey = preg_replace('/(\.| )/', '_', $key); if (isset($target[$badKey])) { // Duplicate values may have already unset this $target[$key] = $target[$badKey]; if (!$keep) { unset($target[$badKey]); } } } } 
+2
Aug 03 '13 at 1:54 on
source share

My solution to this problem was quick and dirty, but I still like it. I just wanted to publish a list of file names that were checked on the form. I used base64_encode to encode the file names in the markup, and then just decrypted it with base64_decode before using them.

+1
Feb 07 2018-11-22T00:
source share

..

Why don't you just convert all the points to some kind of token, like (~ # ~), and then publish it? When you receive vars, you can restore them back. This is because we sometimes need to place underscores .. and we would lose them if we turned all "_" to "." S ...

+1
Sep 22 '11 at 19:20
source share

Well, the function I included below, "getRealPostArray ()", is not quite a solution, but it processes arrays and supports both names: "alpha_beta" and "alpha.beta":

  <input type='text' value='First-.' name='alpha.beta[ab][]' /><br> <input type='text' value='Second-.' name='alpha.beta[ab][]' /><br> <input type='text' value='First-_' name='alpha_beta[ab][]' /><br> <input type='text' value='Second-_' name='alpha_beta[ab][]' /><br> 

whereas var_dump ($ _ POST) produces:

  'alpha_beta' => array (size=1) 'ab' => array (size=4) 0 => string 'First-.' (length=7) 1 => string 'Second-.' (length=8) 2 => string 'First-_' (length=7) 3 => string 'Second-_' (length=8) 

var_dump (getRealPostArray ()) produces:

  'alpha.beta' => array (size=1) 'ab' => array (size=2) 0 => string 'First-.' (length=7) 1 => string 'Second-.' (length=8) 'alpha_beta' => array (size=1) 'ab' => array (size=2) 0 => string 'First-_' (length=7) 1 => string 'Second-_' (length=8) 

Function for what it costs:

 function getRealPostArray() { if ($_SERVER['REQUEST_METHOD'] !== 'POST') {#Nothing to do return null; } $neverANamePart = '~#~'; #Any arbitrary string never expected in a 'name' $postdata = file_get_contents("php://input"); $post = []; $rebuiltpairs = []; $postraws = explode('&', $postdata); foreach ($postraws as $postraw) { #Each is a string like: 'xxxx=yyyy' $keyvalpair = explode('=',$postraw); if (empty($keyvalpair[1])) { $keyvalpair[1] = ''; } $pos = strpos($keyvalpair[0],'%5B'); if ($pos !== false) { $str1 = substr($keyvalpair[0], 0, $pos); $str2 = substr($keyvalpair[0], $pos); $str1 = str_replace('.',$neverANamePart,$str1); $keyvalpair[0] = $str1.$str2; } else { $keyvalpair[0] = str_replace('.',$neverANamePart,$keyvalpair[0]); } $rebuiltpair = implode('=',$keyvalpair); $rebuiltpairs[]=$rebuiltpair; } $rebuiltpostdata = implode('&',$rebuiltpairs); parse_str($rebuiltpostdata, $post); $fixedpost = []; foreach ($post as $key => $val) { $fixedpost[str_replace($neverANamePart,'.',$key)] = $val; } return $fixedpost; } 
0
Sep 12 '14 at 1:27
source share

Using crb, I wanted to recreate the $_POST array as a whole, but keep in mind that you still need to correctly encode and decode both on the client and on the server. It is important to understand when a character is really invalid, and he is really valid. In addition, people should always and always avoid client data before using it without any exception to any database command.

 <?php unset($_POST); $_POST = array(); $p0 = explode('&',file_get_contents('php://input')); foreach ($p0 as $key => $value) { $p1 = explode('=',$value); $_POST[$p1[0]] = $p1[1]; //OR... //$_POST[urldecode($p1[0])] = urldecode($p1[1]); } print_r($_POST); ?> 

I recommend that this be used only for individual cases, but I'm not sure about the negative points of placing this at the top of the main header file.

0
05 Oct '14 at 20:26
source share

My current solution (based on answers to previous topics):

 function parseQueryString($data) { $data = rawurldecode($data); $pattern = '/(?:^|(?<=&))[^=&\[]*[^=&\[]*/'; $data = preg_replace_callback($pattern, function ($match){ return bin2hex(urldecode($match[0])); }, $data); parse_str($data, $values); return array_combine(array_map('hex2bin', array_keys($values)), $values); } $_GET = parseQueryString($_SERVER['QUERY_STRING']); 
0
Nov 16 '15 at 12:36
source share



All Articles