How to handle 404 using regex based routing?

Please consider the following very rudimentary “controllers” (for this function in this case (for simplicity):

function Index() { var_dump(__FUNCTION__); // show the "Index" page } function Send($n) { var_dump(__FUNCTION__, func_get_args()); // placeholder controller } function Receive($n) { var_dump(__FUNCTION__, func_get_args()); // placeholder controller } function Not_Found() { var_dump(__FUNCTION__); // show a "404 - Not Found" page } 

And the following Route() function, based on a regular expression :

 function Route($route, $function = null) { $result = rtrim(preg_replace('~/+~', '/', substr($_SERVER['PHP_SELF'], strlen($_SERVER['SCRIPT_NAME']))), '/'); if (preg_match('~' . rtrim(str_replace(array(':any', ':num'), array('[^/]+', '[0-9]+'), $route), '/') . '$~i', $result, $matches) > 0) { exit(call_user_func_array($function, array_slice($matches, 1))); } return false; } 

Now I want to map the following URLs (trailing slashes are ignored) to the respective "controllers":

 /index.php -> Index() /index.php/send/:NUM -> Send() /index.php/receive/:NUM -> Receive() /index.php/NON_EXISTENT -> Not_Found() 

This is the part where everything starts to get complicated, I have two problems that I cannot solve ... I believe that I am not the first person to have this problem, so someone there must have a solution.


Catching 404 (allowed!)

I cannot find a way to distinguish between requests from root ( index.php ) and requests that should not exist as ( index.php/notHere ). Ultimately, I use the default index.php route for URLs that would otherwise have to be sent to the 404 - Not Found error page. How can i solve this?

EDIT - the solution just flashed in my mind:

 Route('/send/(:num)', 'Send'); Route('/receive/(:num)', 'Receive'); Route('/:any', 'Not_Found'); // use :any here, see the problem bellow Route('/', 'Index'); 

Order Routes

If I configure the routes in a logical order, for example:

 Route('/', 'Index'); Route('/send/(:num)', 'Send'); Route('/receive/(:num)', 'Receive'); Route(':any', 'Not_Found'); 

All URL requests are bound by the Index() controller, since an empty regular expression (remember: trailing slashes are ignored) matches all. However, if I define the routes in a "hacker" order, for example:

 Route('/send/(:num)', 'Send'); Route('/receive/(:num)', 'Receive'); Route('/:any', 'Not_Found'); Route('/', 'Index'); 

Everything seems to work as it should. Is there an elegant way to solve this problem?

Routes cannot always be hardcoded (pulled out of the database or something else), and I need to make sure that it will not ignore any routes due to the order they determined. Any help is appreciated!

+4
source share
3 answers

Well, I know that there is more than one way to trick a cat, but why in the world would you do that? There seems to be some kind of RoR approach to something that can be easily dealt with with mod_rewrite

Speaking, I rewrote your Route function and was able to complete your task. Keep in mind that I added one more condition to catch the index directly, since you removed all / and why it matched the index when you wanted it to match 404. I also combined the 4 Route () calls to use foreach ().

 function Route() { $result = rtrim(preg_replace('~/+~', '/', substr($_SERVER['PHP_SELF'], strlen($_SERVER['SCRIPT_NAME']))), '/'); $matches = array(); $routes = array( 'Send' => '/send/(:num)', 'Receive' => '/receive/(:num)', 'Index' => '/', 'Not_Found' => null ); foreach ($routes as $function => $route) { if (($route == '/' && $result == '') || (preg_match('~' . rtrim(str_replace(array(':any', ':num'), array('[^/]+', '[0-9]+'), $route)) . '$~i', $result, $matches) > 0)) { exit(call_user_func_array($function, array_slice($matches, 1))); } } return false; } Route(); 

Hurrah!

+1
source

This is a common problem with webapps MVC, which is often resolved before it becomes a problem at all.

The easiest and most common way is to use exceptions. Throw a PageNotFound exception if you do not have content for the given parameters. At the top level from your application, catch all the exceptions, as in this simplified example:

index.php:

 try { $controller->method($arg); } catch (PageNotFound $e) { show404Page($e->getMessage()); } catch (Exception $e) { logFatalError($e->getMessage()); show500Page(); } 

controller.php:

 function method($arg) { $obj = findByID($arg); if (false === $obj) { throw new PageNotFound($arg); } else { ... } } 

The ordering problem can be solved by sorting the regular expressions so that the first given regular expression is matched first, and the least specific - last. To do this, count the path delimiters (i.e., slashes) in the regular expression, excluding the path delimiter at the beginning. You will receive the following:

  Regex Separators -------------------------- /send/(:num) 1 /send/8/(:num) 2 / 0 

Sort them in descending order and process them. The order of the process:

  • / send / 8 / (: Num)
  • / send / (: Num)
  • /
+1
source

OK, first of all, something like:

 foo.com/index.php/more/info/to/follow 

it works fine and by default it should load index.php with $ _SERVER [PATH_INFO] set to / more / info / to / follow. This is the CGI / 1.1 standard . If you want the server to NOT execute the PATH_INFO extensions, disable it in your server settings. Under apache, this is done with:

 AcceptPathInfo Off 

If you set it to off under Apache2 ... It will send 404.

I'm not sure what the IIS flag is, but I think you can find it.

0
source

Source: https://habr.com/ru/post/1315694/


All Articles