How to handle URLs in PHP

URL handling is one of the tasks you have to do from time to time in PHP. Sometimes you have to do it because you want to record the referral sites, other times because you want to write your own spider or just because you want to retrieve your current URL.

PHP is a language developed around web for web developers and it contains all the functions you might need in your quests. There is a section in php documentation which groups the URL functions. Along with a few functions used to encode/decode which are rarely used the package contains the functions you can not live without:

parse_url

parse_url – a function that parses a URL and returns an associative array containing all the various components of the URL that are present.

A complete link that is parsed by parse-url should be in the following form:

scheme://username:password@host:port/path/?query#fragment

For example the following code:

parse_url('http://username:password@php-html.net:80/tutorials/?arg=value#anchor');

will return the following array:

Array
(
    [scheme] => http
    [host] => php-html.net
    [user] => username
    [pass] => password
    [path] => /tutorials
    [port] => /80
    [query] => arg=value
    [fragment] => anchor
)

You can provide a second parameter in the function to retrieve only one of the values, as a string. The possible parameters are: PHP_URL_SCHEME, PHP_URL_HOST, PHP_URL_PORT, PHP_URL_USER, PHP_URL_PASS, PHP_URL_PATH, PHP_URL_QUERY or PHP_URL_FRAGMENT:

echo parse_url('http://php-html.net/tutorials/', PHP_URL_HOST); // prints php-html.net
echo parse_url('http://php-html.net/tutorials/', PHP_URL_PATH); // prints /tutorials
...

You have to keep in mind that:
– This function does not work with relative urls.
– It should not be used for url validation, it only parses urls and splits them in components.
– It works fine on partial urls

http_build_query

http_build_query – is a complementary function that builds an url query string based on an associative (or indexed) array:

$data = array('foo'=>'bar',
              'baz'=>'boom',
              'cow'=>'milk',
              'php'=>'hypertext processor');

echo http_build_query($data); // foo=bar&baz=boom&cow=milk&php=hypertext+processor
echo http_build_query($data, '', '&'); // foo=bar&baz=boom&cow=milk&php=hypertext+processor

Encoding/Decoding urls

Base64 encoding

The base 64 encoding transforms a string to another string that can be transmitted through transport layers that are not 8-bit clean, such as mail bodies. Basically it means that a string is transformed in another string that uses a limited set of characters(ASCII). The functions base64_encode / base64_decode can be used to encode and decode any string:

$str = 'This is an encoded string';
echo base64_encode($str); // will print VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw==
echo base64_decode(base64_encode($str)); // will print This is an encoded string

url encoding / url decoding

There are 2 types of encoding / decoding functions designed especially for urls, with the intent to make them usable in a query part of a URL, as a convenient way to pass variables to the next page. There is a slight difference between them:

urlencode / urldecode – encodes a string replacing all non-alphanumeric characters except -_. with a percent (%) sign followed by the 2 hex digits representation. The spaces are encoded as (+) signs. It encodes strings in the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type.
rawurlencode / rawurldecode – It does the same job as the above function, except that it replace the spaces with percent sign followed by the hex representation: %20. This function implements the standard RFC 1738.

Getting the Current URL

There are many cases when you have to retrieve the current url for different reasons. Unfortunately php does not implement a function dedicated to this scenario, but such a function can be easily written to build the current url string from the server variables.

function current_url() {
    $isHTTPS = ( isset($_SERVER["HTTPS"]) && $_SERVER["HTTPS"] == "on" );
    $isPort = ( isset($_SERVER["SERVER_PORT"]) && ((!$isHTTPS && $_SERVER["SERVER_PORT"] != "80")
                 || ($isHTTPS && $_SERVER["SERVER_PORT"] != "443")));
				 
    $port = ( $isPort ) ? ( ':'.$_SERVER["SERVER_PORT"] ) : '';

    //On some setups like nginx and php-fastcgi, REQUEST_URI include the query string
    if ( ($pos = strpos($_SERVER['REQUEST_URI'], '?')) === false )
    {
        // REQUEST_URI include the query string, it should be appended:

        $isQuery = ( isset($_SERVER["QUERY_STRING"]) && $_SERVER["QUERY_STRING"] != '');
        $query = ( $isQuery ) ? ( '?'.$_SERVER["QUERY_STRING"] ) : '';

        $url = ( $isHTTPS ? 'https://' : 'http://')
                    .$_SERVER["SERVER_NAME"].$port.$_SERVER["REQUEST_URI"].$query;
    }
    else
    {
        // the query string is already included in $_SERVER["REQUEST_URI"], no need to append it
        $url = ( $isHTTPS ? 'https://' : 'http://')
                    .$_SERVER["SERVER_NAME"].$port.$_SERVER["REQUEST_URI"];        
    }
         
    return $url;
}

The function don’t need further explanations. It uses the $_SERVER variables to construct the url and checks if a port other that the default one is used and if the page is secure or not. The snippet was improved thanks to the comments from Joseph and ferdhie.

Did you enjoy this tutorial? Be sure to subscribe to the our RSS feed not to miss our new tutorials!
... or make it popular on

7 Comments

  1. I would also include QUERY_STRING in your function. So I would update it to the following:

    function current_url() {
    $isHTTPS = ( isset($_SERVER[“HTTPS”]) && $_SERVER[“HTTPS”] == “on” );
    $isPort = ( isset($_SERVER[“SERVER_PORT”]) && ((!$isHTTPS && $_SERVER[“SERVER_PORT”] != “80”)
    || ($isHTTPS && $_SERVER[“SERVER_PORT”] != “443”)));
    $port = ( $isPort ) ? ( ‘:’.$_SERVER[“SERVER_PORT”] ) : ”;
    $isQuery = ( isset($_SERVER[“QUERY_STRING”]) && $_SERVER[“QUERY_STRING”] != “”);
    $query = ( $isQuery ) ? ( ‘?’.$_SERVER[“QUERY_STRING”] ) : ”;
    $url = ( $isHTTPS ? ‘https://’ : ‘http://’).$_SERVER[“SERVER_NAME”].$port.$_SERVER[“REQUEST_URI”].$query;

    return $url;
    }

  2. Your RSS link is leading to an error page. The error it’s showing is this:

    This page contains the following errors:

    error on line 265 at column 625: Extra content at the end of the document
    Below is a rendering of the page up to the first error.

  3. On some setups like nginx and php-fastcgi, REQUEST_URI include the query string,
    Here’s my mods

    //strip querystring from request_uri
    if ( ($pos = strpos($_SERVER[‘REQUEST_URI’], ‘?’)) !== false )
    $_SERVER[‘REQUEST_URI’] = substr($_SERVER[‘REQUEST_URI’], 0, $pos);

Leave a Comment.