Skip to content

Clean input variable PHP

Some time ago i wrote a post about query string and i think i pretty well covered that topic, however i didn’t mentioned one thing … cleaning input variables. In fact it is much more important to know how to clean $_POST and $_GET arrays then know how to handle query string, because variables sent by user are the only way to hack your script, it is that simple, if you take care of input variables then your script is 100% safe.

All idea of cleaning input variables is pretty simple and it comes down to “escaping” quotes and double quotes from variables sent by user and replacing potentially dangerous characters with their entities, and there are few ways to do it.

First we need to set proper directives in php.ini file, that is:

magic_quotes_gpc = Off
magic_quotes_runtime = Off

I set both to off so PHP by default won’t clean GPC (GET, POST, COOKIE) variables for me. At first it may look like a bad idea to switch off magic_quotes but it is not, you know the old saying: if you want something to be done right, then do it yourself … well maybe i exaggerated it a little bit, but you get the idea.

There are two kinds of variable cleaning: for database and for output. The good practice is to allow client to write in databse whatever he wants even potentially dangerous html code, however we do not want to diplay this code in a way that will make it easy for newbie hacker to hack our site.

The good example of this is: forum, where people are allowed to write whatever they want, so for example they can write a javascript script, which can steal people personal information from cookies and send them to some other website where the thief collects this data, really, it all can be done with javascript, so we need a way to avoid this.

Cleaning Input Variables For Database

Let’s assume that someone sends text by POST, first we want to clean variable for database. If you use procedural mysql then this can be easily done with:

// $_POST['data'] = "quote's";
echo $_POST['data'];
$cleanedForDb = mysql_real_escape_string($_POST['data']);
echo $cleanedForDb;

outputs:

quote's
quote\'s

$cleanedForDb variable can be now safely inserted into any sql query because all quotes are escaped, on the other hand if you try to insert $_POST[‘data’] variable into any query, then in the best case you will get only an error.

Keep in mind that if you escape data before inserting it into database then you must unescape it when you want to get it out of database, a lot more effort then before, but security is worth it.

Cleaning Input Variable For Output

We want to display on screen data send by user, remembering it can contain dangerous code it is the best to clean this data with htmlentities:

$clean = htmlentities($_POST['data'], ENT_QUOTES, 'UTF-8');

first htmlentities argument here is text, second constant ENT_QUOTES, which means that both single and double quotes will be converted to their entities and third is charset, i used UTF-8 you can use whatever charset you use, anyway second and third arguments are optional, but it is good to add them anyway.

But it usually happens that we want to secure whole $_POST data and htmlentities allows only to secure single variable, so as programmers we need to write our own function:

function confHtmlEnt($data)
{
    return htmlentities($data, ENT_QUOTES, 'UTF-8');
}
 
$cleanPost = array_map('confHtmlEnt', $_POST);

There, whole POST array is clean, and the cool part is we can in the same manner clean $_GET, $_COOKIE or any other array.

Remember when i said we need to unescape, slashes when getting data out of database? Here is how to do it:

// $row['column_1'] = "quote\'s";
echo $row['column_1'];
$escaped = stripslashes($row['column_1']);
echo $escaped;

output:

quote\'s
quote's

or if we want to escape whole array:

$escaped = array_map('stripslashes', $row);

Hmmm, i thought it will be pretty short post but it became quite long, fortunately this is all you need to know to make your scripts 99% safer then before.

Published inGeneral

13 Comments

  1. sector sector

    Great post about variable cleaning! I’ve been usually cleaning the input already when storing it to a variable, this way I don’t have to worry about it anymore. However, in some cases I might want to be able to see what the user has really entered.

  2. Thanks Greg,
    That is what i was looking for,not perfect but enough; now i can use GET,POST with more confidence, doing something simple:

    function confHtmlEnt($data)
    {…}

    $cleanPost = array_map(‘confHtmlEnt’, $_GET);
    $_GET=$cleanPost;
    And avoid analizing my vars case by case.

  3. @Gustavo and everyone else

    You can simplify, you don’t need to assign the variable to a variable to the function results. i.e. instead of

    $cleanPost = array_map(’confHtmlEnt’, $_GET);
    $_GET=$cleanPost;

    do

    $_GET = array_map(’confHtmlEnt’, $_GET);

    A nice function is


    function cleanData($data) {
    $data = trim($data);
    $data = htmlentities($data);
    $data = mysql_real_escape_string($data);
    }

    $_POST = array_map('cleanData', $_POST);

  4. edit: the function is missing a “return $data;” line at the end.

  5. jolly joker jolly joker

    charlies solution is known to work, it never fails.

  6. dave dave

    Jolly Joker. it does work. Thank you Charles.

  7. Matt Matt

    This is nice, however you’re missing one important aspect… GET and POST arrays can have arrays as values, so this will fail to clean them (and may infact throw a PHP error as well).

    Recursive functions come in handy here…
    function __stripslashes($var)
    {
    $var = is_array($var) ? array_map(‘__stripslashes’, $var) : stripslashes($var);

    return $var;
    }
    function __htmlspecialchars($var, $style)
    {
    $var = is_array($var) ? array_map(‘__htmlspecialchars’, $var, array_fill(0, count($var), $style)) : htmlspecialchars($var, $style);

    return $var;
    }
    $_GET = __stripslashes($_GET);
    $_GET = __htmlspecialchars($_GET, ENT_QUOTES);

  8. Superb Post. Niftier then the simillar post I checked 2 days ago on WordPress. Maintain the good work.

  9. Thanks, Charlie, your script is superb, I also added a stripslases and strip_tags in there for good measure

  10. Nice read. I found your blog on bing and i have your page bookmarked on my favorite read list!
    I’m a fan of your blog. Keep up the good work

  11. Hi Thanks been looking for a simple to understand and logical way to clean entries in to a database.
    This article was very helpful is there any other tips about cleaning code or making it more secure before it get to a database , currently have dreamweaver code with your code slotted in to try and stop attacks
    Any help would be much appreciated.
    Code:
    if (!function_exists(“GetSQLValueString”)) {
    function GetSQLValueString($theValue, $theType, $theDefinedValue = “”, $theNotDefinedValue = “”)
    {
    $theValue = get_magic_quotes_gpc() ? stripslashes($theValue) : $theValue;

    $theValue = function_exists(“mysql_real_escape_string”) ? mysql_real_escape_string($theValue) : mysql_escape_string($theValue);
    $theValue = htmlentities($theValue, ENT_QUOTES, ‘UTF-8’);
    trim($theValue);
    switch ($theType) {
    case “text”:
    $theValue = ($theValue != “”) ? “‘” . $theValue . “‘” : “NULL”;
    break; case “long”:
    case “int”:
    $theValue = ($theValue != “”) ? intval($theValue) : “NULL”;
    break; case “double”:
    $theValue = ($theValue != “”) ? “‘” . doubleval($theValue) . “‘” : “NULL”;
    break; case “date”:
    $theValue = ($theValue != “”) ? “‘” . $theValue . “‘” : “NULL”;
    break; case “defined”:
    $theValue = ($theValue != “”) ? $theDefinedValue : $theNotDefinedValue;
    break; } return $theValue;}}

  12. SMS SMS

    I knew I needed to protect my form processor from the data submitted to it, but wasn’t sure how to do it with PHP. This helps a bunch, thanks!

  13. “Keep in mind that if you escape data before inserting it into database then you must unescape it when you want to get it out of database, a lot more effort then before, but security is worth it.”

    “Remember when i said we need to unescape, slashes when getting data out of database? Here is how to do it:”

    Sorry, but both of those are quite wrong :<

    Escaping data is for the *TRANSPORT* between PHP, and your database – once your database has received that data, it will store it without the escaping slashes. When you SELECT it back out, the slashes won't be there. If you're seeing extra slashes in your data, it means you're escaping TWICE, which is not a good idea, and can actually open some rather nasty security HOLES.

    What you're seeing is because WordPress automatically escapes all inputs, just like magic_quotes_gpc does, and yes, it suffers from the same issues ( as detailed here http://php.net/magic_quotes ).

    If you're using stripslashes at all, you should be using it BEFORE you use your database specific escaping function, like mysql_real_escape_string(). This way mysql_real_escape_string() is fed the actual content you want to store in the database, and it'll work correctly.

    You shouldn't ever need stripslashes() when SELECT'ing data out from your database.

Leave a Reply

Your email address will not be published. Required fields are marked *