PHP – Securing your Web Application : Escape Output

Escaping is a technique that preserves data as it enters another context. PHP is frequently used as a bridge between disparate data sources, and when you send data to a remote source, it’s your responsibility to prepare it properly so that it’s not misinterpreted.

For example, O’Reilly is represented as O\’Reilly when used in an SQL query to be sent to a MySQL database. The backslash before the single quote exists to preserve the single quote in the context of the SQL query. The single quote is part of the data, not part of the query, and the escaping guarantees this interpretation.

The two predominant remote sources to which PHP applications send data are HTTP clients (web browsers) that interpret HTML, JavaScript, and other client-side technologies, and databases that interpret SQL. For the former, PHP provides htmlentities():

This example demonstrates the use of another naming convention. The $html array is similar to the $clean array, except that its purpose is to hold data that is safe to be used in the context of HTML.

URLs are sometimes embedded in HTML as links:

In this particular example, $value exists within nested contexts. It’s within the query string of a URL that is embedded in HTML as a link. Because it’s alphabetic in this case, it’s safe to be used in both contexts. However, when the value of $var cannot be guaranteed to be safe in these contexts, it must be escaped twice:

This ensures that the link is safe to be used in the context of HTML, and when it is used as a URL (such as when the user clicks the link), the URL encoding ensures that the value of $var is preserved.

For most databases, there is a native escaping function specific to the database. For example, the MySQL extension provides mysqli_real_escape_string():

An even safer alternative is to use a database abstraction library that handles the escaping for you. The following illustrates this concept with PEAR::DB:

Although this is not a complete example, it highlights the use of a placeholder (the question mark) in the SQL query. PEAR::DB properly quotes and escapes the data according to the requirements of your database.

A more complete output-escaping solution would include context-aware escaping for HTML elements, HTML attributes, JavaScript, CSS, and URL content, and would do so in a Unicode-safe manner. Here in Example, is a sample class for escaping output in a variety of contexts, based on the content-escaping rules defined by the Open Web Application Security Project.

Here is the list of of Article in this Series:

Please share the article if you like let your friends learn PHP Security. Please comment any suggestion or queries.


Thanks Kevin Tatroe, Peter MacIntyre and Rasmus Lerdorf. Special Thanks to O’Relly.