Encode User Input Before Outputting it to a Page - Or Else!
Turn bad user input into good ... and prevent your Web
server from being hacked.
Here's an extraordinarily simple ASP.NET Web page - one that displays
a personalized greeting consisting of "Hello" followed by a name
typed by the user when the Click Me button is clicked. Despite its simplicity,
this page suffers from a potentially fatal security flaw that can put cookies,
your Web server, and even other computers behind your firewall at risk. Can you
spot the flaw?
<html>
<body>
<form
runat="server">
<asp:TextBox
ID="Input" RunAt="server" />
<asp:Button
Text="Click Me" OnClick="OnSubmit"
RunAt="server"
/>
<asp:Label ID="Output" RunAt="server" />
</form>
</body>
</html>
<script language="C#"
runat="server">
void OnSubmit (Object sender, EventArgs e)
{
Output.Text =
"Hello, " + Input.Text;
}
</script>
If you guessed that the flaw has to do with the fact that the
code echoes raw user input to the page, you guessed correctly. A Web page
should never, under ANY CIRCUMSTANCES, echo raw, unfiltered user input to the
page. Why? Because it leaves that page susceptible to cross-site scripting
(XSS) attacks. To demonstrate, run the page and type the following text into
the text box:
<script>alert('Gotcha')</script>
Click the Click Me button and a message box will pop up in
your browser. The problem? When you echoed a script block to the page, the
browser interpreted it as a piece of code that needs to be executed. This
script is benign. But malicious users - read: hackers - can enter scripts that
are far from benign. Even a page as simple as this one can't afford to assume
that all users have honorable intentions. (Note: If you run this page with
ASP.NET 1.1, you'll need to add an <%@ Page
ValidateRequest="false" %> directive to the top of the page to see
the message box.)
The solution is simple. Before echoing user input to a page,
use Server.HtmlEncode to HTML-encode it. The following page is functionally
equivalent to the previous one, but because it HTML-encodes the text typed by
the user before outputting it to the page - turning < and > characters,
for example, into < and > - it's impervious to XSS attacks:
<html>
<body>
<form
runat="server">
<asp:TextBox
ID="Input" RunAt="server" />
<asp:Button
Text="Click Me" OnClick="OnSubmit"
RunAt="server"
/>
<asp:Label
ID="Output" RunAt="server" />
</form>
</body>
</html>
<script language="C#"
runat="server">
void OnSubmit (Object sender, EventArgs e)
{
Output.Text =
"Hello, " + Server.HtmlEncode (Input.Text);
}
</script>
Remember: user input is inherently evil and should be treated
as such. Server.HtmlEncode turns bad user input into good and might just
prevent your Web server from being hacked - or worse.