This is the fifth or sixth site I’ve had to clean up in the past year or so, and its always a painful job – I’m pretty good at spotting code that shouldn’t be in a page, but with a large website it can be hard to be certain that it has been completely fixed. And there’s no guarantee that the original loophole that was exploited has been removed. Even under the best of circumstances, cleaning up this sort of mess is a painstaking process.
The following is intended for web designers who aren’t coders – but who use scripts that they have located on the web. Some intro level programmers might benefit. Experienced web programmers should go directly to the following link and do some review: http://cwe.mitre.org/top25/
1. Be very careful about downloading “free” scripts off the web. Do yourself a favour and scan the code before using it. If it has been obfuscated, or it looks odd, you probably want to avoid using it. You don’t need to be a programmer to get a feel for nefarious code.
3. Periodically review old websites that you’ve done. Code that used to be fine may no longer be so safe. Also, as you learn from mistakes, you may notice all kinds of things that are dangerous in your code.
4. Its also really worthwhile to look at the Top 25 Dangerous Bugs list, linked above. A periodic review is in order. Speaking of which, I’m adding that to my to do list.
5. Verify ALL inputs to a script. If you think you have verified them, get somebody with a cynical bent to test it. If something is up on the web, it is guaranteed that somebody will try some oddball and highly unexpected inputs just to see if they use your script for their own purposes.
6. Remember at the end of the day that there’s absolutely no such thing as a hacker-proof piece of software or hardware. Make regular backups. Assume you’re going to need them.
I just want to finish with an anecdote.
I used to operate a small hosting company along with some of my other duties at my former company.
One day, one of our servers started broadcasting vast volumes of spam email, to the point that we had to shut down the outgoing email service.
I spent a few hours reading log files, trying to pinpoint what exactly was happening. I finally narrowed it down to a script that had been uploaded a few days prior on one of the client’s accounts.
The script was basically a feeble attempt to try and implement a CMS (content management system). Basically the way it worked was that any GET input to the main script was assumed to be the name of an html fragment file, and was included into the script with no verification whatsoever.
If this means nothing to you, you’ve probably seen websites that have URLs something along these lines: index.php?id=123. The “id=123” part can be parsed out by the script as an input. In this case the links looked like this: index.php?page=contact.html.
The script just assumed that contact.html was a piece of HTML code, and included it in.
It didn’t take long before half the hackers in the world were sending the script stuff like this: index.php?page=path_to_malware_or_spam_script. And our server was running those bits of malware as if they were located locally.