Category Archives: Articles

Longer articles that I have written, mostly on business-related topics.

Why is productivity growth stalling?

There’s been a flurry of press about stalling productivity growth in the West over the past few years. The usual explanations from economists tend to revolve around low levels of capital investment, poor measurement of certain new forms of innovation, or simply stalling levels of innovation.

I’d like to point out a few more possibilities that have received less coverage. The actuality is likely some combination of many of these factors. Continue reading

The Millennial Warning Problem

I was thinking about the old problem of how to warn people not to dig open a nuclear waste repository that may be unsafe for an extremely extended period of time. There’s an article on Slate from 2014 here. The problem has been discussed for years though. I remember reading about it when I was a kid.

International radiation warning symbol. Click to view other related symbols on Wikipedia.

In the past, governments have tried crowd-sourcing a solution for a warning sign that will still be understood thousands, or tens of thousands of years in the future.

The problem specifically is that a sign, whether it consists of iconography or text, may not be understandable even after a few generations.

I can’t think of a specific example right now, but I’ve encountered examples of iconography from less than a hundred years ago that I had to look up. That obviously wouldn’t do for a sign warning of imminent danger. There’s worse things than accidentally entering the wrong washroom, after all.

What if, instead of crowd-sourcing the solution, we instead extended the resolution out over time? Time-sourcing it, if you will. Continue reading

Is BitCoin Arbitrage Feasible?

BitCoin has been in the news lately with its rapid rise in exchange value, its huge fluctuations in intra-day value, and the susceptibility of services using it to hacking attacks.

It should be obvious to any observer that a position (in the investment term, not opinion) in BTC is speculative in nature, and carries any number of risks that are hard to evaluate.

There may be a way for investors to make money on BTC through arbitrage though – with relatively well-defined and calculable risks. Continue reading

A better display technology?

Display technologies tend to involve trade-offs based on their specific application: contrast, brightness, power consumption, refresh speed.

A technology that is perfectly suited for a television, which is typically plugged into the wall, would work poorly for a mobile device that has limited battery life. Continue reading

Heavy Traffic – Lessons Learned

In the past 15 or 16 years, I’ve worked on a number of websites that had fairly significant traffic (mostly in the form of unique daily visitors – there’s many ways to measure traffic). In one specific case, the traffic on a well-known author’s website spiked significantly (several thousand unique visitors per day) after his appearance on a television talk show. The website, although database driven, primarily consisted of articles, along with a store – and even on shared hosting, this wasn’t a problem.

Recently, my company built an online “live auction” website for a customer, a project which posed a number of interesting challenges and learning experiences (the hard way, of course) regarding how to build a site that has heavy traffic. In this case, the nature of the website requires that all users see information that is current and accurate – resulting in a need for AJAX calls that run repeatedly on a per second basis per user. This project is the first one that I have worked on that required serious optimization work; typically even the heaviest custom development that my team works on is primarily focused on business use cases rather than things like speed or algorithm design; not so here.

The “coming soon” page, long before the site was launched, already received several hundred unique visitors per day (based on Google Analytics). The site launched with more than 500 registered users (pre-registration via the coming soon page), and traffic spiked heavily following launch. The initial traffic spike actually forced the site to close for several days, in order for our team to rework code. The re-launch was preceded by several Beta tests that involved registered users. Bear in mind that a registered user on most sites isn’t individually responsible for much server load. On this particular site, each user is receiving at least one update per second, each of which may involve multiple database calls.

The following is a description of some of the issues we encountered, and how they were addressed or mitigated. In some cases, work is ongoing, in order to adapt to continued growth. In many cases, the challenges that we encountered forced me to revise some assumptions I had held about how to approach traffic. Hopefully the following lessons will save a few people the weeks of sleep deprivation that I went through in order to learn them.

Project Description:

  • Penny Auction website
  • Technology: PHP (Zend Framework), Javascript
  • Server: Various VPS packages (so far)
  • Description of traffic: All users receive one data update per second; there are additional data updates every 3 seconds, and once per minute.

1. Don’t Rely Too Much On Your Server

Many web developers build code that simply assumes that the server will work properly. The problem is that under heavy load, it isn’t at all uncommon for servers to actually not function in the way that might be expected. Examples include things like file resources dropping, database calls dropping – sometimes without intelligible error codes, and even things like system time being unreliable. The following are a couple of specific examples we encountered:

a) PHP time() – When developing in PHP, it is very common to rely on function calls such as time() (to obtain system time, in UNIX timestamp form) for algorithms to work properly. Our setup involved a VPS with multiple CPUs dedicated to our use, and the ability to “burst” to more CPUs as needed. As it turned out, whenever our server went into burst mode, the additional CPUs reported different system times than “our” CPUs did. This is probably an issue with the underlying VPS software, but we didn’t have the luxury of investigating fully. This meant that rows were frequently (as in: about one quarter of the time) saved in the wrong order into the database, which is a serious issue for an auction website! When possible, use a timestamp within SQL code (i.e. MySQL’s TIMESTAMP() function) instead. Fixing the system time on the other VPS partitions wasn’t feasible, since they “belonged” to a different customer.

b) Not every database call will work. Under heavy load, it isn’t at all unusual for a SQL insert or update statement to be dropped. Unless your code is designed to check error statements, and handle retries properly, your site will not work.

2. Pick Your Hosting Company Wisely

We launched the project on one of our hosting company’s smaller VPS packages. We quickly went to one of the middle-range packages, discovered it was also insufficient, and then switched to the largest package that they offer.

In the process, we also entered a number of second tier or higher tickets into their system, including serious operating system level problems.

Luckily, we chose a hosting company that responds quickly to issues, and whose staff are familiar with the types of issues we encountered.

This isn’t something to take for granted. Not every hosting company has the ability to quickly and seamlessly transition a site through different packages on different servers, nor do they necessarily have tier 3 support staff who can address unusual support requests.

In this case, our conversations with the company seem to indicate that they have never seen a new site with this level of load in the past; they still worked valiantly to assist us in keeping things running.

3. Shared Hosting, VPS, Dedicated, Cloud Hosting?

In our previous experience, when a hosting company sells somebody a dedicated server, the notion is that the customer knows what they are doing, and can handle most issues. This occurs even where a SLA (service level agreement) is in place, and can seriously effect response time for trouble tickets.

As a result, our first inclination was to use a VPS service. Our decision was further supported by the level of backup provided by default with VPS packages at our chosen vendor. A similar backup service on a dedicated server of equivalent specifications appeared to be much more expensive.

One of the larger competitors of our customer’s site currently runs under a cloud hosting system. We are continuing to look at a variety of “grid” and cloud hosting options; the main issue is that it is extremely hard to estimate the monthly costs involved in cloud hosting, without having a good handle on how much traffic a site will receive. It isn’t unusual for hosting costs to scale in such a way as to make an otherwise profitable site lose money. That said, we will likely have to transition over to a cloud hosting service of some kind at some point in time.

4. Database Keys Are Your Friend

At one point, we managed to reduce server load from > 100% load, down to around 20%, by adding three keys into the database. This is easy for many web developers to overlook (yes I know, serious “desktop” application developers are used to thinking of this stuff).

5. Zend Framework is Good For Business Logic – But It Isn’t Fast

We initially built the entire code base using Zend Framework 1.10. Using Zend helped build the site in a lot less time than it would otherwise have taken, and it also allows for an extremely maintainable and robust code base. It isn’t particularly fast, however, since there’s significant overhead involved in everything it does.

After some experimentation, we removed any code that supported AJAX calls from Zend, and placed it into a set of “gateway” scripts that were optimized for speed. By building most of the application in Zend, and moving specific pieces of code that need to run quickly out of it, we found a compromise that appears to work – for now.

The next step appears to be to build some kind of compiled daemon to handle requests that need speed.

6. Javascript

Our mandate was to support several of the more common browsers currently in use (mid-2010), including Firefox, IE7-9, Opera, and – if feasible – Safari.

The site is extremely Javascript-intense in nature, although the scripting itself isn’t particularly complex.

We used Jquery as the basis for much of the coding, and then created custom code on top of this. Using a library – while not a magic solution in itself – makes cross-browser support much, much easier. We’re not very picky / particular about specific libraries, but have used Jquery on a number of projects in the past couple of years, to generally good results.

Specific issues encountered included IE’s tendancy to cache AJAX posts, which had to be resolved by tacking a randomized variable onto resources; this, unfortunately, doesn’t “play nice” with Google Speedtest (see below).

We also had a serious issue with scripts that do animated transitions, which resulted in excessive client-side load (and thus poor perceived responsiveness) in addition to intermittantly causing Javascript errors in IE.

Javascript debugging in IE isn’t easy at the best of times, and is made more complex by our usage of minify (see below) to compress script size. One tool that occasionally helped was FireBug Lite, which essentially simulates Firefox’s Firebug plugin in other browsers (but which also sometimes can change the behaviour of the scripts being debugged). The underlying issue is that IE does a poor job of pointing coders to exactly where a script crashed, and the error messages tend to be unhelpful. The debugging method in IE basically boils down to a) downloading a copy of the minified resource in the form that the browser sees it, b) using an editor with good row/column reporting (I often use Notepad++) to track down roughly where the error occurs, and c) put in debug statements randomly to try and isolate the problem. After working with Firebug for a while, this is an unpleasant chore.

7. Testing Server

Long before your site launches, set up a separate testing server with as close to a duplicate of the live environment as possible. Keep the code current (we usually try to use SVN along with some batch scripts to allow quick updating), and test EVERY change on the test site before pushing the code over to the live server. Simple, but frequently overlooked (I’m personally guilty on occasion).

8. CSS

Designers and web developers often think of CSS purely in terms of cross-browser compatibility. Building sites that actually work in major browsers goes without saying, and based on personal experience, CSS issues can lead to a lot of customer support calls (“help, the button is missing”) that could be easily avoided. In the case of this specific project, we actually had to remove or partially degrade some CSS-related features, in order to provide for a more uniform experience across browsers. Attempting to simulate CSS3 functionality using Javascript is not a solution for a heavy-traffic, speed-intensive site; we tried this, and in many cases had to remove the code due to poor performance.

An often overlooked CSS issue (which Google and Yahoo have started plugging – see below) has to do with render speed. Browsers view documents essentially like a multi-dimensional array of elements, and specifying elements in an inefficient way can actually have a significant effect on the apparent page load time for users. It is well worth your while to spend some time with Google Speed Tester (or Yahoo’s competing product) in order to optimize the CSS on your site for speed.

9. Why Caching Doesn’t Always Work

Caching technology can be a very useful way of obtaining additional performance. Unfortunately, it isn’t a magic bullet, and in some cases (i.e. our project specifically), it can not only hurt performance – but can actually make a site unreliable.

High traffic websites tend to fall into one of two categories:

On the one hand, there are sites such as Facebook, whose business model is largely based on advertising; what this means is that if user data isn’t completely, totally current and accurate, it is at most an annoyance (“where’s that photo I just uploaded?”). Facebook famously uses a modified version of memcached to handle much of their data, and this kind of caching is probably the only way they can (profitably) serve half a billion customers.

On the other hand, financial types of websites (think of your bank’s online portal, or a stock trading site) have business models that pertain directly to the user’s pocketbook. This means that – no matter how many users are available, or the volume of data – the information shown on the screen has to be both accurate and timely. You would not want to login to your bank’s site and see an inaccurate account balance, right? In many cases, sites of this nature use a very different type of architecture to “social media” sites. Some banks actually have supercomputers running their websites in order to accomodate this.

Underlying the dichotomy above, is the fundamental notion of what caching is all about – “write infrequently, view often”. Caches work best in situations where there are far fewer updates to data than views.

The initial version of our code actually implemented memcached, in an attempt to try to reduce the number of (relatively expensive) database calls. The problem is that our underlying data changes so rapidly (many times per second, for a relatively small number of resources, that are actively being viewed and changed by many users), that caching the data was happening extremely frequently. The result in practice was that some users were seeing out of date cached data, at least some of the time. Abandoning caching in our specific case helped resolve these issues.

10. Speed Optimization

We used Google Speed Test in order to optimize our project. There is a similar competing product from Yahoo as well. These tools provide a wealth of information about how to make websites load faster – in many cases significantly faster.

Among the many changes that we made to the site, based on the information from the tester, were the following:

a) Use minify to combine and compress Javascript and CSS files. No kidding – this works. Not only that, but if you have a large number of CSS files that are loaded in each page, you can run into odd (and very hard to trace) problems in IE, which appears to only be able to handle approximately 30 external CSS files on a page. Compressing and combining these files using minify and/or yui can save you more than bandwidth.

b) Use sprites to combine images into large files. This does not work well in some cases (i.e. some kinds of buttons), but this technique can save precious seconds of load time. We used a Firefox plugin called Spriteme to automate this task, although we didn’t follow all of its suggestions.

c) Validate your HTML. Again, another “no brainer”. The load time saved by having valid HTML will actually surprise many readers. The process of validation is a nuisance, particularly if your site serves up dynamic, user-contributed content. Set aside a few hours for this process, and just do it though. It makes a difference.

11. Don’t Forget Algorithms 101

I took several courses on algorithm design at university, and then did nothing with that knowledge for more than a decade. Surprise, surprise – a complex, multi-user site actually needs proper thought in this regard.

One example from our experience – the data that tracks the status of an auction (i.e. whether it is currently running, paused, won etc etc) can be “touched” by 9 different pieces of code in the site, including “gateway” code that responds to users, and background tasks.

It took significant effort to build a reliable algorithm that can determine when an auction has actually ended, and the task was complicated by the fact that some of the code runs relatively slowly, and it is quite possible for another operation to attempt to modify the underlying data while the first task is still operating. Furthermore, “locking” in this case may have negative ramifications for user experience, since we did not want to unduly reject or delay incoming “bids” from users.

Conclusions

  1. It is very hard to plan ahead of time for growth in a web environment. Sometimes steps taken specifically to try and address traffic (i.e. caching in our case) can actually be detrimental. The process of adapting to growth can actually involve a surprising amount of trial and error experimentation.
  2. Using frameworks can be very helpful for writing maintainable code. Unfortunately its sometimes necessary to work around them, when specific optimization is needed. Proper documentation and comments can help – I try to write as if I’m explaining to somebody really dumb, years in the future, what is going on in my code – and then I’m often surprised when I need my own comments later on…
  3. Work with the right people. Not just your internal team, but also your hosting company etc. This can make a big difference when you are under pressure.
  4. Prepare yourself for periods of high stress. Not much you can do about this, unfortunately. In most cases, it will be unlikely that you will actually have access to the resources you really need to get the job done. Make sure you schedule breaks too. Its hard. Burnout is much harder though.

Business Lessons From Farmville – Part 3

Continued from Part 2 – http://lichtman.ca/articles/business-lessons-from-farmville-%E2%80%93-part-2.  In Part 2 we discussed the publicity and communal aspects of building a successful viral application, with examples from Zynga’s Farmville game.

3. Manage Scalability

Like all rapidly growing applications, Farmville has scalability issues – as of this writing, they are working on a serious issue pertaining to how the application loads.

Any rapidly growing application has a tendency to run up against physical limits to both the design of the application and also the underlying hardware on which it is running. The result is that at each order of magnitude of growth in traffic, significant work may be required in order to redesign the application. In the meantime, it isn’t uncommon for these kinds of apps to have outages, extreme slowness, or bizarre and unscripted behaviour.

What Zynga has done is to appropriately deal with user expectations: firstly be explaining what the situation is, and secondly by compensating users (in this case via free game items) for the irritation. It is surprising how many companies will try to either hide or deny outages due to scalability issues.

4. Make Money

Making money out of a busy website or application is harder than it may seem.

The “throw something up on the web and put ads on it” model hasn’t worked well in a number of years (yes, I also know a few people still successfully doing this – just try duplicating it from scratch now).

This means that a company that is successfully making money online is worth analyzing in more detail. I don’t know how much Zynga is making (they’re a private company), but from what I’ve read they’re profitable.

Say what you want about the morality of selling virtual items via micropayments, the model appears to work well – at least for the most addictive of online games.

Farmville has two types of internal “currency” for purchasing items in the game:

One type of currency is easily earned within the game, and can be used to purchase the most common items.

The “premium” currency – which is usually obtained through a micropayment (although small quantities of it now circulate within the game as well) – can be used to purchase a variety of premium items that either make game play easier, or convey some form of status.

This model allows people to play the game without having to buy anything, while allowing the most enthusiastic players to essentially subsidize everyone else.

This “freemium” business model works at a similar psychological level to the old “shareware” software licenses – people can use it for free, but also feel like there is some level of obligation to pay, based on their own perceived value of the app.

5. Multiply

On a fickle internet, success may be fleeting.

If a website attracts significant traffic – and money – they need to immediately start planning the next step.

It appears that Zynga realizes that there may be inherent limits to how big a specific web game can get, and that their approach also includes the notion of continually duplicating their success with new games – particularly using cross-pollination approaches such as using familiar characters.

Duplicating a successful methodology over and over can be a useful means of sustaining viral success – particularly if one is skilled at redirecting existing “customers” to the next big thing. By building all of their games around a standard platform – in this case Facebook – the development time to create new apps is reduced. There is also a consistant user interface between the apps, allowing for faster user adoption.

If you play Farmville, you will quickly notice that Zynga has posted advertising at the top of each page for their other games. The objective is to “cross-pollinate” the user list from their various games (my understanding is that they are now running a large number of games on Facebook and other platforms).

One trick – which they don’t seem to be using right now – would be to allow users to utilize the same game currency across all of the games on their platform. I’ve seen this used in practice elsewhere.

Two last things that they appear to be doing well: a)  Zynga developed certain iconic characters for Farmville (i.e. the cartoon animals), and they have used these animals (and similar characters) in other apps that are directly marketed to the users of Farmville. In addition, b) they rapidly create competing products whenever another development team builds a game that could move into their marketspace – and they make an effort to improve on the original. The second company into a market can do very nicely indeed – just think of the Beta vs VHS wars.

In the final part of this article, I will wrap things up and provide some other examples from other businesses. To be continued…

Business Lessons From Farmville – Part 2

Continued from Part 1 – http://lichtman.ca/articles/business-lessons-from-farmville-part-1. In Part 1 we discussed the idea that there are business lessons that can be learned from viral games such as Zynga’s Farmville.

2. Let Everybody Know

If you’re on Facebook, you are familiar with the extent to which Farmville pesters people who aren’t already playing it. I had actually blocked the application at one point, and only logged in after reading about how it had attracted 70 million users on a mainstream press website – which actually proves the idea that in advertising, repeating your message ad nauseum actually does pay off. Eventually.

What Zynga have done with Farmville is create a system that provides an immense number of opportunities for people who are already using the game to gain by telling other people about it. In addition to bugging people who aren’t already playing, it also provides – as mentioned above – innumerable ways of reminding people who are already playing it about its existence. This can be irritating, but it is clearly an exceedingly effective methodology for growing traffic.

A small number of the methods that they use to spread the news include:

  • Constantly requesting users to post announcements to their “streams” – every time a user achieves a milestone in the game – no matter how small – Farmville asks the user if they want to place a post on their stream (the list of updates shared between “friends” on Facebook). These announcements are essentially sales referrals – if somebody not playing the game sees hundreds of such announcements from their friends, possibly it may pique their interest.
  • Many posts from Farmville contain image “snapshots” of what a user’s farm looks like. These follow the notion of “show me, don’t tell me” – a picture is worth a whole lot of verbiage.

3. Build a Community

As mentioned briefly in Part 1, the notion of a community is very powerful in social networking applications. If the people that you are friendly with are all involved in a particular community, not only are you more likely to join, but you’re also much less likely to leave. Real world examples include religious institutions, multi-level marketing organizations, social clubs, charities etc etc. Many such organizations fulfill a social role in addition to any other role they may play, and for their participants this can be a powerful motivator.

The vast majority of online social applications pay lip service to the communal role – but in actuality they provide little incentive (or supportive functionality for that matter) for people to actually interact with each other.

One of the key reasons why Farmville has been so successful is that the communal aspect has been so well thought out – not only are there endless ways for people to interact in the game – it is difficult to progress without doing so. A few examples (and there are probably dozens of others) follow:

  • Neighbours – the game plays up the folksy notion of farmers chatting over a picket fence. Members can add other players as virtual neighbours in the game, and thereafter the game visually renders the neighbouring farms next to the player’s farm. Players are encouraged to visit their neighbour’s farms, and to participate in building up those farms via simple tasks (fertilizing their crops), for which they score points.
  • Many items in the game cannot be purchased from the “market” directly – they can only be given as gifts. Players are encouraged to give such gifts to their neighbours, and the receipt of such items triggers a polite request to send something back.
  • An interesting recent feature – barn raising. A player wishing to build a “barn” with which to store items can pay for the barn directly – or get it for free if ten of their neighbours are willing to help them. The process involves a large amount of voluntary messaging being posted to streams – and people’s in-boxes.

[To be continued in Part 3…]

Business Lessons From Farmville – Part 1

Day 312/365 - 8 Nov - FarmVille
Image by anshu_si via Flickr

I’m not generally recommending that you drop everything and play Farmville, but there are some interesting business lessons to be learned from the game. Possibly this blog entry will save you massive amounts of time – i.e. you can simply read on, rather than playing.

If you’re on Facebook, you’ve probably already been pestered by notifications from a game called Farmville, created by a company called Zynga. Possibly you’re already playing the game yourself. Over the past few months, the number of people playing the game has exceeded 70 million – for comparison’s sake, this is roughly the same as the total number of people using Twitter. Clearly they’re doing something interesting.

Generally speaking, Farmville falls under the category of “viral applications”. A viral app is one that seems to spread uncontrollably – just like a cold or flu bug does.

The key to “virality” has been documented elsewhere to great effect (just visit your local library or favourite guerrilla  marketing blog):

  • create something that is going to keep people interested (what’s usually called “stickiness”).
  • make sure that using it will work even better if the user tells their friends about it.
  • seed the application, website, or whatever with an initial set of users – probably friends of the owner.
  • watch it grow.
  • deal with scalability issues
  • figure out how to make money (!)
  • reproduce / duplicate the effect elsewhere

Very few applications make it to the final step, and there is definitely at least some element of luck involved – I’ve seen some great ideas fall flat for no apparent reason.

What Zynga has done though is a very interesting example of a successful viral application, and there are a number of attributes that can be used elsewhere – not necessarily for games either.

1. Keep People Interested

The notion of keeping people constantly interested in an application is very helpful in building a virally marketed website or game. The longer a person’s attention is on something, the more opportunities the makers have to get them to tell other people about it, as well as there being more likelihood of selling the user something. Keeping people interested is not conceptually hard, but can be difficult to implement in practice; I’m seen a great many websites fall short in this regard. “Viral” without “sticky” often equals “flop”.

Zynga have done a few interesting things with regards to holding people’s attention. Some of them are general rules from the game builder’s playbook, and thus aren’t transferable to all products or services.

Some of the tactics include constantly changing items, seasonally based differences in the appearance of the game, new functionality as a user progresses in level, randomization (things like animals moving around on their own) – these are all things common to many successful online games. Maintaining a stream of new activity is actually quite difficult to carry out – as I’ve discovered in the past while working on other games. There’s a certain level of perseverance involved, along with rallying the developers – most of whom are probably feeling burned out at this point (again, past experience) and keeping the creative juices flowing.

Other interest-enhancing features include their gift exchange system – I’ll talk more about this in Part 2 – which a) ensures that certain things can only be accomplished with the help of friends, and b) provides a stream of requests to players inboxes to entice them to come back repeatedly.

Two last things of note:

By building a community, where players cooperate in longer term development with each other, Farmville makes it less likely that somebody will drop out. Community formation is a powerful tool to keep people coming back over and over.

Farmville also relies on people’s nostalgia for “the simple life” – not that farming is particularly simple in actuality.  The nostalgia factor can be a powerful tool for marketing to particular market segments.

[To Be Continued…]

Reblog this post [with Zemanta]

My first experiences with BuddyPress (open source social platform)

Nathan Bomshteyn discusses his experiences installing and configuring BuddyPress, a social media platform that installs on top of WordPress MU.

Continue reading

How Not To Get Hacked

Image courtesy of "gutter" on Flickr. Creative Commons.
Image courtesy of "gutter" on Flickr. Creative Commons.

I just spent a chunk of this afternoon fixing up a friend’s website which was hacked. The hacker appears to have gained access through a decade old shopping cart (not in use, just sitting in a folder on the site), and then proceeded to insert obfuscated javascript code into every page on the site (several hundred pages, with the code slighly different on each page).

This is the fifth or sixth site I’ve had to clean up in the past year or so, and its always a painful job – I’m pretty good at spotting code that shouldn’t be in a page, but with a large website it can be hard to be certain that it has been completely fixed. And there’s no guarantee that the original loophole that was exploited has been removed. Even under the best of circumstances, cleaning up this sort of mess is a painstaking process.

The following is intended for web designers who aren’t coders – but who use scripts that they have located on the web. Some intro level programmers might benefit. Experienced web programmers should go directly to the following link and do some review: http://cwe.mitre.org/top25/

1. Be very careful about downloading “free” scripts off the web. Do yourself a favour and scan the code before using it. If it has been obfuscated, or it looks odd, you probably want to avoid using it. You don’t need to be a programmer to get a feel for nefarious code.

2. When putting together a website that has any kind of dynamic functionality – be it javascript, a php script on the back end, or something else – bear in mind Jeremy’s Addendum to Murphy’s Law: Whatever can be hacked, will be hacked. There are a lot of common loopholes that hackers exploit that could be easily avoided by looking at code with a cynical eye and trying to figure out how it can hurt you.

3. Periodically review old websites that you’ve done. Code that used to be fine may no longer be so safe. Also, as you learn from mistakes, you may notice all kinds of things that are dangerous in your code.

4. Its also really worthwhile to look at the Top 25 Dangerous Bugs list, linked above. A periodic review is in order. Speaking of which, I’m adding that to my to do list.

5. Verify ALL inputs to a script. If you think you have verified them, get somebody with a cynical bent to test it. If something is up on the web, it is guaranteed that somebody will try some oddball and highly unexpected inputs just to see if they use your script for their own purposes.

6. Remember at the end of the day that there’s absolutely no such thing as a hacker-proof piece of software or hardware. Make regular backups. Assume you’re going to need them.

I just want to finish with an anecdote.

I used to operate a small hosting company along with some of my other duties at my former company.

One day, one of our servers started broadcasting vast volumes of spam email, to the point that we had to shut down the outgoing email service.

I spent a few hours reading log files, trying to pinpoint what exactly was happening. I finally narrowed it down to a script that had been uploaded a few days prior on one of the client’s accounts.

The script was basically a feeble attempt to try and implement a CMS (content management system). Basically the way it worked was that any GET input to the main script was assumed to be the name of an html fragment file, and was included into the script with no verification whatsoever.

If this means nothing to you, you’ve probably seen websites that have URLs something along these lines: index.php?id=123. The “id=123” part can be parsed out by the script as an input. In this case the links looked like this: index.php?page=contact.html.

The script just assumed that contact.html was a piece of HTML code, and included it in.

It didn’t take long before half the hackers in the world were sending the script stuff like this: index.php?page=path_to_malware_or_spam_script. And our server was running those bits of malware as if they were located locally.