Archive for the ‘Behind the Scenes’ category

Defining Website Value

January 30th, 2010

Recently I attempted to define in as objective a way as possible how much value my online calculator is providing to its users. The idea is to come up with an equation which gives me a ’score’ of how much value has been provided to users. With this, it will then be possible to optimise the site to provide as much value as possible. For instance, when considering two possible improvements I can use the equation to estimate the likely increase in value. That then lets me know which one to go ahead with (the one likely to be of most value to the users).

At the end of this post are the notes I made while trying to figure it out. They’re probably not of any use to anyone, but I though I’d share them. Before that, though, I want to outline the process I went though as hopefully this will be a bit more useful.

Step 1: Identify the main source of value you provide

The first thing I did was to figure out what user actions provide the main source of value. For me, since my site is a calculator, this was clear. Value is provided when a user enters a calculation and receives an answer. There are other potential sources of value (e.g. someone may find some value in one of my blog posts). But I want to focus on the calculations as my main value proposition. I may include other sources of value later, but now I’ll focus on just the one.
Step 2: Identify possible outcomes for the value action

When a user takes the value giving action (e.g. performing a calculation), one of several things can happen. Specifically, value can be delivered or not. In my case I found there were four possible outcomes when a visitor was on the main calculation page:

  • No calculation performed
  • Calculation answered correctly
  • Calculation answered incorrectly
  • Calculation not answered

I initially didn’t include the ‘No calculation performed’ outcome, but then added it in as I realised that a user incurs a time cost in visiting the calculation page and thus if they don’t perform a calculation then the value provided is actually negative (i.e. it cost them and they gained nothing).

Step 3: Identify time costs

Each of these actions incurs a time cost to the user. Like the ‘No calculation performed’ case, all four outcomes incur the cost of the user first visiting the site (though this cost is actually shared among all calculations the user performs in that visit).

The  other three also incur the time cost involved in typing in the calculation. This is then followed by the time cost in waiting for a response. Hopefully all these costs should be minimal, but it would be wrong to ignore them.

By analysing web logs, etc, it should be possible to get exact values for this time cost in most cases. The only thing that can’t be exactly pinpointed is the time it takes for the user to find the site as this may include, e.g. searching on a different site such as Google.

Step 4: Identify the user-derived benefit and costs

Assuming the user takes the action required to produce the value, three things can happen. Either the user gets the value they were after, they don’t get the value, or they get a wrong answer which may actually cost them (in that they may act on incorrect information).

In the case that the right answer is provided, we know that the user has received some value. While we don’t know how much there is one assumption we can make – it should be greater than the cost the time cost the user expected to incur. Otherwise the user wouldn’t have take the action (assuming rationality). We can figure out the expected time cost in a way similar to the actual time cost calculated in step 3. This just involves estimated averages across all sites (such as average site response time) rather than using data specific to my site. I also made an assumption that a user would expect a return of at least 10% greater than the time invested to make it worthwhile.

In the case that the calculator returns no answer (where I haven’t yet implemented the calculation asked), then the user gets no benefit, but also no real cost other than the time invested.

The final case is that the user gets given the wrong answer. This could potentially be have a large cost to the user if they act on the incorrect information. While hard to quantify, one assumption I felt confident making is that a rational user would cap their potential losses in relation to their potential gain. For this I assumed that the user would cap losses at, at most, 1o times their potential gain.

Step 5: Write out value expressions for each outcome

We now have enough information to write out expressions for each outcome, in terms of the total actual time cost (which I called ‘alpha’ for no good reason), and the user’s gain if the answer is correct, which I called G. These came out to be:
  • No calculation performed: -alpha/2
  • Calculation answered correctly: G – alpha
  • Calculation answered incorrectly: -10*G – alpha
  • Calculation not answered: -alpha

Step 6: Derive a lower-bound on each expression solely in terms of the gain

We know alpha is the actual time cost the user incurs. This is related to a lower-bound on the gain, G, by the user’s expected time cost (call it alpha_e). With a bit of work (see the notes below for details), we can rewrite the value expressions so that they are expressed in terms of the lower-bound on G. So essentially, we have value expressions written solely in terms of (a lower-bound on) G.

Without getting into to much detail, we can basically total up the value provided to a user when they perform a set of calculations. This will give us a value score for that set in terms of G. While we don’t know what G actually is, it doesn’t matter – as we will be using the value calculation simply to compare one set of calculations to another (separated in time, by user, by calculator version, etc.) we can just drop G entirely.

Conclusion

Now we have an expression for the lower-bound on the value provided by any set of calculations and resulting outcomes. While not perfect (an exact score would be better than a lower-bound), I think it might be the best that can be done without figuring out what user’s are actually doing with their calculation results (and I think that’s their own business). It will still be very useful to be able to compare the lower-bounds as these are likely to be reasonably well correlated with the actual values. Also as the aim is to increase value over time we can be fairly confident that this is occurring if we can see that the lower-bound on the value is increasing over time.

My Rambling Notes (or Stop Here and Don’t Read On)

The main aim is to provide value to users. In order to do so with the greatest impact / least effort, it is necessary to optimise the value provided. In particular, potential projects should be prioritised based on the value they are expected to deliver. However, value is one of those difficult to define things, which makes an objective assessment of the value delivered by projects difficult. Nonetheless, I think it is worth trying to get as good an approximation as possible.

For the time being I’m restricting the value calculation to the core calculation service. There are four basic outcomes which can occur when someone interacts with main calculator page:

  • No calculation performed
  • Calculation answered correctly
  • Calculation answered incorrectly
  • Calculation not answered

If it is possible to assign a reasonable measure of user value to each item, then it will be possible to combine them into a reasonable total measure of the value provided by the calculator in a given time period, per user, etc.

In general, there will be a fixed time cost involved in performing any calculation (regardless of the result). For the time being, call this alpha. The cost when no calculation is performed will be somewhat less than this. At a rough estimate it will be alpha / 2.

Should a calculation be performed correctly there will be some gain G for the user.

If a calculation is not answered, then this gain G will not be realised. However, should the calculation be incorrectly answered, not only will the gain not be realised, but the user may be significantly affected (e.g. by entering into a bad financial transaction based on the incorrect result). While we can’t say exactly what magnitude of loss this might be, it seems reasonable to expect that any rational user would cap their potential losses at 10 times the possible gain. At the very least, at that risks beyond that level, they could be expected to verify the calculation results with other sources. So the maximum loss can, I think, be taken to be -10*G.

Combining these we can get value expressions for each of the possible outcomes:

  • No calculation performed: -alpha/2
  • Calculation answered correctly: G – alpha
  • Calculation answered incorrectly: -10*G – alpha
  • Calculation not answered: -alpha

This will be more useful if we can express it in a single quantity. To do so we need to look at the relationship between alpha and G.

There is a quantity related to alpha which will be useful to us. This is the time cost the user expects to pay, assuming a reasonably responsive system. Depending on the performance of our system, alpha may be greater or smaller than this. We’ll call this quantity alpha_e.

A rational user would not carry out actions which did not at least repay the expected initial investment plus a bit of ‘profit’ to make it worthwhile and absorb associated risks. Thus we can expect G to have a lower-bound of perhaps 1.1 * alpha_e.

That’s a bit better, but still not quite there. We can get the rest of the way by looking at what exactly alpha and alpha_e are. These, as noted above, are the time cost of performing a calculation and the expected time cost of performing one on an averagely responsive system respectively. They have the same formula:

time_to_input_calc + time_to_perform_calc + time_to_reach_calculator / num_calcs_per_user

The only difference is that alpha_e represents the user’s expectation of these values while alpha represents the actual values.

We can assign approximate values to these for alpha_e. Something like this, perhaps:

alpha_e = time_to_input_calc_e + time_to_perform_calc_e + time_to_reach_calculator_e / num_calcs_per_user_e

= 10 + 3 + 10 / 5

= 15 seconds

This gives a lower-bound on G of 16.5 seconds.

Now, we can perform a similar calculation for alpha. I don’t know exactly, but the result will probably be something similar. Assuming for the time being that it is, we can then say that G >= 1.1 * alpha. We could easily adjust the multiplicative factor should analysis of the data show that alpha and alpha_e show that they differ.

But is relating alpha to a lower-bound on G any use? It’s not precise, but it gives a possible approach.

Totalling up the value for a set of calculations (based on the expressions above) we will get something in terms of G and alpha, e.g.

value = a * G – b * alpha

If we can get an upper-bound on b * alpha in terms of G, then we’ll get a lower-bound on the value in terms of G:

Setting G = 1.1 * alpha, we get our upper-bound on b * alpha:

b * alpha < (b / 1.1) * G

This gives us a lower-bound on the value provided as:

value >= (a – (b / 1.1)) * G

So that is our lower-bound on the value provided to the user by a set of calculations. The a and b variables can easily be calculated by totalling the values of G and alpha for each calculation in the set. The 1.1 factor can be assumed constant (or calculated from the actual data observed). And G is essentially irrelevant to us. As all value calculations will contain a multiple of G, we can simply drop it – it contains no useful information.

So we have, pulling everything together, finally:

value >= (a – (b / c))

where

a = answered_correctly – 10 * answered_incorrectly

b = 0.5 * no_calculation + answered_correctly + answered_incorrectly + not_answered

c = 1.1 * 15 / (time_to_input_calc + time_to_perform_calc + time_to_reach_calculator / num_calcs_per_user)

While not an exact measure, (which wouldn’t really be feasible), it does give us a lower-bound on the value provided (given the assumptions made above). This is almost as useful as we can now operate with this as our measure in value and strive for a constantly increasing lower-bound on the value provided. While the actual value may be somewhere higher than this, improvements in the lower-bound will almost certainly result in the actual value provided being increased (if not perfectly correlated, this should at least happen on average over time).

Note that there are no units – this doesn’t matter as the primary use of this value is to compare two sets of calculations, or options, to determine the relative value provided to the user.

Calcatraz, the Fast Online Calculator

January 23rd, 2010

Recently a number of changes I made to the calculator have been causing a bit of a performance hit. To counteract that I’ve been through and made a few simple changes:

  • Removed some external file includes which were no longer needed
  • Implemented better caching to prevent unnecessary re-requests
  • Modified the core to minimise the amount of work being performed in the most-frequently used code.

The last of these required some profiling to be performed. For this I used webgrind for PHP (which requires xdebug to be installed, which I did on my local testing environment). While this set-up was a bit fiddly to get going, it was worth it as it let me pinpoint exactly where performance was being the most affected. As a result I was able to cut calculation time down from 2.5 to around 0.5 seconds on average. The result is faster page loads every time.

New User Interface Uploaded

November 5th, 2009

As promised in my last post, I’ve now uploaded the latest version of Calcatraz to the live site. The main feature is a new, hopefully more intuitive and consistent user interface. There are also a number of behind the scenes improvements such as general code tidy-ups. One improvement I’m particularly happy with is the integration of javascript and CSS minification into my build process. What this means is that whenever I prepare a new version of the calculator to be uploaded, a bit of code runs automatically reducing the size of my javascript and CSS files. As a result, load times should be slightly reduced as there is now less to be downloaded (which will hopefully further help with the bounce rate). It’s not going to have a huge impact on speed, but by constantly exploiting such opportunities for optimisation the result should be an overall leaner and quicker calculator. If you’re interested in learning more about minification, then check out YUI Compressor from Yahoo! which is the tool I use to perform the minification, and which performs simply and effectively.

I’ll be doing some further refinements on the user interface in the near future to further improve it and in particular clean up the history tab which I’m still not entirely happy with. But I wanted to get the new interface up now to see how well it is received and to identify any sticking points.

You can check out the new interface in action on the main site: www.calcatraz.com.

A few changes on the way

November 1st, 2009

Just a quick update on a few of the things I’ve been working on recently:

  • A new user interface – the old Calcatraz interface has two main pages: the home page with a single calculation entry box, and the main calculation page which shows a history of calculations and other information. One thing I found was that the transition from the home page to the more cluttered calculation page was confusing to users. As a result I’ve been developing a new user interface which has a single screen (which is sort of half-way between the two old pages). This hides less commonly used features and removes the ’surprise’ element which used to arise when a user was sent to the second page. Hopefully this will make the calculator more user-friendly and consistent to use. It also has the added benefit of reducing the amount of code and effort from that required to maintain two separate pages.
  • A static blog. As a small startup, time is a precious commodity. I can’t really justify the time cost of maintaining an up-to-date wordpress installation. To solve the problem once and for all I”m creating some code which copies each html page produced by the blog. The result is a static html version of the blog. It has a number of benefits. First is that security problems are virtually eliminated as the only thing that will be placed on the webserver is static html. The second is that, as static html, pages will load more quickly than interpreted code. This will provide users with a better user experience. The main downside is that I will have to disable comments. However, this is a trade-off I’m willing to make as this site is mainly informational. I may look at reintroducing comments in some other form at a later time.
  • General optimisations. One of my main aims for the calculator is that it should be fast. To this end I’ve been implementing a number of optimisations. For example I’ve modified my build process to minify javascript and css (that is, remove unnecessary comments and characters, to reduce size and therefore load time). This will be applied to the live site as soon as I finish my changes to the user interface and perform a site update.
  • Core algorithm updates. I’ve been making a number of changes to the core algorithms to provide results more quickly. I hope to be able to share these details at some point in the future.

So while there hasn’t been much in the way of observable changes to the site, there has been a lot going on in the background. When I finish up these loose ends then I’ll update the live site. I hope you like the improvements.

Bounce rate reduced from 70% to 44%

August 30th, 2009

One thing that became quickly obvious when Calcatraz launched was that the site had a high bounce rate. At 70% it seemed to be fairly average for an unoptimised site. But it meant that for every ten people who visited the site, only 3 were actually sticking around.

I decided to try and reduce the bounce rate for the home page http://www.calcatraz.com. With a few quick, simple changes I was able to reduce the bounce rate to just 44% (meaning for every 10 visitors 5 1/2 actually entered a calculation). This almost doubles the amount of visitors who actually choose to interact with the site.  These changes were:

  1. Speeding up the load time. First I halved the file size of the logo from 6kb to 3kb. Then I inlined the CSS style definitions into the main file (to remove the need to have separate requests for the CSS). Finally I enabled HTML compression which causes the server to compress the page from 3kb to 1kb before sending it to the user. The end result was a significant decrease in the time it takes the main page to load for the user. This change reduced the bounce rate from 70% to about 55%.
  2. Providing the user with a better idea of how to use the calculator. I added the descriptive phrase ‘The free online calculator’ under the logo, and provided examples of valid calculations under the calculation entry box.
  3. Providing the user with easy ways to interact with the calculator. For users who don’t have a specific calculation in mind, I converted the example calculations into links so they can just click on them and Calcatraz will work it out for them. This reduces the barrier to entry for users coming to the site.

I’m still testing the changes described in 2 and 3 with Google website optimiser so they don’t currently appear for all users. However, I will very soon be implementing them as the main front page as they are significantly outperforming the alternative which does not have these features. After that I’ll continue to make refinements to improve the bounce rate as I suspect it can be reduced still further.