next up previous
Next: Ethical Considerations Up: User Testing Previous: Designing the Test

Choosing Participants

The ideal users for testing a system are those in the application's target population, having a full knowledge of the terminology and techniques in their field, and a knowledge of the tasks that an application will be required to be capable of. Testing with people from the target population also means that they have a comparable computing skill level to that the end users will have.

Landay  recommends that if such ideal users are unavailable, a ``closest approximation'' can be used instead. For example, if a system is being targeted for doctors and it is not possible to arrange to have a real doctor test your system, then a medical student can do the testing with almost equivalent skill, and provide similar feedback to what the doctor would have given.

The goal of testing of this system was to get feedback on pen-based formula entry systems, and the new interface concepts created. A number of people were chosen that effectively represent the people who could be expected to use a formula entry system: mathematicians, physicists, computer scientists, and high-school students.

One important aspect to consider is the number of participants to involve in the user testing. If too few are used, only a small number of problems in an application will be found. Too many, and a lot of time will be spent with minimal returns for the additional time and effort of organising and analysing the additional participants' results and responses.


  
Figure 5.1: The proportion of problems with a user interface found as the number of evaluators is increased.
\includegraphics[width=0.95\linewidth]{figures/evalcurve1.eps}

A large amount of the literature on doing user testing and usability studies discusses this important issue of how many people to involve. The curve shown in Figure [*] is typical for the number of evaluators and the proportion of problems found with a user interface, taken from Nielsen .

This suggests that the ``ideal'' number of people to test or evaluate a system lies between four and ten. Nielsen , for example, believes that evaluation tends to work best with three to five evaluators. Dumas and Redish  suggest six to twelve participants for user testing. These figures apply to large systems, such as word processors or email programs. This system is relatively small in comparison.

A total of nine participants took part in the testing, though one of these solely observed another participant using the system and made comments based on what he saw, without using the system himself. The nine participants consisted of two high school students, two physics postgraduates, two mathematics postgraduates, and one postgraduate and two undergraduate computer science students.

There are a number of different ways that participants can be obtained for user testing. You can advertise for participants, then screen them as they apply to see if they are suitable. You can go through a recruitment agency, giving them a list of the criteria that the participants must have, and they will do the screening for you. It is also considered acceptable to use personal networks to get participants , though this is not ideal. Care has to be taken because, as such participants are likely to know you personally, their impartiality may be limited. They may be inclined to be kinder in their responses and opinions, hoping to tell you what you would prefer to hear, rather than be honestly critical. Balancing this is the possibility that friends may possibly be less shy during the test, due to them not being apprehensive of saying critical things about something to someone that they do not know.

The people who tested this system comprised of three people I, the observer, had never met before, and six people I already knew: either as friends of friends or personal friends.

Of the people who used the system, six had not seen it before, two had seen it, and only one had used it before. This meant that there was little bias from participants' previous experience or knowledge of the system. Only two of the users had used a pen and tablet before, unfortunately resulting in the remainder of the participants having to become familiar with the pen and tablet as they used the system. As a result, this tended to decrease the ease with which they were able to use of the system. From my personal experience, the time it takes for a person to become comfortable and accurate with a pen and tablet varies between hours and days.

, though this is not ideal. Care has to be taken because, as such participants are likely to know you personally, their impartiality may be limited. They may be inclined to be kinder in their responses and opinions, hoping to tell you what you would prefer to hear, rather than be honestly critical. Balancing this is the possibility that friends may possibly be less shy during the test, due to them not being apprehensive of saying critical things about something to someone that they do not know.

The people who tested this system comprised of three people I, the observer, had never met before, and six people I already knew: either as friends of friends or personal friends.

Of the people who used the system, six had not seen it before, two had seen it, and only one had used it before. This meant that there was little bias from participants' previous experience or knowledge of the system. Only two of the users had used a pen and tablet before, unfortunately resulting in the remainder of the participants having to become familiar with the pen and tablet as they used the system. As a result, this tended to decrease the ease with which they were able to use of the system. From my personal experience, the time it takes for a person to become comfortable and accurate with a pen and tablet varies between hours and days.


next up previous
Next: Ethical Considerations Up: User Testing Previous: Designing the Test
Steve Smithies
1999-11-13