Caulton: "Relaxing the homogeneity assumption in usability testing"

Caulton, D. (2001). Relaxing the homogeneity assumption in usability testing. Behaviour & Information Technology, 20(1), 1-7. doi:10.1080/01449290010020648



Relaxing the homogeneity assumption in usability testing


concept of ''discount usability''-- cutting costs in testing. One way is by cutting ''unnecessary subjects.'' Cites earlier researcher supporting using smaller number of subjects-- e.g., curve of diminishing returns (Virzi 1992). Virzi claims 80% of usability probs detected with 4-5 subjects; additional subjects less and less likely to reveal new info; most severe usability issues likely to be detected in first few subjects (1-2).


Caulton notes that Lewis (1994) found that user n=10+. Caulton states that he will examine the 2 assumptions in Virzi''s work-- the model assuming that the user base is homegenous; a test in which researcher satisfied with modest claims about pipulation.


Heterogeneous populations and small numbers of subjects


homogeneity assumption: different users encounter diff usability probs at random times; all subjects equally likely to encounter all problems.


Caulton notes this will not work because programmer usability issues differ from clerk issues differ from nurses...


How do you figure out the numbers if user based composed of many sub-groups (2)?


Binomial Model: two perameters: number of subjects run in usability test and probability of finding given problem in one subject. There''s a lot of math involved.


Adding heterogeneous subgroups: need two types of usability probs-- shared probs (equal probability for all users) and unique probs (more likely to be in one subgroup than another).


Exploring the Model: more subgroups mean more potential issues means need more testers.


Stimulating a combination of shared and unique problems: works out equation to determine. (See 4-5)


The effect of heterogenous groups: het subgroups decrease power of usability test. Two factors affect magnitude of effect: number of groups, and how distinct subgroups are from each other.


The applicability of small n usability testing to various types of usability research


Be aware that small n methods work, but they''re imprecise, don''t necessarily indicate the extent that something will affect population. Methods not wrong, but be aware of limitations (5).




Argues against Virzi''s and Lewis''s separate claims as to needing fewer subjects; notes that their models do not adequately address user sub groups.


Argues groups are homogenous when obvious narrow (e.g., 100% secretaries-- but I wonder whether the NY office and the Bangalore office  are ''homogenous'' even if same job description). Caulton brings this up-- e.g., relevant previous excperience.


Asks what to do/how to deal with subgroups. Depends on approach to measuring/identifying variables. Argues running 5 of each experience group, 50 random sampled subjects.