UbiComp/ISWC 2013: Bias in sensor-triggered surveys

Surveys on mobile devices are a useful tool for sampling behavioural data from users, but critical design decisions can bias the outcome, and provide misleading data. This is the thesis presented in an UbiComp paper titled Contextual Dissonance: Design Bias in Sensor-BasedExperience Sampling Methods, which takes a critical look at the tools that researchers use to survey their users. The particular technique in question is called the Experience Sampling Method (ESM), which is used to pose a series of questions to users over the course of a long study in an effort to gain a more complete picture of their lives.

The wide introduction of mobile devices has meant it is easier than ever for researchers to use this method to ask users questions on their own phones, and to use the increasingly wide array of sensors available on these devices to ask more intelligent questions at times when the users will be willing to answer them. Response rates can suffer dramatically if users feel harassed by the survey tool, and bias can creep in if you ask them to recall behaviours that occurred too far in the past. Asking timely questions is a key element of ESM, and it is usually dependent on using sensor data to determine the best time to prompt the user.

Four charts showing response ratings.

Response ratings across trigger groups.

As an example, the researcher may want to ask users questions that relate to their feelings when they are at home, and may use a variety of different sensors to try and determine their location. These may include hardware sensors like accelerometers and microphones, and software sensors that can detect when the user is sending text messages. It has been typical in the past to assume that the choice of the sensor trigger that initiates the question would have little impact on the actual results of the survey, but the authors of this paper contend that this is not the case. In one example, they show how to results of a very simple question (whether they had positive or negative feelings) could be skewed significantly by the choice of trigger, with communication triggers and delayed microphone responses showing significant increases in negativity.

The authors don’t go into any great detail for the remedy for this bias, but recommend investigation into using multiple sensors to more accurately determine appropriate trigger points, and the adoption of machine-learning techniques that could adapt to each user over time. In the question period after the presentation, the authors also suggested the need to a variety of questions to keep user interest over the course of the study, and to ensure engagement. These findings should be helpful for designers and marketers, and of great interest to anyone considering user research with mobile devices.