Improving the accuracy and reliability of earwitness evidence

The victim of a crime might not always get a clear view of the perpetrator. In cases like telephone fraud, blackmail, or masked attack, the sound of the perpetrator’s voice might provide one of the only clues to their identity.

In a situation such as this, the police have to rely on earwitness, rather than eyewitness, evidence. The earwitness will be required to provide a description of what the voice sounded like, and may be asked to try and identify the suspect from a voice parade (Home Office, 2003).

Worryingly however, we still know relatively little about the conditions that might affect the accuracy and reliability of voice identification evidence. Over the past century, a large body of Psychology research has focused on the accuracy of eyewitness testimony. The outcome of this research has filtered into the legal process, resulting in the Turnbull guidelines, and influencing codes of practice (Code D, 1984). In comparison, earwitness testimony has been largely neglected, and there remain many gaps in our knowledge.

One thing we do know is that memory for voices is error prone. Research has consistently shown that people tend to remember faces much better than they remember voices. This might be because we pay more attention to what is being said, rather than the sound of someone’s voice; we are used to being able to rely on someone’s face for information about his or her identity.

As a result, finding words to accurately describe voices is difficult for listeners who do not have expert knowledge of linguistics. The descriptions that lay (i.e. non-expert) listeners produce have a tendency to be inaccurate, vague, and subjective. This is problematic because the description provided by an earwitness can provide crucial evidence.

The witness description has the potential to help narrow down a list of suspects. Such descriptions can be vitally important in the trial process and can help to eliminate cases of misidentification.

As has been demonstrated in previous cases, voice identification evidence can be decisive in court. For example, in R v Roberts (2000), the victim did not see his attacker, but described him as having a ‘London accent’. Very little detail about the voice was recorded, and the conviction was quashed. In R v Nealon (2013), witnesses described the rapist as having a Scottish accent. The suspect was however from Ireland, and was later exonerated by DNA evidence.

Home Office guidelines (2003) recognise the importance of a detailed first description of the perpetrator’s voice. However, they do not explain how it should be obtained.

There is an urgent need for clear guidance, and the development of a procedure for gathering voice identification evidence. This procedure should be quick and easy-to-use so that it can be administered at the earliest opportunity, therefore reducing the likelihood that the witness’ memory will have degraded.

Before this can be achieved though, further research is required.

We are an interdisciplinary team of academics drawn from Psychology, Linguistics, and Law. Our work is funded by the British Academy and the Safety and Security fund at Nottingham Trent University, and focuses on two main strands: identity-specific features of voice quality, and accent identification. Our aim is to develop guidelines, which can be used by the police when they are taking a description of the perpetrator’s voice.


Descriptions of voice quality

Voice quality refers to how the voice sounds.

We are currently conducting a series of lab-based experiments to test how the accuracy, detail and consistency of voice quality descriptions varies according to the type of questions that are asked. In each experiment, the questions are posed in slightly different ways in order to try and reduce the inaccuracy, vagueness, and subjectivity of the descriptions produced.

Initially, we tested how lay listeners perform when they are simply asked to describe a voice. We call this ‘free recall’, because the listener is not given any prompts about how they should answer the question, or what kind of information they should include. We wanted to compare the kind of information people provide spontaneously without prompts, to the kind of information that will be useful in an enquiry. Where there are gaps in free recall, this indicates the questions that earwitnesses need to be specifically asked.

However, these results do not indicate how the questions should be asked. This is what we are currently looking at. Based on the results of the initial experiment, we have adapted the procedure to include structured questions. In a further experiment, listeners are asked to describe particular voice features such as pitch and tempo. To try and reduce subjectivity and inaccuracy even more, in the final experiment we are testing listeners using rating scales. They are asked to assess features such as pitch according to a numeric scale between 1 and 7 (1 = extremely low pitch, 7 = extremely high pitch).

We will compare responses in these two experiments in order to decide how particular questions should be posed. Based on a thorough analysis of all three experiments and some further research, we will be able to develop an easy-to-administer procedure that optimises earwitness performance.

Accent identification

We are also focusing on accent identification.

Accent can be a fundamentally important aspect of a voice description. When an accent is (or is not) mentioned, this is a factor which a judge must direct a jury to take into account when assessing the accuracy of the identification made (Bench Book 2017).  As such, we felt that this demanded careful attention in a separate but related project.

This project will address how accurately people can identify accents, and what linguistic features people use to make a decision (i.e. what kind of word sounds are particularly helpful to listeners trying to decide where someone is from). The results will indicate whether questions can be posed in a way that maximises the accuracy of accent identification, and will feed into the voice description guidelines we are developing.

Looking to the future

These two strands of research are still at an early stage, and we look forward to being able to share our results with the police community.

The preliminary results are very promising. Our work suggests that by changing the way that questions are asked, it is possible to reduce the subjectivity, vagueness, and inaccuracy of earwitness testimony.

As our work on the projects draws to an end, we will invite police legal professionals to a workshop held at Nottingham Trent University. The workshop will provide an opportunity for us to disseminate our findings and demonstrate the resulting procedures. The views of police personnel are extremely valuable to us, as they help to maximise the impact of our research. Most crucially, the workshop will help us to learn more about issues that might be relevant in practice so that any recommendations we ultimately make will be both practicable and useful.

If you would like to discuss any aspect of this research with us, we would be very happy to hear from you.

Dr Harriet Smith ( is an Independent Research Fellow in Psychology at Nottingham Trent University. Harriet has published a range of papers on voice and face perception, eye and earwitness testimony, and verbal descriptions of faces.

Natalie Braber ( is an Associate Professor in Linguistics at Nottingham Trent University. Natalie has published widely on topics relating to accents, dialects, language and identity in the East Midlands.