Research published by the Legal Services Board by Chris Dewberry brings into sharper relief the value of aptitude tests and their application to the legal professions. For me it suggests that the value of aptitude tests is most likely to be felt at the entry into University stage (and possibly not there) but that ensuring that they operate fairly poses a major challenge. The uses proposed by the BSB and being investigated by the Law Society are face significant problems (see also here, here, here, here and here for previous posts and links to Sutton trust work in particular).
The report begins with a point which gives me the opportunity to stoop low into the gutter of smutty blog titles. That is the report shows that, for the most part, we should forget about aptitude tests and focus on measures of cognitive ability. The psychologists label this ‘g’ and G is a powerful thing. It exists and those tests that find it can predict degree performance (and so it is fair to surmise LPC or BPTC performance) or job performance as well as, or better than, the other common techniques. So far so good for proponents of aptitude tests.
G is a pretty general phenomena. Those who have high cognitive ability are likely to have good aptitudes generally. Testing for G is generally as good as testing directly for aptitude. This is particularly true, the research shows, for high complexity jobs. Your G predicts the likelihood you will perform well, looking more closely at aptitude adds little if anything to that. This suggests that G might be a valid predictor of success in the legal profession and that the Bar and the Law Society are wise to consider aptitude (as a predictor of g-) but might be wiser to concentrate their efforts on G more directly. So far, so technical, but the research has some more important and relevant findings which pose significant challenges to the utility of tests for any regulator.
The first is that aptitude and cognitive ability tests work significantly less well where the range of people they are testing is narrow. That is, where there has already been a selection of candidates based on (say) A-level scores or degree results, the value of such tests is significantly inhibited. Those people are already pretty able and further testing does not (on current evidence) much help distinguish between them. How far the test’s value is inhibited is a question for empirical testing. In the context of law, the power of the test to predict over and above A-level results may be modest, pretty weak or non-existent. Interestingly, the Sutton Trust have already considered the power of an aptitude test to predict degree performance (over and above using A-level scores) and, even though they wanted it to succeed, it failed. A-levels were as good or better a predictor and, being data already collected, were a cheaper test of aptitude (or G, if you like).
The second problem is the one that is likely to glean the most attention. Aptitude tests (and G-tests) tend to favour white over black students, rich over poor students, and so on. It is worth emphasising that this is true also of A-levels . There are two reasons. One is that the favoured candidates have higher levels of cognitive ability because their environment has allowed them (or encouraged or required them) to develop their abilities more strongly. A second reason is that this group is more likely to have been coached on the tests. Practice tests and training on tests improves performance. Students from certain backgrounds are more likely to get such support and so get better scores. Data from the US suggests that the difference in scoring of different sub-groups can be enormous and Chris Deberry’s initial analysis of available LNAT data from the UK suggests similar problems may exists here (If an LNAT score of 17 was used to gain admission to law schools using the test then 51% of white candidates would be admitted, 30% of Black African candidates and 27% of Indian and Pakistani candidates). Universities do not use LNATs alone so the figures may not be that bad (equally they could be worse), but they indicate a potentially significant problem.
Most interestingly, those with somewhat lower G-scores from disadvantaged socio-economic groups can and do catch up or overtake those from advantaged groups with higher (or the same) scores. A child from a state school with the same G-score as a child from a public school would be likely to do significantly better in their academic studies and job performance. This might mean disadvantaged candidates should have a different weighting applied to their G-score (or different A-level entry tests). It is also possible to test for and minimise the social discrimination built into tests of G. One US study has produced a test which did not discriminate, and there is a process called differential testing which can be applied to enable tests to become more fair. This leaves open the issue of coaching.
All this suggests to me that if aptitude (or cognitive ability testing) is to be applied usefully anywhere it is likely to be at the stage of entry into University. Before University the ability range of those to be tested is likely to be widest, and so cognitive ability (or aptitude tests) are likely to work best. This depends, however, on such tests being better than A-levels (alongside, I would argue, contextual admissions data to part-tackle the inbuilt biases in A-level scores).
G-tests might also be used by employers, though the range narrowing problem referred to above seems to make that unlikely. It is worth noting that cognitive ability tests are as good as structured interviews (and probably significantly cheaper) and significantly better than (for example) assessment centres, at predicting future job performance (though again these findings are generated across occupations and so have probably not been thoroughly road-tested in law). For those of you with great slightly right-sloping handwriting, sit down and absorb the bad news that graphology as a predictor of job performance is (probably) pants (allegedly [Is there a British Association of Graphologists?]).
The report ends with a series of questions that the LSB are likely to pose to regulators seeking to introduce aptitude tests. These involve being very clear what it is, over and above cognitive ability, they are looking for in such tests; demonstrating robust testing of such additional measures (demonstrating that such tests reliably measure what they set out to measure and do so accurately (through test-retest work, for example)); and, in particular, demonstrating the predictive power of such tests over and above utilising existing information. This is serious work requiring significant care and investment. It is work that has not been carried out to date, in any verifiable way in Britain, because such tests are run by private organisations who plead commercial sensitivities. This may be fine where the testor is a private business (who can be left to look after their own interests), but where it is a profession seeking to advance the public interest there is a need for transparency.
The report also suggests that regulators should also identify sub-group differences in test scores on the basis of class, educational background, gender and ethnicity; report those adequately and consider ways of mitigating those differences. Provision should be made for students to have access to a sufficient number and range of practice tests and test coaching opportunities if possible. Ultimately this is what the progressive case for aptitude tests depends upon: tests which deal fairly with all groups; and for which the coaching advantages enjoyed by (mainly rich) students are removed by the provision of sufficient alternatives to render the benefit of paid coaching illusory. If this could be done the profession would take a significant step towards a fairer and more meritocratic entry process. Whether such a system can be devised which works better than the current system, however, is very debatable and requires a lot of work.