This form of direct oral testing is known variously as oral interview, oral test, or oral examination. I will refer to it here as oral examination, as I am dealing mainly with examinations in a university context.The literature on language testing has identified a number of unsolved problems with oral examinations. Much of the discussion has centred on the issues of validity and reliability, but problems in the practical administration of oral examinations have also received comment.The fundamental problems with oral examinations are those of reliability (i.e. the consistency with which different examiners mark the same test, or with which the same examiner marks a test on different occasions) and validity (i.e. whether or not an oral test assesses what it sets out to assess).The reliability of oral examinations has been seen as a serious problem right from the start of research on this topic. Spelberg et al. (2002) report very low correlations, averaging only .41, between the marks of different examiners, although Taguchi (2005) points out that the nine examiners who marked sixteen candiates [ . . . ] in this study did not have marking schemes, were given no training, were unstandardized and were given no criteria for judging candidates ability, so the discrepancies in their judgements are perhaps not such a surprise. Spelberg (2000) describes the usual ways of testing oral ability as impressions from memory or haphazard interviews and writes that the vast majority of cases [ . . . ] are not reliably separated into levels of speaking ability by this approach, because of the complexity of the language and non-language factors involved. Michael (2001) states that for tests based on free conversation the problems of sampling, and reliable scoring are almost insoluble, unless a great deal of time and many