An Examination of Judge Reliability at a major U.S. Wine Competition
Wine judge performance at a major wine competition has been analyzed from 2005 to 2008 using replicate samples. Each panel of four expert judges received a flight of 30 wines imbedded with triplicate samples poured from the same bottle. Between 65 and 70 judges were tested each year. About 10 percent of the judges were able to replicate their score within a single medal group. Another 10 percent, on occasion, scored the same wine Bronze to Gold. Judges tend to be more consistent in what they don’t like than what they do. An analysis of variance covering every panel over the study period indicates only about half of the panels presented awards based solely on wine quality.