Outcome Scales in PsychiatryShould we use outcome scales in psychiatric practice? If so, which ones? Which are actually feasible in terms of time and utility?

Relatively few psychiatrists use outcome scales on a routine basis. In one study of 314 U.S. psychiatrists, for example, only 6.5% said they “almost always” used scales, and 61% said they never or rarely used such scales (Zimmerman M, McGlinchey JB. 2008. “Why Don’t Psychiatrists Use Scales To Measure Outcome When Treating Depressed Patients?” Submitted for publication).

In this month’s expert interview, Dr. David Katzelnick discusses a practical depression scale called the PHQ-9. In this article, we’ll review some of the other potentially useful scales available.

Depression Scales

While there are many depression scales out there, not all are created equal in terms of practicality for busy clinicians. The two most widely used scales in the antidepressant research literature are the Hamilton depression scale and the MADRS (Montgomery-Asberg Depression Rating Scale). Both of them are clinician administered and require some training to do them well. Other scales, such as the Beck Depression Inventory, are self-administered but are rather long and require that clinicians pay a royalty to use them.

Aside from the PHQ-9 (download at forms/phq_9/), the other prominent free, short, patient-administered scale is called the CUDOS, or Clinically Useful Depression Outcome Scale (Zimmerman M, et al., Comprehensive Psychiatry 2008;49(2):131- 40). It includes 18 items (double the length of the PHQ-9), but still takes most patients under three minutes to complete. It has been well researched and is both valid and reliable as compared to other measures of depression. It can be downloaded for free at midas/scales/cudosform.pdf.

CUDOS’s greater length results from the fact that it breaks up some of the more involved DSM-IV criteria into components. Thus, to take the example of the sleep criterion, the PHQ-9 has a single item: “Trouble falling/staying asleep, sleeping too much,” while the CUDOS splits this into two items: “I had difficulty sleeping” and “I was sleeping too much.”

Michael Posternak, who is a member of TCPR’s editorial board, and who was a coauthor on some of the CUDOS studies, uses it routinely in his clinical practice and estimates that he has administered it to at least 5000 patients. I asked him for his thoughts about CUDOS, particularly as compared to the PHQ-9, which he reviewed but does not routinely use for his patients.

“‘While the PHQ-9 seems briefer, the attempt to save time on fewer answers may actually require more followup questioning. If a patient answers the PHQ-9 sleep item by saying that they either slept too little or too much, you do not know which one it is and will need to clarify. By dividing up the question ahead of time, you have the information up front and can hone right in. This is especially important on the suicidal ideation (SI) item. If, on the PHQ-9, a patients endorses passive SI or active SI (the PHQ-9 wording is “thoughts that you would be better off dead, or of hurting yourself in some way”) you must follow that up. Whereas if these two are divided up (as they are in the CUDOS) and active SI is absent, you can feel comfortable that no active SI is present.”

Regardless of which scale you use, one of the most useful properties is that scales allow you to track your patient’s progress over time and for each medication trial. For example, Posternak’s patient charts will look like this (with each score being obtained at each subsequent visit):

Baseline CUDOS 45
Effexor started –> 42, 36, 35, 26, 25
Effexor + Wellbutrin –> 24, 16, 12, 13

Patients who are unsure of how much progress they have made will often become convinced when shown this relatively objective measure.

In my own informal “road-testing” of the PHQ-9 and CUDOS, I found that the PHQ-9 was easier and quicker for patients to fill out and score, but that the CUDOS provided more information. Since they are both free, I suggest you try them both out and decide which works best for your practice.

A General Symptom Scale: The OQ-45

While we usually think of symptoms scales in terms of tracking progress, one of the most well researched scales in the field is most useful in predicting treatment failure. The OQ-45 (Outcome Questionnaire-45) is a 45 item scale developed for psychotherapy patients, and it contains items covering depression, anxiety, substance abuse, interpersonal distress and difficulties in important life roles. Patients take the OQ-45 at each session, and their change over time is compared to the rate of expected progress, which was based on benchmarking studies of over 11,000 outpatients who completed the same outcome measure.

Since the OQ-45 is geared toward helping us to predict potential treatment failures, an important question is whether it can make this prediction with greater accuracy than therapists using their best judgment after a few sessions. Researchers answered this question by giving the scale to 550 patients over the course of therapy. They found that 40 patients (7.3%) deteriorated by the end of therapy. Therapists were not good at predicting these outcomes, correctly predicting deterioration in only 1 out of 40 patients (a hit rate of 2.5%). But when researchers used an algorithm based on patient responses on the OQ-45, within the first few sessions they accurately predicted treatment failure for every patient who deteriorated (Hannan C et al., J Clin Psychol: In Session 2005;61:155-163).

But the most intriguing research was geared toward answering more practical questions. If therapists are alerted by OQ-45 about patients who are “not on track,” can they do something to prevent deterioration? If so, will their patients have better outcomes than patients whose therapists do not receive this feedback? In five studies, a combined total of over 4000 patients were randomly assigned to either OQ-45 feedback or no-feedback conditions (Lambert M, Psychother Res 2007;17:1-14). When therapists were deprived of feedback, and were forced to rely on their own clinical judgment as per usual, the deterioration rate for “not on track” patients was 21%; but when therapists received feedback in the form of OQ-45 scores, the deterioration rate decreased to the 5%-13% range, depending on the specificity of feedback received.

In deciding whether to try implementing the OQ-45 in your practice, you should be aware of some limitations. The patients enrolled in the studies were generally only mildly ill, and nearly all were treated in university counseling centers. They were seeing therapists and not psychopharmacologists, and the OQ-45 has not been researched in psychiatric practices. The feasibility of administering the OQ-45 is questionable. It is most efficiently administered via computer, which would require giving your patients access to a computer in your office. Otherwise, they would take the pencil and paper version, and then your clerical staff would have to input the responses into the computer so that the software can generate feedback. The cost of the software is $150 for installation (one time charge) and $200/year for unlimited uses – not prohibitively expensive if it helps a few patients get better. You can find more information on their website at

