“An Investigation into the ability of the Edinburgh Postnatal Depression Scale to detect depressive symptomatology after childbirth”

By Vera Auerbach


The primary aim of the present study was to investigate the usefulness of the Edinburgh Postnatal Depression Scale (EPDS) in detecting depressive disorders and/or adjustment disorders in the 12 months following childbirth. Specifically, it was hoped to be able to offer a comparison between the EPDS and the Beck Depression Inventory (BDI). The BDI is often critiqued for its somatic loading and thus assumed to be an invalid tool in the puerperium. Various aspects of the BDI were explored in relation to the EPDS. A sample of 193 women participated in the study. The five research groups consisted of women who had not given birth in the last 12 months (with or without children), mothers who had given birth in the last 12 months (but were not depressed nor anxious), mothers who were diagnosed to be suffering from an adjustment disorder or a depressive illness, female clients of various Community Health Centers’ that were diagnosed as suffering from an adjustment disorder or a depressive illness and female clients of Community Health Centers’ who were diagnosed as suffering from an anxiety disorder. Diagnoses were made via a structured clinical interview (SCID) for the DSM-III-R. The BDI short form was found to be the most reliable predictor of depression in the puerperium. Mean EPDS scores obtained by clients with anxiety disorders were in the clinical range when a cut-off score of >12 was used. The EPDS appears to be more a measure of ‘psychological distress’ (which includes anxiety). It can, therefore, still be used as a general screening instrument for women in the puerperium but not necessarily for screening of depression at this time.

A research project submitted in partial fulfilment of the requirements for the award of the degree of MASTER OF ARTS (HONOURS) IN CLINICAL PSYCHOLOGY from THE UNIVERSITY OF WOLLONGONG

By Vera Auerbach, Department of Psychology, 31st May 1995, Copyright © 1995 by Vera Auerbach

Dedication: I would like to dedicate this thesis to David Lonie who helped me develop my capacity to think, as well as putting me in touch with my feelings/emotions. His support, encouragement, time and humanness have provided me with an invaluable life experience that I will never forget. Vielen Dank for bearing with and being there for me!

Acknowledgments: I would like to thank:

  • John Freestone: Firstly, for taking me on as supervisee when no-one else seemed to want to. Secondly for bearing with me and for taking an interest in my research. You provided me with the structure, faith and input I needed. I also enjoyed your great sense of humor! Thank you.

  • Nigel Mackay and Peter Caputi: for their time and input into my thesis, their support when I was considering conversion to a Ph.D., as well as help with writing some of the statistics-programs.

  • Brin Grenyer, Esme Nasser and Stuart Horstman: for help with patient proof-reading and providing me with useful comments

  • Garry Stevens and the ‘Hurstville Gang’: for their support in providing me with subjects, for putting up with me and for the fun we have

  • Gary Vaughan: for listening to me while I was developing my ideas and for religiously scoring all 193 questionnaires

  • All the mothers and women who agreed to participate!

  • Special Thanks and acknowledgment also goes to: Greg Aldridge, Possum Cottage, Jon Rist, Anne Porter, Canterbury Community Health Centre, Sylvia Smith, Andrew Long and Nick Cotsios.


List of Tables:

Table 1. Summary of validation studies of the EPDS 7

Table 2. Recent studies using the BDI to measure depression (after childbirth) 10

Table 3. Demographic Characteristics by Research Groups 18

Table 4. Questionnaire Means and Standard deviations by research group 20

Table 5. Comparison of sensitivity for BDI and EPDS in identifying Depressive and Anxiety disorders 21

Table 6. True Positive rates for all questionnaires for CBD women 22

Table 7. Efficiency of Questionnaires in detecting depression against SCID diagnoses 23

I. Introduction

Postnatal depression (PND) has recently received much attention in NSW. The NSW Women’s Consultative Committee (1994) and the Minister for Health have targeted it with special funds. Universal screening for PND, with the Edinburgh Postnatal Depression Scale (EPDS), of every women recently delivered was introduced in 1994 by the Minister for Health (NSW Health Department, 1994; Milne, 1994). The EPDS has been translated into 11 major ethnic languages (Cox & Holden, 1994) with a view to using it across-cultures to identify PND.

The PND Definitional Dilemma:

Much debate exists when it comes to defining PND (Hopkins, Marcus & Campbell, 1984; Whiffen, 1992). Initially the American Psychiatric Association (APA, 1987) had not included ‘postnatal depression’ as an official diagnosis in its classification system, the Diagnostic and Statistical Manual of Mental Disorders, 3rd Edn. – revised (DSM-III-R). However, recently in its newly published DSM-IV (APA, 1994a) the following specifier was introduced: “with Postpartum Onset” referring to onset of a Depressive, Manic or Psychotic episode within 4 weeks postpartum (APA, 1994b p.194). Its introduction was meant to reflect evidence that its “presentation may have implications for prognosis and treatment selection” (APA, 1994a, p.781).

The World Health Organisation in its classification system the ‘ICD-10 Classification of Mental and Behavioural Disorders: Clinical descriptions and diagnostic guidelines’ (ICD-10) has an actual listing for ‘postnatal depression NOS’ under the umbrella of “F53: Mental and behavioural disorders associated with the puerperium, not elsewhere classified ” (WHO, 1992, p.195). The ICD-10, unfortunately, does not list criteria to be present for these diagnoses to be made. Thus, understandably much confusion reins when it comes to a definition of ‘PND’, as many researchers and clinicians define PND differently.

 Depression after Childbirth:

Whiffen and Gotlib (1993) claim that depression after childbirth does not differ qualitatively from that occurring at other times, except it may be milder. In research and clinical practice a tradition seems to exist that considers depression after childbirth as a distinct diagnosis, as is observable by simply scanning titles of publications. Two very distinct schools of thought have emerged. One believes in a distinct diagnosis of puerperal disorders (e.g. Affonso, Lavett, Paul & Sheptak, 1990; C.T. Beck, 1992; Boyce, Stubbs & Todd, 1993; Cox, 1986; Cox, Holden & Sagovsky, 1987; Pitt, 1968), the other is opposed to, firstly, classifying a disorder in respect to its aetiology (e.g. APA, 1987, 1994a, 1994b), and secondly, it believes that PND is no different from a diagnosis of depression per se, in general (e.g. APA, 1987, 1994a, 1994b; O’Hara, Zekoski, Philipps & Wright, 1990; Whiffen, 1991, 1992; Whiffen & Gotlib, 1993).

Whiffen and Gotlib (1993) have summarised four main lines of argument that have been put forward to support the stand that PND is qualitatively different from nonpostpartum depression. Firstly, if PND is a distinct diagnosis, then it should be related to some variable aetiologically, e.g. childbirth, that is not present in the development of depression at other times. O’Hara et al. (1990) as well as Gotlib, Whiffen, Wallace and Mount (1991) compared a sample of postpartum depressed women with a sample of nonpostpartum depressed women, and found no difference with regard to depression [defined by Research Diagnostic Criteria, (RDC, Spitzer & Endicott, 1978; Spitzer, Endicott & Robins, 1978a, 1978b)]. Whiffen and Gotlib (1993) also found little evidence to distinguish postpartum from nonpostpartum depression apart from differences in symptom severity. Thus, one could conclude that PND is not a distinct disorder but rather a depressive illness that is milder in its presentation. In fact, it may be more likely that half of the presently called ‘postnatal depressions’, could be diagnosed as Adjustment disorders with either depressed, anxious or mixed mood. If DSM-III-R diagnostic criteria are used to determine depression after childbirth, then PND and Major Depressive Episodes (MDE) become one, and no separate aetiological explanations exist. This then becomes a circular argument as one will only find what is defined by one’s criteria.

Secondly, it is generally claimed by advocates of PND as a distinct diagnosis, that this type of depression is more common in the postpartum than at any other time (e.g. Cutrona, 1983; Hopkins, Marcus & Campbell, 1984; O’Hara, Neunaber & Zekoski, 1984; O’Hara, Rehm & Campbell, 1983). Whiffen (1992) recently reviewed the evidence regarding the prevalence of PND by comparison with epidemiological studies of depression occurring in the general population, and concluded that the risk of minor depression (RDC) is elevated in the puerperium. If, however, DSM-III-R criteria were used to define depression at any point in time, this finding would not hold. It may well be that an Adjustment disorder with depressed mood is more common in the puerperium but research to support this is limited (Terry & Hynes, personal communication, 1994; Whiffen & Gotlib, 1993). If one takes the RDC of minor depression to equal an Adjustment disorder then there may be some support. This would leave the realm of major mood disorder, therefore, there is little support for PND as a distinct mood disorder diagnosis. Rather more support exists for it to be classed as an Adjustment disorder which covers a maladaptive reaction to an identifiable psycho-social stressor (childbirth).

The third line of argument comprises Pitt’s (1968) view that PND has a different clinical picture from other depressive episodes. He observed that ‘PND’ was milder, suicidal ideation was less common, and reports of terminal insomnia were less frequent. Furthermore, he noted an increased level of anxiety and irritability. Pitt’s observation of PND being comparatively milder, has received indirect support from the fact that many women diagnosed with ‘postnatal depression’ (RDC-minor depression), obtained Beck Depression Inventory (BDI, Beck, Ward, Mendelson, Mock & Erbaugh, 1961; Beck & Steer, 1987, 1993) scores below the traditional cutoff score for mild (BDI) depression (Whiffen & Gotlib, 1993). The BDI is seen as a reliable tool for picking up depression in general (Beck, Steer, & Garbin, 1988; Gotlib & Cane, 1989). This again suggests that it is not necessarily MDE we are dealing with, but rather Adjustment disorders or possibly Anxiety disorders. Whiffen and Gotlib (1993) also report that PND is less persistent than is typical of depression normally. From this research one can conclude that we are looking at a milder version of the ‘same picture’ rather than a different clinical picture.

Lastly, the fourth factor thought to distinguish PND is psychiatric history. Kumar and Robson (1984) and other health professionals appear to believe that PND occurs ‘out of the blue’ after childbirth in women who have previously been emotionally stable (Cox, Kumar, Oates, Foreman & Anderson, 1992). However, others have found that women are more likely to become depressed in the puerperium, if they have previously had emotional problems (O’Hara, 1991; O’Hara, Neunaber, & Zekoski, 1984; Whiffen & Gotlib, 1993). Whiffen (1992) argues that the construct of ‘postnatal depression’ is of limited value, if episodes only occur among women who are vulnerable to depression at other times. She challenges the whole concept of PND because, if postnatal depression occurs only in vulnerable women, then the higher prevalence may simply be due to the added stress of childbirth, not any independent aetiological factor.

In summary, disturbances in psychological functioning after childbirth do exist but may be better referred to as a MDE or adjustment disorder, so that consistent research in this area can take place. To continue to use the term ‘postnatal depression’ proliferates the use of a construct that can not be scientifically tested. In fact, many clinicians are also abandoning the construct because it has so many varied meanings and have started to use the term ‘postnatal distress’ (Cox & Holden, 1994) or ‘dysphoria’ (Green & Murray, 1994) .


The aetiology of PND – A Stress-Coping Model?

Many researchers have tried to find an aetiology for PND. Demographic variables such as socioeconomic status (Cox, Connor, & Kendell, 1982; Gotlib, Whiffen, Mount, Milne, & Cordy, 1989; O’Hara et al., 1984), age (Gotlib et al., 1989; Hickey, Boyce, & Ellwood, 1995; Hopkins et al., 1984; O’Hara et al., 1984) and parity (Gotlib et al., 1989; Gotlib et al., 1991; Hickey et al., 1995; Hopkins et al., 1984; Kendell, Rennie, Clark & Dean, 1981) have, in the majority of cases, been found to be unrelated to PND.

Equivocal findings have been reported for variables such as: delivery complications (e.g. Cox et al., 1982; Hickey et al., 1995; O’Hara et al., 1984; Terry & Hynes, personal communication, 1994; Warner, 1995; Whiffen, 1988b) and hormonal factors (e.g. Harris, Johns, Fung, Thomas, Walker, Read & Raid-Fahmy, 1989; O’Hara, Schlechte, Lewis & Varner, 1991).

More recent research has found some support to point towards a bad spousal relationship precipitating depression after childbirth (Gotlib et al., 1991; Terry & Hynes, personal communication, 1994; Whiffen & Gotlib, 1993; Whiffen, 1988a). Researchers have also had somewhat consistent evidence that the presence of previous emotional problems (especially prior episodes of depression) is predictive of PND (e.g. Gotlib et al., 1991; O’Hara, Schlechte, Lewis, & Varner, 1991; Whiffen & Gotlib, 1993). Others (e.g. Cox et al., 1982; Dalton, 1971; Kumar & Robson, 1984; Pitt, 1968) had no such findings. However, these former studies have become outdated compared to the more recent findings. Moreover, after controlling for all the above findings, evidence also tends to suggest that other psychosocial variables such as locus of control, and social support (O’Hara, 1986; Watson, Elliott, Rugg & Brough, 1984; Whiffen, 1988b) consistently predict PND (Terry & Hynes, personal communication, 1994).

Deborah Terry (Terry, 1991a, 1991b,1992; Terry & Hynes, personal communication, 1994; Terry, McHugh & Noller, 1991) has been researching the utility of a stress-coping model to explain depression after childbirth. Specifically, she proposes that the level of stress experienced across the puerperium, the coping efforts that new mothers use to deal with the birth of their child, and the extent to which they have access to coping resources will influence the likelihood that they will suffer symptoms of postpartum depression. Similar proposals have been made by Billings and Moors (1985) as well as Barnett and Gotlib (1988). All go a long way in advancing an aetiological model for depressive symptomatology after childbirth.


The Edinburgh Postnatal Depression Scale (EPDS):

John Cox (1986) developed the EPDS specifically to measure PND and to aid detection of postpartum psychiatric disorder (Murray & Carothers, 1990). Cox and others (Boyce et al., 1993; O’Hara et al., 1990) felt that traditional depression inventories (in particular the BDI) were highly loaded on somatic complaints, something seen as a natural part of childbirth. Cox therefore set out to develop his own 13 item scale which was validated on a sample of 63 mothers. Rotated factor analysis of the data suggested that specificity would be increased if 3 items (two irritability items and one item concerning new motherhood) were omitted because they formed a ‘non-depression’ factor. Thus, the EPDS was revised to become a 10 item scale, and was validated on a sample of 84 new mothers, identifying 35 mothers with a possible diagnosis of ‘PND’ by scoring over 12 (Cox et al., 1987; see Appendix I for a copy of the EPDS).

Limitations of Cox’s studies (Cox, 1986; Cox et al,. 1987) include a relatively small sample (n= 84), non-random selection of subjects (i.e. women were identified by health professionals as possibly depressed), and the fact that the study was not projective. Depression was diagnosed by RDC (Spitzer, Endicott & Robins, 1975; Spitzer et al., 1978b), resulting in 21 diagnosed definite major, 3 probable major and 11 definite minor depressions. Due to the number of false negatives (4 of 11) Cox et al. (1987) suggest a cut-off score of >9 to be used for routine screening purposes.

Cox et al. (1987) go on to suggest that the EPDS may be useful for screening depression in general and suggest that research on other clinical populations should be carried out. Green and Murray (1994) also suggest that the EPDS can be used as a screening instrument for depression generally, moving away from its original justification as a specific depression screening measure in the puerperium. Cox has also suggested that the EPDS may be useful in diagnosing depression in fathers, and goes on to say that “the scale may be renamed for this purpose the … Edinburgh Depression Scale” (Cox & Holden, 1994, p.123). No validation of the EPDS in these different populations has however been carried out. Nor has justification been given why the EPDS may be a better measure of depressive symptomatology than any of the existing depression measures. Another concerning trend has been the exclusive reliance on an EPDS score for the diagnosis of depression, and its use as a measure of treatment outcome (Holden, 1994). Research has not validated the EPDS for such use.


Validation Studies of the EPDS:

Research on the validation of the EPDS has been summarised in Table 1. (All studies have some methodological limitations but space does not permit a full critique). Harris, Huckle, Thomas, Johns and Fung’s (1989) findings are difficult to interpret because of methodological limitations, i.e., the sample was not randomly selected, no standard psychiatric interview was used and almost half of the sample were included because of hyperthyroidism, a condition that invariably presents with anxiety and depression (Gelder, Gath & Mayou, 1989). Furthermore, the EPDS was completed only after a full psychiatric interview, which is very likely to have sensitised the women to depressive symptoms that might not otherwise have been acknowledged (Murray & Carothers, 1990). Both of Cox’s (Cox, 1986; Cox et al., 1987) and Harris’s (Harris, Huckle, et al., 1989; Harris, Johns, et al., 1989) studies are likely to represent overestimates of the true utility of the EPDS for the above reasons (Murray & Carothers, 1990).

Table 1: Summary of validation studies of the EPDS

Study: (cut-off >12)


(proportion of non-depressed women correctly identified)

Sensitivity (proportion of depressed women correctly identified)

Positive predictive value (proportion of women identified who are truly depressed)




Boyce, Stubbs & Todd (1993)

N = 103




(MDE n = 9)



MDE only

O’Hara, Hoffman, Phillips & Wright (1992) N=266

not reported

not reported

not reported

USA, paid

RDC, int. over phone

Murray & Carothers (1990) N=646



(Major = 81.1, Minor = 52)


(Major = 43,

Minor = 23.6)

British, married, primiparous, 20-40 yrs, min 37 & max 42 wk preg, birthweight > 2.5 kg


Harris, Huckle, Thomas, Johns & Fung (1989) N= 147


(vs. BDI 88%)


(vs. BDI 68%)

not given

British, thyroid study


Cox, Holden & Sagovsky (1987) N= 84




British, preselected


Murray and Carothers (1990) as well as Carothers and Murray (1990) validated the EPDS on a community sample of 702 women in United Kingdom, at 6 weeks postpartum using RDC for depression. They only included women who were married or de facto, primiparous, aged 20-40 (with a 37-42 wk pregnancy and birthweight of at least 2.5 kg). It is questionable how representative this is of childbearing women because many women get PND after their second or third child, women are choosing to be single mothers, and also are having their first child when over 40 more often. Unfortunately, their results can only be used for women that match the above criteria. Refusal rate equaled 1.3%. The EPDS was mailed at 6 weeks, 674 (97.3%) returned it. Of the remaining 646 (92% of original sample) all those with an EPDS ³ 13 were interviewed. A random sample of those scoring 10-12 were also interviewed using Standard Psychiatric Interview with additional questions to yield RDC. In addition one in ten with a score <10 were interviewed. Interviewers were blind to the interviewees EPDS score.

In Table 1 definitions of specificity (true negative/ true and false negative), sensitivity (true positive/ true and false positive) and positive predictive value [PPV] (true positive/ true positive and false negative) are offered. It is noteworthy that estimates of those dimensions that are of most relevance to the health professional (sensitivity and positive predictive value) are substantially lower in Murray and Carothers’ (1990) study than in other studies. A PPV, which can be seen as similar to a validity coefficient, of lower than 60% (which is the case if one is looking for MDE in Table 1. has a PPV of 43%) is generally seen as a very low (Gathercole, 1968). Equally, once specificity reaches below 85% the test is of little value, because with a prevalence of 10-12% for depression after childbirth (Cox et al., 1987; Murray & Carothers, 1990) the specificity should be greater than 88% as the test will be right 88% of the time (Gathercole, 1968). From Murray and Carothers’ study one can extrapolate that if one used an EPDS cut-off score of >15 one would come closer to a PPV of 64.3% for a MDE. With a screening test one is, of course, concerned with reducing false negatives, while false positives are of less concern. Murray and Carothers go on to claim that the EPDS is also good at predicting severity of depression, i.e. mothers with high scores on EPDS were found to be much more likely to suffer from a major depressive episode than were those with low scores.

Boyce, Stubbs and Todd (1993) conducted a study relevant to the use of the EPDS in Australia. They interviewed 103 mothers and indicate a score of 12+ would be the most useful cut-off score for an Australian population (see Table 1). This is in line with Cox et al. (1987). Even though their depressed sample was relatively small (n=9, and this was substituted with women who were presenting for treatment of PND, not just the ones found through screening) they interviewed 103 women with the Diagnostic Interview Schedule (DIS) for a DSM-III-R diagnosis of MDE and found results very similar to those reported in the United Kingdom. No mention is made of adjustment disorders with depressed mood. For the above cut-off score they obtained a specificity of 95.7%, sensitivity of 100% and a positive predictive value of 69.2% (refer to Table 1.).

It remains questionable whether the EPDS really measures depression in the puerperium or if it is a general reflection of depression and anxiety, better labeled ‘psychological distress’. An aim of this research was to question the specificity of the EPDS.


Importance of Beck Depression Inventory version in research:

A first in this study was to examine the usefulness of the BDI 1978 revised version (Beck & Steer, 1987, 1993) in postnatal research, as no study to date had done this (see Table 2). It was felt especially important to use the 1978 version after Sacco (1981) found that the 1961- BDI scores represented feelings for the day on which the BDI was administered. In the 1961 version, patients are asked to rate themselves “right now”, whereas in the revised BDI (1978 version) patients are asked to describe themselves for the “past week, including today”. Thus, Sacco (1981) suggests that the original version measured a present ‘state’ component of the inventory, whereas the revised version assesses a more persistent ‘trait’. Therefore, one could argue that none of the studies in Table 2 below can claim that they truly measured depression as a persistent feature or mood. This simple oversight makes the applicability of their results questionable in relation to this area of research.


BDI and Somatic Symptoms in the Puerperium:

The BDI has often been criticised (Boyce et al., 1993; Cox, 1986; O’Hara et al., 1990) for its use in the puerperium because it is argued that it may overstate the severity of depressive symptomatology in childbearing women due to the many normal physiological changes after childbirth said to be similar to symptoms of depression (e.g., loss of sexual interest, appetite change, fatigue and changes in sleep patterns).

Terry and Hynes (1995) tried to overcome this problem by excluding items 17, 19, 20 and 21 using a cut-off score of ³ 10. Nevertheless, the BDI has been used in research during pregnancy and the puerperium, especially in the United States of America (e.g. O’Hara, Rehm & Campbell, 1983; O’Hara et al., 1984) and Canada (e.g. Whiffen & Gotlib, 1993) where it has not received the same extent of criticism as in United Kingdom. An outline of these studies is presented in Table 2.


Specific BDI Norms for Australian Childbearing Women:

Another aim of this study was to answer O’Hara’s et al. (1984) call to provide some Australian norms for puerperal women on the BDI, bearing in mind the somatic symptomatology might be naturally elevated. He also advocates recruiting a community sample of nonpregnant women who are more comparable to puerperal subjects than his depressed were. This study, therefore, set out to include women who were depressed (but had not had a child in the last 12 months), and women who were not depressed and had not had a baby in the last 12 months, as well as women who had given birth in the last 12 months.

Lastly, it was hoped that the cognitive-affective subscale (13 items) or the BDI short form (13 items) may be able to offer a more accurate and similarly brief measure of depression in the postpartum compared to the EPDS, if the overall BDI score was not able to. Even though Gotlib et al. (1991) found that the only variable that did not differentiate depressed from nondepressed childbearing women was the one of dysfunctional cognitions, no-one has ever examined the cognitive-affective subscale in comparison to the EPDS. It should clearly have an advantage over the 21- item BDI as it excludes the somatic components.

Table 2. Recent studies using the BDI to measure depression (after childbirth):


1961 version

1978 version


Terry & Hynes (1995)



N=197, primiparous women only, BDI items excluded 17, 19, 20, 21 cut-off ³ 10, EPDS cut-off ³ 11, HRSD used as diagnostic interview, average EPDS score 14.

Whiffen & Gotlib (1993)



N=158, mean BDI for CB Dep 14.7, CB not dep 5.3, Non-CB dep 21.7 and Non-CB non-dep 5.6; RDC used, found little to distinguish postpartum and nonpostpartum depression by except symptom severity.

BDI cut-off >9, diagnostic interview by phone

O’Hara, Hoffman, Philipps & Wright (1992)



N=130, BDI used as diagnostic indicator

O’Hara, Schlechte, Lewis & Varner (1991)



N=361, RDC and DSM-III criteria used for Diagnosis, prospective design

Gotlib, Whiffen, Wallace & Mount (1991)



N=730, RDC mean BDI for CB dep 14.3 Non dep CB 5.2; BDI ³ 10

Phillips & O’Hara (1991)



N= 70, RDC

O’Hara, Zekoski , Philipps & Wright (1990)



N=182, compared cognitive affective subscale to somatic scale

Harris, Huckle et al. (1989)



see earlier description

Harris, Johns et al. (1989)



O’Hara, Neunaber & Zekoski (1984)



N = 99, used SADS & RDC; Ss paid, recruited 2nd trimester, BDI given at 3, 6, 9 wks and by phone at 6 mths; 8 MDE & 4 minor depression @ 9 weeks postpartum. Symptoms onset 1-6th wk postpartum, mean duration 3.3 wks. Mean BDI scores @ 3 wks = 6.79, 6 wks = 5.58, 9 wks = 4.43, 6 mths = 4.45. Somatic items signif. higher than cognitive-affective subscale

Gotlib, Whiffen, Mount, Milne & Cordy (1989)



BDI ³ 10 interviewed, N=295

Whiffen (1988)



N=120, MDE n=6 (RDC Minor n=4; BDI specificity 86%, sensitivity 48%, false negative 52%, false positive 14%. Concluded BDI not a satisfactory screening instrument for postpartum depression research.


The hypotheses of this study are summarised below:

1) More mothers with children under 12 months who have SCID-diagnosed depressive symptomatology will be correctly identified by the EPDS than by the different versions of the BDI.

2) The EPDS will have a greater sensitivity rate (i.e. proportion of depressed women correctly identified) for both depressed groups than will the following BDI versions: cognitive-affective subscale, BDI (total) and BDI short form.

3) An EPDS score above the cut-off of 12 will have higher sensitivity for depressive disorders than for those diagnosed with Anxiety Disorders.

4) A mean BDI (total) score will differentiate across the depressed clinical group, PND group, anxiety group and control groups. Specifically the mean BDI score for the depressed group will be greater than the mean BDI for the PND group which will be greater than the mean BDI for the anxiety group which in turn will be greater than the mean BDI’s for both control groups. Thus, the following prediction with BDI scores should be possible:

Depressed > PND > Anxiety > both control groups as PND is generally viewed as a milder form of depression, which may not meet stringent diagnostic criteria but some depressive symptomatology is present (thus more like an Adjustment disorder).


II. Method

Research Participants:

Women research participants (n=193) were drawn from a number of different sources to make five groups. Sources included women attending or working in a number of Community Health Centres (Hurstville, Rockdale, Peakhurst, Sutherland, and Canterbury), Hospitals (St. George & Sutherland) or friends thereof. New mothers attending baby health centers in the St. George Area Health Service and a daystay unit (Possum Cottage) also participated.

Participants were broken up into the following 5 groups: 1. Childbearing Screening Group (CB, n=58): Mothers with at least one infant up to 12 months (may have had older children as well); 2. Non-childbearing Screening Group (NCB, n=62):Women with children over 12 months or without children, who were not pregnant; 3. Childbearing Depressed Group (CBD, n=13): Mothers who were depressed when interviewed, with at least one child under 12 months; 4. Non-childbearing Depressed Group (NCBD, n=14): Women with/or without children over 12 months with a diagnosis of Major Depressive Episode, Dysthymia or Adjustment Disorder with depressed/mixed mood as per SCID. 5. Anxiety Disorders Group (NCBA, n=10):Women with/or without children over 12 months of age who were diagnosed to be suffering from Panic Disorder with or without Agoraphobia, Simple or Social phobia, Generalized Anxiety Disorder or an Adjustment Disorder with anxious mood as per SCID diagnosis.

Thus, there were two groups of women who had children under 12 months [normal (CB), depressed (CBD)] and three groups of women who did not have a child under 12 months [normal (NCB), depressed (NCBD), anxious (NCBA)] see Table 3. Women were eligible for participation if they were over 18 year of age and spoke/read English fluently.

Statistical analyses reported in the present paper excluded those subjects who scored above the cut-offs and declined to be interviewed (n=30) and subjects who were interviewed and then diagnosed with a dual diagnosis of anxiety and depressive disorders (n=6). Therefore, subjects (n=157) ranged in age from 18 to 62 years (mean = 32 yrs; SD = 7.29).


Depressive Symptomatology:


O’Hara et al. (1992) reports that the BDI has good psychometric properties. It has been used frequently in research on general depression (Rehm, 1988) and postpartum depression (Cutrona, 1983; O’Hara et al., 1984; Whiffen & Gotlib, 1993; see Table 2). It is probably the most widely used clinical self-report test of depression (Hersen and Bellack, 1988; Sundberg, 1992), and was originally designed to measure the severity of depression in already diagnosed patients (Conoley, 1992; Sweetland & Keyser, 1986, 1991). The BDI consists of 21 items with four options per item, the reading level is estimated at fifth grade (Conoley, 1992; Sundberg, 1992). The first page of the form contains 13 items that cover the cognitive-affective subscale and on the reverse side 8 items form the somatic-performance subscale. Additionally, a short form of the BDI has been developed (Beck, Rial & Rickels, 1974; Reynolds & Gould, 1981). The internal consistency rated by Cronbach’s coefficient alpha (Beck et al., 1988; Beutler, Corbishley & Hamblin, 1988; Conoley, 1992) for 25 studies ranged from .73 – .95. The mean coefficient alphas for 15 nonpsychiatric populations was .81, and .86 for psychiatric populations.

Beck and Steer (1993) suggest a cut-off score for the BDI (total) of ³ 10 when dealing with depressed outpatients. This should pick up all mild (a score of 10-16), moderate (17-29) and severe (30-63) cases of depression. Stehouwer (1985) suggests that the practitioner who is in need of a simple, quick and helpful tool in gathering information about a person’s depressive state would do well to use the BDI and suggests that in any situation where a screening device is necessary for depression, the BDI “would seem to be the test of choice” (p.85). Steer, Beck, Riskind and Brown (1986) support this view. Considerable debate exists over what cut-off scores to use for detecting depression in adult normal populations. Hersen & Bellack (1988) suggest that scores >18 are indicative of possible depressive symptomatology. Sundberg (1992) suggests that with the general population a score greater than 15 may suggest depression. This view is also supported by Boyce (1995). Terry and Hynes (1995) and most other researchers (e.g. Gotlib et al., 1991; Whiffen & Gotlib, 1993) used a cut-off of 10 or over which is said to be indicative of mild depression (Spreen & Strauss, 1991). Cut-off scores on the 13 item Beck short form of 8+ are advised (Knight, 1984; Spreen & Strauss, 1991; Vredenburg, Krames & Flett, 1985), with reliability of .93, test-retest .90 and internal consistency of test items .86, (concurrent validity with HRSD is .82). On the cognitive-affective subscale a cut-off score of >10 is advised by Beck and Steer (1987).

The authors of the BDI recognise that depression is common in many disorders but claim that the BDI can differentiate between major depression and anxiety disorders (Beck & Steer, 1987, 1993; Steer et al., 1986). In summary, the BDI is a well researched assessment tool with substantial support for its reliability and validity.



Even though Terry and Hynes (1995) used a cut-off of 11 or above to indicate PND, general consensus appears to be a cut-off of greater than 12 (Boyce et al., 1993; Cox et al., 1987; Harris, Huckle et al., 1989; Murray & Carothers, 1990). With this cut-off, Cox et al. (1987) cite a validity coefficient of 0.73 and a split-half reliability of 0.88. This, however, was described as inflated by Murray & Carothers (1990), who more thoroughly investigated the EPDS. They cite a validity coefficient of 0.66 for the EPDS (see Table 1). The recommended cut-off score of 12+ (Boyce et al., 1993; Cox et al., 1987; Harris, Huckle et al., 1989; Murray & Carothers, 1990) was used to pick up potential subjects. This cut-off has been used in most studies (Boyce et al., 1993; Cox et al., 1987; Harris, Huckle et al., 1989; Murray & Carothers, 1990), and has been recommended as the cut-off by the Minister for Health (NSW Health Department, 1994; Milne, 1994). It has been reported to have a sensitivity rate of 100% Boyce et al. (1993) at best and 67% as its worst (Murray & Carothers, 1990).

As an aim of this research was to question the specificity of the EPDS, it was administered to clients with depressive and anxiety disorders. If, for example, the EPDS scores were >12 in a population of depressed or anxious clients who have not given birth in the last 12 months, then this would call into question whether the EPDS is measuring a specific syndrome (PND as claimed), or rather a general rate of psychological distress, as a high score on the EPDS was designed to identify postnatal depression. If, on the other hand, clients with a MDE scored just as high on the EPDS as those with ‘PND’, then it may be concluded that the EPDS measures depression per se. This then further challenges the concept of postnatal depression as a separate diagnostic entity. For similar reasoning the same tests were also given to Anxiety disorders. Again it was a way to question the EPDS’s specificity.

The second aim, was to examine how well the EPDS and the BDI compare with each other in detecting depressive symptomatology after childbirth. This was measured against a diagnostic interview, namely the RDC successor, the Structured Clinical Interview for the DSM-III-R [SCID] (Spitzer, Williams, Gibbon & First, 1990a, 1990b).

Diagnostic Measure:

SCID: Structured Clinical Interview for the DSM-III-R:

The SCID (Spitzer et al., 1990b) is a semistructured interview for making the major Axis I & II diagnoses (Spitzer, Williams, Gibbon, & First, 1992). It was developed by Spitzer, Williams, Gibbon and First (1990a, 1990b). In the present study only Axis I diagnoses were made due to time constraints and because it was felt Axis II diagnoses were a separate factor to be investigated in relation to depression after childbirth. The SCID provides a current DSM-III-R diagnosis as well as a lifetime diagnosis.

Administration of the SCID was carried out by a registered psychologist who was familiar with the DSM-III-R classification system and diagnostic criteria it uses. The SCID was chosen because it can be applied to individuals with a psychiatric history or members of the community at large, as it has various editions. In this research the SCID-Non Patient (SCID-NP) version was used (Spitzer et al., 1990a, 1990b; Spitzer et al., 1992). It has a psychotic screening section, enabling the exclusion of psychotic disorders. It was used to establish the DSM-III-R diagnoses of depression, anxiety or adjustment disorders.

The SCID was chosen over the SADS (Endicott & Spitzer, 1978) because it has superseded the RDC (Spitzer et al., 1992). Williams et al. (1992) cites a reliability coefficient of .61 obtained from a very good multisite test-retest reliability study. It should be taken into consideration that the SCID is not a fully structured interview and requires clinical judgment on behalf of the interviewer; thus the reliability is very much a function of the particular circumstances in which it is used (Spitzer et al., 1990a). See Williams et al. (1992) for further discussion on the limitations of structured clinical interviews. Skre, Onstad, Torgersen and Kringlen (1991) report an interrater reliability of .93 for MDE and .95 for Anxiety disorders on the SCID. As it was necessary to use some structured diagnostic interview in this project, in order to make sure that the population being tested was the one proposed, the SCID was chosen as the diagnostic interview of choice.

The SCID was supplemented with questions in order to distinguish natural physical phenomena occurring after the birth of a child from those of depression. Following the general SCID rule, a symptom that can be accounted for by another condition (e.g. physical illness) was not counted as evidence for a particular syndrome. A number of questions were asked to determine if symptoms were primarily attributable to childbearing; replicated from O’Hara et al. (1990) [see Appendix II].

If a SCID interview was not performed the current DSM-III-R case managers’ diagnosis was taken as a backstop measure (n = 5). The SCID demographic section had to be adjusted for the Australian context, see Appendix III. The SCID can be a relatively brief but thorough interview (Spitzer et al., 1990a) and took between 30-90 minutes to administer.

Even though Terry and Hynes (1995) used the Hamilton Rating Scale for Depression (HRSD, Hamilton, 1960) it was not used in this study because Steer, Beck, Brown and Bernick (1987) report that the BDI and HRSD stress different symptoms and would not be expected to produce equivalent findings. The HRSD emphasizes somatic and behavioural symptoms, whereas the BDI focuses on the subjective experience of depression.



Potential participants were approached by their case manager (clinical populations) or colleague and invited to participate in a study that was ‘investigating psychological functioning in women’. Women control subjects were invited to take extra questionnaires for other women who they thought would agree to participate. Therefore, the women control groups consisted mainly of health centre staff or their wives (who may have taken a questionnaire at their team meeting) or university postgraduate students (or their wives) who were invited to participate while in various classes (mainly psychology related). Completed questionnaires were scored by an independent scorer (a Clinical Psychologist) who would forward the envelopes to the chief investigator with the following comments on the outside: “scored, participants name and phone number” if they were “to be interviewed”, and had “consented”. The chief investigator then, blind to the sample from which the participant came, made contact and gave the SCID (usually within 14 days). Interviews were held at a place most convenient to the research participant, usually their home (75%) or the centre of their choice. If participants were diagnosed with an Axis I disorder they were alerted to their treatment options.

The SCID was routinely administered to all clinical groups (CBD, NCBD, NCBA), while it was only given to control groups (CB, NCB), if they scored above the cut-offs on either the BDI or EPDS. Control groups were systemically sampled for interview in order to be able to comment on possible false negatives, as well as rates of specificity and positive predictive value. Every tenth participant scoring below the cut-off criteria in the NCB group, and every fifth in the CB group, were administered the SCID. Questionnaires were stapled in reverse order 50% of the time, in order to avoid rank-order effects. Where participants consented to case managers providing more information, the DSM-III-R diagnosis of the case manager/psychiatrist was also recorded.


III. Results

The research data of the 157 research participants was initially checked and screened. This involved examining means, variance and standard deviations, as well as inspecting skewness and kurtosis on scatter plots. Bar charts and normality plots for each variable, by groups, were also scrutinized in order to ensure a normal distribution was present. No extreme outliers were observed thus none were excluded.

The demographic characteristics of all research groups including age, marital status, education, parity and previous psychiatric history, are presented in Table 3 (see Hopkins & Glass, 1978, p.52 for rounding). Non-parametric (when appropriate parametric) analytic strategies were employed due to the unequal numbers in sample sizes making the data ‘unbalanced’ (Schlotzhauer & Littell, 1991). Assessing differences among the five groups on demographic variables involved conducting Kruskal-Wallis tests through the SAS statistical software system (a non-parametric way of conducting an ANOVA; SAS Institute, 1988). The next step was a Bonferroni t-test for unbalanced data (a parametric test) and the Ryan-Einot-Gabriel-Welsch Multiple F test (for unbalanced data) in order to determine which groups differed significantly (Schlotzhauer & Littell, 1991). Both these tests control for the experimentwise error rate, and allow for a smaller a -level. No significant differences were obtained with respect to age. If women’s ages in Table 3 are examined one can see that the means are all in the early thirties. These ages are slightly older than the mean ages usually reported, namely 28 years (e.g. Boyce et al., 1993; O’Hara et al., 1990). Of note are also the differences in standard deviations for age, where the NCBA group has the largest range (SD = 13.2 years), followed by the NCBD group and NCB group (SD = 9.45 & 8.38 respectively), while it is the childbearing groups that have the lowest ranges (SD = 6.49 for CBD & 3.74 for CB group). Here it is of interest that the CBD group varies to a greater extend in age then the CB control group.

Participants were also compared on remaining demographic indexes; including percentage married, education, and previous psychiatric history using the same analytic strategies outlined above. Childbearing groups were also compared for the variable of caesarean section. From initial perusing of the data it appeared that there may be differences between the percentages married, 85% and 95% for the childbearing groups, and 43%, 48% and 40% for the non-childbearing groups. The percentage of caesarian section also appeared to differ as 46 % of depressed childbearing women had had a caesarian section while only 21% of the non-depressed childbearing group had had them.

Table 3: Demographic Characteristics by Research Groups

  NCBD(n = 14) CBD(n= 13) NCB(n= 62) CB(n= 58) NCBA(n= 10)


34.1 30.4 34.2 30.4 33.6


9.45 6.49 8.38 3.74 13.2
% Married 43 85 48 95 40
Number of children:          

No children

n =7 (50%) n/a¹ n = 32 (52%) n/a n = 4 (40%)


n = 2 (14%) n = 8 (62%) n = 7 (11.%) n = 46 (79%) n = 2 (20%)


n = 5 (35%) n = 5 (38%) n = 21 (34%) n = 12 (21%) n = 3 (30%)
Mean Level of Education Tertiaryº Tertiary Tertiary Tertiary Tertiary
% caesarean section n/a 46 n/a 21 n/a
% Prev. Psych History 8 10 15 10 10

Note: NCB = non-childbearing, CB = childbearing, NCBD = non-childbearing depressed, CBD = childbearing depressed, NCBA = non-childbearing anxious, ¹childbearing women naturally did not have a subgroup with no children, º tertiary = thirteen years of education or more, i.e. qualifications beyond the Higher School Certificate

With respect to previous psychiatric history (which included anything from having seen a counsellor in the past to previous psychatric admissions) a nonparametric analysis of variance (Kruskal-Wallis) revealed significant differences between groups (df = 4, n=157) p < .05. NCB women significantly differed from all groups with respect to past psychiatric history. As can be noted in Table 3, 15% of non-childbearing women had sought prior help, compared to 10% for the CBD, CB and NCBA groups, with the NCBD group having the lowest percentage of previous psychiatric history, namely 8%. What is of note is that a higher percentage of CBD women had sought help before compared to the somewhat older NCBD group. No other study has also found their control group to have had the highest incidence of previous psychiatric help.

For marital status a Ryan-Einot-Gabriel-Welsch Multiple F test revealed that the two childbearing groups were significantly different, p<.0001, in their percentage married from the other 3 non-childbearing groups (a =0.05, df = 151). This study differed from O’Hara et al. (1990) where 80% of his NCB group were married compared to 40-48% in the present study. For the CB groups, 85% and 95% were married, this percentage being close to Boyce et al. (1993) of 93% for an Australian sample and O’Hara et al. (1990). The Australian Bureau of Statistics (ABS, 1988) reports that despite a declining trend in marriages (a decrease of 10% in 10 years) in 1987, 78% of women in the age range of 30-34 years were married in Australia. One could estimate that by 1994, following the trend, approximately 70% would have been married in the same age range. This percentage is below that of the CB groups, yet still far above the NCB groups.

With respect to Education, groups did not significantly differ, p < .5, when a non-parametric analysis of variance was performed (df =4, n=157). It appeared that on average all samples completed further training after having obtained the Higher School Certificate.

Lastly, the childbearing groups were compared for the percentage of caesarean section, via a Wilcoxon Rank Sum test, i.e. a non-parametric test for comparing two groups. A significant difference between the two groups was found (df=1, n = 67) p < .05. The depressed childbearing group had a significantly higher rate of caesarian section than the control childbearing group. Similar differences have been reported by Kendell et al. (1981) and Campbell and Cohn (1991).

Agreement between SCID diagnosis and case managers diagnosis was r = .88 (it was calculated via a Pearson Product Moment correlation, n=58) when specific subtypes and levels of severity of disorders were diagnosed. This agreement increased to 100%, r = 1.00, when only distinctions between anxiety, depressive and adjustment disorders were required. This is not surprising as DSM-III-R criteria were used for case managers diagnoses as well as SCID diagnoses.

When examining questionnaire data for a rank order effect (i.e. could the order of the BDI then EPDS or vice versa, as they were filled in have affected results) it was noted that questionnaire order was evenly balanced. Half the sample had the BDI as their first questionnaire while the other half had the EPDS as their first questionnaire. The perfect balance of 0.5, was nearly reached in this sample namely, 0.49. It was, therefore, concluded that the questionnaire order did not influence the results of this study.

Table 4 represents the means and standard deviations obtained on the various questionnaires by each research group. NCBD women had the highest mean scores, but both depressed groups fell into the BDI total moderate range (Beck & Steer, 1993). With a score of 24+ being the start of the severe range the NCBD group came nevertheless very close with a mean of 23.4. The split between the two depressed groups for MDE and Adjustment Disorders was 9:5 for NCBD women and 7:6 for CBD women respectively. The rate of MDE in CBD women being higher in proportion to O’Hara’s et al. (1990) study, yet similar to Whiffen’s (1988) study who both used the diagnosis of ‘minor depression’.

Anxiety disorders fell into the mild BDI (total) range [10-15] with a mean score of 14.6 (see Table 4). Both control groups were in the minimal/ non-depressed classification. They conformed to the statistically ‘average’ level of depression with a BDI score between 4 and 6, as was recommended by Gotlib and Cane (1989).

It is of interest that the BDI somatic subscale is the only scale in which the CBD group had the highest mean score, 9.15. The CB control group’s mean score of 3.53 is quite lower than that. If somatic symptoms were abnormally represented or endorsed on the BDI for new mothers in general, as argued, one might expect this mean score to be more similar for both CB groups.

Table 4. Questionnaire Means and Standard deviations by research group:



(n = 14)


(n= 13)


(n= 62)


(n= 58)


(n= 10)

BDI total          













BDI cognitive/affective          













BDI somatic          













BDI short form          


























Note: NCB = Non-childbearing, CB = childbearing, NCBD = Non-childbearing depressed, CBD = childbearing depressed, NCBA = Non-childbearing anxious, BDI = Beck Depression Inventory, EPDS = Edinburgh Postnatal Depression Scale

For the EPDS both depressed groups are above the clinical cut-off, yet Anxiety disorders just fell above twelve. Boyce et al. (1993) reported slightly higher mean EPDS scores for the CBD group, namely 17.8 (SD = 3). In the present study the CBD group had a mean of 15.8 compared to that of the NCBD group which was closer to Boyce’s finding with a mean of 17.4. This may not be surprising though as Boyce did not include Adjustment disorders in his study. Also of note is that Anxiety disorders consistently had the largest standard deviation compared to all other groups.


Hypothesis 4:

The hypothesis that a mean BDI (total) score would differentiate across the five groups via a linear relationship, was tested through a Kruskal-Wallis nonparametric one-way analysis of variance (due to unequal sample sizes), and a significant difference at p = .0001 was found between the groups (n= 158). Further analysis via a general linear models procedure of multiple t-tests revealed that all groups differed significantly on the BDI (total) variable, except the two depressed groups, where no significant differences between the means were found. The prediction of the NCBD group having a larger mean (M = 23.4) than the CBD group (M = 21.8) was confirmed but not at a significant level (see Table 4). Despite the CBD group consisting of nearly half of the participants with an Adjustment disorder with depressed mood, their mean score was still relatively high (M = 21.8).

Hypothesis 3:

The hypothesis that an EPDS score above the cut-off of 12 will have higher sensitivity for depressive disorders than for those diagnosed with Anxiety disorders was confirmed (see Table 5). Of interest was the low sensitivity of the BDI total for both Anxiety and Depressive disorders. Even though sensitivity is not an important measure for a screening test (as the number of false positives are not really of interest because one is concerned with eliminating false negatives) it is lower than that of the EPDS. The low sensitivity for detecting anxiety disorders is encouraging as the BDI is not meant to measure anxiety. Likewise the 67% for the EDPS on this category is worth observation, as Murray and Carrothers (1990) found the sensitivity of the EPDS for detecting depression to be 67% as well (see Table 1).

Table 5. Comparison of sensitivity for BDI and EPDS in identifying Depressive and Anxiety disorders.





BDI (total)


Anxiety Disorders




Depressive Disorders




Hypothesis 1:

The hypothesis that the true positive rate for the EPDS will be higher than for the BDI (total), the BDI cognitive-affective sub-scale, and the BDI short form; for depressed mothers with children under 12 months, was not supported (see Table 6). The EPDS’s true positive rate (63%) was the lowest in predicting depression with the BDI cognitive-affective sub-scale (63%). When case managers diagnoses were included in order to increase sample size it is of interest that the true positive rate of the EPDS improved and left the cognitive-affective sub-scale as the worst performer. Both the BDI (total) and short form had a much better true positive rate of 100% each. The BDI total, thus, proved to be superior to the EPDS in detecting depression after childbirth.

Table 6. True Positive rates for all questionnaires for CBD women:




SCID diagnosis:


Combined SCID and CM diagnosis: (n=13)


True Positive:



BDI (total)

True Positive:



BDI (cognitive-affective)

True Positive:



BDI (short form)

True Positive:



Note: CBD = Childbearing depressed group, SCID = Structure Clinical Interview for DSM-III-R, CM = case manager

Hypothesis 2:

The hypothesis that the cognitive-affective sub-scale, the BDI total and the BDI short form will have a greater sensitivity rate for both depressed groups than will the EPDS, held true only for the Beck cognitive-affective scale (94%) and short form (86%). The BDI total, as reported earlier had a sensitivity of 58%, which fell below the 84% of the EPDS (see Table 7.). The BDI cognitive-affective sub-scale included the least number of false positives, thus its great sensitivity. Yet in screening one is not concerned with getting too many false positives, thus of real interest were the specificity (proportion of non-depressed women correctly identified) and positive predictive value (PPV, proportion of women identified who were truly depressed). The positive predictive values for detecting depression in general, were 100% for the BDI total and short form and 84% for the BDI cognitive-affective subscale and EPDS. For detecting depression in the puerperium they changed somewhat to 100% for the BDI total and short form and 62% for the BDI cognitive-affective sub-scale and EPDS. For specificity the BDI total and BDI short form appear superior with a 100% detection of non-depressed women that were correctly identified.

Table 7. Efficiency of Questionnaires in detecting depression against SCID diagnoses





Predictive Value


all depressed

all depressed

all depressed


BDI total





BDI cognitive/affective





BDI short form










Note: Specificity = true negatives/true and false negatives; Sensitivity = True positives/ true and false positives; Positive Predictive Value = True positives/ true positives and false negatives, CBD = childbearing depressed

If the EPDS results of Table 7. are compared to the research cited in Table 1 (with the idea in mind that both depressed groups are included in these figures) then the specificity here is somewhat lower at 90% to that of Harris, Huckle et al. (1989) of 93%, and 95.7% for Murray and Carothers (1990) and Boyce et al. (1993). Sensitivity at 84% was found to be above the 67.7% of Murray and Carothers (1990) study, yet below Boyce’s findings of 100% and Cox et al. (1987) of 86%. Yet sensitivity of the EPDS in the puerperium was 63%, which comes much closer to that of Murray and Carothers (1990). The positive predictive value in the puerperium of 62% was lower than those reported by Boyce et al. (1993) and Murray and Carothers (1990), being 69.2% and 66.7% respectively.


In summary then the PPV for all tests in detecting depression in general apprear reasonalbe as they are above the recommended 60%. In the puerperium results are less convincing as the PPV sinks to 62% for the EPDS and BDI cognitive affective scale. The BDI (total) and short form remain stable at 100%. For specificity the desired percentages of greater than 85%, in order to eliminate chance were also met by all tests. If one had to rank order the performance of the four tests, the BDI short form would come first in its ability to detect depression, followed by the BDI cognitive affective scale, the BDI total and lastly the EPDS.


IV. Discussion

The present study investigated the EPDS’s ability to detect depressive symptomatology and diagnosis after childbirth. Bearing in mind that depressive symptomatology may be more continuous than a dichotomous diagnosis of depression, increasing the difficulty to predict diagnosis (O’Hara, Schlechte, Lewis and Varner, 1991), it was hoped to nevertheless be able to compare the EPDS’s performance to that of the BDI.

Due to the significant differences in some of the demographic variables these will be commented upon first. In the present study, the number of women who were married differed in the sample groups, as well as from the Australian population average, and may have influenced results. In the childbearing groups 85-95% were married (similar to O’Hara et al., 1990; O’Hara, Schlechte, Lewis & Wright, 1991), compared to 40-48% in the other groups. This higher percentage, compared to the average of 70% (ABS, 1988), may be natural in the context of starting a family. To try and match the depressed group to the CB marital rate, as O’Hara et al. (1990) did, may lead to an unrealistic composition of other depressed cases, as depression may possibly be a part of being unmarried in this sample. However, this does not explain the low percentage of NCB women in the control group. Here one can only wonder if women working in the health profession may be an exaggeration of the trend of marrying later (or divorcing), thus producing a figure of 48% of them being married at the mean age of 34 years. This nevertheless raises serious concerns about the comparability of the sample populations.

Caesarian sections had been excluded by most researchers investigating the EPDS and/or the BDI, until this study. It is of interest that when included, a significantly higher representation (n=6) of caesarian sections emerged for the CBD group, though the clinical numbers were small and any conclusions have to bear this in mind. Campbell and Cohn (1991) found no difference between neonatal complications (measured by an observer) but found that depressed women were more likely to report delivery complications. The role that the subjective experience of having a caesar means to a mother has to therefore be more thouroughly investigated. This may need to be evaluated in relation to depression after childbirth, where, rather than clinicians simply providing ratings on how traumatic a birth was (an external opinion), the mother’s subjective experience is taken into account, possibly through content analysis of their description of giving birth. It may be that most depressed women felt helpless and out of control when they gave birth, whether via a caesarian or not, and this could be the trigger for depression at a later stage. Or it may simply be their perception, tainted by their current depression, that we are hearing about.

Green and Murray (1994, p.191) seem to be the some of the few researchers that have attempted to answer this question. They asked women to comment on their “feeling in control of one’s behaviour, in control during contractions, and in control of what staff were doing” during labour and found all of the above to be significantly related to PND. Jane Fisher (1994, 1995) also reported on an associated decrease in mood for caesarean sections and forceps assisted vaginal deliveries. She concluded that these procedures deprive women of mastery and decrease their self-esteem. She also advokates the diagnosis of Post Traumatic Stress disorder in the puerperium rather than PND. No doubt this area needs further investigations in order to conclude if caesarian sections, or rather a women’s subjective perception of the birth, play an aetiological role in developing depression after childbirth.

It seemed important therefore to include caesarian sections in this research. It is of interest that studies in the United Kingdom have not found caesarian sections to be significantly related to PND (e.g. Warner, 1995) while here in Australia we find the opposite (Fisher, 1994). This may be explained by our unusually high rates of assisted births, not matched anywhere in the world. Fisher reports that especially private patients and older women have concerningly high rates of ceasarian deliveries in Australia (Fisher, 1994).

The high proportion of previous psychiatric history for the women’s control group was again unusual and unexpected. Having used allied health staff and university postgraduate students as the women control group, who operate within a culture where it is acceptable to get help, may have unrealistically inflated this figure. What is of interest is that for the CB women an equal proportion in both groups, namely 10%, had sought prior counselling, whether currently depressed or not. Ironically the ‘most’ depressed group, whom one would have expected to have seen health professionals in the past, had the least percentage with prior psychiatric history. These findings are inconsistent with prior research in this field and need to be addressed in any replication of this study, where hopefully by including larger sample sizes, more conclusive statements can be made.

A brief comment on the differing standard deviations for age may shed some light on the possible nature of depression in the puerperium. It is noteworthy that the CBD group had a standard deviation twice as large as the CB control group. It has been hypothesised that older and younger women may be more vulnerable to depression after childbirth. One reason may be that the ‘older’ group having had an established career respons to the at least temporary loss of this, while having to re-adjust to a life at home with the baby, with depression. The younger group may get depressed simply because of the vulnerability youth brings to becoming a mother, when one does not feel ready to mother and has few internal resources to do so. Boyce (1995) also suggests that women having their second child (thus older) are twice as likely to become depressed than first time mothers.

The main aim of this study was to investigate the EPDS’s ability to detect depressive symptomatology after childbirth. It was compared to existing measures of depression, primarily the BDI. Results in general indicate that even though the EPDS’s sensitivity was greater than that of the BDI total in detecting depressive symptomatology, is not as efficient in detecting depressive disorders in the puerperium as is the BDI. In particular, after the BDI 1961 version had been widely criticized for its inability to be used in the puerperium (Whiffen, 1988a) and in particular due to its ‘somatic component’, the 1978 version appears to do better than its predecessor was reported to do, even though it remained largely unchanged. It may be that by assessing mood over the last week the BDI 1978 version matches the EPDS more closely as it too asks women to rate how they have felt over the past 7 days. In this sense instructions are more similar and one would predict make the tests more comparable. Having deleted the fist six words from the instructions of the EPDS, in order to make it applicable for all groups should not have significantly altered the EPDS’s performance.

There may be a number of reasons for the BDI’s improved performance. It could be that somatic symptoms are an integral part of depression in general, as well as in depression after childbirth, and a high somatic score is simply part of the symptom profile. Support for this line of argument could be taken from the fact that the CB control group did not have an elevated somatic score, being only 1 point above the mean of the other NCB control group. Secondly, a subjective sensitivity to somatic symptoms may exist in CB depressed women (or all depressed individuals). This may inflate scores but also help detect them. The NCBD counterparts were very close to the CBD group in their mean somatic score, possibly refuting the specific argument of increased somatic symptoms in the puerperium.

It is of interest that for the BDI (total) 29% of cases were false positives giving some credence to the above argument, compared to 2% of false positives for the cognitive-affective sub-scale, and 6% for the EPDS (BDI short form 6%). It appears that the cognitive-affective sub-scale by excluding somatic symptoms, is the most sensitive to the exclusion of false positives, thus its highest sensitivity score, and would be the tool to use in order to keep the number of false positives low. Yet when screening one is less concerned with false positives and hopes to rather minimise the number of false negatives, and it is here that the BDI total and short form succeeded rather than the EPDS or the cognitive-affective sub-scale.

Lastly, the somatic symptoms may pick up an anxiety component that could be present more so in depressed women in the puerperium. Gotlib and Cane (1989) have listed distinctive and overlapping items for symptoms of depression, and comment on the overlap between depression and anxiety self-report measures due to this. The constructs of depression and anxiety have not been able to be separated in factor analytic studies due to the symptom overlap and they suggest that future self-report measures should focus on the distinctive symptoms not the overlapping ones, such as loss of interest and pleasure in depression and excessive worry for anxiety.

The EPDS has 3 anxiety items. A higher cut-off, rather than lower [>9] as proposed by Cox et al. (1987), is suggested and may improve the EPDS’s ability to pick up diagnosable depression, while excluding anxious clients. With anxiety disorders scoring around 12, a possible cut-off above 14 may help to avoid too many false positive cases and therefore increase sensitivity and specificity. Under its present operational definition (i.e. >12), the EPDS does not appear to be as suitable a tool for detecting depression as the two BDI versions (total & short form), in the puerperium or at other times. One may argue that ‘psychological distress’ is picked up but not necessarily depression. It was also encouraging to notice the low sensitivity the BDI total had in the detection of anxiety disorders, indicating possibly its purer item-constellation for detecting depressive symptomatology only. The role that anxiety plays in depression in the puerperium, and in depression in general remains unclear and is beyond the scope of discussion in this paper (see Beck, Brown, Steer, Eidelson & Riskind, 1987; Clark & Watson, 1991; Kendall & Watson, 1989; Tanaka-Matsumi & Kameoka, 1986 for further discussion). If women are at an increased risk for anxiety in the puerperium it would make sense to screen them for the presence of it as well, but possibly with a tool that intends to do so, because anxiety could also interfere markedly with the mothers ability to function, yet treatment would progess along different lines.

While on this point it is of interest to note that the DSM-IV (APA, 1994a) provided in its Appendix B sets of criteria for further study for the following disorders: Minor depressive disorder, Mixed anxiety-depressive disorder and Depressive personality disorder. It is the diagnosis of a ‘Mixed anxiety-depressive disorder’ that is of particular interest especially in the light of Boyce’s (1995) stating that he “considers anxiety disorders to be part of ‘postnatal depression'” (personal communication). The need to agree on a common operationalisation for ‘PND’, and the lack thereof continues to frustrate research in this area. Issues such as when to measure, and up to what time depression after childbirth is diagnosable, also need urgent attention.

One of the reasons for the EPDS having been hailed as a useful tool is its brevity, yet the BDI short form appears to be a better measure and nearly matches the brevity of the EPDS with its 13 items. From the results of this study it is suggested that if results are replicated, one could suggest that the BDI short form (with a cut-off of ³ 8) may be a more useful tool for the screening of depression in puerperal women. The EPDS appears less specific and may pick up ‘psychological distress’ rather than depression.

In addition, one should err on the side of caution in response to Cox’s suggestion that the EDPS could be used as an effective depression measure per se (Cox & Holden, 1994). Results suggest that if the NSW government wishes to detect depression after childbirth, it would be better served by using the BDI short form. Furthermore, the BDI remains the test of choice when wishing to detect depression as was demonstrated in this study. If brevity is the only reason why the EPDS was advocated then one would do far better using the BDI short form rather than the EPDS.

Yet the BDI is not free, it costs $2.20 per sheet and one can start to understand why the EPDS may have been the preferred choice. As its use is also not restricted to registered psychologists, and all health professionals can use it, it would be an alternative worth considering if it performed as well as the BDI. Nevertheless, the unrestricted use of the EPDS has led to it being misused as well, the danger being that because of its inviting title ‘postnatal depression questionnaire’, well-intentioned health professionals who due to lack of training in the field, are relying solely on an EPDS score >12 to diagnose ‘Post Natal Depression’. Even worse if as a result of this a prescription for antidepressants follows (often meaning the cessation of breastfeeding), to women that may not be depressed, but in fact may be suffering from an anxiety disorder or are simply in need of reassurance and supportive counselling.

In relation to the dilemma of whether PND is a distinct diagnosis, Whiffen’s (1992) argument that PND is like depression only milder seems to have been supported to some extent by the numbers of diagnosable depressive disorders compared to adjustment disorders in the puerperium. If Adjustment disorders are more common than MDE in the puerperium, this would have numerous implications for treatment. One may be dealing with a situation similar to a loss or bereavement in which natural progression may consist of feelings of depression and loss, due to the now irreversible and unavoidable transition to being a mother. Life has changed forever, and mothers may respond to this realisation in different ways. To emerge these mothers in a cognitive-behavioural treatment programme for depression may be inappropriate as simple strategies in helping them come to terms with this life change may meet their needs.

In closing, comments on the limitations of this study are necessary. Firstly, a weakness in the study are small clinical sample sizes which could be improved by larger sample sizes. Whether the differences found are statistically meaningful on such small samples remains open to further research. The results stand partially in contradiction to earlier findings, especially in relation to the BDI’s usefulness in detecting depression in the puerperium, and will need to be replicated. Another weakness was the absence of an anxiety measure, as its inclusion would have been helpful in being able to comment on the possible overlap between anxiety disorders and depressive disorders in the puerperium. Some evidence of overlap came from the relatively large number of dual diagnoses (n=6) that were excluded from the analysis.

The third major limitation of this study was that not all 193 research participants were interviewed, but were only randomly selected if under the cut-off. This makes it impossible to comment with certainty on the test’s specificity, and false negative rates. Future research should aim to clinically interview a large sample of women, preferably immediately after they have completed the questionnaires. Of concern were also the 30 refusals to be interviewed and one wonders if depressive symptomatology influenced their decision not to further participate. An over-representation of depressed women in the drop-out category was also suggested by Boyce et al. (1993) in his study, as many women anecdotally comment that “when I was really bad I would not have agreed to see you”. It seems that as women are coming out of their depression or while in treatment they agree to participate in research on depression.

Lastly, the significant demographic variables leave too much room to wonder about the comparability of the research groups in this study, and compared to previous ones. The high percentage of previous psychiatric history in the control group leaves doubt about comparabilty, espcially in the light that previous psychiatric history appears to be a predictor of depression, and one womder how different the NCB control groups actually was. The significant difference between the CB goups due to caesarian delivery again complicates interpretation of results, as do the abnormal percentages of being married. One would hope that with larger numbers possible differences between groups would be avoided.

One wonders about error in sampling and in particular that the NCB group probably was not representative of women in general. An indication of refusal rates would also have been useful. At last but not least, a prospective design would have been able to yield more and stronger data for interpretation, yet due to time constraints this was not attempted.

Ethical issues are also of importance espcially when researching depression, and more so depression after childbirth as one knows that an infant with a depressed mother is ‘out there’. The refusal to be interviewed or to accept help by (possibly) depressed mothers is at times hard to accept. One wonders about the effects on the infant and mother and their relationship. None of the mothers in this study that were interviewed came under the Mental Health Act of 1990, however 30 refused to be interviewed. Depression research is quite a challenge as most people when depressed do not wish to be part of anything let alone research. Therefore, finding willing reserach participants was much harder than anitcipated. A prospective study may be more successful in this as the research team would have already engaged with the mother, and appears the ideal design in this area of reasarch.

This study, nevertheless managed to achieve most of its aims quite well. The above results, if taken as a representative sample, leave little doubt that the BDI short form may be better able to detect depressive symptomatology and diagnosable depressive disorders in the puerperium compared to the EPDS. The current universal screening with the EPDS in NSW may be useful in detecting psychological distress and prompting referral. Mothers that score above 12 on the EPDS may be suffering from depression or anxiety. It is unfortunate that the scale infers depression in its title, ‘Edinburgh Postnatal Depression Scale’ as it may lead to an assumption of a possible diagnosis by the novice. A name change may help, some suggestions may be the ‘Edinburgh Scale’ or ‘Edinburgh Postnatal Distress Scale’ in order to avoid this. If the NSW government wishes to detect depression specifically, due to its effects on the development of infants of depressed mothers (Hammen, Adrian, Gordon, Burge & Jaenicke, 1987; Philipps & O’Hara, 1991; Whiffen & Gotlib, 1989), it may be more useful to use the BDI short form.

In conclusion, the BDI short form was found to be the most reliable predictor of depression after childbirth rather than the EPDS. The EPDS was unable to specifically distinguish between Anxiety and Depressive disorders. One may conclude that the EPDS measures ‘psychological distress’ and will therefore detect distressed mothers. The EPDS is not recommended as a general depression measure.


Affonso, D.D., Lovett, S., Paul, S.M., & Sheptak, S. (1990). A Standardized Interview that Differentiates symptoms from Perinatal Clinical Depression. Birth, 17(3), 121-130.

American Psychiatric Association (1987). Diagnostic and Statistical Manual of Mental Disorders (3rd ed.) Washington, DC: Author.

American Psychiatric Association (1994a). Diagnostic and Statistical Manual of Mental Disorders (4th Edn.) Washington DC: Author.

American Psychiatric Association (1994b). Diagnostic Criteria from DSM-IV. Washington DC: Author.

Australian Bureau of Statistics (1988). Estimated Resident Population by Marital Status, Age and Sex, Australia, June 1976, 1981 to 1987 (Catalogue No. 3220.0). Canberra: Australia ABS.

Barnett, P.A., & Gotlib, I.H. (1988). Psychosocial functioning and depression: Distinguishing among antecedents, concomitants, and consequences. Psychological Bulletin, 104, 97-126.

Beck, A.T., Brown, G., Steer, R.A., Eidelson, J.I., & Riskind, J.H. (1987). Differentiating Anxiety and Depression: A test of the cognitive Content-specificity Hypothesis. Journal of Abnormal Psychology, Vol. 96(3), 179-183.

Beck, A.T., Rial, W.Y., & Rickels, K. (1974). Short Form of Depression Inventory: Cross-Validation. Psychological Reports, Vol. 34, 1184-1186.

Beck, A.T., & Steer, R.A. (1987). Beck Depression Inventory: Manual. New York: The Psychological corporation Harcourt Brace Jovanovich Inc.

Beck, A.T., & Steer, R.A. (1993). Beck Depression Inventory: Manual. New York: The Psychological corporation Harcourt Brace Jovanovich Inc.

Beck, A.T., Steer, R.A., & Garbin, M.G. (1988).Psychometric Properties of the Beck Depression Inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8, 77-100.

Beck, A.T., Ward, C.H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An Inventory for Measuring Depression. Archives of General Psychiatry, 4: 561-571.

Beck, C.T. (1992). The Lived Experience of Postpartum Depression: A Phenomenological Study. Nursing Research, 41(3). 166-170.

Beutler, L., Corbishley, A., & Hamblin, D. (1988). Reliability and Validity of the Short Form Beck Depression Inventory with older adults. Journal of Clinical Psychology, Vol. 44(6), 853-857.

Billings, A.G., & Moors, R.H. (1985). Psychosocial stressors, coping, and depression. In E.E. Beckham & W.R. Leber (Eds.), Handbook of depression: Treatment, assessment, and research (pp. 940-974). Homewood, IL: Dorsey Press.

Boyce, P. (1995, March). Post-Partum Depression. Paper presented at the meeting of the NSW Institute of Psychotherapy for the Programme on Infant Research and Development, Sydney, Australia.

Boyce, P., Stubbs, J., & Todd, A. (1993). The Edinburgh Postnatal Depression Scale: Validation for an Australian Sample. Australian and New Zealand Journal of Psychiatry, 27: 472-476

Campbell, S.B., & Cohn, J.F. (1991). Prevalence and Correlates of Postpartum Depression in First-Time Mothers. Journal of Abnormal Psychology, Vol. 100(4), 594-599.

Carothers, A.D., & Murray, L. (1990). Estimating psychiatric morbidity by logistic regression: application to post-natal depression in a community sample. Psychological Medicine, 20, 695-702.

Clark, L.A., & Watson, D. (1991). Tripartite Model of Anxiety and Depression: Psychometric Evidence and Taxonomic Implications. Journal of Abnormal Psychology, Vol. 100 (3), 316-336.

Conoley, C.W. (1992). Review of the Beck Depression Inventory [Revised Edition]. In Mental Measurement Yearbook 11th (Edn.). Buros & Mitchell: Nebraska (pp. 78-79).

Cox, J. L. (1986). Postnatal Depression: A guide for Health Professionals. Edinburgh: Churchill Livingstone.

Cox, J.L., Connor, Y.M., & Kendell, R.E. (1982). Prospective study of the Psychiatric Disorders of Childbirth. British Journal of Psychiatry, 140, 111-117.

Cox, J.L., Kumar, C., Oates, M., Foreman, D., & Anderson, H. (1992). The College: Report of the General Psychiatry Section Working Party on Post-Natal Mental Illness. Psychiatric Bulletin, 16, 519-522.

Cox, J., & Holden, J. (Eds.) (1994). Perinatal Psychiatry; Use and Misuse of the Edinburgh Postnatal Depression Scale. Gaskell: London.

Cox, J.L., Holden, J.M., & Sagovsky, R. (1987). Detection of Postnatal Depression: Development of the 10-item Edinburgh Postnatal Depression Scale. British Journal of Psychiatry, 150, 782-786.

Cutrona, C.E. (1983). Causal attributions and perinatal depression. Journal of Abnormal Psychology, 92, 161-172.

Dalton, K. (1971). Prospective study into puerperal depression. British Journal of Psychiatry, 118, 689-692.

Dobson, K.S., & Breiter, H.J. (1981). Cognitive assessment of depression: Reliability and validity of three measures. Journal of Abnormal Psychology, 92, 107-109.

Endicott, J., & Spitzer, R.L. (1978). A diagnostic interview schedule for affective disorders and schizophrenia. Archives of General Psychiatry, 35, 837-844.

Fisher, J. (1994, September). Obstetric Intervention and Post-Partum Mood. Paper presented at the 29th Annual Conference of the Australian Psychological Society, Wollongong, Australia.

Fisher, J. (1995, April). You’ve got a health baby, why fret about a caesarean delivery? Paper presented at the Marcé Society Pacific Rim Conference, Sydney, Australia.

Gathercole, C.E. (1968). Assessment in Clinical Psychology. London: Penguin Books.

Gelder, M., Gath, D. & Mayou, R. (1989). Oxford Textbook of Psychiatry. Oxford: Oxford University Press.

Gotlib, I.H., & Cane, D.B. (1989). Self-report assessment of depression and anxiety. In P.C. Kendall & D. Watson (Eds.). Anxiety and depression, Distinctive and overlapping features (pp. 131-169). Orlando FL: Academic Press.

Gotlib, I.H., Whiffen, V.E., Mount, J.H., Milne, K., & Cordy, N.I. (1989). Prevalence Rates and Demographic Characteristics Associated with Depression in Pregnancy and the Postpartum. Journal of Consulting and Clinical Psychology, Vol. 57 (2), 269-274.

Gotlib, I.H., Whiffen, V.E., Wallace, P.M., & Mount, J.H. (1991). Prospective Investigation of Postpartum Depression: Factors Involved in Onset and Recovery. Journal of Abnormal Psychology, Vol. 100 (2), 122-132.

Green, J.M., & Murray, D. (1994). The Use of the Edinburgh Postnatal Depression Scale in research to explore the relationship between antenatal and postnatal dysphoria. In John Cox and Jeni Holden’s (Edn.) Perinatal Psychiatry: The Use and Misuse of the EPDS(pp. 180-198). London: Gaskell.

Hamilton, M. (1960). A rating scale for depression. Journal of Neurology and Neuropsychological Psychiatry, 23, 56-62.

Hammen, C., Adrian, C., Gordon, D., Burge, D., & Jaenicke, C. (1987). Children of Depressed Mothers: Maternal strain and symptom predictors of dysfunction. Journal of Abnormal Psychology, Vol. 96 (3), 190-198.

Harris, B., Huckle, P., Thomas, R., Johns, S., & Fung, H. (1989). The Use of Rating Scales to Identify Post-natal Depression. British Journal of Psychiatry, 154, 813-817.

Harris, B., Johns, S., Fung, H., Thomas, R. Walker, R., Read, G., & Raid-Fahmy, D. (1989). The Hormonal Environment of Post-natal Depression. British Journal of Psychiatry, 154, 660-667.

Herson, M., & Bellack, A.S. (Eds.) . (1988). Dictionary of Behavioural Assessment Techniques. NY: Pergamon Press.

Hickey, A., Boyce, P. & Ellwood, D. (1995, April). Risk factors for Postnatal Depression: A Prospective study. Paper presented at the Marcé Society Pacific Rim Conference, Sydney, Australia.

Holden, J. (1994). Using the Edinburgh Postnatal Depression Scale in clinical practice. In John Cox and Jeni Holden’s (Edn.) Perinatal Psychiatry: The Use and Misuse of the EPDS (pp. 125-144). London: Gaskell.

Hopkins, K.D., & Glass, G.V. (1978). Basic Statistics for Behavioral Sciences. New Jersey: Prentice Hall.

Hopkins, J., Marcus, M., & Campbell, S.B. (1984). Postpartum Depression: A Critical Review. Psychological Bulletin, Vol. 95 (3), 498-515.

Kendall, P.C. & Watson, D. (Eds.) (1989). Anxiety and depression, Distinctive and overlapping features. Orlando FL: Academic Press.

Kendell, R.E., Rennie, D, Clark, J.A., & Dean, C. (1981). The social and obstetric correlates of psychiatric admission in the puerperium. Psychological Medicine, 11, 351-359.

Knight, R.G. (1984). Some general population norms for the short form Beck Depression Inventory. Journal of Clinical Psychology, Vol. 40 (3), 751-753.

Kumar, R., & Robson, K. (1984). A prospective study of emotional disorders in childbearing women. British Journal of Psychiatry, 144, 35-47.

Milne, P. (1994). The Edinburgh Postnatal Depression Scale guidelines for use in primary health care (State Health Publication No. [PA]94-132). Sydney, NSW: NSW Department of Health, Women’s Health Unit. (ISBN 07310 06283)

Murray, L., & Carothers, A.D. (1990). The Validation of the Edinburgh Post-natal Depression Scale on a Community Sample. British Journal of Psychiatry, 157: 288-290.

NSW Health Department (1994). Postnatal Depression Services Review (State Health Publication No. [PA]94-131). Sydney, NSW: NSW Department of Health, Family and Child Health Unit. (ISBN 07310 062875)

NSW Women’s Consultative Committee (1994). If motherhood is bliss, why do I feel so awful? Community Consultations on Post Natal Stress and Depression in NSW (ISBN 07310 41909). Woolloomooloo, NSW: NSW Ministry for the Status and Advancement of Women.

O’Hara, M.W. (1986). Social Support, Life Events, and Depression During Pregnancy and the Puerperium. Archives of General Psychiatry, Vol. 43, 569-573.

O’Hara, M.W. (1991). Postpartum mental disorders. In J.J. Sciarra (Ed.), Gynecology and Obstetrics. Vol. 6 (pp. 1-17 reprint private communication, Chapter 84). Philadelphia: Harper & Row.

O’Hara, M.W. (1994). Postpartum Depression: Identification and Measurement in a Cross-Cultural Context. In J. Cox and J. Holden (Eds.), Perinatal Psychiatry: Use and Misuse of the Edinburgh Postnatal Depression Scale (pp. 145-168). London: Gaskell.

O’Hara, M.W., Hoffman, J.G., Philipps, L.H.C., & Wright, E.J. (1992). Adjustment in Childbearing women: The Postpartum Adjustment Questionnaire. Psychological Assessment, Vol. 4(2), 160-169.

O’Hara, M.W., Neunaber, D. J., & Zekoski, E.M. (1984). Prospective study of postpartum depression: Prevalence, course, and predictive factors. Journal of Abnormal Psychology, 93(2), 158-171.

O’Hara, M.W., Rehm, L.P., & Campbell, S.B. (1983). Postpartum Depression: A role for social network and life stress variables. Journal of Nervous and Mental Disease, Vol. 171 (6), 336-341.

O’Hara, M.W., Schlechte, J.A., Lewis, D.A., & Varner, M. (1991). Controlled prospective study of postpartum mood disorders: Psychological, environmental, and hormonal variables. Journal of Abnormal Psychology, 100, 63-73.

O’Hara, M.W., Schlechte, J.A., Lewis, D.A., & Wright, E.J. (1991). Prospective study of Postpartum Blues – Biologic and Psychosocial Factors. Archives of General Psychiatry, 48, 801-806.

O’Hara, M.W., Zekoski, E.M., Philipps, L.H., & Wright, E. J. (1990). Controlled Prospective Study of Postpartum Mood Disorders: comparison of Childbearing and Non-childbearing women. Journal of Abnormal Psychology, 99, 3-15.

Pantsos, I. (1993). Postpartum Depression a prospective study: Personality hardiness as a predictor of depression in the postpartum period. Unpublished honours thesis, University of Wollongong, Department of Psychology, Wollongong, Australia.

Philipps, L.H.C., & O’Hara, M. W. (1991). Prospective Study of Postpartum Depression: 4½-Year Follow-up of women and children. Journal of Abnormal Psychology, Vol. 100(2), 151-155.

Pitt, B. (1968). “Atypical” Depression Following Childbirth. British Journal of Psychiatry, 114. 1325-1335.

Rehm, L.P. (1988). Assessment of depression. In M. Hersen & A.S. Bellack (Eds.), Behavioural Assessment: a practical handbook, 2nd Edn., (pp. 246-295). Elmsford, NY: Pergamon Press.

Reynolds, W.M., & Gould, J.W. (1981). A psychometric investigation of the standard and short form Beck Depression Inventory. Journal of Consulting and Clinical Psychology, 49(2), 306-307.

Sacco, W.P. (1981). Invalid use of the Beck Depression Inventory to identify depressed college-student subjects: a methodological comment. Cognitive Therapy and Research, 5, 143-147.

SAS Institute Inc. (1988). SAS/STAT users guide [Computer program]. Cary, NC: Author.

Schlotzhauer, S.D. & Littell, R.C. (1991). SAS System for Elementary Statistical Analysis. SAS Institute: Cary, NC.

Skre, I., Onstad, S., Torgersen, S., & Kringlen, E. (1991). High interrater reliability for the Structured Clinical Interview for DSM-III-R Axis I (SCID-I). Acta Psychiatr Scaand, 84, 167-173.

Spitzer, R.L., Endicott, J., & Robins, E. (1975). Research Diagnostic Criteria. Instrument No. 58. New York: New York State Psychiatric Institute.

Spitzer, R.L., & Endicott, J. (1978). A Diagnostic Interview: The Schedule for Affective Disorders and Schizophrenia. Archives of General Psychiatry, Vol. 35, 837-844.

Spitzer, R.L., Endicott, J., & Robins, E. (1978a). Research Diagnostic Criteria: Rationale and Reliability. Archives of General Psychiatry, Vol. 35, 773-782.

Spitzer, R.L., Endicott, J., & Robins, E. (1978b). Research Diagnostic Criteria. In: Biometrics Research, NY State, Department of Mental Hygiene, New York.

Spitzer, R.L., Williams, J.B.W., Gibbon, M., & First M.B. (1990a). SCID User’s Guide for the Structured Clinical Interview for DSM-III-R. Washington, DC: American Psychiatric Press Inc.

Spitzer, R.L., Williams, J.B.W., Gibbon, M., & First M.B. (1990b). Structured Clinical Interview for DSM-III-R – Patient Edition (SCID-P, Version 1.0). Washington, DC: American Psychiatric Press Inc.

Spitzer, R.L., Williams, J.B.W., Gibbon, M., & First M.B. (1992). The Structured Clinical Interview for the DSM-III-R (SCID). I: History, Rationale, and Description. Archives of General Psychiatry, 49, 624-629.

Spreen, O., & Strauss, E. (1991). A compendium of neuropsychological tests: Administration, Norms and Commentary. : Oxford: Oxford University Press.

Steer, R.A., Beck, A.T., Brown, G., & Berchick, R.J. (1987). Self-reported depressive symptoms differentiate major depression from dysthymic disorders. Journal of Clinical Psychology, 43, 246-250.

Steer, R.A., Beck, A.T., Riskind, J., & Brown, G. (1986). Differentiation of depressive disorders from generalized anxiety by the Beck Depression Inventory. Journal of Clinical Psychology, 40, 475-478.

Stehouwer, R.S. (1985). Beck Depression Inventory. In D.J. Keyser and R.C. Sweetland (Edn.) Test Critiques. Volume II. (pp. 83-87). Missouri: Test Corporation of America.

Sundberg, N.D. (1992). Review of the Beck Depression Inventory [Revised Edition]. In Mental Measurement Yearbook 11th Edn. Buros & Mitchell: Nebraska (pp. 79-81).

Sweetland, R.C., & Keyser, D. J. (1991). Tests – A Comprehensive Reference for Assessments in Psychology, Education, and Business (3rd Edn.) Texas: Pro-ed.

Sweetland, R.C., & Keyser, D. J. (1986). Tests – A Comprehensive Reference for Assessments in Psychology, Education, and Business (2nd (Edn.)). Texas: Pro-ed.

Tanaka-Matsumi, J., & Kameoka, V.A. (1986). Reliabilities and concurrent validities of popular self-report measures of depression, anxiety, and social desirability. Journal of Consulting and Clinical Psychology, 54, 328-333.

Terry, D.J. (1992). Transition to parenthood. In P. Heaven (Ed.) Lifespan Development (Chap. 8, pp. 184-211). Sydney: HBJ .

Terry, D.J. (1991a). Predictors of Subjective Stress in a Sample of New Parents. Australian Journal of Psychology, Vol. 43(1), 29-36.

Terry, D.J. (1991b). Stress, Coping and Adaptation to new parenthood. Journal of Social and Personal Relationships, Vol. 8, 527-547.

Terry, D.J., McHugh, T.A., & Noller, P. (1991). Role Dissatisfaction and the Decline in marital Quality Across the Transition to Parenthood. Australian Journal Psychology, Vol. 43 (3), 129-132.

Terry, D.J., & Hynes, G.J. (1995). Depressive Symptomatology in New Mothers: A Stress and Coping Perspective. Manuscript submitted for publication.

Vredenburg, K., Krames, L., & Flett, G.L. (1985). Reexamining the Beck Depression Inventory: The long and short of it. Psychological Reports, 57, 767-778.

Warner, R. (1995, April). Risk factors and maternal attitides in PND: Results from a large urban sample. Paper presented at the Marcé Society Pacific Rim Conference, Sydney, Australia.

Watson, J.P., Elliott, S.A., Rugg, A.J., & Brough, D.I. (1984). Psychiatric disorder in pregnancy and the first postnatal year. British Journal of Psychiatry, 144, 453-462.

Whiffen, V.E. (1988a). Screening for postpartum depression: A methodological note. Journal of Clinical Psychology, May Vol. 44(3), 367-371.

Whiffen, V.E. (1988b). Vulnerability to Postpartum Depression: A Prospective Multivariate Study. Journal of Abnormal Psychology, Vol. 97 (4), 467-474.

Whiffen, V.E. (1991). The comparison of Postpartum with Non-postpartum Depression: A Rose by Another Name. Journal Psychiatr Neurosci, Vol. 16(3), 160-165.

Whiffen, V.E. (1992). Is postpartum depression a distinct diagnosis? Clinical Psychology Review, 12, 485-508.

Whiffen, V.E., & Gotlib, I.H. (1989). Infants of Postpartum Depressed Mothers: Temperament and Cognitive Status. Journal of Abnormal Psychology, Vol. 98 (3), 274-279.

Whiffen, V.E., & Gotlib, I.H. (1993). Comparison of Postpartum and Nonpostpartum Depression: Clinical Presentation, Psychiatric History, and Psychological Functioning. Journal of Consulting and Clinical Psychology, 61(3), 485-494.

Williams, J.B.W., Gibbon, M., First, M.B., Spitzer, R.L., Davies, M., Borus, J., Howes, M.J., Kane, J., Pope, H.G., Rounsaville, B., & Wittchen, H.U. (1992). The Structured Clinical Interview for DSM-III-R (SCID). II. Multisite Test-Retest Reliability. Archives of General Psychiatry, 49, 630-636.

World Health Organization (1992). The ICD-10 Classification of Mental and Behavioural Disorders: Clinical descriptions and diagnostic guidelines. Geneva, Switzerland: Author (p.195)

Appendix 3 – Edinburgh Postnatal Depression Scale:

The first 6 words “As you recently had a baby” were omitted for this research in order to make the questionnaire applicable for all research groups.


Cox, Holden & Sagovsky (1987) version



We would like to know how you are feeling. Please UNDERLINE the answer which comes closest to how you have felt IN THE PAST 7 DAYS, not just how you feel today.


Here is an example, already completed.


I have felt happy:


Yes, all the time

Yes, most of the time

No, not very often

No, not at all

This would mean: “I have felt happy most of the time” during the past week.

Please complete the other questions in the same way.



In the past 7 days:

1. I have been able to laugh and see the funny side of things:

As much as I always could

Not quite so much now

Definitely not so much now

Not at all

2. I have looked forward with enjoyment to things:

As much as I ever did

Rather less than I used to

Definitely less than I used to

Hardly at all


3. I have blamed myself unnecessarily when things went wrong:

Yes, most of the time

Yes, some of the time

Not very often

No, never


4. I have been anxious or worried for no good reason:

No, not at all

Hardly ever

Yes, sometimes

Yes, very often P.T.O.

Appendix 3 cont.

5. I have felt scared or panicky for no very good reason:

Yes, quite a lot

Yes, sometimes

No, not much

No, not at all


6. Things have been getting on top of me:

Yes, most of the time I haven’t been able to cope at all

Yes, sometimes I haven’t been coping as well as usual

No, most of the time I have coped quite well

No, I have been coping as well as ever


7. I have been so unhappy that I have had difficulty sleeping:

Yes, most of the time

Yes, sometimes

Not very often

No, not at all


8. I have felt sad or miserable:

Yes, most of the time

Yes, quite often

Not very often

No, not at all


9. I have been so unhappy that I have been crying:

Yes, most of the time

Yes, quite often

Only occasionally

No, never


10. The thought of harming myself has occurred to me:

Yes, quite often


Hardly ever






Appendix II – Additional Questions to determine correct diagnosis:

Did mood and symptom co-vary? Was the symptom present long before the mood disturbance? If sleep disturbance was evident, we asked whether she was getting up frequently to feed to baby? Was she able to go back to sleep after this? If it was only to feed the baby the item was discounted. If however she lay ruminating then it was counted, especially if she was awake before the baby cried at night.

Appendix III – SCID Australian Demographics:

The SCID Ethnicity and Education sections (of the demographic data) were Australianised:



Black, not of Hispanic origin

1. Australian born (not Aboriginal/Torres Straits Islander)


2. Aboriginal Australian

White, not of Hispanic origin

3. Migrated but naturalized (Country of Origin__________________)

American Indian or Alaskan native

4. Permanent Resident (Country of Origin_____________________)

Asian or Pacific Islander

5. Other: _______________________________________




Grade 6 or less

1. No School Certificate

Grade 7 to 12 (without graduating high school)

2. School Certificate or equivalent

Graduated High School or equivalent

3. Did year 11/12 but dropped out

Part College

4. High School Certificate or equivalent (TPC)

Graduated 2 year college

5. Technical/ College diploma

Graduated 4 year college

6. Part of University degree

Part graduate/ professional school

7. University degree

Completed graduate/ professional school

8. Postgraduate University Degree