pubmed.ncbi.nlm.nih.gov

Reliability of the American Academy of Sleep Medicine Rules for Assessing Sleep Depth in Clinical Practice - PubMed

  • ️Mon Jan 01 2018

Reliability of the American Academy of Sleep Medicine Rules for Assessing Sleep Depth in Clinical Practice

Magdy Younes et al. J Clin Sleep Med. 2018.

Abstract

Study objectives: The American Academy of Sleep Medicine has published manuals for scoring polysomnograms that recommend time spent in non-rapid eye movement sleep stages (stage N1, N2, and N3 sleep) be reported. Given the well-established large interrater variability in scoring stage N1 and N3 sleep, we determined the range of time in stage N1 and N3 sleep scored by a large number of technologists when compared to reasonably estimated true values.

Methods: Polysomnograms of 70 females were scored by 10 highly trained sleep technologists, two each from five different academic sleep laboratories. Range and confidence interval (CI = difference between the 5th and 95th percentiles) of the 10 times spent in stage N1 and N3 sleep assigned in each polysomnogram were determined. Average values of times spent in stage N1 and N3 sleep generated by the 10 technologists in each polysomnogram were considered representative of the true values for the individual polysomnogram. Accuracy of different technologists in estimating delta wave duration was determined by comparing their scores to digitally determined durations.

Results: The CI range of the ten N1 scores was 4 to 39 percent of total sleep time (% TST) in different polysomnograms (mean CI ± standard deviation = 11.1 ± 7.1 % TST). Corresponding range for N3 was 1 to 28 % TST (14.4 ± 6.1 % TST). For stage N1 and N3 sleep, very low or very high values were reported for virtually all polysomnograms by different technologists. Technologists varied widely in their assignment of stage N3 sleep, scoring that stage when the digitally determined time of delta waves ranged from 3 to 17 seconds.

Conclusions: Manual scoring of non-rapid eye movement sleep stages is highly unreliable among highly trained, experienced technologists. Measures of sleep continuity and depth that are reliable and clinically relevant should be a focus of clinical research.

Keywords: digital sleep analysis; interrater variability; sleep depth.

© 2018 American Academy of Sleep Medicine

PubMed Disclaimer

Figures

Figure 1
Figure 1. Frequency of agreement among the six technologists when a given stage is scored by at least one of the technologists.

Bars are standard deviations. Continuous line is the cumulative percentage. Note that only 1.5 ± 1.2 % of epochs scored by any technologist as stage N1 sleep received a unanimous (6 of 6) N1 score and a majority N1 (4 of 6) score was reached in only 18% of such cases. A similar pattern was observed for stage N3 sleep.

Figure 2
Figure 2. Duration of different sleep stages scored by individual technologists as a function of the average (of 10 scores) duration of the stage in individual PSGs.

Each PSG is represented by 10 points aligned at the average of the 10 durations. Technologists are represented by different symbols. Upper and lower irregular lines join the 5th and 95th percentiles of the individual PSGs. Solid diagonal line is the line of identity. Note that the confidence interval (difference between the upper and lower lines) is quite variable and generally quite wide for stage N1 and stage N3 sleep. Three PSGs are not represented in the N1 panel because their results would nearly double both axes, compressing most of the data in one corner. The average (confidence interval) of N1 durations in these three PSGs were 20.7 (13.3, 25.4), 28.0 (15.2, 49.3), and 36.5 (22.9, 45.1). PSG = polysomnogram, TRT = total recording time, TST = total sleep time.

Figure 3
Figure 3. Frequency of scoring stage N3 sleep by six technologists in epochs with different total delta wave duration.

Delta wave duration is the sum of durations of all delta waves identified digitally in each 30-second epoch. Arrows represent ideal scoring; no N3 until delta duration exceeds 6 seconds and N3 scored in all epochs with delta duration > 6 seconds. Numbers in the upper section represent the number of epochs examined within each delta duration range.

Similar articles

Cited by

References

    1. Berry RB, Brooks R, Gamaldo CE, et al. for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. Darien, IL: American Academy of Sleep Medicine; 2012. Version 2.0.
    1. Iber C, Ancoli-Israel S, Chesson AL, Quan SF for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. 1st ed. Westchester, IL: American Academy of Sleep Medicine; 2007.
    1. Ferri R, Ferri P, Colognola RM, Petrella MA, Musumeci SA, Bergonzi P. Comparison between the results of an automatic and a visual scoring of sleep EEG recordings. Sleep. 1989;12(4):354–362. - PubMed
    1. Whitney CW, Gottlieb DJ, Redline S, et al. Reliability of scoring respiratory disturbance indices and sleep staging. Sleep. 1998;21(7):749–757. - PubMed
    1. Norman RG, Pal I, Stewart C, Walsleben JA, Rapoport DM. Interobserver agreement among sleep scorers from different centers in a large dataset. Sleep. 2000;23(7):901–908. - PubMed

MeSH terms