Author + information
- Received December 17, 2018
- Revision received January 23, 2019
- Accepted February 14, 2019
- Published online February 3, 2020.
- Kenya Kusunose, MD, PhDa,∗ (, )
- Takashi Abe, MD, PhDb,
- Akihiro Haga, PhDc,
- Daiju Fukuda, MD, PhDa,
- Hirotsugu Yamada, MD, PhDa,
- Masafumi Harada, MD, PhDb and
- Masataka Sata, MD, PhDa
- aDepartment of Cardiovascular Medicine, Tokushima University Hospital, Tokushima, Japan
- bDepartment of Radiology and Radiation Oncology, Graduate School of Biomedical Sciences, Tokushima University, Tokushima, Japan
- cDepartment of Medical Image Informatics, Graduate School of Biomedical Sciences, Tokushima University, Tokushima, Japan
- ↵∗Address for correspondence:
Dr. Kenya Kusunose, Department of Cardiovascular Medicine, Tokushima University Hospital, 2-50-1 Kuramoto, Tokushima, Japan.
Objectives This study investigated whether a deep convolutional neural network (DCNN) could provide improved detection of regional wall motion abnormalities (RWMAs) and differentiate among groups of coronary infarction territories from conventional 2-dimensional echocardiographic images compared with that of cardiologists, sonographers, and resident readers.
Background An effective intervention for reduction of misreading of RWMAs is needed. The hypothesis was that a DCNN trained using echocardiographic images would provide improved detection of RWMAs in the clinical setting.
Methods A total of 300 patients with a history of myocardial infarction were enrolled. From this cohort, 3 groups of 100 patients each had infarctions of the left anterior descending (LAD) artery, the left circumflex (LCX) branch, and the right coronary artery (RCA). A total of 100 age-matched control patients with normal wall motion were selected from a database. Each case contained cardiac ultrasonographs from short-axis views at end-diastolic, mid-systolic, and end-systolic phases. After the DCNN underwent 100 steps of training, diagnostic accuracies were calculated from the test set. Independently, 10 versions of the same model were trained, and ensemble predictions were performed using those versions.
Results For detection of the presence of WMAs, the area under the receiver-operating characteristic curve (AUC) produced by the deep learning algorithm was similar to that produced by the cardiologists and sonographer readers (0.99 vs. 0.98, respectively; p = 0.15) and significantly higher than the AUC result of the resident readers (0.99 vs. 0.90, respectively; p = 0.002). For detection of territories of WMAs, the AUC by the deep learning algorithm was similar to the AUC by the cardiologist and sonographer readers (0.97 vs. 0.95, respectively; p = 0.61) and significantly higher than the AUC by resident readers (0.97 vs. 0.83, respectively; p = 0.003). From a validation group at an independent site (n = 40), the AUC by the deep learning algorithm was 0.90.
Conclusions The present results support the possibility of using DCNN for automated diagnosis of RWMAs in the field of echocardiography.
Two-dimensional echocardiography is currently the most widely used noninvasive imaging modality for evaluating regional wall motion abnormalities (RWMAs) in patients with coronary artery disease. Assessment of RWMAs by trained echocardiogram technicians in patients with chest pain in the emergency department is a Class I recommendation by the American College of Cardiology/American Heart Association and the European Heart Association (1,2). Identification of patients with RWMAs is useful for detecting a significant occult coronary artery disease not evident by symptoms, electrocardiography, or initial cardiac biomarkers. However, conventional assessment of RWMAs, which is based on visual interpretation of endocardial excursion and myocardial thickening, is subjective and depends on experience (3). An effective intervention for reduction of misreading of RWMAs is needed (4–6).
Machine learning helps computers to learn and develop rules without requiring human instruction at all stages. Recently, deep learning has become a powerful method for detection and classification of several diseases in many medical fields (7–12). It may be a useful artificial intelligence tool for the assessment of cardiovascular disease (13–16). Conventional machine learning usually requires predefined measurements to characterize the information in the input image (17). In contrast, deep learning directly calculates the results beyond the predefining process (7,18). In addition, the deep layer of the convolutional neural network is able to extract detailed low-level information from the original image and may be useful for detecting echocardiographic problems (19,20). Hypothetically, that deep convolutional neural network (DCNN) trained with echocardiographic images may provide improved detection of RWMAs in the clinical setting. This study sought to demonstrate that a DCNN can automatically provide improved differentiation among groups of coronary infarction territories, using conventional 2-dimensional echocardiographic images, compared with that of inexperienced and expert readers.
A total of 400 patients who had undergone coronary angiography to evaluate coronary artery disease were retrospectively enrolled. In this cohort, 300 patients had prior myocardial infarctions. Briefly, 100 patients had an anterior infarction (isolated left anterior descending artery [LAD] branch disease), 100 patients had an inferolateral infarction (isolated left circumflex artery [LCX] branch disease), and 100 patients had an inferior infarction (isolated right coronary artery [RCA] disease). The age-matched control group consisted of 100 patients without obstructive coronary artery disease. None of the patients had atrial fibrillation or severe valvular disease. Images with good or adequate acoustic detail were selected on the basis of visualization of the left ventricular (LV) walls and endocardium in order to test the deep learning algorithm using echocardiographic images. To overcome the issue of generalizability, a separate validation group of 40 patients was gathered who were referred for coronary angiography from an independent site (Hoetsu Hospital, Tokushima, Japan). In this cohort, there were 10 patients with LAD artery asynergy, 5 patients with LCX artery asynergy, 9 patients with RCA asynergy, and 16 patients without asynergy. The Institutional Review Board of Tokushima University Hospital approved the study protocol (number 3217).
Echocardiography was performed using a commercially available ultrasonography machine (Vivid E9 and E95, GE Healthcare, Waukesha, Wisconsin). All echocardiographic measurements were obtained according to American Society of Echocardiography recommendations (21). All images were stored digitally for playback and analysis. Visual RWMAs were interpreted by consensus of 10 cardiologists and sonographers and 10 resident observers using short-axis views (22). Three territories (the LAD and LCX arteries and the RCA) were evaluated using the coronary angiographs with combined wall motion evaluation using apical and short-axis views. Consensus of expert agreement of RWMAs on the echocardiography images was used as the gold standard (the experts K.K and H.Y. have >10 years’ experience with echocardiography). Readers were able to take into account additional information available from angiography and ventriculography. These classifications were blinded from the results of the other analysis. The short-axis view, which results in a circular view of the LV, can be used at the middle level. The LAD artery feeds segments of the anterior septum and anterior free wall; the RCA feeds segments of the inferior septum and inferior free wall; and the LCX branch feeds segments of the inferolateral and lateral walls.
The process for importing data is shown in Figure 1. Each case contains cardiac ultrasonographs from mid-level short-axis views. To adjust for differences among patients’ frame rate and heart rate, images at end-diastolic, mid-systolic, and end-systolic phases were used. All DICOM (Digital Imaging and Communications in Medicine) images were transformed into 128- × 128-pixel portable network graphic images with down sampling. Three territories of data, as well as the control group data, were divided into a training set and a test set (80:20, respectively), so that a total of 400 cases with 1,200 images were split with 256 cases (786 images) as the training set, 64 cases (192 images) as the validation set, and 80 cases (240 images) as the test set. Two steps of analysis were performed. The protocol step 1 was performed to detect the presence of RWMAs, and the protocol step 2 was performed to detect the territory of RWMAs.
Deep learning model
The overall process is shown in the Central Illustration. Detection of the presence of RWMAs and the territory of RWMAs was accomplished by using DCNN. ResNet, DenseNet, Inception-ResNet, Inception, and Xception were used as DCNNs (23–25). In order for the DCNN model to return the answer, 1 fully connected layer with 50% dropout was added to the model. The DCNN trained from images assessed by expert cardiologists was used to estimate the probability of RWMAs in the LAD and LCX arteries and RCA territories. The maximum probability was used as the probability of patient disease. The fully connected layers transformed the image features into the final LAD and LCX arteries and RCA scores by adjusting weights for neuron activations during training. Model training was performed using a graphics processing unit (GeForce GTX 1080 Ti, NVIDIA, Santa Clara, California). Once the network is trained, it will calculate how far the trained model’s output is from the actual output. Then, the cross-entropy function will try to reduce this error to a minimum point (26). The Adam Optimization algorithm (Machine Learning Mastery, Vermont Victoria, Australia) was used for training. After 100 steps of training, diagnostic accuracy was calculated using the test set. Independently, 10 versions of the same DCNN model were trained and performed using the voting scheme of ensemble prediction with them on the test set. The majority voting ensemble prediction was used to score the probability of RWMAs. Majority voting ensemble is one of representative ensemble methods that can combine the outputs of 10 trained different classifiers. These models were trained using the same initialization and learning rate policies. Deep learning was performed using Python version 3.5 programming language (Python Software Foundation, Beaverton, Oregon) with Keras version 2.1.5 software (GitHub, San Francisco, California). The code has been uploaded in GitHub.
Data are mean ± SD. Differences between multiple groups were analyzed by ANOVA, followed by Tukey post hoc analysis. The diagnostic performance of the deep learning algorithm and the observers were evaluated using receiver operating characteristic (ROC) analysis and pairwise comparisons of the area under the ROC curve (AUC) according to the DeLong method (27). Statistical analysis was performed using commercially available software (SPSS version 21.0, SPSS Inc., Chicago, Illinois; and version 17 software, MedCalc, Mariakerke, Belgium). Statistical significance was defined as a p value <0.05.
The subject characteristics included in this study are shown in Table 1. The study population consisted of 300 patients with coronary artery disease (CAD) and 100 patients without CAD. LV ejection fraction was significantly lower in the LAD artery group than in the other groups. The wall motion score index was also higher in the LAD artery group than in the other groups. Figure 2 shows the value of the loss function in the training and validation sets for training a DCNN model. As shown in this figure, the model converges in the training process near the100th epoch, and the data distribution ranges were narrow.
Detection of RWMAs
For detection of the presence of wall motion abnormality (wall motion abnormality vs. control), the AUC produced by the deep learning algorithm (ResNet) was similar to that by cardiologist and sonographer readers (0.99 vs. 0.98, respectively; p = 0.15) and significantly higher than the AUC produced by resident readers (0.99 vs. 0.90, respectively; p = 0.002).
Results of the ROC analysis used to assess the diagnostic ability to detect the territories of RWMAs are shown in Figure 3. The authors used the AUCs to compare several deep learning algorithms for detection of territories of wall motion abnormality (the LAD artery vs. the LCX artery vs. the RCA vs. those of the control subjects). The deep learning with largest AUC was ResNet (AUC: 0.97), but there were no significant differences among deep learning algorithms except for the Xception model (ResNet AUC: 0.97; DenseNet AUC: 0.95; Inception-ResNet AUC: 0.89; Inception AUC: 0.90; and Xception AUC: 0.85; vs. other algorithms; p < 0.05). For detection of territories of wall motion abnormality, the AUC produced by the deep learning algorithm (ResNet) was similar to the AUC produced by the cardiologist and sonographer readers (0.97 vs. 0.95, respectively; p = 0.61) and significantly higher than the AUC by resident readers (0.97 vs. 0.83, respectively; p = 0.003) (Supplemental Figure 1). To assess the diagnostic performance in each separately, the ROCs in Supplement Figure 2 were added. All AUCs were good.
To check the accuracy of RWMA identification for each coronary territory, the odds ratios of deep learning vs. those from the cardiologists and sonographers were calculated for misclassification (Figure 4). In the results, deep learning had relatively low ratio of misclassification in the RCA, except for the Xception model. In addition, deep learning also had relatively low ratios of misclassification in the control group.
Moreover, the top 10 cases of RWMAs misclassified by deep learning (ResNet) and cardiologist and sonographer readers were selected. Interestingly, they are cases in which cardiologists and sonographer readers and deep learning misclassification were very similar. Deep learning and cardiologist and sonographers misclassifications matched in 8 of 10 cases. Thus, it is possible that the deep learning reading was similar to that of the cardiologists and sonographers reading. In addition, the patients’ characteristics with misclassification by the deep learning are shown in Supplemental Table 1. There were no statistically significant differences between the correctly classified group and the misclassified group, but LV size (left ventricular end-diastolic volume index [LVEDVi]) was slightly larger in the misclassified group than in the correctly classified group. One possible explanation is that the sample size with large LV size for development of deep learning model is relatively small. The worst cases may be needed for the development of the deep learning model in future study.
For detection of territories of wall motion abnormality in the separate validation group of 40 patients from the independent site, the AUC by the deep learning algorithm was 0.90. The AUC was slightly smaller than the AUC in the original cohort. One explanation is that the original cohort had the equal distribution in each territory, the relatively low prediction performance in the classification was seen for the LAD artery asynergy, and the number of the LAD artery asynergy was the largest in the newly gathered data.
Interpretation of wall motion abnormalities with echocardiography is observer-dependent and requires experience. An inexperienced reader sometimes misinterprets a wall motion abnormality, and significant training is required to become an expert. Deep learning algorithm is an objective method with no intraobserver error, and its accuracy is similar to that of visual assessments by experts. The diagnostic system can be used as a useful tool to classify RWMAs in clinical evaluation. However, because the number of patients examined was limited, the present study should be considered a proof of concept, and the present authors believe that larger prospective multicenter studies are warranted.
Comparison with previous automated analysis
The use of quantitative assessment was expected to improve the accuracy and objectivity of echocardiographic image analysis. Several methods for measuring cardiac wall motion and strain and strain rate were developed to be performed using echocardiographic images (4,28,29). However, the reproducibility of quantitative measurements in echocardiography was limited by interobserver or intraobserver variability. Recently, other groups have developed automated algorithms for the analysis of left ventricular function and endocardial border detection (30,31). However, most of those algorithms remain semiautomatic, where the observer input is initially needed to manually annotate important landmarks (e.g., mitral plane, apex). Fully automated assessment is needed to obtain quantitative results without any user interaction (e.g., marker positioning, contour drawing or modification). The present results demonstrate that DCNN can be trained to identify wall motion abnormality on echocardiographic images. The accuracy of deep learning algorithm is superior to that of inexperienced observers and similar to that of expert observers. These authors believe this study is a milestone, signaling application of the deep learning algorithm in reading echocardiographic images in the future.
Deep learning for echocardiography
Previous machine learning approaches requiring the extraction and integration of pre-identified imaging measurements have shown a possibility of automated detection in cardiovascular disease (6). This paper describes the development of an objective classification model for RWMAs based on a deep learning algorithm. Although there were a relatively small number of cases with images, performance was improved using this algorithm. The resulting learned structure showed that it was possible to approve the good agreement between deep learning diagnosis and expert consensus. A simple and available algorithm was used to achieved great performance accuracy. It is possible that further significant improvements in deep learning could be achieved by the integration of additional imaging and clinical data. The study examined how these encouraging results could help less-experienced observers improve their diagnostic accuracy because the agreement between less-experienced observers and the experts is often low. In addition, resident readers had relatively high ratios of misclassification compared to deep the learning algorithm (ResNet) for the RCA and LCX arteries (RCA odds ratio: 3.9; LCX artery odds ratio: 2.2). Thus, deep learning algorithms may have a potential to help diagnosis for the RCA and LCX territories.
Although there were advantages to using deep learning for echocardiography, the major limitation of deep learning is that echocardiographic images were an unstructured dataset. The image quality depends on the machine vendors and software version. In the clinical setting, several vendors with many versions are used. Thus, the normalization of images among vendors will be required when this algorithm is applied in the clinical setting that uses many vendors.
Another limitation of deep learning is that the reasons for differences among deep learning methods and why they may behave differently are unclear. In the present study, 5 deep learning models were applied to differentiate among echocardiographic images. Clearly, the number of parameters and layers are different among the deep learning models used in this study. One of the advantages to the use of a deep learning model compared to the other types of machine learning models is that the deep learning can construct the appropriate features automatically developed in the intermediate layers. The extracted features may be different among the deep learning models because of the different numbers of parameters and layers. On the other hand, this may also be the reason why one model is superior to another. Unfortunately, there is no clear explanation in this field (one of the engineering issues).
Interpretation of wall motion abnormalities using echocardiography is observer-dependent and requires experience. Assessment of RWMAs using the deep learning algorithm is an objective method with no intraobserver error, and its accuracy was equal to that of the consensus assessments by experts. Echocardiographic assessment in artificial intelligence may not be necessary for experts; however, quantitative assessment is another advantage of artificial intelligence. In the future, the authors plan to expand classifications to identify different levels of RWMAs at the segment level and include images from stress echocardiography. The authors also plan to apply an algorithm to differentiate among several cardiovascular diseases.
This study of deep learning applied to echocardiographic data has several limitations. First, RWMA assessment is based on results of echocardiography, coronary angiography, and left ventriculography by expert consensus. Second, only echocardiographic images at mid-level short-axis view were used, acquired in only 1 cycle to ensure applicability to a simple imaging protocol used in a clinical routine. The identification of apical abnormalities was not tested, and patients were chosen with infarcts involving the mid segments. Possibly, a larger set of training data could allow further improvement (32). Third, the echocardiographic images do not consist of structured data and cannot reconfigure. Thus, the accuracy of diagnosis may be influenced by the image quality. Fourth, the patients enrolled in this study had single-vessel disease, so the authors were unable to assess patients with multivessel disease. In the future big-data study, multivessel coronary disease may be included. Fifth, the number of patients was relatively limited. Generally, deep learning algorithms require thousands of patients with 10 times the number of images. On the other hand, in the present analysis, the deep learning diagnostic accuracy seemed to be good in the independent test cohort. The present authors believe that this report can serve as an impetus for a future large multicenter study. Our results confirm in principle that DCNN may be very informative for interpreting regional wall motion abnormalities, but a study using larger numbers of patients should be performed to assess the efficacy of the automatic classification system in the clinical setting.
Our results support the possibility of the use of DCNN for automated diagnosis of myocardial ischemia in the field of echocardiography.
COMPETENCY IN MEDICAL KNOWLEDGE: A deep learning algorithm is an objective method with no intraobserver error, and its accuracy seems to be equal or superior to that of visual assessment by experts.
COMPETENCY IN PATIENT CARE AND PROCEDURAL SKILLS: Regional wall motion abnormality should be carefully assessed in the clinical setting. Present results suggest that a deep learning algorithm is a useful method for detecting regional wall motion abnormalities in patients with suspected coronary artery disease.
TRANSLATIONAL OUTLOOK: Although this study suggests the utility of detecting regional wall motion abnormalities by using a deep learning algorithm, this deep learning model should be improved using a larger cohort of coronary artery disease patients.
The authors thank Kathryn Brock, BA, for editing the manuscript. The authors also thank Natsumi Yamaguchi for gathering data and Hoetsu Hospital for providing data.
Partially supported by Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research (KAKENHI) grant 17K09506. The authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- deep convolutional neural network
- left anterior descending artery
- left circumflex artery
- right coronary artery
- regional wall motion abnormality
- Received December 17, 2018.
- Revision received January 23, 2019.
- Accepted February 14, 2019.
- 2020 American College of Cardiology Foundation
- Amsterdam E.A.,
- Wenger N.K.,
- Brindis R.G.,
- et al.
- Parisi A.F.,
- Moynihan P.F.,
- Folland E.D.,
- Feldman C.L.
- Amundsen B.H.,
- Helle-Valle T.,
- Edvardsen T.,
- et al.
- ↵Qazi M, Fung G, Krishnan S, et al. Automated heart wall motion abnormality detection from ultrasound images using Bayesian networks. In: Sangal R, Mehta H, Bagga RK, editors. Proceedings of the 20th international joint conference on Artifical intelligence. San Francisco, CA: Morgan Kaufmann Publishers Inc., 2007:519–25.
- Suzuki K.
- Kida S.,
- Nakamoto T.,
- Nakano M.,
- et al.
- Shrestha S.,
- Sengupta P.P.
- Betancur J.,
- Commandeur F.,
- Motlagh M.,
- et al.
- Sanchez-Martinez S.,
- Duchateau N.,
- Erdei T.,
- et al.
- Tabassian M.,
- Sunderji I.,
- Erdei T.,
- et al.
- Jordan M.I.,
- Mitchell T.M.
- Amari S.
- Zeiler M.D.,
- Fergus R.
- Zhang J.,
- Gajjala S.,
- Agrawal P.,
- et al.
- Ryan T.,
- Berlacher K.,
- Lindner J.R.,
- Mankad S.V.,
- Rose G.A.,
- Wang A.
- ↵Chollet F. Xception: deep learning with depthwise separable convolutions. arXiv preprint 2017;1610.02357.
- ↵Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. Presented at: The Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016:2818–26.
- ↵Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. Computer Vision and Pattern Recognition, 2009 CVPR 2009 IEEE Conference on: IEEE, 2009:248–255.
- ↵Ciresan DC, Meier U, Gambardella LM, Schmidhuber J. Convolutional neural network committees for handwritten character classification. Presented at: The Document Analysis and Recognition (ICDAR) International Conference on IEEE; 2011:1135–9.
- Mor-Avi V.,
- Vignon P.,
- Koch R.,
- et al.
- Urheim S.,
- Edvardsen T.,
- Torp H.,
- Angelsen B.,
- Smiseth O.A.
- Leung K.E.,
- Bosch J.G.
- ↵Yang L, Georgescu B, Zheng Y, Foran DJ, Comaniciu D. A fast and accurate tracking algorithm of left ventricles in 3D echocardiography. Proceedings of the IEEE International Symposium on Biomedical Imaging. NIH Public Access, 2008:221.
- Narula S.,
- Shameer K.,
- Omar A.M.S.,
- Dudley J.T.,
- Sengupta P.P.