Disclosure of Invention
The invention mainly aims to provide a ventricular premature beat identification system and method based on classifier fusion and diagnosis rules, so as to overcome the defects in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
the embodiment of the invention provides a ventricular premature beat identification system based on classifier fusion and diagnosis rules, which comprises:
the classification unit comprises an LCNN classification module and an RNN classification module, the LCNN classification module and the RNN classification module are used for independently processing electrocardiogram data, the LCNN classification module comprises m first classifiers with different structures, the m first classifiers are at least used for outputting m first classification results, the RNN classification module comprises n second classifiers with different structures, the n second classifiers are at least used for outputting n second classification results, and m and n are positive integers;
the fusion unit is used for performing fusion decision on the m first classification results and the n second classification results output by the classification unit according to a fusion decision rule to obtain a fusion result, wherein the fusion result comprises non-PVC data and PVC data;
and the judging unit is at least used for judging the non-PVC data and the PVC data judged by the fusion unit according to the PVC pathological characteristics to obtain a PVC identification result.
In some exemplary embodiments, the first classifier uses a Lead Convolutional Neural Network (LCNN).
In some exemplary embodiments, the second classifier employs a Recurrent Neural Network (RNN).
In some exemplary embodiments, the ventricular premature beat identification system further comprises: the preprocessing unit is used for preprocessing a raw Electrocardiogram (ECG) signal and inputting the preprocessed signal into the classifying unit.
Further, the preprocessing unit comprises a filter at least for removing baseline drift noise and/or power frequency interference noise.
In some exemplary embodiments, the fusion unit includes:
the first fusion module is used for carrying out fusion decision on the m first classification results output by the m first classifiers according to an addition fusion decision rule;
the second fusion module is used for carrying out fusion decision on the n second classification results output by the n second classifiers according to an addition fusion decision rule;
and the third fusion module is used for performing fusion decision on the fusion results output by the first fusion module and the second fusion module according to a mean fusion decision rule so as to obtain a final fusion result.
The embodiment of the invention also provides a ventricular premature beat identification method based on classifier fusion and diagnosis rules, which comprises the following steps:
processing electrocardiogram data by using m first classifiers with different structures in an LCNN classification module to output m first classification results;
processing the electrocardiogram data by adopting n second classifiers with different structures in the RNN classification module to output n second classification results;
performing fusion decision on the m output first classification results and the n output second classification results according to a fusion decision rule to obtain a fusion result, wherein the fusion result comprises non-PVC data and PVC data;
and judging the non-PVC data and the PVC data judged by the fusion unit according to the PVC pathological characteristics to obtain a PVC identification result.
Compared with the prior art, the invention has the advantages that:
according to the ventricular premature beat identification system and method based on classifier fusion and diagnosis rules, the advantages of the LCNN and the RNN are fully considered, the LCNN and the RNN are respectively used as the base classifiers for ensemble learning, and then classification results of the two classifiers are fused, so that a better PVC classification result is obtained; meanwhile, some pathological characteristics of PVC are also blended, and the method combining machine learning and disease diagnosis rules is adopted, so that the overall classification performance of PVC identification is improved, and the accuracy of PVC identification is effectively improved.
Detailed Description
In view of the deficiencies in the prior art, the inventors of the present invention have made extensive studies and extensive practices to provide technical solutions of the present invention. The invention has the principle that PVC is firstly identified by a classifier fusion method to obtain non-PVC and PVC, and then the non-PVC and PVC predicted after the classifier fusion are respectively judged again by using some pathological characteristics of PVC, so that the accuracy of PVC identification is improved. The technical solution, its implementation and principles, etc. will be further explained as follows.
One aspect of the embodiments of the present invention provides a ventricular premature beat identification system based on classifier fusion and diagnosis rules, which includes:
the classification unit comprises an LCNN classification module and an RNN classification module, the LCNN classification module and the RNN classification module are used for independently processing electrocardiogram data, the LCNN classification module comprises m first classifiers with different structures, the m first classifiers are at least used for outputting m first classification results, the RNN classification module comprises n second classifiers with different structures, the n second classifiers are at least used for outputting n second classification results, and m and n are positive integers;
the fusion unit is used for performing fusion decision on the m first classification results and the n second classification results output by the classification unit according to a fusion decision rule to obtain a fusion result, wherein the fusion result comprises non-PVC data and PVC data;
and the judging unit is at least used for judging the non-PVC data and the PVC data judged by the fusion unit according to the PVC pathological characteristics to obtain a PVC identification result.
In some exemplary embodiments, the first classifier employs a lead convolutional neural network.
In some exemplary embodiments, the second classifier employs a recurrent neural network.
In some exemplary embodiments, the ventricular premature beat identification system further comprises: and the preprocessing unit is used for preprocessing the original electrocardiogram signals and inputting the preprocessed signals into the classifying unit.
Further, the preprocessing unit comprises a filter, and is used for denoising at least, specifically removing noise such as baseline drift and power frequency interference.
In some exemplary embodiments, the fusion unit includes:
the first fusion module is used for carrying out fusion decision on the m first classification results output by the m first classifiers according to an addition fusion decision rule;
the second fusion module is used for carrying out fusion decision on the n second classification results output by the n second classifiers according to an addition fusion decision rule;
and the third fusion module is used for performing fusion decision on the fusion results output by the first fusion module and the second fusion module according to a mean fusion decision rule so as to obtain a final fusion result.
In some exemplary embodiments, the first fusion module performs an additive fusion decision rule by using the following formula:
wherein P isLCNN-jRepresents a fusion result of i first classification results, i is an integer of 2 or more, tmjAnd j is 0 or 1, wherein 0 represents non-PVC data and 1 represents PVC data.
Preferably, the formula adopted by the decision rule of the second fusion module for additive fusion is as follows:
wherein P isRNN-jRepresents the fusion result of g second classification results, g is an integer of 2 or more, ynjAnd j is 0 or 1, wherein 0 represents non-PVC data and 1 represents PVC data.
Preferably, the formula adopted by the third fusion module for the mean value fusion decision rule is as follows:
Pj=(1/2)*(PLCNN-j+PRNN-j)
wherein P isjJ is 0 or 1, wherein 0 represents non-PVC data, 1 represents PVC data, and if P is the final fusion result of the first classifier and the second classifierjAnd if the original electrocardiogram signal is greater than 0.5, the original electrocardiogram signal is PVC data, otherwise, the original electrocardiogram signal is non-PVC data.
The invention extracts the electrocardiogram characteristics such as the start point and the stop point of the QRS wave and the R wave characteristic point when the PVC diagnostic rule is adopted, and does not limit the extraction method of the QRS wave start point and the stop point and the R wave.
Another aspect of the embodiments of the present invention provides a ventricular premature beat identification method based on classifier fusion and diagnosis rules, which includes:
processing electrocardiogram data by using m first classifiers with different structures in an LCNN classification module to output m first classification results;
processing the electrocardiogram data by adopting n second classifiers with different structures in the RNN classification module to output n second classification results;
performing fusion decision on the m output first classification results and the n output second classification results according to a fusion decision rule to obtain a fusion result, wherein the fusion result comprises non-PVC data and PVC data;
and judging the non-PVC data and the PVC data judged by the fusion unit according to the PVC pathological characteristics to obtain a PVC identification result.
The process of P discrimination of the original electrocardiogram signal by adopting the ventricular premature beat recognition method is completed in three steps, wherein the first step is to perform denoising pretreatment, the second step is to fuse classifiers LCNN and RNN, the third step is to judge PVC data and non-PVC data (marked as non-PVC) predicted after the classifiers are fused by adopting PVC pathological characteristics again, and as shown in figure 1, the specific treatment process is as follows:
1. pretreatment:
firstly, the ECG signal is filtered to remove noise such as baseline drift, power frequency interference and the like.
2. Classifier fusion
The local connection and weight sharing mechanism of the LCNN effectively reduces the complexity of the network, reduces the number of training parameters, and has strong robustness and fault tolerance. The RNN is a deep learning model with a memory storage function, and takes the incidence relation among samples into consideration. The LCNN and the RNN have the characteristics that manual design does not need to be extracted, the classification process of the LCNN and the RNN is completely automatic, the input of the LCNN and the RNN is original input data, and then the final classification result is obtained through training and testing. As shown in fig. 1, after the preprocessing, the ECG signal is classified into PVCs through m1 LCNN classifiers and n2 RNN classifiers, wherein the m1 LCNN classifiers and the n2 RNN classifiers have different structures, and m1 classification results of m1 classifiers and n2 classification results of n2 classifiers are obtained through each classifier, so that in fact, an original probability value is output by each classifier, and whether the ECG signal is a PVC can be determined according to the original probability value. However, in order to improve the overall classification performance, the method provided by the invention firstly adopts an addition fusion decision rule to respectively perform fusion decision on m1 classification results and n2 classification results to obtain two fusion results, and then adopts a mean value fusion decision rule to perform fusion decision on the two fusion results to obtain a final fusion result. In order to further improve the overall classification performance, the invention also adopts PVC pathological characteristics to respectively judge the non-PVC data and the PVC data which are judged after the classifiers are fused.
PVC diagnostic rules
Some characteristics of the PVC electrocardiogram are as follows: the QRS wave has large width and large deformity, and the QRS wave form state is different from other normal QRS waves; the height of the R wave is obviously higher or lower than that of the non-PVC heart beat; and thirdly, when the advanced QRS-T wave appears, the RR interval (the distance from the R wave of the current heart beat to the R wave of the previous heart beat) is smaller than the average RR interval in the front. FIG. 2 shows a schematic electrocardiogram of a PVC recording, wherein N represents non-PVC heartbeat and V represents PVC heartbeat. Therefore, the invention selects RR interval, QRS wave width, QRS wave amplitude and QRS wave similarity as characteristic parameters.
After the fusion judgment of the classifier, the data predicted to be non-PVC and the data predicted to be PVC need to be judged again through PVC diagnosis rules, the data predicted to be non-PVC through the fusion of the classifier is marked as Enon-PVC, and the data predicted to be PVC is marked as EPVC. The EPVC class data and the Enon-PVC class data are discriminated again by the flow charts shown in FIG. 3 and FIG. 4, respectively. Wherein, the QRS wave similarity is obtained by calculating the correlation coefficient of the current QRS wave and the QRS wave with normal morphology, and taking the correlation coefficient as the similarity measurement of the QRS wave. The mean QRS wave width is obtained by calculating the QRS wave width of morphologically normal before the current heartbeat and then taking the mean. If the ECG recording meets the conditions of flowchart 3, then the ECG is likely to be a PVC.
It is worth mentioning that: since the invention is a binary classification problem, the specificity (Sp), the sensitivity (Se), the accuracy (Acc) and the comprehensive index York index gamma can be used to measure the quality of the classification effect, and generally, the bigger the York index is, the better the overall classification performance of the classification system is. The confusion matrix for the second category is shown in table 1 below:
TABLE 1 confusion matrix
The definition of each index is as follows:
Acc=(TP+TN)/(TP+TN+FP+FN) (1)
Se=TP/(TP+FN) (2)
Sp=TN/(TN+FP) (3)
γ=Se+Sp-1 (4)
the technical solution of the present invention is further described in detail by the following examples. However, the examples are chosen only for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Example 1
The data used in this example were derived from the cardiovascular disease database in China (CCDD database, http://58.210.56.164/CCDD /).
(1) In order to carry out denoising pretreatment, the ECG record is firstly subjected to band-pass filtering of 0.5-40 Hz;
(2) 35840 (containing 3112 PVC records) pre-processed ECG records were used as training samples; and the other 141046 records (containing 2148 PVC records) were used for testing. All training samples are respectively input into 4 LCNNs and 6 RNNs for independent parallel training, wherein the 4 LCNNs are selected from more trained LCNN models with better results and larger differences among the models. Similarly, the 6 RNNs are selected from more trained RNN models with better results and larger differences among models. After learning, the test samples are independently tested through the 4 LCNNs and the 6 RNNs respectively to obtain 4 LCNN classification results and 6 RNN classification results respectively. Their output values are probability values, denoted by tmjIndicates the probability value (j is 0,1, where 0 indicates non-PVC and 1 indicates PVC) belonging to the jth class obtained from the mth LCNN classification result, ynjIndicates the result of the nth RNN classificationThe probability value of the j-th class. In fact, a decision about the disease has been obtained by each classifier, i.e. if tmjOr ynjIf the number of the samples is more than 0.5, the samples are judged to be PVC, otherwise the samples are non-PVC, but in order to improve the overall classification result, the embodiment does not judge the output values obtained by each classifier, but fusion decision is performed on the classification results according to the output values by respectively adopting an addition fusion decision rule.
Firstly, 4 LCNN classification results are fused by adopting an addition fusion decision rule, and the formula is as follows:
wherein P isLCNN-jRepresents the fusion of 4 LCNN classification results, tmjRepresenting the probability value belonging to the j-th class obtained from the m-th LCNN classification result, wherein j is 0 or 1, wherein 0 represents non-PVC data, and 1 represents PVC data;
similarly, the formula for fusing the classification results of the 6 RNNs is as follows:
wherein P isRNN-jRepresents the fusion result of 6 RNN classification results, ynjA probability value belonging to the j-th class obtained from the nth RNN classification result is represented, where j is 0 or 1, where 0 represents non-PVC data and 1 represents PVC data;
then, two fusion results P are obtained by adopting a mean fusion decision ruleLCNN-jAnd PRNN-jPerforming fusion by adopting a formula as follows:
Pj=(1/2)*(PLCNN-j+PRNN-j)
wherein P isjAnd j is 0 or 1, wherein 0 represents non-PVC data, and 1 represents PVC data, and is the final fusion result of the LCNN classifier and the RNN classifier, namely the probability that the sample belongs to the jth class. If PjIf the number is more than 0.5, the samples are fused by the two classifiersThe sample belongs to type 1, namely PVC data, or else, the sample belongs to type 0, namely non-PVC data. The results obtained after fusing the decisions of the two classifiers are shown in table 2 below:
TABLE 2 classifier fusion recognition PVC results
The upper horizontal direction of the upper confusion matrix represents real data and the left vertical direction represents predicted data. As can be seen from Table 2, the obtained sensitivity is low after the two classifiers are fused and decided, and the comprehensive index gamma is not high. In order to further improve the overall classification performance, the inventor considers that certain pathological features of PVC are adopted to judge the Enon-PVC data and the EPVC data predicted after the classifier is fused again.
(3) As can be seen from Table 2, the predicted Enon-PVC data after the classifier fusion has 135212+458 records, and the EPVC data has 3686+1690 records. And then, judging the EPVC data and the Enon-PVC data again by adopting some pathological characteristics of PVC respectively, wherein the judging flow charts of the EPVC data and the Enon-PVC data are shown in fig. 3 and 4, and the values of the parameters adopted in the judging flow charts 3-4 are obtained according to experience. Finally, the PVC identification results obtained after combining the LCNN, RNN and PVC diagnostic rules are shown in table 3 below.
TABLE 3PVC Final identification results
In summary, the accuracy of the PVC obtained by the PVC identification system of this embodiment in tens of thousands of test data in the CCDD database 14 is 98.01%, the specificity is 98.04%, and the sensitivity is 96.32%. Compared with the table 2, although the accuracy obtained by adopting the method of combining the LCNN and the RNN is higher, the sensitivity and the comprehensive index are lower, each index obtained by combining some basic pathological characteristics of the LCNN, the RNN and the PVC is improved, the improvement range of the sensitivity and the comprehensive index is larger, and as can be seen from the table 3, the overall classification performance of PVC identification is improved by adopting classifier fusion and disease diagnosis rules.
It should be understood that the above describes only some embodiments of the present invention and that various other changes and modifications may be affected therein by one of ordinary skill in the related art without departing from the scope or spirit of the invention.