CN113164057B

CN113164057B - Machine learning health analysis with mobile device

Info

Publication number: CN113164057B
Application number: CN201980080631.XA
Authority: CN
Inventors: A·V·瓦里斯; F·L·彼得森; C·D·C·盖洛韦; 大卫·E·艾伯特; 拉维·葛巴拉克利希南; L·科尔齐诺夫; 王菲; E·汤姆森; 努珀尔·斯里瓦斯塔瓦; O·达伍德; I·阿布扎伊德
Original assignee: AliveCor Inc
Current assignee: AliveCor Inc
Priority date: 2018-10-05
Filing date: 2019-10-04
Publication date: 2024-08-09
Anticipated expiration: 2039-10-04
Also published as: WO2020073013A1; JP7495398B2; CN113164057A; EP3860436A1; JP2022504288A

Abstract

Devices, systems, methods, and platforms for continuously monitoring a user's health (e.g., heart health) are disclosed. Systems, methods, devices, software and platforms are described for continuously monitoring health indicator data (such as, but not limited to, PPG signals, heart rate or blood pressure, etc.) of a user from a user device in conjunction with (temporally) corresponding data related to factors ("other factors") that may affect the health indicator to determine whether the user has normal health by, for example, but not limited to, determining or comparing, for example, but not limited to: i) A group of individuals affected by similar other factors; or ii) the user himself affected by similar other factors.

Description

Health analysis using machine learning on mobile devices

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求于2019年9月24日提交的美国专利申请号16/580,574的优先权，该申请是于2018年10月5日提交的美国申请序列号16/153,403的延续申请，其全部内容通过引用合并于此。This application claims priority to U.S. patent application Ser. No. 16/580,574, filed on Sep. 24, 2019, which is a continuation of U.S. application Ser. No. 16/153,403, filed on Oct. 5, 2018, the entire contents of which are incorporated herein by reference.

背景技术Background Art

个体生理健康的指标(“健康指标”)(例如但不限于：心率、心率变异性、血压和ECG(心电图)等)可以是根据为了测量健康指标而收集的数据在任何离散的一个或多个时间点处测量或计算出的。在许多情况下，特定时间的健康指标的值、或随着时间的推移而发生的变化提供了与个体健康状况有关的信息。例如，低或高的心率或血压、或清晰展现了心肌缺血的ECG可以展现出对于立即干预的需求。但是需要注意，这些指标的读数、一系列读数或读数随着时间的推移的变化可能提供用户或甚至健康专业人员不能识别的信息。Indicators of an individual's physiological health ("health indicators") (such as, but not limited to, heart rate, heart rate variability, blood pressure, and ECG (electrocardiogram)) can be measured or calculated at any discrete time point or points based on data collected to measure the health indicators. In many cases, the value of a health indicator at a particular time, or the change over time, provides information about the individual's health status. For example, a low or high heart rate or blood pressure, or an ECG that clearly demonstrates myocardial ischemia, may indicate the need for immediate intervention. However, it should be noted that the readings, a series of readings, or changes in readings over time of these indicators may provide information that cannot be recognized by the user or even a health professional.

心律失常例如可能持续发生或者可能间歇性发生。持续发生的心律失常可以由个体的心电图最明确地诊断出来。由于持续心律失常总是存在，因此可以在任何时间应用ECG分析，以诊断心律失常。ECG也可用于诊断间歇性心律失常。然而，由于间歇性心律失常可能是无症状的和/或按照定义是间歇性的，因此诊断呈现出在个体正在经历心律失常时应用诊断技术的挑战。因此，间歇性心律失常的实际诊断非常困难。这种特殊的困难与无症状心律失常(占美国心律失常的近40％)相结合。Boriani G.和Pettorelli D.,AtrialFibrillation Burden and Atrial Fibrillation type:Clinical Significance andImpact on the Risk of Stroke and Decision Making for Long-termAnticoagulation,Vascul Pharmacol.,83:26-35(2016年8月),第26页。Arrhythmias, for example, may occur continuously or may occur intermittently. Continuous arrhythmias can be most clearly diagnosed by an individual's electrocardiogram. Since continuous arrhythmias are always present, ECG analysis can be applied at any time to diagnose arrhythmias. ECG can also be used to diagnose intermittent arrhythmias. However, since intermittent arrhythmias may be asymptomatic and/or intermittent by definition, diagnosis presents the challenge of applying diagnostic techniques when an individual is experiencing an arrhythmia. Therefore, the actual diagnosis of intermittent arrhythmias is very difficult. This particular difficulty is combined with asymptomatic arrhythmias (accounting for nearly 40% of arrhythmias in the United States). Boriani G. and Pettorelli D., Atrial Fibrillation Burden and Atrial Fibrillation type: Clinical Significance and Impact on the Risk of Stroke and Decision Making for Long-term Anticoagulation, Vascul Pharmacol., 83: 26-35 (August 2016), page 26.

存在允许经常或持续监测并记录健康指标的传感器和移动电子技术。然而，这些传感器平台的能力常常超过传统医学科学解释传感器所产生的数据的能力。例如心率等的健康指标参数的生理意义经常仅在特定的医学上下文中得到很好的定义：例如，在脱离上下文的情况下，传统上根据可能影响健康指标的其它数据/信息将心率评价为单个标量值。在每分钟60～100次跳动(BPM)的范围内的静息心率可被认为是正常的。用户大体上可能每天一次或两次地手动测量他们的静息心率。There are sensors and mobile electronic technologies that allow frequent or continuous monitoring and recording of health indicators. However, the capabilities of these sensor platforms often exceed the ability of traditional medical science to interpret the data generated by the sensors. The physiological meaning of health indicator parameters such as heart rate is often only well defined in a specific medical context: for example, out of context, heart rate is traditionally evaluated as a single scalar value based on other data/information that may affect health indicators. A resting heart rate in the range of 60 to 100 beats per minute (BPM) can be considered normal. Users generally may manually measure their resting heart rate once or twice a day.

移动传感器平台(例如：移动血压袖带；移动心率监视器；或移动ECG装置)可以能够持续监测健康指标(例如，心率)，例如能够每秒或每5秒产生一次测量，同时还获取与用户有关的其它数据，诸如但不限于：活动水平、身体位置、以及例如气温、气压、位置等的环境参数。在24小时时间段内，这可能导致数千次独立的健康指标测量。与每天一次或两次的测量相对，关于数千次测量的“正常”序列看起来如何，存在相对少的数据或医学共识。A mobile sensor platform (e.g., a mobile blood pressure cuff; a mobile heart rate monitor; or a mobile ECG device) may be capable of continuously monitoring a health indicator (e.g., heart rate), for example being capable of generating a measurement every second or every 5 seconds, while also acquiring other data about the user, such as, but not limited to, activity level, body position, and environmental parameters such as air temperature, air pressure, location, etc. Over a 24 hour period, this may result in thousands of separate health indicator measurements. There is relatively little data or medical consensus on what a "normal" sequence of thousands of measurements looks like, as opposed to once or twice a day.

目前用于持续测量用户/患者的健康指标的装置从体积庞大、具有侵入性且不便利的装置到简单的可穿戴式或手持式移动装置不等。目前，这些装置未提供用以有效地利用数据来持续监测个人健康的能力。依赖于用户或健康专业人员来根据可能影响健康指标的其它因素来评估这些健康指标以确定用户的健康状况。Current devices for continuously measuring health indicators of users/patients range from bulky, invasive and inconvenient devices to simple wearable or handheld mobile devices. Currently, these devices do not provide the ability to effectively utilize data to continuously monitor personal health. Users or health professionals are relied upon to evaluate these health indicators based on other factors that may affect the health indicators to determine the health status of the user.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

在所附权利要求书中特别地阐述了这里描述的某些特征。将通过参考阐述了利用这里描述的原理的例示性实施例的以下具体说明和附图来获得对所公开的实施例的特征和优点的更好理解，在附图中：Certain features described herein are particularly set forth in the appended claims. A better understanding of the features and advantages of the disclosed embodiments will be obtained by reference to the following detailed description and accompanying drawings which set forth exemplary embodiments utilizing the principles described herein, in which:

图1A～1B描绘了根据如这里描述的一些实施例可以使用的卷积神经网络；1A-1B depict convolutional neural networks that may be used according to some embodiments as described herein;

图2A～2B描绘了根据如这里描述的一些实施例可以使用的递归神经网络；2A-2B depict recurrent neural networks that may be used according to some embodiments as described herein;

图3描绘了根据如这里描述的一些实施例可以使用的可选递归神经网络；FIG3 depicts an alternative recurrent neural network that may be used in accordance with some embodiments as described herein;

图4A～4C描绘了用以展现如这里描述的一些实施例的应用的假设数据标绘图；4A-4C depict hypothetical data plots used to illustrate the application of some embodiments as described herein;

图5A～5E描绘了根据如这里描述的一些实施例的可选递归神经网络以及用于描述这些实施例中的一些的假设标绘图；5A-5E depict alternative recurrent neural networks according to some embodiments as described herein and hypothetical plots used to describe some of these embodiments;

图6描绘了根据如这里描述的一些实施例的展开递归神经网络；FIG6 depicts an unfolded recurrent neural network according to some embodiments as described herein;

图7A～7B描绘了根据如这里描述的一些实施例的系统和装置；7A-7B depict systems and apparatus according to some embodiments as described herein;

图8描绘了根据如这里描述的一些实施例的方法；FIG8 depicts a method according to some embodiments as described herein;

图9A～9B描绘了根据如这里描述的一些实施例的方法以及用以展现一个或多个实施例的心率相对于时间的假设标绘图；9A-9B depict hypothetical plots of heart rate versus time according to methods as described herein and to illustrate one or more embodiments;

图10描绘了根据如这里描述的一些实施例的方法；FIG10 depicts a method according to some embodiments as described herein;

图11描绘了用以展现如这里描述的一些实施例的应用的假设数据标绘图；以及FIG. 11 depicts a hypothetical data plot used to illustrate the application of some embodiments as described herein; and

图12描绘了根据如这里描述的一些实施例的系统和装置。FIG. 12 depicts systems and apparatus according to some embodiments as described herein.

具体实施方式DETAILED DESCRIPTION

大数据量、健康指标与其它因素之间相互作用的复杂性以及有限的临床指导可能限制尝试基于传统医学实践通过具体规则来检测连续和/或流动传感器数据中的异常的任何监测系统的有效性。这里描述的实施例包括可以利用预测机器学习模型根据健康指标数据单独或与其它因素(如这里定义)数据相结合以无监督的方式检测的时间序列中的异常的装置、系统、方法和平台。The volume of large data, the complexity of the interactions between health indicators and other factors, and limited clinical guidance may limit the effectiveness of any monitoring system that attempts to detect anomalies in continuous and/or flowing sensor data through specific rules based on traditional medical practice. The embodiments described herein include devices, systems, methods, and platforms that can use predictive machine learning models to detect anomalies in a time series in an unsupervised manner based on health indicator data alone or in combination with other factors (as defined herein) data.

心房颤动(AF或AFib)在一般人群中的发生率为1～2％，并且AF的存在增加了诸如中风和心脏衰竭等的发病风险以及不利结果。Boriani G.和Pettorelli D.,AtrialFibrillation Burden and Atrial Fibrillation type:Clinical Significance andImpact on the Risk of Stroke and Decision Making for Long-termAnticoagulation,Vascul Pharmacol.,83:26-35(2016年8月),第26页。在许多人中(据估计高达40％的AF患者中)，AFib可能是无症状的，并且这些无症状患者与有症状患者具有类似的中风和心脏衰竭的风险状况。参见同上出处。然而，有症状患者可以采取积极的措施(诸如服用血液稀释剂或其它药物)以降低负面结果的风险。使用植入式电气装置(CIED)可以检测无症状AF(所谓的沉默性AF或SAF)以及患者处于AF中的持续时间。同上出处。根据该信息，可以确定这些患者处于AF或AF负荷时所度过的时间。同上出处。大于5～6分钟、特别地大于1小时的AF负荷与中风和其它负面健康结果的显著增加的风险相关联。同上出处。因此，测量无症状患者中的AF负荷的能力可以使得早期介入治疗，并且可以降低与AF相关联的负面健康结果的风险。同上出处。SAF的检测是有挑战性的，通常需要某种形式的持续监测。目前，对AF的持续监测需要体积庞大、有时具有侵入性且昂贵的装置，其中这种监测需要高水平的医学专业人员监督和审查。Atrial fibrillation (AF or AFib) occurs in 1-2% of the general population, and the presence of AF increases the risk of morbidity and adverse outcomes such as stroke and heart failure. Boriani G. and Pettorelli D., Atrial Fibrillation Burden and Atrial Fibrillation type: Clinical Significance and Impact on the Risk of Stroke and Decision Making for Long-term Anticoagulation, Vascul Pharmacol. , 83:26-35 (August 2016), page 26. In many people (estimated to be up to 40% of AF patients), AFib may be asymptomatic, and these asymptomatic patients have similar risk profiles for stroke and heart failure as symptomatic patients. See supra. However, symptomatic patients can take active measures (such as taking blood thinners or other medications) to reduce the risk of negative outcomes. The use of implantable electrical devices (CIEDs) can detect asymptomatic AF (so-called silent AF or SAF) and the duration of a patient's AF. Ibid. From this information, the time these patients spent in AF or AF burden can be determined. Ibid. AF burden greater than 5-6 minutes, and particularly greater than 1 hour, is associated with a significantly increased risk of stroke and other negative health outcomes. Ibid. Therefore, the ability to measure AF burden in asymptomatic patients may enable earlier interventional treatment and may reduce the risk of negative health outcomes associated with AF. Ibid. Detection of SAF is challenging and typically requires some form of continuous monitoring. Currently, continuous monitoring of AF requires bulky, sometimes invasive, and expensive equipment, with such monitoring requiring a high level of medical professional oversight and review.

许多装置持续获得数据以提供对健康指标数据的测量或计算，例如但不限于Apple 智能手机、平板计算机等属于可穿戴式装置和/或移动装置的类别。其它装置包括用户/患者身上或体内的永久性或半永久性装置(例如，动态心电图)，而其它装置可能包括医院内因在手推车上而可移动的较大装置。但是，除了在显示器上定期观察该测量数据或建立简单的数据阈值之外，对该测量数据的处理很少。对数据的观察(甚至是经训练的医学专业人员对数据的观察)也可能经常表现为正常，一个主要的例外是用户具有易于识别的急性症状的情况。医学专业人员很难并且几乎不可能持续监测健康指标而观察到可能表示情况更严重的数据异常和/或趋势。Many devices continuously acquire data to provide measurements or calculations of health indicators, such as but not limited to Apple Smartphones, tablet computers, and the like fall into the category of wearable and/or mobile devices. Other devices include permanent or semi-permanent devices on or in the user/patient (e.g., Holter monitors), while others may include larger devices that are mobile within a hospital on a cart. However, little processing is done on the measurement data beyond periodic observation of the measurement data on a display or establishment of simple data thresholds. Observation of the data (even by trained medical professionals) may often appear normal, with one major exception being situations where the user has easily identifiable acute symptoms. It is difficult and nearly impossible for medical professionals to continuously monitor health indicators and observe data anomalies and/or trends that may indicate a more serious condition.

如这里使用的，平台包括一个或多个定制软件应用(或“应用”)，该一个或多个定制软件应用被配置为在本地或通过包括云和因特网的分布式网络彼此交互。如这里描述的平台的应用被配置为收集和分析用户数据，并且可以包括一个或多个软件模型。在平台的一些实施例中，平台包括一个或多个硬件组件(例如，一个或多个感测装置、处理装置或微处理器)。在一些实施例中，平台被配置为与一个或多个装置和/或一个或多个系统一起操作。也就是说，在一些实施例中，如这里描述的装置被配置为使用内置处理器来运行平台的应用，并且在一些实施例中，平台被包括与平台的一个或多个应用进行交互或运行平台的一个或多个应用的一个或多个计算装置的系统所利用。As used herein, a platform includes one or more custom software applications (or "applications") that are configured to interact with each other locally or through a distributed network including the cloud and the Internet. The applications of the platform as described herein are configured to collect and analyze user data and may include one or more software models. In some embodiments of the platform, the platform includes one or more hardware components (e.g., one or more sensing devices, processing devices, or microprocessors). In some embodiments, the platform is configured to operate with one or more devices and/or one or more systems. That is, in some embodiments, the devices as described herein are configured to use a built-in processor to run the applications of the platform, and in some embodiments, the platform is utilized by a system including one or more computing devices that interact with or run one or more applications of the platform.

本发明描述了用于结合与可能影响健康指标的因素(这里被称为“其它因素”)相关的(时间上)相应数据来持续监测来自用户装置的与一个或多个健康指标相关的用户数据(例如但不限于PPG信号、心率或血压等)以通过例如但不限于以下内容判定或与例如但不限于以下内容相比较来判断用户是否具有正常健康的系统、方法、装置、软件和平台：i)受类似的其它因素影响的一组个体；或者ii)受类似的其它因素影响的用户自己。在一些实施例中，测量健康指标数据单独地或与其它因素数据相结合地输入到经训练的机器学习模型中，其中该机器学习模型确定用户的测量健康指标被考虑为在健康范围内的概率，并且在用户的测量健康指标被考虑为不在健康范围内的情况下向用户通知这种情况。不处于健康范围内的用户可能会增加用户可能正在经历需要高保真度信息以确认诊断的健康事件(诸如可能有症状或无症状的心律失常)的可能性。通知可以采取例如请求用户获得ECG的形式。可以请求其它高保真度测量(血压、脉搏血氧计等)，ECG只是一个示例。高保真度测量(在本实施例中为ECG)可以通过算法和/或医学专业人员进行评价，以进行通知或诊断(这里统称为“诊断”，认识到只有医师才能进行诊断)。在ECG示例中，诊断可能是AFib或使用ECG进行诊断的任何其它数量的众所周知的状况。The present invention describes systems, methods, devices, software and platforms for continuously monitoring user data (e.g., but not limited to, PPG signals, heart rate, or blood pressure, etc.) from a user device in combination with (temporally) corresponding data related to factors that may affect the health indicators (referred to herein as "other factors") to determine whether the user has normal health by, for example, but not limited to, determining or comparing with, for example, but not limited to, the following: i) a group of individuals affected by similar other factors; or ii) the user himself affected by similar other factors. In some embodiments, the measured health indicator data is input into a trained machine learning model alone or in combination with other factor data, wherein the machine learning model determines the probability that the user's measured health indicator is considered to be within a healthy range, and notifies the user of this situation if the user's measured health indicator is considered to be not within a healthy range. Users who are not within a healthy range may increase the likelihood that the user may be experiencing a health event (such as arrhythmia that may be symptomatic or asymptomatic) that requires high-fidelity information to confirm the diagnosis. The notification may take the form of, for example, requesting the user to obtain an ECG. Other high-fidelity measurements (blood pressure, pulse oximeter, etc.) may be requested, and ECG is just one example. The high fidelity measurement (ECG in this example) can be evaluated by algorithms and/or medical professionals for notification or diagnosis (collectively referred to herein as "diagnosis," recognizing that only a physician can make a diagnosis). In the ECG example, the diagnosis could be AFib or any other number of well-known conditions that are diagnosed using an ECG.

在进一步的实施例中，诊断用于标记低保真度数据序列(例如，心率或PPG)，该低保真度数据序列可以包括其它因素数据序列。这种高保真度诊断标记后的低保真度数据序列用于训练高保真度机器学习模型。在这些进一步的实施例中，高保真度机器学习模型的训练可以通过无监督学习来训练，或者可以利用新的训练示例不时更新。在一些实施例中，用户的测量低保真度健康指标数据序列以及可选的其它因素的(时间上)相应数据序列被输入到经训练的高保真度机器学习模型中，以确定用户正在经历或经历过诊断状况的概率和/或预测，其中，高保真度机器学习模型是针对该诊断状况而训练的。这种概率可以包括事件何时开始以及事件何时结束的概率。例如，一些实施例可以计算用户的心房颤动(AF)负荷、或者用户随时间经历AF的时间量。先前，AF负荷只能使用麻烦且昂贵的动态心电图或植入式持续ECG监测设备来确定。因此，这里描述的一些实施例可以通过单独地或与其它因素的相应数据相结合地持续监测从用户佩戴的装置获得的健康指标数据(例如但不限于PPG数据、血压数据和心率数据等)，来持续监测用户的健康状况并向用户通知健康状况变化。如这里使用的“其它因素”包括可能影响健康指标和/或可能影响表示健康指标的数据(例如，PPG数据)的任何因素。这些其它因素可以包括各种因素，诸如但不限于：气温、海拔、锻炼水平、体重、性别、饮食、站着、坐着、跌倒、躺下、天气、以及BMI等。在一些实施例中，可以使用并非机器学习模型的数学或经验模型来确定何时通知用户获得高保真度测量，然后该高保真度测量可被分析并用于训练如这里描述的高保真度机器训练模型。In further embodiments, the diagnosis is used to label a low-fidelity data sequence (e.g., heart rate or PPG), which may include other factor data sequences. This low-fidelity data sequence labeled with a high-fidelity diagnosis is used to train a high-fidelity machine learning model. In these further embodiments, the training of the high-fidelity machine learning model can be trained by unsupervised learning, or can be updated from time to time with new training examples. In some embodiments, the user's measured low-fidelity health indicator data sequence and optionally the corresponding (temporal) data sequence of other factors are input into the trained high-fidelity machine learning model to determine the probability and/or prediction that the user is experiencing or has experienced a diagnostic condition, wherein the high-fidelity machine learning model is trained for the diagnostic condition. Such probabilities may include probabilities of when an event begins and when an event ends. For example, some embodiments may calculate a user's atrial fibrillation (AF) burden, or the amount of time a user experiences AF over time. Previously, AF burden could only be determined using cumbersome and expensive dynamic electrocardiograms or implantable continuous ECG monitoring devices. Therefore, some embodiments described herein can continuously monitor the health status of the user and notify the user of changes in health status by continuously monitoring health indicator data (such as but not limited to PPG data, blood pressure data, and heart rate data, etc.) obtained from a device worn by the user, either alone or in combination with corresponding data of other factors. "Other factors" as used herein include any factors that may affect health indicators and/or may affect data representing health indicators (e.g., PPG data). These other factors may include various factors such as but not limited to: air temperature, altitude, exercise level, weight, gender, diet, standing, sitting, falling, lying down, weather, and BMI, etc. In some embodiments, a mathematical or empirical model that is not a machine learning model can be used to determine when to notify the user to obtain a high-fidelity measurement, which can then be analyzed and used to train a high-fidelity machine training model as described herein.

这里描述的一些实施例可以通过如下操作以无监督的方式检测用户的异常：接收健康指标数据的主时间序列；可选择地接收在时间上与健康指标数据的主时间序列相对应的一个或多个其它因素数据的次时间序列，这些次序列可以来自传感器或者来自外部数据源(例如，通过网络连接、计算机API等)；将主时间序列和次时间序列提供至预处理器，该预处理器可以对数据进行诸如滤波、高速缓存、求平均、时间对齐、缓冲、上采样和下采样等的操作；将数据的时间序列提供至机器学习模型，该机器学习模型被训练和/或配置为利用主时间序列和次时间序列的值来预测主时间序列在未来时间处的下一值；将特定时间t处的由机器学习模型生成的预测主时间序列值与时间t处的主时间序列的测量值进行比较；以及在预测未来时间序列与测量时间序列之间的差超过阈值或标准的情况下警告或提示用户采取动作。Some embodiments described herein may detect anomalies of a user in an unsupervised manner by: receiving a primary time series of health indicator data; optionally receiving a secondary time series of one or more other factor data corresponding in time to the primary time series of health indicator data, which secondary series may be from sensors or from external data sources (e.g., via a network connection, a computer API, etc.); providing the primary time series and the secondary time series to a preprocessor, which may perform operations such as filtering, caching, averaging, time alignment, buffering, upsampling, and downsampling on the data; providing the time series of data to a machine learning model, which is trained and/or configured to use the values of the primary time series and the secondary time series to predict the next value of the primary time series at a future time; comparing the predicted primary time series value generated by the machine learning model at a specific time t with the measured value of the primary time series at time t; and alerting or prompting the user to take action if the difference between the predicted future time series and the measured time series exceeds a threshold or standard.

因此，这里描述的一些实施例检测相对于时间的流逝和/或响应于所观察到的数据的次序列而观察到的生理数据的主序列的行为何时与在给定用于训练模型的训练示例的情况下所预期的行为不同。在从正常个体或从特定用户的先前被归类为正常的数据中收集训练示例的情况下，系统可以用作异常检测器。如果数据仅仅是在没有任何其它分类的情况下从特定用户获取的，则系统可以用作变化检测器，用于检测主序列正在测量的健康指标数据相对于捕获训练数据的时间的变化。Thus, some embodiments described herein detect when the behavior of a primary sequence of observed physiological data differs from the behavior that would be expected given the training examples used to train the model, relative to the passage of time and/or in response to a secondary sequence of observed data. In cases where the training examples are collected from normal individuals or from data previously classified as normal for a particular user, the system can be used as an anomaly detector. If the data is simply acquired from a particular user without any other classification, the system can be used as a change detector to detect changes in the health indicator data being measured by the primary sequence relative to the time the training data was captured.

这里描述了如下的软件平台、系统、装置和方法：用于生成经训练的机器学习模型，并使用该模型来预测或确定受其它因素(次序列)影响的用户的测量健康指标数据(主序列)在受类似其它因素影响的健康人群的正常界限(即，全局模型)外、或者在受类似其它因素影响的该特定用户的正常界限(即，个性化模型)外的概率，其中向用户提供这样的通知。在一些实施例中，可以提示用户获得可用于标记先前获取的低保真度用户健康指标数据的附加的测量高保真度数据，以生成不同的经训练的高保真度机器学习模型，该模型具有仅使用低保真度健康指标数据来预测或诊断异常或事件的能力，其中这种异常通常仅使用高保真度数据进行识别或诊断。Described herein are software platforms, systems, apparatus, and methods for generating a trained machine learning model and using the model to predict or determine the probability that measured health indicator data (primary sequence) of a user affected by other factors (secondary sequence) is outside the normal boundaries of a healthy population affected by similar other factors (i.e., a global model), or outside the normal boundaries of the specific user affected by similar other factors (i.e., a personalized model), wherein such notification is provided to the user. In some embodiments, the user may be prompted to obtain additional measured high-fidelity data that may be used to label previously acquired low-fidelity user health indicator data to generate a different trained high-fidelity machine learning model having the ability to predict or diagnose anomalies or events using only low-fidelity health indicator data, where such anomalies are typically identified or diagnosed using only high-fidelity data.

这里描述的一些实施例可以包括输入用户的健康指标数据，以及可选地将其它因素的(时间上)相应数据输入到经训练的机器学习模型中，其中经训练的机器学习模型预测未来时间步骤处的用户的健康指标数据或健康指标数据的概率分布。在一些实施例中，将预测与预测时间步骤处的用户的测量健康指标数据进行比较，其中，如果差的绝对值超过阈值，则向用户通知他或她的健康指标数据在正常范围外。在一些实施例中，该通知可以包括诊断或做某事的指示，例如但不限于获得附加的测量或联系健康专业人员。在一些实施例中，使用来自健康人群的健康指标数据和其它因素的(时间上)相应数据来训练机器学习模型。应当理解，用于训练机器学习模型的训练示例中的其它因素可能不是人群的平均，相反，其它因素中的各因素的数据在时间上与训练示例中的个体的健康指标数据的集合相对应。Some embodiments described herein may include inputting a user's health indicator data, and optionally inputting corresponding data (in time) of other factors into a trained machine learning model, wherein the trained machine learning model predicts the user's health indicator data or the probability distribution of health indicator data at a future time step. In some embodiments, the predicted and predicted measured health indicator data of the user at the time step are compared, wherein if the absolute value of the difference exceeds a threshold, the user is notified that his or her health indicator data is outside the normal range. In some embodiments, the notification may include an instruction to diagnose or do something, such as but not limited to obtaining additional measurements or contacting a health professional. In some embodiments, the machine learning model is trained using health indicator data from a healthy population and corresponding data (in time) of other factors. It should be understood that the other factors in the training examples used to train the machine learning model may not be the average of the population, but rather the data of each of the other factors corresponds in time to the set of health indicator data of the individuals in the training examples.

一些实施例被描述为接收时间上的离散数据点、根据输入预测未来时间处的离散数据点、然后判断未来时间处的离散测量输入与未来时间处的预测值之间的损失是否超过阈值。本领域技术人员将容易理解，输入数据和输出预测可以采取除离散数据点或标量之外的形式。例如但不限于，健康指标数据序列(这里也称为主序列)和其它数据序列(这里也称为次序列)可被划分成时间片段。本领域技术人员将认识到，数据分段的方式是设计选择的问题，并且可以采取许多不同的形式。Some embodiments are described as receiving discrete data points in time, predicting discrete data points at future times based on inputs, and then determining whether the loss between the discrete measured inputs at future times and the predicted values at future times exceeds a threshold. Those skilled in the art will readily appreciate that the input data and output predictions can take forms other than discrete data points or scalars. For example, but not limited to, health indicator data sequences (also referred to herein as primary sequences) and other data sequences (also referred to herein as secondary sequences) can be divided into time segments. Those skilled in the art will recognize that the manner in which the data is segmented is a matter of design choice and can take many different forms.

一些实施例将健康指标数据序列(这里也称为主序列)和其它数据序列(这里也称为次序列)分割成两个片段：过去，表示特定时间t之前的所有数据；以及未来，表示时间t处或之后的所有数据。这些实施例将过去时间片段的健康指标数据序列和过去时间片段的所有其它数据序列输入到机器学习模型中，该机器学习模型被配置为预测健康指标数据的最可能的未来片段(或可能的未来片段的分布)。可选地，这些实施例将过去时间片段的健康指标数据序列、过去时间片段的所有其它数据序列和未来片段的其它数据序列输入到机器学习模型，该机器学习模型被配置为预测健康指标数据的最可能的未来片段(或可能的未来片段的分布)。将预测的健康指标数据的未来片段与未来片段处的用户的测量健康指标数据进行比较以判断损失以及该损失是否超过阈值，在这种情况下，采取一些动作。动作可以包括例如但不限于：通知用户获得附加数据(例如，ECG或血压)；通知用户联系健康专业人员；或自动触发附加数据的获取。自动获取附加数据可以包括例如但不限于经由可操作地(有线或无线)耦接至用户佩戴的计算装置的传感器的ECG获取、或经由围绕用户的手腕或其它适当身体部位并且耦接至用户佩戴的计算装置的移动袖带的血压。数据片段可以包括单个数据点、某一时间段内的许多数据点、该时间段内的这些数据点的平均值，其中平均值可以包括真正平均值、中值或众数。在一些实施例中，片段可以在时间上重叠。Some embodiments divide the health indicator data sequence (also referred to herein as the main sequence) and other data sequences (also referred to herein as the secondary sequence) into two segments: the past, representing all data before a specific time t; and the future, representing all data at or after time t. These embodiments input the health indicator data sequence of the past time segment and all other data sequences of the past time segment into a machine learning model, which is configured to predict the most likely future segment of the health indicator data (or the distribution of possible future segments). Optionally, these embodiments input the health indicator data sequence of the past time segment, all other data sequences of the past time segment, and other data sequences of the future segment into a machine learning model, which is configured to predict the most likely future segment of the health indicator data (or the distribution of possible future segments). The predicted future segment of the health indicator data is compared with the measured health indicator data of the user at the future segment to determine the loss and whether the loss exceeds a threshold, in which case some actions are taken. The action may include, for example, but not limited to: notifying the user to obtain additional data (e.g., ECG or blood pressure); notifying the user to contact a health professional; or automatically triggering the acquisition of additional data. Automatically acquiring additional data may include, for example, but not limited to, ECG acquisition via a sensor operatively coupled (wired or wirelessly) to a computing device worn by a user, or blood pressure via a mobile cuff around the user's wrist or other appropriate body part and coupled to a computing device worn by the user. The data segments may include a single data point, a number of data points over a period of time, an average of these data points over the period of time, where the average may include a true mean, median, or mode. In some embodiments, the segments may overlap in time.

这些实施例检测受(时间上)如相应的其它因素数据序列所影响的那样、相对于时间流逝而观察到的健康指标数据序列的行为或测量何时与根据训练示例所预期的行为或测量不同，其中训练示例是在类似的其它因素下收集到的。如果训练示例是从类似的其它因素下的健康个体、或者从类似的其它因素下的特定用户的先前被归类为健康的数据中收集到的，则这些实施例分别用作来自健康人群或特定用户的异常检测器。如果训练示例仅仅是在没有任何其它分类的情况下从特定用户获取的，则这些实施例用作变化检测器，用于检测特定用户的测量时的健康指标相对于收集训练示例的时间的变化。These embodiments detect when the behavior or measurement of a health indicator data sequence observed relative to the passage of time, as influenced (temporally) by the corresponding other factor data sequence, differs from the behavior or measurement expected based on training examples collected under similar other factors. If the training examples are collected from healthy individuals under similar other factors, or from data previously classified as healthy for a specific user under similar other factors, these embodiments act as anomaly detectors from a healthy population or a specific user, respectively. If the training examples are simply obtained from a specific user without any other classification, these embodiments act as change detectors to detect changes in the health indicators of the specific user when measured relative to the time when the training examples were collected.

这里描述的一些实施例利用机器学习来持续监测在一个或多个其它因素的影响下的一个人的健康指标，并且根据在类似其它因素的影响下的被归类为健康的人群来评估这个人是否是健康的。如本领域技术人员将容易理解的，在不超出这里描述的范围的情况下，可以使用多种不同的机器学习算法或模型(包括但不限于Bayes、Markov、Gausian过程、聚类算法、生成模型、核和神经网络算法)。如本领域技术人员所理解的，典型的神经网络采用例如但不限于非线性激活函数的一个或多个层来预测接收输入的输出，并且除了输入层和输出层之外，还可以包括一个或多个隐藏层。这些网络中的一些网络的各隐藏层的输出用作网络中的下一层的输入。神经网络的示例包括例如但不限于生成神经网络(generative neutral network)、卷积神经网络和递归神经网络。Some embodiments described herein utilize machine learning to continuously monitor a person's health indicators under the influence of one or more other factors, and assess whether the person is healthy according to the crowd classified as healthy under the influence of similar other factors. As will be readily understood by those skilled in the art, without exceeding the scope described herein, a variety of different machine learning algorithms or models (including but not limited to Bayes, Markov, Gausian processes, clustering algorithms, generative models, kernels and neural network algorithms) can be used. As understood by those skilled in the art, a typical neural network uses one or more layers of, for example, but not limited to, nonlinear activation functions to predict the output of receiving input, and in addition to the input layer and the output layer, one or more hidden layers may also be included. The output of each hidden layer of some of these networks is used as the input of the next layer in the network. Examples of neural networks include, for example, but not limited to, generative neutral networks, convolutional neural networks, and recursive neural networks.

健康监测系统的一些实施例监测作为低保真度数据(例如，心率或PPG数据)的个体的心率和活动数据，并检测通常使用高保真度数据(例如，ECG数据)进行检测的状况(例如，AFib)。例如，个体的心率可以由传感器持续地或以离散的间隔(诸如每五秒)提供。心率可以基于PPG、脉搏血氧计或其它传感器来确定。在一些实施例中，活动数据可被生成为所进行的迈步数、感测到的移动量或指示活动水平的其它数据点。然后，低保真度(例如，心率)数据和活动数据可被输入到机器学习系统中，以确定高保真度结果的预测。例如，机器学习系统可以使用低保真度数据来预测心律失常或用户心脏健康的其它指示。在一些实施例中，机器学习系统可以使用数据输入的片段的输入来确定预测。例如，一小时的活动水平数据和心率数据可被输入到机器学习系统中。然后，系统可以使用这些数据来生成对诸如心房颤动等的状况的预测。以下更详尽地讨论了本发明的各种实施例。Some embodiments of the health monitoring system monitor the heart rate and activity data of an individual as low-fidelity data (e.g., heart rate or PPG data), and detect conditions (e.g., AFib) that are usually detected using high-fidelity data (e.g., ECG data). For example, the heart rate of an individual may be provided by a sensor continuously or at discrete intervals (such as every five seconds). The heart rate may be determined based on a PPG, a pulse oximeter, or other sensor. In some embodiments, the activity data may be generated as the number of steps taken, the amount of movement sensed, or other data points indicating the activity level. Then, the low-fidelity (e.g., heart rate) data and the activity data may be input into a machine learning system to determine a prediction of a high-fidelity result. For example, a machine learning system may use low-fidelity data to predict arrhythmias or other indications of a user's heart health. In some embodiments, a machine learning system may use input of a fragment of data input to determine a prediction. For example, one hour of activity level data and heart rate data may be input into a machine learning system. Then, the system may use these data to generate predictions of conditions such as atrial fibrillation. Various embodiments of the present invention are discussed in more detail below.

参考图1A，经训练的卷积神经网络(CNN)100(前馈网络的一个示例)将输入数据102(例如，船的图片)带入到卷积层(又称隐藏层)103中，对各卷积层103中的输入数据106应用一系列经训练的权重或滤波器104。第一卷积层的输出是激活映射(未示出)，该激活映射是应用了经训练的权重或滤波器(未示出)的第二卷积层的输入，其中后续卷积层的输出得到表示第一层的输入数据的越来越复杂的特征的激活映射。在各卷积层之后，应用非线性层(未示出)来引入非线性问题，其中非线性层可以包括tanh、sigmoid或ReLU。在一些情况下，可以在非线性层之后应用池化层(未示出)(也称为下采样层)，其中该池化层基本上采用相同长度的滤波器和步长(stride)，并将其应用于输入，并输出滤波器进行卷积运算的每个子区域的最大数量。池化的其它选项是平均池化和L2范数池化。池化层降低了输入体积的空间维度，从而降低了计算成本并控制了过度拟合。网络的最后层是全连接层，该全连接层获取上一个卷积层的输出并输出表示要预测的量(例如，图像分类的概率：20％汽车，75％船，5％公交车以及0％自行车)的n维输出向量，即，得到预测输出106(O*)，例如，这可能是船的图片。输出可以是网络正在预测的标量值数据点，例如股票价格。如以下更全面地描述，经训练的权重104对于各卷积层103可能是不同的。为了实现这种现实世界的预测/检测(例如，它是一个船)，需要就已知数据输入或训练示例对神经网络进行训练，从而得到经训练的CNN 100。为了训练CNN 100，将许多不同的训练示例(例如，许多船的图片)输入到模型中。神经网络领域的技术人员将充分理解，以上说明提供了CNN的某种简单化观点以针对当前讨论提供某种上下文，并且将充分理解单独应用任何CNN或与其它神经网络相结合地应用CNN将是同样适用的、并且在这里描述的一些实施例的范围内。Referring to FIG. 1A , a trained convolutional neural network (CNN) 100 (an example of a feed-forward network) takes input data 102 (e.g., a picture of a boat) into a convolutional layer (also called a hidden layer) 103, and applies a series of trained weights or filters 104 to the input data 106 in each convolutional layer 103. The output of the first convolutional layer is an activation map (not shown), which is the input to the second convolutional layer to which the trained weights or filters (not shown) are applied, where the output of subsequent convolutional layers results in activation maps that represent increasingly complex features of the input data of the first layer. After each convolutional layer, a nonlinear layer (not shown) is applied to introduce nonlinearity, where the nonlinear layer can include tanh, sigmoid, or ReLU. In some cases, a pooling layer (not shown) (also called a downsampling layer) can be applied after the nonlinear layer, where the pooling layer basically takes filters of the same length and stride, applies them to the input, and outputs the maximum number of filters per sub-region for which the convolution operation is performed. Other options for pooling are average pooling and L2 norm pooling. The pooling layer reduces the spatial dimension of the input volume, thereby reducing computational cost and controlling overfitting. The last layer of the network is a fully connected layer, which takes the output of the previous convolutional layer and outputs an n-dimensional output vector representing the quantity to be predicted (e.g., the probability of image classification: 20% car, 75% boat, 5% bus and 0% bicycle), that is, the predicted output 106 (O*) is obtained, for example, this may be a picture of a boat. The output can be a scalar value data point that the network is predicting, such as a stock price. As described more fully below, the trained weights 104 may be different for each convolutional layer 103. In order to achieve this real-world prediction/detection (e.g., it is a boat), it is necessary to train the neural network on known data inputs or training examples to obtain a trained CNN 100. In order to train the CNN 100, many different training examples (e.g., many pictures of boats) are input into the model. Those skilled in the art of neural networks will fully appreciate that the above description provides a somewhat simplified view of CNNs to provide some context for the current discussion, and will fully appreciate that the application of any CNN alone or in combination with other neural networks will be equally applicable and within the scope of some of the embodiments described herein.

图1B展现了训练CNN 108。在图1B中，卷积层103被示出为单个隐藏卷积层105、105’直到卷积层105^n-1，并且最后的第n层是全连接层。应当理解，最后的层可以是多于一个全连接层。将训练示例111输入到卷积层103中，将非线性激活函数(未示出)和权重110、110’到110ⁿ连续应用于训练示例111中，其中任何隐藏层的输出是下一层的输入，依此类推，直到最后的第n个全连接层105ⁿ产生输出114。将输出或预测114与训练示例111(例如，船的图片)进行比较，得到输出或预测114与训练示例111之间的差116。如果差或损失116小于某个预设损失(例如，输出或预测114预测物体是船)，则CNN收敛并被认为是经训练的。如果CNN尚未收敛，则使用反向传播技术，根据预测与已知输入多接近来更新权重110和110’到110ⁿ。本领域技术人员将理解，可以使用除反向传播之外的方法来调整权重。输入第二个训练示例(例如，不同的船的图片)，并利用更新后的权重再次重复该处理，然后再次更新权重，依此类推，直到已经输入第n个训练示例(例如，第n个船的第n个图片)。通过相同的n个训练示例反复重复该处理，直到卷积神经网络(CNN)被训练或收敛于已知输入的正确输出。一旦CNN 108被训练，权重110、110’到110ⁿ(即如图1A中所描绘的权重104)就固定并用于经训练的CNN 100。如所解释的，对于各卷积层103和各全连接层存在不同的权重。然后，向经训练的CNN 100或模型馈送图像数据，以确定或预测它被训练以预测/识别的事物(例如，船)，如以上所述。任何经训练的模型、CNN、RNN等可以利用附加训练示例或者由模型输出、继而用作训练示例的预测数据来进一步训练，即，可以允许修改权重。机器学习模型可以“离线”训练，例如在与使用/执行经训练的模型的平台分离的计算平台上进行训练、然后被传送至使用/执行经训练的模型的平台。可选地，这里描述的实施例可以基于新获取的训练数据来定期或持续地更新机器学习模型。这种更新训练可以在通过网络连接将更新后的经训练模型传递至使用/执行再训练的模型的平台的单独计算平台上进行，或者训练/再训练/更新处理可以在获取到新数据时在使用/执行再训练的模型的平台本身上进行。本领域技术人员将理解，CNN适用于固定阵列中的数据(例如，图片、字符、词等)或数据的时间序列。例如，可以使用CNN来对序列化的健康指标数据和其它因素数据进行建模。一些实施例利用具有跳跃连接和高斯混合模型输出的前馈CNN来确定预测健康指标(例如，心率、PPG或心律失常)的概率分布。FIG1B illustrates training a CNN 108. In FIG1B , the convolutional layers 103 are shown as a single hidden convolutional layer 105, 105' through convolutional layer 105n ^-1 , and the final nth layer is a fully connected layer. It should be understood that the final layer can be more than one fully connected layer. A training example 111 is input into the convolutional layer 103, and a non-linear activation function (not shown) and weights 110, 110' through ¹¹⁰ⁿ are applied to the training example 111 successively, where the output of any hidden layer is the input to the next layer, and so on, until the final nth fully connected layer ¹⁰⁵ⁿ produces an output 114. The output or prediction 114 is compared to the training example 111 (e.g., a picture of a boat), and a difference 116 between the output or prediction 114 and the training example 111 is obtained. If the difference or loss 116 is less than a certain preset loss (e.g., the output or prediction 114 predicts that the object is a boat), the CNN converges and is considered trained. If the CNN has not yet converged, the back propagation technique is used to update the weights 110 and 110' to ¹¹⁰ⁿ according to how close the prediction is to the known input. Those skilled in the art will appreciate that methods other than back propagation can be used to adjust the weights. A second training example (e.g., a picture of a different ship) is input, and the process is repeated again using the updated weights, and then the weights are updated again, and so on, until the nth training example (e.g., the nth picture of the nth ship) has been input. The process is repeated repeatedly through the same n training examples until the convolutional neural network (CNN) is trained or converges to the correct output for the known input. Once the CNN 108 is trained, the weights 110, 110' to ¹¹⁰ⁿ (i.e., the weights 104 depicted in Figure 1A) are fixed and used for the trained CNN 100. As explained, there are different weights for each convolutional layer 103 and each fully connected layer. Then, the trained CNN 100 or model is fed with image data to determine or predict what it is trained to predict/recognize (e.g., a ship), as described above. Any trained model, CNN, RNN, etc. can be further trained using additional training examples or prediction data output by the model and then used as training examples, that is, weights can be allowed to be modified. The machine learning model can be trained "offline", for example, trained on a computing platform separate from the platform using/executing the trained model and then transmitted to the platform using/executing the trained model. Optionally, the embodiments described herein can periodically or continuously update the machine learning model based on newly acquired training data. This update training can be performed on a separate computing platform that transmits the updated trained model to the platform using/executing the retrained model through a network connection, or the training/retraining/update process can be performed on the platform itself using/executing the retrained model when new data is acquired. Those skilled in the art will understand that CNN is suitable for data in fixed arrays (e.g., pictures, characters, words, etc.) or time series of data. For example, CNN can be used to model serialized health indicator data and other factor data. Some embodiments use a feedforward CNN with a skip connection and a Gaussian mixture model output to determine the probability distribution of predicted health indicators (e.g., heart rate, PPG, or arrhythmia).

一些实施例可以利用其它类型和配置的神经网络。卷积层的数量以及全连接层的数量可以增加或减少。一般来说，卷积层相对于全连接层的最佳数量和比例可以通过确定哪种配置对于给定数据集提供了最佳性能来以实验的方式设置。卷积层的数量可以减少至0，剩下全连接网络。卷积滤波器的数量和各滤波器的宽度也可以增加或减少。Some embodiments may utilize other types and configurations of neural networks. The number of convolutional layers and the number of fully connected layers may be increased or decreased. In general, the optimal number and ratio of convolutional layers to fully connected layers may be set experimentally by determining which configuration provides the best performance for a given data set. The number of convolutional layers may be reduced to zero, leaving a fully connected network. The number of convolutional filters and the width of each filter may also be increased or decreased.

神经网络的输出可以是与对主时间序列的精确预测相对应的单个标量值。可选地，神经网络的输出可以是逻辑回归，在该逻辑回归中，各类别与主时间序列值的特定范围或种类相对应，其中这些主时间序列值是本领域技术人员容易理解的任意数量的可选输出。The output of the neural network can be a single scalar value corresponding to an accurate forecast of the primary time series. Alternatively, the output of the neural network can be a logistic regression in which each category corresponds to a specific range or category of primary time series values, where these primary time series values are any number of optional outputs that are readily understood by those skilled in the art.

在一些实施例中，使用高斯混合模型输出旨在约束网络学习形式良好的概率分布、并改进有限训练数据的一般化。在一些实施例中，在高斯混合模型中使用多个元素旨在允许该模型学习多模式概率分布。还可以使用对不同神经网络的结果进行组合或聚合的机器学习模型，其中可以组合各结果。In some embodiments, the use of Gaussian mixture model outputs is intended to constrain the network to learn well-formed probability distributions and improve generalization to limited training data. In some embodiments, the use of multiple elements in the Gaussian mixture model is intended to allow the model to learn multi-modal probability distributions. Machine learning models that combine or aggregate the results of different neural networks can also be used, where the results can be combined.

具有来自先前预测的可更新的记忆或状态以应用于后续预测的机器学习模型是用于对序列化数据进行建模的另一种方法。特别地，这里描述的一些实施例利用递归神经网络。参考图2A的示例，示出经训练的递归神经网络(RNN)200的图。经训练的RNN 200具有可更新的状态(S)202和经训练的权重(W)204。将输入数据206输入到应用权重(W)204的状态202中，并且输出预测206(P*)。与线性神经网络(例如，CNN 100)相对，状态202基于输入数据进行更新，从而用作来自先前状态的记忆以依次用于利用下一数据的下一预测。更新状态为RNN提供了环状或循环特征。为了更好地展现，图2B示出展开的经训练的RNN 200以及它对于序列化数据的适用性。在展开时，RNN表现得与CNN类似，但在展开的RNN中，各明显类似的层表现为具有更新后的状态的单个层，其中在循环的各迭代中应用相同的权重。本领域技术人员将理解，单个层本身可能具有子层，但是为了清楚解释，这里描述了单个层。将时间t处的输入数据(I_t)208输入到时间t处的状态(S_t)210，并且在时间t处的神经元(C_t)212内应用经训练的权重204。C_t 212的输出是时间步骤t+1处的预测214和更新后的状态S_t+1 216。类似地，在C_t+1 220中，将I_t+1 218输入到S_t+1 216中，应用相同的经训练的权重204，并且C_t+1 220的输出是222。如上所述，由S_t更新S_t+1，因此S_t+1具有来自先前时间步骤中的S_t的记忆。例如但不限于，该记忆可以包括来自一个或多个先前时间步骤的先前健康指标数据或先前其它因素数据。该处理继续n个步骤，其中将I_t+n 224输入到S_t+n 226中，并应用相同的权重204。神经元C_t+n的输出是预测特别地，根据先前时间步骤更新状态，从而为RNN提供了来自先前状态的记忆的益处。对于一些实施例，该特性使得RNN成为对序列化数据进行预测的可选选项。尽管如此，并且如上所述，存在用于对序列化数据进行这种预测的其它合适的机器学习技术，包括CNN。A machine learning model with an updateable memory or state from a previous prediction to be applied to a subsequent prediction is another method for modeling serialized data. In particular, some embodiments described herein utilize recurrent neural networks. With reference to the example of FIG. 2A , a diagram of a trained recurrent neural network (RNN) 200 is shown. The trained RNN 200 has an updateable state (S) 202 and a trained weight (W) 204. Input data 206 is input into the state 202 to which the weight (W) 204 is applied, and a prediction 206 (P*) is output. In contrast to a linear neural network (e.g., CNN 100), the state 202 is updated based on the input data, thereby serving as a memory from a previous state to sequentially utilize the next prediction of the next data. The updated state provides a cyclic or loop feature for the RNN. In order to better demonstrate, FIG. 2B illustrates an expanded trained RNN 200 and its applicability to serialized data. When unfolded, RNNs behave similarly to CNNs, but in an unfolded RNN, each apparently similar layer behaves as a single layer with an updated state, where the same weights are applied in each iteration of the loop. Those skilled in the art will appreciate that a single layer may itself have sublayers, but for clarity of explanation, a single layer is described here. Input data at time t (I _t ) 208 is input to the state (S _t ) 210 at time t, and the trained weights 204 are applied within the neuron (C _t ) 212 at time t. The output of C _t 212 is the prediction at time step t+1 214 and the updated state S _t+1 216. Similarly, in C _t+1 220, I _t+1 218 is input into S _t+1 216, the same trained weights 204 are applied, and the output of C _t+1 220 is 222. As described above, _St+1 is updated by _St , so _St+1 has a memory of _St from the previous time step. For example, but not limited to, the memory may include previous health indicator data or previous other factor data from one or more previous time steps. The process continues for n steps, where I _t+n 224 is input into St _+n 226 and the same weights 204 are applied. The output of neuron C _t+n is the prediction In particular, the state is updated according to the previous time step, thereby providing the RNN with the benefit of memory from the previous state. For some embodiments, this property makes RNN a viable option for making predictions on serialized data. Nevertheless, and as mentioned above, there are other suitable machine learning techniques for making such predictions on serialized data, including CNNs.

与CNN一样，RNN可以处理作为输入的数据串，并输出预测数据串。用以解释使用RNN的该方面的简单方式是使用自然语言预测的示例。以下面的短语为例：The sky isblue(天空是蓝色的)。词串(即，数据)具有上下文。因此，随着状态更新，数据串从一个迭代更新到下一迭代，从而提供了用以预测blue(蓝色)的上下文。正如刚刚描述的，RNN具有用以辅助对序列化数据进行预测的记忆分量。然而，RNN的更新后的状态中的记忆可能在它可以回溯多远这一方面受到限制，类似于短期记忆。当期望在更长回溯(类似于长期记忆)的情况下预测序列化数据时，可以使用对刚刚描述的RNN的微调来实现这一点。从前面或周围紧邻的词不清楚要预测的词的句子又是用以解释以下内容的简单示例：Mary speaksfluent French(Mary讲流利的法语)。从前面紧邻的词不清楚French(法语)是正确的预测；只清楚某种语言是正确的预测，但哪种语言是正确的预测呢？正确的预测可能存在于以比单个词串更大的间隔分隔的词的上下文中。长期记忆(Long Term Memory,LSTM)网络是一种特殊的RNN，它能够学习这些(更)长期依赖关系。Like CNN, RNN can process a data string as input and output a predicted data string. A simple way to explain this aspect of using RNN is to use an example of natural language prediction. Take the following phrase as an example: The sky is blue. The word string (i.e., data) has a context. Therefore, as the state is updated, the data string is updated from one iteration to the next, thereby providing a context for predicting blue. As just described, RNN has a memory component to assist in predicting serialized data. However, the memory in the updated state of the RNN may be limited in how far it can be traced back, similar to short-term memory. When it is desired to predict serialized data with a longer traceback (similar to long-term memory), this can be achieved using fine-tuning of the RNN just described. The sentence in which it is unclear what word to predict from the previous or surrounding words is a simple example to explain the following: Mary speaks fluent French. It is not clear from the previous words that French is the correct prediction; it is only clear that a certain language is the correct prediction, but which language is the correct prediction? The correct prediction may lie in the context of words separated by larger intervals than a single string of words. Long Term Memory (LSTM) networks are a special type of RNN that are able to learn these (longer) long-term dependencies.

如上所述，RNN具有相对简单的重复结构，例如，它们包括具有非线性激活函数(例如，tanh或sigmoid)的单个层。类似地，LSTM具有链状结构，但是(例如)具有四个神经网络层，而不是一个。这些附加的神经网络层为LSTM提供了通过使用被称为神经元门的结构来相对于状态(S)删除或添加信息的能力。同上出处。图3示出用于LSTM RNN的神经元300。线302表示神经元状态(S)，并且可被视为信息高速公路；它相对容易地使信息沿着未改变的神经元状态流动。同上出处。神经元门304、306和308确定有多少信息被允许通过该状态或沿着信息高速公路。神经元门304首先决定要从神经元状态S_t、即所谓的遗忘门层中删除多少信息。同上出处。接着，神经元门306和306’确定哪些信息将被添加到神经元状态，并且神经元门308和308’确定将从神经元状态输出什么信息作为预测信息高速公路或神经元状态现在是更新后的神经元状态S_t+1，以供在下一神经元中使用。LSTM允许RNN具有更持久或(更)长期的记忆。与较简单的RNN结构相比，LSTM为基于RNN的机器学习模型提供了如下的附加优势：根据数据是如何序列化的，输出预测考虑了与输入数据分离更长的空间或时间的上下文。As described above, RNNs have a relatively simple repetitive structure, for example, they include a single layer with a nonlinear activation function (e.g., tanh or sigmoid). Similarly, LSTMs have a chain-like structure, but (for example) have four neural network layers instead of one. These additional neural network layers provide LSTMs with the ability to remove or add information relative to the state (S) by using structures called neuron gates. Ibid. Figure 3 shows a neuron 300 for an LSTM RNN. Line 302 represents the neuron state (S) and can be viewed as an information highway; it relatively easily allows information to flow along the unchanged neuron state. Ibid. Neuron gates 304, 306, and 308 determine how much information is allowed to pass through the state or along the information highway. Neuron gate 304 first decides how much information to remove from the neuron state S _t , the so-called forget gate layer. Ibid. Next, neuron gates 306 and 306' determine what information will be added to the neuron state, and neuron gates 308 and 308' determine what information will be output from the neuron state as a prediction. The information highway or neuron state is now the updated neuron state _St+1 for use in the next neuron. LSTM allows RNNs to have more persistent or (longer) long-term memory. Compared to simpler RNN structures, LSTM provides the following additional advantages to RNN-based machine learning models: Depending on how the data is serialized, the output predictions take into account context that is separated from the input data for a longer period of time or space.

在利用RNN的一些实施例中，在各时间步骤处，主时间序列和次时间序列可以不作为向量被提供给RNN。作为替代，仅向RNN提供主时间序列和次时间序列的当前值、以及预测间隔内的次时间序列的未来值或聚合函数。以这种方式，RNN使用持久状态向量来保留与先前值有关的信息，以供在进行预测时使用。In some embodiments utilizing RNNs, the primary and secondary time series may not be provided to the RNN as vectors at each time step. Instead, only the current values of the primary and secondary time series, and the future values or aggregate functions of the secondary time series within the prediction interval are provided to the RNN. In this way, the RNN uses a persistent state vector to retain information about previous values for use in making predictions.

机器学习非常适合持续监测一个或多个标准以识别与用于训练模型的训练示例相比的输入数据中的大小不一的异常或趋势。因此，这里描述的一些实施例将用户的健康指标数据和可选的其它因素数据输入到经训练的机器学习模型中，该经训练的机器模型预测健康人的健康指标数据在下一时间步骤处的样子，并将预测与未来时间步骤处的用户的测量健康指标数据进行比较。如果差的绝对值(例如，下文所述的损失)超过阈值，则向用户通知他或她的健康指标数据不在正常或健康的范围内。阈值是由设计者设置的数字，并且在一些实施例中可以由用户更改以允许用户调整通知灵敏度。可以通过来自健康人群的健康指标数据单独地或与(时间上)相应的其它因素数据相结合地对这些实施例的机器学习模型进行训练，或者可以通过其它训练示例对这些实施例的机器学习模型进行训练，以满足模型的设计需求。Machine learning is very suitable for continuously monitoring one or more criteria to identify anomalies or trends of varying sizes in the input data compared to the training examples used to train the model. Therefore, some embodiments described herein input the user's health indicator data and optional other factor data into a trained machine learning model, which predicts what the health indicator data of a healthy person will look like at the next time step, and compares the prediction with the measured health indicator data of the user at the future time step. If the absolute value of the difference (e.g., the loss described below) exceeds a threshold, the user is notified that his or her health indicator data is not within a normal or healthy range. The threshold is a number set by the designer, and in some embodiments can be changed by the user to allow the user to adjust the notification sensitivity. The machine learning models of these embodiments can be trained individually or in combination with (in time) corresponding other factor data from the health indicator data of a healthy population, or the machine learning models of these embodiments can be trained by other training examples to meet the design requirements of the model.

来自健康指标的数据(例如，心率数据)是序列化数据，更特别地是时间序列化数据。心率例如但不限于可以以多种不同的方式测量，例如，测量来自胸带或从PPG信号导出的电信号。一些实施例获得从装置导出的心率，其中各数据点(例如，心率)以大致相等的间隔(例如，5秒)产生。但是，在一些情况下以及在其它实施例中，所导出的心率不是以大致相等的时间步骤提供的，这例如是因为导出所需的数据不可靠(例如，因为装置移动或由于光污染，PPG信号是不可靠的)。对于从用于收集其它因素数据的运动传感器或其它传感器获得次数据序列也如此。Data from health indicators (e.g., heart rate data) is serialized data, more particularly time-serialized data. Heart rate, for example but not limited to, can be measured in a variety of different ways, for example, measuring an electrical signal from a chest strap or derived from a PPG signal. Some embodiments obtain a heart rate derived from a device, where each data point (e.g., heart rate) is generated at approximately equal intervals (e.g., 5 seconds). However, in some cases and in other embodiments, the derived heart rate is not provided in approximately equal time steps, for example, because the data required for the derivation is unreliable (e.g., because the device moves or due to light pollution, the PPG signal is unreliable). The same is true for obtaining secondary data sequences from motion sensors or other sensors used to collect data on other factors.

原始信号/数据(来自ECG、胸带或PPG信号的电信号)本身是可根据一些实施例使用的数据的时间序列。为了清楚起见而非限制，本说明书使用PPG来指代表示健康指标的数据。本领域技术人员将容易理解，根据这里描述的一些实施例，可以使用健康指标数据、原始数据、波形或由原始数据或波形导出的数字的形式。The raw signal/data (electrical signal from ECG, chest strap or PPG signal) itself is a time series of data that can be used according to some embodiments. For the sake of clarity and not limitation, this specification uses PPG to refer to data representing health indicators. Those skilled in the art will readily understand that according to some embodiments described herein, health indicator data, raw data, waveforms, or digital forms derived from raw data or waveforms can be used.

可以与这里描述的实施例一起使用的机器学习模型包括例如但不限于Bayes、Markov、Gausian过程、聚类算法、生成模型、核和神经网络算法。一些实施例利用基于经训练的神经网络的机器学习模型，其它实施例利用递归神经网络，以及附加实施例使用LTSMRNN。为了清楚起见而非限制，将使用递归神经网络来描述本说明书的一些实施例。Machine learning models that can be used with the embodiments described herein include, for example, but are not limited to, Bayesian, Markov, Gausian processes, clustering algorithms, generative models, kernel and neural network algorithms. Some embodiments utilize machine learning models based on trained neural networks, other embodiments utilize recursive neural networks, and additional embodiments use LTSMRNN. For clarity and not limitation, some embodiments of this specification will be described using recursive neural networks.

图4A～4C示出PPG(图4A)、所进行的迈步(图4B)和气温(图4C)相对于时间的假设标绘图。PPG是健康指标数据的示例，其中迈步、活动水平和气温是可能影响健康指标数据的其它因素的示例性其它因素数据。本领域技术人员将会理解，其它数据可以从包括但不限于加速度计数据、GPS数据、体重秤、用户输入等的许多已知来源中的任何一个获得，并且可以包括但不限于气温、活动(跑步、步行、坐着、骑车、跌倒、爬楼梯、迈步等)、BMI、体重、身高、年龄等。在所有三个标绘图上垂直延伸的第一条虚线表示获得用户数据以供输入到(以下讨论的)经训练的机器学习模型中的时间t。图4A中的散列标绘线表示预测的或可能的输出数据402，以及图4A中的实线404表示测量数据。图4B是各个时间的用户迈步数的假设标绘图，以及图4C是各个时间的气温的假设标绘图。4A-4C show hypothetical plots of PPG (FIG. 4A), steps taken (FIG. 4B), and air temperature (FIG. 4C) versus time. PPG is an example of health indicator data, where steps, activity level, and air temperature are exemplary other factors data that may affect the health indicator data. Those skilled in the art will appreciate that other data may be obtained from any of a number of known sources including, but not limited to, accelerometer data, GPS data, weight scales, user input, etc., and may include, but are not limited to, air temperature, activity (running, walking, sitting, biking, falling, climbing stairs, stepping, etc.), BMI, weight, height, age, etc. The first dashed line extending vertically on all three plots represents the time t at which user data is obtained for input into a trained machine learning model (discussed below). The hashed plot lines in FIG. 4A represent predicted or possible output data 402, and the solid line 404 in FIG. 4A represents measured data. FIG. 4B is a hypothetical plot of the number of steps taken by the user at various times, and FIG. 4C is a hypothetical plot of the air temperature at various times.

图5A～5B描绘了经训练的递归神经网络500接收图4A～4C中所描绘的输入数据(即，PPG(P)、迈步(R)和气温(T))的示意图。再次强调，这些输入数据(P、R和T)仅仅是健康指标数据和其它因素数据的示例。还应当理解，可以输入和预测多于一个健康指标的数据，并且可以使用多于或少于两个其它因素数据，其中选择是取决于模型被设计用于什么。本领域技术人员还将理解，收集其它因素数据以在时间上与健康指标数据的收集或测量相对应。在一些情况(例如，体重)下，其它因素数据在某个时间段内将保持相对恒定。5A-5B depict schematic diagrams of a trained recurrent neural network 500 receiving the input data depicted in FIGS. 4A-4C (i.e., PPG (P), steps (R), and air temperature (T)). Again, these input data (P, R, and T) are merely examples of health indicator data and other factor data. It should also be understood that data for more than one health indicator can be input and predicted, and more or less than two other factor data can be used, where the selection depends on what the model is designed for. Those skilled in the art will also understand that other factor data is collected to correspond in time to the collection or measurement of health indicator data. In some cases (e.g., weight), other factor data will remain relatively constant over a period of time.

图5A将经训练的神经网络500描绘为循环。将P、T和R输入到RNN 500的应用了权重W的状态502中，并且RNN 500输出预测PPG 504(P*)。在步骤506中，计算差P-P^*(ΔP^*)，并且在步骤508处，判断|ΔP^*|是否大于阈值。如果是，则步骤510向用户通知/警告他/她的健康指标在被预测为正常或针对健康人所预测的界限/阈值外。警告/通知/检测可以是例如但不限于提议看医生/咨询医生、诸如触觉反馈等的简单通知、请求采取诸如ECG等的附加测量、或者没有任何建议的简单注释、或其任意组合。如果|ΔP^*|小于或等于阈值，则步骤512不进行任何操作。在这两个步骤510和512中，利用新的用户数据在下一时间步骤处重复该处理。在本实施例中，在预测数据被输出之后更新状态，并且可以在更新状态时使用预测数据。FIG. 5A depicts the trained neural network 500 as a loop. P, T, and R are input into the state 502 of the RNN 500 to which the weight W is applied, and the RNN 500 outputs a predicted PPG 504 (P*). In step 506, the difference PP ^* (ΔP ^* ) is calculated, and at step 508, it is determined whether |ΔP ^* | is greater than a threshold. If so, step 510 notifies/warns the user that his/her health indicator is outside the boundary/threshold predicted as normal or predicted for a healthy person. The warning/notification/detection can be, for example, but not limited to, a suggestion to see a doctor/consult a doctor, a simple notification such as tactile feedback, a request to take additional measurements such as ECG, or a simple annotation without any suggestions, or any combination thereof. If |ΔP ^* | is less than or equal to the threshold, step 512 does nothing. In both steps 510 and 512, the process is repeated at the next time step with new user data. In this embodiment, the state is updated after the predicted data is output, and the predicted data can be used when updating the state.

在另一实施例(未示出)中，将(例如，从PPG信号导出的)心率数据的主序列以及其它因素数据的次序列提供给经训练的机器学习模型，该经训练的机器学习模型可以是RNN、CNN、其它机器学习模型或这些模型的组合。在本实施例中，机器学习模型被配置为接收参考时间t处的以下内容作为输入：In another embodiment (not shown), the primary sequence of heart rate data (e.g., derived from the PPG signal) and the secondary sequence of other factor data are provided to a trained machine learning model, which can be an RNN, a CNN, other machine learning models, or a combination of these models. In this embodiment, the machine learning model is configured to receive the following as input at a reference time t:

A.直到且包括时间t处的任何健康指标数据的最后300个健康指标样本(例如，以每分钟跳动次数为单位的心率)的长度为300的向量(V_H)；A. A vector (V _H ) of length 300 of the last 300 samples of health indicator data (eg, heart rate in beats per minute) up to and including time t;

B.包含V_H中的各样本的近似时间处的最近其它因素数据(例如，步数)的长度为300的至少一个向量(V_O)；B. at least one vector (V _O ) of length 300 containing the most recent other factor data (e.g., number of steps) at the approximate time of each sample in V _H ;

C.长度为300的向量(V_TD)，其中索引为i的输入V_DT(i)包含健康指标样本V_H(i)的时间戳和V_H(i-1)的时间戳之间的时间差；以及C. a vector (V _TD ) of length 300, where the input V _DT (i) with index i contains the time difference between the timestamp of the health indicator sample V _H (i) and the timestamp of V _H (i-1); and

D.表示在从t到t+τ的时间段测量到的平均其它因素速率(例如，迈步速率)的标量预测间隔其它因素速率O_rate(例如但不限于迈步速率)，其中τ可以是例如但不限于2.5分钟，并且是未来预测间隔。D. A scalar prediction interval other factor rate O _rate (eg, step rate) representing the average other factor rate (eg, step rate) measured over a time period from t to t+τ, where τ may be, for example, but not limited to, 2.5 minutes, and is a future prediction interval.

本实施例的输出可以是例如表征从t到t+τ的时间段测量到的预测心率的概率分布。在一些实施例中，利用包括健康指标数据的连续时间序列和其它因素数据序列的训练示例来对机器学习模型进行训练。在一个可选实施例中，通知系统为各预测健康指标(例如，心率)分布指派时间戳t+τ/2，从而将预测分布集中在预测间隔(τ)内。在本实施例中，通知逻辑然后考虑长度为W_L＝2*(τ)或在本示例中为5分钟的滑动窗口(W)内的所有样本，并计算三个参数：The output of this embodiment can be, for example, a probability distribution that characterizes the predicted heart rate measured over a time period from t to t+τ. In some embodiments, the machine learning model is trained using training examples that include a continuous time series of health indicator data and other factor data series. In an optional embodiment, the notification system assigns a timestamp t+τ/2 to each predicted health indicator (e.g., heart rate) distribution, thereby concentrating the predicted distribution within a prediction interval (τ). In this embodiment, the notification logic then considers all samples within a sliding window (W) of length W _L =2*(τ), or 5 minutes in this example, and calculates three parameters:

1.时间窗口内的所有健康指标序列化数据的平均值 1. The average value of all serialized data of health indicators within the time window

2.预测时间戳落在时间窗口内的健康指标的所有模型预测的平均值其中；以及2. The average of all model predictions for health indicators whose timestamps fall within the time window Among them; and

3.时间窗口内的各预测健康指标分布的均方根的中间值其中3. The median value of the root mean square of the distribution of each predicted health indicator within the time window in

4.在一个实施例中，如果或 (其中ψ是阈值)，则生成通知。4. In one embodiment, if or (where ψ is a threshold), a notification is generated.

在本实施例中，在特定窗口W内的测量健康指标多于离预测健康指标值的平均值的标准偏差的一定倍数的情况下，生成警告。窗口W可以以滑动的方式应用在测量健康指标值和预测健康指标值的序列中，其中各窗口在时间上与前一个窗口重叠设计者指定的分数(例如，0.5分钟)。In this embodiment, a warning is generated when the measured health index within a particular window W is more than a certain number of standard deviations from the mean of the predicted health index value. The window W can be applied in a sliding manner in the sequence of measured health index values and predicted health index values, where each window overlaps the previous window in time by a designer-specified fraction (e.g., 0.5 minutes).

通知可以采取任何数量的不同形式。例如但不限于，可以通知用户获得ECG和/或血压，可以指导计算系统(例如，可穿戴式计算系统等)自动获得ECG或血压(例如)，可以通知用户去看医生，或者简单地告知用户健康指标数据不正常。The notification may take any number of different forms. For example, but not limited to, the user may be notified to obtain an ECG and/or blood pressure, the computing system (e.g., a wearable computing system, etc.) may be instructed to automatically obtain an ECG or blood pressure (for example), the user may be notified to see a doctor, or the user may simply be informed that health indicator data is not normal.

在本实施例中，作为模型的输入的V_DT的选择旨在允许模型利用V_H中的健康指标数据之间的可变间距中所包含的信息，其中可变间距可能来自从不太一致的原始数据中导出健康指标数据的算法。例如，心率样本由Apple Watch算法仅在具有足够可靠的原始PPG数据来输出可靠心率值的情况下生成，这导致心率样本之间的时间间隙不规则。以类似的方式，本实施例利用具有与其它向量相同长度的其它因素数据(V_O)的向量来处理主序列(健康指标)和次序列(其它因素)之间的不同且不规则的采样速率。在本实施例中，次序列被再映射或插值到与主时间序列相同的时间点上。In this embodiment, the choice of V _DT as an input to the model is intended to allow the model to exploit the information contained in the variable spacing between health indicator data in V _H , where the variable spacing may come from the algorithm that derives the health indicator data from less consistent raw data. For example, heart rate samples are generated by the Apple Watch algorithm only when there is sufficiently reliable raw PPG data to output reliable heart rate values, which results in irregular time gaps between heart rate samples. In a similar manner, this embodiment utilizes a vector of other factor data (V _O ) with the same length as the other vectors to handle the different and irregular sampling rates between the primary sequence (health indicators) and the secondary sequence (other factors). In this embodiment, the secondary sequence is remapped or interpolated to the same time point as the primary time series.

此外，在一些实施例中，可以修改未来预测时间间隔(例如，在t之后)的作为机器学习模型的输入所存在的次时间序列中的数据的配置。在一些实施例中，可以利用多个标量值(例如，每个次时间序列一个标量值)来修改包含预测间隔内的平均其它因素数据速率的单个标量值。或者，可以使用预测间隔内的值向量。另外，可以调整预测间隔本身。例如，较短的预测间隔可以提供对变化的更快响应以及对基本时间度量(较)短的事件的改进检测，但也可能对来自噪声源的干扰(例如，运动伪影)更敏感。In addition, in some embodiments, the configuration of the data in the sub-time series that exist as input to the machine learning model for the future prediction time interval (e.g., after t) can be modified. In some embodiments, a single scalar value containing the average other factor data rate within the prediction interval can be modified using multiple scalar values (e.g., one scalar value per sub-time series). Alternatively, a vector of values within the prediction interval can be used. In addition, the prediction interval itself can be adjusted. For example, a shorter prediction interval can provide a faster response to changes and improved detection of events with a (shorter) basic time metric, but may also be more sensitive to interference from noise sources (e.g., motion artifacts).

类似地，机器学习模型本身的输出预测无需是标量。例如，一些实施例可以针对t和t+τ之间的时间间隔内的多个时间t生成预测的时间序列，并且警告逻辑可以将这些预测中的各预测与同一时间间隔内的测量值进行比较。Similarly, the output predictions of the machine learning model itself need not be scalars. For example, some embodiments may generate a time series of predictions for multiple times t within the time interval between t and t+τ, and the alert logic may compare each of these predictions with a measured value within the same time interval.

在该前一实施例中，机器学习模型本身可以包括例如7层前馈神经网络。前3层可以是包含32个核的卷积层，各核的核宽度为24并且步长为2。第一层在三个通道中可以具有作为输入的阵列V_H、V_O和V_TD。最后4层可以是全连接层，除了最后一层之外，所有全连接层利用双曲正切激活函数。第三层的输出可以平整化为一个阵列，以供输入到第一全连接层中。最后一层输出30个值，从而将高斯混合模型参数化为10个混合(针对各混合，具有平均值、方差和权重这三个参数)。网络在第一全连接层和第三全连接层之间使用跳跃连接，使得层6的输出与层4的输出求和，以产生层7的输入。标准批归一化可以在除最后一层之外的所有层上使用，衰减为0.97。使用跳跃连接和批归一化可以提高通过网络传播梯度的能力。In this previous embodiment, the machine learning model itself may include, for example, a 7-layer feedforward neural network. The first 3 layers may be convolutional layers containing 32 kernels, each with a kernel width of 24 and a stride of 2. The first layer may have as input the arrays V _H , V _O , and V _TD in three channels. The last 4 layers may be fully connected layers, all of which, except the last layer, use a hyperbolic tangent activation function. The output of the third layer may be flattened into an array for input into the first fully connected layer. The last layer outputs 30 values, thereby parameterizing the Gaussian mixture model into 10 mixtures (with three parameters, mean, variance, and weight, for each mixture). The network uses a skip connection between the first fully connected layer and the third fully connected layer, so that the output of layer 6 is summed with the output of layer 4 to produce the input of layer 7. Standard batch normalization may be used on all layers except the last layer, with a decay of 0.97. Using skip connections and batch normalization may improve the ability to propagate gradients through the network.

机器学习模型的选择可能影响系统的性能。机器学习模型配置可以分为两种类型的考虑。首先是模型的内部架构，即模型类型的选择(卷积神经网络、递归神经网络、随机森林等的广义非线性回归)、以及表征模型的实现的参数(一般为参数的数量、以及/或者层的数量、决策树的数量等)。其次是模型的外部架构——正被馈送到模型中的数据的布置以及模型被要求解决的问题的具体参数。外部架构的特征可以部分地在于作为模型的输入而提供的数据的维度和类型、该数据所跨越的时间范围、以及对数据进行的预处理或后处理。The choice of machine learning model may affect the performance of the system. Machine learning model configuration can be divided into two types of considerations. The first is the internal architecture of the model, that is, the choice of model type (convolutional neural network, recursive neural network, generalized nonlinear regression of random forest, etc.), and the parameters that characterize the implementation of the model (generally the number of parameters, and/or the number of layers, the number of decision trees, etc.). The second is the external architecture of the model-the arrangement of the data being fed into the model and the specific parameters of the problem the model is asked to solve. The characteristics of the external architecture can be partly the dimensions and type of data provided as input to the model, the time range spanned by the data, and the pre-processing or post-processing of the data.

一般而言，外部架构的选择是增加参数的数量和增加作为输入提供的信息量之间的平衡，这可以增加机器学习模型的预测能力(具有用以训练和评价较大模型的可用存储和计算能力)以及用以防止过度拟合的足够数据量的可用性。In general, the choice of external architecture is a balance between increasing the number of parameters, which can increase the predictive power of the machine learning model (with available storage and computing power to train and evaluate larger models), and increasing the amount of information provided as input, and the availability of sufficient amounts of data to prevent overfitting.

在一些实施例中讨论的模型的外部架构的许多变化是可能的。可以修改输入向量的数量以及绝对长度(元素数)和所覆盖的时间跨度。各输入向量无需是相同的长度或覆盖相同的时间跨度。数据无需等时间地采样，例如但不限于，可以提供6小时的心率数据历史，其中，少于t之前的一小时的数据以1Hz的速率进行采样，多于t之前的1小时但少于t之前的2小时的数据以0.5Hz的速率进行采样，以及早于2小时的数据以0.1Hz的速率进行采样，其中t是参考时间。Many variations of the external architecture of the model discussed in some embodiments are possible. The number of input vectors as well as the absolute length (number of elements) and the time span covered can be modified. The input vectors need not be of the same length or cover the same time span. The data need not be sampled isochronously, for example but not limited to, a 6-hour history of heart rate data can be provided, wherein data less than one hour before t is sampled at a rate of 1 Hz, data more than one hour before t but less than two hours before t is sampled at a rate of 0.5 Hz, and data earlier than two hours is sampled at a rate of 0.1 Hz, where t is a reference time.

图5B示出展开的经训练的RNN 500。将输入数据513(P_t、R_t和T_t)输入到时间t处的状态(S_t)514，并且应用经训练的权重516。神经元(C_t)518的输出是时间t+1处的预测520和更新后的状态S_t+1 522。类似地，在C_t+1 524中，将输入数据(P_t+1、R_t+1和T_t+1)513’输入到S_t+1 522中，并且应用经训练的权重516，并且C_t+1524的输出是523。如上所述，通过更新S_t得到S_t+1，因此S_t+1具有来自先前时间步骤处的S_t的由神经元(C_t)518中的操作得到的记忆。该处理继续n个步骤，其中将输入数据(P_n、R_n和T_n)513”输入到S_n 530中，并且应用经训练的权重516。神经元C_t的输出是预测532特别地，经训练的RNN始终应用相同的权重，但是且更重要的是，根据先前时间步骤来更新状态，从而为RNN提供来自先前时间步骤的记忆的益处。本领域技术人员将理解，输入依赖性健康指标数据的时间顺序可以变化，而且仍将产生期望结果。例如，来自先前时间步骤的测量健康指标数据(例如，P_t-1)和来自当前时间步骤的其它因素数据(例如，R_t和T_t)可以输入到当前时间步骤(S_t)处的状态中，其中模型预测当前时间步骤处的健康指标如上所述将该健康指标与当前时间步骤处的测量健康指标数据进行比较以判断用户的健康指标是否是正常的或处于健康范围内的。5B shows the unrolled trained RNN 500. Input data 513 ( _Pt , _Rt , and _Tt ) are input to the state ( _St ) 514 at time t, and the trained weights 516 are applied. The output of neuron ( _Ct ) 518 is the prediction at time t+1 520 and the updated state S _t+1 522. Similarly, in C _t+1 524, the input data (P _t+1 , R _t+1 and T _t+1 ) 513' is input into S _t+1 522, and the trained weights 516 are applied, and the output of C _t+1 524 is 523. As described above, _St is obtained by updating St, so St ₊ ₁ has the memory of _St from the previous time step resulting from the operation in neuron ( _Ct ) 518. The process continues for n steps, where the input data ( _Pn , _Rn , and _Tn ) 513" is input into _Sn 530, and the trained weights 516 are applied. The output of neuron _Ct is the prediction 532 In particular, the trained RNN always applies the same weights, but more importantly, updates the state based on the previous time step, thereby providing the RNN with the benefit of memory from the previous time step. Those skilled in the art will appreciate that the temporal order in which the dependent health indicator data is input can be varied and still produce the desired results. For example, measured health indicator data from a previous time step (e.g., Pt _-1 ) and other factor data from the current time step (e.g., _Rt and _Tt ) can be input into the state at the current time step ( _St ), where the model predicts the health indicator at the current time step. As mentioned above, the health indicator The measured health indicator data at the current time step is compared to determine whether the user's health indicator is normal or within a healthy range.

图5C示出用以判断用户的健康指标序列化数据(在我们的示例中为PPG)是否在健康人的带或阈值内的经训练的RNN的可选实施例。本实施例中的输入数据是线性组合其中是时间t处的预测健康指标值，以及P_t是时间t处的测量健康指标。在本实施例中，作为损失(L)的函数，α的非线性范围为0～1，其中损失和α在下文中更详细地讨论。现在值得注意的是，当α接近0时，将测量数据P_t输入到网络中，而当α接近1时，将预测数据输入到网络中，以在下一时间步骤处进行预测。还可以可选地输入时间t处的其它因素数据(O_t)。FIG5C shows an alternative embodiment of a trained RNN for determining whether the user's health indicator serialized data (PPG in our example) is within the band or threshold of a healthy person. The input data in this embodiment is a linear combination in is the predicted health indicator value at time t, and _Pt is the measured health indicator at time t. In this embodiment, the nonlinear range of α is 0 to 1 as a function of the loss (L), where the loss and α are discussed in more detail below. It is now worth noting that when α is close to 0, the measured data _Pt is input into the network, while when α is close to 1, the predicted data Input to the network to make predictions at the next time step. Other factor data (O _t ) at time t may also be optionally input.

I_t和O_t是状态S_t的输入，其中在一些实施例中，状态S_t输出时间步骤t+1处的预测健康指标数据的概率分布(β)其中β_(P*)是预测健康指标(P^*)的概率分布函数。在一些实施例中，对概率分布函数进行采样以选择t+1处的预测健康指标值如本领域技术人员所理解，可以根据网络设计者的目标使用不同的方法对β_(P*)进行采样，其中方法可以包括获取概率分布的平均值、最大值或随机采样。使用时间t+1处的测量数据来评价β^t+1提供了状态S_t+1针对测量数据所预测的概率。I _t and O _t are inputs to state S _t , where in some embodiments, state S _t outputs the predicted health indicator data at time step t+1 Probability distribution of (β) Where β _(P*) is the probability distribution function of the predicted health indicator (P ^* ). In some embodiments, the probability distribution function is sampled to select the predicted health indicator value at t+1 As will be appreciated by those skilled in the art, different methods may be used to sample β _(P*) according to the goals of the network designer, wherein the methods may include obtaining the mean, maximum, or random sampling of the probability distribution. Using the measured data at time t+1 to evaluate β ^t+1 provides the probability that the state S _t+1 is predicted for the measured data.

为了说明这一概念，图5D示出时间t+1处的假设健康指标数据范围的假设概率分布。例如以最大概率0.95对该函数进行采样，以确定时间t+1处的预测健康指标还使用测量或实际的健康指标数据来评价概率分布(β^t+1)，并确定模型在实际数据已被输入到该模型中的情况下将会预测出的概率。在该示例中，是0.85。To illustrate this concept, FIG5D shows a hypothetical probability distribution of a hypothetical health indicator data range at time t+1. For example, the function is sampled with a maximum probability of 0.95 to determine the predicted health indicator at time t+1. Also uses measured or actual health indicator data to evaluate the probability distribution (β ^t+1 ) and determine the probability that the model would predict if the actual data had been input into the model. In this example, It is 0.85.

可以定义损失以帮助判断是否要向用户通知他或她的健康状况不在经训练的机器学习模型所预测的正常范围内。选择损失以对预测数据与实际或测量数据有多接近进行建模。本领域技术人员将理解用以定义损失的许多方式。在这里描述的其它实施例中，例如，预测数据和实际数据之间的差的绝对值(|ΔP^*|)是损失。在一些实施例中，损失(L)可以是L＝-ln[β_(P)]，其中 L是预测数据与测量或实际数据有多接近的测度。β_(P)在0至1的范围内，其中1意味着预测值和测量值相同。因此，低损失表示预测值有可能与测量值相同或接近；在这种上下文中，低损失意味着测量数据看起来像是来自健康/正常人。在一些实施例中，设置L的阈值，例如L>5，其中向用户通知健康指标数据在被认为健康的范围外。其它实施例可以获取某一时间段内的损失的平均值，并将平均值与阈值进行比较。在一些实施例中，阈值本身可以是预测值的统计计算或预测值的平均值的函数。在一些实施例中，可以使用下式来向用户通知健康指标不在健康范围内：A loss can be defined to help determine whether to notify a user that his or her health condition is not within the normal range predicted by the trained machine learning model. The loss is selected to model how close the predicted data is to the actual or measured data. Those skilled in the art will understand many ways to define the loss. In other embodiments described herein, for example, the absolute value of the difference between the predicted data and the actual data (|ΔP ^* |) is the loss. In some embodiments, the loss (L) can be L=-ln[β _(P) ], where L is a measure of how close the predicted data is to the measured or actual data. _β(P) ranges from 0 to 1, where 1 means the predicted and measured values are the same. Therefore, low loss indicates that the predicted value is likely to be the same or close to the measured value; in this context, low loss means that the measured data looks like it came from a healthy/normal person. In some embodiments, a threshold for L is set, such as L>5, where the user is notified that the health indicator data is outside the range considered healthy. Other embodiments may take the average of the losses over a period of time and compare the average to a threshold. In some embodiments, the threshold itself may be a statistical calculation of the predicted values or a function of the average of the predicted values. In some embodiments, the following formula may be used to notify the user that the health indicator is not within a healthy range:

<P_range>由对某一时间范围内的测量健康指标数据求平均的方法确定；<P _range > is determined by averaging the measured health indicator data within a certain time range;

由对同一时间范围内的预测健康指标数据求平均的方法确定；是在同一时间范围内从网络获得的标准偏差的序列的中值；以及 Determined by averaging the predicted health indicator data over the same time frame; is the median of the series of standard deviations obtained from the network over the same time frame; and

是在处评价的标准偏差的函数，并且可以用作阈值。 is is a function of the standard deviation of the evaluations at , and can be used as a threshold.

可以使用的求平均的方法包括例如但不限于平均值、算术平均、中值和众数。在一些实施例中，删除离群点以便不使计算出的数字偏离。Averaging methods that can be used include, for example, but are not limited to, average, mean, median, and mode. In some embodiments, outliers are deleted so as not to skew the calculated numbers.

返回参考图5C中所描绘的实施例的输入数据α_t被定义为L的函数并且在0至1的范围内。例如，α(L)可以是线性函数或非线性函数，或者可以在L的某个范围内是线性的，而在L的单独范围内是非线性的。在一个示例中，如图5E所示，函数α(L)对于0到3之间的L是线性的，对于3到13之间的L是二次的，并且对于大于13的L为1。对于本实施例，当L在0到3之间时(即，当预测健康指标数据和测量健康指标数据近似匹配时)，随着α-1接近零，输入数据I_t+1近似为测量数据P_t+1。当L大(例如，大于13)时，α(L)为1，这使得输入数据为时间(时间t+1处的预测健康指标)。当L在1到13之间时，α(L)二次地变化，并且预测健康指标数据和测量健康指标数据对输入数据的相对贡献也会发生变化。在本实施例中，利用α(L)进行加权的预测健康指标数据和测量健康指标数据的线性组合允许对任何特定时间步骤处的输入数据在预测数据和测量数据之间进行加权。在所有这些示例中，输入数据还可以包括其它因素数据(O_t)。这仅仅是自采样的一个示例，其中使用预测数据和测量数据的某种组合作为经训练的网络的输入。本领域技术人员将理解，可以使用其它示例。Referring back to the input data of the embodiment depicted in FIG. 5C α _t is defined as a function of L and is in the range of 0 to 1. For example, α(L) may be a linear function or a nonlinear function, or may be linear within a certain range of L and nonlinear within a separate range of L. In one example, as shown in FIG5E , the function α(L) is linear for L between 0 and 3, quadratic for L between 3 and 13, and 1 for L greater than 13. For this embodiment, when L is between 0 and 3 (i.e., when the predicted health indicator data and the measured health indicator data approximately match), as α-1 approaches zero, the input data I _t+1 is approximately the measured data P _t+1 . When L is large (e.g., greater than 13), α(L) is 1, which makes the input data is the time (the predicted health indicator at time t+1). When L is between 1 and 13, α(L) varies quadratically, and the relative contributions of the predicted health indicator data and the measured health indicator data to the input data also change. In this embodiment, the linear combination of the predicted health indicator data and the measured health indicator data weighted by α(L) allows the input data at any particular time step to be weighted between the predicted data and the measured data. In all of these examples, the input data may also include other factor data (O _t ). This is just one example of self-sampling, where some combination of predicted data and measured data is used as the input to the trained network. Those skilled in the art will appreciate that other examples may be used.

实施例中的机器学习模型使用经训练的机器学习模型。在一些实施例中，机器学习模型使用递归神经网络，该递归神经网络需要经训练的RNN。作为示例而非限制，图6描绘了根据一些实施例的展开的RNN以展现训练RNN。神经元602具有初始状态S₀ 604和权重矩阵W 606。将时间步骤0处的迈步速率数据R₀、气温数据T₀和初始PPG数据P₀输入到状态S₀中，应用权重W，并从神经元602输出第一时间步骤处的预测PPG并使用在时间步骤1处获得的PPG(P₁)来计算神经元602还输出更新后的时间步骤1处的状态608(S₁)，该状态608(S₁)进入神经元610。将时间步骤1处的迈步速率数据R₁、气温数据T₁和PPG数据P₁输入到S₁中，应用权重606W，并从神经元610输出时间步骤2处的预测PPG并使用在时间步骤2处获得的PPG(P₂)来计算神经元610还输出更新后的时间步骤2处的状态612(S₂)，该状态612(S₂)进入神经元614。将时间步骤3处的迈步速率数据R₃、气温数据T₃和PPG数据(P₃)输入到S₂中，应用权重606W，并从神经元614输出时间步骤3处的预测PPG并使用在时间步骤3处获得的PPG(P₃)来计算继续该处理，直到输出时间步骤n处的状态616并且计算出为止。与卷积神经网络的训练类似，在反向传播中使用ΔP^*’以调整权重矩阵。然而，与卷积网络不同，在各迭代中应用递归神经网络中的相同权重矩阵；在训练期间，仅在反向传播中修改权重矩阵。具有健康指标数据和相应的其它因素数据的许多训练示例被反复输入到RNN 600中，直至其收敛为止。如先前所讨论的，在一些实施例中可以使用LTSMRNN，其中此类网络的状态提供了对输入数据的更长期上下文分析，从而可以在网络获知(更)长期相关性的情况下提供更好的预测。正如所提到的，并且本领域技术人员将容易理解，其它机器学习模型将落在这里描述的实施例的范围内，并且可以包括例如但不限于CNN或其它前馈网络。The machine learning model in the embodiment uses a trained machine learning model. In some embodiments, the machine learning model uses a recursive neural network, which requires a trained RNN. As an example and not limitation, FIG. 6 depicts an unfolded RNN according to some embodiments to show the training RNN. Neuron 602 has an initial state S ₀ 604 and a weight matrix W 606. The step rate data R ₀ , the air temperature data T ₀ and the initial PPG data P ₀ at time step 0 are input into state S ₀ , the weight W is applied, and the predicted PPG at the first time step is output from neuron 602. and use the PPG(P ₁ ) obtained at time step 1 to calculate Neuron 602 also outputs an updated state 608 (S ₁ ) at time step 1, which enters neuron 610. The step rate data R ₁ , air temperature data T ₁ , and PPG data P ₁ at time step 1 are input into S ₁ , weight 606 W is applied, and the predicted _PPG at time step 2 is output from neuron 610. and use the PPG(P ₂ ) obtained at time step 2 to calculate Neuron 610 also outputs an updated state 612 (S ₂ ) at time step 2, which enters neuron 614. The step rate data R ₃ , air temperature data T ₃ , and PPG data (P ₃ ) at time step ₃ are input into S ₂ , weight 606 W is applied, and the predicted PPG at time step 3 is output from neuron 614 and use the PPG (P ₃ ) obtained at time step 3 to calculate This process continues until the state 616 at time step n is output and the . Similar to the training of convolutional neural networks, ΔP ^* ' is used in back propagation to adjust the weight matrix. However, unlike convolutional networks, the same weight matrix in the recursive neural network is applied in each iteration; during training, the weight matrix is modified only in back propagation. Many training examples with health indicator data and corresponding other factor data are repeatedly input into the RNN 600 until it converges. As previously discussed, LTSMRNNs may be used in some embodiments, where the state of such networks provides a longer-term contextual analysis of the input data, thereby providing better predictions when the network learns (more) long-term correlations. As mentioned, and those skilled in the art will readily appreciate, other machine learning models will fall within the scope of the embodiments described herein, and may include, for example, but not limited to, CNNs or other feedforward networks.

图7A描绘了用于预测用户的测量健康指标是在针对类似其它因素下的健康人而言正常的阈值内还是外的系统700。系统700具有机器学习模型702和健康检测器704。例如(但不限于)，机器学习模型702的实施例包括经训练的机器学习模型、经训练的RNN、CNN或其它前馈网络。可以通过来自从中收集到健康指标数据和(时间上)相应的其它因素数据的健康人群的训练示例对经训练的RNN、其它网络或网络的组合进行训练。可选地，可以通过来自特定用户的训练示例对经训练的RNN、其它网络或网络的组合进行训练，使其成为个性化的经训练的机器学习模型。本领域技术人员将理解，一般情况下可以根据经训练的网络和系统的使用或设计来选择来自不同人群的训练示例。本领域技术人员也将容易理解，本实施例和其它实施例中的健康指标数据可以是一个或多个健康指标。可以使用例如但不限于PPG数据、心率数据、血压数据、体温数据和血氧浓度数据等中的一个或多个来训练模型并预测用户的健康。健康检测器704使用来自机器学习模型702的预测708和输入数据710来判断损失或通过用测量数据分析预测输出而确定的其它度量是否超过被认为正常的阈值、并因此是不健康的。然后，系统700输出通知或用户的健康状况。该通知可以采用如这里所讨论的许多形式。输入生成器706利用传感器(未示出)从佩戴该传感器或与该传感器接触的用户持续获得数据，其中数据表示用户的一个或多个健康指标。(时间上)相应的其它因素数据可以由另一传感器收集，或者通过这里所描述的或对本领域技术人员显而易见的其它手段获取。FIG. 7A depicts a system 700 for predicting whether a user's measured health indicator is within or outside a threshold that is normal for a healthy person under similar other factors. The system 700 has a machine learning model 702 and a health detector 704. For example (but not limited to), an embodiment of the machine learning model 702 includes a trained machine learning model, a trained RNN, a CNN, or other feedforward network. The trained RNN, other networks, or a combination of networks can be trained by training examples from a healthy population from which health indicator data and (temporally) corresponding other factor data are collected. Optionally, the trained RNN, other networks, or a combination of networks can be trained by training examples from a specific user to make it a personalized trained machine learning model. Those skilled in the art will understand that in general, training examples from different populations can be selected based on the use or design of the trained network and system. Those skilled in the art will also readily understand that the health indicator data in this embodiment and other embodiments can be one or more health indicators. For example, but not limited to, one or more of PPG data, heart rate data, blood pressure data, body temperature data, and blood oxygen concentration data can be used to train the model and predict the health of the user. The health detector 704 uses the prediction 708 from the machine learning model 702 and the input data 710 to determine whether the loss or other metrics determined by analyzing the predicted output with the measured data exceed a threshold that is considered normal and is therefore unhealthy. The system 700 then outputs a notification or the user's health status. The notification can take many forms as discussed herein. The input generator 706 uses a sensor (not shown) to continuously obtain data from a user wearing the sensor or in contact with the sensor, wherein the data represents one or more health indicators of the user. The corresponding other factor data (in time) can be collected by another sensor, or obtained by other means described herein or apparent to those skilled in the art.

输入生成器706也可以收集数据以确定/计算其它因素数据。例如但不限于，输入生成器可以包括智能手表、可穿戴或移动装置(例如，Apple或智能手机、平板计算机或膝上型计算机)、智能手表和移动装置的组合、具有将数据发送至移动装置或其它便携式计算装置的能力的外科手术植入装置、或者医疗护理设施中的手推车上的装置。优选地，用户输入生成器706具有用以测量与一个或多个健康指标相关的数据的传感器(例如，PPG传感器、电极传感器等)。一些实施例的智能手表、平板计算机、移动电话或膝上型计算机可以携带传感器，或者传感器可被放置在远处(通过外科手术嵌入、接触远离移动装置或一些单独装置的身体)，其中，在所有这些情况下，移动装置与传感器进行通信以收集健康指标数据。在一些实施例中，系统700可以单独地、与其它移动装置相结合地、或者经由通过这些装置可以进行通信的网络的通信来与其它计算系统相结合地提供在移动装置上。例如但不限于，系统700可以是具有机器学习模型702和健康检测器704的智能手表或可穿戴式装置，其中机器学习模型702和健康检测器704位于装置(例如，手表的存储器或手表上的固件)上。手表可以具有用户输入生成器706，并经由直接通信、无线通信(例如，WiFi、声音、蓝牙等)或通过网络(例如，互联网、内联网、外联网等)或其组合来与其它计算装置(例如，移动电话、平板计算机、膝上型计算机或台式计算机等)进行通信，其中经训练的机器学习模型702和健康检测器704可位于其它计算装置上。本领域技术人员将理解，在不超过这里描述的实施例的范围的情况下可以利用系统700的任何数量的配置。The input generator 706 may also collect data to determine/calculate other factor data. For example, but not limited to, the input generator may include a smart watch, wearable or mobile device (e.g., Apple or The system 700 may be a combination of a smartwatch and a mobile device, a surgically implanted device with the ability to send data to a mobile device or other portable computing device, or a device on a cart in a medical care facility. Preferably, the user input generator 706 has a sensor (e.g., a PPG sensor, an electrode sensor, etc.) for measuring data related to one or more health indicators. The smartwatch, tablet, mobile phone, or laptop computer of some embodiments may carry the sensor, or the sensor may be placed remotely (surgically embedded, contacting the body away from the mobile device, or some separate device), wherein in all these cases, the mobile device communicates with the sensor to collect health indicator data. In some embodiments, the system 700 may be provided on a mobile device alone, in combination with other mobile devices, or in combination with other computing systems via communication over a network through which these devices can communicate. For example, but not limited to, the system 700 may be a smartwatch or wearable device with a machine learning model 702 and a health detector 704, wherein the machine learning model 702 and the health detector 704 are located on the device (e.g., the memory of the watch or the firmware on the watch). The watch may have a user input generator 706 and communicate with other computing devices (e.g., mobile phones, tablet computers, laptop computers, or desktop computers, etc.) via direct communication, wireless communication (e.g., WiFi, sound, Bluetooth, etc.), or through a network (e.g., the Internet, an intranet, an extranet, etc.), or a combination thereof, where the trained machine learning model 702 and the health detector 704 may be located on the other computing devices. Those skilled in the art will appreciate that any number of configurations of the system 700 may be utilized without exceeding the scope of the embodiments described herein.

参考图7B，描绘了根据实施例的智能手表712。智能手表712包括手表714，其包含本领域技术人员已知的所有电路、微处理器以及处理装置(未示出)。手表714还包括显示器716，其中在该显示器716上，可以显示用户的健康指标数据718(在本示例中为心率数据)。显示器716上还可以显示针对正常或健康人群的预测的健康指标带720。在图7B中，用户的测量心率数据不超过预测健康带，因此在该特定示例中，将不进行通知。手表714还可以包括手表带722和高保真度传感器724(例如ECG传感器)。可选地，手表带722可以是用以测量血压的可扩展袖带。在手表714的背面设置低保真度传感器726(以阴影示出)以收集诸如PPG数据等的用户健康指标数据，该数据可用于导出例如心率数据或诸如血压等的其它数据。可选地，如本领域技术人员将理解的，在一些实施例中，可以使用健身手环(诸如FitBit或Polar等)，其中健身手环具有类似的处理能力和其它因素测量装置(例如，ppg和加速度计)。Referring to FIG. 7B , a smartwatch 712 according to an embodiment is depicted. The smartwatch 712 includes a watch 714, which contains all circuits, microprocessors, and processing devices (not shown) known to those skilled in the art. The watch 714 also includes a display 716, on which the user's health indicator data 718 (heart rate data in this example) can be displayed. The predicted health indicator band 720 for normal or healthy people can also be displayed on the display 716. In FIG. 7B , the user's measured heart rate data does not exceed the predicted health band, so in this particular example, no notification will be made. The watch 714 can also include a watch band 722 and a high-fidelity sensor 724 (e.g., an ECG sensor). Optionally, the watch band 722 can be an expandable cuff for measuring blood pressure. A low-fidelity sensor 726 (shown in shaded form) is provided on the back of the watch 714 to collect user health indicator data such as PPG data, which can be used to derive, for example, heart rate data or other data such as blood pressure. Alternatively, as will be appreciated by those skilled in the art, in some embodiments, a fitness band (such as FitBit or Polar, etc.) may be used, where the fitness band has similar processing capabilities and other factor measurement devices (e.g., ppg and accelerometer).

图8描绘了用于持续监测用户的健康状况的方法800的实施例。步骤802接收用户输入数据，该用户输入数据可以包括一个或多个健康指标的数据(又称为主数据序列)和其它因素的(时间上)相应数据(又称为次数据序列)。步骤804将用户数据输入到经训练的机器学习模型中，该模型可以包括经训练的RNN、CNN、如这里描述的其它前馈网络或本领域技术人员已知的其它神经网络。在一些实施例中，健康指标输入数据可以是预测健康指标数据和测量健康指标数据中的一个或组合，例如线性组合，如这里的一些实施例中所述。步骤806输出某一时间步骤处的一个或多个预测健康指标的数据，其中输出可以包括例如但不限于单个预测值、作为预测值的函数的概率分布。步骤808基于预测健康指标来确定损失，其中例如但不限于，损失可以是预测健康指标和测量健康指标之间的简单差、或一些其它适当选择的损失函数(例如，测量健康指标的值的评价的概率分布的负对数)。步骤810判断损失是否超过被认为是正常或不健康的阈值，其中阈值可以是例如但不限于设计者所选择的简单数字、或与预测相关的一些参数的更复杂函数。如果损失大于阈值，则步骤812向用户通知他或她的健康指标超过被认为是正常或健康的阈值。如这里所述，通知可以采取许多形式。在一些实施例中，该信息可以对用户可视。例如但不限于，信息可以显示在用户界面上，诸如示出以下内容的图(i)作为时间的函数的测量健康指标数据(例如，心率)和其它因素数据(例如，步数)、(ii)机器学习模型所生成的预测健康指标数据(例如，预测心率值)的分布。以这种方式，用户可以在视觉上将测量数据点与预测数据点进行比较，并通过视觉观察来判断例如其心率是否落在机器学习模型所期望的范围内。FIG8 depicts an embodiment of a method 800 for continuously monitoring a user's health status. Step 802 receives user input data, which may include data of one or more health indicators (also referred to as a primary data sequence) and corresponding data (also referred to as a secondary data sequence) of other factors (in time). Step 804 inputs the user data into a trained machine learning model, which may include a trained RNN, CNN, other feedforward networks as described herein, or other neural networks known to those skilled in the art. In some embodiments, the health indicator input data may be one or a combination of predicted health indicator data and measured health indicator data, such as a linear combination, as described in some embodiments herein. Step 806 outputs data of one or more predicted health indicators at a certain time step, wherein the output may include, for example, but not limited to, a single predicted value, a probability distribution as a function of the predicted value. Step 808 determines a loss based on the predicted health indicator, wherein, for example, but not limited to, the loss may be a simple difference between the predicted health indicator and the measured health indicator, or some other appropriately selected loss function (e.g., the negative logarithm of the probability distribution of the evaluation of the value of the measured health indicator). Step 810 determines whether the loss exceeds a threshold value that is considered normal or unhealthy, where the threshold value can be, for example, but not limited to, a simple number selected by the designer, or a more complex function of some parameters related to the prediction. If the loss is greater than the threshold, step 812 notifies the user that his or her health index exceeds the threshold value that is considered normal or healthy. As described herein, the notification can take many forms. In some embodiments, the information can be visible to the user. For example, but not limited to, the information can be displayed on a user interface, such as a graph showing the following (i) measured health index data (e.g., heart rate) and other factor data (e.g., number of steps) as a function of time, (ii) the distribution of predicted health index data (e.g., predicted heart rate value) generated by the machine learning model. In this way, the user can visually compare the measured data points with the predicted data points, and judge by visual observation, for example, whether his or her heart rate falls within the range expected by the machine learning model.

这里描述的一些实施例已经提到使用阈值来判断是否要通知用户。在这些实施例中的一个或多个实施例中，用户可以改变阈值以调整或微调系统或方法以更紧密地匹配用户的个人健康知识。例如，如果所使用的生理指标是血压、并且用户具有较高的血压，则实施例可能根据通过健康人群进行训练的模型频繁地向用户警告/通知其健康指标在正常或健康的范围外。因此，某些实施例允许用户增大阈值，使得不会如此频繁地向用户通知他/她的健康指标数据超过被认为是正常或健康的范围。Some of the embodiments described herein have mentioned the use of thresholds to determine whether to notify the user. In one or more of these embodiments, the user can change the threshold to adjust or fine-tune the system or method to more closely match the user's personal health knowledge. For example, if the physiological indicator used is blood pressure and the user has high blood pressure, the embodiment may frequently warn/notify the user that his/her health indicator is outside the normal or healthy range based on a model trained with healthy people. Therefore, some embodiments allow the user to increase the threshold so that the user is not notified so frequently that his/her health indicator data exceeds the range considered normal or healthy.

一些实施例优选使用健康指标的原始数据。如果对原始数据进行处理以导出特定测量(例如，心率)，则可以根据实施例使用该导出的数据。在一些情况下，健康监测设备的提供者无法控制原始数据，相反，所接收到的数据是以所计算的健康指标(例如，心率或血压)的形式进行处理的数据。如本领域技术人员将会理解，用于训练机器学习模型的数据的形式应当与从用户收集并输入到经训练的模型中的数据的形式相匹配，否则预测可能会被证明是错误的。例如，Apple Watch提供不等的时间步骤处的心率测量数据，而不提供原始PPG数据。在本示例中，用户佩戴Apple Watch，该Apple Watch根据Apple的PPG处理算法利用不等的时间步骤处的心率数据来输出心率数据。通过该数据对模型进行训练。Apple决定改变其提供心率数据的算法，这可能使通过来自先前算法的数据进行训练的模型对于使用来自新算法的数据输入是过时的。为了解决这一潜在问题，一些实施例在收集数据以训练模型的情况下将不规则间隔的数据(心率、血压数据或ECG数据等)重新采样到规则间隔的栅格上并且根据规则间隔的栅格进行采样。如果Apple或其它数据供应商改变了其算法，则只需要通过新收集的训练示例来对模型进行重新训练，而无需对模型进行重构以考虑算法变化。Some embodiments preferably use raw data of health indicators. If the raw data is processed to derive a specific measurement (e.g., heart rate), the derived data can be used according to the embodiment. In some cases, the provider of the health monitoring device cannot control the raw data, instead, the received data is data processed in the form of a calculated health indicator (e.g., heart rate or blood pressure). As will be appreciated by those skilled in the art, the form of data used to train the machine learning model should match the form of data collected from the user and input into the trained model, otherwise the prediction may prove to be wrong. For example, Apple Watch provides heart rate measurement data at unequal time steps, but does not provide raw PPG data. In this example, the user wears an Apple Watch, which outputs heart rate data using heart rate data at unequal time steps according to Apple's PPG processing algorithm. The model is trained with this data. Apple decided to change its algorithm for providing heart rate data, which may make the model trained with data from the previous algorithm obsolete for using data input from the new algorithm. To address this potential problem, some embodiments resample irregularly spaced data (heart rate, blood pressure data, or ECG data, etc.) to a regularly spaced grid and sample according to the regularly spaced grid when collecting data to train the model. If Apple or other data providers change their algorithms, the model only needs to be retrained with the newly collected training examples, without having to reconstruct the model to account for the algorithmic changes.

在进一步的实施例中，经训练的机器学习模型可以通过用户数据进行训练，从而得到个性化的经训练的机器学习模型。这种经训练的个性化机器学习模型可以代替这里描述的通过健康人群进行训练的机器学习模型使用或者与其组合使用。如果使用个性化的经训练的机器学习模型本身，则将用户的数据输入到该机器学习模型中，该机器学习模型将输出对于该用户而言正常的下一时间步骤中的个体健康指标的预测，然后该预测以与这里描述的实施例一致的方式与来自下一时间步骤的实际/测量数据进行比较以判断用户的健康指标是否与针对该用户而言被预测为正常的健康指标相差某一阈值。另外，这种个性化的机器学习模型可以与通过来自健康人群的训练示例进行训练的机器学习模型组合使用，以生成与针对该个体用户而言被预测为正常的健康指标和针对健康人群而言被预测为正常的健康指标这两者有关的预测和相关通知。In a further embodiment, the trained machine learning model can be trained by user data to obtain a personalized trained machine learning model. This trained personalized machine learning model can be used instead of the machine learning model trained by healthy people described here or in combination with it. If a personalized trained machine learning model itself is used, the user's data is input into the machine learning model, which will output a prediction of the individual health index in the next time step that is normal for the user, and then the prediction is compared with the actual/measured data from the next time step in a manner consistent with the embodiments described here to determine whether the user's health index differs from the health index predicted to be normal for the user by a certain threshold. In addition, this personalized machine learning model can be used in combination with a machine learning model trained by training examples from a healthy population to generate predictions and related notifications related to both the health index predicted to be normal for the individual user and the health index predicted to be normal for the healthy population.

图9A描绘了根据另一实施例的方法900，以及图9B示出为了解释目的的(例如但不限于)作为时间的函数的心率的假设标绘图902。步骤904(图9A)接收用户心率数据(或其它健康指标数据)以及可选的(时间上)相应的其它因素数据，并将该数据输入到个性化的经训练的机器学习模型中。在一些实施例中，如这里所述，通过用户的个体健康指标数据以及可选的(时间上)相应的其它数据对个性化的经训练的模型进行训练。因此，在步骤906中，个性化的经训练的机器学习模型预测在其它因素的条件下的该个体用户的正常心率数据，并且步骤908将用户的健康指标数据与针对该特定用户被预测为正常的健康指标数据相比来识别该用户的健康指标数据的反常或异常。如本说明书中所讨论的，一些实施例从用户身上的可穿戴式装置(例如，Apple Watch、智能手表、等)、或者从与用户身上的传感器(例如，带、PPG传感器等)进行通信的其它移动装置(例如，平板计算机、计算机等)接收用户的健康指标数据。FIG. 9A depicts a method 900 according to another embodiment, and FIG. 9B shows a hypothetical plot 902 of heart rate as a function of time for purposes of explanation (for example, but not limited to). Step 904 (FIG. 9A) receives user heart rate data (or other health indicator data) and optionally other factors data corresponding (in time), and inputs the data into a personalized trained machine learning model. In some embodiments, as described herein, the personalized trained model is trained by the user's individual health indicator data and optionally other data corresponding (in time). Thus, in step 906, the personalized trained machine learning model predicts normal heart rate data for the individual user under the conditions of other factors, and step 908 compares the user's health indicator data with the health indicator data predicted to be normal for the particular user to identify anomalies or abnormalities in the user's health indicator data. As discussed in this specification, some embodiments receive heart rate data from a wearable device (e.g., Apple Watch, smartwatch, etc.), or from sensors on the user (e.g. Other mobile devices (e.g., tablet computers, computers, etc.) that communicate with the user's health indicator data.

可以定义损失以帮助判断是否要在步骤908中向用户通知该用户的测量数据对于针对该特定用户被预测为正常的数据而言是异常的。选择损失以对预测与实际或测量数据有多接近进行建模。本领域技术人员将理解用以定义损失的许多方式。在这里描述且同样适用的其它实施例中，例如，预测值和测量值之间的差的绝对值|ΔP^*|是损失的形式。在一些实施例中，损失(L)可以是L＝-ln[β_(P)]，其中L通常是预测数据与测量数据有多接近的测度。β_(P)(在本示例中为概率分布)在0至1的范围内，其中1意味着预测数据和测量数据相同。因此，在一些实施例中，低损失表示预测数据有可能与测量数据相同或接近。在一些实施例中，设置L的阈值，例如L>5，其中根据针对特定用户的预测向该用户通知存在异常状况。如本文其它各处描述的，这种通知可以采取多种形式。另外如本文其它各处描述的，其它实施例可以获取某一时间段内的损失的平均值，并将该平均值与阈值进行比较。在一些实施例中，如本文其它各处更详细描述的，阈值本身可以是预测数据的统计计算或预测数据的平均值的函数。损失已经在本文其它各处进行了详细描述，并且为了简洁起见，在此将不再进行进一步讨论。本领域技术人员将理解，输入和预测数据可以是标量值、或某一时间段内的数据片段。例如但不限于，系统设计者可能对5分钟的数据片段感兴趣，并将输入时间t之前的所有数据和t+5分钟的所有其它数据，预测t+5分钟的健康指标数据，并确定t+5分钟片段的测量健康指标数据相对于t+5分钟片段的预测健康指标数据之间的损失。A loss may be defined to help determine whether to notify the user in step 908 that the user's measured data is abnormal for data predicted to be normal for that particular user. The loss is selected to model how close the prediction is to the actual or measured data. One skilled in the art will appreciate many ways to define the loss. In other embodiments described herein and equally applicable, for example, the absolute value of the difference between the predicted and measured values |ΔP ^* | is in the form of a loss. In some embodiments, the loss (L) may be L=-ln[β _(P) ], where L is typically a measure of how close the predicted data is to the measured data. _{β (P)} (in this example, a probability distribution) ranges from 0 to 1, where 1 means that the predicted data and the measured data are the same. Therefore, in some embodiments, low loss indicates that the predicted data is likely to be the same or close to the measured data. In some embodiments, a threshold value for L is set, such as L>5, where the user is notified of the presence of an abnormal condition based on the prediction for a specific user. As described elsewhere in this document, this notification can take a variety of forms. In addition, as described elsewhere in this document, other embodiments can obtain the average value of the loss within a certain time period and compare the average value with a threshold. In some embodiments, as described in more detail elsewhere in this document, the threshold itself can be a function of a statistical calculation of the predicted data or the average value of the predicted data. The loss has been described in detail elsewhere in this document, and for the sake of brevity, it will not be discussed further here. Those skilled in the art will understand that the input and predicted data can be scalar values, or data fragments within a certain time period. For example, but not limited to, a system designer may be interested in a 5 minute segment of data and will input all data before time t and all other data up to t+5 minutes, predict the health indicator data for t+5 minutes, and determine the loss between the measured health indicator data for the t+5 minute segment relative to the predicted health indicator data for the t+5 minute segment.

步骤908判断是否存在异常。如所讨论的，可以判断损失是否超过阈值。如前所述，阈值是由设计者的选择并基于正在设计的系统的目的来设置的。在一些实施例中，阈值可以由用户修改，但优选地，在本实施例中不进行修改。如果不存在异常，则在步骤904处重复该处理。如果存在异常，则步骤910通知或警告用户获得高保真度测量，例如但不限于ECG或血压测量。在步骤912中，高保真度数据由算法、健康专业人士或这两者进行分析，并被描述为正常或不正常，并且如果不正常，则可以根据所获得的高保真度测量来指派一些诊断，例如AFib、心动过速、心动过缓、心房颤动或高/低血压。为清楚起见，应当注意，用以记录高保真度数据的通知在其它实施例以及上述使用一般模型的特定实施例中是同样适用且可能的。在一些实施例中，高保真度测量可以由用户使用移动监测系统(诸如ECG或血压系统等)直接获得，在一些实施例中，该移动监测系统可以与可穿戴式装置相关联。可选地，通知步骤910使得自动获取高保真度测量。例如，可穿戴式装置可以(通过硬连线或经由无线通信)与传感器进行通信并获得ECG数据，或者它可以与血压袖带系统(例如，可穿戴式装置的腕带或臂环袖带)进行通信以自动获得血压测量，或者它可以与诸如起搏器或ECG电极等的植入式装置进行通信。例如，AliveCor,Inc.提供了用于远程获得ECG的系统，这种系统包括(但不限于)与在两个或更多个位置中的用户接触的一个或多个传感器，其中传感器收集有线或无线地发送至移动计算装置的心电数据，其中app根据数据生成ECG带，该ECG带可以由算法、医学专业人员或这两者进行分析。可选地，传感器可以是血压监视器，其中血压数据被有线或无线地发送至移动计算装置。可穿戴式装置本身可以是血压系统，其具有能够测量健康指标数据的袖带并且可选地具有与上述的ECG传感器类似的ECG传感器。ECG传感器还可以包括诸如共同拥有的US临时申请号61/872,555中所述的ECG传感器，其内容通过引用而并入于此。移动计算装置可以是例如但不限于平板计算机(例如，iPad)、智能手机(例如，)、可穿戴式装置(例如，Apple Watch)或医疗护理设施中的装置(可以安装在手推车上)。在一些实施例中，移动计算装置可以是膝上型计算机或与某些其它移动装置进行通信的计算机。本领域技术人员将理解，可穿戴式装置或智能手表就在这里描述的实施例的上下文中提供的能力而言也将被视为移动计算装置。在可穿戴式装置的情况下，传感器可以放置在可穿戴式装置的环带上，其中传感器可以无线地或通过电线将数据发送至计算装置/可穿戴式装置，或者环带也可以是血压监测袖带，或者如前所述的这两者。在移动电话的情况下，传感器可以是附接至电话或远离电话的衬垫，其中该衬垫感测心电信号，并且无线地或通过硬线将数据通信至可穿戴式装置或其它移动计算装置。这些系统中的一些系统的更详细描述在美国专利号9,420,956、9,572,499、9,351,654、9,247,911、9,254,095、和8,509,882中的一个或多个以及美国专利申请公开号2015/0018660、2015/0297134、和2015/0320328中的一个或多个中提供，上述文献的全部为了所有目的而并入于此。步骤912如前所述，分析高保真度数据并提供描述或诊断。Step 908 determines whether there is an abnormality. As discussed, it can be determined whether the loss exceeds a threshold. As previously described, the threshold is set by the designer's choice and based on the purpose of the system being designed. In some embodiments, the threshold can be modified by the user, but preferably, it is not modified in this embodiment. If there is no abnormality, the process is repeated at step 904. If there is an abnormality, step 910 notifies or warns the user to obtain a high-fidelity measurement, such as but not limited to an ECG or blood pressure measurement. In step 912, the high-fidelity data is analyzed by an algorithm, a health professional, or both, and is described as normal or abnormal, and if abnormal, some diagnoses can be assigned based on the obtained high-fidelity measurements, such as AFib, tachycardia, bradycardia, atrial fibrillation, or high/low blood pressure. For clarity, it should be noted that the notification used to record high-fidelity data is equally applicable and possible in other embodiments and in the above-mentioned specific embodiments using a general model. In some embodiments, the high-fidelity measurement can be obtained directly by the user using a mobile monitoring system (such as an ECG or blood pressure system, etc.), which in some embodiments can be associated with a wearable device. Optionally, the notification step 910 enables automatic acquisition of high-fidelity measurements. For example, the wearable device can communicate with a sensor (either hardwired or via wireless communication) and obtain ECG data, or it can communicate with a blood pressure cuff system (e.g., a wristband or armband cuff of the wearable device) to automatically obtain blood pressure measurements, or it can communicate with an implanted device such as a pacemaker or ECG electrode. For example, AliveCor, Inc. provides a system for remotely obtaining an ECG, which includes (but is not limited to) one or more sensors in contact with a user in two or more locations, wherein the sensor collects ECG data that is sent to a mobile computing device by wire or wirelessly, wherein the app generates an ECG band based on the data, which can be analyzed by an algorithm, a medical professional, or both. Optionally, the sensor can be a blood pressure monitor, wherein the blood pressure data is sent to the mobile computing device by wire or wirelessly. The wearable device itself may be a blood pressure system having a cuff capable of measuring health indicator data and optionally an ECG sensor similar to the ECG sensor described above. The ECG sensor may also include an ECG sensor such as that described in co-owned US Provisional Application No. 61/872,555, the contents of which are incorporated herein by reference. The mobile computing device may be, for example, but not limited to, a tablet computer (e.g., iPad), a smartphone (e.g., ), a wearable device (e.g., an Apple Watch), or a device in a medical care facility (which may be mounted on a cart). In some embodiments, the mobile computing device may be a laptop computer or a computer that communicates with some other mobile device. Those skilled in the art will appreciate that a wearable device or smartwatch would also be considered a mobile computing device in terms of the capabilities provided in the context of the embodiments described herein. In the case of a wearable device, the sensor may be placed on a cuff of the wearable device, where the sensor may send data to the computing device/wearable device wirelessly or by wire, or the cuff may also be a blood pressure monitoring cuff, or both as described above. In the case of a mobile phone, the sensor may be a pad attached to or remote from the phone, where the pad senses ECG signals and communicates the data to the wearable device or other mobile computing device wirelessly or by hardwire. A more detailed description of some of these systems is provided in one or more of U.S. Patent Nos. 9,420,956, 9,572,499, 9,351,654, 9,247,911, 9,254,095, and 8,509,882 and one or more of U.S. Patent Application Publication Nos. 2015/0018660, 2015/0297134, and 2015/0320328, all of which are incorporated herein for all purposes. Step 912 analyzes the high-fidelity data and provides a description or diagnosis as previously described.

在步骤914中，由计算系统接收高保真度测量的诊断或分类，其中在一些实施例中，该计算系统可以是用于收集用户的心率数据(或其它健康指标数据)的移动或可穿戴式计算系统，并且在步骤916中，通过诊断来标记低保真度健康指标数据序列(在本示例中为心率数据)。在步骤918中，使用标记的用户低保真度数据序列来训练高保真度机器学习模型，并且可选地还提供其它因素数据序列来训练模型。在一些实施例中，经训练的高保真度机器学习模型能够接收测量低保真度健康指标数据序列(例如，心率数据或PPG数据)和可选的其它因素数据，并给出用户正在经历通常使用高保真度数据诊断或检测到的事件的概率，或预测或诊断或检测用户何时经历通常使用高保真度数据诊断或检测到的事件。经训练的高保真度机器学习模型能够做到这一点，这是因为已经通过利用对高保真度数据的诊断进行标记的用户健康指标数据(和可选的其它因素数据)对经训练的高保真度机器学习模型进行了训练。因此，经训练的模型能够仅基于测量低保真度健康指标输入数据序列(例如，心率或ppg数据)(和可选的其它因素数据)来预测用户何时发生与一个或多个标记相关联的事件(例如，Afib、高血压等)。本领域技术人员将理解，高保真度模型的训练可以在用户的移动装置上、远离用户的移动装置、这两者兼有地、或在分布式网络中进行。例如但不限于，用户的健康指标数据可以存储在云系统中，并且该数据可以使用来自步骤914的诊断在云中进行标记。本领域技术人员将容易理解用以存储、标记和访问该信息的任何数量的方法和方式。可选地，可以使用全局训练的高保真度模型，该模型将通过来自正在经历通常利用高保真度测量诊断或检测到的这些状况的人群的标记训练示例进行训练。这些全局训练示例将提供利用使用高保真度测量诊断出的状况(例如，医学专业人员或算法根据ECG称为的Afib)标记的低保真度数据序列(例如，心率)。In step 914, a diagnosis or classification of a high-fidelity measurement is received by a computing system, which in some embodiments may be a mobile or wearable computing system for collecting a user's heart rate data (or other health indicator data), and in step 916, a sequence of low-fidelity health indicator data (heart rate data in this example) is labeled with a diagnosis. In step 918, a high-fidelity machine learning model is trained using the labeled sequence of low-fidelity data of the user, and optionally other factor data sequences are also provided to train the model. In some embodiments, the trained high-fidelity machine learning model is capable of receiving a sequence of measured low-fidelity health indicator data (e.g., heart rate data or PPG data) and optional other factor data, and giving a probability that the user is experiencing an event that is typically diagnosed or detected using high-fidelity data, or predicting or diagnosing or detecting when the user is experiencing an event that is typically diagnosed or detected using high-fidelity data. The trained high-fidelity machine learning model is able to do this because it has been trained using the user's health indicator data (and optional other factor data) labeled with the diagnosis of the high-fidelity data. Thus, the trained model is able to predict when an event associated with one or more markers (e.g., Afib, hypertension, etc.) occurs to the user based solely on measuring low-fidelity health indicator input data sequences (e.g., heart rate or ppg data) (and optional other factor data). Those skilled in the art will appreciate that the training of the high-fidelity model can be performed on the user's mobile device, away from the user's mobile device, both, or in a distributed network. For example, but not limited to, the user's health indicator data can be stored in a cloud system, and the data can be marked in the cloud using the diagnosis from step 914. Those skilled in the art will readily understand any number of methods and means for storing, marking, and accessing this information. Optionally, a globally trained high-fidelity model can be used, which will be trained by labeled training examples from people who are experiencing these conditions that are typically diagnosed or detected using high-fidelity measurements. These global training examples will provide low-fidelity data sequences (e.g., heart rate) labeled with conditions diagnosed using high-fidelity measurements (e.g., Afib called by a medical professional or algorithm based on an ECG).

现在参考图9B，标绘图902示出作为时间的函数进行标绘的心率的示意图。相对于用户的正常心率数据的反常920发生在时间t₁、t₂、t₃、t₄、t₅、t₆、t₇、₈处。如上文所述，正常意味着该特定用户的预测数据在测量数据的阈值内，其中反常在阈值外。在相对于正常的反常时，一些实施例提示用户获得更明确或高保真度的读数，例如但不限于被标识为ECG₁、ECG₂、ECG₃、ECG₄、ECG₅、ECG₆、ECG₇、ECG₈的ECG读数。如上文所述，可以自动获得高保真度读数，用户可以获得高保真度读数，或者高保真度读数可以是除了ECG以外的事物，例如血压。通过算法、健康专业人员或这两者来分析高保真度读数，以将高保真度数据识别为正常/异常，并进一步识别/诊断异常(例如但不限于AFib)。该信息用于标记用户的序列化数据中的异常点920处的健康指标数据(例如，心率或PPG数据)。Referring now to FIG. 9B , a plot 902 shows a schematic diagram of heart rate plotted as a function of time. Anomalies 920 relative to the user's normal heart rate data occur at times t ₁ , t ₂ , t ₃ , t ₄ , t ₅ , t ₆ , t ₇ , _8. As described above, normal means that the predicted data for that particular user is within the threshold of the measured data, where the anomaly is outside the threshold. In the event of an anomaly relative to normal, some embodiments prompt the user to obtain a more explicit or high-fidelity reading, such as, but not limited to, an ECG reading identified as ECG ₁ , ECG ₂ , ECG ₃ , ECG ₄ , ECG ₅ , ECG ₆ , ECG ₇ , ECG _8. As described above, a high-fidelity reading may be obtained automatically, a user may obtain a high-fidelity reading, or the high-fidelity reading may be something other than an ECG, such as blood pressure. The high fidelity readings are analyzed by algorithms, health professionals, or both to identify the high fidelity data as normal/abnormal and further identify/diagnose abnormalities (such as, but not limited to, AFib). This information is used to flag health indicator data (e.g., heart rate or PPG data) at abnormal points 920 in the user's serialized data.

高保真度和低保真度数据之间的区别是高保真度数据或测量通常用于进行判断、检测或诊断，而低保真度数据可能不易用于这种判断、检测或诊断。例如，ECG扫描可用于识别、检测或诊断心律失常，而心率或PPG数据通常不提供这种能力。如本领域技术人员将理解的，这里关于机器学习算法(例如，Bayes、Markov、Gausian过程、聚类算法、生成模型、核和神经网络算法)的描述同样适用于这里描述的所有实施例。The difference between high-fidelity and low-fidelity data is that high-fidelity data or measurements are typically used to make judgments, detections, or diagnoses, while low-fidelity data may not be easily used for such judgments, detections, or diagnoses. For example, an ECG scan can be used to identify, detect, or diagnose arrhythmias, while heart rate or PPG data generally do not provide such capabilities. As will be appreciated by those skilled in the art, the descriptions herein of machine learning algorithms (e.g., Bayesian, Markov, Gausian processes, clustering algorithms, generative models, kernel and neural network algorithms) are equally applicable to all embodiments described herein.

在一些情况下，尽管可能存在这些问题，但用户仍然无症状，并且即使存在症状，获得进行诊断或检测所必需的高保真度测量也可能是不切实际的。例如但不限于，可能不存在心律失常、尤其是AF，并且即使症状确实存在，记录该时刻的ECG也是非常困难的，并且在没有昂贵、体积庞大且有时有创的监测装置的情况下，持续监测用户是非常困难的。如本文其它各处所讨论的，了解用户何时经历AF是很重要的，这是因为除了其它严重状况之外，AF至少可能是中风的起因。类似地并且如其它各处所讨论地，AF负荷可能有类似的输入。一些实施例允许仅使用对低保真度健康指标数据(诸如心率或ppg)以及可选的其它因素数据的持续监测来持续监测心律失常(例如，AF)或其它严重状况。In some cases, despite these issues, the user may still be asymptomatic, and even if symptoms are present, it may be impractical to obtain the high fidelity measurements necessary to make a diagnosis or detection. For example, but not limited to, an arrhythmia, particularly AF, may not be present, and even if symptoms do exist, it may be very difficult to record an ECG at that moment, and it may be very difficult to continuously monitor the user without expensive, bulky, and sometimes invasive monitoring devices. As discussed elsewhere in this document, it is important to know when a user is experiencing AF because AF may be at least a cause of a stroke, in addition to other serious conditions. Similarly and as discussed elsewhere, AF burden may have similar inputs. Some embodiments allow for continuous monitoring of arrhythmias (e.g., AF) or other serious conditions using only continuous monitoring of low-fidelity health indicator data (such as heart rate or ppg) and optionally other factor data.

图10描绘了根据健康监测系统和方法的一些实施例的方法1000。步骤1002接收测量或实际的用户低保真度健康指标数据(例如，来自可穿戴式装置上的传感器的心率或PPG数据)，并且可选地接收(时间上)相应的其它因素数据，该其它因素数据可能影响这里描述的健康指标数据。如本文其它各处所讨论的，低保真度健康指标数据可以由诸如智能手表、其它可穿戴式装置或平板计算机等的移动计算装置测量。在步骤1004中，将用户的低保真度健康指标数据(以及可选的其它因素数据)输入到经训练的高保真度机器学习模型中，该经训练的高保真度机器学习模型在步骤1006中基于测量低保真度健康指标数据(以及可选的(时间上)相应的其它因素数据)来输出针对用户的预测识别或诊断。步骤1008询问识别或诊断是否是正常的，如果是，则重新开始处理。如果识别或诊断是不正常的，则步骤1010向用户通知问题或检测。可选地，可以设置系统、方法或平台以通知用户、家人、朋友、医学护理专业人员或急救911等的任何组合。通知这些人中的哪个人可以取决于识别、检测或诊断。在识别、检测或诊断危及生命的情况下，则可以联系或通知某些人，这些人在诊断不危及生命的情况下可以不进行通知。另外，在一些实施例中，将测量健康指标数据序列输入到经训练的高保真度机器学习模型中，并计算用户正在经历异常事件的时间量(例如，预测异常事件的开始和停止之间的差)，从而允许更好地理解用户的异常负荷。特别地，在预防中风和其它严重状况方面，理解AF负荷可以是非常重要的。因此，一些实施例允许利用移动计算装置、可穿戴式计算装置、或能够仅获取低保真度健康因素数据的其它便携式装置以及可选的其它因素数据来持续监测异常事件。FIG. 10 depicts a method 1000 according to some embodiments of health monitoring systems and methods. Step 1002 receives measured or actual low-fidelity health indicator data of a user (e.g., heart rate or PPG data from a sensor on a wearable device), and optionally receives (time-dependent) corresponding other factor data that may affect the health indicator data described herein. As discussed elsewhere herein, the low-fidelity health indicator data may be measured by a mobile computing device such as a smartwatch, other wearable device, or tablet computer. In step 1004, the low-fidelity health indicator data of the user (and optional other factor data) is input into a trained high-fidelity machine learning model, which outputs a predicted recognition or diagnosis for the user based on the measured low-fidelity health indicator data (and optional (time-dependent) corresponding other factor data) in step 1006. Step 1008 asks whether the recognition or diagnosis is normal, and if so, restarts the process. If the recognition or diagnosis is abnormal, step 1010 notifies the user of the problem or detection. Optionally, the system, method, or platform can be set to notify any combination of the user, family, friends, medical care professionals, or emergency 911, etc. Which of these people is notified may depend on the identification, detection, or diagnosis. In the case of identification, detection, or diagnosis of life-threatening conditions, certain people may be contacted or notified, and these people may not be notified if the diagnosis is not life-threatening. In addition, in some embodiments, the measured health indicator data sequence is input into a trained high-fidelity machine learning model, and the amount of time the user is experiencing an abnormal event (e.g., the difference between the start and stop of the predicted abnormal event) is calculated, thereby allowing a better understanding of the user's abnormal load. In particular, understanding AF load can be very important in preventing strokes and other serious conditions. Therefore, some embodiments allow continuous monitoring of abnormal events using mobile computing devices, wearable computing devices, or other portable devices that can only obtain low-fidelity health factor data and optional other factor data.

图11描绘了根据这里描述的一些实施例的基于低保真度数据进行分析以生成高保真度输出预测或检测的示例数据1100。虽然参考心房颤动的检测进行描述，但是针对基于低保真度测量的高保真度诊断的附加预测可以生成类似的数据。第一个图1110示出用户随时间的心率计算。心率可以基于PPG数据或其它心率传感器来确定。第二个图1120示出同一时间段期间的用户的活动数据。例如，活动数据可以基于步数或用户移动的其它测量来确定。第三个图1130示出从机器学习模型输出的分类器以及何时生成通知的水平阈值。机器学习模型可以基于低保真度测量的输入来生成预测。例如，第一个图1110和第二个图1120中的数据可以由如以上进一步描述的机器学习系统进行分析。可以提供机器学习系统分析的结果作为图1130中所示的心房颤动概率。当概率超过阈值(在这种情况下示出为高于0.6置信度)时，健康监测系统可以触发针对用户、医师或与用户相关联的其他用户的通知或其它警告。FIG. 11 depicts example data 1100 for analyzing based on low-fidelity data to generate a high-fidelity output prediction or detection according to some embodiments described herein. Although described with reference to the detection of atrial fibrillation, similar data can be generated for additional predictions of high-fidelity diagnosis based on low-fidelity measurements. The first figure 1110 shows the calculation of the user's heart rate over time. The heart rate can be determined based on PPG data or other heart rate sensors. The second figure 1120 shows the user's activity data during the same time period. For example, the activity data can be determined based on the number of steps or other measurements of the user's movement. The third figure 1130 shows the classifier output from the machine learning model and the level threshold when the notification is generated. The machine learning model can generate predictions based on the input of low-fidelity measurements. For example, the data in the first figure 1110 and the second figure 1120 can be analyzed by a machine learning system as further described above. The results of the machine learning system analysis can be provided as the probability of atrial fibrillation shown in FIG. 1130. When the probability exceeds a threshold (shown in this case as above 0.6 confidence), the health monitoring system may trigger a notification or other alert to the user, a physician, or other users associated with the user.

在一些实施例中，图1110和1120中的数据可以作为持续测量提供给机器学习系统。例如，可以每5秒生成心率和活动水平作为测量，以进行准确的测量。然后可以将具有多个测量的时间片段输入到机器学习模型中。例如，前一小时的数据可以用作机器学习模型的输入。在一些实施例中，可以提供较短或较长的时间段，而不是一小时。如图11所示，输出图1130提供了用户正在经历异常健康事件的时间段的指示。例如，健康监测系统可以使用预测高于某个置信度水平的时间段来确定心房颤动。然后，可以使用这个值来确定测量时间段期间的用户的心房颤动负荷。In some embodiments, the data in Figures 1110 and 1120 can be provided to a machine learning system as continuous measurements. For example, heart rate and activity level can be generated as measurements every 5 seconds for accurate measurements. Time segments with multiple measurements can then be input into the machine learning model. For example, data from the previous hour can be used as input to the machine learning model. In some embodiments, shorter or longer time periods can be provided instead of one hour. As shown in Figure 11, output graph 1130 provides an indication of the time period when the user is experiencing an abnormal health event. For example, a health monitoring system can use a time period predicted to be above a certain confidence level to determine atrial fibrillation. This value can then be used to determine the user's atrial fibrillation load during the measurement time period.

在一些实施例中，可以基于标记的用户数据来训练用以生成图1130中的预测输出的机器学习模型。例如，可以基于在低保真度数据(例如，PPG、心率等)和其它数据(例如，活动水平或迈步等)也可用的时间段内获取的高保真度数据(诸如ECG读数等)来提供标记的用户数据。在一些实施例中，机器学习模型被设计用于判断在前一时间段期间是否可能存在心房颤动。例如，机器学习模型可以获取一小时的低保真度数据作为输入，并提供发生事件的可能性。因此，训练数据可能包括个体群体的多个小时的记录数据。在基于高保真度数据诊断出状况的情况下，该数据可以是健康事件标记时间。因此，如果存在基于高保真度数据的健康事件标记时间，则机器学习模型可以判断为输入到未经训练的机器学习模型中的具有该事件的任何一个小时窗口的低保真度数据应当提供健康事件的预测。然后，可以基于将预测与标记进行比较来更新未经训练的机器学习模型。在重复多次迭代并判断为机器学习模型已经收敛之后，健康监测系统可以使用该机器学习模型，以基于低保真度数据来监测用户的心房颤动。在各种实施例中，可以使用低保真度数据来检测除心房颤动之外的其它状况。In some embodiments, the machine learning model used to generate the prediction output in Figure 1130 can be trained based on the marked user data. For example, the marked user data can be provided based on high-fidelity data (such as ECG readings, etc.) acquired during a time period where low-fidelity data (e.g., PPG, heart rate, etc.) and other data (e.g., activity level or steps, etc.) are also available. In some embodiments, the machine learning model is designed to determine whether atrial fibrillation may exist during the previous time period. For example, the machine learning model can obtain one hour of low-fidelity data as input and provide the possibility of an event. Therefore, the training data may include multiple hours of recorded data for a population of individuals. In the case of diagnosing a condition based on high-fidelity data, the data can be a health event marking time. Therefore, if there is a health event marking time based on high-fidelity data, the machine learning model can determine that the low-fidelity data of any one hour window with the event input into the untrained machine learning model should provide a prediction of the health event. Then, the untrained machine learning model can be updated based on comparing the prediction with the mark. After repeating multiple iterations and determining that the machine learning model has converged, the health monitoring system can use the machine learning model to monitor the user's atrial fibrillation based on the low-fidelity data. In various embodiments, the low-fidelity data can be used to detect other conditions besides atrial fibrillation.

图12示出计算机系统1200的示例形式中的机器的示意性表示，其中在计算机系统1200内，可以执行用于使机器进行这里所讨论的任何一个或多个方法的指令集。在可选实施例中，机器可以连接(例如，联网)到局域网(LAN)、内联网、外联网或因特网中的其它机器。机器可以在客户端-服务器网络环境中以服务器或客户端机器的能力进行操作，或者作为对等(或分布式)网络环境中的对等机器进行操作。机器可以是个人计算机(PC)、平板PC、机顶盒(STB)、个人数字助理(PDA)、蜂窝电话、web设备、服务器、网络路由器、交换机或桥接器、集线器、接入点、网络接入控制装置、或者能够执行用于指定该机器所要采取的动作的(顺序的或其它形式的)指令集的任何机器。此外，虽然仅示出单个机器，但术语“机器”也应被视为包括单独地或共同地执行一个(或多个)指令集以进行这里所讨论的任何一个或多个方法的机器的任何集合。在一个实施例中，计算机系统1200可以代表被配置为进行这里描述的健康监测的服务器、移动计算装置或可穿戴式装置等。Figure 12 shows a schematic representation of a machine in an example form of a computer system 1200, wherein within the computer system 1200, an instruction set for causing the machine to perform any one or more methods discussed herein can be executed. In an optional embodiment, the machine can be connected (e.g., networked) to other machines in a local area network (LAN), an intranet, an extranet, or the Internet. The machine can operate in a client-server network environment with the capabilities of a server or client machine, or as a peer machine in a peer (or distributed) network environment. The machine can be a personal computer (PC), a tablet PC, a set-top box (STB), a personal digital assistant (PDA), a cellular phone, a web device, a server, a network router, a switch or a bridge, a hub, an access point, a network access control device, or any machine capable of executing an instruction set (sequential or other forms) for specifying the action to be taken by the machine. In addition, although only a single machine is shown, the term "machine" should also be considered to include any collection of machines that execute one (or more) instruction sets individually or collectively to perform any one or more methods discussed herein. In one embodiment, computer system 1200 may represent a server, a mobile computing device, a wearable device, or the like configured to perform health monitoring as described herein.

示例性计算机系统1200包括经由总线1230彼此通信的处理装置1202、主存储器1204(例如，只读存储器(ROM)、闪速存储器、动态随机存取存储器(DRAM))、静态存储器1206(例如，闪速存储器、静态随机存取存储器(SRAM)等)、以及数据存储装置1218。通过这里描述的各种总线提供的任何信号可以与其它信号进行时间复用，并通过一个或多个公共总线提供。另外，电路组件或块之间的互连可被示出为总线或单信号线。各总线可以可选地是一个或多个单信号线，并且各单信号线可以可选地是总线。The exemplary computer system 1200 includes a processing device 1202, a main memory 1204 (e.g., a read-only memory (ROM), a flash memory, a dynamic random access memory (DRAM)), a static memory 1206 (e.g., a flash memory, a static random access memory (SRAM), etc.), and a data storage device 1218 that communicate with each other via a bus 1230. Any signal provided via the various buses described herein may be time multiplexed with other signals and provided via one or more common buses. In addition, the interconnections between circuit components or blocks may be shown as buses or single signal lines. Each bus may alternatively be one or more single signal lines, and each single signal line may alternatively be a bus.

处理装置1202表示一个或多个通用处理装置，诸如微处理器、中央处理单元或其他处理装置等。更具体地，处理装置可以是复杂指令集计算(CISC)微处理器、精简指令集计算机(RISC)微处理器、超长指令字(VLIW)微处理器、或实现其它指令集的处理器、或实现指令集组合的处理器。处理装置1202也可以是一个或多个专用处理装置，诸如专用集成电路(ASIC)、现场可编程门阵列(FPGA)、数字信号处理器(DSP)或网络处理器等。处理装置1202被配置为执行处理逻辑1226，该处理逻辑1226可以是用于进行这里讨论的操作和步骤的健康监视器1250和相关系统的一个示例。The processing device 1202 represents one or more general-purpose processing devices, such as a microprocessor, a central processing unit, or other processing devices. More specifically, the processing device can be a complex instruction set computing (CISC) microprocessor, a reduced instruction set computer (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, or a processor that implements other instruction sets, or a processor that implements a combination of instruction sets. The processing device 1202 can also be one or more special-purpose processing devices, such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), or a network processor. The processing device 1202 is configured to execute processing logic 1226, which can be an example of a health monitor 1250 and related systems for performing the operations and steps discussed herein.

数据存储装置1218可以包括机器可读存储介质1228，在该机器可读存储介质1228上存储了体现这里描述的功能的任何一个或多个方法的一个或多个指令集1222(例如，软件)，包括用以使处理装置1202执行这里描述的健康监视器1250和相关处理的指令。在计算机系统1200执行指令1222期间，指令1222也可以完全或至少部分地存在于主存储器1204内或处理装置1202内；主存储器1204和处理装置1202也构成机器可读存储介质。指令1222还可以经由网络接口装置1208通过网络1220发送或接收。The data storage device 1218 may include a machine-readable storage medium 1228 on which is stored one or more sets of instructions 1222 (e.g., software) embodying any one or more methods of functionality described herein, including instructions for causing the processing device 1202 to perform the health monitor 1250 and related processes described herein. The instructions 1222 may also reside, in whole or in part, within the main memory 1204 or within the processing device 1202 during execution of the instructions 1222 by the computer system 1200; the main memory 1204 and the processing device 1202 also constitute machine-readable storage media. The instructions 1222 may also be sent or received over the network 1220 via the network interface device 1208.

如这里所述，机器可读存储介质1228也可以用于存储用以进行用于监测用户健康的方法的指令。虽然在示例性实施例中、机器可读存储介质1228被示出为单个介质，但术语“机器可读存储介质”应被视为包括存储一个或多个指令集的单个介质或多个介质(例如，集中式或分布式数据库或相关的高速缓存和服务器)。机器可读介质包括用于以机器(例如，计算机)可读的形式(例如，软件、处理应用)存储信息的任何机构。机器可读介质可以包括但不限于磁性存储介质(例如，软盘)、光存储介质(例如，CD-ROM)、磁光存储介质、只读存储器(ROM)、随机存取存储器(RAM)、可擦除可编程存储器(例如，EPROM和EEPROM)、闪速存储器、或适合存储电子指令的其它类型的介质。As described herein, the machine-readable storage medium 1228 may also be used to store instructions for performing a method for monitoring the health of a user. Although in an exemplary embodiment, the machine-readable storage medium 1228 is shown as a single medium, the term "machine-readable storage medium" should be considered to include a single medium or multiple media (e.g., a centralized or distributed database or associated cache and server) storing one or more instruction sets. Machine-readable media include any mechanism for storing information in a machine (e.g., computer) readable form (e.g., software, processing applications). Machine-readable media may include, but are not limited to, magnetic storage media (e.g., floppy disks), optical storage media (e.g., CD-ROMs), magneto-optical storage media, read-only memory (ROM), random access memory (RAM), erasable programmable memory (e.g., EPROM and EEPROM), flash memory, or other types of media suitable for storing electronic instructions.

以上说明阐述了许多具体细节，诸如具体系统、组件和方法等的示例，以提供对本发明的多个实施例的良好理解。然而，对于本领域技术人员显而易见的是，本发明的至少一些实施例可以在没有这些具体细节的情况下实现。在其它情况下，众所周知的组件或方法未详细描述或以简单的框图格式呈现，以避免不必要地使本发明模糊。因此，所阐述的具体细节仅仅是示例性的。具体实施例可以与这些示例性细节不同，并且仍可被预期在本发明的范围内。The above description sets forth many specific details, such as examples of specific systems, components and methods, to provide a good understanding of multiple embodiments of the present invention. However, it will be apparent to those skilled in the art that at least some embodiments of the present invention can be implemented without these specific details. In other cases, well-known components or methods are not described in detail or presented in a simple block diagram format to avoid unnecessarily obscuring the present invention. Therefore, the specific details set forth are merely exemplary. Specific embodiments may be different from these exemplary details, and still may be expected to be within the scope of the present invention.

另外，一些实施例可以在分布式计算环境中实现，其中机器可读介质存储在多于一个计算机系统上或由多于一个计算机系统执行。另外，可以在连接计算机系统的通信介质中拉取或推送计算机系统之间所传送的信息。In addition, some embodiments may be implemented in a distributed computing environment, where the machine-readable medium is stored on or executed by more than one computer system. In addition, information transmitted between computer systems may be pulled or pushed in the communication medium connecting the computer systems.

所要求保护的主题的实施例包括但不限于这里描述的各种操作。这些操作可以由硬件组件、软件、固件或它们的组合进行。Embodiments of the claimed subject matter include, but are not limited to, the various operations described herein. These operations may be performed by hardware components, software, firmware, or a combination thereof.

尽管这里的方法的操作以特定的顺序示出和描述，但是可以改变各方法的操作的顺序，使得某些操作可以以相反的顺序进行，或者使得某些操作可以至少部分地与其它操作同时进行。在另一实施例中，不同操作的指令或子操作可以采用间歇性或交替的方式。Although the operation of the method herein is shown and described in a particular order, the order of the operation of each method can be changed so that some operations can be performed in reverse order, or some operations can be performed at least partially simultaneously with other operations. In another embodiment, the instructions or sub-operations of different operations can be used in an intermittent or alternating manner.

以上对本发明的例示实现的描述(包括摘要中所描述的内容)并非旨在是详尽的或将本发明限于所公开的确切形式。如本领域技术人员将认识到，虽然这里为了例示性目的而描述了本发明的具体实现和示例，但是各种等效修改也可以在本发明的范围内。这里使用的“示例”或“示例性”一词意味着用作示例、实例或例证。这里描述为“示例”或“示例性”的任何方面或设计不一定被理解为比其它方面或设计优选或有利。相反，使用“示例”或“示例性”一词旨在以具体方式呈现概念。如本申请中所使用，术语“或”旨在意味着包容性的“或”而不是排他性的“或”。也就是说，除非另有规定或从上下文中清楚可见，否则“X包括A或B”旨在意味着任何自然的包容性排列。也就是说，如果X包括A、X包括B、或者X包括A和B这两者，则在上述任何情况下，满足“X包括A或B”。另外，本申请和所附权利要求书中所使用的冠词“一个(a)”和“一个(an)”一般应被理解为意味着“一个或多个”，除非另有规定或从上下文中清楚可见针对单一形式。此外，在整个说明书中使用术语“实施例”或“一个实施例”或“实现”或“一个实现”并不旨在意味着相同的实施例或实现，除非被描述为这样。此外，这里使用的术语“第一”、“第二”、“第三”、“第四”等是指用以区分不同元件的标记，并且可能不一定具有根据其数字指定的序数意义。The above description of the exemplary implementation of the present invention (including the content described in the abstract) is not intended to be exhaustive or to limit the present invention to the exact form disclosed. As will be appreciated by those skilled in the art, although the specific implementation and examples of the present invention are described here for illustrative purposes, various equivalent modifications may also be within the scope of the present invention. The term "example" or "exemplary" used here means to be used as an example, instance or illustration. Any aspect or design described here as "example" or "exemplary" is not necessarily understood to be preferred or advantageous over other aspects or designs. On the contrary, the term "example" or "exemplary" is used to present concepts in a specific manner. As used in this application, the term "or" is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless otherwise specified or clearly visible from the context, "X includes A or B" is intended to mean any natural inclusive arrangement. That is, if X includes A, X includes B, or X includes both A and B, then in any of the above cases, "X includes A or B" is satisfied. In addition, the articles "a" and "an" used in the present application and the appended claims should generally be understood to mean "one or more", unless otherwise specified or it is clear from the context that a single form is intended. In addition, the use of the terms "embodiment" or "one embodiment" or "implementation" or "an implementation" throughout the specification is not intended to mean the same embodiment or implementation unless described as such. In addition, the terms "first", "second", "third", "fourth", etc. used herein refer to marks used to distinguish different elements and may not necessarily have ordinal meanings according to their numerical designations.

应当理解，以上公开的和其它的特征和功能的变形或其替代方案可以组合到其它不同系统或应用中。本领域技术人员随后可以进行目前未预见到的或未预期到的各种替代、修改、变形或改进，这些替代、修改、变形或改进也旨在被以下权利要求书所涵盖。权利要求书可以涵盖硬件、软件或其组合中的实施例。It should be understood that variations or alternatives to the above disclosed and other features and functions may be combined into other different systems or applications. Various substitutions, modifications, variations or improvements not currently foreseen or expected may be subsequently made by those skilled in the art, which are also intended to be covered by the following claims. The claims may cover embodiments in hardware, software or a combination thereof.

除上述的实施例外，本发明还包括但不限于以下示例实现。In addition to the above-mentioned embodiments, the present invention also includes but is not limited to the following exemplary implementations.

一些示例实现提供了监测用户的心脏健康的方法。该方法可以包括：接收第一时间处的用户的测量健康指标数据和其它因素数据；通过处理装置将健康指标数据和其它因素数据输入到机器学习模型中，其中，机器学习模型生成下一时间步骤处的预测健康指标数据；接收下一时间步骤处的用户的数据；通过处理装置来确定下一时间步骤处的损失，其中，损失是下一时间步骤处的预测健康指标数据和下一时间步骤处的用户的测量健康指标数据之间的测度；判断为损失超过阈值；以及响应于判断为损失超过阈值而向用户输出通知。Some example implementations provide a method for monitoring a user's cardiac health. The method may include: receiving measured health indicator data and other factor data of a user at a first time; inputting the health indicator data and other factor data into a machine learning model through a processing device, wherein the machine learning model generates predicted health indicator data at a next time step; receiving the user's data at a next time step; determining the loss at the next time step through a processing device, wherein the loss is a measure between the predicted health indicator data at the next time step and the measured health indicator data of the user at the next time step; determining that the loss exceeds a threshold; and outputting a notification to the user in response to determining that the loss exceeds the threshold.

在任何示例实现的方法的一些示例实现中，经训练的机器学习模型是经训练的生成神经网络。在任何示例实现的方法的一些示例实现中，经训练的机器学习模型是前馈网络。在任何示例实现的方法的一些示例实现中，经训练的机器学习模型是RNN。在任何示例实现的方法的一些示例实现中，经训练的机器学习模型是CNN。In some example implementations of the method of any example implementation, the trained machine learning model is a trained generative neural network. In some example implementations of the method of any example implementation, the trained machine learning model is a feedforward network. In some example implementations of the method of any example implementation, the trained machine learning model is an RNN. In some example implementations of the method of any example implementation, the trained machine learning model is a CNN.

在任何示例实现的方法的一些示例实现中，通过来自以下内容中的一个或多个内容的训练示例对经训练的机器学习模型进行训练：健康人群、患有心脏病的人群、以及用户。In some example implementations of the method of any example implementation, the trained machine learning model is trained with training examples from one or more of: healthy people, people with heart disease, and users.

在任何示例实现的方法的一些示例实现中，下一时间步骤处的损失是下一时间步骤处的预测健康指标数据和下一时间步骤处的用户的测量健康指标之间的差的绝对值。In some example implementations of the method of any example implementation, the loss at the next time step is an absolute value of a difference between the predicted health indicator data at the next time step and the measured health indicator of the user at the next time step.

在任何示例实现的方法的一些示例实现中，预测健康指标数据是概率分布，并且其中，下一时间步骤处的预测健康指标数据是根据概率分布进行采样的。In some example implementations of the method of any example implementation, the predicted health indicator data is a probability distribution, and wherein the predicted health indicator data at the next time step is sampled according to the probability distribution.

在任何示例实现的方法的一些示例实现中，根据选自以下内容构成的组中的采样技术对下一时间步骤处的预测健康指标数据进行采样：最大概率的预测健康指标数据；以及根据概率分布对预测健康指标数据进行随机采样。In some example implementations of the method of any example implementation, the predicted health indicator data at the next time step is sampled according to a sampling technique selected from the group consisting of: predicted health indicator data of maximum probability; and randomly sampling the predicted health indicator data according to a probability distribution.

在任何示例实现的方法的一些示例实现中，预测健康指标数据是概率分布(β)，并且其中，损失是基于利用下一时间步骤处的用户的测量健康指标进行评价的下一时间步骤处的概率分布的负对数来确定的。在任何示例实现的方法的一些示例实现中，方法还包括概率分布的自采样。In some example implementations of the method of any example implementation, the predicted health indicator data is a probability distribution (β), and wherein the loss is determined based on the negative logarithm of the probability distribution at the next time step evaluated using the measured health indicator of the user at the next time step. In some example implementations of the method of any example implementation, the method also includes self-sampling of the probability distribution.

在任何示例实现的方法的一些示例实现中，方法还包括：对时间步骤的时间段内的预测健康指标数据求平均；对时间步骤的时间段内的用户的测量健康指标数据求平均；以及基于预测健康指标数据和测量健康指标数据之间的差的绝对值来确定损失。In some example implementations of the method of any example implementation, the method also includes: averaging the predicted health indicator data over the time period of the time step; averaging the measured health indicator data of the user over the time period of the time step; and determining the loss based on the absolute value of the difference between the predicted health indicator data and the measured health indicator data.

在任何示例实现的方法的一些示例实现中，测量健康指标数据包括PPG数据。在任何示例实现的方法的一些示例实现中，测量健康指标数据包括心率数据。In some example implementations of the method of any example implementation, the measured health indicator data includes PPG data. In some example implementations of the method of any example implementation, the measured health indicator data includes heart rate data.

在任何示例实现的方法的一些示例实现中，方法还包括将不规则间隔的心率数据重新采样到规则间隔的栅格上，其中，根据规则间隔的栅格对心率数据进行采样。In some example implementations of the method of any example implementation, the method further comprises resampling the irregularly spaced heart rate data onto a regularly spaced grid, wherein the heart rate data is sampled according to the regularly spaced grid.

在任何示例实现的方法的一些示例实现中，测量健康指标数据是选自以下内容构成的组中的一个或多个健康指标数据：PPG数据、心率数据、脉搏血氧计数据、ECG数据和血压数据。In some example implementations of the method of any example implementation, the measured health indicator data is one or more health indicator data selected from the group consisting of: PPG data, heart rate data, pulse oximeter data, ECG data, and blood pressure data.

一些示例限制提供了一种设备，该设备包括移动计算装置，该移动计算装置包括：处理装置；显示器；健康指标数据传感器；以及存储器，其上存储了在由处理装置执行时使处理装置进行以下操作的指令：接收第一时间处的来自健康指标数据传感器的测量健康指标数据以及第一时间处的其它因素数据；将健康指标数据和其它因素数据输入到经训练的机器学习模型中，并且其中，经训练的机器学习模型生成下一时间步骤处的预测健康指标数据；接收下一时间步骤处的测量健康指标数据和其它因素数据；确定下一时间步骤处的损失，其中，损失是下一时间步骤处的预测健康指标数据和下一时间步骤处的测量健康指标数据之间的测度；以及在下一时间步骤处的损失超过阈值的情况下输出通知。Some example limitations provide an apparatus comprising a mobile computing device, the mobile computing device comprising: a processing device; a display; a health indicator data sensor; and a memory storing instructions that, when executed by the processing device, cause the processing device to perform the following operations: receiving measured health indicator data from the health indicator data sensor at a first time and other factor data at the first time; inputting the health indicator data and other factor data into a trained machine learning model, and wherein the trained machine learning model generates predicted health indicator data at a next time step; receiving the measured health indicator data and other factor data at a next time step; determining a loss at a next time step, wherein the loss is a measure between the predicted health indicator data at the next time step and the measured health indicator data at the next time step; and outputting a notification if the loss at the next time step exceeds a threshold.

在任何示例设备的一些示例实现中，经训练的机器学习模型包括经训练的生成神经网络。在任何示例设备的一些示例实现中，经训练的机器学习模型包括前馈网络。在任何示例设备的一些示例实现中，经训练的机器学习模型是RNN。在任何示例实现的方法的一些示例实现中，经训练的机器学习模型是CNN。In some example implementations of any example device, the trained machine learning model includes a trained generative neural network. In some example implementations of any example device, the trained machine learning model includes a feed-forward network. In some example implementations of any example device, the trained machine learning model is an RNN. In some example implementations of the method of any example implementation, the trained machine learning model is a CNN.

在任何示例设备的一些示例实现中，通过来自以下内容构成的组中的一个内容的训练示例对经训练的机器学习模型进行训练：健康人群、患有心脏病的人群、以及用户。In some example implementations of any example device, the trained machine learning model is trained with training examples from one of the group consisting of: healthy people, people with heart disease, and users.

在任何示例设备的一些示例实现中，预测健康指标数据是下一时间步骤处的用户健康指标的点预测，并且其中，损失是下一时间步骤处的预测健康指标数据和下一时间步骤处的测量健康指标数据之间的差的绝对值。In some example implementations of any example device, the predicted health indicator data is a point prediction of the user's health indicator at the next time step, and wherein the loss is an absolute value of a difference between the predicted health indicator data at the next time step and the measured health indicator data at the next time step.

在任何示例设备的一些示例实现中，预测健康指标数据是根据由机器学习模型生成的概率分布进行采样的。In some example implementations of any of the example devices, the predicted health indicator data is sampled according to a probability distribution generated by a machine learning model.

在任何示例设备的一些示例实现中，根据选自以下内容构成的组中的采样技术对预测健康指标数据进行采样：最大概率；以及根据概率分布进行随机采样。In some example implementations of any of the example devices, the predicted health indicator data is sampled according to a sampling technique selected from the group consisting of: maximum probability; and random sampling according to a probability distribution.

在任何示例设备的一些示例实现中，预测健康指标数据是概率分布(β)，并且其中，损失是基于利用下一时间步骤处的用户的测量健康指标进行评价的β的负对数来确定的。In some example implementations of any of the example devices, the predicted health indicator data is a probability distribution (β), and wherein the loss is determined based on a negative logarithm of β evaluated using the measured health indicator of the user at the next time step.

在任何示例设备的一些示例实现中，处理装置还用于定义范围为0至1的函数α，其中I_t包括作为α的函数的用户的测量健康指标数据和预测健康指标数的线性组合。In some example implementations of any of the example devices, the processing means is further configured to define a function α ranging from 0 to 1, where _It comprises a linear combination of the user's measured health indicator data and the predicted health indicator number as a function of α.

在任何示例设备的一些示例实现中，处理装置还用于进行概率分布的自采样。In some example implementations of any of the example apparatus, the processing device is further configured to perform self-sampling of the probability distribution.

在任何示例设备的一些示例实现中，处理装置还用于：使用求平均方法对时间步骤的时间段内的根据概率分布采样的预测健康指标数据求平均；使用求平均方法对时间步骤的时间段内的用户的测量健康指标数据求平均；定义损失为平均的预测健康指标数据和测量健康指标数据的差的绝对值。In some example implementations of any example device, the processing device is also used to: use an averaging method to average the predicted health indicator data sampled according to the probability distribution within the time period of the time step; use an averaging method to average the measured health indicator data of the user within the time period of the time step; define the loss as the absolute value of the difference between the averaged predicted health indicator data and the measured health indicator data.

在任何示例设备的一些示例实现中，求平均方法包括选自以下内容构成的组中的一个或多个方法：计算平均值、计算算术平均、计算中值和计算众数。In some example implementations of any of the example devices, the averaging method includes one or more methods selected from the group consisting of: calculating an average value, calculating an arithmetic mean, calculating a median, and calculating a mode.

在任何示例设备的一些示例实现中，测量健康指标数据包括来自PPG信号的PPG数据。在任何示例设备的一些示例实现中，测量健康指标数据是心率数据。在任何示例设备的一些示例实现中，通过将不规则间隔的心率数据重新采样到规则间隔的栅格上并且根据规则间隔的栅格来对心率数据进行采样来收集心率数据。在任何示例设备的一些示例实现中，测量健康指标数据是选自以下内容构成的组中的一个或多个健康指标数据：PPG数据、心率数据、脉搏血氧计数据、ECG数据和血压数据。In some example implementations of any example device, the measured health indicator data includes PPG data from a PPG signal. In some example implementations of any example device, the measured health indicator data is heart rate data. In some example implementations of any example device, the heart rate data is collected by resampling irregularly spaced heart rate data onto a regularly spaced grid and sampling the heart rate data according to the regularly spaced grid. In some example implementations of any example device, the measured health indicator data is one or more health indicator data selected from the group consisting of: PPG data, heart rate data, pulse oximeter data, ECG data, and blood pressure data.

在任何示例设备的一些示例实现中，移动装置选自以下内容构成的组：智能手表；健身手环；平板计算机；和膝上型计算机。In some example implementations of any of the example apparatus, the mobile device is selected from the group consisting of: a smart watch; a fitness band; a tablet computer; and a laptop computer.

在任何示例设备的一些示例实现中，移动装置还包括用户高保真度传感器，其中通知请求用户获得高保真度测量数据，并且其中，处理装置还用于：接收对高保真度测量数据的分析；利用分析来标记用户的测量健康指标数据，以生成标记的用户健康指标数据；以及使用标记的用户健康指标数据作为训练示例，以对经训练的个性化高保真度机器学习模型进行训练。In some example implementations of any example apparatus, the mobile device also includes a user high-fidelity sensor, wherein the notification requests the user to obtain high-fidelity measurement data, and wherein the processing device is further used to: receive an analysis of the high-fidelity measurement data; use the analysis to label the user's measured health indicator data to generate labeled user health indicator data; and use the labeled user health indicator data as a training example to train the trained personalized high-fidelity machine learning model.

在任何示例设备的一些示例实现中，经训练的机器学习模型存储在存储器中。在任何示例设备的一些示例实现中，经训练的机器学习模型存储在远程存储器中，其中，远程存储器与计算装置分离，并且其中，移动计算装置是可穿戴式计算装置。在任何示例设备的一些示例实现中，经训练的个性化高保真度机器学习模型存储在存储器中。在任何示例设备的一些示例实现中，经训练的个性化高保真度机器学习模型存储在远程存储器中，其中，远程存储器与计算装置分离，并且其中，移动计算装置是可穿戴式计算装置。In some example implementations of any of the example devices, the trained machine learning model is stored in memory. In some example implementations of any of the example devices, the trained machine learning model is stored in remote memory, wherein the remote memory is separate from the computing device, and wherein the mobile computing device is a wearable computing device. In some example implementations of any of the example devices, the trained personalized high-fidelity machine learning model is stored in memory. In some example implementations of any of the example devices, the trained personalized high-fidelity machine learning model is stored in remote memory, wherein the remote memory is separate from the computing device, and wherein the mobile computing device is a wearable computing device.

在任何示例设备的一些示例实现中，处理装置还用于预测用户正在经历心房颤动并确定用户的心房颤动负荷。In some example implementations of any of the example devices, the processing device is further configured to predict that the user is experiencing atrial fibrillation and determine the user's atrial fibrillation burden.

一些示例实现提供了监测用户的心脏健康的方法。该方法可以包括：接收第一时间处的测量低保真度用户健康指标数据和其它因素数据；将包括第一时间处的用户健康指标数据和其它因素数据的数据输入到个性化的经训练的高保真度机器学习模型中，其中，个性化的经训练的高保真度机器学习模型预测用户的健康指标数据是否异常；以及在预测异常的情况下发送用户的健康异常的通知。Some example implementations provide a method for monitoring a user's cardiac health. The method may include: receiving low-fidelity user health indicator data and other factor data measured at a first time; inputting data including the user health indicator data and other factor data at the first time into a personalized trained high-fidelity machine learning model, wherein the personalized trained high-fidelity machine learning model predicts whether the user's health indicator data is abnormal; and sending a notification of the user's health abnormality if an abnormality is predicted.

在任何示例实现的方法的一些示例实现中，通过利用对高保真度测量数据的分析进行标记的测量低保真度用户健康指标数据来对经训练的个性化高保真度机器学习模型进行训练。In some example implementations of the method of any example implementation, the trained personalized high-fidelity machine learning model is trained by measuring low-fidelity user health indicator data labeled using analysis of high-fidelity measurement data.

在任何示例实现的方法的一些示例实现中，对高保真度测量数据的分析是基于用户特定的高保真度测量数据的。In some example implementations of the method of any example implementation, the analysis of the high-fidelity measurement data is based on user-specific high-fidelity measurement data.

在任何示例实现的方法的一些示例实现中，个性化高保真度机器学习模型输出概率分布，其中，根据概率分布对预测进行采样。In some example implementations of the method of any example implementation, the personalized high-fidelity machine learning model outputs a probability distribution, wherein the predictions are sampled according to the probability distribution.

在任何示例实现的方法的一些示例实现中，根据选自以下内容构成的组中的采样技术对预测进行采样：最大概率的预测；以及根据概率分布对预测进行采样。In some example implementations of the method of any example implementation, the predictions are sampled according to a sampling technique selected from the group consisting of: maximum probability predictions; and sampling predictions according to a probability distribution.

在任何示例实现的方法的一些示例实现中，通过使用求平均方法对时间步骤的时间段内的预测求平均来确定平均预测，并且其中，使用平均预测来判断用户的健康指标数据是正常的还是异常的。In some example implementations of the method of any example implementation, an average prediction is determined by averaging the predictions within a time period of time steps using an averaging method, and wherein the average prediction is used to determine whether the user's health indicator data is normal or abnormal.

在任何示例实现的方法的一些示例实现中，求平均方法包括选自以下内容构成的组中的一个或多个方法：计算平均值、计算算术平均、计算中值和计算众数。In some example implementations of the method of any example implementation, the averaging method includes one or more methods selected from the group consisting of: calculating an average value, calculating an arithmetic mean, calculating a median, and calculating a mode.

在任何示例实现的方法的一些示例实现中，个性化的高保真度训练机器学习模型存储在用户可穿戴式装置的存储器中。在任何示例实现的方法的一些示例实现中，测量健康指标数据和其它因素数据是某一时间段内的数据的时间片段。In some example implementations of the method of any example implementation, the personalized high-fidelity trained machine learning model is stored in a memory of the user's wearable device. In some example implementations of the method of any example implementation, the measured health indicator data and other factor data are time segments of data within a certain time period.

在任何示例实现的方法的一些示例实现中，个性化的高保真度训练机器学习模型存储在远程存储器中，其中，远程存储器位于远离用户可穿戴式计算装置的位置处。In some example implementations of the method of any example implementation, the personalized high-fidelity trained machine learning model is stored in a remote memory, where the remote memory is located at a location remote from the user's wearable computing device.

在一些示例实现中，健康监测设备可以包括移动计算装置，该移动计算装置包括：微处理器；显示器；用户健康指标数据传感器；以及存储器，其上存储了在由微处理器执行时使处理装置进行以下操作的指令：接收第一时间处的测量低保真度健康指标数据和其它因素数据，其中，测量健康指标数据是由用户健康指标数据传感器获得的；将包括第一时间处的健康指标数据和其它因素数据的数据输入到经训练的高保真度机器学习模型中，其中，经训练的高保真度机器学习模型预测用户的健康指标数据是正常的还是异常的；以及响应于预测是异常的，向至少所述用户发送该用户的健康异常的通知。In some example implementations, a health monitoring device may include a mobile computing device comprising: a microprocessor; a display; a user health indicator data sensor; and a memory storing instructions that, when executed by the microprocessor, cause the processing device to perform the following operations: receiving measured low-fidelity health indicator data and other factor data at a first time, wherein the measured health indicator data is obtained by the user health indicator data sensor; inputting data including the health indicator data and other factor data at the first time into a trained high-fidelity machine learning model, wherein the trained high-fidelity machine learning model predicts whether the user's health indicator data is normal or abnormal; and in response to the prediction being abnormal, sending a notification of the user's health abnormality to at least the user.

在任何示例实现的健康监测设备的一些示例实现中，经训练的高保真度机器学习模型是经训练的高保真度生成神经网络。在任何示例实现的健康监测设备的一些示例实现中，其中，经训练的高保真度机器学习模型是经训练的递归神经网络(RNN)。在任何示例实现的健康监测设备的一些示例实现中，经训练的高保真度机器学习模型是经训练的前馈神经网络。在任何示例实现的健康监测设备的一些示例实现中，经训练的高保真度机器学习模型是CNN。In some example implementations of the health monitoring device of any example implementation, the trained high-fidelity machine learning model is a trained high-fidelity generative neural network. In some example implementations of the health monitoring device of any example implementation, wherein the trained high-fidelity machine learning model is a trained recurrent neural network (RNN). In some example implementations of the health monitoring device of any example implementation, the trained high-fidelity machine learning model is a trained feedforward neural network. In some example implementations of the health monitoring device of any example implementation, the trained high-fidelity machine learning model is a CNN.

在任何示例实现的健康监测设备的一些示例实现中，通过基于用户特定的高保真度测量数据进行标记的测量用户健康指标数据来对经训练的高保真度机器学习模型进行训练。In some example implementations of the health monitoring device of any example implementation, the trained high-fidelity machine learning model is trained with measured user health indicator data labeled based on user-specific high-fidelity measurement data.

在任何示例实现的健康监测设备的一些示例实现中，通过基于高保真度测量数据进行标记的低保真度健康指标数据来对经训练的高保真度机器学习模型进行训练，其中，低保真度健康指标数据和高保真度测量数据来自受试者的群体。In some example implementations of the health monitoring device of any example implementation, a trained high-fidelity machine learning model is trained using low-fidelity health indicator data labeled based on high-fidelity measurement data, wherein the low-fidelity health indicator data and the high-fidelity measurement data are from a population of subjects.

在任何示例实现的健康监测设备的一些示例实现中，高保真度机器学习模型输出概率分布，其中，根据该概率分布对预测进行采样。In some example implementations of the health monitoring device of any example implementation, the high-fidelity machine learning model outputs a probability distribution, wherein the predictions are sampled according to the probability distribution.

在任何示例实现的健康监测设备的一些示例实现中，根据选自以下内容构成的组中的采样技术对预测进行采样：最大概率的预测；以及根据概率分布对预测进行随机采样。In some example implementations of the health monitoring device of any example implementation, the predictions are sampled according to a sampling technique selected from the group consisting of: maximum probability predictions; and random sampling of predictions according to a probability distribution.

在任何示例实现的健康监测设备的一些示例实现中，通过使用求平均方法对时间步骤的时间段内的预测求平均来确定平均预测，并且其中，使用平均预测来判断用户的健康指标数据是正常的还是异常的。In some example implementations of the health monitoring device of any example implementation, an average prediction is determined by averaging the predictions within a time period of time steps using an averaging method, and wherein the average prediction is used to determine whether the user's health indicator data is normal or abnormal.

在任何示例实现的健康监测设备的一些示例实现中，测量健康指标数据和其它因素数据是某一时间段内的数据的时间片段。In some example implementations of the health monitoring device of any example implementation, the measured health indicator data and other factor data are time segments of data within a time period.

在任何示例实现的健康监测设备的一些示例实现中，求平均方法包括选自以下内容构成的组中的一个或多个方法：计算平均值、计算算术平均、计算中值和计算众数。In some example implementations of the health monitoring device of any example implementation, the averaging method includes one or more methods selected from the group consisting of: calculating an average value, calculating an arithmetic mean, calculating a median, and calculating a mode.

在任何示例实现的健康监测设备的一些示例实现中，个性化的高保真度训练机器学习模型存储在存储器中。在任何示例实现的健康监测设备的一些示例实现中，个性化的高保真度训练机器学习模型存储在远程存储器中，其中，远程存储器位于远离可穿戴式计算装置的位置处。在任何示例实现的健康监测设备的一些示例实现中，移动装置选自以下内容构成的组：智能手表；健身手环；平板计算机；和膝上型计算机。In some example implementations of the health monitoring device of any example implementation, the personalized high-fidelity trained machine learning model is stored in the memory. In some example implementations of the health monitoring device of any example implementation, the personalized high-fidelity trained machine learning model is stored in the remote memory, wherein the remote memory is located at a location remote from the wearable computing device. In some example implementations of the health monitoring device of any example implementation, the mobile device is selected from the group consisting of: a smart watch; a fitness band; a tablet computer; and a laptop computer.

Claims

1. A device for health analysis based on machine learning, comprising:

Processing device;

a health indicator data sensor operably coupled to the processing device; and

a memory having stored thereon instructions which, when executed by the processing device, cause the processing device to:

Receiving low-fidelity health indicator data and other factor data of a user at a time, wherein the low-fidelity health indicator data is obtained by the health indicator data sensor;

Inputting a data set including the low-fidelity health indicator data and the other factor data into a trained high-fidelity machine learning model, wherein the trained high-fidelity machine learning model is used to generate a prediction of whether the high-fidelity health indicator output of the user is normal or abnormal; and

In response to the prediction being abnormal, sending a notification of the user's health abnormality.

2. The apparatus of claim 1, wherein the trained high-fidelity machine learning model comprises one or more of: a trained high-fidelity generative neural network, a trained recurrent neural network (RNN), a trained feedforward neural network, and a trained feedforward neural network.

3. The apparatus of claim 1, wherein the trained high-fidelity machine learning model is trained by utilizing measured user health indicator data labeled with high-fidelity measurement data of a specific user.

4. The apparatus of claim 1, wherein the trained high-fidelity machine learning model is trained by using low-fidelity health indicator data labeled with high-fidelity measurement data, wherein the low-fidelity health indicator data and the high-fidelity measurement data are from a population of subjects.

5. The apparatus of claim 1 , wherein the high-fidelity machine learning model outputs a probability distribution, wherein the predictions are sampled according to the probability distribution.

6. The apparatus of claim 5, wherein the predictions are sampled according to a sampling technique selected from the group consisting of: predictions at maximum probability; and random sampling of the predictions according to the probability distribution.

7. The apparatus of claim 5, wherein an average prediction is determined by averaging the predictions within a time period of time steps using an averaging method, and wherein the average prediction is used to determine whether the low-fidelity health indicator data of the user is normal or abnormal.

8. The device of claim 1, wherein the device is selected from the group consisting of: a smart watch; a fitness band; a tablet computer; and a laptop computer.

9. The apparatus of claim 1, wherein the low-fidelity health indicator data and the other factor data are each time segments of data within a time period.

10. The apparatus of claim 1 , wherein the low-fidelity health indicator data comprises a record of heart rate prior to the time, the other factor data comprises a record of activity level, and the prediction comprises a prediction that the user experienced atrial fibrillation during the recording of heart rate prior to the time.

11. The apparatus according to claim 1, wherein the processing device is further configured to:

Receiving a training data set, wherein the training data includes labeled low-fidelity health indicator data from a population of individuals, corresponding other factor data from the population of individuals, wherein the labeled low-fidelity health indicator data is labeled using corresponding high-fidelity data from the population of individuals;

Inputting the labeled low-fidelity health indicator data and corresponding intervals of other factor data into the trained high-fidelity machine learning model; and

The labeled high-fidelity machine learning model is updated by comparing the output from the trained high-fidelity machine learning model with the labeled high-fidelity data.

12. A method for health analysis based on machine learning, comprising:

Receiving, by a processing device, measured low-fidelity health indicator data and other factor data of a user at a time, wherein the measured low-fidelity health indicator data is obtained by a user health indicator data sensor;

inputting, by the processing device, data including the low-fidelity health indicator data and the other factor data at the time into a trained high-fidelity machine learning model, wherein the trained high-fidelity machine learning model generates a prediction of whether the high-fidelity health indicator output of the user is normal or abnormal; and

13. The method of claim 12, wherein the trained high-fidelity machine learning model comprises one or more of: a trained high-fidelity generative neural network, a trained recurrent neural network (RNN), a trained feedforward neural network, and a trained feedforward neural network.

14. The method of claim 12, wherein the low-fidelity health indicator data and the other factor data are each time segments of data within a time period.

15. A method according to claim 12, wherein the low-fidelity health indicator data includes a record of heart rate before the time, the other factor data includes a record of activity level, the prediction includes a prediction that the user experienced atrial fibrillation during the recording of heart rate before the time, and the notification includes an instruction to take an ECG.

16. A method according to claim 12, wherein the trained high-fidelity machine learning model is trained by using low-fidelity health indicator data labeled with high-fidelity measurement data, wherein the low-fidelity health indicator data and the high-fidelity measurement data are from a population of subjects.

17. The method of claim 12, wherein the high-fidelity machine learning model outputs a probability distribution, wherein the predictions are sampled according to the probability distribution.

18. The method of claim 17, wherein the predictions are sampled according to a sampling technique selected from the group consisting of: predictions at maximum probability; and random sampling of the predictions according to the probability distribution.

19. The method of claim 17, wherein an average prediction is determined by averaging the predictions within a time period of time steps using an averaging method, and wherein the average prediction is used to determine whether the low-fidelity health indicator data of the user is normal or abnormal.

20. The method of claim 12, further comprising:

receiving a training data set, wherein the training data comprises labeled low-fidelity health indicator data from a population of individuals, corresponding other factor data from the population of individuals, wherein,

The labeled low-fidelity health indicator data is labeled using corresponding high-fidelity data from a population of the individuals;

inputting the labeled intervals of low-fidelity health indicator data and corresponding other factor data into the trained high-fidelity machine learning model; and