The apparatus and method and the application thereof of identification gazing direction of human eyes
Affiliated technical field
The present invention relates to machine vision and automatic control technology, particularly discern the apparatus and method of gazing direction of human eyes and the application in Based Intelligent Control thereof.The present invention is a kind of new technology of differentiating gazing direction of human eyes based on video image, this technology does not need extra utility appliance except single-camera, processor and visual interface, the recognition methods simple and fast, can make response more active, the close friend and definite of machine in a lot of applications of machine vision and Based Intelligent Control to the people.
Background technology
In the reciprocal process between people and people, human to facial expression, particularly the direction of gaze to human eye has extremely sensitive discriminating power, and machine vision and people's vision is compared, and still is in very low level on the one hand at this.The identification of existing direction of gaze, the main special utility appliance that is contained on subject's head that relies on, as infrared illuminator, reflected light detector and dibit gamma camera etc.Though these devices can have higher measuring accuracy to direction of gaze, its range of application is very restricted.
In present widely used computing machine and the household electrical appliance control technology,, belong to the keying way of contact basically, proved that this is a kind of important channel of pathogen transmission as telepilot, keyboard, mouse and touch-screen etc.In addition, from the viewpoint of " people-oriented ", also there are many natures inadequately and local easily in the use in this class device.Intelligent control technology based on the vision and the sense of hearing, it is the non-contact intelligent control technology of natural interactive style between a kind of people of approaching and the people, it is the developing direction of intelligent control technology, not only convenient, nature, contactless germ infect, and to computing machine and digitized popularize also significant.
Summary of the invention
The purpose of this invention is to provide a kind of apparatus and method and application thereof of discerning gazing direction of human eyes.The present invention will whether watch attentively as the startup of machine, close or user's request information in the non-contact intelligent machine control technology based on the machine vision and the sense of hearing, make machine to people's demand response more initiatively, friendly and definite.These machines comprise the extremely great household electrical appliance of market scale, game machine, medical care instrument, driver safety companion, intelligent robot and disabled person and some person's of special procuring telechiric device etc.For example, the application method of watching attentively is started or is closed electrical equipment such as air-conditioning or electric fan, " wake " robot that is ready up or be in the computing machine of dormant state, also can in voice or gesture intelligent control technology, add the gazing direction of human eyes recognition technology, whether be intentionally steering order that user send, can improve the reliability of Based Intelligent Control greatly if being used to differentiate its voice or hand signal.
The present invention comprises that mainly imaging apparatus (stylus: gamma camera or image sensor), image storage device (storer), image recognition device and image-processing system (computing machine, single card microcomputer or DSP process chip) constitute; Described imaging apparatus is used to gather the scene image, described image storage device is used for the picture information of storage of collected scene image, described image recognition device is used for detecting the picture information that the picture information of obtaining in real time satisfies predetermined condition, described image-processing system is used for judging that automatically the scene graph image information of being gathered is steering order information or supplementary, particularly gazing direction of human eyes is as steering order or supplementary, and these steering order information or supplementary can be used for the control of non-contacting electrical equipment.
The present invention is installed in hardware device on the controlled electrical equipment, stylus points to the direction at electrical equipment user place, make stylus regularly constantly gather the scene image, and by in the process software scene image that judgement is gathered automatically whether the people being arranged, if nobody then continues circle collection scene image; If someone judges further then whether people's eyes are watching camera attentively, and is used for non-contacting electrical apparatus control system with this as steering order or supplementary.
Application the present invention check the differentiation of direction of gaze, and the result shows that identification is subjected to illumination effect little, and is convenient and reliable, and how much relevant the effect of direction of gaze identification simultaneously is with crowd's size that is identified and training sample.When using from 126 breadth portions image sample that 13 different people are gathered (comprise and watch sample attentively and non-ly watch each 63 in sample attentively) when carrying out the training of parameter extraction and neural network, in order to guarantee identification certainty to different people, the discrimination of training sample is 98.4%, test as test sample book with 127 non-training samples then, correct recognition rata can reach 84.3%.Reduce the crowd's be identified number or increase the number of training sample, correct recognition rata is improved further.Should be pointed out that in above-mentioned recognition result misjudged all is to watch sample attentively, be about to watch attentively declared and watch attentively, non-ly watch sample attentively all identification is correct into non-.The condition for identification of sample is watched in strictness attentively, be for fear of when direction of gaze is used as control information, owing to watching erroneous judgement attentively and cause maloperation, and can also in next round is handled, obtain identification, delay is arranged only slightly the erroneous judgement of watching attentively for watching attentively to differentiate with non-.At present, to discern the needed time be about 0.1 second to each direction of gaze.Processing time also can further reduce by the raising of hardware speed.
The present invention can be used for household electrical appliance, game machine, medical care instrument, driver safety companion, intelligent robot and disabled person and some person's of special procuring non-contact control device, judgement as direction of gaze can be used for simple appliances such as opening and closing air-conditioning, " wake " robot or the computing machine that are ready up, the safety that pilot's line of vision is departed from is reminded etc.Also can in voice or gesture intelligent control technology, add the gazing direction of human eyes recognition technology, whether be used to differentiate voice or hand signal is the steering order intentionally that the user sends, can improve the reliability of Based Intelligent Control greatly, make response more active, the close friend and definite of machine the people.For example present technique is applied in the gesture identification system, the implementation result of present technique is: can distinguish and have a mind to and gesture unintentionally, reduce the False Rate of gesture; Also can make the design of gesture instruction more natural, the gesture that needn't strict regulations adopts those people to be of little use or to be difficult to make.
Description of drawings
Fig. 1: image processing and direction of gaze discrimination technology process flow diagram.
Fig. 2: the geometric representation of face and plane of delineation angle Φ.A is the center of nose, B and E are respectively the left and right sides eye center on people's face plane, ABE forms people's face plane, B and C are respectively the left and right sides eye center on the image plane, ABC is an image plane, and the angle of the people's face plane and the plane of delineation is Φ, and AB is the straight line of two Plane intersects, on image plane, the angle of the line of two center B and C and nose center A is θ.
Fig. 3: judge the processing flow chart whether people is arranged in the scene.
Fig. 4: people's face location and the processing flow chart of cutting apart
Fig. 5: the positioning flow figure that windows of eyes and nose.
Fig. 6: facial parameters is extracted and direction of gaze identification processing flow chart.
Embodiment
The present invention is described in detail as follows with reference to accompanying drawing:
Shown in 1, the technological process of image acquisition of the present invention and processing comprises following 5 steps:
(1) whether people's judgement is arranged in Motion Recognition and the scene
With the identification of taking exercises of time method of difference: at first choose relatively-stationary background image, gather a width of cloth new images and background image at regular intervals and subtract each other, can judge whether have moving object in scene, to occur by the difference image.In case finding has moving object to enter scene, promptly is partitioned into the moving object image, beginning is further handled; Otherwise, continue to do the next round Motion Recognition.If there is moving object to enter scene, then adopt the feed-forward type neural network that the moving image that extracts is carried out colour of skin identification, when finding the object identical with people's colour of skin to be arranged and satisfy certain size condition, we just think that the someone enters scene, begin next step processing; Otherwise, from restarting to do the next round Motion Recognition.
The neural net method of people's colour of skin identification is adopted in above image acquisition and processing, and its structure is
4-3-1, wherein 3 in 4 of input layer nodes are respectively red, green, the blue components of image point, the 4th node is the biasing input; Hidden layer is got 3 neurons through repetition test and definite; The neuron of output layer exports+1 when being input as the colour of skin, then export-1 when the non-colour of skin is imported.This neural network adopts the BP learning algorithm to train, and training sample is taken from the actual measurement colour of skin and the non-colour of skin colour signal of the colour of skin under four kinds of situations such as daylight lamp, incandescent lamp, sun frontlighting and backlight of different people.Neural network after the training has good reliability and generalization to the identification of the colour of skin.Lighting condition change less or user number more after a little while, the number of training sample can reduce.Detailed process is seen Fig. 3.
(2) people's face is located and is cut apart
Having in the moving image identical of detecting with people's colour of skin, carry out the filling of colour of skin image and cut apart with " mountain peak algorithm ", and the geometric properties of utilization people face is from wherein going out people's face image respectively, see for details the paper that the applicant delivers [Yuan Jing and etc., " a kind of " based on the colour of skin and Facial Recognition, photoelectron laser, 2002, Vol.13, No.4, pp.394-397.], if nobody's face then returns the first step in the broca scale picture.Detailed process is seen Fig. 4.
(3) eyes location and nose location
Adopt the dynamic partition threshold value to carry out binaryzation to the facial image that is partitioned into.Because the black picture element at human eye and nose place is more, utilizes line by line and can determine the position of eyes and nose, and be partitioned into the visual window that comprises eyes and nose with the method for adding priori by column scan.Detailed process is seen Fig. 5.
(4) relevant with direction of gaze facial geometrical characteristic parameter extracts
Facial geometrical characteristic parameter comprises: the center of nose, left and right sides eye center and eyeball center on people's face image plane.Adopt the subregion dynamic thresholding method to carry out binaryzation again to the eyes window that is partitioned into, determine the center of eyes with the rectangle frame matching method, use template matching method (or mountain peak algorithm) to determine the eyeball center again, deduct eye center with the eyeball center and can determine the offset of eyeball with respect to eyes; With the subregion dynamic thresholding method nose window is carried out binaryzation equally, determine the center (being the projected position of nose on image plane) of nose with template matching method.Suppose that B and C are respectively the left and right sides eye center on the image plane, A is the center of nose, and then ABC is the image plane (see figure 2).In Fig. 2, we represent left and right sides eye center on people's face plane respectively with B and E.Suppose that the straight line that people's face plane and image plane intersect is the line AB at nose center and left eye center, ED is the perpendicular line of E to AB, and then ∠ EDC promptly is the angle Φ of the people's face plane and the plane of delineation.Here, the center E of right eye is according to geometric relationship supposition on people's face plane, and it can be determined by the angle theta of two center B on the image plane and C and they and nose center A line.Bright through theoretical reckoner, known straight line AB and the relational expression between AC and their angle theta are on angle Φ and the image plane:
(5) based on the identification of the artificial neural network direction of gaze of facial geometrical characteristic parameter
The identification of so-called direction of gaze is exactly to differentiate the eyes eyeball whether to watch camera lens attentively.Two-layer feed-forward type neural network is set up in the recognition methods of the artificial neural network direction of gaze among the present invention exactly, and its structure is
4-4-1With the position offset of two eyeballs that calculate above with respect to eyes, parameters such as the sinusoidal sin Φ of the angle between the people's face plane and the plane of delineation are as input variable, the 4th is input as biasing, by known actual of relevant people watched attentively and the non-study of watching visual sample attentively, can determine the weight parameter of this neural network, neural network after training is finished promptly can be used for differentiating the direction of gaze of human eye, and the result is differentiated in output, detailed process is seen the paper that Fig. 6 and applicant deliver [Wang Yong etc. " based on the differentiation of the gazing direction of human eyes of parameter extraction ", photoelectron laser, 2001, Vol.12, No.12, pp.1284-1287.], and this output information is used for Based Intelligent Control.