CN103440864A

CN103440864A - Personality characteristic forecasting method based on voices

Info

Publication number: CN103440864A
Application number: CN2013103292952A
Authority: CN
Inventors: 赵欢; 张希翔; 陈佐; 郑睿
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2013-07-31
Filing date: 2013-07-31
Publication date: 2013-12-11

Abstract

The invention discloses a voice-based personality feature prediction method, the implementation steps of which are as follows: personality evaluation and measurement are performed on multiple reference testers to obtain scores of multiple personality feature factors; voice segments of reference testers are collected and multiple acoustic components are extracted. Prosodic features, extracting a number of statistical feature values; establishing a voice personality prediction machine learning model, inputting the score values and statistical feature values of multiple personality feature factors and statistical feature values of each reference measurement person into the voice personality prediction machine learning model for training; collecting and measuring people Extract the acoustic prosodic features and statistical features, input the voice personality prediction machine learning model to obtain the scores of multiple personality characteristic factors corresponding to each acoustic prosodic feature, and weight and sum the scores of all personality characteristic factors for each feature Obtain and output the score values of multiple personality characteristic factors of the measured person. The invention has the advantages of simple collection of prediction materials, quick prediction process and objective and accurate effect.

Description

Voice-based personality characteristics Forecasting Methodology

Technical field

The present invention relates to the Computer Applied Technology field, be specifically related to a kind of voice-based personality characteristics Forecasting Methodology.

Background technology

On internet, the personality Forecasting Methodology generally adopts the form based on the word test paper at present.Although the personality Forecasting Methodology of word test paper has had abundant achievement in research, as five-factor model personality test (Big Five), Cartel 16 factor personality tests (Sixteen Personality Factor Questionnaire, 16PF) etc.But the user need to spend the plenty of time and carries out answer in this manner, the prediction required time depends on exercise question quantity and tester's answer speed, and prediction steps is various tediously long, the tester easily produces and is sick of and the psychology of conflicting, the accuracy of test result depends on tester's subjective cooperate degree, therefore this method not very applicable internet advocate simple and convenient " fast food type " application model.

The technical scheme that number of patent application is 201010606120.8 discloses a kind of personality method of testing and device that proposes the mutual question and answer mode of a kind of speech type based on multiple dialect background, the word interrogation reply system of personality test is changed into to the voice question and answer mode, solved to a certain extent adaptability and the convenience problem of special population, but not from solving in essence the too tediously long situation of test process.In addition, the technical scheme that number of patent application is 201310059465.X discloses a kind of user's of utilization handwriting picture and has carried out the analyses and prediction personality characteristics, although can remove tediously long answer predicted time from, but at present in mobile social activity, network social intercourse, the use of hand-written picture is not extensive, has problems such as being difficult to gather predicted data.The present invention is based on the personality prediction mode of voice, step is few and simple to operate, can under moving internet, mobile environment, in numerous application, promote, and then provide further social service for the user accurately and efficiently.Therefore, how to overcome personality characteristics prediction mode length consuming time, effect on internet, mobile platform and be subject to the deficiencies such as subjective factor affects, determination data is difficult to obtain, for the user provides " fast food type " personality that is simple and easy to use Approaches For Prediction, become a technical matters urgently to be resolved hurrily.

Summary of the invention

For the above-mentioned enumeration problem of prior art, the enumeration problem that the present invention will solve be to provide a kind of prediction consuming time short, effect is objective and accurate, material collection voice-based personality characteristics Forecasting Methodology simply and easily.

In order to solve above-mentioned enumeration problem, the technical solution used in the present invention is:

A kind of voice-based personality characteristics Forecasting Methodology, implementation step is as follows:

1) set up voice personality prediction machine learning model: carry out personality assessment's mensuration for a plurality of reference mensuration people that select and obtain the multinomial personality characteristics factor score value of marking as the true benchmark of the personality characteristics factor with reference to the mensuration people; Gather a plurality of described sound bites with reference to measuring people's normal articulation voice, described sound bite is carried out pre-service and extracts multinomial acoustics prosodic features, extract the multinomial statistical characteristics of described acoustics prosodic features; Foundation comprises the acoustics prosodic features to the voice personality of personality characteristics factor score value mapping relations prediction machine learning model, each is inputted respectively to described voice personality prediction machine learning model with reference to multinomial personality characteristics factor score value of measuring the people and multinomial statistical characteristics corresponding to the every acoustics prosodic features of sound bite and trained;

2) personality characteristics prediction: gather the normal articulation voice of measuring the people and obtain sound bite to be predicted, described sound bite is carried out pre-service and extracts multinomial acoustics prosodic features and corresponding multinomial statistical nature, described multinomial acoustics prosodic features and corresponding multinomial statistical nature are inputted to described voice personality prediction machine learning model to carry out the regretional analysis of personality characteristics factor score value and obtains every acoustics prosodic features and multinomial personality characteristics factor score value corresponding to statistical nature, corresponding all personality characteristic factor score values weighted sum by each acoustics prosodic features respectively, finally obtain measuring the people multinomial personality characteristics factor score value output.

As further improvement in the technical proposal of the present invention:

A plurality ofly with reference to measuring people, carry out a kind of in the specifically fingering row five-factor model personality test of personality assessment's mensuration, the multinomial personality test in Minnesota, Cartel 16 personalities tests for what select in described step 1).

Described step 1) and step 2) in to sound bite, carry out pre-service and extract the detailed step of multinomial acoustics prosodic features as follows: sound bite is carried out to pre-emphasis, windowing process, divide frame, end-point detection obtains pretreated sound bite, pretreated each sound bite is extracted respectively and comprises the Mel frequency cepstral coefficient, the linear prediction cepstrum coefficient coefficient, the perception linear predictor coefficient, pitch, front two resonance peaks, energy, sound segment length, unvoiced segments length, the perception linear predictor coefficient, short-time zero-crossing rate, the humorous ratio of making an uproar, multinomial at interior acoustics prosodic features when long in averaging spectrum.

The multinomial statistical characteristics of extracting the acoustics prosodic features in described step 1) specifically refers to multiple in the maximal value of extracting described acoustics prosodic features, minimum value, average, variance, relative entropy, slope, difference value.

Voice personality in described step 1) prediction machine learning model specifically refers to a kind of in gaussian kernel support vector machine statistical model, logistic regression method model, decision-tree model, LEAST SQUARES MODELS FITTING, perceptron algorithm model, Boost method model, Hidden Markov Model (HMM), gauss hybrid models, neural network model, degree of deep learning model.

The present invention has following technique effect: the present invention is by the voice personality prediction machine learning model of setting up in advance, any sound bite that can utilize the user to provide is realized the personality characteristics prediction by voice personality prediction machine learning model, set up the mapping relations between phonetic feature and personality characteristics factor by utilizing statistical learning method, dope every personality factors index, overcome traditional personality and predicted length consuming time, effect is subject to the subjective factor impact, measure the deficiencies such as material is not easy to obtain, can take full advantage of current network social intercourse, mobile social activity is easy to obtain the characteristics of sound materials, the acoustics prosodic features of any sound bite of person to be measured of submitting to by the extraction user, utilize statistical learning method to calculate a plurality of personality characteristics factor score values that this sound bite is corresponding, a plurality of personality characteristics factor score values are weighted to summation to be obtained predicting people's final personality characteristics comprehensive grading value and provides social service for the user based on this, can provide the best personality coupling marriage and making friend based on personality characteristics for the user, the interpersonal relation prediction, the social class personality prediction such as job market planning application quick service, there is prediction consuming time short, effect is objective and accurate, material collection is simple and convenient, the advantage had wide range of applications.

The accompanying drawing explanation

The method flow schematic diagram that Fig. 1 is the embodiment of the present invention.

The principle schematic that Fig. 2 is personality characteristics prediction in the embodiment of the present invention.

Embodiment

As shown in Figure 1, the implementation step of the voice-based personality characteristics Forecasting Methodology of the present embodiment is as follows:

One, set up voice personality prediction machine learning model.

1.1) carry out with reference to measuring people the multinomial personality characteristics factor score value that personality assessment's mensuration obtains measuring as reference people's the true benchmark scoring of personality characteristics factor for a plurality of of selection.In the present embodiment, measure the people for a plurality of references of selecting and carry out the specifically fingering row five-factor model personality test (Big Five) of personality assessment's mensuration, show that each is with reference to measuring people's nervousness (Neuroticism), extropism (Extroversion), open (Openness), agreeableness (Agreeableness), five personality characteristics factor score values of doing one's duty property (Conscientiousness).In addition, carry out the personality assessment for a plurality of reference mensuration people and can also adopt the multinomial personality test in Minnesota or Cartel 16 personality tests etc., its result equally also can obtain multinomial personality characteristics factor score value, and the item number of personality characteristics factor score value can be different and different due to concrete personality assessment's assay method.

1.2) gather a plurality of sound bites with reference to measuring people's normal articulation voice, sound bite is carried out pre-service and extracts multinomial acoustics prosodic features.In the present embodiment, select altogether 400 with reference to measuring the people, each records 10 sections any normal articulation voice about 15 seconds with reference to measuring the people, obtain altogether 4000 sound bites, because being greater than 300, general experiment image data amount meets the psychological analysis needs, therefore the present embodiment is set up the sound bite that voice personality prediction machine learning model uses and is met the correlated sampling standard, will wherein approximately 2/3rds sound bite be for training set in the present embodiment, and remaining 1/3rd for test set.In the present embodiment, sound bite is carried out to pre-service and obtain the detailed step of multinomial acoustics prosodic features of sound bite as follows: sound bite is carried out to the voice pre-service and (carry out successively pre-emphasis, windowing process, divide frame, end-point detection) obtain a plurality of sound bites, each sound bite is extracted respectively to Mel frequency cepstral coefficient (MFCC), pitch (Pitch, design per second vocal cord vibration number of times, be related to the tone and intonation), front two resonance peaks (First formant F1 and second-order resonance peak F2), energy (Energy), sound segment length (L0), unvoiced segments length (L1, be used for L0 in conjunction with rear relevant to pronunciation speed), perception linear predictor coefficient (Perceptual Linear Predictive), short-time zero-crossing rate, humorous making an uproar than (Harmonics-to-Noise-Ratio), the multinomial multinomial acoustics prosodic features obtained as extraction when long in averaging spectrum (Long-Term Average Spectrum).

1.3) extract the multinomial statistical characteristics of acoustics prosodic features.In the present embodiment, the multinomial statistical characteristics of extracting the acoustics prosodic features specifically refers to multiple in the maximal value (Max) of multinomial acoustics prosodic features, minimum value (Min), average (Mean), variance (Stdev), relative entropy KL, slope, difference value.

1.4) set up and to comprise the voice personality prediction machine learning model of acoustics prosodic features to personality characteristics factor score value mapping relations, each is inputted respectively to the voice personality with reference to multinomial personality characteristics factor score value of measuring the people and multinomial statistical characteristics corresponding to the every acoustics prosodic features of sound bite and predict that machine learning model is trained.

In the present embodiment, the voice personality prediction machine learning model of the multinomial statistical characteristics input of each multinomial acoustics prosodic features with reference to a plurality of sound bites of the multinomial personality characteristics factor score value of measuring the people and correspondence is specifically referred to gaussian kernel support vector machine statistical model (gaussian kernel Support Vector Machine) model, each of each sound bite acoustics prosodic features comprises corresponding nervousness (Neuroticism), extropism (Extroversion), open (Openness), agreeableness (Agreeableness), five personality characteristics factor score values of doing one's duty property (Conscientiousness).In addition, can also adopt as required other voice personality prediction machine learning model that comprises logistic regression method model, decision-tree model, LEAST SQUARES MODELS FITTING, perceptron algorithm model, Boost method model, Hidden Markov Model (HMM), gauss hybrid models, neural network model, degree of deep learning model, no matter but for any voice personality prediction machine learning model, the sample size of its degree of accuracy and training is relevant, and the sample size of training degree of accuracy more at most is higher.The present embodiment is by after the multinomial statistical characteristics input gaussian kernel support vector machine statistical model of each multinomial acoustics prosodic features with reference to the multinomial personality characteristics factor of measuring the people and correspondence, gaussian kernel support vector machine statistical model is completed to training and obtain comprising the gaussian kernel support vector machine statistical model of acoustics prosodic features to the mapping of personality characteristics factor score value, obtain aforesaid voice personality prediction machine learning model.Because comprising the acoustics prosodic features, this voice personality prediction machine learning model shines upon to personality characteristics factor score value, therefore can utilize any sound bite that the user provides to carry out the personality prediction, by utilizing the mapping relations between sound bite acoustics prosodic features and personality characteristics factor score value to dope every personality factors index, thereby lay the foundation for the personality signatures to predict.

Two, personality characteristics prediction.

2.1) normal articulation voice that gather to measure the people obtain sound bite to be predicted, sound bite is carried out pre-service and extracts the multinomial statistical nature that multinomial acoustics prosodic features is corresponding.Gather sound bite and can pass through two kinds of modes: one, gather the voice of measuring the people, the user can use mobile phone, computer, flat board or other electronic equipments to choose the sound bite file of having recorded, and is committed to the voice collecting interface of application the present embodiment method by network; Two, the user can select the real-time recording function of the system of application the present embodiment method, records one section sound bite and is committed to the voice collecting interface.In the present embodiment, specifically by the voice collecting interface, receive the sound bite audio file that the user submits to by network, its sampling rate is 11025Hz, and the sound bite audio file all saves as the wav form.In addition, sound bite is carried out to pre-service and obtains the step and step 1.2 of the multinomial acoustics prosodic features of sound bite) identical, do not repeat them here.

2.2) multinomial acoustics prosodic features and corresponding multinomial statistical nature input voice personality prediction machine learning model are carried out to the regretional analysis of personality characteristics factor score value obtain every acoustics prosodic features and multinomial personality characteristics factor score value corresponding to statistical nature.Because comprising the acoustics prosodic features, voice personality prediction machine learning model shines upon to personality characteristics factor score value, therefore multinomial acoustics prosodic features input voice personality prediction machine learning model is carried out to the regretional analysis of personality characteristics factor score value, can obtain a plurality of personality characteristics factor score values of corresponding nervousness (Neuroticism), extropism (Extroversion), open (Openness), agreeableness (Agreeableness), five personality characteristics factors of doing one's duty property (Conscientiousness).Finally, obtain five personality characteristics factor score values that every acoustics prosodic features is corresponding, each acoustics prosodic features is to there being five personality characteristics factor score values.

2.3) corresponding all personality characteristic factor score values weighted sum by each acoustics prosodic features respectively, finally obtain measuring the people multinomial personality characteristics factor score value output.

As shown in Figure 2, the present embodiment is at first by step 2.1) gather sound bite and extract multinomial acoustics prosodic features and statistical nature, extract multinomial acoustics prosodic features and comprise pitch, resonance peak (front two resonance peaks, comprise First formant F1 and second-order resonance peak F2) etc., calculate multinomial statistical nature and comprise maximal value, minimum value, average, variance, relative entropy etc., due to through step 2.2) after input voice personality prediction machine learning model carries out the regretional analysis of personality characteristics factor score value, each acoustics prosodic features obtains corresponding nervousness (Neuroticism), extropism (Extroversion), open (Openness), agreeableness (Agreeableness), five personality characteristics factor score values of doing one's duty property (Conscientiousness), the present embodiment is finally by step 2.3) by corresponding nervousness (Neuroticism), extropism (Extroversion), open (Openness), agreeableness (Agreeableness), the personality characteristics factor score value that doing one's duty property (Conscientiousness) is five is weighted the final scoring that summation draws five personality factors, dopes five personality factors exponential quantities measuring the people and predicts the outcome as the final personality characteristics of measuring the people.

In sum, the present embodiment is by the voice personality prediction machine learning model of setting up in advance, can utilize any sound bite that the user provides to carry out the personality prediction, set up the mapping relations between phonetic feature and personality characteristics factor by utilizing statistical learning method, dope every personality factors index, overcome traditional personality and predicted length consuming time, effect is subject to the subjective factor impact, measure the deficiencies such as material is not easy to obtain, can take full advantage of current network social intercourse, mobile social activity is easy to obtain the characteristics of sound materials, the acoustics of any sound bite of person to be measured of submitting to by the extraction user, the features such as the rhythm, utilize statistical learning method to calculate a plurality of personality characteristics factor score values that this sound bite is corresponding, a plurality of personality characteristics factor score values are weighted to summation to be obtained predicting people's final personality characteristics comprehensive grading value and provides social service for the user based on this, by for a plurality of, with reference to measuring the people, having carried out respectively personality characteristics prediction accuracy contrast experiment, experimental data shows that the priori accuracy of the present embodiment can reach 67% left and right, the actual measurement accuracy 75% of carrying out artificial personality assessment's mensuration with prior art is more approaching, can meet personality characteristics forecast demand fast and accurately, can provide the best personality coupling marriage and making friend based on personality characteristics for the user, the interpersonal relation prediction, the social class personality prediction such as job market planning application quick service, there is prediction consuming time short, effect is objective and accurate, material collection is simple and convenient, the advantage had wide range of applications.

The above is only the preferred embodiment of the present invention, and protection scope of the present invention also not only is confined to above-described embodiment, and all technical schemes belonged under thinking of the present invention all belong to protection scope of the present invention.It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. A method for predicting personality traits based on speech, characterized in that the implementation steps are as follows:

1) Establish a voice personality prediction machine learning model: perform personality assessment and measurement on multiple selected reference measurement persons to obtain multiple personality characteristic factor score values as the real benchmark score of the personality characteristic factors of the reference measurement person; collect multiple reference measurement persons A speech segment of a person's normal pronunciation, preprocessing the speech segment and extracting multiple acoustic prosody features, extracting multiple statistical feature values of the acoustic prosody features; establishing a mapping relationship including acoustic prosody features to personality characteristic factor score values The phonetic personality prediction machine learning model of each reference measurement person's multiple personality characteristic factor score values and the multiple statistical feature values corresponding to each acoustic prosodic feature of the voice segment are respectively input into the voice personality prediction machine learning model for training;

2) Personality feature prediction: collect and measure the normal pronunciation of a person to obtain a speech segment to be predicted, preprocess the speech segment and extract multiple acoustic prosody features and corresponding multiple statistical features, and convert the multiple acoustic prosody Features and corresponding multiple statistical features are input into the voice personality prediction machine learning model to perform regression analysis on the score values of personality feature factors to obtain multiple personality feature factor score values corresponding to various acoustic prosodic features and statistical features, and each feature is respectively The weighted summation of all corresponding personality characteristic factor scores is finally obtained and outputted.

2. The method for predicting personality characteristics based on speech according to claim 1, characterized in that: in the step 1), carrying out personality assessment and determination for a plurality of selected reference measurement persons specifically refers to conducting the Big Five personality test, Minnesota multi- One personality test, one of the Cattell 16 personality tests.

3. The voice-based personality feature prediction method according to claim 2, characterized in that: in the step 1) and step 2), the detailed steps of preprocessing the voice segment and extracting multiple acoustic prosody features are as follows: Speech segments are pre-emphasized, windowed, framed, and endpoint detected to obtain preprocessed speech segments, and each pre-processed speech segment is extracted including Mel frequency cepstral coefficients, linear prediction cepstral coefficients, and perceptual linear prediction. Acoustic prosody features including coefficient, pitch, first two formants, energy, voiced segment length, unvoiced segment length, perceptual linear prediction coefficient, short-term zero-crossing rate, harmonic-to-noise ratio, and long-term average spectrum.

4. The voice-based personality feature prediction method according to claim 3, characterized in that: the multiple statistical feature values of the acoustic prosody features extracted in the step 1) specifically refer to the maximum and minimum values of the acoustic prosody features extracted , mean, variance, relative entropy, slope, and difference values.

5. The speech-based personality feature prediction method according to any one of claims 1 to 4, characterized in that: the speech personality prediction machine learning model in the step 1) specifically refers to the Gaussian kernel support vector machine statistical model , logistic regression method model, decision tree model, least squares method model, perceptron algorithm model, Boost method model, hidden Markov model, Gaussian mixture model, neural network model, deep learning model.