Python Data Analysis - NumPy -4. NumPy Convenient Functions

1. Correlation (np. cov(), diagonal(), trace(), np.abs())

Provide closing price data using two sample datasets, including the minimum value of the closing price. The first company is BHP Billiton (BHP), whose main business is the extraction of oil, metals, and diamonds. The second company is Vale (VALE), which is also a company in the metal mining industry. Some of the businesses of these two companies overlap, and correlation analysis can be conducted on their stocks.

import numpy as np
from matplotlib.pyplot import plot
from matplotlib.pyplot import show
bhp = np.loadtxt("BHP.csv", delimiter=",", usecols=(6,), unpack=True)
bhp_returns = np.diff(bhp) / bhp[:-1]
vale = np.loadtxt("VALE.csv", delimiter=",", usecols=(6,), unpack=True)
vale_returns = np.diff(vale) / vale[:-1]
#covariance describes the trend of two variables changing together, which is actually the correlation coefficient before normalization. apply cov calculate the covariance matrix of stock returns using a function 
covariance = np.cov(bhp_returns, vale_returns)
print("Covariance (covariance matrix) :", covariance)
#apply diagonal function view elements on diagonal 
a = covariance.diagonal()
print("Covariance Diagonal (elements on the corner line) :", a)
#apply trace the function calculates the trace of a matrix, which is the sum of elements on the diagonal 
b = covariance.trace()
print("Covariance Trace (trace of matrix) :", b)
#the correlation coefficient between two vectors is defined as the product of covariance divided by their respective standard deviations 
c = covariance / (np.std(bhp_returns) * np.std(vale_returns))
print("the correlation coefficient between two vectors :", c)
#apply corrcoef calculate the correlation coefficient of the function (correlation coefficient matrix) 
d = np.corrcoef(bhp_returns, vale_returns)
#the correlation coefficient matrix is symmetric about the diagonal, representing BHP related to VALE the correlation coefficient of is equal to VALE and BHP the correlation coefficient of. 
print("Correlation coefficient (correlation coefficient matrix) :", d)
#determine whether the price trends of two stocks are synchronized : if their difference deviates by a distance of twice the average difference from the standard deviation, it is considered that the trends of these two stocks are not synchronized 
#if it is determined to be out of sync, stock trading can be carried out, waiting for them to return to the synchronized state. calculate the difference between the closing prices of these two stocks to determine if they are synchronized 
differnce = bhp - vale
avg = np.mean(differnce)
dev = np.std(differnce)
e = np.abs(differnce[-1] - avg) > 2*dev # np.abs the () function calculates the absolute values of each element in an array 
print("Out of sync:",e)
t = np.arange(len(bhp_returns))
plot(t, bhp_returns, lw=1)
plot(t, vale_returns, lw=2)
show()

2. Polynomials (np. polyfit(), np. polyval(), np. roots(), np. polyder())

In calculus, there is the concept of Taylor expansion, which represents a differentiable function using an infinite number. In fact, any differentiable (and therefore continuous) function can be estimated using an N-th degree polynomial, and the higher order parts are negligible infinitesimals.

# NumPy middle ployfit a function can be fitted to a series of data points using polynomials, regardless of whether these data points come from a continuous function or not 
import numpy as np
from matplotlib.pyplot import plot
from matplotlib.pyplot import show
bhp = np.loadtxt("BHP.csv", delimiter=",", usecols=(6,), unpack=True)
vale = np.loadtxt("VALE.csv", delimiter=",", usecols=(6,), unpack=True)
t = np.arange(len(bhp))
poly = np.polyfit(t, bhp-vale, 5)  #using a cubic polynomial to fit the difference in closing prices between two stocks 
print("Polynomial fit (the fitted result is the coefficient of the polynomial) :", poly)
#infer the next value 
a = np.polyval(poly, t[-1]+1)
print("Next value (infer next value) :", a)
#apply roots find the point where the fitted polynomial function has a value of 0 using the () function 
b = np.roots(poly)
print("Roots (find point 0) :", b)
#apply polyder derivation of a polynomial function (coefficient of the derivative of a polynomial function) 
der = np.polyder(poly)
print("Derivative (derivative coefficients of polynomial functions) :", der)
#extreme values may be the maximum or minimum values of a function, located at the position where the derivative of the function is 0 
ext = np.roots(der)
print("Extremas (extreme point) :", ext)
#review results (using polyval calculate the value of a polynomial function) 
vals = np.polyval(poly, t)
print("maximum value point", np.argmax(vals))
print("minimum value point", np.argmin(vals))
plot(t, bhp - vale)
plot(t, vals)
show()

#smooth the data before fitting 
import numpy as np
from matplotlib.pyplot import plot
from matplotlib.pyplot import show
bhp = np.loadtxt("BHP.csv", delimiter=",", usecols=(6,), unpack=True)
vale = np.loadtxt("VALE.csv", delimiter=",", usecols=(6,), unpack=True)
c = bhp-vale
N = 3
weights = np.ones(N) / N
sma = np.convolve(weights, c)[N-1:-N+1]
t = np.arange(len(sma))
poly = np.polyfit(t, sma, 5)  #using a cubic polynomial to fit the difference in closing prices between two stocks 
print("Polynomial fit (the fitted result is the coefficient of the polynomial) :", poly)
#apply polyder derivation of a polynomial function (coefficient of the derivative of a polynomial function) 
der = np.polyder(poly)
print("Derivative (derivative coefficients of polynomial functions) :", der)
#extreme values may be the maximum or minimum values of a function, located at the position where the derivative of the function is 0 
ext = np.roots(der)
print("Extremas (extreme point) :", ext)
#review results (using polyval calculate the value of a polynomial function) 
vals = np.polyval(poly, t)
print("maximum value point", np.argmax(vals))
print("minimum value point", np.argmin(vals))
plot(t, sma)
plot(t, vals)
show()

3. Net trading volume (np. sign(), np. peerwise, np. array)_Equal()

Volume is a very important variable in investment, which can represent the magnitude of price fluctuations. On Balance Volume (OBV) is one of the simplest stock price indicators, which can be calculated from the closing price of the day, the previous day's closing price, and the day's trading volume. A positive or negative sign determined by the change in closing price needs to be multiplied before the trading volume.

#hold BHP load data into arrays of closing price and trading volume separately 
import numpy as np
c, v = np.loadtxt("BHP.csv", delimiter=",", usecols=(6, 7), unpack=True)
# diff the () function calculates the difference between two consecutive elements in an array and returns an array composed of these differences 
change = np.diff(c)
print("Change", change)
print(len(c))
print(len(change))
# 1.sign the function can return the positive and negative signs of each element in the array. when the array element is negative, it returns -1, when it is positive, it returns 1, otherwise it returns 0. 
signs = np.sign(change)
print("Signs:", signs)
# 2.piecewise function to obtain the positive and negative values of array elements. the function can be called with appropriate return values and corresponding conditions given in segments 
pieces = np.piecewise(change, [change < 0, change > 0], [-1, 1])
print("Pieces:", pieces)
#check if the two outputs are consistent 
equal = np.array_equal(signs, pieces)  #check if two arrays have the same shape and elements 
print("Arrays equal?", equal)
# OBV the calculation of the value depends on the closing price of the previous day, so in the example, it is not possible to calculate the first day's closing price OBV value 
obv = v[1:] * signs
print("On Balance volume:", obv)

4. Data smoothing (np. hanning(), np. polysub(), np. isreal(), np. select(), np. trim)_Zeros()

import numpy as np
from matplotlib.pyplot import plot
from matplotlib.pyplot import show
N = 8  #call hanning the () function calculates weights and generates a length of N window for 
weights1 = np.hanning(N)  #cosine window function window 
weights2 = np.hamming(N)  #hanming window 
weights3 = np.blackman(N)  #blackman window 
weights4 = np.bartlett(N)  #bartlett window 
weights5 = np.kaiser(N, 3)  #kaize window 
print("Weights1", weights1)
print("Weights2", weights2)
print("Weights3", weights3)
print("Weights4", weights4)
print("Weights5", weights5)
#apply convolve () function calculation BHP and VALE normalized stock returns weights as a parameter 
bhp = np.loadtxt('BHP.csv', delimiter=',', usecols=(6,), unpack=True)
bhp_returns = np.diff(bhp) / bhp[: -1]
smooth_bhp1 = np.convolve(weights1/weights1.sum(), bhp_returns)[N-1:-N+1]
smooth_bhp2 = np.convolve(weights2/weights2.sum(), bhp_returns)[N-1:-N+1]
smooth_bhp3 = np.convolve(weights3/weights3.sum(), bhp_returns)[N-1:-N+1]
smooth_bhp4 = np.convolve(weights4/weights4.sum(), bhp_returns)[N-1:-N+1]
smooth_bhp5 = np.convolve(weights5/weights5.sum(), bhp_returns)[N-1:-N+1]
vale = np.loadtxt('VALE.csv', delimiter=',', usecols=(6,), unpack=True)
vale_returns = np.diff(vale) / vale[: -1]
smooth_vale1 = np.convolve(weights1/weights1.sum(), vale_returns)[N-1:-N+1]
smooth_vale2 = np.convolve(weights2/weights2.sum(), vale_returns)[N-1:-N+1]
smooth_vale3 = np.convolve(weights3/weights3.sum(), vale_returns)[N-1:-N+1]
smooth_vale4 = np.convolve(weights4/weights4.sum(), vale_returns)[N-1:-N+1]
smooth_vale5 = np.convolve(weights5/weights5.sum(), vale_returns)[N-1:-N+1]
#fitting smoothed data using polynomials 
K = 5
t = np.arange(N - 1, len(bhp_returns))
poly_bhp1 = np.polyfit(t, smooth_bhp1, K)
poly_bhp2 = np.polyfit(t, smooth_bhp2, K)
poly_bhp3 = np.polyfit(t, smooth_bhp3, K)
poly_bhp4 = np.polyfit(t, smooth_bhp4, K)
poly_bhp5 = np.polyfit(t, smooth_bhp5, K)
poly_vale1 = np.polyfit(t, smooth_vale1, K)
poly_vale2 = np.polyfit(t, smooth_vale2, K)
poly_vale3 = np.polyfit(t, smooth_vale3, K)
poly_vale4 = np.polyfit(t, smooth_vale4, K)
poly_vale5 = np.polyfit(t, smooth_vale5, K)
#find out when two polynomials have equal values, that is, where there are intersections. 
#equivalent to subtracting two polynomial functions first and then taking the root of the resulting polynomial function. apply polysub the () function subtracts polynomials 
poly_sub1 = np.polysub(poly_bhp1, poly_vale1)
xpoints1 = np.roots(poly_sub1)
print("Intersection points1", xpoints1)
poly_sub2 = np.polysub(poly_bhp2, poly_vale2)
xpoints2 = np.roots(poly_sub2)
print("Intersection points2", xpoints2)
poly_sub3 = np.polysub(poly_bhp3, poly_vale3)
xpoints3 = np.roots(poly_sub3)
print("Intersection points3", xpoints3)
poly_sub4 = np.polysub(poly_bhp4, poly_vale5)
xpoints4 = np.roots(poly_sub4)
print("Intersection points4", xpoints4)
poly_sub5 = np.polysub(poly_bhp5, poly_vale5)
xpoints5 = np.roots(poly_sub5)
print("Intersection points5", xpoints5)
#the plural in the result is not conducive to subsequent processing and needs to be used isreal the () function is used to determine whether an array element is a real number 
reals1 = np.isreal(xpoints1)
print("Real1 number?", reals1)
reals2 = np.isreal(xpoints2)
print("Real2 number?", reals2)
reals3 = np.isreal(xpoints3)
print("Real3 number?", reals3)
reals4 = np.isreal(xpoints4)
print("Real4 number?", reals4)
reals5 = np.isreal(xpoints5)
print("Real5 number?", reals5)
# select a function can select elements from a set of given conditions that meet the conditions and return an array 
#real number intersection obtained 
xpoints1 = np.select([reals1], [xpoints1])
xpoints1 = xpoints1.real    # np.real () real part 
print("Real1 intersection points", xpoints1)
xpoints2 = np.select([reals2], [xpoints2])
xpoints2 = xpoints2.real
print("Real2 intersection points", xpoints2)
xpoints3 = np.select([reals3], [xpoints3])
xpoints3 = xpoints3.real
print("Real3 intersection points", xpoints3)
xpoints4 = np.select([reals4], [xpoints4])
xpoints4 = xpoints4.real
print("Real4 intersection points", xpoints4)
xpoints5 = np.select([reals5], [xpoints5])
xpoints5 = xpoints5.real
print("Real5 intersection points", xpoints5)
#remove elements with 0 from them :trim_zeros the () function can remove elements from a one-dimensional array that start and end with 0 
print("Sans1 0s", np.trim_zeros(xpoints1))
print("Sans2 0s", np.trim_zeros(xpoints2))
print("Sans3 0s", np.trim_zeros(xpoints3))
print("Sans4 0s", np.trim_zeros(xpoints4))
print("Sans5 0s", np.trim_zeros(xpoints5))
#the thin lines in the graph represent the stock returns, while the thick lines represent the smoothed results. the intersection of the lines in the graph may be the turning point of the stock price trend 
plot(t, bhp_returns[N-1:], lw=1.0)
plot(t, smooth_bhp1, lw=2.0)
plot(t, smooth_bhp2, lw=2.0)
plot(t, smooth_bhp3, lw=2.0)
plot(t, smooth_bhp4, lw=2.0)
plot(t, smooth_bhp5, lw=2.0)
plot(t, vale_returns[N-1:], lw=1.0)
plot(t, smooth_vale1, lw=2.0)
plot(t, smooth_vale2, lw=2.0)
plot(t, smooth_vale3, lw=2.0)
plot(t, smooth_vale4, lw=2.0)
plot(t, smooth_vale5, lw=2.0)
show()

Summary

The correlation between the returns of two stocks was calculated using the np.corrcoef() function. The diagonal() and trace() functions can provide the diagonal elements of the matrix and the trace of the matrix, respectively. Fit a series of data points using the np. polyfit() function, calculate the value of the polynomial function using the np. polyval() function, solve the root of the polynomial using the np. roots() function, and solve the derivative of the polynomial function using the np. polyder() function.

Python Data Analysis - NumPy -4. NumPy Convenient Functions

1. Correlation (np. cov(), diagonal(), trace(), np.abs())

2. Polynomials (np. polyfit(), np. polyval(), np. roots(), np. polyder())

3. Net trading volume (np. sign(), np. peerwise, np. array)_Equal()

4. Data smoothing (np. hanning(), np. polysub(), np. isreal(), np. select(), np. trim)_Zeros()

Summary

Related articles

Latest articles

Hot tags：