Basic Knowledge of Python Data Analysis (1)

Fundamentals of Data Analysis (1)

NumPy library

Concept and Function

NumPy is an open-source Python scientific computing foundation library and the foundation of data processing or scientific computing libraries such as Scipy and Pandas. It is used for scientific computing and has certain advantages in performance and storage.

Advantages:

1] The pre compiled C code performs calculations quickly, making it faster than Python
2] There is a better storage structure to improve computational efficiency
Knowledge system Darray data structure and its characteristics.

UFunc Universal Functions

Like a set of tools capable of handling arrays, capable of manipulating each element in a NumPy array one by one
Similar to the built-in math module in Python, it is mainly used to implement some basic array operations, such as mathematical functions, trigonometric functions, etc
Common functions:

Np.add()
Np. subtract().

Other sub module packages
This includes contents such as random number packets, linear algebra packets, matrix packets, etc
Common functions:

Np. random. randint (low, high, shape).

Import

Import numpy as np.

Understanding the ndarray array

Ndarray is a multidimensional array object.

Dimensions of data

Dimension is the organizational form of a set of data, which to some extent determines the expression and meaning of a set of data.

one-dimensional data:

Usually composed of data organized in a linear manner
As with lists in Python, data can only be described from the perspective of columns.

2D data:

Composed of multiple one-dimensional data, it is a combination of one-dimensional data A table represented by a nested list can be described from two dimensions: row and column.

3D data:

Expanding one-dimensional/two-dimensional data into new dimensions If there is another layer of nested lists
Higher dimensional data.

Create an ndarray array

Grammar:

Np. array (list/tuple...).

Usage:

Pass data types such as lists and tuples from Python as parameters into the array method.

Features:

Transferring several dimensions of Python data into the array() method will generate several dimensions of darray arrays.

Note: Unlike Python, the ndarray array requires all element types to be the same, so incoming data parameters need to unify the data types of their internal elements.

Characteristics of ndarray arrays

The data type of the element
The data in a single array is homogeneous, meaning that all elements have the same data type

Attribute: dtype represents the data type of array elements, which can be called when viewing the ndarray data type

Example:

Print (array3. dtype).

Dimensional information
Axis and Rank

Axis: The dimension in which data is saved, represented by numbers starting from 0. A dimension group axis has axis; 0, 2D array axis has axis= 0, axis; 1.

Rank: Refers to the number of axes or dimensions, with a rank of 1 for a one-dimensional array and 2 for a two-dimensional array.

attribute.

Shape attribute -Represents the shape of an array and returns a tuple as the result when called
Array_2. shape When the return result is (4,3), it indicates that this is an array of 4 rows and 3 columns, where different axes represent the index index of this tuple. Axis; The length of the array returned in the 0 direction is 4, axis= The length of the returned array corresponding to 1 is 3, and the correspondence is from the outermost layer to the innermost layer.

Ndim attribute -Represents the number of axes/dimensions in an array, and returns an integer as the result when called.

Array_2. ndim
When the return result is 2, it represents that the axis of the array is 2 and there are two dimensions of quantity.

Size attribute -Represents the number of elements in an array, and returns an integer as the result when called
Array_2. Size.

Index Function: Retrieve elements at specific positions in an array
Index of a single element in a one-dimensional array.

Syntax: darray [index value 0].

Usage: Similar to slicing a list, SDndarray [1].

Index of a single element in a multidimensional array

Syntax: darray [index value 0, index value 1...]
Usage: Take an index value for each dimension separately, separated by commas.

Darray [1,1].

Example:
View elements with a value of 2 in the array.

Array_1 [1].

Index the elements in the third row and second column of a two-dimensional array.

Array_2 [2, 1].

Combining index slices with different indexes

Usage: You can directly select a dimension
Syntax: darray [starting index value: ending index value: step size]
Attention:
Not filling in the left side of the colon is equivalent to filling in 0, and not filling in the right side is equivalent to taking the element with the highest index value.

End point index value: does not include
when taking values

Step size: used for interpolation values, equal to the difference between the index values of each element being valued, default to 1.

Boolean index

Perform logical operations on the target array to find elements that meet certain conditions and return them in the form of a one-dimensional array.

Syntax: darray [Boolean array].

Usage: Index the target array through a Boolean array, ndarray [booling]
Example:

Boosting_1= Array_1< 2
Array_1 [booling_1].

Vector Operations

1] When two arrays have the same shape, the elements corresponding to their internal positions will be added, subtracted, multiplied, and divided one by one 2] When performing operations between two arrays with different shapes, the number of elements in their corresponding dimensions must be the same, or one of them must be 1, in order for the operation to be successful.