In recent years, neural networks have become one of the most popular methods for solving various tasks such as image classification, time series forecasting, natural language processing, content generation, and more. They can "learn" to extract features from data and make decisions based on these features, making them particularly useful in the field of artificial intelligence.
Thanks to its simplicity and rich ecosystem of machine learning libraries, Python is one of the most popular programming languages for creating neural networks. In this article, we will provide a step-by-step guide to creating a simple neural network in Python, starting from the basic concepts of neural networks and moving on to the practical creation and training of a model.
To proceed, you should have Python (version 3.5 and above) and pip
installed (usually installed with Python).
In this section, we will cover the basic information related to neural networks, including:
Neural Network Architecture: We will discuss the main types of neural network architectures such as perceptrons, convolutional, and recurrent neural networks, and their applications.
Weights and Biases Definition: We will explore how neural networks extract features from input data by determining weights and biases that allow them to make decisions based on these features.
Activation Functions: We will look at various activation functions used in neural networks and their role in controlling the output of neurons.
Loss Functions and Optimization: We will review different loss functions used to measure the error of the neural network, as well as various optimization methods used to update the weights and biases of the neural network during training.
The architecture of a neural network describes its structure and defines how it processes input data and generates output values. There are several types of neural network architectures, each designed to solve specific tasks.
A perceptron is one of the simplest types of neural networks, consisting of one or more layers of neurons. Each neuron in a perceptron has its weights and bias, allowing it to process input data and generate output values.
Perceptrons are often used for classification tasks, such as determining whether an image is of a cat or a dog.
Convolutional neural networks are particularly well-suited for image processing. They have multiple layers, including convolutional, pooling, and fully connected layers.
Convolutional layers are used to extract features from images, pooling layers reduce the dimensionality of the output data, and fully connected layers are used to make the final decision based on the extracted features.
Recurrent neural networks are a type of neural network used for sequential data, such as audio signals or text data. Recurrent layers in these neural networks allow the network to remember information from previous steps and use it to make decisions at the current step. This enables recurrent neural networks to work with data of varying lengths and predict subsequent values in a sequence.
When a neural network receives input data, it passes through several layers of neurons. Each neuron processes the data and produces some output, which is passed to the next layer of neurons. For a neural network to work correctly, it must learn to extract features from the data, determining which input values are most important for making a decision.
To achieve this, each neuron in the neural network has its weight and bias. Weights determine the importance of each input parameter for determining the neuron's output, while the bias allows the neuron to adjust its output based on the input data.
During training, the neural network adjusts the values of weights and biases to minimize the output error. This is done using various optimization methods, such as stochastic gradient descent, and different loss functions that measure the neural network's output error.
The activation function plays a key role in the operation of the neural network. It is applied to the output of each neuron and determines whether the neuron should be activated and pass its value to the next layer of neurons.
There are several types of activation functions, but one of the most popular is the ReLU (Rectified Linear Unit) function. It has the form f(x) = max(0, x) and allows the neuron to pass its value if it is positive; otherwise, it passes a zero value.
Other activation functions, such as sigmoid, are also used in neural networks, but they are less efficient than the ReLU function, especially when working with deep neural networks.
When creating your neural network in Python, you need to choose an appropriate activation function depending on the task you want to solve. Additionally, it is important to ensure that the activation function is chosen correctly to avoid issues such as gradient vanishing.
After choosing an activation function, you need to select a loss function that will measure the neural network's error during training. The loss function should be chosen depending on the task you want to solve. For example, for a classification task, loss functions such as cross-entropy or mean squared error can be used.
Moreover, you need to choose an optimization method for training the neural network. An optimizer is used to change the weights of the neural network during training to minimize the loss function. The stochastic gradient descent (SGD) algorithm is one of the most popular optimizers. It updates the neural network weights in the direction opposite to the gradient of the loss function.
There are other optimization methods, such as Adam and Adagrad, which may be more effective in some cases. When choosing an optimizer, you should also consider the task and characteristics of the data.
Choosing the right loss function and optimizer is an important step in creating a neural network. They should be selected according to the task and data characteristics to ensure the best performance of the neural network.
Let's consider creating a simple neural network in Python to solve a specific task. We'll take a table with 4 columns: 3 of them will be inputs, and the last one will be the output.
First Input |
Second Input |
Third Input |
Result |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
? |
In this case, the result is always equal to the value of the first input. We will create and train a neural network that will predict the expected result based on three inputs. Our neural network will be a perceptron with the following architecture:
The neural network receives 3 parameters as input. During training, the neural network will choose the optimal weights. In the end, the neuron will process the weighted parameters through an activation function.
Before starting, we need to install the NumPy module. NumPy is a library for mathematical operations, including matrix operations, which are particularly important in neural networks.
Open a terminal in your system or development environment and execute the following command:
pip install numpy
And import the library into the code:
import numpy as np
Let's write the sigmoid activation function:
def sigmoid(x):
return 1/(1 + np.exp(-x))
There are various functions, and as mentioned earlier, their selection is important when designing large neural networks. In our case, the choice is not as critical.
Now, let's create an array with the training data:
training_inputs = np.array([[0, 0, 1],
[1, 1, 1],
[1, 0, 1],
[0, 1, 1]])
training_outputs = np.array([[0, 1, 1, 0]]).T
The training_inputs
variable stores the input data, and training_outputs
stores the outputs.
Let's choose random weights:
np.random.seed(1)
synaptic_weights = 2 * np.random.random((3, 1)) - 1
And train the neural network:
for i in range(10000):
input_layer = training_inputs
outputs = sigmoid(np.dot(input_layer, synaptic_weights))
err = training_outputs - outputs
adjustments = np.dot(input_layer.T, err * outputs * (1 - outputs))
synaptic_weights += adjustments
print("Weights after training")
print(synaptic_weights)
Output:
Weights after training
[[14.09344278]
[-0.18776581]
[-4.50486337]]
What happens in this code segment: the neural network iteratively selects the optimal weights. With each iteration, it gets closer to the correct values.
print("Result")
print(outputs)
Output:
Result
[[0.01093477]
[0.99991734]
[0.99993149]
[0.00907983]]
To verify the neural network's functionality, let's test it on an unfamiliar example:
new_input = np.array([[0, 0, 1]])
print(sigmoid(np.dot(new_input, synaptic_weights)))
Output:
[[0.01093422]]
In this material, we reviewed the basic concepts of neural networks and created a simple neural network in Python. This example demonstrates how easy it is to develop and train neural networks in Python and how you can use them to solve various tasks such as classification, regression, and content generation.
If you want to build a web service using Python, you can rent a cloud server at competitive prices with Hostman.