How to Install and Set Up PyTorch
PyTorch is a free, open-source deep learning library. With its help, a computer can detect objects, classify images, generate text, and perform other complex tasks.
PyTorch is also a rich tool ecosystem that supports and accelerates AI development and research. In this article, we will cover only the basics: we will learn how to install PyTorch and verify that it works.
To work with PyTorch, you will need:
At least 1 GB of RAM.
Installed Python 3 and pip.
A configured local development environment.
Deep knowledge of machine learning is not required for this tutorial. It is assumed that you are familiar with basic Python terms and concepts.
Installing PyTorch
We will be working in a Windows environment but using the command line. This makes the tutorial almost universal—you can use the same commands on Linux and macOS.
First, create a workspace where you will work with Torch Python.
Navigate to the directory where you want to place the new folder and create it:
mkdir pytorch
Inside the pytorch directory, create a new virtual environment. This is necessary to isolate projects and, if needed, use different library versions.
python3 -m venv virtualpytorch
To activate the virtual environment, first go to the newly created directory:
cd virtualpytorch
Inside, there is a scripts folder (on Windows) or bin (on other OS). Navigate to it:
cd scripts
Activate the virtual environment using a bat file by running the following command in the terminal:
activate.bat
The workspace is now ready. The next step is to install the PyTorch library.
The easiest way to find the installation command is to check the official website. There is a convenient form where you select the required parameters.
As an example, install the stable version for Windows using CPU via pip. Select these parameters in the form, and you will get the necessary command:
pip3 install torch torchvision torchaudio
Copy and execute the pip install torch command in the Windows command line. You are also installing two sub-libraries:
torchvision – contains popular datasets, model architectures, and image transformations for computer vision.
torchaudio – a library for processing audio and signals using PyTorch, providing input/output functions, signal processing, datasets, model implementations, and application components.
This is the standard setup often used when first exploring the library.
The method described above is not the only way to install PyTorch. If Anaconda is installed on Windows, you can use its graphical interface. If your computer has NVIDIA GPUs, you can select the CUDA version instead of CPU. In that case, the installation command will be different.
All possible local installation methods are listed in the official documentation. You can also find commands for installing older versions of the library there. To install them, just select the required version and install it the same way as the current package builds.
You don't need to write a script to check if the library is working. The Python interpreter has enough capabilities to perform basic operations.
If you have successfully installed PyTorch in the previous steps, then launching the Python interpreter won’t be an issue. Run the following command in the command line:
python
Then enter the following code:
import torch
x = torch.rand(5, 3)
print(x)
You should see an output similar to this:
tensor([[0.0925, 0.3696, 0.4949],
[0.0240, 0.2642, 0.1545],
[0.7274, 0.4975, 0.0753],
[0.4438, 0.9685, 0.5022],
[0.4757, 0.6715, 0.4298]])
Now, you can move on to solving more complex tasks.
PyTorch Usage Example
To make learning basic concepts more engaging, let’s do it in practice. For example, let’s create a neural network using PyTorch that can recognize the digit shown in an image.
Prerequisites
To create a neural network, we need to import eight modules:
import torch
import torchvision
import torch.nn.functional as F
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets
All of these are standard PyTorch libraries plus Matplotlib. They handle image processing, optimization, neural network construction, and graph visualization.
Loading and Transforming Data
We will train the neural network on the MNIST dataset, which contains 70,000 images of handwritten digits.
60,000 images will be used for training.
10,000 images will be used for testing.
Each image is 28 × 28 pixels.
Each image has a label representing the digit (e.g., 1, 2, 5, etc.).
train = datasets.MNIST("", train=True, download=True,
transform=transforms.Compose([transforms.ToTensor()]))
test = datasets.MNIST("", train=False, download=True,
transform=transforms.Compose([transforms.ToTensor()]))
trainset = torch.utils.data.DataLoader(train, batch_size=15, shuffle=True)
testset = torch.utils.data.DataLoader(test, batch_size=15, shuffle=True)
First, we divide the data into training and testing sets by setting train=True/False.
The test set must contain data that the machine has not seen before. Otherwise, the neural network’s performance would be biased.
Setting shuffle=True helps reduce bias and overfitting.
Imagine that the dataset contains many consecutive "1"s. If the machine gets too good at recognizing only the digit 1, it might struggle to recognize other numbers. Shuffling the data prevents the model from overfitting specific patterns and ensures a more generalized learning process.
Definition and Initialization of the Neural Network
The next step is defining the neural network:
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 86)
self.fc2 = nn.Linear(86, 86)
self.fc3 = nn.Linear(86, 86)
self.fc4 = nn.Linear(86, 10)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return F.log_softmax(x, dim=1)
model = NeuralNetwork()
The neural network consists of four layers: one input layer, two hidden layers, and one output layer. The Linear type represents a simple neural network.
For each layer, it is necessary to specify the number of inputs and outputs. The output number of one layer becomes the input for the next layer.
The input layer has 784 nodes. This is the result of multiplying 28 × 28 (the image size in pixels).
The first hidden layer has 86 output nodes, so the input to the next layer must be 86 as well.The same logic applies further. 86 is an arbitrary number—you can use a different value.
The output layer contains 10 nodes because the images represent digits from 0 to 9.
Each time data passes through a layer, it is processed by an activation function.
There are several activation functions. In this example, we use ReLU (Rectified Linear Unit). This function returns 0 if the value is negative or the value itself if it is positive.
The softmax function is used at the output layer to normalize values. For example, it might return an 80% probability that the digit in the image is 1, or a 30% probability that the digit is 5, and so on. The highest probability is selected as the final prediction.
Training
The next step is training.
optimizer = optim.Adam(model.parameters(), lr=0.001)
EPOCHS = 3
for epoch in range(EPOCHS):
for data in trainset:
X, y = data
model.zero_grad()
output = model(X.view(-1, 28 * 28))
loss = F.nll_loss(output, y)
loss.backward()
optimizer.step()
print(loss)
The optimizer calculates the difference (loss) between the actual data and the prediction, adjusts the weights, recalculates the loss, and continues the cycle until the loss is minimized.
Training Verification
Here, we compare the actual values with the predictions made by the model. For this tutorial, the accuracy is high because the neural network effectively recognizes each digit.
correct = 0
total = 0
with torch.no_grad():
for data in testset:
data_input, target = data
output = model(data_input.view(-1, 784))
for idx, i in enumerate(output):
if torch.argmax(i) == target[idx]:
correct += 1
total += 1
print('Accuracy: %d %%' % (100 * correct / total))
To verify that the neural network works, pass it an image of a digit from the test set:
plt.imshow(X[1].view(28,28))
plt.show()
print(torch.argmax(model(X[1].view(-1, 784))[0]))
The output should display the digit shown in the provided image.
Final Script
Here’s the full script you can run to see how the neural network works:
import torch
import torchvision
import torch.nn.functional as F
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets
train = datasets.MNIST("", train=True, download=True,
transform = transforms.Compose([transforms.ToTensor()]))
test = datasets.MNIST("", train=False, download=True,
transform = transforms.Compose([transforms.ToTensor()]))
trainset = torch.utils.data.DataLoader(train, batch_size=15, shuffle=True)
testset = torch.utils.data.DataLoader(test, batch_size=15, shuffle=True)
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 86)
self.fc2 = nn.Linear(86, 86)
self.fc3 = nn.Linear(86, 86)
self.fc4 = nn.Linear(86, 10)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return F.log_softmax(x, dim=1)
model = NeuralNetwork()
optimizer = optim.Adam(model.parameters(), lr=0.001)
EPOCHS = 3
for epoch in range(EPOCHS):
for data in trainset:
X, y = data
model.zero_grad()
output = model(X.view(-1, 28 * 28))
loss = F.nll_loss(output, y)
loss.backward()
optimizer.step()
print(loss)
correct = 0
total = 0
with torch.no_grad():
for data in testset:
data_input, target = data
output = model(data_input.view(-1, 784))
for idx, i in enumerate(output):
if torch.argmax(i) == target[idx]:
correct += 1
total += 1
print('Accuracy: %d %%' % (100 * correct / total))
plt.imshow(X[1].view(28,28))
plt.show()
print(torch.argmax(model(X[1].view(-1, 784))[0]))
Each time we run the network, it will take a random image from the test set and analyze the digit depicted on it. After the process is completed, it will display the recognition accuracy in percentage, the image itself, and the digit recognized by the neural network.
This is how it looks:
Conclusion
PyTorch is a powerful open-source machine learning platform that accelerates the transition from research prototypes to production deployments. With it, you can solve various tasks in the fields of artificial intelligence and neural networks.
You don’t need deep knowledge of machine learning to begin working with PyTorch. It is enough to know the basic concepts to repeat and even modify popular procedures like image recognition to suit your needs. A big advantage of PyTorch is the large user community that writes tutorials and shares examples of using the library.
Object recognition in images is one of the simplest and most popular tasks in PyTorch for beginners. However, the capabilities of the library are not limited to this.
To create powerful neural networks, you need a lot of training data. These can be stored, for example, in an object-based S3 storage such as Hostman, with instant data access via API or web interface. This is an excellent solution for storing large volumes of information.
01 April 2025 · 10 min to read