Sign In
Sign In

How to Parse HTML with Python

How to Parse HTML with Python
Hostman Team
Technical writer
Python
11.02.2025
Reading time: 13 min

Parsing is the automatic search for various patterns (based on pre-defined structures) in text data sources to extract specific information.

Although parsing is a broad term, it most commonly refers to the process of collecting and analyzing data from remote web resources.

In the Python programming language, you can create programs for parsing data from external websites can using two key tools:

  • Standard HTTP request package
  • External HTML markup processing libraries

However, data processing capabilities are not limited to just HTML documents.

Thanks to a wide range of external libraries in Python, you can organize parsing for documents of any complexity, whether they are arbitrary text, popular markup languages (e.g., XML), or even rare programming languages.

If there is no suitable parsing library available, you can implement it manually using low-level methods that Python provides by default, such as simple string searching or regular expressions. Although, of course, this requires additional skills.

This guide will cover how to organize parsers in Python. We will focus on extracting data from HTML pages based on specified tags and attributes.

We run all the examples in this guide using Python 3.10.12 interpreter on a Hostman cloud server with Ubuntu 22.04 and Pip 22.0.2 as the package manager.

Structure of an HTML Document

Any document written in HTML consists of two types of tags:

  1. Opening: Defined within less-than (<) and greater-than (>) symbols, e.g., <div>.

  2. Closing: Defined within less-than (<) and greater-than (>) symbols with a forward slash (/), e.g., </div>.

Each tag can have various attributes, the values of which are written in quotes after the equal sign. Some commonly used attributes include:

  • href: Link to a resource. E.g., href="https://hostman.com".
  • class: The class of an object. E.g., class="surface panel panel_closed".
  • id: Identifier of an object. E.g., id="menu".

Each tag, with or without attributes, is an element (object) of the so-called DOM (Document Object Model) tree, which is built by practically any HTML interpreter (parser).

This builds a hierarchy of elements, where nested tags are child elements to their parent tags.

For example, in a browser, we access elements and their attributes through JavaScript scripts. In Python, we use separate libraries for this purpose. The difference is that after parsing the HTML document, the browser not only constructs the DOM tree but also displays it on the monitor.

<!DOCTYPE html>

<html>
    <head>
        <title>This is the page title</title>
    </head>

    <body>
        <h1>This is a heading</h1>
        <p>This is a simple text.</p>
    </body>
</html>

The markup of this page is built with tags in a hierarchical structure without specifying any attributes:

  • html
    • head
      • title
    • body
      • h1
      • p

Such a document structure is more than enough to extract information. We can parse the data by reading the data between opening and closing tags.

However, real website tags have additional attributes that specify both the specific function of the element and its special styling (described in separate CSS files):

<!DOCTYPE html>
<html>
    <body>
        <h1 class="h1_bright">This is a heading</h1>
        <p>This is simple text.</p>

        <div class="block" href="https://hostman.com/products/cloud-server">
            <div class="block__title">Cloud Services</div>
            <div class="block__information">Cloud Servers</div>
        </div>

        <div class="block" href="https://hostman.com/products/vps-server-hosting">
            <div class="block__title">VPS Hosting</div>
            <div class="block__information">Cloud Infrastructure</div>
        </div>

        <div class="block" href="https://hostman.com/services/app-platform">
            <div class="block__title">App Platform</div>
            <div class="block__information">Apps in the Cloud</div>
        </div>
    </body>
</html>

Thus, in addition to explicitly specified tags, the required information can be refined with specific attributes, extracting only the necessary elements from the DOM tree.

HTML Data Parser Structure

Web pages can be of two types:

  • Static: During the loading and viewing of the site, the HTML markup remains unchanged. Parsing does not require emulating the browser's behavior.

  • Dynamic: During the loading and viewing of the site (Single-page application, SPA), the HTML markup is modified using JavaScript. Parsing requires emulating the browser's behavior.

Parsing static websites is relatively simple—after making a remote request, the necessary data is extracted from the received HTML document.

Parsing dynamic websites requires a more complex approach. After making a remote request, both the HTML document itself and the JavaScript scripts controlling it are downloaded to the local machine. These scripts, in turn, usually perform several remote requests automatically, loading additional content and modifying the HTML document while viewing the page.

Because of this, parsing dynamic websites requires emulating the browser’s behavior and user actions on the local machine. Without this, the necessary data simply won’t load.

Most modern websites load additional content using JavaScript scripts in one way or another.

The variety of technical implementations of modern websites is so large that they can’t be classified as entirely static or entirely dynamic.

Typically, general information is loaded initially, while specific information is loaded later.

Most HTML parsers are designed for static pages. Systems that emulate browser behavior to generate dynamic content are much less common.

In Python, libraries (packages) intended for analyzing HTML markup can be divided into two groups:

  1. Low-level processors: Compact, but syntactically complex packages with a complicated implementation that parse HTML (or XML) syntax and build a hierarchical tree of elements.

  2. High-level libraries and frameworks: Large, but syntactically concise packages with a wide range of features to extract formalized data from raw HTML documents. This group includes not only compact HTML parsers but also full-fledged systems for data scraping. Often, these packages use low-level parsers (processors) from the first group as their core for parsing.

Several low-level libraries are available for Python:

  • lxml: A low-level XML syntax processor that is also used for HTML parsing. It is based on the popular libxml2 library written in C.

  • html5lib: A Python library for HTML syntax parsing, written according to the HTML specification by WHATWG (The Web Hypertext Application Technology Working Group), which is followed by all modern browsers.

However, using high-level libraries is faster and easier—they have simpler syntax and a wider range of functions:

  • BeautifulSoup: A simple yet flexible library for Python that allows parsing HTML and XML documents by creating a full DOM tree of elements and extracting the necessary data.
  • Scrapy: A full-fledged framework for parsing data from HTML pages, consisting of autonomous “spiders” (web crawlers) with pre-defined instructions.
  • Selectolax: A fast HTML page parser that uses CSS selectors to extract information from tags.
  • Parsel: A Python library with a specific selector syntax that allows you to extract data from HTML, JSON, and XML documents.
  • requests-html: A Python library that closely mimics browser CSS selectors written in JavaScript.

This guide will review several of these high-level libraries.

Installing the pip Package Manager

We can install all parsing libraries (as well as many other packages) in Python through the standard package manager, pip, which needs to be installed separately.

First, update the list of available repositories:

sudo apt update

Then, install pip using the APT package manager:

sudo apt install python3-pip -y

The -y flag will automatically confirm all terminal prompts during the installation.

To verify that pip was installed correctly, check its version:

pip3 --version

The terminal will display the pip version and the installation path:

pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

As shown, this guide uses pip version 22.0.2.

Installing the HTTP Requests Package

Usually, the default Python interpreter includes the Requests package, which allows making requests to remote servers. We will use it in the examples of this guide.

However, in some cases, it might not be installed. Then, you can manually install requests via pip:

pip install requests

If the system already has it, you will see the following message in the terminal:

Requirement already satisfied: requests in /usr/lib/python3/dist-packages (2.25.1)

Otherwise, the command will add requests to the list of available packages for import in Python scripts.

Using BeautifulSoup

To install BeautifulSoup version 4, use pip:

pip install beautifulsoup4

After this, the library will be available for import in Python scripts. However, it also requires the previously mentioned low-level HTML processors to work properly.

First, install lxml:

pip install lxml

Then install html5lib:

pip install html5lib

In the future, you can specify one of these processors as the core parser for BeautifulSoup in your Python code.

Create a new file in your home directory:

nano bs.py

Add the following code:

import requests
from bs4 import BeautifulSoup

# Request to the website 'https://hostman.com'
response = requests.get('https://hostman.com')

# Parse the HTML content of the page using 'html5lib' parser
page = BeautifulSoup(response.text, 'html5lib')

# Extract the title of the page
pageTitle = page.find('title')
print(pageTitle)
print(pageTitle.string)

print("")

# Extract all <a> links on the page
pageParagraphs = page.find_all('a')

# Print the content of the first 3 links (if they exist)
for i, link in enumerate(pageParagraphs[:3]):
    print(link.string)

print("")

# Find all div elements with a class starting with 'socials--'
social_links_containers = page.find_all('div', class_=lambda c: c and c.startswith('socials--'))

# Collect the links from these divs
for container in social_links_containers:
    links = container.find_all('a', href=True)
    for link in links:
        href = link['href']

        # Ignore links related to Cloudflare's email protection
        if href.startswith('/cdn-cgi/l/email-protection'):
            continue

        print(href)

Now run the script:

python bs.py

This will produce the following console output:

<title>Hostman - Cloud Service Provider with a Global Cloud Infrastructure</title>
Hostman - Cloud Service Provider with a Global Cloud Infrastructure

Partners
Tutorials
API

https://wa.me/35795959804
https://twitter.com/hostman_com
https://www.facebook.com/profile.php?id=61556075738626
https://github.com/hostman-cloud
https://www.linkedin.com/company/hostman-inc/about/
https://www.reddit.com/r/Hostman_com/

Of course, instead of html5lib, you can specify lxml:

page = BeautifulSoup(response.text, 'lxml')

However, it is best to use the html5lib library as the processor. Unlike lxml, which is specifically designed for working with XML markup, html5lib has full support for modern HTML5 standards.

Despite the fact that the BeautifulSoup library has a concise syntax, it does not support browser emulation, meaning it cannot dynamically load content.

Using Scrapy

The Scrapy framework is implemented in a more object-oriented manner. In Scrapy, website parsing is based on three core entities:

  • Spiders: Classes that contain information about parsing details for specified websites, including URLs, element selectors (CSS or XPath), and page browsing mechanisms.

  • Items: Variables for storing extracted data, which are more complex forms of Python dictionaries with a special internal structure.

  • Pipelines: Intermediate handlers for extracted data that can modify items and interact with external software (such as databases).

You can install Scrapy through the pip package manager:

pip install scrapy

After that, you need to initialize a parser project, which creates a separate directory with its own folder structure and configuration files:

scrapy startproject parser

Now, you can navigate to the newly created directory:

cd parser

Check the contents of the current directory:

ls

It has a general configuration file and a directory with project source files:

parser scrapy.cfg

Move to the source files directory:

cd parser

If you check its contents:

ls

You will see both special Python scripts, each performing its function, and a separate directory for spiders:

__init__.py items.py middlewares.py pipelines.py settings.py spiders

Let's open the settings file:

nano settings.py

By default, most parameters are commented out with the hash symbol (#). For the parser to work correctly, you need to uncomment some of these parameters without changing the default values specified in the file:

  • USER_AGENT
  • ROBOTSTXT_OBEY
  • CONCURRENT_REQUESTS
  • DOWNLOAD_DELAY
  • COOKIES_ENABLED

Each specific project will require a more precise configuration of the framework. You can find all available parameters in the official documentation.

After that, you can generate a new spider:

scrapy genspider hostmanspider hostman.com

After running the above command, the console should display a message about the creation of a new spider:

Created spider ‘hostmanspider' using template 'basic' in module:
parser.spiders.hostmanspider

Now, if you check the contents of the spiders directory:

ls spiders

You will see the empty source files for the new spider:

__init__.py  __pycache__  hostmanspider.py

Let's open the script file:

nano spiders/hostmanspider.py

And fill it with the following code:

from pathlib import Path  # Package for working with files
import scrapy  # Package from the Scrapy framework

class HostmanSpider(scrapy.Spider):  # Spider class inherits from the Spider class
        name = 'hostmanspider'  # Name of the spider

        def start_requests(self):
                urls = ["https://hostman.com"]
                for url in urls:
                        yield scrapy.Request(url=url, callback=self.parse)

        def parse(self, response):
                open("output", "w").close()  # Clear the content of the 'output' file
                someFile = open("output", "a")  # Create (or append to) a new file

                dataTitle = response.css("title::text").get()  # Extract the title from the server response using a CSS selector

                dataA = response.css("a").getall()  # Extract the first 3 links from the server response using a CSS selector 

                someFile.write(dataTitle + "\n\n")
                for i in range(3): 
                    someFile.write(dataA[i] + "\n")
                someFile.close()

You can now run the created spider with the following command:

scrapy crawl hostmanspider

Running the spider will create an output file in the current directory. To view the contents of this file, you can use:

cat output

The content of this file will look something like this:

Hostman - Cloud Service Provider with a Global Cloud Infrastructure

<a href="/partners/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">Partners</span></a>
<a href="/tutorials/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">Tutorials</span></a>
<a href="/api-docs/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">API</span></a>

You can find more detailed information on extracting data using selectors (both CSS and XPath) can be found in the official Scrapy documentation.

Conclusion

Data parsing from remote sources in Python is made possible by two main components:

  1. A package for making remote requests
  2. Libraries for parsing data

These libraries can range from simple ones, suitable only for parsing static websites, to more complex ones that can emulate browser behavior and, consequently, parse dynamic websites.

In Python, the most popular libraries for parsing static data are:

  • BeautifulSoup
  • Scrapy

These tools, similar to JavaScript functions (e.g., getElementsByClassName() using CSS selectors), allow us to extract data (attributes and text) from the DOM tree elements of any HTML document.

Python
11.02.2025
Reading time: 13 min

Similar

Python

Dunder Methods in Python: Purpose and Application

Dunder methods (double underscore methods) are special methods in the Python programming language that are surrounded by two underscores at the beginning and end of their names. This naming convention is intended to prevent name conflicts with other user-defined functions. Each dunder method corresponds to a specific Python language construct that performs a particular data transformation operation. Here are some commonly used dunder methods: __init__(): Initializes an instance of a class, acting as a constructor. __repr__(): Returns a representative value of a variable in Python expression format. __eq__(): Compares two variables. Whenever the Python interpreter encounters any syntactic construct, it implicitly calls the corresponding dunder method with all necessary arguments. For example, when Python encounters the addition symbol in the expression a + b, it implicitly calls the dunder method a.__add__(b), where the addition operation is performed internally. Thus, dunder methods implement the core mechanics of the Python language. Importantly, these methods are accessible not only to the interpreter but also to ordinary users. Moreover, you can override the implementation of each dunder method within custom classes. In this guide, we'll explore all existing dunder methods in Python and provide examples. The demonstrated scripts were run using the Python 3.10.12 interpreter installed on a Hostman cloud server running Ubuntu 22.04. To run examples from this article, you need to place each script in a separate file with a .py extension (e.g., some_script.py). Then you can execute the file with the following command: python some_script.py Creation, Initialization, and Deletion Creation, initialization, and deletion are the main stages of an object's lifecycle in Python. Each of these stages corresponds to a specific dunder method. Syntax Dunder Method Result Description a = C(b, c) C.__new__(b, c) C Creation a = C(b, c) C.__init__(b, c) None Initialization del a a.__del__() None Deletion The general algorithm for these methods has specific characteristics: Creation: The __new__() method is called with a set of arguments. The first argument is the class of the object (not the object itself). This name is not regulated and can be arbitrary. The remaining arguments are the parameters specified when creating the object in the calling code. The __new__() method must return a class instance — a new object. Initialization: Immediately after returning a new object from __new__(), the __init__() method is automatically called. Inside this method, the created object is initialized. The first argument is the object itself, passed as self. The remaining arguments are the parameters specified when creating the object. The first argument name is regulated and must be the keyword self. Deletion: Explicit deletion of an object using the del keyword triggers the __del__() method. Its only argument is the object itself, accessed through the self keyword. Thanks to the ability to override dunder methods responsible for an object's lifecycle, you can create unique implementations for custom classes: class Initable: instances = 0 # class variable, not an object variable def __new__(cls, first, second): print(cls.instances) cls.instances += 1 return super().__new__(cls) # call the base class's object creation method with the current class name as an argument def __init__(self, first, second): self.first = first # object variable self.second = second # another object variable def __del__(self): print("Annihilated!") inited = Initable("Initialized", 13) # output: 0 print(inited.first) # output: Initialized print(inited.second) # output: 13 del inited # output: Annihilated! Thanks to these hook-like methods, you can manage not only the internal state of an object but also external resources. Comparison Objects created in Python can be compared with one another, yielding either a positive or negative result. Each comparison operator is associated with a corresponding dunder method. Syntax Dunder Method Result Description a == b or a is b a.__eq__(b) bool Equal a != b a.__ne__(b) bool Not equal a > b a.__gt__(b) bool Greater than a < b a.__lt__(b) bool Less than a >= b a.__ge__(b) bool Greater than or equal a <= b a.__le__(b) bool Less than or equal hash(a) a.__hash__() int Hashing In some cases, Python provides multiple syntactic constructs for the same comparison operations. We can replace each of these operations by the corresponding dunder method: a = 5 b = 6 c = "This is a regular string" print(a == b) # Output: False print(a is b) # Output: False print(a.__eq__(b)) # Output: False print(a != b) # Output: True print(a is not b) # Output: True print(a.__ne__(b)) # Output: True print(not a.__eq__(b)) # Output: True print(a > b) # Output: False print(a < b) # Output: True print(a >= b) # Output: False print(a <= b) # Output: True print(a.__gt__(b)) # Output: False print(a.__lt__(b)) # Output: True print(a.__ge__(b)) # Output: False print(a.__le__(b)) # Output: True print(hash(a)) # Output: 5 print(a.__hash__()) # Output: 5 print(c.__hash__()) # Output: 1745008793 The __ne__() method returns the inverted result of the __eq__() method. Because of this, there's often no need to redefine __ne__() since the primary comparison logic is usually implemented in __eq__(). Additionally, some comparison operations implicitly performed by Python when manipulating list elements require hash computation. For this purpose, Python provides the special dunder method __hash__(). By default, any user-defined class already implements the methods __eq__(), __ne__(), and __hash__(): class Comparable: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 c1 = Comparable(4, 3) c2 = Comparable(7, 9) print(c1 == c1) # Output: True print(c1 != c1) # Output: False print(c1 == c2) # Output: False print(c1 != c2) # Output: True print(c1.__hash__()) # Example output: -2146408067 print(c2.__hash__()) # Example output: 1076316 In this case, the default __eq__() method compares instances without considering their internal variables defined in the __init__() constructor. The same applies to the __hash__() method, whose results vary between calls. Python's mechanics are designed such that overriding the __eq__() method automatically removes the default __hash__() method: class Comparable: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 def __eq__(self, other): if isinstance(other, self.__class__): return self.value1 == other.value1 and self.value2 == other.value2 return False c1 = Comparable(4, 3) c2 = Comparable(7, 9) print(c1 == c1) # Output: True print(c1 != c1) # Output: False print(c1 == c2) # Output: False print(c1 != c2) # Output: True print(c1.__hash__()) # ERROR (method not defined) print(c2.__hash__()) # ERROR (method not defined) Therefore, overriding the __eq__() method requires also overriding the __hash__() method with a new hashing algorithm: class Comparable: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 def __eq__(self, other): if isinstance(other, self.__class__): return self.value1 == other.value1 and self.value2 == other.value2 return False def __ne__(self, other): return not self.__eq__(other) def __gt__(self, other): return self.value1 + self.value2 > other.value1 + other.value2 def __lt__(self, other): return not self.__gt__(other) def __ge__(self, other): return self.value1 + self.value2 >= other.value1 + other.value2 def __le__(self, other): return self.value1 + self.value2 <= other.value1 + other.value2 def __hash__(self): return hash((self.value1, self.value2)) # Returns the hash of a tuple of two numbers c1 = Comparable(4, 3) c2 = Comparable(7, 9) print(c1 == c1) # Output: True print(c1 != c1) # Output: False print(c1 == c2) # Output: False print(c1 != c2) # Output: True print(c1 > c2) # Output: False print(c1 < c2) # Output: True print(c1 >= c2) # Output: False print(c1 <= c2) # Output: True print(c1.__hash__()) # Output: -1441980059 print(c2.__hash__()) # Output: -2113571365 Thus, by overriding comparison methods, you can use standard syntactic constructs for custom classes, similar to built-in data types, regardless of their internal complexity. Conversion in Python In Python, we can convert all built-in types from one to another. Similar conversions can be added to custom classes, considering the specifics of their internal implementation. Syntax Dunder Method Result Description str(a) a.__str__() str String bool(a) a.__bool__() bool Boolean int(a) a.__int__() int Integer float(a) a.__float__() float Floating-point number bytes(a) a.__bytes__() bytes Byte sequence complex(a) a.__complex__() complex Complex number By default, we can only convert a custom class to a few basic types: class Convertible: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 some_variable = Convertible(4, 3) print(str(some_variable)) # Example output: <__main__.Convertible object at 0x1229620> print(bool(some_variable)) # Output: True However, by overriding the corresponding dunder methods, you can implement conversions from a custom class to any built-in data type: class Convertible: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 def __str__(self): return str(self.value1) + str(self.value2) def __bool__(self): return self.value1 == self.value2 def __int__(self): return self.value1 + self.value2 def __float__(self): return float(self.value1) + float(self.value2) def __bytes__(self): return bytes(self.value1) + bytes(self.value2) def __complex__(self): return complex(self.value1) + complex(self.value2) someVariable = Convertible(4, 3) print(str(someVariable)) # output: 43 print(bool(someVariable)) # output: False print(int(someVariable)) # output: 7 print(float(someVariable)) # output: 7.0 print(bytes(someVariable)) # output: b'\x00\x00\x00\x00\x00\x00\x00' print(complex(someVariable)) # output: (7+0j) Thus, implementing dunder methods for conversion allows objects of custom classes to behave like built-in data types, enhancing their completeness and versatility. Element Management in Python Just like lists, we can make any custom class in Python iterable. Python provides corresponding dunder methods for retrieving and manipulating elements. Syntax Dunder Method Description len(a) a.__len__() Length iter(a) or for i in a: a.__iter__() Iterator a[b] a.__getitem__(b) Retrieve element a[b] a.__missing__(b) Retrieve non-existent dictionary item a[b] = c a.__setitem__(b, c) Set element del a[b] a.__delitem__(b) Delete element b in a a.__contains__(b) Check if element exists reversed(a) a.__reversed__() Elements in reverse order next(a) a.__next__() Retrieve next element Even though the internal implementation of an iterable custom class can vary, element management is handled using Python's standard container interface rather than custom methods. class Iterable: def __init__(self, e1, e2, e3, e4): self.e1 = e1 self.e2 = e2 self.e3 = e3 self.e4 = e4 self.index = 0 def __len__(self): len = 0 if self.e1: len += 1 if self.e2: len += 1 if self.e3: len += 1 if self.e4: len += 1 return len def __iter__(self): for i in range(0, self.__len__() + 1): if i == 0: yield self.e1 if i == 1: yield self.e2 if i == 2: yield self.e3 if i == 3: yield self.e4 def __getitem__(self, item): if item == 0: return self.e1 elif item == 1: return self.e2 elif item == 2: return self.e3 elif item == 3: return self.e4 else: raise Exception("Out of range") def __setitem__(self, item, value): if item == 0: self.e1 = value elif item == 1: self.e2 = value elif item == 2: self.e3 = value elif item == 3: self.e4 = value else: raise Exception("Out of range") def __delitem__(self, item): if item == 0: self.e1 = None elif item == 1: self.e2 = None elif item == 2: self.e3 = None elif item == 3: self.e4 = None else: raise Exception("Out of range") def __contains__(self, item): if self.e1 == item: return true elif self.e2 == item: return True elif self.e3 == item: return True elif self.e4 == item: return True else: return False def __reversed__(self): return Iterable(self.e4, self.e3, self.e2, self.e1) def __next__(self): if self.index >=4: self.index = 0 if self.index == 0: element = self.e1 if self.index == 1: element = self.e2 if self.index == 2: element = self.e3 if self.index == 3: element = self.e4 self.index += 1 return element someContainer = Iterable(-2, 54, 6, 13) print(someContainer.__len__()) # output: 4 print(someContainer[0]) # output: -2 print(someContainer[1]) # output: 54 print(someContainer[2]) # output: 6 print(someContainer[3]) # output: 13 someContainer[2] = 117 del someContainer[0] print(someContainer[2]) # output: 117 for element in someContainer: print(element) # output: None, 54, 117, 13 print(117 in someContainer) # output: True someContainerReversed = someContainer.__reversed__() for element in someContainerReversed: print(element) # output: 13, 117, 54, None print(someContainer.__next__()) # output: None print(someContainer.__next__()) # output: 54 print(someContainer.__next__()) # output: 117 print(someContainer.__next__()) # output: 13 print(someContainer.__next__()) # output: None It’s important to understand the difference between the __iter__() and __next__() methods, which facilitate object iteration. __iter__() iterates the object at a given point. __next__() returns an element considering an internal index. A particularly interesting dunder method is __missing__(), which is only relevant in custom classes inherited from the base dictionary type dict. This method allows you to override the default dict behavior when attempting to retrieve a non-existent element: class dict2(dict): def __missing__(self, item): return "Sorry but I don’t exist..." someDictionary = dict2(item1=10, item2=20, item3=30) print(someDictionary["item1"]) # output: 10 print(someDictionary["item2"]) # output: 20 print(someDictionary["item3"]) # output: 30 print(someDictionary["item4"]) # output: Sorry but I don’t exist... Arithmetic Operations Arithmetic operations are the most common type of data manipulation. Python provides corresponding syntactic constructs for performing addition, subtraction, multiplication, and division. Most often, left-handed methods are used, which perform calculations on behalf of the left operand. Syntax Dunder Method Description a + b a.__add__(b) Addition a - b a.__sub__(b) Subtraction a * b a.__mul__(b) Multiplication a / b a.__truediv__(b) Division a % b a.__mod__(b) Modulus a // b a.__floordiv__(b) Floor division a ** b a.__pow__(b) Exponentiation If the right operand does not know how to perform the operation, Python automatically calls a right-handed method, which calculates the value on behalf of the right operand. However, in this case, the operands must be of different types. Syntax Dunder Method Description a + b a.__radd__(b) Addition a - b a.__rsub__(b) Subtraction a * b a.__rmul__(b) Multiplication a / b a.__rtruediv__(b) Division a % b a.__rmod__(b) Modulus a // b a.__rfloordiv__(b) Floor Division a ** b a.__rpow__(b) Exponentiation It is also possible to override in-place arithmetic operations. In this case, dunder methods do not return a new value but modify the existing variables of the left operand. Syntax Dunder Method Description a += b a.__iadd__(b) Addition a -= b a.__isub__(b) Subtraction a *= b a.__imul__(b) Multiplication a /= b a.__itruediv__(b) Division a %= b a.__imod__(b) Modulus a //= b a.__ifloordiv__(b) Floor Division a **= b a.__ipow__(b) Exponentiation By overriding these corresponding dunder methods, you can define specific behaviors for your custom class during arithmetic operations: class Arithmetic: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 def __add__(self, other): return Arithmetic(self.value1 + other.value1, self.value2 + other.value2) def __radd__(self, other): return Arithmetic(other + self.value1, other + self.value2) def __iadd__(self, other): self.value1 += other.value1 self.value2 += other.value2 return self def __sub__(self, other): return Arithmetic(self.value1 - other.value1, self.value2 - other.value2) def __rsub__(self, other): return Arithmetic(other - self.value1, other - self.value2) def __isub__(self, other): self.value1 -= other.value1 self.value2 -= other.value2 return self def __mul__(self, other): return Arithmetic(self.value1 * other.value1, self.value2 * other.value2) def __rmul__(self, other): return Arithmetic(other * self.value1, other * self.value2) def __imul__(self, other): self.value1 *= other.value1 self.value2 *= other.value2 return self def __truediv__(self, other): return Arithmetic(self.value1 / other.value1, self.value2 / other.value2) def __rtruediv__(self, other): return Arithmetic(other / self.value1, other / self.value2) def __itruediv__(self, other): self.value1 /= other.value1 self.value2 /= other.value2 return self def __mod__(self, other): return Arithmetic(self.value1 % other.value1, self.value2 % other.value2) def __rmod__(self, other): return Arithmetic(other % self.value1, other % self.value2) def __imod__(self, other): self.value1 %= other.value1 self.value2 %= other.value2 return self def __floordiv__(self, other): return Arithmetic(self.value1 // other.value1, self.value2 // other.value2) def __rfloordiv__(self, other): return Arithmetic(other // self.value1, other // self.value2) def __ifloordiv__(self, other): self.value1 //= other.value1 self.value2 //= other.value2 return self def __pow__(self, other): return Arithmetic(self.value1 ** other.value1, self.value2 ** other.value2) def __rpow__(self, other): return Arithmetic(other ** self.value1, other ** self.value2) def __ipow__(self, other): self.value1 **= other.value1 self.value2 **= other.value2 return self a1 = Arithmetic(4, 6) a2 = Arithmetic(10, 3) add = a1 + a2 sub = a1 - a2 mul = a1 * a2 truediv = a1 / a2 mod = a1 % a2 floordiv = a1 // a2 pow = a1 ** a2 radd = 50 + a1 rsub = 50 - a2 rmul = 50 * a1 rtruediv = 50 / a2 rmod = 50 % a1 rfloordiv = 50 // a2 rpow = 50 ** a2 a1 -= a2 a1 *= a2 a1 /= a2 a1 %= a2 a1 //= a2 a1 **= a2 print(add.value1, add.value2) # output: 14 9 print(sub.value1, sub.value2) # output: -6 3 print(mul.value1, mul.value2) # output: 40 18 print(truediv.value1, truediv.value2) # output: 0.4 2.0 print(mod.value1, mod.value2) # output: 4 0 print(floordiv.value1, floordiv.value2) # output: 0 2 print(pow.value1, pow.value2) # output: 1048576 216 print(radd.value1, radd.value2) # output: 54 56 print(rsub.value1, rsub.value2) # output: 40 47 print(rmul.value1, rmul.value2) # output: 200 300 print(rtruediv.value1, rtruediv.value2) # output: 5.0 16.666666666666668 print(rmod.value1, rmod.value2) # output: 2 2 print(rfloordiv.value1, rfloordiv.value2) # output: 5 16 print(rpow.value1, rpow.value2) # output: 97656250000000000 125000 In real-world scenarios, arithmetic dunder methods are among the most frequently overridden. Therefore, it is good practice to implement both left-handed and right-handed methods simultaneously. Bitwise Operations In addition to standard mathematical operations, Python allows you to override the behavior of custom classes during bitwise transformations. Syntax Dunder Method Description a & b a.__and__(b) Bitwise AND `a b` a.__or__(b) a ^ b a.__xor__(b) Bitwise XOR a >> b a.__rshift__(b) Right Shift a << b a.__lshift__(b) Left Shift Similar to arithmetic operations, bitwise transformations can be performed on behalf of the right operand. Syntax Dunder Method Description a & b a.__rand__(b) Bitwise AND `a b` a.__ror__(b) a ^ b a.__rxor__(b) Bitwise XOR a >> b a.__rrshift__(b) Right Shift a << b a.__rlshift__(b) Left Shift Naturally, bitwise operations can also be performed in-place, modifying the left operand instead of returning a new object. Syntax Dunder Method Description a &= b a.__iand__(b) Bitwise AND `a = b` a.__ior__(b) a ^= b a.__ixor__(b) Bitwise XOR a >>= b a.__irshift__(b) Right Shift a <<= b a.__ilshift__(b) Left Shift By overriding these dunder methods, any custom class can perform familiar bitwise operations on its contents seamlessly. class Bitable: def __init__(self, value1, value2): self.value1 = value1 self.value2 = value2 def __and__(self, other): return Bitable(self.value1 & other.value1, self.value2 & other.value2) def __rand__(self, other): return Bitable(other & self.value1, other & self.value2) def __iand__(self, other): self.value1 &= other.value1 self.value2 &= other.value2 return self def __or__(self, other): return Bitable(self.value1 | other.value1, self.value2 | other.value2) def __ror__(self, other): return Bitable(other | self.value1, other | self.value2) def __ior__(self, other): self.value1 |= other.value1 self.value2 |= other.value2 return self def __xor__(self, other): return Bitable(self.value1 ^ other.value1, self.value2 ^ other.value2) def __rxor__(self, other): return Bitable(other ^ self.value1, other ^ self.value2) def __ixor__(self, other): self.value1 |= other.value1 self.value2 |= other.value2 return self def __rshift__(self, other): return Bitable(self.value1 >> other.value1, self.value2 >> other.value2) def __rrshift__(self, other): return Bitable(other >> self.value1, other >> self.value2) def __irshift__(self, other): self.value1 >>= other.value1 self.value2 >>= other.value2 return self def __lshift__(self, other): return Bitable(self.value1 << other.value1, self.value2 << other.value2) def __rlshift__(self, other): return Bitable(other << self.value1, other << self.value2) def __ilshift__(self, other): self.value1 <<= other.value1 self.value2 <<= other.value2 return self b1 = Bitable(5, 3) b2 = Bitable(7, 2) resultAnd = b1 & b2 resultOr = b1 | b2 resultXor = b1 ^ b2 resultRshift = b1 >> b2 resultLshift = b1 << b2 resultRand = 50 & b1 resultRor = 50 | b2 resultRxor = 50 ^ b1 resultRrshift = 50 >> b2 resultRlshift = 50 << b1 b1 &= b2 b1 |= b2 b1 ^= b2 b1 >>= b2 b1 <<= b2 print(resultAnd.value1, resultAnd.value2) # output: 5 2 print(resultOr.value1, resultAnd.value2) # output: 7 2 print(resultXor.value1, resultAnd.value2) # output: 2 2 print(resultRshift.value1, resultAnd.value2) # output: 0 2 print(resultLshift.value1, resultAnd.value2) # output: 640 2 print(resultRand.value1, resultRand.value2) # output: 0 2 print(resultRor.value1, resultRor.value2) # output: 55 50 print(resultRxor.value1, resultRxor.value2) # output: 55 49 print(resultRrshift.value1, resultRrshift.value2) # output: 0 12 print(resultRlshift.value1, resultRlshift.value2) # output: 1600 400 In addition to operations involving two operands, Python provides dunder methods for bitwise transformations involving a single operand. Syntax Dunder Method Description -a a.__neg__() Negation ~a a.__invert__() Bitwise Inversion +a a.__pos__() Bitwise Positivization The + operator typically does not affect the value of the variable. Many classes override this method to perform alternative transformations. Object Information Extraction Python offers several dunder methods to retrieve additional information about an object. Syntax Dunder Method Description str(a) a.__str__() Returns the object's value repr(a) a.__repr__() Returns the object's representation __str__() returns a user-friendly string representation of the variable’s value. __repr__() returns a more detailed and often code-like representation of the variable, suitable for recreating the original variable via eval(). So, it is important for a custom class to be able to provide additional information about itself. class Human: def __init__(self, name, age): self.name = name self.age = age def __str__(self): return str(self.name + " (" + str(self.age) + " years old)") def __repr__(self): return "Human(" + repr(self.name) + ", " + str(self.age) + ")" someHuman = Human("John", 35) someOtherHuman = eval(repr(someHuman)) print(str(someHuman)) # output: John (35 years old) print(repr(someHuman)) # output: Human('John', 35) print(str(someOtherHuman)) # output: John (35 years old) print(repr(someOtherHuman)) # output: Human('John', 35) Conclusion A distinctive feature of Python dunder methods is using two underscore characters at the beginning and end of the name, which prevents naming conflicts with other user-defined functions. Unlike regular control methods, dunder methods allow you to define unique behavior for a custom class when using standard Python operators responsible for: Arithmetic operations Iteration and access to elements Creation, initialization, and deletion of objects Additional dunder attributes provide auxiliary information about Python program entities, which can simplify the implementation of custom classes.
10 February 2025 · 21 min to read
Python

How to Use the numpy.where() Method in Python

The numpy.where() method in Python is one of the most powerful and frequently used tools in the NumPy library for the conditional selection of elements from arrays. It provides flexible options for processing and analyzing large datasets, replacing traditional if-else conditional operators and significantly speeding up code execution. This method allows you to replace elements in an array that meet a certain condition with specified values while leaving other elements unchanged. Unlike regular loops, which can slow down execution when working with large datasets, numpy.where() uses vectorization, making operations faster and more efficient. Syntax of the where() Method The numpy.where() method has the following syntax: numpy.where(condition[, x, y]) Where: condition: the condition or array of conditions to be checked. x: values returned if the condition is True. y: values returned if the condition is False. If the arguments x and y are not specified, the method will return the indices of the elements that satisfy the condition. Main Usage Approaches Let's move on to practical examples. Finding Element Indices It is often necessary to determine the positions of elements that satisfy a certain condition. numpy.where() makes this easy to achieve: import numpy as np arr = np.array([1, 2, 3, 4, 5]) indices = np.where(arr > 3) print(indices) In this example, we create an array [1, 2, 3, 4, 5]. Then, we use the np.where() function to find the indices of elements greater than 3. Running the code yields (array([3, 4]),), indicating the positions of the numbers 4 and 5 in the original array, as only these numbers satisfy the condition arr > 3. In this case, the method returns a tuple containing an array of indices for elements greater than 3. Conditional Element Replacement The numpy.where() method is widely used for conditionally replacing elements in an array: import numpy as np arr = np.array([1, 2, 3, 4, 5]) result = np.where(arr > 3, 100, arr) print(result) This code starts by creating an array [1, 2, 3, 4, 5]. The np.where() function is then used to find elements greater than 3. The additional parameter 100 allows these elements to be replaced with the specified value. The resulting output is [1, 2, 3, 100, 100], where the elements 4 and 5 have been replaced with 100 because they satisfy the condition arr > 3. In this case, np.where() replaces all elements meeting the condition with the specified value. Working with Multidimensional Arrays The numpy.where() method also works effectively with multidimensional arrays: import numpy as np matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) result = np.where(matrix % 2 == 0, 'even', 'odd') print(result) This example creates a matrix [[1, 2, 3], [4, 5, 6], [7, 8, 9]]. The np.where() function is applied to replace elements based on the condition: if the number is even (divisible by 2 without a remainder), it is replaced with the string 'even'; otherwise, it is replaced with 'odd'. The resulting matrix is printed as: [['odd' 'even' 'odd'] ['even' 'odd' 'even'] ['odd' 'even' 'odd']] In this example, the method returns an updated matrix with strings instead of numbers. Applying Multiple Conditions By using logical operators, numpy.where() can handle more complex conditions: import numpy as np arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) result = np.where((arr > 3) & (arr < 7), arr * 2, arr) print(result) In this example, an array [1, 2, 3, 4, 5, 6, 7, 8, 9] is created. The np.where() function is used with a combined condition: if the number is greater than 3 and less than 7, it is multiplied by 2; otherwise, it remains unchanged. The output is: [1, 2, 3, 8, 10, 12, 7, 8, 9] The numbers 4, 5, and 6 are multiplied by 2 as they meet the condition. In this case, the method returns a new array with updated values based on the condition. Practical Examples Working with Temperature Data Suppose we have an array of temperatures in Celsius, and we want to classify them as hot" or "comfortable": import numpy as np temperatures = np.array([23, 25, 28, 32, 35, 29]) status = np.where(temperatures > 30, 'hot', 'comfortable') print(status) In this example, the temperature array [23, 25, 28, 32, 35, 29] is created. The np.where() function is applied to determine comfort levels: if the temperature exceeds 30 degrees, it is labeled as 'hot'; otherwise, it is 'comfortable'.  The output is:  ['comfortable' 'comfortable' 'comfortable' 'hot' 'hot' 'comfortable']  Temperatures 32 and 35 degrees are marked as 'hot' because they exceed the threshold.  This method returns a new array with string values reflecting the temperature evaluation. Handling Missing Values In datasets, missing values often need to be replaced or handled: import numpy as np data = np.array([1, np.nan, 3, np.nan, 5]) cleaned_data = np.where(np.isnan(data), 0, data) print(cleaned_data) Here, we create an array with missing values [1, np.nan, 3, np.nan, 5]. The np.where() function is combined with np.isnan() to replace missing values (NaN) with 0.  The result is: [1. 0. 3. 0. 5.] The NaN values are replaced with 0, while other elements remain unchanged.  This example demonstrates how to clean data by handling missing values. Method Comparison Table Characteristic numpy.where() Loops List Comprehension Speed High Low Medium Memory Usage Medium High Medium Readability High Medium High Vectorization Yes No Partially Flexibility High High High As the table shows, numpy.where() outperforms traditional loops and list comprehensions in terms of speed and memory efficiency, while maintaining high readability and flexibility. Conclusion The numpy.where() method is an indispensable tool for efficient data processing and analysis in Python. Its use allows developers to write more performant, clean, and readable code, especially when working with large datasets and complex conditions. This method simplifies tasks related to replacing array elements based on specified conditions and eliminates the need for bulky loops and checks, making the code more compact and faster. numpy.where() is particularly useful for handling large datasets where high performance and simple conditional operations are crucial. Loops remain a better choice for complex data processing logic or step-by-step operations, especially when working with smaller datasets. On the other hand, list comprehensions are suitable for compact and readable code when dealing with small to medium datasets, provided the operations are not overly complex. Understanding the syntax and capabilities of numpy.where() opens up new approaches for solving various problems in areas such as data analysis, image processing, and financial analysis. The method enables efficient handling of large data volumes and significantly accelerates operations through vectorization, which is particularly important for tasks requiring high performance. Using techniques like vectorization and masks in combination with NumPy functions helps developers optimize code and achieve fast and accurate results. Regardless of your level of experience in Python programming, mastering numpy.where() and understanding its advantages will be a crucial step toward more efficient data handling, improving program performance, and implementing optimal solutions in analytics and information processing.
06 February 2025 · 6 min to read
Python

How to Set Up Visual Studio Code for Python

Creating and debugging programs in Python is easier when using a specialized Integrated Development Environment (IDE). With an IDE, you can quickly and efficiently develop, test, and debug programs. Visual Studio Code (VS Code) for Python provides full support for the language and offers a wide range of plugins and extensions. In this article, we will install Visual Studio Code on three operating systems (Windows, macOS, Linux) and set it up for Python programming, including the use of popular plugins. Prerequisites To install and set up Visual Studio Code for Python, we will need the following: A personal or work computer with Windows 10/11, macOS, or Ubuntu Linux distribution version 24.04 pre-installed. Alternatively, you can rent a dedicated server or a virtual machine with Windows Server 2016/2019/2022. If using regular versions of Windows, you can download your own ISO image in advance. You can also rent a server with Ubuntu. Installing the Python Interpreter Before installing VS Code, we need to install the Python interpreter on all three operating systems — Windows, macOS, and Linux. On Windows Go to the official Python website and download the installer file. In this case, we will be installing Python version 3.13.1. Run the installer file. You will have two installation options: Install Now — This performs a full installation, including documentation files, the package manager pip, the tcl/tk library for graphical interface support, and standard libraries. Customize Installation — This option allows you to choose which components to install. We will use the full installation. Make sure to check the box next to the option Add python.exe to PATH and click Install Now. The installation process will begin. Once the installation is complete, the program will notify you that it has finished. On macOS On macOS, the Python interpreter is pre-installed by default. You can verify this by running the following command in the terminal: python3 --version However, the installed version may be outdated. If necessary, you can install a newer version. To do this, we will use the Homebrew package manager. First, if Homebrew is not installed on your system, you can install it by running the following command: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" Next, you need to check which versions of Python are available for installation. Use the following command: brew search python In our case, several versions of Python are available: Install the latest available version, Python 3.13, by running the following command: brew install [email protected] Check the Python version again: python3 --versionpython3.13 --version As shown in the screenshot above, running the python3 --version still shows the old version (Python 3.9.6). However, the newly installed version (3.13) can be accessed using the command python3.13 --version. If needed, you can change the default Python version to the newly installed one. To do this, first get the full path to the newly installed Python interpreter using the following command: brew --prefix [email protected] Then, check which shell you are using: echo $SHELL Depending on the shell used, open the corresponding file for editing: For bash or sh: nano ~/.bashrc For zsh: nano ~/.zshrc Add the following line at the end of the file: export PATH="/opt/homebrew/opt/[email protected]/bin:$PATH" Save the changes and reload the file: source ~/.zshrc Now, when you check the Python version, it will display the latest installed version: python3 --version On Ubuntu By default, Python is pre-installed on almost all Linux distributions, including Ubuntu. In the latest supported versions of Ubuntu, the current version of Python is installed: python3 --version However, if Python is not installed for any reason, you can install it by running the following command: apt -y install python3 Installing Visual Studio Code You can install Visual Studio Code on your personal computer. You can also rent a dedicated or cloud server with Windows Server or one of the available Linux distributions. If the required distribution is not available in the list of offered images, you can upload your own. We will cover the installation of Visual Studio Code on three operating systems: Windows, macOS, and Linux (Ubuntu 24.04 distribution). For Windows Visual Studio Code supports installation on Windows 10 and Windows 11. It also supports Windows Server distributions, from version 2016 to 2022. We will install it on Windows 10.  Go to the official website and download the installer. This will download an .exe installation file. Run the installer file.  On the first step, accept the license agreement by selecting the option "I accept the agreement". Next, the installer will prompt you to choose an installation location. You can choose the default path suggested by the installer or specify your own. If necessary, you can create a shortcut for the program in the Windows menu. If you don’t want to create a shortcut, select the option "Don’t create a Start Menu folder" at the bottom: The next step lets you configure additional options by ticking the corresponding checkboxes: Create a desktop icon — creates a shortcut on the desktop for quick access to the program. Add “Open with Code” action to Windows Explorer file context menu — adds the "Open with Code" option to the context menu when right-clicking on a file. This option allows you to quickly open any file directly in Visual Studio Code. Add “Open with Code” action to Windows Explorer directory file context menu — similar to the above option but adds the "Open with Code" option to the context menu of directories (folders). Register Code as an editor for supported file types — makes Visual Studio Code the default editor for certain file types (e.g., .c, .cpp, .py, .java, .js, .html files). Add to PATH (require shell restart) — adds Visual Studio Code to the system’s PATH variable so it can be launched from the command line (cmd). Once all necessary options are set, Visual Studio Code is ready for installation. Click Install. After the installation is complete, you can launch the program immediately: For macOS Go to the official website and download the installer: After downloading, you will have a ZIP archive. Inside the archive, you will find the executable file, which you need to extract to the "Applications" directory. On the first launch, the system will notify you that this file was downloaded from the internet and may not have vulnerabilities. Click Open to continue: For Linux (Ubuntu) Visual Studio Code supports installation on Linux distributions such as Ubuntu, Debian, Red Hat, Fedora, and SUSE. You need a graphical desktop environment to install Visual Studio Code on Linux (GNOME, KDE, Xfce, etc.). Let’s consider the installation of Visual Studio Code on Ubuntu 24.04 with the Xfce desktop environment. You can also install Visual Studio Code using Snapcraft. Go to the official website and download the installer for your Linux distribution. In our case, we need the .deb installer: Once the file is downloaded, open a terminal (console) and navigate to the directory where the file was downloaded (e.g., /root). To install, run the following command where code_1.96.2-1734607745_amd64.deb is the name of the downloaded file: dpkg -i code_1.96.2-1734607745_amd64.deb During installation, a message will prompt you to add Microsoft repositories to the system. Select <Yes> and press Enter: Wait for the installation to complete. Once the installation is finished, you can launch the program from the applications menu (for distributions using Xfce, Visual Studio Code is available in the menu: Applications → Development): Adding Python Interpreter to PATH Variable in Windows If you haven't checked the Add python.exe to PATH checkbox during the Python installation on Windows, you need to manually add the full path to the interpreter to run Python from the command line. To do this: Press Win+R, type sysdm.cpl in the Run window, and press Enter. In the window that opens, go to the Advanced tab and click on the Environment Variables button. To add a user-level variable, select the Path variable under User variables and click on Edit. Double-click on an empty field or click the New button. Enter the full path to the Python interpreter file. By default, the Python interpreter is located at the following path: C:\Users\<Username>\AppData\Local\Programs\Python\Python313 For example: C:\Users\Administrator\AppData\Local\Programs\Python\Python313 After entering the path, click OK to save the changes. To verify, open the command prompt and type python. If the path to the interpreter is correctly specified, the Python console will open. Setting Up Python Interpreter in Visual Studio Code In Windows Once Python is installed, you need to connect it to Visual Studio Code. To do this: Open Visual Studio Code and click the New File... button on the home page to create a new file.  Alternatively, you can create a Python project in Visual Studio Code by clicking on Open Folder…, where you can select the entire project folder containing the files. Type any name for the file, use the .py extension, and press Enter. Save the file in any location. Ensure that the file name ends with the .py extension. Once the file is saved, the interface in Visual Studio Code will display a prompt at the bottom right, suggesting you install the recommended Python extension. To run Python in Visual Studio Code, you first need to select the Python interpreter. A button will appear at the bottom of the panel with a warning: Select Interpreter. Click on it. In the menu that appears, select Enter interpreter path… and press Enter. Specify the full path to the Python interpreter. By default, it is located at: C:\Users\Administrator\AppData\Local\Programs\Python\Python313 where Administrator is your user account name. Select the file named python and click on Select Interpreter. To test it, write a simple program that calculates the square root of a number: import math num = float(input("Enter a number to find its square root: ")) if num >= 0: sqrt_result = math.sqrt(num) print(f"The square root of {num} is {sqrt_result}") else: print("Square root of a negative number is not real.") In Visual Studio Code, to run the Python code, click the Run Python File button at the top-right. If the interpreter is set up correctly, the program will run successfully. In macOS On macOS operating systems, the Python interpreter is automatically recognized. Simply create a new .py file as described in the Windows section above and run the program directly. In Ubuntu Similarly to macOS, in the Ubuntu distribution, VS Code automatically detects the installed Python interpreter in the system. All you need to do is create a new .py file and run the program directly. Recommended Extensions for Python in Visual Studio Code VS Code offers a wide range of Python extensions (plugins) that simplify the development process. Here are some of the most popular ones: Pylance Pylance provides code analysis, autocompletion, and IntelliSense support, making Python development more efficient and user-friendly. Key features include fast autocompletion, type checking, and IntelliSense support. Jupyter The Jupyter extension is a powerful tool for working with interactive notebooks directly in the editor.  It’s especially useful for data analysis, machine learning, visualization, and interactive programming tasks. autoDocstring — Python Docstring Generator autoDocstring is a popular extension that helps automatically generate docstrings for Python functions, methods, and classes. Docstrings improve code readability and serve as built-in documentation. isort isort is a tool for automatically sorting and organizing imports in Python code. You can configure it in Visual Studio Code to make working with imports easier and to improve code readability. Conclusion This article covered the installation and setup of Visual Studio Code for Python development. Visual Studio Code offers full support for Python and provides the ability to extend its functionality through various plugins, making the coding process easier and more efficient.
05 February 2025 · 10 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support