Sign In
Sign In

How to Parse HTML with Python

How to Parse HTML with Python
Hostman Team
Technical writer
Python
11.02.2025
Reading time: 13 min

Parsing is the automatic search for various patterns (based on pre-defined structures) in text data sources to extract specific information.

Although parsing is a broad term, it most commonly refers to the process of collecting and analyzing data from remote web resources.

In the Python programming language, you can create programs for parsing data from external websites can using two key tools:

  • Standard HTTP request package
  • External HTML markup processing libraries

However, data processing capabilities are not limited to just HTML documents.

Thanks to a wide range of external libraries in Python, you can organize parsing for documents of any complexity, whether they are arbitrary text, popular markup languages (e.g., XML), or even rare programming languages.

If there is no suitable parsing library available, you can implement it manually using low-level methods that Python provides by default, such as simple string searching or regular expressions. Although, of course, this requires additional skills.

This guide will cover how to organize parsers in Python. We will focus on extracting data from HTML pages based on specified tags and attributes.

We run all the examples in this guide using Python 3.10.12 interpreter on a Hostman cloud server with Ubuntu 22.04 and Pip 22.0.2 as the package manager.

Structure of an HTML Document

Any document written in HTML consists of two types of tags:

  1. Opening: Defined within less-than (<) and greater-than (>) symbols, e.g., <div>.

  2. Closing: Defined within less-than (<) and greater-than (>) symbols with a forward slash (/), e.g., </div>.

Each tag can have various attributes, the values of which are written in quotes after the equal sign. Some commonly used attributes include:

  • href: Link to a resource. E.g., href="https://hostman.com".
  • class: The class of an object. E.g., class="surface panel panel_closed".
  • id: Identifier of an object. E.g., id="menu".

Each tag, with or without attributes, is an element (object) of the so-called DOM (Document Object Model) tree, which is built by practically any HTML interpreter (parser).

This builds a hierarchy of elements, where nested tags are child elements to their parent tags.

For example, in a browser, we access elements and their attributes through JavaScript scripts. In Python, we use separate libraries for this purpose. The difference is that after parsing the HTML document, the browser not only constructs the DOM tree but also displays it on the monitor.

<!DOCTYPE html>

<html>
    <head>
        <title>This is the page title</title>
    </head>

    <body>
        <h1>This is a heading</h1>
        <p>This is a simple text.</p>
    </body>
</html>

The markup of this page is built with tags in a hierarchical structure without specifying any attributes:

  • html
    • head
      • title
    • body
      • h1
      • p

Such a document structure is more than enough to extract information. We can parse the data by reading the data between opening and closing tags.

However, real website tags have additional attributes that specify both the specific function of the element and its special styling (described in separate CSS files):

<!DOCTYPE html>
<html>
    <body>
        <h1 class="h1_bright">This is a heading</h1>
        <p>This is simple text.</p>

        <div class="block" href="https://hostman.com/products/cloud-server">
            <div class="block__title">Cloud Services</div>
            <div class="block__information">Cloud Servers</div>
        </div>

        <div class="block" href="https://hostman.com/products/vps-server-hosting">
            <div class="block__title">VPS Hosting</div>
            <div class="block__information">Cloud Infrastructure</div>
        </div>

        <div class="block" href="https://hostman.com/services/app-platform">
            <div class="block__title">App Platform</div>
            <div class="block__information">Apps in the Cloud</div>
        </div>
    </body>
</html>

Thus, in addition to explicitly specified tags, the required information can be refined with specific attributes, extracting only the necessary elements from the DOM tree.

HTML Data Parser Structure

Web pages can be of two types:

  • Static: During the loading and viewing of the site, the HTML markup remains unchanged. Parsing does not require emulating the browser's behavior.

  • Dynamic: During the loading and viewing of the site (Single-page application, SPA), the HTML markup is modified using JavaScript. Parsing requires emulating the browser's behavior.

Parsing static websites is relatively simple—after making a remote request, the necessary data is extracted from the received HTML document.

Parsing dynamic websites requires a more complex approach. After making a remote request, both the HTML document itself and the JavaScript scripts controlling it are downloaded to the local machine. These scripts, in turn, usually perform several remote requests automatically, loading additional content and modifying the HTML document while viewing the page.

Because of this, parsing dynamic websites requires emulating the browser’s behavior and user actions on the local machine. Without this, the necessary data simply won’t load.

Most modern websites load additional content using JavaScript scripts in one way or another.

The variety of technical implementations of modern websites is so large that they can’t be classified as entirely static or entirely dynamic.

Typically, general information is loaded initially, while specific information is loaded later.

Most HTML parsers are designed for static pages. Systems that emulate browser behavior to generate dynamic content are much less common.

In Python, libraries (packages) intended for analyzing HTML markup can be divided into two groups:

  1. Low-level processors: Compact, but syntactically complex packages with a complicated implementation that parse HTML (or XML) syntax and build a hierarchical tree of elements.

  2. High-level libraries and frameworks: Large, but syntactically concise packages with a wide range of features to extract formalized data from raw HTML documents. This group includes not only compact HTML parsers but also full-fledged systems for data scraping. Often, these packages use low-level parsers (processors) from the first group as their core for parsing.

Several low-level libraries are available for Python:

  • lxml: A low-level XML syntax processor that is also used for HTML parsing. It is based on the popular libxml2 library written in C.

  • html5lib: A Python library for HTML syntax parsing, written according to the HTML specification by WHATWG (The Web Hypertext Application Technology Working Group), which is followed by all modern browsers.

However, using high-level libraries is faster and easier—they have simpler syntax and a wider range of functions:

  • BeautifulSoup: A simple yet flexible library for Python that allows parsing HTML and XML documents by creating a full DOM tree of elements and extracting the necessary data.
  • Scrapy: A full-fledged framework for parsing data from HTML pages, consisting of autonomous “spiders” (web crawlers) with pre-defined instructions.
  • Selectolax: A fast HTML page parser that uses CSS selectors to extract information from tags.
  • Parsel: A Python library with a specific selector syntax that allows you to extract data from HTML, JSON, and XML documents.
  • requests-html: A Python library that closely mimics browser CSS selectors written in JavaScript.

This guide will review several of these high-level libraries.

Installing the pip Package Manager

We can install all parsing libraries (as well as many other packages) in Python through the standard package manager, pip, which needs to be installed separately.

First, update the list of available repositories:

sudo apt update

Then, install pip using the APT package manager:

sudo apt install python3-pip -y

The -y flag will automatically confirm all terminal prompts during the installation.

To verify that pip was installed correctly, check its version:

pip3 --version

The terminal will display the pip version and the installation path:

pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

As shown, this guide uses pip version 22.0.2.

Installing the HTTP Requests Package

Usually, the default Python interpreter includes the Requests package, which allows making requests to remote servers. We will use it in the examples of this guide.

However, in some cases, it might not be installed. Then, you can manually install requests via pip:

pip install requests

If the system already has it, you will see the following message in the terminal:

Requirement already satisfied: requests in /usr/lib/python3/dist-packages (2.25.1)

Otherwise, the command will add requests to the list of available packages for import in Python scripts.

Using BeautifulSoup

To install BeautifulSoup version 4, use pip:

pip install beautifulsoup4

After this, the library will be available for import in Python scripts. However, it also requires the previously mentioned low-level HTML processors to work properly.

First, install lxml:

pip install lxml

Then install html5lib:

pip install html5lib

In the future, you can specify one of these processors as the core parser for BeautifulSoup in your Python code.

Create a new file in your home directory:

nano bs.py

Add the following code:

import requests
from bs4 import BeautifulSoup

# Request to the website 'https://hostman.com'
response = requests.get('https://hostman.com')

# Parse the HTML content of the page using 'html5lib' parser
page = BeautifulSoup(response.text, 'html5lib')

# Extract the title of the page
pageTitle = page.find('title')
print(pageTitle)
print(pageTitle.string)

print("")

# Extract all <a> links on the page
pageParagraphs = page.find_all('a')

# Print the content of the first 3 links (if they exist)
for i, link in enumerate(pageParagraphs[:3]):
    print(link.string)

print("")

# Find all div elements with a class starting with 'socials--'
social_links_containers = page.find_all('div', class_=lambda c: c and c.startswith('socials--'))

# Collect the links from these divs
for container in social_links_containers:
    links = container.find_all('a', href=True)
    for link in links:
        href = link['href']

        # Ignore links related to Cloudflare's email protection
        if href.startswith('/cdn-cgi/l/email-protection'):
            continue

        print(href)

Now run the script:

python bs.py

This will produce the following console output:

<title>Hostman - Cloud Service Provider with a Global Cloud Infrastructure</title>
Hostman - Cloud Service Provider with a Global Cloud Infrastructure

Partners
Tutorials
API

https://wa.me/35795959804
https://twitter.com/hostman_com
https://www.facebook.com/profile.php?id=61556075738626
https://github.com/hostman-cloud
https://www.linkedin.com/company/hostman-inc/about/
https://www.reddit.com/r/Hostman_com/

Of course, instead of html5lib, you can specify lxml:

page = BeautifulSoup(response.text, 'lxml')

However, it is best to use the html5lib library as the processor. Unlike lxml, which is specifically designed for working with XML markup, html5lib has full support for modern HTML5 standards.

Despite the fact that the BeautifulSoup library has a concise syntax, it does not support browser emulation, meaning it cannot dynamically load content.

Using Scrapy

The Scrapy framework is implemented in a more object-oriented manner. In Scrapy, website parsing is based on three core entities:

  • Spiders: Classes that contain information about parsing details for specified websites, including URLs, element selectors (CSS or XPath), and page browsing mechanisms.

  • Items: Variables for storing extracted data, which are more complex forms of Python dictionaries with a special internal structure.

  • Pipelines: Intermediate handlers for extracted data that can modify items and interact with external software (such as databases).

You can install Scrapy through the pip package manager:

pip install scrapy

After that, you need to initialize a parser project, which creates a separate directory with its own folder structure and configuration files:

scrapy startproject parser

Now, you can navigate to the newly created directory:

cd parser

Check the contents of the current directory:

ls

It has a general configuration file and a directory with project source files:

parser scrapy.cfg

Move to the source files directory:

cd parser

If you check its contents:

ls

You will see both special Python scripts, each performing its function, and a separate directory for spiders:

__init__.py items.py middlewares.py pipelines.py settings.py spiders

Let's open the settings file:

nano settings.py

By default, most parameters are commented out with the hash symbol (#). For the parser to work correctly, you need to uncomment some of these parameters without changing the default values specified in the file:

  • USER_AGENT
  • ROBOTSTXT_OBEY
  • CONCURRENT_REQUESTS
  • DOWNLOAD_DELAY
  • COOKIES_ENABLED

Each specific project will require a more precise configuration of the framework. You can find all available parameters in the official documentation.

After that, you can generate a new spider:

scrapy genspider hostmanspider hostman.com

After running the above command, the console should display a message about the creation of a new spider:

Created spider ‘hostmanspider' using template 'basic' in module:
parser.spiders.hostmanspider

Now, if you check the contents of the spiders directory:

ls spiders

You will see the empty source files for the new spider:

__init__.py  __pycache__  hostmanspider.py

Let's open the script file:

nano spiders/hostmanspider.py

And fill it with the following code:

from pathlib import Path  # Package for working with files
import scrapy  # Package from the Scrapy framework

class HostmanSpider(scrapy.Spider):  # Spider class inherits from the Spider class
        name = 'hostmanspider'  # Name of the spider

        def start_requests(self):
                urls = ["https://hostman.com"]
                for url in urls:
                        yield scrapy.Request(url=url, callback=self.parse)

        def parse(self, response):
                open("output", "w").close()  # Clear the content of the 'output' file
                someFile = open("output", "a")  # Create (or append to) a new file

                dataTitle = response.css("title::text").get()  # Extract the title from the server response using a CSS selector

                dataA = response.css("a").getall()  # Extract the first 3 links from the server response using a CSS selector 

                someFile.write(dataTitle + "\n\n")
                for i in range(3): 
                    someFile.write(dataA[i] + "\n")
                someFile.close()

You can now run the created spider with the following command:

scrapy crawl hostmanspider

Running the spider will create an output file in the current directory. To view the contents of this file, you can use:

cat output

The content of this file will look something like this:

Hostman - Cloud Service Provider with a Global Cloud Infrastructure

<a href="/partners/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">Partners</span></a>
<a href="/tutorials/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">Tutorials</span></a>
<a href="/api-docs/" itemprop="url" class="body4 medium nd-link-primary"><span itemprop="name">API</span></a>

You can find more detailed information on extracting data using selectors (both CSS and XPath) can be found in the official Scrapy documentation.

Conclusion

Data parsing from remote sources in Python is made possible by two main components:

  1. A package for making remote requests
  2. Libraries for parsing data

These libraries can range from simple ones, suitable only for parsing static websites, to more complex ones that can emulate browser behavior and, consequently, parse dynamic websites.

In Python, the most popular libraries for parsing static data are:

  • BeautifulSoup
  • Scrapy

These tools, similar to JavaScript functions (e.g., getElementsByClassName() using CSS selectors), allow us to extract data (attributes and text) from the DOM tree elements of any HTML document.

Python
11.02.2025
Reading time: 13 min

Similar

Python

Python Static Method

A static method in Python is bound to the class itself rather than any instance of that class. So, you can call it without first creating an object and without access to instance data (self).  To create a static method we need to use a decorator, specifically @staticmethod. It will tell Python to call the method on the class rather than an instance. Static methods are excellent for utility or helper functions that are logically connected to the class but don't need to access any of its properties.  When To Use & Not to Use a Python Static Method Static methods are frequently used in real-world code for tasks like input validation, data formatting, and calculations—especially when that logic naturally belongs with a class but doesn't need its state. Here's an example from a User class that checks email format: class User: @staticmethod def is_valid_email(email): return "@" in email and "." in email This method doesn't depend on any part of the User instance, but conceptually belongs in the class. It can be used anywhere as User.is_valid_email(email), keeping your code cleaner and more organized. If the logic requires access to or modification of instance attributes or class-level data, avoid using a static method as it won't help here. For instance, if you are working with user settings or need to monitor object creation, you will require a class method or an instance method instead. class Counter: count = 0 @classmethod def increment(cls): cls.count += 1 In this example, using a static method would prevent access to cls.count, making it useless for this kind of task. Python Static Method vs Class Method Though they look similar, class and static methods in Python have different uses; so, let's now quickly review their differences. Defined inside a class, a class method is connected to that class rather than an instance. Conventionally called cls, the class itself is the first parameter; so, it can access and change class-level data. Factory patterns, alternate constructors, or any activity applicable to the class as a whole and not individual instances are often implemented via class methods. Conversely, a static method is defined within a class but does not start with either self or cls parameters. It is just a regular function that “lives” inside a class but doesn’t interact with the class or its instances. For utility tasks that are conceptually related to the class but don’t depend on its state, static methods are perfect. Here's a quick breakdown of the Python class/static methods differences: Feature Class Method Static Method Binding Bound to the class Not bound to class or instance First parameter cls (class itself) None (no self or cls) Access to class/instance data Yes No Common use cases Factory methods, class-level behavior Utility/helper functions Decorator @classmethod @staticmethod Python Static Method vs Regular Functions You might ask: why not just define a function outside the class instead of using a static method? The answer is structure. A static method keeps related logic grouped within the class, even if it doesn't interact with the class or its instances. # Regular function def is_even(n): return n % 2 == 0 # Static method inside a class class NumberUtils: @staticmethod def is_even(n): return n % 2 == 0 Both functions do the same thing, but placing is_even inside NumberUtils helps keep utility logic organized and easier to find later. Let’s proceed to the hands-on Python static method examples. Example #1 Imagine that we have a MathUtils class that contains a static method for calculating the factorial: class MathUtils: @staticmethod def factorial(n): if n == 0: return 1 else: return n * MathUtils.factorial(n-1) Next, let's enter: print(MathUtils.factorial(5))120 We get the factorial of 5, which is 120. Here, the factorial static method does not use any attributes of the class instance, only the input argument n. And we called it using the MathUtils.factorial(n) syntax without creating an instance of the MathUtils class. In Python, static methods apply not only in classes but also in modules and packages. The @staticmethod decorator marks a function you define inside a class if it does not interact with instance-specific data. The function exists on its own; it is related to the class logically but is independent of its internal state. Managed solution for Backend development Example #2 Let's say we're working with a StringUtils module with a static method for checking if a string is a palindrome. The code will be: def is_palindrome(string):    return string == string[::-1] This function doesn't rely on any instance-specific data — it simply performs a check on the input. That makes it a good candidate for a static method. To organize it within a class and signal that it doesn't depend on the class state, we can use the @staticmethod decorator like this: class StringUtils:    @staticmethod    def is_palindrome(string):       return string == string[::-1] Let's enter for verification: print(StringUtils.is_palindrome("deed"))True print(StringUtils.is_palindrome("deer"))False That's correct, the first word is a palindrome, so the interpreter outputs True, but the second word is not, and we get False. So, we can call the is_palindrome method through the StringUtils class using the StringUtils.is_palindrome(string) syntax instead of importing the is_palindrome function and calling it directly. - Python static method and class instance also differ in that the static cannot affect the state of an instance. Since they do not have access to the instance, they cannot alter attribute values, which makes sense. Instance methods are how one may modify the instance state of a class. Example #3 Let's look at another example. Suppose we have a Person class that has an age attribute and a static is_adult method that checks the value against the age of majority: class Person:    def __init__(self, age):        self.age = age    @staticmethod    def is_adult(age):       return age >= 21 Next, let's create an age variable with a value of 24, call the is_adult static method from the Person class with this value and store its result in the is_adult variable, like this: age = 24is_adult = Person.is_adult(age) Now to test this, let's enter: print(is_adult)True Since the age matches the condition specified in the static method, we get True. In the example, the is_adult static method serves as an auxiliary tool—a helper function—accepting the age argument but without access to the age attribute of the Person class instance. Conclusion Static methods improve code readability and make it possible to reuse it. They are also more convenient when compared to standard Python functions. Static methods are convenient as, unlike functions, they do not call for a separate import. Therefore, applying Python class static methods can help you streamline and work with your code greatly. And, as you've probably seen from the examples above, they are quite easy to master. On our app platform you can find Python applications, such as Celery, Django, FastAPI and Flask. 
16 April 2025 · 6 min to read
Python

Input in Python

Python provides interactive capabilities through various tools, one of which is the input() function. Its primary purpose is to receive user input. This function makes Python programs meaningful because without user interaction, applications would have limited utility. How the Python Input Works This function operates as follows: user_name = input('Enter your name: ') user_age = int(input('How old are you? ')) First, the user is asked to enter their name, then their age. Both inputs are captured using a special operator that stores the entered values in the variables user_name and user_age. These values can then be used in the program. For example, we can create an age-based access condition for a website (by converting the age input to an integer using int()) and display a welcome message using the entered name: if user_age < 18: print('Sorry, access is restricted to adults only') else: print('Welcome to the site,', user_name, '!') So, what happens when int() receives an empty value? If the user presses Enter without entering anything, let's see what happens by extending the program: user_name = input('Enter your name: ') user_age = int(input('How old are you? ')) if user_age < 18: print('Sorry, access is restricted to adults only') else: print('Welcome to the site,', user_name, '!') input('Press Enter to go to the menu') print('Welcome to the menu') Pressing Enter moves the program to the next line of code. If there is no next line, the program exits. The last line can be written as: input('Press Enter to exit') If there are no more lines in the program, it will exit. Here is the complete version of the program: user_name = input('Enter your name: ') user_age = int(input('How old are you? ')) if user_age < 18: print('Sorry, access is restricted to adults only') else: print('Welcome to the site,', user_name, '!') input('Press Enter to go to the menu') print('Welcome to the menu') input('Press Enter to exit') input('Press Enter to exit') If the user enters an acceptable age, they will see the message inside the else block. Otherwise, they will see only the if block message and the final exit prompt. The input() function is used four times in this program, and in the last two cases, it does not store any values but serves to move to the next part of the code or exit the program. input() in the Python Interpreter The above example is a complete program, but you can also execute it line by line in the Python interpreter. However, in this case, you must enter data immediately to continue: >>> user_name = input('Enter your name: ') Enter your name: Jamie >>> user_age = int(input('How old are you? ')) How old are you? 18 The code will still execute, and values will be stored in variables. This method allows testing specific code blocks. However, keep in mind that values are retained only until you exit the interactive mode. It is recommended to save your code in a .py file. Input Conversion Methods: int(), float(), split() Sometimes, we need to convert user input into a specific data type, such as an integer, a floating-point number, or a list. Integer conversion (int()) We've already seen this in a previous example: user_age = int(input('How old are you? ')) The int() function converts input into an integer, allowing Python to process it as a numeric type. By default, numbers entered by users are treated as strings, so Python requires explicit conversion. A more detailed approach would be: user_age = input('How old are you? ') user_age = int(user_age) The first method is shorter and more convenient, but the second method is useful for understanding function behavior. Floating-point conversion (float()) To convert user input into a floating-point number, use float(): height = float(input('Enter your height (e.g., 1.72): ')) weight = float(input('Enter your weight (e.g., 80.3): ')) Or using a more detailed approach: height = input('Enter your height (e.g., 1.72): ') height = float(height) weight = input('Enter your weight (e.g., 80.3): ') weight = float(weight) Now, the program can perform calculations with floating-point numbers. Converting Input into a List (split()) The split() method converts input text into a list of words: animals = input('Enter your favorite animals separated by spaces: ').split() print('Here they are as a list:', animals) Example output: Enter your favorite animals separated by spaces: cat dog rabbit fox bear Here they are as a list: ['cat', 'dog', 'rabbit', 'fox', 'bear'] Handling Input Errors Users often make mistakes while entering data or may intentionally enter incorrect characters. In such cases, incorrect input can cause the program to crash: >>> height = float(input('Enter your height (e.g., 1.72): ')) Enter your height (e.g., 1.72): 1m72 Traceback (most recent call last): File "<pyshell#2>", line 1, in <module> height = float(input('Enter your height (e.g., 1.72): ')) ValueError: could not convert string to float: '1m72' The error message indicates that Python cannot convert the string into a float. To prevent such crashes, we use the try-except block: try: height = float(input('Enter your height (e.g., 1.72): ')) except ValueError: height = float(input('Please enter your height in the correct format: ')) We can also modify our initial age-input program to be more robust: try: user_age = int(input('How old are you? ')) except ValueError: user_age = int(input('Please enter a number: ')) However, the program will still crash if the user enters incorrect data again. To make it more resilient, we can use a while loop: while True: try: height = float(input('Enter your height (e.g., 1.72): ')) break except ValueError: print('Let’s try again.') continue print('Thank you!') Here, we use a while loop with break and continue. The program works as follows: If the input is correct, the loop breaks, and the program proceeds to the final message: print('Thank you!'). If the program cannot convert input to a float, it catches an exception (ValueError) and displays the message "Let’s try again."  The continue statement prevents the program from crashing and loops back to request input again. Now, the user must enter valid data before proceeding. Here is the complete code for a more resilient program: user_name = input('Enter your name: ') while True: try: user_age = int(input('How old are you? ')) break except ValueError: print('Are you sure?') continue if user_age < 18: print('Sorry, access is restricted to adults only') else: print('Welcome to the site,', user_name, '!') input('Press Enter to go to the menu') print('Welcome to the menu') input('Press Enter to exit') This program still allows unrealistic inputs (e.g., 3 meters tall or 300 years old). To enforce realistic values, additional range checks would be needed, but that is beyond the scope of this article. 
08 April 2025 · 6 min to read
Python

Operators in Python

Python operators are tools used to perform various actions with variables, as well as numerical and other values called operands—objects on which operations are performed. There are several types of Python operators: Arithmetic Comparison Assignment Identity Membership Logical Bitwise This article will examine each of them in detail and provide examples. Arithmetic Operators For addition, subtraction, multiplication, and division, we use the Python operators +, -, *, and / respectively: >>> 24 + 28 52 >>> 24 - 28 -4 >>> 24 * 28 672 >>> 24 / 28 0.8571428571428571 For exponentiation, ** is used: >>> 5 ** 2 25 >>> 5 ** 3 125 >>> 5 ** 4 625 For floor division (integer division without remainder), // is used: >>> 61 // 12 5 >>> 52 // 22 2 >>> 75 // 3 25 >>> 77 // 3 25 The % operator returns the remainder (modulo division): >>> 62 % 6 2 >>> 65 % 9 2 >>> 48 % 5 3 >>> 48 % 12 0 Comparison Operators Python has six comparison operators: >, <, >=, <=, ==, !=. Note that equality in Python is written as ==, because a single = is used for assignment. The != operator is used for "not equal to." When comparing values, Python will return True or False depending on whether the expressions are true or false. >>> 26 > 58 False >>> 26 < 58 True >>> 26 >= 26 True >>> 58 <= 57 False >>> 50 == 50 True >>> 50 != 50 False >>> 50 != 51 True Assignment Operators A single = is used for assigning values to variables: >>> b = 5 >>> variants = 20 Python also provides convenient shorthand operators that combine arithmetic operations with assignment: +=, -=, *=, /=, //=, %=. For example: >>> cars = 5 >>> cars += 7 >>> cars 12 This is equivalent to: >>> cars = cars + 7 >>> cars 12 The first version is more compact. Other assignment operators work similarly: >>> train = 11 >>> train -= 2 >>> train 9 >>> moto = 3 >>> moto *= 7 >>> moto 21 >>> plain = 8 >>> plain /= 4 >>> plain 2.0 Notice that in the last case, the result is a floating-point number (float), not an integer (int). Identity Operators Python has two identity operators: is and is not. The results are True or False, similar to comparison operators. >>> 55 is 55 True >>> 55 is 56 False >>> 55 is not 55 False >>> 55 is not 56 True >>> 55 is '55' False >>> '55' is "55" True In the last two examples: 55 is '55' returned False because an integer and a string were compared. '55' is "55" returned True because both operands are strings. Python does not differentiate between single and double quotes, so the identity check was successful. Membership Operators There are only two membership operators in Python: in and not in. They check whether a certain value exists within a sequence. For example: >>> wordlist = ('assistant', 'streetcar', 'fraudster', 'dancer', 'heat', 'blank', 'compass', 'commerce', 'judgment', 'approach') >>> 'house' in wordlist False >>> 'assistant' in wordlist True >>> 'assistant' and 'streetcar' in wordlist True In the last case, a logical operator (and) was used, which leads us to the next topic. Logical Operators Python has three logical operators: and, or, and not. and returns True only if all operands are true. It can process any number of values. Using an example from the previous section: >>> wordlist = ('assistant', 'streetcar', 'fraudster', 'dancer', 'heat', 'blank', 'compass', 'commerce', 'judgment', 'approach') >>> 'assistant' and 'streetcar' in wordlist True >>> 'fraudster' and 'dancer' and 'heat' and 'blank' in wordlist True >>> 'fraudster' and 'dancer' and 'heat' and 'blank' and 'house' in wordlist False Since 'house' is not in the sequence, the result is False. These operations also work with numerical values: >>> numbers = 54 > 55 and 22 > 21 >>> print(numbers) False One of the expressions is false, and and requires all conditions to be true. or works differently: it returns True if at least one operand is true. If we replace and with or in the previous example, we get: >>> numbers = 54 > 55 or 22 > 21 >>> print(numbers) True Here, 22 > 21 is true, so the overall expression evaluates to True, even though 54 > 55 is false. not reverses logical values: >>> first = True >>> second = False >>> print(not first) False >>> print(not second) True As seen in the example, not flips True to False and vice versa. Bitwise Operators Bitwise operators are used in Python to manipulate objects at the bit level. There are five of them (shift operators belong to the same type, as they differ only in shift direction): & (AND) | (OR) ^ (XOR) ~ (NOT) << and >> (shift operators) Bitwise operators are based on Boolean logic principles and work as follows: & (AND) returns 1 if both operands contain 1; otherwise, it returns 0: >>> 1 & 1 1 >>> 1 & 0 0 >>> 0 & 1 0 >>> 0 & 0 0 | (OR) returns 1 if at least one operand contains 1, otherwise 0: >>> 1 | 1 1 >>> 1 | 0 1 >>> 0 | 1 1 >>> 0 | 0 0 ^ (XOR) returns 1 if the operands are different and 0 if they are the same: >>> 1 ^ 1 0 >>> 1 ^ 0 1 >>> 0 ^ 1 1 >>> 0 ^ 0 0 ~ (NOT) inverts bits, turning positive values into negative ones with a shift of one: >>> ~5 -6 >>> ~-5 4 >>> ~7 -8 >>> ~-7 6 >>> ~9 -10 >>> ~-9 8 << and >> shift bits by a specified number of positions: >>> 1 << 1 2 >>> 1 >> 1 0 To understand shifts, let’s break down values into bits: 0 = 00 1 = 01 2 = 10 Shifting 1 left by one bit gives 2, while shifting right results in 0. What happens if we shift by two positions? >>> 1 << 2 4 >>> 1 >> 2 0 1 = 001 2 = 010 4 = 100 Shifting 1 two places to the left results in 4 (100 in binary). Shifting right always results in zero because bits are discarded. For more details, refer to our article on bitwise operators. Difference Between Operators and Functions You may have noticed that we have included no functions in this overview. The confusion between operators and functions arises because both perform similar actions—transforming objects. However: Functions are broader and can operate on strings, entire blocks of code, and more. Operators work only with individual values and variables. In Python, a function can consist of a block of operators, but operators can never contain functions.
08 April 2025 · 6 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support