Sign In
Sign In

Python Sets and Python Sets Operations

Python Sets and Python Sets Operations
Adnene Mabrouk
Technical writer
Python
25.06.2024
Reading time: 4 min

Python sets are a fundamental data type in Python, providing a collection that is unordered, mutable, and does not allow duplicate elements. Sets are powerful tools for various applications, from mathematical operations to database management, due to their unique properties and efficient operations.

Importance of Sets in Programming

Sets are crucial in programming for several reasons:

  1. Uniqueness: Ensures all elements are unique, which is useful for removing duplicates.

  2. Efficiency: Set operations like membership tests (checking if an item is in a set) are optimized, offering average-case O(1) time complexity.

  3. Mathematical Operations: Sets support operations such as union, intersection, and difference, aligning with their mathematical counterparts, making them ideal for tasks like Venn diagrams and common item extraction.

  4. Use Cases: Sets are used in applications such as network analysis, database queries, and data cleaning.

Creating Sets in Python

Creating a set in Python is straightforward. Sets can be created using the set() constructor or by using curly braces {}.

# Using set() constructor
set_a = set([1, 2, 3, 4, 5])

# Using curly braces
set_b = {1, 2, 3, 4, 5}

# Creating an empty set
empty_set = set()

Basic Set Operations: Union, Intersection, Difference

Let's take a look at basic set operations.

Union

The union of two sets is a set containing all elements from both sets, without duplicates.

set_a = {1, 2, 3}
set_b = {3, 4, 5}
union_set = set_a | set_b
# union_set is {1, 2, 3, 4, 5}

Intersection

The intersection of two sets is a set containing only the elements that are present in both sets.

intersection_set = set_a & set_b
# intersection_set is {3}

Difference

The difference of two sets is a set containing elements that are in the first set but not in the second, so it basically substracts elements of one set from another.

difference_set = set_a - set_b
# difference_set is {1, 2}

Advanced Set Operations and Methods

Python sets offer various methods for advanced operations:

Symmetric Difference

The symmetric difference returns a set with elements in either of the sets, but not in both.

sym_diff_set = set_a ^ set_b
# sym_diff_set is {1, 2, 4, 5}

Subset and Superset

Check if a set is a subset or superset of another set.

set_c = {1, 2}
is_subset = set_c <= set_a  # True
is_superset = set_a >= set_c  # True

Updating Sets

Sets can be updated with various methods:

  • add() to add a single element.

  • update() to add multiple elements.

set_a.add(6)
# set_a is now {1, 2, 3, 6}

set_a.update([7, 8])
# set_a is now {1, 2, 3, 6, 7, 8}

Removing Elements

You can use:

  • remove() to remove a specific element (raises KeyError if not found).

  • discard() to remove a specific element (does nothing if not found).

  • pop() to remove and return an arbitrary element.

set_a.remove(6)
# set_a is now {1, 2, 3, 7, 8}

set_a.discard(7)
# set_a is now {1, 2, 3, 8}

popped_element = set_a.pop()
# popped_element is 1 (example)

Practical Examples of Set Usage

For better understanding, let's see how to use sets in practice.

Removing Duplicates

Sets are an easy way to remove duplicates from a list.

numbers = [1, 2, 2, 3, 4, 4, 5]
unique_numbers = list(set(numbers))
# unique_numbers is [1, 2, 3, 4, 5]

Membership Testing

Sets offer efficient membership testing, ideal for scenarios where this is frequently needed.

names = {"Alice", "Bob", "Charlie"}
is_present = "Bob" in names  # True

Finding Common Items

Sets can be used to find common elements in multiple lists.

list1 = {1, 2, 3}
list2 = {2, 3, 4}
common_elements = list1 & list2
# common_elements is {2, 3}

Performance Considerations with Sets

Sets are highly optimized for certain operations:

  1. Membership Testing: Average-case O(1)* time complexity, making them faster than lists and tuples for this purpose.

  2. Insertion and Deletion: Average-case O(1)* time complexity, allowing for quick modifications.

  3. Memory Usage: Sets typically use more memory than lists for storing the same number of elements due to the need to maintain hash tables for fast access.

* Refers to the average-case time complexity of an operation, which in this context means that the operation takes a constant amount of time on average, regardless of the size of the input. 

However, the actual performance can vary based on the specific application and the size of the data. When choosing data structures, it’s essential to consider the type of operations required and the efficiency of those operations with sets.

Conclusion

Python sets are a versatile and efficient tool for managing collections of unique items and performing various mathematical and logical operations. Their unique properties and methods make them indispensable for many programming tasks. Whether you're removing duplicates, testing for membership, or performing complex mathematical operations, Python sets provide a robust and efficient solution. By understanding and utilizing sets, programmers can enhance the performance and functionality of their code, making sets a vital part of the Python programming toolkit.

Python
25.06.2024
Reading time: 4 min

Similar

Python

The Walrus Operator in Python

The first question newcomers often ask about the walrus operator in Python is: why such a strange name? The answer lies in its appearance. Look at the Python walrus operator: :=. Doesn't it resemble a walrus lounging on a beach, with the symbols representing its "eyes" and "tusks"? That's how it earned the name. How the Walrus Operator Works Introduced in Python 3.8, the walrus operator allows you to assign a value to a variable while returning that value in a single expression. Here's a simple example: print(apples = 7) This would result in an error because print expects an expression, not an assignment. But with the walrus operator: print(apples := 7) The output will be 7. This one-liner assigns the value 7 to apples and returns it simultaneously, making the code compact and clear. Practical Examples Let’s look at a few examples of how to use the walrus operator in Python. Consider a program where users input phrases. The program stops if the user presses Enter. In earlier versions of Python, you'd write it like this: expression = input('Enter something or just press Enter: ') while expression != '': print('Great!') expression = input('Enter something or just press Enter: ') print('Bored? Okay, goodbye.') This works, but we can simplify it using the walrus operator, reducing the code from five lines to three: while (expression := input('Enter something or just press Enter: ')) != '': print('Great!') print('Bored? Okay, goodbye.') Here, the walrus operator allows us to assign the user input to expression directly inside the while loop, eliminating redundancy. Key Features of the Walrus Operator: The walrus operator only assigns values within other expressions, such as loops or conditions. It helps reduce code length while maintaining clarity, making your scripts more efficient and easier to read. Now let's look at another example of the walrus operator within a conditional expression, demonstrating its versatility in Python's modern syntax. Using the Walrus Operator with Conditional Constructs Let’s write a phrase, assign it to a variable, and then find a word in this phrase using a condition: phrase = 'But all sorts of things and weather must be taken in together to make up a year and a sphere...' word = phrase.find('things') if word != -1: print(phrase[word:]) The expression [word:] allows us to get the following output: things and weather must be taken in together to make up a year and a sphere... Now let's shorten the code using the walrus operator. Instead of: word = phrase.find('things') if word != -1: print(phrase[word:]) we can write: if (word := phrase.find('things')) != -1: print(phrase[word:]) In this case, we saved a little in volume but also reduced the number of lines. Note that, despite the reduced time for writing the code, the walrus operator doesn’t always simplify reading it. However, in many cases, it’s just a matter of habit, so with practice, you'll learn to read code with "walruses" easily. Using the Walrus Operator with Numeric Expressions Lastly, let’s look at an example from another area where using the walrus operator helps optimize program performance: numerical operations. We will write a simple program to perform exponentiation: def pow(number, power): print('Calling pow') result = 1 while power: result *= number power -= 1 return result Now, let’s enter the following in the interpreter: >>> [pow(number, 2) for number in range(3) if pow(number, 2) % 2 == 0] We get the following output: Calling pow Calling pow Calling pow Calling pow Calling pow [0, 4, 16] Now, let's rewrite the input in the interpreter using the walrus operator: >>> [p for number in range(3) if (p := pow(number, 2)) % 2 == 0] Output: Calling pow Calling pow Calling pow [0, 4, 16] As we can see, the code hasn’t shrunk significantly, but the number of function calls has nearly been halved, meaning the program will run faster! Conclusion In conclusion, the walrus operator (:=) introduced in Python 3.8 streamlines code by allowing assignment and value retrieval in a single expression. This operator enhances readability and efficiency, particularly in loops and conditional statements. Through practical examples, we’ve seen how it reduces line counts and minimizes redundant function calls, leading to faster execution. With practice, developers can master the walrus operator, making their code cleaner and more concise. On our app platform you can deploy Python applications, such as Celery, Django, FastAPI and Flask. 
23 October 2024 · 4 min to read
Python

Python String Functions

As the name suggests, Python 3 string functions are designed to perform various operations on strings. There are several dozen string functions in the Python programming language. In this article, we will cover the most commonly used ones and several special functions that may be less popular but are still useful. They can be helpful not only for formatting but also for data validation. List of Basic String Functions for Text Formatting First, let’s discuss string formatting functions, and to make the learning process more enjoyable, we will use texts generated by a neural network in our examples. capitalize() — Converts the first character of the string to uppercase, while all other characters will be in lowercase: >>> phrase = 'the shortage of programmers increases the significance of DevOps. After the presentation, developers start offering their services one after another, competing with each other for DevOps.' >>> phrase.capitalize() 'The shortage of programmers increases the significance of devops. after the presentation, developers start offering their services one after another, competing with each other for devops.' casefold() — Returns all elements of the string in lowercase: >>> phrase = 'Cloud providers offer scalable computing resources and services over the internet, enabling businesses to innovate quickly. They support various applications, from storage to machine learning, while ensuring reliability and security.' >>> phrase.casefold() 'cloud providers offer scalable computing resources and services over the internet, enabling businesses to innovate quickly. they support various applications, from storage to machine learning, while ensuring reliability and security.' center() — This method allows you to center-align strings: >>> text = 'Python is great for writing AI' >>> newtext = text.center(40, '*') >>> print(newtext) *****Python is great for writing AI***** A small explanation: The center() function has two arguments: the first (length of the string for centering) is mandatory, while the second (filler) is optional. In the operation above, we used both. Our string consists of 30 characters, so the remaining 10 were filled with asterisks. If the second attribute were omitted, spaces would fill the gaps instead. upper() and lower() — convert all characters to uppercase and lowercase, respectively: >>> text = 'Projects using Internet of Things technology are becoming increasingly popular in Europe.' >>> text.lower() 'projects using internet of things technology are becoming increasingly popular in europe.' >>> text.upper() 'PROJECTS USING INTERNET OF THINGS TECHNOLOGY ARE BECOMING INCREASINGLY POPULAR IN EUROPE.' replace() — is used to replace a part of the string with another element: >>> text.replace('Europe', 'USA') 'Projects using Internet of Things technology are becoming increasingly popular in the USA.' The replace() function also has an optional count attribute that specifies the maximum number of replacements if the element to be replaced occurs multiple times in the text. It is specified in the third position: >>> text = 'hooray hooray hooray' >>> text.replace('hooray', 'hip', 2) 'hip hip hooray' strip() — removes identical characters from the edges of a string: >>> text = 'ole ole ole' >>> text.strip('ole') 'ole' If there are no symmetrical values, it will remove what is found on the left or right. If the specified characters are absent, the output will remain unchanged: >>> text.strip('ol') 'e ole ole' >>> text.strip('le') 'ole ole o' >>> text.strip('ura') 'ole ole ole' title() — creates titles, capitalizing each word: >>> texttitle = 'The 5G revolution: transforming connectivity. How next-gen networks are shaping our digital future' >>> texttitle.title() 'The 5G Revolution: Transforming Connectivity. How Next-Gen Networks Are Shaping Our Digital Future' expandtabs() — changes tabs in the text, which helps with formatting: >>> clublist = 'Milan\tReal\tBayern\tArsenal' >>> print(clublist) Milan Real Bayern Arsenal >>> clublist.expandtabs(1) 'Milan Real Bayern Arsenal' >>> clublist.expandtabs(5) 'Milan Real Bayern Arsenal' String Functions for Value Checking Sometimes, it is necessary to count a certain number of elements in a sequence or check if a specific value appears in the text. The following string functions solve these and other tasks. count() — counts substrings (individual elements) that occur in a string. Let's refer again to our neural network example: >>> text = "Cloud technologies significantly accelerate work with neural networks and AI. These technologies are especially important for employees of large corporations operating in any field — from piloting spacecraft to training programmers." >>> element = "o" >>> number = text.count(element) >>> print("The letter 'o' appears in the text", number, "time(s).") The letter 'o' appears in the text 19 time(s). As a substring, you can specify a sequence of characters (we'll use text from the example above): >>> element = "ob" >>> number = text.count(element) >>> print("The combination 'ob' appears in the text", number, "time(s).") The combination 'in' appears in the text 5 time(s). Additionally, the count() function has two optional numerical attributes that specify the search boundaries for the specified element: >>> element = "o" >>> number = text.count(element, 20, 80) >>> print("The letter 'o' appears in the specified text fragment", number, "time(s).") The letter 'o' appears in the specified text fragment 6 time(s). The letter 'o' appears in the specified text fragment 6 time(s). find() — searches for the specified value in the string and returns the smallest index. Again, we will use the example above: >>> print(text.find(element)) 7 This output means that the first found letter o is located at position 7 in the string (actually at position 8, because counting in Python starts from zero). Note that the interpreter ignored the capital letter O, which is located at position zero. Now let's combine the two functions we've learned in one code: >>> text = "Cloud technologies significantly accelerate work with neural networks and AI. These technologies are especially important for employees of large corporations operating in any field — from piloting spacecraft to training programmers." >>> element = "o" >>> number = text.count(element, 20, 80) >>> print("The letter 'o' appears in the specified text fragment", number, "time(s), and the first time in the whole text at", (text.find(element)), "position.") The letter 'o' appears in the specified text fragment 3 time(s), and the first time in the whole text at 7 position. index() — works similarly to find(), but will raise an error if the specified value is absent: Traceback (most recent call last): File "C:\Python\text.py", line 4, in <module> print(text.index(element)) ValueError: substring not found Here's what the interpreter would return when using the find() function in this case: -1 This negative position indicates that the value was not found. enumerate() — a very useful function that not only iterates through the elements of a list or tuple, returning their values, but also returns the ordinal number of each element: team_scores = [78, 74, 56, 53, 49, 47, 44] for number, score in enumerate(team_scores, 1): print(str(number) + '-th team scored ' + str(score) + ' points.') To output the values with their ordinal numbers, we introduced a few variables: number for ordinal numbers, score for the values of the list, and str indicates a string. And here’s the output: 1-th team scored 78 points. 2-th team scored 74 points. 3-th team scored 56 points. 4-th team scored 53 points. 5-th team scored 49 points. 6-th team scored 47 points. 7-th team scored 44 points. Note that the second attribute of the enumerate() function is the number 1, otherwise Python would start counting from zero. len() — counts the length of an object, i.e., the number of elements that make up a particular sequence: >>> len(team_scores) 7 This way, we counted the number of elements in the list from the example above. Now let's ask the neural network to write a string again and count the number of characters in it: >>> network = 'It is said that artificial intelligence excludes the human factor. But do not forget that the human factor is still present in the media and government structures.' >>> len(network) 162 Special String Functions in Python join() — allows you to convert lists into strings: >>> cities = ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio'] >>> cities_str = ', '.join(cities) >>> print('Cities in one line:', cities_str) Cities in one line: New York, Los Angeles, Chicago, Houston, Phoenix, Philadelphia, San Antonio print() — provides a printed representation of any object in Python: >>> cities = ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio'] >>> print(cities) ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix', 'Philadelphia', 'San Antonio'] type() — returns the type of the object: >>> type(cities) <class 'list'> We found out that the object from the previous example is a list. This is useful for beginners, as they may initially confuse lists with tuples, which have different functionalities and are handled differently by the interpreter. map() — is a fairly efficient replacement for a for loop, allowing you to iterate over the elements of an iterable object, applying a built-in function to each of them. For example, let's convert a list of string values into integers using the int function: >>> numbers_list = ['4', '7', '11', '12', '17'] >>> list(map(int, numbers_list)) [4, 7, 11, 12, 17] As we can see, we used the list() function, "wrapping" the map() function in it—this was necessary to avoid the following output: >>> numbers_list = ['4', '7', '11', '12', '17'] >>> map(int, numbers_list) <map object at 0x0000000002E272B0> This is not an error; it simply produces the ID of the object, and the program will continue to run. However, the list() method is useful in such cases to get the desired list output. Of course, we haven't covered all string functions in Python. Still, this set will already help you perform a large number of operations with strings and carry out various transformations (programmatic and mathematical). On our app platform you can deploy Python applications, such as Celery, Django, FastAPI and Flask. 
23 October 2024 · 9 min to read
Python

Deploying Python Applications with Gunicorn

In this article, we’ll show how to set up an Ubuntu 20.04 server and install and configure the components required for deploying Python applications. We’ll configure the WSGI server Gunicorn to interact with our application. Gunicorn will serve as an interface that converts client requests via the HTTP protocol into Python function calls executed by the application. Then, we will configure Nginx as a reverse proxy server for Gunicorn, which will forward requests to the Gunicorn server. Additionally, we will cover securing HTTP connections with an SSL certificate or using other features like load balancing, caching, etc. These details can be helpful when working with cloud services like those provided by Hostman. Creating a Python Virtual Environment To begin, we need to update all packages: sudo apt update Ubuntu provides the latest version of the Python interpreter by default. Let’s check the installed version using the following command: python3 --version Example output: Python 3.10.12 We’ll set up a virtual environment to ensure that our project has its own dependencies, separate from other projects. First, install the virtualenv package, which allows you to create virtual environments: sudo apt-get install python3-venv python3-dev Next, create a folder for your project and navigate into it: mkdir myappcd myapp Now, create a virtual environment: python3 -m venv venv And create a folder for your project: mkdir app Your project directory should now contain two items: app and venv. You can verify this using the standard Linux command to list directory contents: ls Expected output: myapp venv You need to activate the virtual environment so that all subsequent components are installed locally for the project: source venv/bin/activate Installing and Configuring Gunicorn Gunicorn (Green Unicorn) is a Python WSGI HTTP server for UNIX. It is compatible with various web frameworks, fast, easy to implement, and uses minimal server resources. To install Gunicorn, run the following command: pip install gunicorn WSGI and Python WSGI (Web Server Gateway Interface) is the standard interface between a Python application running on the server side and the web server itself, such as Nginx. A WSGI server interacts with the application, allowing you to run code when handling requests. Typically, the application is provided as an object named application in a Python module, which is made available to the server. In the standard wsgi.py file, there is usually a callable application. For example, let’s create such a file using the nano text editor: nano wsgi.py Add the following simple code to the file: from aiohttp import web async def index(request): return web.Response(text="Welcome home!") app = web.Application() app.router.add_get('/', index) In the code above, we import aiohttp, a library that provides an asynchronous HTTP client built on top of asyncio. HTTP requests are a classic example of where asynchronous handling is ideal, as they involve waiting for server responses, during which other code can execute efficiently. This library allows sequential requests to be made without waiting for the first response before sending a new one. It’s common to run aiohttp servers behind Nginx. Running the Gunicorn Server You can launch the server using the following command template: gunicorn [OPTIONS] [WSGI_APP] Here, [WSGI_APP] consists of $(MODULE_NAME):$(VARIABLE_NAME) and [OPTIONS] is a set of parameters for configuring Gunicorn. A simple command would look like this: gunicorn wsgi:app To restart Gunicorn, you can use: sudo systemctl restart gunicorn Systemd Integration systemd is a system and service manager that allows for strict control over processes, resources, and permissions. We’ll create a socket that systemd will listen to, automatically starting Gunicorn in response to traffic. Configuring the Gunicorn Service and Socket First, create the service configuration file: sudo nano /etc/systemd/system/gunicorn.service Add the following content to the file: [Unit] Description=gunicorn daemon Requires=gunicorn.socket After=network.target [Service] Type=notify User=someuser Group=someuser RuntimeDirectory=gunicorn WorkingDirectory=/home/someuser/myapp ExecStart=/path/to/venv/bin/gunicorn wsgi:app ExecReload=/bin/kill -s HUP $MAINPID KillMode=mixed TimeoutStopSec=5 PrivateTmp=true [Install] WantedBy=multi-user.target Make sure to replace /path/to/venv/bin/gunicorn with the actual path to the Gunicorn executable within your virtual environment. It will likely look something like this: /home/someuser/myapp/venv/bin/gunicorn. Next, create the socket configuration file: sudo nano /etc/systemd/system/gunicorn.socket Add the following content: [Unit] Description=gunicorn socket [Socket] ListenStream=/run/gunicorn.sock SocketUser=www-data [Install] WantedBy=sockets.target Enable and start the socket with: systemctl enable --now gunicorn.socket Configuring Gunicorn Let's review some useful parameters for Gunicorn in Python 3. You can find all possible parameters in the official documentation. Sockets -b BIND, --bind=BIND — Specifies the server socket. You can use formats like: $(HOST), $(HOST):$(PORT). Example: gunicorn --bind=127.0.0.1:8080 wsgi:app This command will run your application locally on port 8080. Worker Processes -w WORKERS, --workers=WORKERS — Sets the number of worker processes. Typically, this number should be between 2 to 4 per server core. Example: gunicorn --workers=2 wsgi:app Process Type -k WORKERCLASS, --worker-class=WORKERCLASS — Specifies the type of worker process to run. By default, Gunicorn uses the sync worker type, which is a simple synchronous worker that handles one request at a time. Other worker types may require additional dependencies. Asynchronous worker processes are available using Greenlets (via Eventlet or Gevent). Greenlets are a cooperative multitasking implementation for Python. The corresponding parameters are eventlet and gevent. We will use an asynchronous worker type compatible with aiohttp: gunicorn wsgi:app --bind localhost:8080 --worker-class aiohttp.GunicornWebWorker Access Logging You can enable access logging using the --access-logfile flag. Example: gunicorn wsgi:app --access-logfile access.log Error Logging To specify an error log file, use the following command: gunicorn wsgi:app --error-logfile error.log You can also set the verbosity level of the error log output using the --log-level flag. Available log levels in Gunicorn are: debug info warning error critical By default, the info level is set, which omits debug-level information. Installing and Configuring Nginx First, install Nginx with the command: sudo apt install nginx Let’s check if the Nginx service can connect to the socket created earlier: sudo -u www-data curl --unix-socket /run/gunicorn.sock http If successful, Gunicorn will automatically start, and you'll see the HTML code from the server in the terminal. Nginx configuration involves adding config files for virtual hosts. Each proxy configuration should be stored in the /etc/nginx/sites-available directory. To enable each proxy server, create a symbolic link to it in /etc/nginx/sites-enabled. When Nginx starts, it automatically loads all proxy servers in this directory. Create a new configuration file: sudo nano /etc/nginx/sites-available/myconfig.conf Then create a symbolic link with the command: sudo ln -s /etc/nginx/sites-available/myconfig.conf /etc/nginx/sites-enabled Nginx must be restarted after any changes to the configuration file to apply the new settings. First, check the syntax of the configuration file: nginx -t Then reload Nginx: nginx -s reload Conclusion Gunicorn is a robust and versatile WSGI server for deploying Python applications, offering flexibility with various worker types and integration options like Nginx for load balancing and reverse proxying. Its ease of installation and configuration, combined with detailed logging and scaling options, make it an excellent choice for production environments. By utilizing Gunicorn with frameworks like aiohttp and integrating it with Nginx, you can efficiently serve Python applications with improved performance and resource management.
23 October 2024 · 7 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support