Sign In
Sign In

How to Install and Use ClickHouse on Ubuntu

How to Install and Use ClickHouse on Ubuntu
Mohammad Waqas Shahid
Technical writer
ClickHouse
30.05.2024
Reading time: 7 min

ClickHouse is an open-source column-oriented database management system developed by Yandex, designed for real-time analytical processing of large volumes of data. It is specifically optimized for online analytical processing (OLAP) workloads, making it ideal for applications requiring fast analytical queries on vast datasets.

System Requirements

Before proceeding with the installation, ensure that your system meets the following requirements:

  • Ubuntu operating system (latest LTS version recommended for stability)

  • Minimum 2GB RAM (4GB or more recommended)

  • Adequate disk space for data storage

Update Package Lists

Open a terminal window and update the package lists by running the following command:

sudo apt update

Install ClickHouse

Use the following command to install ClickHouse on your Ubuntu system:

sudo apt-get install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee \
   /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update

Start ClickHouse Server

Once the installation is complete, start the ClickHouse server with the following command:

sudo apt-get install -y clickhouse-server clickhouse-client

To verify that ClickHouse has been installed successfully, check its status by executing:

sudo service clickhouse-server start

Configuration and Setup

Launch the ClickHouse client by typing the following command in your terminal:

clickhouse-client

Create a Database

Inside the ClickHouse client, create a new database using the following SQL command:

CREATE DATABASE my_database;

Then use the command use database:

USE my_database;

This command selects my_database as the current database to execute further commands within.

Create a Table

Here's an example of how to create a table within the database. Suppose we want to create a table named my_table with the following columns: id (UInt32), name (String), and age (UInt8).

CREATE TABLE my_table
(
   id UInt32,
   name String,
   age UInt8
)
ENGINE = MergeTree()
ORDER BY id;

This command creates a table named my_table with three columns (id, name, and age). The MergeTree engine is used, which is suitable for large datasets. The ORDER BY id clause specifies that the data will be ordered by the id column, which is often used to optimize query performance.

Basic Usage and Commands

Insert Data

To insert data into a ClickHouse table, use the INSERT INTO SQL statement. For example:

INSERT INTO my_table (column1, column2) VALUES (1, 'value1');

Query Data

Retrieve data from a table using the SELECT SQL statement. Here's an example:

SELECT * FROM my_table;

Deleting Data

To delete rows from a ClickHouse table, use the DELETE SQL statement. When using the DELETE statement in ClickHouse, you need to specify conditions that determine which rows will be deleted. These conditions can be based on various comparisons and logical operations. Here are examples of different types of conditions and the corresponding SQL commands to delete rows from my_table.

Delete Rows Based on Equality

Delete rows where column1 is equal to 1.

ALTER TABLE my_table DELETE WHERE column1 = 1;

Delete Rows Based on Inequality

Delete rows where column1 is not equal to 1.

ALTER TABLE my_table DELETE WHERE column1 != 1;

Delete Rows Based on Greater Than

Delete rows where column1 is greater than 2.

ALTER TABLE my_table DELETE WHERE column1 > 2;

Delete Rows Based on Less Than

Delete rows where column1 is less than 3.

ALTER TABLE my_table DELETE WHERE column1 < 3;

Delete Rows Based on Greater Than or Equal To

Delete rows where column1 is greater than or equal to 3.

ALTER TABLE my_table DELETE WHERE column1 >= 3;

Delete Rows Based on Less Than or Equal To

Delete rows where column1 is less than or equal to 2.

ALTER TABLE my_table DELETE WHERE column1 <= 2;

Delete Rows Based on String Matching

Delete rows where column2 is equal to value1.

ALTER TABLE my_table DELETE WHERE column2 = 'value1';

Delete Rows Based on a Range

Delete rows where column1 is between 2 and 4.

ALTER TABLE my_table DELETE WHERE column1 BETWEEN 2 AND 4;

Delete Rows Based on Logical AND

Delete rows where column1 is greater than 2 and column2 is value3.

ALTER TABLE my_table DELETE WHERE column1 > 2 AND column2 = 'value3';

Delete Rows Based on Logical OR

Delete rows where column1 is less than 2 or column2 is value5.

ALTER TABLE my_table DELETE WHERE column1 < 2 OR column2 = 'value5';

Delete Rows Based on NULL

Delete rows where column2 is NULL (assuming column2 can be NULL).

ALTER TABLE my_table DELETE WHERE column2 IS NULL;

Modifying Data

Update existing records in a ClickHouse table with the UPDATE SQL statement:

UPDATE my_table SET column1 = value1 WHERE condition;

Advanced Features and Functions

Let's expand on each section with detailed explanations and error-free code snippets.

Materialized Views

Implement materialized views for efficient data aggregation. Create a materialized view with the following SQL command:

CREATE MATERIALIZED VIEW my_materialized_view ENGINE = MergeTree() AS SELECT * FROM my_table;

Partitioning

Utilize partitioning to manage and query large datasets effectively. Create a partitioned table using the following SQL syntax:

CREATE TABLE my_partitioned_table
(
    column1 Date,
    column2 String
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(column1)
ORDER BY column1;

Monitoring and Maintenance

System Monitoring

Monitor ClickHouse performance using built-in metrics and tools. Access the ClickHouse system tables to view performance metrics:

SELECT * FROM system.metrics;

Backup and Restore

To back up ClickHouse data to a local disk, follow these steps:

1. Configure Backup Destination

First, you need to set up the backup destination by creating a configuration file /etc/clickhouse-server/config.d/backup_disk.xml. This file specifies the local storage path for backups. Paste the following into the backup_disk.xml file:

   <clickhouse>
       <storage_configuration>
           <disks>
               <backups>
                   <type>local</type>
                   <path>/backups/</path>
               </backups>
           </disks>
       </storage_configuration>
       <backups>
           <allowed_disk>backups</allowed_disk>
           <allowed_path>/backups/</allowed_path>
       </backups>
 </clickhouse>

2. Backup Data

Use the BACKUP SQL statement to back up a table. Specify the destination as a local disk.

BACKUP TABLE my_table TO Disk('backups', 'backup_name.zip');

3. Restore Data

To restore a table from a backup, use the RESTORE SQL statement.

RESTORE TABLE my_table FROM Disk('backups', 'backup_name.zip');
SETTINGS allow_non_empty_tables=true

In case if the empty tables are needed to be restored then the line SETTINGS allow_non_empty_tables=true is not needed in the command.

4. Incremental Backup

Incremental backups can be performed to save only the changes since the last backup.

BACKUP TABLE my_table TO Disk('backups', 'incremental.zip')
SETTINGS base_backup = Disk('backups', 'base_backup.zip');

5. Password Protection

Protect your backups with a password for added security.

BACKUP TABLE my_table TO Disk('backups', 'protected_backup.zip')
SETTINGS password='your_password';

These steps ensure a reliable and secure backup and restore process using ClickHouse's native tools. For more detailed information, visit the ClickHouse documentation.

Troubleshooting and Issue Resolution

Log Analysis

Analyze ClickHouse logs to identify and troubleshoot issues. Review ClickHouse log files located in /var/log/clickhouse-server/ for error messages and warnings.

Community Support

Seek assistance from the ClickHouse community for resolving complex problems. Join ClickHouse forums and mailing lists to ask questions and seek advice from experienced users.

And do not forget to check ClickHouse Docs for better understanding.

ClickHouse
30.05.2024
Reading time: 7 min

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start
Email us