Sign In
Sign In

Creating Tables in ClickHouse

Creating Tables in ClickHouse
Hostman Team
Technical writer
ClickHouse
23.09.2024
Reading time: 6 min

Like other database management systems, ClickHouse offers many functions for working with tables. 

But before working with tables, we must create a database where these tables will be stored.

Creating a Database in ClickHouse

To create a database in ClickHouse, use the CREATE DATABASE command, which creates a new database in ClickHouse's storage. To select an already created database, use the USE command. These commands look like this (database names are just examples):

CREATE DATABASE my_new_database
USE my_old_database

In the Hostman panel, you can create a ClickHouse database in just a few clicks:

  • Go to the "Databases" page and click the "Create" button.

  • On the next page, select ClickHouse in the "Database Type" section, fill in the required parameters below, and click the "Order" button on the right.

Now, we can proceed to work with tables.

How to Create Tables in ClickHouse

To create tables in ClickHouse, use the CREATE TABLE command, followed by the table name and the column parameters. Let's jump into an example and explain other related commands.

Example 1

We'll start with an example of creating a table to store data about the models of a car brand:

CREATE TABLE autobrand_models (
    model_id UInt32,
    model_name String,
    release_year UInt16,
    engine_type String,
    horsepower UInt16,
    price Decimal(10, 2)
) ENGINE = MergeTree()
ORDER BY model_id;

In this example, we create the table autobrand_models with the following fields:

  • model_id: The model identifier, 32-bit unsigned integer (UInt32).

  • model_name: The name of the model (String).

  • release_year: The year the model was released, 16-bit unsigned integer (UInt16).

  • engine_type: The type of engine (String).

  • horsepower: The engine's horsepower, 16-bit unsigned integer (UInt16).

  • price: The price, decimal number with up to 10 digits and 2 decimal places (Decimal(10, 2)).

This table allows storing information about car models, including their ID, name, release year, engine type, horsepower, and price.

Notice the following two lines:

  • ENGINE = MergeTree(): We're using the MergeTree engine, one of the most commonly used engines for data storage in ClickHouse. The engine determines how and where the data is stored, which queries are supported, and whether data replication is enabled. While ClickHouse supports many engines, MergeTree is the obvious choice for simple tables like this one.

  • ORDER BY model_id: This specifies that the data will be ordered by the model_id column.

Example 2

Now let's talk about how to use primary keys. These can be specified in different ways: within or outside the column list.

In the first case, primary keys are written like this:

CREATE TABLE db.new_table
(
   name01 type01, name02 type02, ...,
   PRIMARY KEY(expr01[, expr02,...])
)

If you want to specify them outside the table definition, it looks like this:

CREATE TABLE db.new_table
(
   name01 type01, name02 type02, ...
)
PRIMARY KEY(expr01[, expr02,...]);

As we can see, the difference in the code is the position of the closing parenthesis: in the second case, the primary keys are placed outside the table definition.

Let's look at a specific example of primary key usage:

CREATE TABLE shop.sales_by_category
(
    category_id UInt32,
    product_name String,
    sale_date Date,
    quantity UInt16,
    price Decimal(10, 2),
    revenue Decimal(10, 2),
    PRIMARY KEY(category_id, sale_date)
)
ENGINE = MergeTree()
ORDER BY (category_id, sale_date);

Here:

  • category_id: The category ID, a 32-bit unsigned integer (UInt32).

  • product_name: The product name (String).

  • sale_date: The sale date (Date).

  • quantity: The quantity sold, 16-bit unsigned integer (UInt16).

  • price: The price, decimal with 2 decimal places (Decimal(10, 2)).

  • revenue: The total revenue, also in decimal format (Decimal(10, 2)).

  • PRIMARY KEY: The primary keys are category_id and sale_date, defined within the column list.

  • ORDER BY (category_id, sale_date): This defines the data sorting order in the table.

Adding Comments to Tables

You can add comments to tables using the COMMENT command. Here's an example of adding a comment and then displaying it. To add a comment, you can do something like this:

CREATE TABLE countdown (x String) ENGINE = Memory COMMENT 'Temporary table for report';

To display the comment, use this command:

SELECT name, comment FROM system.tables WHERE name = 'countdown';

The result will look like this:

name         | comment
-------------|------------------------------------------------
countdown    | Temporary table for report

Temporary Tables in ClickHouse

ClickHouse allows you to create temporary tables that exist only for the current query and are created outside of a database. These tables are automatically deleted after the session ends or the connection is closed. Use the CREATE TEMPORARY TABLE command to create a temporary table in ClickHouse. Here's the simplest structure for such a table:

CREATE TEMPORARY TABLE temp_table (
    id UInt32,
    name String
) ENGINE = Memory;

In this example, we create the temporary table temp_table with fields id and name. The engine is set to Memory, meaning the data will be stored in RAM. Temporary tables are useful when you need to quickly store and use data within the current session without keeping it afterward.

Example: Analyzing Sales Data

Let's assume we have a database for an online store and want to analyze sales for a specific period. To do this, we may need to create a temporary table to collect data from multiple tables and perform analytical queries. Here's how it can be done:

CREATE TEMPORARY TABLE temp_sales_data AS
SELECT 
    products.product_id,
    products.product_name,
    sales.sale_date,
    sales.sale_amount
FROM products
JOIN sales ON products.product_id = sales.product_id
WHERE sales.sale_date BETWEEN '2024-08-01' AND '2024-08-31';

In this example, we create a temporary table, temp_sales_data, to place data about products and sales for August 2024. We join the products and sales tables by product ID and select only those records where the sale date falls within the specified period.

After creating the temp_sales_data table, we can perform various analytical queries on this data, such as calculating the total sales for the month or finding the most popular products. Note that the temporary table will automatically be deleted when the session ends, which is convenient for analytics since it doesn't clutter the database with temporary reports.

Modifying Table Structure

ClickHouse allows changing table structures using the REPLACE command. However, keep in mind that this command applies only to existing tables; otherwise, the system will return an error.

Suppose we have a table myBigData with data and want to replace all records where CounterID is less than a certain value. In ClickHouse, this can be done in a single line:

REPLACE TABLE myBigData SELECT * FROM myBigData WHERE CounterID < 5356;

This query replaces all records in the myBigData table where CounterID < 5356 with the result of the selection from the same table. This way, the REPLACE command efficiently updates data in a table based on a query from the same table.

Conclusion

We've explored creating regular and temporary tables in ClickHouse and learned how to modify data and add comments to tables. For more advanced ClickHouse table features, check out the documentation.

ClickHouse
23.09.2024
Reading time: 6 min

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start
Email us