DISTINCT Clause
If SELECT DISTINCT is specified, only unique rows will remain in a query result. Thus, only a single row will remain out of all the sets of fully matching rows in the result.
You can specify the list of columns that must have unique values: SELECT DISTINCT ON (column1, column2,...). If the columns are not specified, all of them are taken into consideration.
Consider the table:
Using DISTINCT without specifying columns:
Using DISTINCT with specified columns:
DISTINCT and ORDER BY
ClickHouse supports using the DISTINCT and ORDER BY clauses for different columns in one query. The DISTINCT clause is executed before the ORDER BY clause.
Consider the table:
Selecting data:
Selecting data with the different sorting direction:
Row 2, 4 was cut before sorting.
Take this implementation specificity into account when programming queries.
Null Processing
DISTINCT works with NULL as if NULL were a specific value, and NULL==NULL. In other words, in the DISTINCT results, different combinations with NULL occur only once. It differs from NULL processing in most other contexts.
Alternatives
It is possible to obtain the same result by applying GROUP BY across the same set of values as specified as SELECT clause, without using any aggregate functions. But there are few differences from GROUP BY approach: