DBT

This article explores the types of tests in dbt (data build tool), including generic tests, dbt_utils package tests, and custom tests. It provides an overview of each type of test, their purpose, and how to use them effectively to ensure data quality and integrity.

DBT

Types of Tests in dbt (data build tool)

In dbt (data build tool), there are several types of tests that can be utilized to ensure data quality and integrity. These tests can be categorized into three main groups: Generic Tests, dbt_utils Package Tests, and Custom Tests.

Generic Tests

These are the standard tests provided by dbt to validate common data quality issues:

  • not_null: Ensures that there are no null values in a specified column, maintaining data completeness.
SQL
# In your .yml file
version: 2

models:
  - name: your_model_name
    columns:
      - name: column_name
        tests:
          - not_null
  • unique: Verifies that all entries in a specified column are unique, preventing duplicate records.
SQL
# In your .yml file
version: 2
models:
  - name: your_model_name
    columns:
      - name: column_name
        tests:
          - unique
  • relationships: Checks referential integrity by ensuring that foreign keys in one table correctly reference primary keys in another table.
SQL
# In your .yml file
version: 2

models:
  - name: child_table
    columns:
      - name: foreign_key_column
        tests:
          - relationships:
              to: ref('parent_table')
              field: primary_key_column
  • accepted_values: Confirms that all entries in a column fall within a predefined set of acceptable values, enforcing consistency and business rules.
SQL
# In your .yml file
version: 2

models:
  - name: your_model_name
    columns:
      - name: column_name
        tests:
          - accepted_values:
              values: ['value1', 'value2', 'value3']

dbt_utils Package Tests

The dbt_utils package extends the functionality of dbt with additional tests:

  • unique_combination_of_columns: Tests composite keys to ensure that combinations of columns are unique, which is useful for complex uniqueness constraints.
SQL
# In your .yml file
version: 2
models:
  - name: your_model_name
    tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns: ['column1', 'column2']
  • not_constant: Ensures that values in a column vary, which can help detect unexpected uniformity in data.
SQL
# In your .yml file
version: 2
models:
  - name: your_model_name
    columns:
      - name: column_name
        tests:
          - dbt_utils.not_constant
  • sequential_values: Checks if IDs or other numeric columns are sequential, which is important for ordered datasets.
SQL
# In your .yml file
version: 2
models:
  - name: your_model_name
    columns:
      - name: id_column
        tests:
          - dbt_utils.sequential_values

Custom Tests

Custom tests allow you to write specific SQL queries to address unique data validation needs:

  • You can write your own SQL to test specific scenarios or complex business rules that are not covered by the built-in or package tests.
SQL
-- tests/no_large_transactions.sql

with large_transactions as (
    select *
    from {{ ref('your_model_name') }}
    where transaction_amount > 10000
)

select count(*) as errors from large_transactions where errors > 0;
  • Custom tests are flexible and can be tailored to check conditions across columns or entire tables, such as ensuring transactional limits or verifying complex business logic.
SQL
# In your .yml file
version: 2

models:
  - name: your_model_name
    tests:
      - custom_test_name: no_large_transactions.sql

By leveraging these different types of tests, dbt users can ensure robust data quality checks throughout their data transformation processes. These tests help maintain data integrity and reliability, supporting accurate and consistent data analysis.