13 gennaio 2025 / 11:01 AM

Enhancing data privacy: a guide to Dynamic Data Masking in Snowflake

SDG Blog

Protect sensitive information using Snowflake's Dynamic Data Masking.

Welcome to TechStation, SDG's space for exploring cutting-edge trends in data and analytics! In this article, we delve into Snowflake's Dynamic Data Masking feature and how the dbt_snow_mask package simplifies its implementation. Learn how to secure sensitive data dynamically, ensuring robust compliance and streamlined management across your data environments.

Interested in something else? Check all of our content here.

techstation generica

Dynamic Data Masking (DDM) in Snowflake is a security feature that allows organizations to protect their sensitive data by masking them in real-time, based on the user roles and their access permissions.

This feature guarantees that sensitive contents, such as personal information or financial data, are hidden for users who do not have the correct authorization, while still allowing authorized users to see the full data.

This article will present a powerful tool that helps data teams to manage more efficiently the DDM policies within their Snowflake environments, using a dbt framework.

 

 

Target set-up

Let us start by analyzing the two main components involved: dbt and Snowflake.

dbt (Data Build Tool) is an open-source tool designed for data transformation and modeling within modern data warehouses.

Snowflake is a cloud-based data warehouse platform providing flexible, scalable and completely managed services for storing and analyzing large data volumes.

The data transformation and modeling capabilities of dbt combined with the instant scalability and the cloud-based data warehousing offered by Snowflake make it easier for data teams to build, manage and analyze data pipelines.

dbt_snow_mask is a package designed to work with dbt to automate the application of masking policies in Snowflake by exploiting the meta property in dbt models.

In this way, it is easier to manage and apply data masking consistently across Snowflake data warehouses.

 

Key features and benefits of dbt_snow_mask

 

  • Automation, consistency and scalability: dbt_snow_mask automates the application of arbitrarily complex Snowflake’s masking policies using definitions within the dbt’s meta configs, reducing manual effort. This ensures consistency, by centrally managing policies in dbt, and scalability, by providing data masking across all relevant data.

  • Customized masking policy: the package supports the creation of custom masking policies that can be tailored to specific data fields and business requirements. Through this flexibility, organizations can define how data should be masked based on their unique security needs and compliance restrictions.

  • Simplified data security management: integrating masking criteria into the dbt workflow using this package simplifies the management and implementation of data security measures. In this way, data are constantly protected, reducing the risk of exposure, and organizations can track and verify who has access to sensitive data and under what circumstances.
 
How it works & Example usage

 

This section will present how to apply a masking policy to a model; similar considerations can be made if, instead of a model, one were to consider a source or a snapshot.

1. First of all, the package must be installed:

1.a) Add the dbt_snow_mask package to the packages.yml file in the dbt project:

 


packages:
  - package: entechlog/dbt_snow_mask
    version: [latest_version]

 


 

Replace “[latest_version]” with the actual latest version number, which can be found on the dbt Hub.


1.b) Since this package in turn uses dbt_utils package, it also needs to be installed. This can be done by adding the following two lines to the packages.yml file:

 


  - package: dbt-labs/dbt_utils
    version: [latest_version]

 

Also in this case replace “[latest_version]” with the actual latest version number, which can be found on the dbt Hub.

1.c) After adding the packages, run the following command to install them:

 


  dbt deps

 

2. By default, the masking policies are created in the database-schema pair associated with the target specified in the profiles.yml file.

This behaviour can be changed by acting on the parameters passed to the dbt_project.yml file and use a common database or a common schema:

2.a) Use a common database.
By setting the following, optional, parameters the database and schema where the masking polices are created in can be changed:

  • use_common_masking_policy_db: flag to enable or not the usage of a common pair database-schema for all masking policies. Valid values are ‘True’ or ‘False’.
  • common_masking_policy_db: the database name for creating masking policies.
  • common_masking_policy_schema: the schema name for creating masking policies.
  • create_masking_policy_schema: flag whose valid values are ‘True’ or ‘False’. When set to ‘False’, helps to avoid creating schema if the dbt role does not have access to create schema. The default value is ‘True’.

Example: vars config in dbt_project.yml file to enable using a common masking policy database, with database name set to “DB_NAME”, schema name set to “SCHEMA_NAME”, avoiding creating schema if the dbt role does not have access to create schema:

 


vars:
  use_common_masking_policy_db: "True"
  common_masking_policy_db: "DB_NAME"
  common_masking_policy_schema: "SCHEMA_NAME"
  create_masking_policy_schema: "False"
  

 

2.b) Use a common schema (in the current database).
By setting the following, optional, parameters only the schema that the masking polices are created in can be changed:

  • use_common_masking_policy_schema_only: flag to enable the usage or not of a common schema in the current database for all masking policies. Valid values are ‘True’ or ‘False’.
  • common_masking_policy_schema: the schema name for creating masking policies.
  • create_masking_policy_schema: flag whose valid values are ‘True’ or ‘False’. When set to ‘False’, helps to avoid creating schema if the dbt role does not have access to create schema. The default value is ‘True’.

Example: vars config in dbt_project.yml file to enable using a common masking policy schema, with schema name set to “SCHEMA_NAME”, avoiding creating schema if the dbt role does not have access to create schema:

 


vars:
  use_common_masking_policy_schema_only: "True"
  common_masking_policy_schema: "SCHEMA_NAME"
  create_masking_policy_schema: "False"
  

 

3) Use the meta property in the model.yml file to specify the masking policy to be adopted. Decide the masking policy name and add the key masking_policy in the column which has to be masked.

Example: configuration of model.ymlfile to apply the masking policy “MP_NAME” to the column “COLUMN_NAME” of model “MODEL_NAME”:

 


version: 2
 
models:
  - name: MODEL_NAME
    description: "A model with sensitive information"
    columns:
      - name: COLUMN_NAME
        description: "sensitive information"
        meta:
          masking_policy: MP_NAME
  

 

4) Before using a masking policy on models it must be defined and created on Snowflake. In order to do that, create a macro with the name create_masking_policy_<masking-policy-name> and the SQL for masking policy definition.

Then create the masking policy in Snowflake by running the following command:

 


      dbt run-operation create_masking_policy --args "{resource_type: models}"

 

Example: definition of a masking policy named “MP_NAME” which allows the visibility of the content of column “COLUMN_NAME” only to roles “ROLE_1” and “ROLE_2”, all other roles will see ‘**********’ instead of the actual value:

 


     

 

To create the masking policy “MP_NAME”, with customizable database and schema configurations as explained in step 2, the command is the following:

 


      dbt run-operation create_masking_policy --args "{resource_type: models}"

 

5) Apply the masking policy by running below command:

 


      dbt run --model <model-name>

 

6) To remove the applied masking policy, simply run the following command:

 


      dbt run-operation unapply_masking_policy --args "{resource_type: models}"

 

 

 

Conclusion

In summary, the dbt_snow_mask package is a powerful tool for teams using dbt and Snowflake, allowing for straightforward, consistent, and automated application of Dynamic Data Masking policies across Snowflake data warehouses using the dbt meta property.


Vuoi scoprire come il Dynamic Data Masking può migliorare la sicurezza e la conformità dei dati della tua azienda? Prenota una call per esplorare soluzioni su misura e scoprire come integrare facilmente il pacchetto dbt_snow_mask nel tuo ambiente Snowflake!