Masking Joined Tables
When masking a table, you can specify a list of joins (or dictionary mapping keys to joins) that will join the rows of a target table to rows from one or more additional tables, providing you with the additional joined values to use in your masking rules.
Parameters
target_table(required): The name of the new table you wish to join into the masking data. The target table can be prefixed with a schema name to reference a table in another schema.target_key(required): The key ontarget_tableto use when performing the join. This can be specified as a single column name or a list of column names.source_table(required): The name of the table you wish join thetarget_tablewith. This could be the table being masked, or another table earlier in the list ofjoins(allowing you to perform multistep joins). The source table can be prefixed with a schema name to reference a table in another schema.source_key(required): The key onsource_tableto use when performing the join. This can be specified as a single column name or a list of column names.
Example
In this example, we have two tables: Customers and Membership. We would like the mask the first_name and
last_name columns of the Customers table, but only if the user's membership status is Active.
The Customers contains data on customers, including their name and membership ID. The Membership table
contains the status of the membership of each Customer: either Active or Inactive. The membership_ip column
of the Users has a foreign key relation with the id column of the Membership table.
Customers Table
| customer_id | first_name | last_name | membership_id |
|---|---|---|---|
| 1 | Anastasia | Rose | 10001 |
| 2 | Bill | Jones | 10002 |
| 3 | Chris | Yang | 10003 |
| 4 | Judith | Taylor | 10004 |
| 5 | Gordon | Smith | 10005 |
Membership Table
| id | membership_status |
|---|---|
| 10000 | Active |
| 10001 | Active |
| 10002 | Inactive |
| 10003 | Active |
| 10004 | Inactive |

In order to access the membership_status column of the Membership table, we need to define a join in our
ruleset from the Customers table to the Membership table.
version: "1.0"
tasks:
- type: mask_table
table: Customers
key: customer_id
joins:
- target_table: Membership
target_key: id
source_table: Customers
source_key: membership_id
rules:
- if:
- column: '"Membership".membership_status'
equals: Active
rules:
- column: first_name
masks:
- type: from_file
seed_file: DataMasque_firstNames_mixed.csv
seed_column: firstname-mixed
- column: last_name
masks:
- type: from_file
seed_file: DataMasque_lastNames.csv
seed_column: lastnames
After performing the join, this will allow us to reference the membership_status column of the Membership
table in our ruleset. In this example, we can reference the column with Membership.membership_status. Using
this column, we can use Conditional Masking to only mask the rows of Customers where the
status of the membership is 'Active'.
Note: To reference a column in a joined table, the table name of joined table must be added as a prefix to the column name**
The example below utilises the from_file mask type detailed here to select a random
first name from the DataMasque_firstNames-mixed.csv and a random last name from the
DataMasque_lastNames.csv files that can be found on our Supplementary Files user guide. It will
first check if the membership_status for the customer is 'Active', and if so, masks the two name columns: otherwise,
these columns are left unmasked.
version: "1.0"
tasks:
- type: mask_table
table: Customers
key: customer_id
joins:
- target_table: Membership
target_key: id
source_table: Customers
source_key: membership_id
rules:
- if:
- column: '"Membership".membership_status'
equals: Active
rules:
- column: first_name
masks:
- type: from_file
seed_file: DataMasque_firstNames_mixed.csv
seed_column: firstname-mixed
- column: last_name
masks:
- type: from_file
seed_file: DataMasque_lastNames.csv
seed_column: lastnames
This example will produce the following results in the Customers table. The customers with customer_id 3 and 5
are not masked, as the status of their membership is 'Inactive' in the joined Membership table.
| customer_id | first_name | last_name | membership_id |
|---|---|---|---|
| 1 | Tia | Pallin | 10001 |
| 2 | Nikau | Koller | 10002 |
| 3 | Chris | Yang | 10003 |
| 4 | Anika | Thom | 10004 |
| 5 | Gordon | Smith | 10005 |
Note:
For Microsoft SQL Server (Linked Server),
joinsare not currently supported.For Microsoft SQL Server, when using temporary table, the name of the temporary table must be wrapped in quotation marks, as the
#symbol in the YAML editor denotes the beginning of a comment (e.g.target_table: '##my_temporary_table'or'##my_temporary_table.column').To reference a temporary table column (e.g. the
table_filter_columnparameter of thefrom_filemask type or as a part ofhash_columns) you must prefix the column name with its table name (e.g.table.column).Any column name specified without a table prefix is assumed to belong to the table being masked (as specified by the
tableparameter for the task). You cannot specify tables that belong to other schemas.