Preparing for Database Sync

Updated 24 January 2024 15:49

Most organisations will use one or more backend databases to store their application and analytical datasets. Use Database Sync to easily pull this valuable data into Distil, so that you can make use of the powerful insights and actions that Distil provides.

Setting up Database Sync involves three steps covered in this series of articles:

Preparing for Database Sync - the topic of this article
Preparing your tables or views - this is covered in seperate articles for each Data Category:
Creating the Database Connection - creating the Connection in Distil

Remember that you can always reach out if you need help.

We know that databases are critical elements to any solution, and that connecting external systems to your databases requires great care and attention. We have ensured that the Distil Database Sync method treats your infrastructure with respect, and that operations are kept as lightweight as possible. This is something that is very important when dealing with many millions of records.

The following describes some basic database best practices that we recommend when preparing your database and infrastructure to connect to Distil.

Read-only user account

It is best practice to grant any new database only user the minimum permissions needed to complete their tasks. Therefore we recommend that you create a dedicated database user account for your Distil Database Sync, with read-only access to the Distil tables described in this series of articles.

Distil doesn't need access to any other information in your database.

Secure Shell (SSH)

SSH security with a public key provides an additional layer of safety and is supported by most databases. We suggest that you set this up if you are able. Distil can easily connect to your database over SSH.

If you need some guidance on setting this up, we recommend this article.

IP White Listing

Distil Database Sync will always connect from one of three fixed IP addresses. These are:

52.169.82.244
40.127.201.179
20.67.135.65

It's likely that your network security policies will need you to allow connections to these IP addresses. In any case, whitelisting these addresses will help avoid issues when using Database Sync.

Read-Replica Database

We have designed Distil Database Sync to be lightweight and minimise calls to your database. However, if you are concerned about additional load being placed on your database infrastructure, then setting up a separate read-replica database is advisable.

Table or Views?

Once a connection has been established to your database, Distil will interrogate the schema and identify any Tables or Views that match the naming conventions described in the other articles in this section.

Distil is comfortable with either a Table or a View. However it is likely that the Customer data in your database is set to a particular format which will differ from that required by Distil. Therefore it is more efficient to create a database View containing Customer data and joining any other interesting values you wish to be present as attributes in your Distil dataset.

While Distil uses a set of core Attributes, you can also add your own custom attributes as described in the accompanying articles.

If you are concerned with heavy loading of your database, then you may wish to consider copying your Customer data into a table (or if your database supports it, a Materialised View). This may reduce the load on your system during sync, however there is a potential for Distil Database Sync and the table update query within your database to get out of step with each other.

Column Naming

Before the column names in your database tables are matched to Attributes in Distil, first you will need to clean the original column name so that Distil can recognise and map the field. This is described in detail in each of the articles covering Customer data, Product data, Order data and Content data.