- 24 Feb 2023
- 1 Minute to read
- Print
- DarkLight
Databricks
- Updated on 24 Feb 2023
- 1 Minute to read
- Print
- DarkLight
Overview
This article explains how to connect a Databricks database to Preset.
Allowlist Preset IPs
Preset Cloud runs on four regions. For Preset to access your data, first thing you need to do is to add region based Preset IP addresses to your Inbound and Outbound firewall rules.
us-west-2 (us1a) | us-east-1 (us2a) | eu-north-1 (eu5a) | ap-northeast-1 (ap1a) |
---|---|---|---|
35.161.45.11 | 44.193.153.196 | 13.48.95.3 | 35.74.159.67 |
54.244.23.85 | 52.70.123.52 | 13.51.212.165 | 35.75.171.157 |
52.32.136.34 | 54.83.88.93 | 16.170.49.24 | 52.193.196.211 |
If you are not sure where your Preset workspace is located, you can refer to the URL on your browser when accessing Preset. It should look like this: https://xxxxxxxx.us2a.app.preset.io/superset..., where us2a means it is in us-east-1.
Retrieve Databricks Information
Step 1: Get Databricks Token
In Databricks, navigate to User Settings → Access Tokens → Generate New Token.
Step 2: Get Host and URL
Navigate to Clusters and select your cluster.
...then copy and save:
- Server Hostname
- Port
- HTTP Path
Connect with the Databricks Connector
Let's start by selecting + Database — have a look at Connecting your Data if you need help wth this step.
Select Databricks.
Fill the form with information from the Databricks cluster:
- Host with the Server Hostname
- Port with the Port
- Database name with the name of the database to be connected
- Access Token with the generated token
- HTTP Path with the HTTP Path
- Display Name with the name to be used on Preset for this connection
Click on Connect to create the connection.
Configure default Catalog and Schema
The connection would use the default Catalog and Schema specified on your Databricks settings. However, you can set a different Catalog or Schema to be used as default on Preset on the Advanced Settings:
- Navigate to Settings > Database Connections to access the list of existing connections on your Workspace.
- Hover the mouse over your Databricks connection and click on the pencil icon under the Actions column to modify it.
- Navigate to the ADVANCED tab, and expand the Other section.
- You should find a JSON on the ENGINE PARAMETERS configuration similar to this one:
{
"connect_args":
{
"http_path": "{{Your HTTP Path}}"
}
}
- Specify the default Catalog and Schema that should be used on the
connection_args
:
{
"connect_args":
{
"http_path": "{{Your HTTP Path}}",
"catalog": "{{Catalog Name}}",
"schema": "{{Schema Name}}"
}
}
- Save changes.
You can also query data from different catalogs on the SQL lab, using below structure:
SELECT * FROM {{Catalog_Name}}.{{Schema_Name}}.{{Table_Name}}