Skip to main content

Databricks Integration

Databricks is a unified data and AI platform built on the lakehouse architecture. By connecting WISEPIM to Databricks, you can export your enriched product data directly into Unity Catalog and Delta Lake tables, enabling advanced analytics, machine learning workflows, and enterprise-grade data governance across your entire product catalog.
The Databricks integration is available exclusively on the Enterprise plan. Contact our sales team to learn more about Enterprise features and pricing.

Prerequisites

Before you begin, make sure you have the following:
  • A WISEPIM account on the Enterprise plan
  • A Databricks workspace with Unity Catalog enabled
  • Appropriate permissions to create schemas and tables in your Databricks catalog
  • Your Databricks Server Hostname, HTTP Path, and an authentication method (Personal Access Token or OAuth credentials)

Getting Your Databricks Connection Details

You will need several pieces of information from your Databricks workspace to establish the connection.
1

Log in to your Databricks workspace

Go to your Databricks workspace URL and sign in with your credentials.
2

Locate your Server Hostname and HTTP Path

Navigate to SQL Warehouses (or Compute for clusters):
  1. Select the SQL warehouse or cluster you want WISEPIM to connect to
  2. Click on Connection Details
  3. Copy the Server Hostname (e.g., adb-1234567890.1.azuredatabricks.net)
  4. Copy the HTTP Path (e.g., /sql/1.0/warehouses/abc123def456)
3

Set up authentication

You can authenticate using one of two methods:Option A: Personal Access Token
  1. Click your username in the top-right corner of the Databricks workspace
  2. Go to Settings then Developer then Access Tokens
  3. Click Generate New Token, give it a description (e.g., “WISEPIM Integration”), and set an expiration
  4. Copy the generated token immediately — it will not be shown again
Option B: OAuth (Service Principal)
  1. In your Databricks account console, create a Service Principal
  2. Generate a Client ID and Client Secret for the service principal
  3. Grant the service principal access to the workspace and the target catalog
4

Identify your target Catalog and Schema

In the Databricks workspace, go to Data in the sidebar to browse Unity Catalog:
  1. Select or create the Catalog where WISEPIM product data will be stored
  2. Select or create the Schema (database) within that catalog
  3. Note down the catalog and schema names
Personal Access Tokens provide full access to your Databricks workspace based on your user permissions. For production environments, we recommend using OAuth with a Service Principal that has only the minimum required permissions. Rotate credentials regularly and never share them publicly.

Connecting Databricks to WISEPIM

Once you have gathered your connection details, configure the integration in WISEPIM.
1

Open the Integrations page

Log in to your WISEPIM account and navigate to the Integrations page from the main sidebar.
2

Select Databricks

Find the Databricks tile in the App Marketplace and click on it to open the configuration modal.
3

Enter your connection details

Fill in the following fields:Connection Settings
  • Server Hostname: Your Databricks workspace hostname
  • HTTP Path: The path to your SQL warehouse or cluster
Authentication (choose one method)
  • Access Token: Your Databricks Personal Access Token
  • Or Client ID and Client Secret: Your OAuth service principal credentials
Data Location
  • Catalog: The Unity Catalog name (e.g., wisepim_data)
  • Schema: The schema/database name within the catalog (e.g., product_catalog)
4

Configure source tables (optional)

If you are importing data from Databricks into WISEPIM, you can specify source table names:
  • Attributes Source Table: The table containing attribute definitions
  • Attribute Options Source Table: The table containing attribute option values
  • Products Source Table: The table containing product data
  • Batch Size: The number of rows to process per batch (default: 1000)
5

Test the connection

Click Test Connection to verify that WISEPIM can reach your Databricks workspace and access the specified catalog and schema.
6

Save your configuration

If the connection test is successful, click Save to store your integration settings.

Data Pipeline Setup

The Databricks integration supports bidirectional data flow between WISEPIM and your data lakehouse.

Exporting Product Data to Databricks

You can push your enriched product data from WISEPIM to Databricks for analytics and ML use cases:
  1. Go to the Products page in WISEPIM
  2. Select the products you want to export (or select all)
  3. Click Export and choose Databricks as the destination
  4. WISEPIM will write the data to Delta Lake tables in your specified catalog and schema
The following data is exported:
  • Product identifiers (IDs, SKUs, EAN/GTIN)
  • Product names and descriptions (all languages)
  • Prices and stock information
  • Category hierarchies
  • Product attributes and custom fields
  • Image URLs and metadata
  • Translation status and quality scores
WISEPIM exports data in Delta Lake format, which provides ACID transactions, schema enforcement, and time travel capabilities. You can query historical versions of your product data at any point.

Importing Product Data from Databricks

If your product data lives in Databricks (for example, from upstream data pipelines), you can import it into WISEPIM:
  1. Configure the source table names in your integration settings
  2. Click Import on the Products page and select Databricks as the source
  3. WISEPIM will read from your specified tables and map the data to your project attributes
When importing from Databricks, ensure your source tables follow a consistent schema. WISEPIM will attempt to map columns to product attributes automatically, but you can customize the mapping using the Attribute Mapper.

Analytics Use Cases

Once your product data is in Databricks, you can leverage it for a variety of analytics and data science workflows:

Product Performance Analytics

  • Build dashboards to track product performance across channels and markets
  • Analyze which product attributes correlate with higher conversion rates
  • Compare performance across different languages and regions

Machine Learning Pipelines

  • Train product recommendation models using enriched product data
  • Build demand forecasting models with historical product and pricing data
  • Develop pricing optimization algorithms based on market data
  • Use WISEPIM’s product embeddings for similarity search and clustering

Data Governance

  • Track data lineage from source to enrichment to export with Unity Catalog
  • Set up access controls to manage who can read and modify product data
  • Audit all data changes with Delta Lake’s transaction log
Use WISEPIM’s product IDs as the primary key when joining product data with sales, inventory, or customer data in Databricks. This ensures consistent identity mapping across all your datasets.

Batch Processing Configuration

For large product catalogs, you can configure the batch size to optimize performance:
  • Small catalogs (under 10,000 products): Default batch size of 1,000 works well
  • Medium catalogs (10,000 - 100,000 products): Consider increasing to 5,000 per batch
  • Large catalogs (100,000+ products): Use 10,000 per batch and monitor resource usage
You can adjust the batch size in the integration configuration modal under the Batch Size field.

Troubleshooting

If you encounter issues with your Databricks integration, try the following:

Connection Errors

  • Verify that the Server Hostname is correct and includes the full domain (e.g., adb-1234567890.1.azuredatabricks.net)
  • Check that the HTTP Path points to an active SQL warehouse or cluster
  • Ensure your SQL warehouse or cluster is running (not in a stopped/terminated state)
  • If using a Personal Access Token, verify it has not expired
  • If using OAuth, confirm the Service Principal has workspace-level access

Authentication Issues

  • Regenerate your access token if you suspect it has been compromised or expired
  • For OAuth, verify the Client ID and Client Secret are correct
  • Ensure the authenticated user or service principal has USE CATALOG and USE SCHEMA privileges on the target catalog and schema

Data Export Issues

  • Confirm that the authenticated user has CREATE TABLE and MODIFY privileges on the target schema
  • Check that the catalog and schema names are spelled correctly and exist in Unity Catalog
  • If exports are slow, try reducing the batch size or using a larger SQL warehouse
  • Review WISEPIM’s error log for specific error messages from the Databricks API

Data Import Issues

  • Verify that the source table names are correct and the tables exist
  • Confirm that the authenticated user has SELECT privileges on the source tables
  • Check that the source table schemas are compatible with WISEPIM’s expected format
  • If imports time out, try reducing the batch size
Databricks resources (SQL warehouses, clusters) incur costs while running. Make sure to configure auto-stop settings in your Databricks workspace to avoid unexpected charges. WISEPIM will attempt to start your SQL warehouse if it is stopped, but this may add latency to the first operation.

Next Steps

Once your Databricks integration is set up, you can:
  • Import products from your Databricks tables
  • Enrich your products with AI-powered content optimization
  • Export products to Delta Lake tables for analytics
  • Build dashboards and ML models using your enriched product data in Databricks
  • Set up scheduled exports for continuous data synchronization