TableauUsageToDatabricks
Overview
TableauUsageToDatabricks is a .NET application designed to extract Tableau usage data and upload it to Databricks in a structured format. It parses Tableau XML and JSON files, transforms them into models, and writes the results as Parquet files for analytics and reporting in Databricks.
Prerequisites
- For Users:
- Windows 64-bit OS
- PowerShell or CMD (Command Prompt) to run the application
- Access to Tableau server with role that has access to required resources
- Databricks workspace and credentials
- No .NET installation required (if using the provided self-contained executable)
- For Developers:
- Pre-install .NET 8 SDK (required to build or publish the application)
How to create tokens and find IDs
- Tableau Server Personal Access Token:
- Log in to Tableau Server.
- Go to your account settings (top right corner).
- Find "Personal Access Tokens" and click "Create a new token".
- Enter a name, generate the token, and copy it. Save the token name and secret for use in
appsettings.json
.
- Databricks Personal Access Token:
- Log in to Databricks workspace.
- Click your user icon (top right) and select "User Settings".
- Go to "Access Tokens" and click "Generate New Token".
- Copy the generated token and use it in
appsettings.json
.
- Databricks SQL Warehouse ID:
- In Databricks workspace, go to "SQL" from the sidebar.
- Click on "SQL Warehouses".
- Select the warehouse you want to use.
- Copy the "Warehouse ID" from the warehouse details page and use it in
appsettings.json
.
Configuration: appsettings.json
Before running the application, you must create and configure an appsettings.json
file. This file contains all necessary settings for Tableau and Databricks integration.
Where to place: Place appsettings.json
in the same directory as TableauUsageToDatabricks.exe
(the published executable). Users can edit this file at any time to update configuration without modifying the application.
Example appsettings.json
{
"Tableau": {
"BaseUrl": "https://your-tableau-server.com",
"TokenName": "your-token-name",
"TokenSecret": "your-token-secret"
},
"Databricks": {
"Host": "https://your-databricks-instance.cloud.databricks.com",
"PersonalAccessToken": "your-databricks-token",
"WarehouseId": "your-warehouse-id",
"VolumePath": "/mnt/your-volume",
"DatabaseName": "your-database-name"
},
"Parquet": {
"DataSourcesFileName": "datasources.parquet",
"WorkbooksFileName": "workbooks.parquet",
"ViewsFileName": "views.parquet",
"WorkbookUsageFileName": "workbook_usage.parquet",
"ViewUsageFileName": "view_usage.parquet",
"WorkbookConnectionsFileName": "workbook_connections.parquet",
"WorkbookDatasourcesFileName": "workbook_datasources.parquet",
"DataSourceConnectionsFileName": "datasource_connections.parquet"
}
}
Required fields
- Tableau: Connection details for your Tableau server and API token.
- Databricks: Host URL, access token, warehouse ID, volume path, and database name for Databricks.
- Parquet: Output file names for each data type.
Usage
- Edit
appsettings.json
with your Tableau and Databricks configuration.
- The application connects to Tableau Server, collects usage statistics, and saves them as Parquet files in Temp folders before uploading to Databricks.
- If Parquet files already exist in Temp, they will be reused.
- Newly created Parquet files will not be overwritten or re-created on the same day.
- Run the application from PowerShell or CMD:
.\TableauUsageToDatabricks.exe
Features
- Collects usage statistics from Tableau workbooks, views, and datasources
- Converts data to Parquet format
- Uploads data to Databricks
- Supports configuration via
appsettings.json
How to Build and Publish (Self-Contained, Win-x64)
- Restore dependencies:
dotnet restore
- Publish a self-contained release build for Windows 64-bit:
dotnet publish -c Release -r win-x64 --self-contained true /p:PublishSingleFile=true /p:IncludeNativeLibrariesForSelfExtract=true -o publish
Output will be in the publish
folder.
- Create a ZIP file of the release:
Compress-Archive -Path publish\* -DestinationPath TableauUsageToDatabricks-win-x64.zip
Publishing to NuGet
- Build the NuGet package:
dotnet pack -c Release
- Publish to NuGet.org:
dotnet nuget push .\bin\Release\TableauUsageToDatabricks.1.0.0.nupkg --api-key <your-nuget-api-key> --source https://api.nuget.org/v3/index.json