SafetyCulture to Databricks

This project is a data integration tool that collects information from the SafetyCulture API v1.0 and inserts it into Databricks Delta tables. It is designed to automate the extraction, transformation, and loading (ETL) of SafetyCulture data for analytics and reporting in Databricks.

Prerequisites

For Users

How to create tokens and find IDs

For Developers

Configuration: appsettings.json

Before running the application, you must create and configure an appsettings.json file. This file contains all necessary settings for SafetyCulture and Databricks integration.

Where to place

Example appsettings.json

{
    "SafetyCulture": {
      "BaseUrl": "https://api.safetyculture.io",
      "TokenSecret": "your-safetyculture-token"
    },
    "Parquet": {
      "InspectionsFileName": "sc_inspections.parquet",
      "InspectionItemsFileName": "sc_inspection_items.parquet",
      "SitesFileName": "sc_sites.parquet",
      "ActionsFileName": "sc_actions.parquet",
      "ActionTimelineItemsFileName": "sc_action_timeline_items.parquet",
      "ActionAssigneesFileName": "sc_action_assignees.parquet",
      "Users": "sc_users.parquet",
      "GroupUsersFileName": "sc_group_users.parquet",
      "SiteMembersFileName": "sc_site_members.parquet",
      "TemplatesFileName": "sc_templates.parquet",
      "TemplatePermissionsFileName": "sc_template_permissions.parquet"
    },
    "Databricks": {
      "Host": "https://your-databricks-instance.databricks.azure.com",
      "PersonalAccessToken": "your-databricks-token",
      "VolumePath": "/Volumes/your_catalog/your_schema/your_volume",
      "DatabaseName": "your_catalog.your_schema",
      "WarehouseId": "your_warehouse_id"
    },
    "Logging": {
      "LogLevel": {
        "Default": "Information",
        "Microsoft": "Warning",
        "Microsoft.Hosting.Lifetime": "Information"
      },
      "Console": {
        "IncludeScopes": false
      }
    }
  }

Usage

  1. Edit appsettings.json with your SafetyCulture and Databricks configuration.
  2. The application connects to SafetyCulture Server, collects data, and saves them as Parquet files in Temp folders before uploading to Databricks.
  3. Run the application from PowerShell or CMD:
    .\\SafetyCultureToDatabricks.exe
    

Features

Project Structure

Prerequisites

Getting Started

  1. Clone the repository.
  2. Update appsettings.json with your SafetyCulture API token and Databricks connection details.
  3. Build the project:
    dotnet build
    
  4. Run the desired job or the main program:
    dotnet run
    

How to Build and Publish (Self-Contained, Win-x64)

  1. Restore dependencies
    dotnet restore
    
  2. Publish a self-contained release build for Windows 64-bit
    dotnet publish -c Release -r win-x64 --self-contained true /p:PublishSingleFile=true /p:IncludeNativeLibrariesForSelfExtract=true -o publish
    
  3. Create a ZIP file of the release
    Compress-Archive -Path publish\\* -DestinationPath SafetyCultureToDatabricks-win-x64.zip
    

How to add another table later

  1. Create its model.
  2. Create its extractor returning parquet path.
  3. Add another BuildXJob(...) similar to BuildInspectionsJob and append to the jobs array.

License

See LICENSE for details.

Thank you for considering contributing to SafetyCultureToDatabricks!
SafetyCultureToDatabricks Repository