Tableau Online – Setup & Control Using Python/Command Line Interface

What is Tableau Online?

Tableau Online is a comprehensive solution to host your analytics platform in the cloud. Without the need of setting up servers, designing the platform, managing upgrades or investing any setup cost – at the enterprise level this serves as the best solution to host your Tableau Dashboards so that everyone can easily access them via the Desktop Browser or the Mobile App.

With the level of maturity that the Tableau Community has achieved over the years, it comes as no surprise that Tableau has a developer program that can enable all the Tableau evangelists to get their hands on to their own developer website – https://www.tableau.com/developer

Before you proceed, make sure you create your own tableau Online website by clicking on the above link.

Overview

This is a part of my presentation at the DC + Baltimore + NoVA Tableau User Group event – https://www.youtube.com/watch?v=u_iayFA9XuA (Click here to see the entire session)

Now that you have your own Tableau Online website the next logical step would be to learn what you can do with it and whats the best way to administer and interact with the platform. This walk-through will take you through all the nuances that would help you flex your Tableau skills – now on your own Tableau Online platform.

This walk-through primarily leverages Python, Tableau APIs and the Command Line Interface to interact and administer the platform. The reasons for doing that are multifold – a command line interface can help you get a more granular control over your platform, just like AWS, Tableau is no different. In addition, you can also automate site administration tasks, leverage API’s, and connect to peripherals like Postman/IFTTT/Webhooks etc to always be aware of the different moving parts within your Tableau Online instance.

This is part 1 of a three part series on Tableau Online. Specifically this part covers the setup and the following functionalities –

  • Connecting to Tableau Online via Python (Tableau Server Client)
  • Using Hyper API
  • Querying Projects on Tableau Online
  • Publishing Hyper files
  • Querying available data sources on Tableau Online
  • Querying views contained within a specific workbook

Pre-Requisites

  • tabcmd – Tableau provides the tabcmd command-line utility which you can use to automate site administration tasks on your Tableau Server site. For example, creating or deleting users, projects, and groups. The tabcmd utility is included with Tableau Server, however its installer is not included. If you want to run it on a computer other than the initial server node, you need to download the installer from the Tableau website. Use this link to download the latest version of tabcmd – https://downloads.tableau.com/esdalt/2020.3.0/TableauServerTabcmd-64bit-2020-3-0.exe

Walk-Through

Lets dive right into the code. I have used Jupyter Notebook, however, you can opt for any editor of your choice.

Import Libraries
import os
import pandas as pd
import tableauserverclient as TSC
from pandleau import *
from tableauhyperapi import Connection, HyperProcess, SqlType, TableDefinition, escape_string_literal, escape_name, NOT_NULLABLE, Telemetry, Inserter, CreateMode, TableName
import pantab
import subprocess
from tableau_api_lib import TableauServerConnection
from tableau_api_lib.utils.querying import get_workbooks_dataframe, get_datasources_dataframe
  • The Tableau Server Client (TSC) – is a Python library for the Tableau Server REST API. Using the TSC library, you can manage and change many of the Tableau Server and Tableau Online resources programmatically. You can use this library to create your own custom applications.
  • Tableau Hyper API – contains a set of functions you can use to automate your interactions with Tableau extract (.hyper) files. You can use the API to create new extract files, or to open existing files, and then insert, delete, update, or read data from those files.
  • Pandleau – A quick and easy way to convert a Pandas DataFrame to a Tableau .tde or .hyper extract.
  • Pantab – is a Python wrapper around Tableau’s Hyper API which promotes usage of the pandas DataFrame to seamlessly generate and read hyper extracts.

P.S. The last two of the above mentioned libraries are something that we will need in the forthcoming parts of this walk-through.

Storing login credentials in variables

Note: as a best practice it’s recommended that you never hard-code passwords/credentials in your code. To that end – I have stored the credentials as environment variables

username = os.getenv("tableau_user")
password = os.getenv("tableau_password")
site_id = os.getenv("tableau_site_id")
Using the credentials to log into Tableau Online (Tableau Developer Sites use the server – https://10ax.online.tableau.com/)
tableau_auth = TSC.TableauAuth(username,
 password, site_id = site_id)
server = TSC.Server('https://10ax.online.tableau.com/',
use_server_version=True)
server.auth.sign_in(tableau_auth)

For the purpose of this walk-through I created a sample workbook built using a really basic table. Attached both of them here for reference – https://bit.ly/32LW51B

Reading the CSV
file = "path/Sample-Sales.csv"
df = pd.read_csv(file)
Basic pre-processing to demonstrate TSC and Hyper API capabilities
df_north = df[df['Region']=="North"]
df_south = df[df['Region']=="South"]
df_east = df[df['Region']=="East"]
df_west = df[df['Region']=="West"]

df_h = pandleau(df)
df_north_h = pandleau(df_north)
df_south_h = pandleau(df_south)
df_east_h = pandleau(df_east)
df_west_h = pandleau(df_west)
Saving data-frames as .hyper files

df_h.to_tableau('path/fulldata.hyper', add_index=False)
df_north_h.to_tableau('path/northdata.hyper', add_index=False)
df_south_h.to_tableau('path/southdata.hyper', add_index=False)
df_east_h.to_tableau('path/eastdata.hyper', add_index=False)
df_west_h.to_tableau('path/westdata.hyper', add_index=False)

For the purpose of our next steps make sure you create at least one project on your tableau online site –

Querying details of the project of interest (say – “Demo”) and upload the aforementioned workbook (say – “New-Sales”)
#printing project id
projectname = 'Demo'
print([(project.name, project.id) for project in TSC.Pager(server.projects) if project.name.startswith(projectname)])
project_id = [(project.id) for project in TSC.Pager(server.projects) if project.name.startswith(projectname)]
Use the Project-ID from the output above to publish our newly created .hyper files
mydatasourceitem = TSC.DatasourceItem(project_id[0],
name='fulldata Extract')
item = server.datasources.publish(mydatasourceitem,
'path/fulldata.hyper', 'Overwrite')
print("{} published with id: {}".format(item.name, item.id))
Querying Tableau online for all the available data-sources
#printing id for all data sources
with server.auth.sign_in(tableau_auth):
    all_datasources, pagination_item = server.datasources.get()
    print("\nThere are {} datasources on site: ".format(pagination_item.total_available))
    print([(datasource.name,datasource.id) for datasource in all_datasources])
Querying Tableau Online for the workbook ID of the workbook that contains our Sales Report
#printing selected workbook id
workbookname = 'New-Sales'
with server.auth.sign_in(tableau_auth):
    all_workbooks_items, pagination_item = server.workbooks.get()
    # print names of first 100 workbooks
    print([(workbook.id, workbook.name) for workbook in all_workbooks_items if workbook.name.startswith(workbookname)])
    temp = [(workbook.id) for workbook in all_workbooks_items if workbook.name.startswith(workbookname)]
    workbook_id = temp[0]
Obtaining view IDs of all the views contained within our workbook
with server.auth.sign_in(tableau_auth):

    all_workbooks_items, pagination_item = server.workbooks.get()
    for wb in TSC.Pager(server.workbooks):
        wbID = wb.id
        if wb.project_name == "Demo" and wb.name == "New-Sales":  # and wb.name == WorkbookName : #use with line#10. uncomment if list needs to be narrowed down to a specific workbook
            workbook = server.workbooks.get_by_id(wbID)
            server.workbooks.populate_views(workbook)
            print(workbook.project_name)
            print("\t", workbook.name)
            for view in workbook.views:
                print("\t", "\t", view.name + ": " + view.id)
Alternatively to look for all the views on your website
with server.auth.sign_in(tableau_auth):
    all_views, pagination_item = server.views.get()
    print([(view.name,view.id) for view in all_views])

This marks the end of part 1 of the Tableau Online series. I have divided the walk-through based on specific applications that it can be used for. In the forthcoming parts we will dive deeper into leveraging tabcmd via the command line and creating a business use case for distribution of these reports outside your organization via an automation script.

%d bloggers like this: