Integrating Trino with BI Tools — Superset, Tableau, and JDBC/Python Clients
A practical guide to connecting Trino to BI tools and applications: JDBC/ODBC drivers and connection URLs, Superset and Tableau integration, Python (trino, SQLAlchemy) clients, authentication (LDAP/OAuth2), and operational patterns that protect the cluster from dashboard traffic storms.
No matter how well you tune Trino, your users ultimately experience it through BI tools and applications. When connection settings are off, it comes back as complaints — "it's slow", "it keeps disconnecting", "authentication doesn't work". Conversely, once you get the connection, authentication, and operational patterns right, Trino becomes a powerful backend for tools like Superset and Tableau.
This post covers the practical side of connecting Trino to BI tools and clients — JDBC/ODBC drivers, Superset and Tableau integration, Python clients, authentication setup, and operational patterns that shield the cluster from dashboard traffic.
1. Connection Basics — Drivers and URLs
All Trino clients connect to the coordinator's HTTP(S) endpoint.
| Client | Connection method |
|---|---|
| BI tools (Superset, Tableau, DBeaver, etc.) | JDBC or ODBC driver |
| Applications | Python/Go/Node clients, JDBC |
| CLI | trino CLI |
JDBC URL Structure
jdbc:trino://<host>:<port>/<catalog>/<schema>?<properties># Plaintext (PoC)
jdbc:trino://trino:8080/iceberg/analytics
# TLS + LDAP authentication (production)
jdbc:trino://trino:8443/iceberg/analytics?SSL=true&user=alice&password=...| Property | Meaning |
|---|---|
SSL=true | HTTPS connection |
user / password | LDAP/file-based password authentication |
SSLTrustStorePath | Trust your internal CA certificate |
source | Client identification (important — see Section 6) |
clientTags | Tags for Resource Group selectors |
Key tip: always set
sourceandclientTags. Resource Group selectors classify queries by these values, so the operational pattern of "isolating dashboard traffic into a dashboard group" starts here. (Resource Groups are covered in a separate post, "Trino Memory Management and Resource Groups".)
2. Apache Superset Integration
Superset is built on SQLAlchemy, so you connect using the Trino dialect (the sqlalchemy-trino/trino packages).
Connection String (SQLAlchemy URI)
# Basic
trino://alice@trino:8080/iceberg
# TLS + password authentication
trino://alice:password@trino:8443/iceberg?protocol=httpsSetup flow:
- Enter the URI above under Database Connections in Superset.
- If you set a catalog like
icebergas the default DB, schemas and tables are discovered automatically. - To use multiple catalogs, add one DB connection per catalog, or qualify queries with
catalog.schema.table.
Superset Operational Tips
- Result caching: enable Superset's own cache (Redis, etc.) so re-loading the same chart doesn't hit Trino. Since Trino has no result cache (see the separate post "Trino Caching Strategies"), the BI tool cache effectively serves as your result cache.
- Async queries: configure asynchronous execution with Celery workers so long queries don't tie up web workers.
- Row limit: set sensible default row limits on charts to prevent accidental large fetches.
3. Tableau Integration
Tableau connects via the Starburst/Trino connector or JDBC/ODBC.
- Live vs Extract: a Live connection hits Trino on every interaction. With many dashboard users, the load adds up. Using Extract to keep periodic snapshots on the Tableau server dramatically reduces Trino load.
- Authentication: configure TLS + LDAP/OAuth2 via driver options. In an SSO environment, set up an OAuth2 connection.
- Passing user context: set
source/clientTagson the connection to identify and isolate Tableau traffic.
| Mode | Trino load | Freshness | Best for |
|---|---|---|---|
| Live | High (every interaction) | Real time | Few users, real-time needs |
| Extract | Low (periodic extract) | As of extract | Many users, dashboards |
General principle: production dashboards with many users are best served by a pre-aggregated table + Extract combination, while exploratory analysis by a small group of analysts fits Live.
4. Python Clients
For applications and data pipelines, use the official trino Python package.
import trino
conn = trino.dbapi.connect(
host="trino",
port=8443,
user="alice",
catalog="iceberg",
schema="analytics",
http_scheme="https",
auth=trino.auth.BasicAuthentication("alice", "password"), # LDAP
source="etl-pipeline", # Resource Group identification
client_tags=["batch"],
)
cur = conn.cursor()
cur.execute("""
SELECT event_type, count(*) AS cnt
FROM events
WHERE event_time >= TIMESTAMP '2026-06-01 00:00:00 UTC'
GROUP BY event_type
""")
for row in cur.fetchall():
print(row)SQLAlchemy / pandas
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine("trino://alice@trino:8443/iceberg?protocol=https")
df = pd.read_sql("SELECT * FROM analytics.daily_active_users", engine)Caution: don't pull a huge result set into memory wholesale, as in
pd.read_sql("SELECT * FROM huge_table"). The rule is to trim with WHERE/LIMIT, or aggregate server-side and fetch only the small result.
5. Authentication Setup at a Glance
Configure clients to match whichever authentication method the cluster has enabled (see the separate post "The Complete Trino Security Guide").
| Cluster authentication | JDBC | Python |
|---|---|---|
| LDAP/file (PASSWORD) | user+password, SSL=true | BasicAuthentication |
| OAuth2 / OIDC | Browser or external token | OAuth2Authentication |
| Kerberos | KerberosPrincipal, etc. | KerberosAuthentication |
| mTLS (certificates) | Client keystore | Certificate options |
Since virtually every authentication method assumes TLS (never send plaintext passwords), clients must set SSL=true/http_scheme=https together with the truststore (your internal CA).
6. Protecting the Cluster from Dashboard Traffic
The most common incident in BI integration is dashboard auto-refresh and surging concurrent users grinding the cluster to a halt. Defenses:
6.1 Isolation via source/clientTags
Superset → source=superset → Resource Group: global.dashboard
Tableau → source=tableau → Resource Group: global.dashboard
ETL → source=etl-pipeline→ Resource Group: global.batch
Analyst CLI → (default) → Resource Group: global.adhocWith concurrency, memory, and queue limits on the dashboard group, even a flood of dashboard queries cannot encroach on batch and ad-hoc analysis resources.
6.2 BI Tool Caching + Pre-aggregation
- Enable Superset/Tableau caching to absorb repeated identical queries.
- Pre-compute heavy aggregations as Materialized Views or daily-batch summary tables, so dashboards query only small tables. (See the separate post "Trino Caching Strategies".)
6.3 Guardrails
- Cluster side: block accidental massive scans with
query.max-scan-physical-bytes. - BI tool side: set chart row limits and query timeouts.
7. Integration Checklist
- Production uses TLS (
SSL=true/https) + internal CA truststore - Client configuration matches the authentication method (LDAP/OAuth2/Kerberos)
-
source/clientTagsset on every client - source-based Resource Group selectors isolating dashboard/batch/adhoc
- BI tool result caching enabled
- Heavy aggregations pre-computed as summary tables/Materialized Views
- Extract mode evaluated for Tableau with many users
- Chart row limits and timeouts, cluster scan guardrails
8. Summary
| Target | Connection | Operational focus |
|---|---|---|
| Superset | SQLAlchemy trino:// | Built-in cache + async + source tag |
| Tableau | Trino connector/JDBC | Extract vs Live, source identification |
| Python apps | trino dbapi / SQLAlchemy | No huge fetches, set source |
| All clients | JDBC/ODBC + TLS | Authentication, truststore, clientTags |
Integrating Trino with BI tools comes down to two things. First, configure TLS + authentication consistently all the way to the clients. Second, identify traffic with source/clientTags and isolate it with Resource Groups, while absorbing cluster load through BI caching and pre-aggregation. With these operational patterns in place, you can scale Trino as a company-wide BI backend — keeping analytics and batch workloads stable even as dashboard users grow.
This post is based on the Trino 440 series. If you need help with BI tool integration or dashboard backend architecture, feel free to reach out.
— Data Dynamics Engineering Team