Argus Catalog
An integrated AI·Data·API metadata platform that governs data, models, APIs, and AI agents in a single catalog. With strong support for air-gapped and on-premises environments, it secures enterprise-wide data sovereignty without ever sending data outside.
Concept Diagram

Highlights
Unified governance of data, models, APIs & AI
Brings the data catalog, ML model registry, API catalog, and AI Agent catalog together to deliver an enterprise-wide single source of truth (SSOT).
Auto-sync across 11 data sources
Automatically collects metadata from Hive, Impala, Kudu, Trino, StarRocks, Greenplum, Iceberg REST, PostgreSQL, MySQL, Oracle, and MSSQL, keeping schemas, statistics, and lineage up to date.
Column-level cross-platform lineage
Automatically traces end-to-end lineage at the dataset and column level via SQL parsing, and generates ER diagrams from DDL parsing.
Air-gapped / on-prem + local LLMs
Integrates with OpenAI and Anthropic as well as local LLMs such as Ollama, enabling full AI governance even in closed networks where data never leaves.
Platform Architecture
An end-to-end metadata platform where Catalog UI, Server, Extensions, and SDK work organically together.
Core Capabilities
From data catalog to quality & governance, ML model registry, and AI — the five pillars of enterprise metadata management in a single platform.
Data Catalog
The core for discovering, trusting, and governing datasets.
Data Quality
Profiles source databases directly and validates with rules.
Metadata Governance
Catalogs not just data but APIs and AI agents too.
ML Model Registry
MLflow/OCI-compatible model governance with air-gapped import.
AI
Auto-generates metadata with LLMs and queries the catalog.
An open-source metadata platform
Argus Catalog is fully open-sourced on GitHub under the Apache License 2.0. Apart from the metadata ingestion connectors, the entire core engine — backend, frontend, SDK, AI agent, and quality batch — is public, so enterprises can verify the code directly, extend it to fit their environment, and operate it without any external data leakage.
- Apache 2.0 with no commercial-use restrictions
- Verify and extend the code yourself
- Self-host in air-gapped / on-premises