v2.0.0
Release Changelog
v2.0.0
Release Availability Date
25-June-2026
Recommended Versions
- CLI/SDK: 1.6.0.4
- Remote Executor: v2.0.0-cloud, v1.1.3-cloud, v1.0.3-cloud
- On-Prem Versions:
- Helm: 1.6.173
- API Gateway: 0.7.3
Tested Remote Executor versions
DataHub Cloud v2.0.0 has been internally tested with the following Remote Executor versions:
| Remote Executor version | Status | Notes |
|---|---|---|
| v2.0.0-cloud | Tested | Recommended. |
| v1.1.3-cloud | Tested | Supported with this release. |
| v1.0.3-cloud | Tested | Supported with this release. |
New Feature Highlights
Search V2.5 is now the default — cross-entity ranking with name-match boosting and diversity promotion, plus latency optimizations for query understanding and multi-match.
Bridge-backed semantic search now covers datasets, charts, dashboards, glossary terms, and data products via hidden bridge documents in the document semantic index.
Action Workflows v2 — full backend + frontend reworking the authoring surface around the existing Filter (OR-of-ANDs Criterion[]) shape and a new composable
DynamicSource(seed + hops + destination) for graph traversals. Adds expression engine, dynamic actors, cancel/quorum support.MCP Audit — new audit surface for MCP tool invocations: GraphQL
mcpAuditresolver, dedicated audit tab in Settings → AI, KPI charts and history table.CDE Steward agent — new 10-star governance agent for Critical Data Element compliance and certification workflows.
Internationalization (i18n) — first-class infrastructure (feature flag + user settings), end-to-end string extraction across the application (entity tabs, search, settings, governance, ingestion source metadata, etc.), and initial DE translations.
Distributed rate limiting across REST, GraphQL, and OpenAPI with per-endpoint token-bucket controls and configurable jitter.
Domain propagation — automatic propagation of domain assignment across lineage and containment relationships, with attribution.
note_metadata_observationMCP tool — replacesregister_feedbackandnote_sql_anchor_observation, and raises ANNOTATE or POST_ATTACHMENT proposals so agent-recorded observations land in the Context Hub inbox for SME review.Smart-assertion inference V2 — decoupled from
DATAHUB_USE_OBSERVE_MODELSvia the newDATAHUB_USE_INFERENCE_V2flag; assertions also recordINITduring user-defined exclusion windows instead of producing spurious anomalies.SecretServicecaller guard — non-system actors (including PATs withMANAGE_SECRETS) can no longer decrypt secrets; hardens the credential-access surface.New ingestion sources: ThoughtSpot, TimescaleDB, Airbyte; production-ready SAP HANA with calc-view lineage, stored procedures, and query usage.
All changes in https://github.com/datahub-project/datahub/releases/tag/v1.6.0
- Note Breaking Changes: https://datahubproject.io/docs/how/updating-datahub/
Product
- Action Workflows v2 authoring surface — Filter-shaped step/field/entrypoint conditions, composable
DynamicSource(seed + hops + destination), regex-onlyFieldValidation. Feature-flag-gated (actionWorkflowsV2Enabled). See Data Access Workflows. - Workflow form requests accept caller-supplied
id.createActionWorkflowFormRequest(andcreateActionWorkflowFormRequestV2) accept an optionalid: Stringto produce a stable URNurn:li:actionRequest:<id>instead of a UUID. - Context Generation Settings — domain is now required for enabling Context Generation; end-to-end backend wiring through the curator agent. Tracking events added for context hub features.
- Internationalization — i18n infrastructure with feature flag and user-settings backend; first-class string extraction across entity tabs (schema, queries, observe, validations, incidents, documentation, summary), search, settings, governance (domains, glossary, structured properties), identity / permissions, ingestion source metadata, home pages, and shared components. Initial DE translation for settings pages.
- UI redesign: nav and home hero — fully collapsed nav redesign and home hero toggler.
- External document UI improvements — inline preview, read-only fields, last-synced indicator.
- Document anchor pattern UX — anchor patterns flattened into per-metric cards; sort and cap anchor patterns; inline editing of anchor pattern rendered text.
- Glossary — Redesigned cards and sidebar with semantic tokens. Added support for custom relationships between glossary terms: create a Structured Property and select "Treat as Relationship" to define a new relationship type. Then navigate to the Related Terms tab on any glossary term and choose your custom relationship type when adding a related term.
- Schema field drawer — column statistics shown in full inline on the About tab.
- Dataset Summary — Generate Documentation flow now reachable from the Dataset Summary page.
- Inbox auto-redirect — Task Center inbox auto-redirects to proposals when the tasks tab is empty.
- Tracking events — context hub feature usage tracked for product analytics.
AI / Ask DataHub
- MCP Audit — new GraphQL
mcpAuditresolver and Settings → AI → MCP Audit tab with feature-flag gate. KPI charts, history table, session and event drawers.MCP_TELEMETRY_CAPTURE_PAYLOADSenv flag controls payload capture.McpServerRequestanalytics report script. - MCP Apps infrastructure — build infrastructure for in-product MCP App surfaces;
mcp/↔mcp_integration/boundary enforcement for OSS hygiene. note_metadata_observationMCP tool. Replacesregister_feedbackandnote_sql_anchor_observationwith a single tool covering both metadata gaps and SQL-anchor quality observations. Update agent prompts that reference the old tool names.note_metadata_observationraises ANNOTATE proposals on context docs. When called with one or moreurn:li:document:URNs inrelated_objects, the tool raises oneDOCUMENTS_PROPOSALActionRequest per target withproposalType: ANNOTATE. The proposal stages a draft Document carrying anENTITY_ANNOUNCEMENTPost; until an SME accepts it from the Context Hub inbox the target doc is unchanged. On acceptance, only the Post migrates to the target.note_metadata_observationraises POST_ATTACHMENT_PROPOSAL when the agent cannot pin the gap to an existing doc. Emits two coordinated MCPs: a Post withpostType=AI_OBSERVATIONandtarget=null, plus a newPOST_ATTACHMENT_PROPOSALActionRequest. The Post is invisible across existing UI surfaces by construction; the ActionRequest shows up in the Task Center for SME triage. On accept, the Post becomes a real Comment on the chosen entity.DocumentProposalService.applyDraftDocumentChangerespectsproposalType: ANNOTATE. ANNOTATE proposals now skip the info / properties copy and only migrate Posts; EDIT, STATE_CHANGE, and CONFLICT acceptance behavior is unchanged.preview_sql_context/save_sql_contextuse a clean preview-then-commit flow.preview_sql_contextbuilds the MCP App render payload without writing anything;save_sql_contextcommits on Approve in a single write. Cancel is handled client-side — no tombstone cleanup needed. Draft-based save flow with MCP App preview.find_sql_contextimprovements — consumes admin-curatedoverrideSql; deprecatesgenerate_sql_sketch. Looker view-text and dbt model-text fallbacks added. SQL-override prop supported on semantic-anchor docs for human edits / proposals.- Semantic-anchor enrichment — dataset bridge documents now include structured properties, field terms/tags, doc links, and domains. Dialect detection and Bedrock retry resilience.
- Per-pattern question generation — semantic anchors generate distinct questions per pattern; metricKeys lookup property for metric anchors.
- CDE Steward agent — new 10-star governance agent for Critical Data Element compliance + certification.
- Agent authoring foundation — 10-star authoring foundation with curator Cloudsmith wiring and runtime env-flag gating for SYSTEM agents.
Context DropandContext Curatoragents default to off. - Ingestion-agent tooling — Read/Grep/Glob source-browsing tools for connector troubleshooting from inside the agent.
- LLM telemetry and billing —
LLMCallEventper-call telemetry (token-billing primitive), context-local cost accumulation, per-turn usage logging, surface and time-to-first-token captured. NewPOST /openapi/v1/billing/usageendpoint. - Slack bot service-account mappings. Admins can map a Slack
bot_idto a DataHub corp user URN under Settings → Platform → AI → Enable Ask DataHub in Slack. When a registered bot @-mentions DataHub, the question is attributed to the mapped service account. Useful for cop-rotation bots, ticketing bots, and other automations. - Smart search — surface
externalUrland inject DataHub URLs into results. - Confluence connector — HTML → Markdown body conversion via
markdownify.
Platform
SecretServicecaller guard. Non-system actors (including human users and PATs withMANAGE_SECRETS) can no longer decrypt secrets. Controlled bySECRET_SERVICE_CALLER_GUARD_MODE(ENFORCE/AUDIT/DISABLED). Components that fetch secrets at runtime use system credentials in standard deployments and are unaffected.- Distributed rate limiting across REST, GraphQL, and OpenAPI with per-endpoint token-bucket controls and configurable jitter (
RATE_LIMITS_RETRY_AFTER_JITTER_PERCENT). - RFC 8693 token exchange for trusted external issuers (OAuth2 token-exchange grant type).
- Agent lifecycle stages. Disabled SYSTEM agents are now
ARCHIVEDviaStatus.lifecycleStageinstead of disappearing from the API. A startup reconciler readsagent-flags.yamland restores the previous non-ARCHIVED stage when the flag flips back on. - Domain propagation — automatic propagation of domain assignment across lineage / containment relationships, with attribution.
- Bridge-backed semantic search for datasets, charts, dashboards, glossary terms, and data products. Deploy GMS and MAE consumer together so source-entity metadata changes create, update, and delete bridge documents consistently. After enabling for a new entity type, run the bridge-document backfill and generate embeddings.
- Search V2.5 cross-entity ranking — DisMax scoring, name-match signals, diversity promotion across query understanding, multi-match, and focused fields. kNN inner_hits chunk text surfaced for the Cohere reranker.
- Elasticsearch 8.18+ semantic search — DataHub Cloud now supports semantic search on Elasticsearch 8.18+ deployments alongside OpenSearch. GCP deployments use a managed Vertex AI embedding provider (
gemini-embedding-001); AWS deployments continue to use AWS Bedrock with Cohere Embed v3. - Per-entity OpenSearch / Elasticsearch mapping ceilings. New
ELASTICSEARCH_INDEX_ENTITYMAPPINGLIMITS_<ENTITY>_<LIMIT>env vars configure per-entity mapping limits; the configured value is baked into the index settings at creation/reindex time and pushed to existing live indices on every system update run. UseDEFAULTas the entity name for a fallback. - pgQueue alternative messaging transport — pluggable transport abstraction with pgQueue as an alternative to Kafka. Includes
metadata-ingestionsink andactionspg_queue event source. Kafka remains the default; transport selection is configuration-driven. - Event filtering framework with pre-deserialization MCL optimization — drops uninteresting MCL events before deserialization, reducing GMS/MAE consumer load.
- Retention policies re-applied on system-update — when retention configuration changes, the system-update job re-applies policies cluster-wide.
- Configurable retention policy refresh —
KubernetesControllerhonors KEDA-aware scaling signals. - OTel GraphQL operation tracing — OpenTelemetry instrumentation across GraphQL resolvers.
RelationshipChangeplatform event — emitted on relationship changes for downstream consumers.- Domain attribution — domain assignment now records the source of the assignment (manual vs. propagated) for audit purposes.
- GraphQL operationContext threading — per-event operationContext now threads through the MCL single-event hook path.
- Assertion ownership — assertions support owners; assignment rule UI aligned with ownership semantics.
- Assertion failure configuration SDK — programmatic configuration of assertion failure behavior.
- Smart-assertion delta-space bounds in predictions; new
DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDSgate. - Severity-escalation broadcast in Slack for incident notifications.
- Docker — published DataHub Postgres extensions image; image tags overhaul (see Breaking Changes); bundled venv symlinks for ingestion source aliases.
Ingestion
New ingestion sources:
- ThoughtSpot — new BI source connector.
- TimescaleDB — new connector supporting self-managed and Tiger Cloud TimescaleDB.
- Airbyte — new connector for Airbyte metadata.
- SAP HANA — production-ready connector with calc-view lineage, stored procedures, and query usage.
Connector improvements:
- Hex — Major in-place upgrade: upstream lineage (table-level and column-level), Project → Component links, run history (
lastRefreshed), and optional AI context documents extracted directly from Hex REST APIs. Newinclude_lineage,use_queried_tables_lineage,connection_platform_map, andinclude_context_documentsconfig options. Hex Components are now ingested as Chart entities (see Breaking Changes). - Snowflake — Internal Marketplace support; dynamic-table lineage extracted from
DYNAMIC_TABLE_GRAPH_HISTORY; private-link Snowsight base URL override; Sweden Central Azure region mapping; fix for silently-dropped views in batchedSHOW VIEWS. - Databricks Unity Catalog — extract primary key, foreign key, and partition key constraints; opt-in Metric View ingestion; v1.1 composable lineage with agent metadata; default profiling switched to SQLAlchemy (from Great Expectations); ownership and
datasetPropertiesemitted as standard MCPWs with incremental config; partner/DataHub user-agent for Databricks telemetry. - BigQuery — Workload Identity Federation (WIF) auth.
- Tableau — support for virtual connections; Initial SQL ingested as lineage and custom property.
- Glue — PATCH mode for dataset properties; column-level Lake Formation tags by default (
extract_lakeformation_column_tags); optional propagation of database tags to tables and columns (propagate_lakeformation_tags); inherited tags marked with propagation attribution. - Dremio — query lineage and view-parent lineage respect
schema_patternanddataset_pattern(and skip the_accelerator_reflection schema); platform mappings forBIGQUERY,RESTCATALOG(Polaris OSS, Nessie, AWS Glue Iceberg REST, S3 Tables, Confluent Tableflow, Microsoft OneLake),SAPHANA,SNOWFLAKEOPENCATALOG, andUNITY;domainrecipe field now actually emits aDomainsaspect; stateful incremental ingestion, incremental lineage / properties, profile-skip; syntheticcreated = epoch 0no longer emitted when Dremio doesn't report one;remove_stale_metadataandfail_safe_thresholdexposed. - Athena — S3 Tables (Iceberg) support.
- dbt —
skip_missing_upstreams_in_lineageconfig; column-level lineage restored for two-tier warehouses (catalog-prefixed SQL + v2 schema fieldPaths); test assertion entities emit anownershipaspect when the dbt test node has explicit owner metadata. - GCS —
workload_identityauth type for GKE Workload Identity;list_objects_v2to fixPaginationErroron Hive-partitioned paths. - MSSQL — consolidated to a single
SqlParsingAggregator. - Teradata — exponential backoff on transient errors; nullable / autoincrement hydration corrected for
CHAR(N)-padded values. - Redshift — fixed late-binding view columns silently dropped due to wrong WHERE clause column name.
- PowerBI — paginated reports with embedded RDL datasources emit lineage; CTE alias no longer leaks as upstream in native SQL lineage.
- LookML — graceful handling of git clone failures and configurable clone timeout.
- Mode —
report_pattern(AllowDenyPattern) config; chart fetch gated onchart_countrather thanexplorations_count. - SAC — query Resources OData endpoint directly instead of via
$metadata. - Kafka Connect — fix duplicated schema segment in sink lineage URNs.
- Dataplex — clearer GCP permission error in project-number resolution.
- Confluence — page-body HTML converted to Markdown via
markdownify.
Ingestion infrastructure:
- Per-connector CLI version matrix with resolution stamp; fall back to default CLI version when the configured version is unset.
- Patch-based writes for user-editable aspects — finer-grained partial updates from ingestion.
- Great Expectations profiler is now optional — default profiling switches to SQLAlchemy.
acryl-datahubinstalls the GE extras only when explicitly requested. sqlglot[c]tokenizer restored on 30.8.0 for performance.- Two-tier stored procedure ingestion — correct URN format and lineage.
- Skip empty columns for CLL — avoids spurious column-level lineage entries.
Executor:
- Opt-in ingestion-log garbage collector. Setting
DATAHUB_EXECUTOR_LOG_GC_ENABLED=truemakes both the remote executor and the coordinator's embedded worker scan/tmp/datahub/logs/on an hourly tick and delete per-execution subdirectories older than 14 days. A size cap (default 10 GB) and a 1-hour in-flight grace window apply. Default isfalsefor this release; expect the default to flip in a follow-up release. - Smart assertions record
INITduring exclusion windows. Surfaces the window name onnativeResultsunderExclusion Windowso planned freezes, migrations, or backfills no longer produce spurious anomalies. - Delta-space assertion bounds gated behind
DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDS(defaultfalse). Protects workers pre-dating eval-time delta anchoring during mixed-fleet rollouts. - V1 inference path no longer imports observe-models at module load. When
DATAHUB_USE_OBSERVE_MODELS=false, slim / stripped / low-RAM executor builds no longer risk an import-time crash loop. - Custom SQL assertions can optionally allow stored-procedure
CALLstatements.DATAHUB_EXECUTOR_ALLOW_CALL_STATEMENTS=truelets a custom SQL assertion's statement be aCALL my_db.my_schema.my_proc()in addition to the read-only query shapes always allowed. Off by default; enabling accepts the risk that the procedure may perform mutations. - mTLS client authentication for outbound HTTPS from the executor.
- Coordinator monitor-request handling — duplicate monitor requests skipped; polling log chatter trimmed.
- Smart-assertion v1 inference preprocessing & training improvements; exclusion window display names include schedule descriptions.
Breaking Changes
- Search V2.5 is now enabled by default. Instances that have not set
SEARCH_VERSION_V2_5_ENABLEDwill use V2.5 after upgrading. Existing instances still on legacy V2 may reindex search indices with the V2.5 analyzers during the upgrade. Migration: no action required to use the new default. To temporarily roll back to legacy V2, setSEARCH_VERSION_V2_5_ENABLED=falsefor GMS and the system-update job. - Dataset semantic search is no longer enabled implicitly. Enabling semantic search with the default
ELASTICSEARCH_SEMANTIC_SEARCH_ENTITIES=documentno longer also bridges datasets;datasetmust be listed explicitly (document,dataset). Action: for instances relying on dataset semantic search, setELASTICSEARCH_SEMANTIC_SEARCH_ENTITIES=document,dataseton GMS and the system-update job. - GMS rate limiting renamed.
rateLimits.defaultRetryAfterSeconds/RATE_LIMITS_DEFAULT_RETRY_AFTERrenamed tominRetryAfterSeconds/RATE_LIMITS_MIN_RETRY_AFTER. The value is now the minimumRetry-Afterfloor; endpoint (token-bucket) denials may return a longer wait. AddedretryAfterJitterPercent/RATE_LIMITS_RETRY_AFTER_JITTER_PERCENT(default10) to spread endpoint retry timing. - Airflow plugin: Airflow 2.x dropped.
acryl-datahub-airflow-pluginnow requires Airflow 3.0+. The plugin always usesapache-airflow-providers-openlineage(>=2.1.0); dropopenlineage-airflowfrom constraints.[airflow2]install extra removed;[airflow3]retained as a no-op.taskinstanceURL format andpatch_snowflake_schemaconfig removed. Pinacryl-datahub-airflow-plugin <= 1.6.0for Airflow 2.7–2.10. - Prefect plugin: Prefect 3.x required (
>=3.0.0,<4.0.0). Entry point group changed fromprefect.blocktoprefect.collections. Re-register the DataHub block before upgrading. - PowerBI Report Server:
chart_patternremoved — emits a deprecation warning if set; chart-level filtering is not yet implemented for this connector. - Hex: Components ingested as Charts instead of Dashboards. A Hex Component defines its own visualisation that importing projects cannot override, so it maps to a Chart (analogous to a Looker Look or PowerBI Tile). Component URNs change entity type — saved views, glossary/tag/ownership assignments, and policies that targeted the old Dashboard-typed Component URNs must be manually reapplied to the new Chart URNs. Stateful-ingestion stale-removal handles most soft-deletes; component-heavy workspaces may need a one-time bulk cleanup.
- Docker image tags:
:headremoved;:quickstartand:sha-<short>added. The floating:headtag is no longer published. For Compose / local quickstart: useDATAHUB_VERSION=quickstart. For Kubernetes / production: pin an immutable tag (releasev*orsha-<7-char>). Bare short-SHA tags (no prefix) are no longer published; switch to thesha-prefixed form. - Docker / local development: legacy compose files removed —
docker/docker-compose*.yml,docker/quickstart.sh,docker/dev*.sh,docker/nuke.sh, and old quickstart bundles underdocker/quickstart/(exceptdocker-compose.quickstart-profile.yml). Usedatahub docker quickstart,./gradlew quickstartDebug, orscripts/dev/datahub-dev.shinstead. - Document entity:
MANAGE_DOCUMENTSprivilege required for creation via RestLI / OpenAPI. Updating and deleting still accept the genericEDIT_ENTITY/DELETE_ENTITYprivileges (orMANAGE_DOCUMENTS). Existing owners and editors retain access. Custom automation creating documents via the API with onlyCREATE_ENTITY/EDIT_ENTITYmust be grantedMANAGE_DOCUMENTS. - Executor coordinator env flags.
DATAHUB_EXECUTOR_MONITORS_ENABLEDandDATAHUB_EXECUTOR_TASKS_ENABLEDare now hard opt-outs (skip subsystem wiring and heavy imports), not fetcher-only toggles. UseDATAHUB_EXECUTOR_INGESTION_PIPELINE_ENABLEDto disable the Kafka /datahub-actionspipeline. getSecretValuesGraphQL query now requires system-level authentication. TheMANAGE_SECRETSprivilege check remains in place, butSecretServicenow also enforces system-actor auth. Components that fetch secrets at runtime use system credentials in standard deployments and are unaffected. Customers who configured these services with a user-issued PAT must migrate to system credentials before upgrading.- Agent env-flag gating now drives
Status.lifecycleStage. A SYSTEM agent whoseAI_AGENT_<NAME>_ENABLEDenv-var resolves to false is now markedARCHIVEDrather than hidden by a read-time filter. Direct URN fetches resolve normally; the existing lifecycle-stage filter keeps ARCHIVED agents out of default search. A startup reconciler restores the previous stage when the flag flips back. Callers that depended on the "null entity" behavior should switch to checkingStatus.lifecycleStage = ARCHIVED. AIAgentInfo.enablementEnvVarandenablementDefaultremoved from PDL. These fields were never released. SYSTEM-agent env-flag config is now declared inagent.toml's[flag]block. Drop any reference to these fields.- Removed
REQUEST_MINIMAL_SLACK_PERMISSIONSfeature flag. Replaced byDATAHUB_SLACK_SERVER_SIDE_HISTORY_ENABLED, which marks the Slack:historyscopes as optional in the install screen viabot_optional. Admins can deselect them per install rather than the choice being baked in at deploy time. - Subscriptions without explicit notification settings now inherit defaults dynamically. GraphQL callers that omit
notificationConfigwhen creating a subscription throughsyncSubscription, or sendnotificationConfigwithoutnotificationSettings, will use the actor's current notification defaults at delivery time. Callers that intentionally want a no-sink subscription should sendnotificationConfig.notificationSettings.sinkTypes: []explicitly. - Smart-assertion inference routing split into
DATAHUB_USE_INFERENCE_V2. PreviouslyDATAHUB_USE_OBSERVE_MODELS=trueboth enabled observe-models and routed to V2 training. These are now decoupled. Action required: executors currently runningDATAHUB_USE_OBSERVE_MODELS=trueto get V2 must also setDATAHUB_USE_INFERENCE_V2=true. - Action Workflows v2 — Filter / DynamicSource model. Feature-flag-gated (
actionWorkflowsV2Enabled). The legacy v1singleFieldValueConditionfield-visibility shape is removed and on-disk values were rewritten toFilterat the model-migration boundary.FieldValidationis reduced to{ pattern, errorMessage }(regex only); the v2-private expression-based validation is removed.
Deprecations
- Removed
ENABLE_BEDROCK_OPTIMIZED_LATENCY. AWS latency-optimized inference is only available for Claude 3.5 Haiku — not the newer model families (Haiku 4.5, Sonnet 4.x, Opus). The flag was a no-op for those models while inflating every Bedrock cost estimate by 25%. Remove the variable from any deployment config. - Dynamic ownership reassignment for proposals is now opt-in. Proposals continue to work as expected; existing asset owners still receive and can act on proposals. Workflows that depend on ownership reassignment automatically updating who sees proposals must enable the option in Automations.
- Hex: legacy lineage recipe fields removed —
lineage_start_time,lineage_end_time, anddatahub_page_sizeemit a deprecation warning if set. Lineage now comes directly from the Hex REST API. Remove them from your recipe.
Security / Dependencies
- See updating-datahub.md for the full OSS dependency changelog. SaaS-specific notes:
SecretServicecaller guard (see Platform).- Removed
REQUEST_MINIMAL_SLACK_PERMISSIONSin favor ofDATAHUB_SLACK_SERVER_SIDE_HISTORY_ENABLEDfor per-install Slack scope opt-out. - Sanitized API error responses to prevent CWE-200 information leaks.
- CVE coverage —
ujson >= 5.12.1(CVE-2026-44660);idna >= 3.15(CVE-2026-45409); plus ongoing dep-bump coverage in line with the OSS changelog.
Bug Fixes
- Fixed dataset semantic search bridge backfill robustness during staged rollouts.
- Fixed executor read of prefixed
DATAHUB_EXECUTOR_EMBEDDED_WORKER_ENABLEDenv var. - Fixed
AssertionRunEvent,MonitorSuiteInfo,SubscriptionInfo, andAssertionAssignmentRuleInfoschemaVersionbumps for forward compatibility. - Fixed Context Hub — prevent duplicate publish context-doc proposals on multiple context-generation runs.
- Fixed Action Workflow rendering of human-readable name on selected dynamic-source URN fields.
- Fixed CSS selector escape in SchemaTable to prevent
SyntaxErroron special characters. - Fixed lifecycle-stage migration on documents.
- Fixed semantic-index marker migration to patch existing embedded dataset bridge chunks without requiring a Python re-embed.
- Fixed Display all group relationships on the user profile page.
- Fixed Ask DataHub bold rendering, mode-selector glow, follow-up suggestions.
- Fixed built-in agents from seeing system-agent backend tools.
- Fixed lifecycle-stage issues across glossary terms and documents.
- Fixed missing
applicationsfield on glossary term and data product GraphQL queries. - Fixed wrong namespace in
EntityDropdown(i18n). - Fixed executor
WorkEventProducerfail-fast init when a Kafka channel is required. - Fixed
@KafkaMessagingEnabledannotation onKafkaAdminServiceFactoryfor pgQueue compatibility.
Known Issues
- TBD
Environment Variables
SECRET_SERVICE_CALLER_GUARD_MODE(default:ENFORCE) — Controls howSecretServiceresponds when a non-system actor attempts to decrypt a secret. Values:ENFORCE(throwSecurityException, recommended for production),AUDIT(allow but log a warning — staged rollout),DISABLED(no enforcement — break-glass).DATAHUB_EXECUTOR_LOG_GC_ENABLED(defaultfalse),DATAHUB_EXECUTOR_LOG_DIR(default/tmp/datahub/logs),DATAHUB_EXECUTOR_LOG_GC_INTERVAL_SECONDS(default3600),DATAHUB_EXECUTOR_LOG_GC_RETENTION_DAYS(default14),DATAHUB_EXECUTOR_LOG_GC_MAX_DIR_SIZE_MB(default10000;0to disable the size cap),DATAHUB_EXECUTOR_LOG_GC_IN_FLIGHT_GRACE_SECONDS(default3600): Opt-in in-process ingestion-log garbage collector on remote executors and the coordinator's embedded worker.DATAHUB_EXECUTOR_ALLOW_CALL_STATEMENTS(defaultfalse): Whentrue, custom SQL assertions may use a stored-procedureCALLstatement in addition to read-only queries. Requires an executor restart.DATAHUB_USE_INFERENCE_V2(defaultfalse): Routes smart-assertion training to the V2 pipeline. RequiresDATAHUB_USE_OBSERVE_MODELS=true.DATAHUB_EXECUTOR_ENABLE_DELTA_BOUNDS(defaultfalse): Enables differenced (boundsValueSpace=DELTA) prediction bounds in the V1 smart-assertion trainer.ELASTICSEARCH_INDEX_ENTITYMAPPINGLIMITS_<ENTITY>_<LIMIT>: Per-entity OpenSearch / Elasticsearch mapping ceilings. UseDEFAULTas the entity name for a fallback applying to all entity indices.MCP_TELEMETRY_CAPTURE_PAYLOADS: Enables payload capture for MCP tool invocations surfaced in the MCP Audit tab.RATE_LIMITS_MIN_RETRY_AFTER(replacesRATE_LIMITS_DEFAULT_RETRY_AFTER) andRATE_LIMITS_RETRY_AFTER_JITTER_PERCENT(default10): GMS rate-limit floor and jitter spread.AUTH_GMS_SESSION_COOKIE_NAME(defaultSESSION): Name of the Spring Security session cookie set by GMS. Set this if your deployment overridesspring.session.servlet.cookie.name.