Skip to main content

MCP Builder

These classes and methods make it easier to construct MetadataChangeProposals.

MetadataChangeProposalWrapper

class datahub.emitter.mcp.MetadataChangeProposalWrapper(entityType = 'ENTITY_TYPE_UNSET', changeType = 'UPSERT', entityUrn = None, entityKeyAspect = None, auditHeader = None, aspectName = None, aspect = None, systemMetadata = None, headers = None)

Bases: object

  • Parameters:
    • entityType (str)
    • changeType (Union[str, ChangeTypeClass]) –
    • entityUrn (Optional[str])
    • entityKeyAspect (Optional[_Aspect])
    • auditHeader (Optional[KafkaAuditHeaderClass]) –
    • aspectName (Optional[str])
    • aspect (Optional[_Aspect])
    • systemMetadata (Optional[SystemMetadataClass]) –
    • headers (Optional[Dict[str, str]])

as_workunit(*, treat_errors_as_warnings=False, is_primary_source=True)

  • Parameters:
    • treat_errors_as_warnings (bool)
    • is_primary_source (bool)
  • Return type:MetadataWorkUnit

aspect : Optional[_Aspect] = None

aspectName : Optional[str] = None

auditHeader : Optional[KafkaAuditHeaderClass] = None

changeType : Union[str, ChangeTypeClass] = 'UPSERT'

classmethod construct_many(entityUrn, aspects)

entityKeyAspect : Optional[_Aspect] = None

entityType : str = 'ENTITY_TYPE_UNSET'

entityUrn : Optional[str] = None

classmethod from_obj(obj, tuples=False)

Attempt to deserialize into an MCPW, but fall back to a standard MCP if we’re missing codegen’d classes for the entity key or aspect.

classmethod from_obj_require_wrapper(obj, tuples=False)

headers : Optional[Dict[str, str]] = None

make_mcp()

systemMetadata : Optional[SystemMetadataClass] = None

to_obj(tuples=False, simplified_structure=False)

  • Parameters:
    • tuples (bool)
    • simplified_structure (bool)
  • Return type:dict

classmethod try_from_mcl(mcl)

classmethod try_from_mcpc(mcpc)

Attempts to create a MetadataChangeProposalWrapper from a MetadataChangeProposalClass. Neatly handles unsupported, expected cases, such as unknown aspect types or non-json content type.

validate()

  • Return type:bool

BigQueryDatasetKey

class datahub.emitter.mcp_builder.BigQueryDatasetKey(**data)

Bases: ProjectIdKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • project_id (str)
    • dataset_id (str)

dataset_id : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

BucketKey

class datahub.emitter.mcp_builder.BucketKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • bucket_name (str)

bucket_name : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

CatalogKey

class datahub.emitter.mcp_builder.CatalogKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • catalog (str)

catalog : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

CatalogKeyWithMetastore

class datahub.emitter.mcp_builder.CatalogKeyWithMetastore(**data)

Bases: MetastoreKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • metastore (str)
    • catalog (str)

catalog : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

ContainerKey

class datahub.emitter.mcp_builder.ContainerKey(**data)

Bases: DatahubKey

Base class for container guid keys. Most users should use one of the subclasses instead.

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)

as_urn()

  • Return type:str

as_urn_typed()

backcompat_env_as_instance : bool

env : Optional[str]

guid_dict()

  • Return type:Dict[str, str]

instance : Optional[str]

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

parent_key()

platform : str

property_dict()

  • Return type:Dict[str, str]

DataProductKey

class datahub.emitter.mcp_builder.DataProductKey(**data)

Bases: DatahubKey

  • Parameters:
    • data (Any)
    • platform (str)
    • name (str)
    • env (str | None)
    • instance (str | None)

as_urn()

  • Return type:str

env : Optional[str]

instance : Optional[str]

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name : str

platform : str

property_dict()

  • Return type:Dict[str, str]

DatabaseKey

class datahub.emitter.mcp_builder.DatabaseKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • database (str)

database : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

DatahubKey

class datahub.emitter.mcp_builder.DatahubKey(**data)

Bases: BaseModel

  • Parameters:data (Any)

guid()

  • Return type:str

guid_dict()

  • Return type:Dict[str, str]

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

DomainKey

class datahub.emitter.mcp_builder.DomainKey(**data)

Bases: DatahubKey

  • Parameters:
    • data (Any)
    • name (str)
    • platform (str | None)
    • instance (str | None)

as_urn()

  • Return type:str

instance : Optional[str]

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name : str

platform : Optional[str]

property_dict()

  • Return type:Dict[str, str]

ExperimentKey

class datahub.emitter.mcp_builder.ExperimentKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • id (str)

id : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

FolderKey

class datahub.emitter.mcp_builder.FolderKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • folder_abs_path (str)

folder_abs_path : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

MetastoreKey

class datahub.emitter.mcp_builder.MetastoreKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • metastore (str)

metastore : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

NamespaceKey

class datahub.emitter.mcp_builder.NamespaceKey(**data)

Bases: ContainerKey

For Iceberg namespaces (databases/schemas)

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • namespace (str)

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

namespace : str

NotebookKey

class datahub.emitter.mcp_builder.NotebookKey(**data)

Bases: DatahubKey

  • Parameters:
    • data (Any)
    • notebook_id (int)
    • platform (str)
    • instance (str | None)

as_urn()

  • Return type:str

instance : Optional[str]

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

notebook_id : int

platform : str

PlatformKey

datahub.emitter.mcp_builder.PlatformKey()

alias of ContainerKey

ProjectIdKey

class datahub.emitter.mcp_builder.ProjectIdKey(**data)

Bases: ContainerKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • project_id (str)

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

project_id : str

SchemaKey

class datahub.emitter.mcp_builder.SchemaKey(**data)

Bases: DatabaseKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • database (str)
    • schema (str)

db_schema : str

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

StructuredPropertyWriteMode

class datahub.emitter.mcp_builder.StructuredPropertyWriteMode(value)

Bases: StrEnum

How add_structured_properties_to_entity_wu writes the aspect.

UPSERT replaces the whole structuredProperties aspect each run (recipe is source of truth). PATCH adds each property individually so user/UI/other-pipeline edits survive — at the cost of removals from the recipe no longer propagating; clean those up via the UI or API.

PATCH = 'patch'

UPSERT = 'upsert'

UnitySchemaKey

class datahub.emitter.mcp_builder.UnitySchemaKey(**data)

Bases: CatalogKey

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • catalog (str)
    • unity_schema (str)

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

unity_schema : str

UnitySchemaKeyWithMetastore

class datahub.emitter.mcp_builder.UnitySchemaKeyWithMetastore(**data)

Bases: CatalogKeyWithMetastore

  • Parameters:
    • data (Any)
    • platform (str)
    • instance (str | None)
    • env (str | None)
    • backcompat_env_as_instance (bool)
    • metastore (str)
    • catalog (str)
    • unity_schema (str)

model_config : ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

unity_schema : str

add_dataset_to_container

datahub.emitter.mcp_builder.add_dataset_to_container(container_key, dataset_urn)
  • Parameters:
    • container_key (TypeVar(KeyType, bound= ContainerKey)) –
    • dataset_urn (str)
  • Return type:Iterable[MetadataWorkUnit]

add_domain_to_entity_wu

datahub.emitter.mcp_builder.add_domain_to_entity_wu(entity_urn, domain_urn)
  • Parameters:
    • entity_urn (str)
    • domain_urn (str)
  • Return type:Iterable[MetadataWorkUnit]

add_entity_to_container

datahub.emitter.mcp_builder.add_entity_to_container(container_key, entity_type, entity_urn)
  • Parameters:
    • container_key (TypeVar(KeyType, bound= ContainerKey)) –
    • entity_type (str)
    • entity_urn (str)
  • Return type:Iterable[MetadataWorkUnit]

add_owner_to_entity_wu

datahub.emitter.mcp_builder.add_owner_to_entity_wu(entity_type, entity_urn, owner_urn, ownership_type = 'DATAOWNER')
  • Parameters:
    • entity_type (str)
    • entity_urn (str)
    • owner_urn (str)
    • ownership_type (str)
  • Return type:Iterable[MetadataWorkUnit]

add_structured_properties_to_entity_wu

datahub.emitter.mcp_builder.add_structured_properties_to_entity_wu(entity_urn, structured_properties, write_mode = StructuredPropertyWriteMode.UPSERT)

add_tags_to_entity_wu

datahub.emitter.mcp_builder.add_tags_to_entity_wu(entity_type, entity_urn, tags)
  • Parameters:
    • entity_type (str)
    • entity_urn (str)
    • tags (List[str])
  • Return type:Iterable[MetadataWorkUnit]

create_embed_mcp

datahub.emitter.mcp_builder.create_embed_mcp(urn, embed_url)

entity_supports_aspect

datahub.emitter.mcp_builder.entity_supports_aspect(entity_type, aspect_type)
  • Parameters:
    • entity_type (str)
    • aspect_type (Type[TypeVar(Aspect, bound= _Aspect)]) –
  • Return type:bool

gen_containers

datahub.emitter.mcp_builder.gen_containers(container_key, name, sub_types, parent_container_key = None, extra_properties = None, structured_properties = None, domain_urn = None, description = None, owner_urn = None, ownership_type = None, external_url = None, tags = None, qualified_name = None, created = None, last_modified = None)
  • Parameters:
    • container_key (TypeVar(KeyType, bound= ContainerKey)) –
    • name (str)
    • sub_types (List[str])
    • parent_container_key (Optional[ContainerKey]) –
    • extra_properties (Optional[Dict[str, str]])
    • structured_properties (Optional[Dict[StructuredPropertyUrn, str]]) –
    • domain_urn (Optional[str])
    • description (Optional[str])
    • owner_urn (Optional[str])
    • ownership_type (Optional[str])
    • external_url (Optional[str])
    • tags (Optional[List[str]])
    • qualified_name (Optional[str])
    • created (Optional[int])
    • last_modified (Optional[int])
  • Return type:Iterable[MetadataWorkUnit]

gen_data_product

datahub.emitter.mcp_builder.gen_data_product(data_product_key, name, description = None, external_url = None, custom_properties = None, domain_urn = None, owner_urns = None, ownership_type = 'DATAOWNER', tags = None, structured_properties = None, assets = None)

Generate metadata workunits for a Data Product entity.

  • Parameters:
    • data_product_key (DataProductKey) – Key containing platform, name, env, and instance for generating a deterministic URN
    • name (str) – Display name of the Data Product
    • description (Optional[str]) – Documentation describing the Data Product
    • external_url (Optional[str]) – URL to external documentation or resources
    • custom_properties (Optional[Dict[str, str]]) – Custom key-value metadata properties
    • domain_urn (Optional[str]) – URN of the domain this Data Product belongs to
    • owner_urns (Optional[List[str]]) – List of owner URNs (users or groups)
    • ownership_type (str) – Ownership type for all owners. Can be a string constant (e.g., “TECHNICAL_OWNER”) or a custom ownership type URN (e.g., “urn:li:ownershipType:producer”).
    • tags (Optional[List[str]]) – List of tag names to associate with the Data Product
    • structured_properties (Optional[Dict[StructuredPropertyUrn, str]]) – Structured property URN to value mappings
    • assets (Optional[List[str]]) – List of asset URNs (datasets, dashboards, charts, etc.) that are part of this Data Product. Assets are converted to DataProductAssociation relationships.
  • Yields:MetadataWorkUnit – Workunits for DataProductProperties, Status, Domain, Ownership, Tags, and StructuredProperties aspects
  • Return type:Iterable[MetadataWorkUnit]

mcps_from_mce

datahub.emitter.mcp_builder.mcps_from_mce(mce)