v0.3.7
Release Availability Date
13-Nov-2024
Recommended CLI/SDK
v0.14.1.12
with release notes at https://github.com/datahub/datahub/releases/tag/v0.14.1.12
If you are using an older CLI/SDK version, then please upgrade it. This applies for all CLI/SDK usages, if you are using it through your terminal, GitHub Actions, Airflow, in Python SDK somewhere, Java SDK, etc. This is a strong recommendation to upgrade, as we keep on pushing fixes in the CLI, and it helps us support you better.
Known Issues
v0.3.7.3
- Search page fails to render when filters are applied with a query which returns zero results.
Release Changelog
v0.3.7.8
- [Postgres] Fix regression from MySQL fix in v0.3.7.7
v0.3.7.7
- [UI] Fix bug showing upstream lineage dbt source leaves
- [UI] Show column-level lineage through transformational home node
- [UI] Browse nodes titles expand to full width of panel
- [UI] Data product preview cards display correctly
- [UI] Fix elasticsearch usage sort field names
- [UI] Add structured property display settings feature
- [Executor] Fix false errors on cli ingestions
- [Search] Schema field boost reduced
- [Search] Search usage ranking null_fill fix
- [Search] Single term with underscores by default no longer considered quoted
- [Metadata Tests] Metadata Test shutdown actions flush
- [Metadata Tests] Add deduplicate logic for MCP batches
- [Metadata Tests] Prevent mutation of systemMetadata in patch batches
- [MAE Consumer] Fix graph edge on container delete exception
- [Notifications] Filter out system ingestion source notifications
- [MySQL] Fix index gap lock deadlock
- [API] DataJobInputOutput finegrained lineage fix
v0.3.7.6
- [UI] fix(automations): white screen automations with dbt sync
v0.3.7.5
- [GMS] Fix upstream lineage patching when path contained encoded slash
- [UI] Fix merging siblings schema with v1 and v2 fields
- [UI] Fix display nullable in schema field drawer
- [Ingestion] Reduce Data Product write volume from unset side-effect
v0.3.7.4
- #11935 - Added environment variable for enabling stricter URN validation rules
STRICT_URN_VALIDATION_ENABLED
[1]. - [Automations] Filter out self-nodes in glossary term propagation
- [Remote Executor] Allow dashes in executor ids.
- [Search] Fix Nested Filter Counts in Primary Search
- [Search] Fix white screen of death on empty search result
- [Columns Tab] Support searching nested struct columns correctly in V2 UI.
- [Logo] Fix fit of custom logo for V2 UI nav bar.
- [Structured Properties] Better handling for special characters in structured properties
- [Lineage] Improvements to handling lineage cycles
- [Metadata Tests] Improve Reliability of Metadata Tests Action Application
- [Slack Integration] Minor improvement in authentication redirect to integrate with Slack
- [Columns Tab] Property display nullable status in column sidebar (bug)
- [Columns Tab] Fixing merging of sibling schemas between V2 and V1 field paths.
- [Documentation] Support group authors for institutional memory aspect
v0.3.7
All changes in https://github.com/datahub-project/datahub/releases/tag/v0.14.1
- Note Breaking Changes: https://datahubproject.io/docs/how/updating-datahub/#0141
Breaking Changes
- Authentication & RestAPI Authorization enabled by default (since v0.3.6)
- Helm Chart Requirement: 1.4.137+
- Recommend setting timezone for
datahub-gc
anddatahub-usage-reporting
acryl-datahub:
global:
datahub:
timezone: 'America/Los_Angeles'
- Recommend setting timezone for
- #11486 - Criterion's
value
parameter has been previously deprecated. Use ofvalue
instead ofvalues
is no longer supported and will be completely removed on the next major version. - #10472 -
SANDBOX
added as a FabricType. No rollbacks allowed once metadata with this fabric type is added without manual cleanups in databases. - #11619 - schema field/column paths can no longer be empty strings
- #11619 - schema field/column paths can no longer be duplicated within the schema
- #11570 - The
DatahubClientConfig
's server field no longer defaults tohttp://localhost:8080
. Be sure to explicitly set this. - #11570 - If a
datahub_api
is explicitly passed to a stateful ingestion config provider, it will be used. We previously ignored it if the pipeline context also had a graph object. - #11518 - DataHub Garbage Collection: Various entities that are soft-deleted (after 10d) or are timeseries entities (dataprocess, execution requests) will be removed automatically using logic in the
datahub-gc
ingestion source.
Bug Fixes
- [UI] Fix a bug in displaying the filter value counts when selecting filters on the primary search experience
- [UI] Fix unnecessary horizontal scrolling wide markdown documentation.
- [UI] Fix bug in siblings external URLs. Now showing both the dbt and Snowflake URL as separate, correct URLs.
- [UI] Fix bug on listing data product assets with View applied
- [UI] Fix siblings bug in Schema Field queries tab
- [UI] Handle edge cycles in lineage graph more correctly
- [UI] Hide incorrect "Lineage" sidebar section on sibling pages (incorrect merge)
- [UI] Miscellaneous fixes to Automations forms UI - creating and editing automations.
- [UI] Fix scrolling to the end of the list of tabs on Asset Profiles
- [UI] Fix Compact View preview on Hover Card (looked squished!)
- [UI] Update asset counts on Domain profile pages after adding and removing assets right away
- [UI] Improve support for Compliance Forms and Structured Properties on sibling asset profile pages
- [Automations] Column Description Propagation: Fix Column Description Propagation issue where column description would not propagate if self-lineage was stored in graph index
- [Automations] Snowflake Tag Sync: Fix bug in Snowflake Tag Sync that failed to sync to columns with special characters
Product
- [BETA] Introducing the BigQuery Metadata Sync Automation to sync tags, glossary terms, and descriptions from DataHub to BigQuery. Check out the feature guide for more information. To enable this BETA feature, reach out to your Acryl representative.
- [BETA] Introducing the AI Classification Automation to automatically classify your tables & columns using your organization's custom glossary terms. Check out the feature guide for more information. To enable this BETA feature, reach out to your Acryl representative.
- [BETA] A new way to visualize Column-Level Lineage, focused on a single column. Accessible by clicking on a column name in the column details sidebar or by clicking on the "Explore complete column lineage" button on a column in the regular lineage visualization. This will allow you to view only the upstreams and downstreams of the specific column being viewed. Please reach out to your Acryl representative to enable this feature.
- [BETA] Support running Automations via a Remote Executor using an Executor ID. This is currently in Beta, please reach out to your Acryl representative for more information.
- [BETA] Support plugging in custom Mixpanel or Google Analytics Measurement ID (GA4) to DataHub. Reach out to your Acryl representative for more information.
- Introducing Structured Properties UI. Create and manage custom properties for all asset types via the DataHub UI under Govern > Structured Properties. Feature guide will be coming in v0.3.8 - reach out to your Acryl representative for more information. Requires the
Manage Structured Properties
privilege to edit,View Structured Properties
privilege to view. - Introducing Compliance Forms UI. Create and manage compliance forms to run large-scale metadata collection initiatives inside your organization. Supported for all asset types via the DataHub UI under Govern > Compliance Forms. Feature guide will be coming in v0.3.8 - reach out to your Acryl representative for more information. Requires the
Manage Compliance Forms
privilege to edit,View Compliance Forms
privilege to view. Compliance Forms also support analytics, which are updated once per day by default. - Support adding and removing structured properties from Table & Column Properties Tab
- Support filtering by Structured Properties in the search UI (main search only, not on lists yet)
- Acryl 2.0 is enabled by default for all users who have no explicitly set their display preference via Settings > Appearance.
- Support searching the visible lineage graph by asset name
- Support showing 'all' assets in a downstream or upstream lineage level in one click
- Support searching the assets hidden by a collapsed, "show more" node
- On lineage graph, draw an arrow from a column to the "show more" node if that column has lineage to a hidden node
- On lineage graph, add control to show lineage edges to entities that are deleted / do not exist
- Support deleting Data Product from the Data Product page
- Support viewing & editing documentation in full-screen mode
- Support copying queries for View Definitions (sidebar + tabs)
- Support V2 UI with Chrome Extension, fix miscellaneous bugs related to documentations, glossary, and lineage interactions.
- Minor UX improvements (alignments, etc) to Quality, Assertions tabs.
- Reorder the asset sidebar sections to prioritize documentation & lineage, the most used features. Moved down status, and share related tabs.
- Add "Total Views" and "Recent Views" statistics to Dashboard & Chart asset sidebar header.
- Ingestion UI: Always display the number of assets ingested on "Failure" & "Succeeded With Warnings"
- Permissions: Hide Settings > Access Tokens page if user doesn't have the
Generate Access Tokens
privilege. - Add a Properties tab to Asset sidebar
- Hide the 'notes' icon from the Columns table on Dataset Profiles, only show in the Column sidebar
- Add Properties Count, Column Count, Incident Count to Asset Profile tab names
- Allow resizing of the browse sizebar
- Display custom Assertion Error messages via the UI
- Add sorting to Columns table
- Add description to "hover preview" of assets
- Rename 'Inbox' navigation item to 'Tasks' to align with rebranding as 'Task Center'
- Support viewing correctly merged schema change history for sibling pages
- Minor UX improvements on lineage graph
- Minox UX improvements on Glossary, Search Cards, Home page, Subscriptions tab, and more.
- Improved usage-based search ranking. Please reach out with any questions or concerns
- Improved UX for setting up and managing SSO
Ingestion changes
- In addition to the improvements listed here: https://github.com/acryldata/datahub/releases/tag/v0.14.1.12
- PowerBI: Support for PowerBI Apps and cross-workspace lineage
- Fivetran: Major improvements to configurability and improved reliability with large Fivetran setups
- Snowflake & BigQuery: Improved handling of temporary tables and swap statements when generating lineage
- [Beta] Preset integration
Platform changes
- Added datahub-usage-reporting job to calculate usage metrics for search ranking
- Metadata Test performance improvements: async ingestion & tag patch support
- Authentication & RestAPI Authorization enabled by default
- Added datahub-gc and datahub-usage-reporting SYSTEM ingestion sources
- Added sweeper to executor to cancel duplicate and stale ingestion jobs
- Added soft delete status to edges in graph store
- Added service side options for newer clients with older service
- ALTERNATE_MCP_VALIDATION=true
- MCP_VALIDATION_IGNORE_UNKNOWN=true
- OpenAPIv3
- Added generic entities scroll endpoint
- Added
async
andcreateIfNotExists
on aspect endpoints
- System Operations privilege extended to all system operations
- [BETA] Introduce Entity Change Events Poll API behind permission "Get Platform Events". This enables programmatic access to entity change events in DataHub. Reach out to your Acryl representative for more information.
- (system / internal) Exclude form-prompt tests in live Metadata Tests evaluation
- (system / internal) Exclude form-prompt tests in stored Metadata Test results
- Elasticsearch reindex time limit of 8h removed
- Data Product Properties Unset side effect introduced
- Previously, Data Products could be set as linked to multiple Datasets if modified directly via the REST API rather than linked through the UI or GraphQL. This side effect aligns the REST API behavior with the GraphQL behavior by introducting a side effect that enforces the 1-to-1 constraint between Data Products and Datasets
- NOTE: There is a pathological pattern of writes for Data Products that can introduce issues with write processing that can occur with this side effect. If you are constantly changing all of the Datasets associated with a Data Product back and forth between multiple Data Products it will result in a high volume of writes due to the need to unset previous associations.
Is this page helpful?