Skip to content

v5.5.0

Choose a tag to compare

released this 15 Jun 18:49
· 1575 commits to master since this release

Release Notes for Data Hub 5.5.0

Release Summary

Data Hub 5.5.0 includes the following new features and changes:
Important: Upgrading to this release would trigger a reindexing of the STAGING and FINAL databases. Learn more about how reindexing works and its impact on performance.
Notices

  • QuickStart is now deprecated. Hub Central supports all the same functionality and should be used instead. As of 5.5, match and merge steps can only be configured and run in Hub Central.
  • Custom hooks are now deprecated, and interceptors should be used instead.

General enhancements

  • New option to run a step interceptor before the primary function is invoked (to do this, specify  "when" as "beforeMain")
  • Support for testing Data Hub steps with marklogic-unit-test and JUnit 5 (see Testing Data Hub Applications)
  • Ingest and run a Data Hub flow of steps in a single call via a new REST extension or MLCP. When you specify more than one step, the output of one step is the input to the next step. This is a more performant way to ingest and run multiple steps since it involves only one call to MarkLogic (see Run Multiple Steps on Ingest)
  • Ability to facet on structured types in Hub Central's Explore feature
  • When Data Hub writes documents via a step, documents are now written to the user's default collections
  • You can now set 'quality' to documents created by a step to control the relevance score of documents in text searches

Mapping enhancements (see Mapping Enhancements)

  • Support for mapping multiple entities from a single source document within the same step
  • A new Attach Source Document field in mapping settings lets you specify whether the source document should be copied into the mapped entity instance. 
  • A new URI field in every mapping configuration lets you define a URI template as a mapping expression. A new mapping function called hubURI generates a UUID and prefixes the name of the specified entity type. The function signature is hubURI(entityType).
  • You can define custom parameters that are referenced from a mapping expression


Mastering enhancements

  • Configure match and merge steps in HC for structured properties
  • Test your configuration for matching
    • View total match scores and broken down match contributions for each property
    • View documents side-by-side to compare similarities and differences

Monitoring enhancements

  • Monitor the steps and flows that have been run in the Data Hub via Hub Central (Facet, Filter, Sort)
  • New jobs REST extension with additional parameters
  • Provenance capture has been turned off by default for new steps created

Step options added in 5.5

  • Among the new step options in 5.5, writeStepOutput defaults to true. If set to false, the content objects outputted by a step are not be persisted. Typically, this is only useful when running multiple steps on ingest.
  • The new step properties added are listed below:

A step has a "type" that is defined by the step definition with which it is associated. The following properties apply to a step regardless of its type (the "Yes *" for Required means that it is required for every step type exception for ingestion steps):

Property Value Required Description
enableBatchOutput string No New in 5.5; If "never", then a Batch document will never be created when the step is run; if "onFailure", then a Batch document is created only if an error occurs for a batch when the step is run; else, a Batch document is created for the step, unless "disableJobOutput" is set to "true" as a flow option or a runtime option
targetCollectionsAdditivity boolean No New in 5.5; Defaults to false; if set to true, then for any content object returned by a step that was also as input content object, its original collections will be retained
writeStepOutput boolean No New in 5.5; Defaults to true; if set to false, then the content objects outputted by a step will not be persisted; typically only useful when running multiple steps on ingest
Package Key Method
marklogic-data-hub-5.5.0-client.jar adda0f82ef195b05747dc2b84d2548fabb40725d SHA1SUM
marklogic-data-hub-central-5.5.0.war 5de0d83916adb8678f23fdb1f4e8d0716a0c84ad SHA1SUM