Collibra - Data Catalog Software

Introduction

Data Catalog Software acts like a dictionary for all data assets in a company. In Collibra, customers store metadata about all databases, including tables and columns, spreadsheets, reports, processes, etc. The idea is to make data management easier, overall understanding, and ownership of the data assets. For example, if an employee wants to see where the data from a particular report originates, or who they should ask to make changes to a report, they can check in Collibra.

The integration with AtScale should work as follows: on sync, all data objects from an AtScale catalog/project should be transferred to Collibra.

Collibra Login: https://<your_collibra_hostname>.collibra.com/

Documentation: About Collibra

 

Requirements

  1. Works with both Installer and Container versions

  2. AtScale has to be public to communicate with Collibra. 

  3. An AtScale product license with the  Data Catalog feature is required.  

  4. At least one published project

  5. To run Collibra, download the zip file: atscale-to-collibra-integration-1.0.1.zip

How to run

1. Extract the content of the provided zip file

The zip file provided contains all the files needed to run, organized into three folders:

  • bin - contains executable files

  • config - contains the configuration

  • lib - contains all libraries


2. Configure

The provided application.properties The file contains the default values for all configurations, except for usernames and passwords, as well as the hosts and ports for the services used (AtScale and Collibra).

Each configuration is composed of two parts: a key (defined in the application) and a value (provided by the user), separated by =.

All the configuration values can be modified. If the key is missing, the default value will be used.

Values must be provided for at least the following configuration keys, as there are no default values for them:

  • collibra.username - the username used for Collibra import

  • collibra.password - the password for the Collibra user

  • atscale.api.username - the username used to connect to atscale

  • atscale.api.password - the password for the AtScale user

  • atscale.api.apihost - the host atscale is running on

  • atscale.api.authhost - the host atscale authorization is running on. Usually the same as atscale.api.apihost

  • trigger.api.username - the username for the utility of atscale-to-collibra-integration, set it on the application. Properties file.

  • trigger.api.password - the password for the utility of atscale-to-collibra-integration, set it on the application. Properties file.

The atscale.api.organization (the organisation for the atscale installer) It can also be changed if it is not the default.

3. Start atscale-to-collibra-integration

This can be done by executing in a terminal the command: ./bin/atscale-to-collibra-integration.

The process is not expected to be completed. It will perform synchronizations periodically (if the trigger.scheduler.cron.enabled is set to true) at the frequency specified in trigger.scheduler.cron.expression (default every 2 hours).

Leave this terminal open; here you can see the sync happening.

4. Initial metadata update

In a new terminal, with atscale-to-collibra-integration running, run ./bin/update_metadata.sh <trigger.api.username>  <trigger.api.password>

The password is in the application.properties file.

This will update and/or create the necessary assets in Collibra and perform an initial synchronization.

This will create the Business Domain & Data Domain specified in the application.properties file.

After the work is completed, this terminal can be closed up safely.

5. Sync all data objects

To force it to send all objects from AtScale to Collibra (and not wait for the sync to happen on schedule), you can use the following link:

http://localhost:8081/api/sync

Paste it into your browser, and every time you refresh, it will direct you to AtScale, retrieve all the metadata, and send it to Collibra.

In Collibra, there is a Community called “Connector Testing“ please use this one to do the tests. To do this, update row 56 with the correct ID. 

Currently it is: atscale.community.id=01961563-2ef8-7732-b33d-f9f667bced5f

Was this article helpful?

0 out of 0 found this helpful