Data catalog demo


1. Introduction to the test environment

CDP Runtime versionCDP PvC Base 7.1.7 SP2
CM versionCloudera Manager 7.11.3.2
ECS versionCDP PvC DataServices 1.5.2
OS versionCentos 7.9
K8S versionRKE 1.25.14
Whether to enable KerberosYes
Whether to enable TLSYes
Auto-TLSYes
KerberosFreeIPA
LDAPFreeIPA
DB ConfigurationEmbedded
VaultEmbedded
Docker registryEmbedded
Install MethodInternet

2. Basic Concept

  • Data Catalog is a service within Cloudera Data Platform that enables you to understand, manage, secure, and govern data assets across the enterprise.

3. Prerequisites

  • Navigate to Cloudera Manager > Clusters > Atlas > Configuration, search conf/atlas-application.properties_role_safety_valve and enter the values for the Atlas service:
atlas.proxyuser.dpprofiler.hosts=*
atlas.proxyuser.dpprofiler.users=*
atlas.proxyuser.dpprofiler.groups=*

  • Navigate to Cloudera Manager > Clusters > Ranger > Configuration, search conf/ranger-admin-site.xml and enter the values for the Atlas service:
Name: ranger.proxyuser.dpprofiler.hosts
Value: *

Name: ranger.proxyuser.dpprofiler.users
Value: *

Name: ranger.proxyuser.dpprofiler.groups
Value: *

  • Restart the cluster according to the configuration update prompts on the CM UI.

  • Go to Cloudera Management Console > User Management > Users and grant PowerUser role to user admin.

  • Log in to Data Catalog service as user admin by clicking on the CDP homepage on your CDP console.

  • Navigate to Cloudera Data Catalog > Search, You can see “Setup the Profiler for ecstest-datalake”. Click Get Started.

  • Click Setup Profiler.

  • Navigate to Cloudera Data Catalog > Profilers > Configs, You can see that three built-in profilers: Cluster Sensitivity Profiler, Ranger Audit Profiler, Hive Column Profiler are enabled.


Back to top

All trademarks, logos, service marks and company names appeared here are the property of their respective owners.