Data Governance and Security for Databricks

Enable consistent data governance and security across your machine learning and artificial intelligence workloads.

Privacera and Databricks


Understand at a glance what sensitive data you have, where it is stored, and who is accessing it with Databricks.


Democratize data by safely make more data available to more Databricks users for machine learning and artificial intelligence.


With Privacera and Databricks, easily achieve and prove compliance with privacy regulations like GDPR and CCPA.

Balance Data Governance and Security with Machine Learning and Artificial Intelligence

Privacera natively integrates with Databricks at the infrastructure level, as well as with Amazon S3, Azure Data Lake Store and other cloud storage services that make data available to Databricks, to provide consistent data governance and security.

Discovery and Classification

Discover and classify sensitive data across tables/schemas in a catalog to understand where PII and sensitive data is stored in relation to your Databricks environment. Privacera runs on a Databricks cluster and uses machine learning and rules to accurately identify specific data types and apply tags. Learn more about discovery and classification.

Centralized Access Management

Centrally define and enforce role-based, row and column-level access controls in Databricks to ensure data is accessed only by authorized users. Privacera leverages Apache Ranger-based plugins to provide column, row and file-level access control across Spark functions. Learn more about centralized access management.

Anonymization and Masking

Privacera enables compliance with privacy and security regulations by anonymizing sensitive data as it is stored in the cloud while preserving the data’s analytical value and usefulness for machine learning and artificial intelligence with Databricks. The data can be de-anonymized for select users based on policies. Learn more about anonymization and masking.

Frequently asked questions

Does Privacera work with Databricks?

Privacera plugins, based on Apache Ranger, can enforce fine-grained access management in Databricks and Apache Spark. Privacera plugins are automatically initiated when a Databricks cluster is started.

Does Privacera access management add any performance overhead?

Privacera differs from other solutions that try to manage data requests from Apache Spark and access data on behalf of the service. Privacera’s lightweight access enforcement points quickly check a request and let it process if there is a corresponding policy granting access.

Is Privacera integrated with Apache Hive metadata store and AWS Glue?

Privacera works across any metadata store for Databricks, including Hive metadata stores and AWS Glue. Privacera can also enable tag-based access policies based on data classifications.

Resources & Latest News


Security and Privacy for Modern Data Platforms

Learn how to enable comprehensive security, privacy and governance in big data and cloud environments using Privacera.

Get Started Today

Contact us to learn more about Privacera for Databricks and get a free risk assessment.