Data Quality for Hadoop
BigInsights BigQuaity is a data quality solution that provides a rich set of data profiling, cleansing and monitoring capabilities that execute on the data nodes of a Hadoop cluster.
BigInsights BigQuality helps ensure information quality and provides the ability to quickly adapt to strategic business changes by stewardship and monitoring of data and application of data quality rules for your Hadoop data.
- A Massively scalable, shared-nothing, in-memory data profiling and cleansingengine running natively in a Hadoop cluster to help bring enterprise-class data processing capabilities to the data lake.
- A rich set of data profiling to understand the assets that are moved into Hadoop.
- Metadata management to help make sense of the enormous quantities of information in the data lake.
- Support data privacy, data masking and test data management initiatives by identifying where Personally Identifiable Information (PII), sensitive and other classes of data are stored.
- Fast time to value by identifying the type of data contained within a column using three dozen pre-defined, out-of-the-box data classes including: credit card, taxpayer IDs, US phone number and others.
- Data Investigation, standardization, matching, survivorship and address verification now supported running directly inside a Hadoop cluster. USAC and AVI address cleansing and validation is also supported running on the Hadoop cluster.
- Deliver better big data, faster with a scalable data cleansing platform. You can outperform Hadoop-only distributions, process the right workloads with the right tools and enable data governance using data lineage.
- Enable cloud initiatives, whether you need data integration as part of a private or public cloud, or to integrate on-premises data with a cloud environment.
- Deliver faster time to value by deploying an easy-to-use graphical interface to help you transform information across your enterprise.