Hamburger Hamburger Hamburger

How Data Reliability Engineering Can Boost Data Pipelines

Alt Text

George Philip

Practice Head, Digital Insights, Hitachi Vantara

George Philip has more than 25 years of hands-on experience in managing & delivering measurable business value for large global clients across Advanced Analytics, Data Science, Big Data, Business Intelligence, Information Management, Information Governance, Data Warehousing, Performance Management, Master Data Management, CDI, Data Quality and Metadata Management programs

Extensive experience in Program Management & Solution Architecture of large Enterprise scale Data Modernization programs across various functional areas like Regulatory, Risk, Fraud, Sales, Marketing, Customer, Finance & Operations

Invited speaker at international & national events like Gartner BI Summit USA, BI summits by Silicon India etc. On advisory board for various BI conferences like National Conference on Business Analytics & Business Intelligence held by The Institute of Public Enterprise.

Educational qualification :
PGDBM & B.Tech in Computer Science.

Read Bio +

The enterprise has a pipeline problem – a data pipeline problem.

As the amount of data being generated, shared, and stored continues to explode, data-intensive applications like Generative AI and Large Language Models, not to mention everyday business decision making, are demanding cleaner, more accurate and more complete datasets than ever before.

Data rushing in from billions of IoT devices, billions of smart devices, the ever-expanding social media platforms and increasingly complex business processes is stressing the data pipelines making it increasingly difficult, if not downright impossible for companies to efficiently channel the tsunami into data cleansing and preparation programs. As a result, sludgy unqualified data slogs through the pipelines and into the analytical applications, negatively impacting everything from insights and outcomes to customer experiences and bottom lines.

Enter the Era of DRE

To get control of this issue and to start supporting the plethora of data-intensive applications, organizations are beginning to explore a new trend on the horizon called Data Reliability Engineering (DRE). A subset of the Site Reliability Engineering (SRE), DRE is a set of tools and practices designed to help organizations achieve high data quality, freshness, and availability across the data life cycle. As part of the work, DevOps and even Site Reliability Engineering practices are employed, but the main goal of DRE is to make high quality and reliable data available at any time, from anywhere across the enterprise.

From a DRE perspective, there are lots of applications and processes that may be generating unqualified data simply because the basic checks and balances were never built into them. It’s impossible to go in and change those in the applications and processes. However, when the data reaches the threshold – that membrane that separates the pipeline from the consumption and analytics space – that's where DRE can start encapsulating that data and perform analysis and prep. That’s when the data can be immediately examined and the quality determined, all before it’s consumed by the analytical application.

For our part, Hitachi Vantara is well known for our innovations to help organizations manage and protect their growing data volumes. We’ve embraced DRE fully and just launched our own Hitachi DRE suite of services that are built upon proven DataOps methodologies and practices. Among the wide range of capabilities within this suite, we enable predictive and preventive approaches to data quality by providing crucial data observability and data pipeline monitoring.

With traditional monitoring, the data problem is identified, but a data engineer then must go in and spend time and energy to discover the root cause. Then they must fill that gap. Data observability tools went a step further and enabled engineers to observe the data, identify the problem and recommend a fix, which streamlined the process. But engineers still must go in and remedy the problem.

The new Hitachi DRE introduces self-healing engineering, in which the data engineer can initiate automatic reconciliation of the incident, dramatically reducing the amount of time needed for repair and moving the data faster into the analytical applications.

As the world grows more awash in data, the need for self-healing data pipelines grows more critical. DRE is an exciting new approach that can make it happen.

Related

{ "FirstName": "First Name", "LastName": "Last Name", "Email": "Business Email", "Title": "Job Title", "Company": "Company Name", "Address": "Address", "City": "City", "State":"State", "Country":"Country", "Phone": "Business Telephone", "LeadCommentsExtended": "Additional Information(optional)", "LblCustomField1": "What solution area are you wanting to discuss?", "ApplicationModern": "Application Modernization", "InfrastructureModern": "Infrastructure Modernization", "Other": "Other", "DataModern": "Data Modernization", "GlobalOption": "If you select 'Yes' below, you consent to receive commercial communications by email in relation to Hitachi Vantara's products and services.", "GlobalOptionYes": "Yes", "GlobalOptionNo": "No", "Submit": "Submit", "EmailError": "Must be valid email.", "RequiredFieldError": "This field is required." }
en