top of page

Driving Enterprise Transformation via rationalization of analytics on Data Lake

Updated: Apr 8, 2023


In Business Intelligence world, Excel Sheets are predominant and would continue to dominate the strategic & operational reporting needs of the business. Working with various organizations, we have observed the dependency on driving reporting through excel and continuing to live with the data quality challenges with these reports. Different divisional heads are increasingly focused on building these reports without truly understanding the consumption patterns nor being able to classify them between operational & strategic reporting. C level execs who receive these reports also struggle to understand the parameters that drive their business key performance indicators.

Datalake are conceptualized to help resolve some of these challenges and present ability for business to realize the true value of data. However, technology team often struggles to create a repeatable design pattern that will help Datawarehouse to redefine the business execution using data it presents.

In this blog, we present a framework where the technology team partnered with the business groups to understand & interpret the business requirements, help them with rationalization and create a design pattern that can be applied for other similar projects and other domains and showcase how Information management systems can truly redefine how business is conducted and transformed.

1. Current State Challenges

The business analytics team has been working using Excel sheets to create the reports after pulling in data from various sources, aggregating them through excel formulae and macros and presenting them using the visual graphs. Some of these analytics include executive dashboards, operational & strategic reporting that depicts the current health of business & its operations. Other critical functions include auditing, reconciliation and loan tracking. Current state operational reports needed extracts combining data from user data managed on Access / Silos and combining with system of records information. This approach was always prone to any minor mistakes cascading to multiple downstream data analysis using this assembled information.

Adding to the complexity, regulatory requirements drove changes into the business processes that were generating the data elements feeding into the reports. Regulatory requirements also required business to create adhoc reports which should reconcile with the data previously reported, failing which there could be penalties associated.

2. Transformation Approach

A strategic program driven approach is required to understand the business data needs and mapping them with corresponding business processes. Essentially the approach should build a factory model comprising of various reusable design patterns that will maximize the value of Datalake serving the needs of business and at the same time, enabling scalability & reusability of components for quicker time to market. This approach would enable SINGLE SOURCE OF TRUTH from a systematically managed reporting repository, while giving business the ability & flexibility to generate custom analytics on need basis.

Below is a high level framework, comprising of 5 foundational elements that we leveraged as part of overall approach. Further details of the same is covered in the following sections.

2.1 Understanding what business needs?

Mapping of business reporting needs and the corresponding business process in a rapidly evolving environment is extremely challenging. With properly built questionnaires, we captured the current business reporting, identify rationalization opportunities and map the respective reporting elements to the business process that generates them. We captured these details in report to business process map document (Document A)

Qualifying the report to business process map document (Document A) with the end consumers & execs helped us significantly to rationalize and improvise the reports layout. We also identified key performance indicators in this exercise and elements that help drive them. Including the data quality rules around these elements helped build out 360 degree view of the data, potential issues and drive strategy to optimize how data is being used.

Another challenge was to track the ever changing business processes and reporting needs. We created an iteration driven model to baseline and continuously enhance the report to business process map document with business stakeholders who were able to recommend changes in the document. The iterative approach included PoC in mocking up reporting templates with data which would help visualize the data and the What If analysis that can be performed with it. This added up to the requirements time but helped us reduce the iterations with respect to data model to support the operational and analytical reporting needs.

2.2 Define Technology Platform for sourcing, provisioning & reporting

Depending on the use case, volume of data and business domain, the technology team decides on the platform of choice for setting up sourcing, provisioning & reporting hubs. Sourcing platforms enable data from heterogeneous sources to be brought onto same platform supporting joins & transformations before being brought into provisioning layer. ETL tools now enable creation of a transient sourcing layer and data masking using custom algorithms.

Driven by business process map document and architecture standards, important functions like data quality, reconciliation, reprocessing mechanisms are implemented as and when the data moves onto provisioning layers. Data Quality and Reprocessing mechanisms are closely interlinked and they should be established based on priority chosen by business as documented in business process map (Document A)

Setting up & Defining the Integration Strategy is part of the architectural specifications, however, nimbleness in design should be key criterion while deciding on the integration patterns and data movement within the Datalake. CTQs that will help track the business value of Datalake include:

1) Ability to deliver the data as fast as possible for consumption – streaming vs near real time vs batch

2) Ability to deliver data minimum number of hops.

3) Deliver data with right quality as expected by business

One of the key factors that drive Datalake adoption is the analytics platform. It is essential that business and execs are actively involved in selection of the platform.

2.3 Define Metadata, Data Lineage Strategy

Using the business process map document (Document A), the business analysts can work with business group and technology partners to define & document metadata & data lineage in the platform of choice as per the architectural strategy set by the information system organization. An exhaustive effort to map the source to Target fields along with the data transformation helped in shortening the development cycles and enabled clarity for each of the integration and reporting team to deliver a product of highest data quality and closest to business expectations.

2.4 Data Modeling as Service

Organizations often ignore the importance of data modeling; however, it is extremely important for Datalake to store the data in a format where the value of business data elements is best represented. While the data models are required to appreciate & present the key business elements to the reporting layer, it should also facilitate an ability to add elements as the business requirements changes. The transformed / derived data elements that find their way in multiple reports should be part of the data model so that transformation logic can be standardized. Any adhoc report that is created by business should also be reviewed frequently by technology team to identify transformed reporting elements that should be incorporated in data model.

Giving below an approach for standardizing Data Modeling Process, which will enable creation of abstract layer, driving service model. Inputs for this approach would be driven by business process map document (Document A). This approach was leveraged over the period of program to help with rationalization of attributes across subject areas, reduce data quality issues, facilitated quicker & prioritized changes on need basis and drive a robust data strategy.

Figure 2: Data Modeling as Service – Core Foundation Elements

Further, to enable robust reporting model that could suffice both operational and analytical needs, Logical data model with business process entities and data relationship was defined. This later helped to break the attributes and group them to physical model paved a long way to come up with the ideal design for ETL as well as Reporting. The most critical aspect of this approach was to bring in few players from ETL and Reporting team to understand requirements thereby all questions around the model was clarified to take it forward.

Delivering this capability, does not require organizations to have a dedicated data modeling team, but need to be nimble to cross train & move resources across the database development and data modeling to create such a service capability.

2.5 Reporting & Analytics – Build out & Adoption Strategy

Reporting platform provides an external view into the business value of the Datalake and hence extreme care should be taken to define & select the reporting platform.

The selection criterion should include examination of:

1) Transformation complexity in reporting layer

2) Visual Analytics

3) Predictive Analytics

4) Ease of Adhoc Analytics creation

5) Time to generate reports

6) Ability to go back and retrieve past reports for easy comparison

It is not easy for business users to move out of Excel and move to alternate reporting platform due to comfort zone, but with right value positioning, business would move along IT to empower themselves with advanced tools. Hence business participation in selection of these aspects would drive the adoption. Proof of Concept with assessment outcome depicted with Pugh Matrix is best tool for getting agreement with business on reporting layer strategy.

The other challenge with report adoption is often the difference of how the new platform presents the data and how it reconciles with the old reports. There could be differences and it requires good domain expertise to review & explain the differences to business to drive adoption. We have seen that most of the testing effort should be focused on report data reconciliation vis-à-vis legacy reports to fix issues during SIT and present fewer issues during UAT or post implementation.

Adoption of analytics requires an executive technology & business council that can review the reconciliation findings between new reports & legacy reports and work collaboratively for greater adoption.

3. Converting tactical initiatives into strategic value

Moving key data elements into Datalake and enabling strategic business value requires lot of tactical steps, most of which have been depicted above. While the factory model approach would help complete build out of Data Mart, we still need to complete and build out an integrated view across multiple data marts driven by various use cases to create an enterprise view of the data. Being able to connect the BI initiatives towards strategic goals of the enterprise will drive increased adoption of the system.

4. Quality Control & Risk Management

Creating a standardized approach to Risk Management is essential for success in Business Intelligence Program. Few aspects where Risk Monitoring should be implemented:

1) Technology

2) Organizational

3) External to Information System

4) Project Management

Determining and monitoring these controls over these aspects would help set the expectations around delivery timelines and as they change in the course of project due to variety of factors. A Risk Breakdown Structure is an effective tool to keep track of risks, exposure limits and limiting impact.

5. Cost of Opportunity Analysis

The cost of opportunity plays an important role in determining if there is a true business value in moving into an Information System that will drive successful business intelligence. Support from key stakeholders is required to successfully develop a business intelligence program and factors like changes in people, process, technology, political environment are critical in determining the timing of the move from legacy systems. Few techniques like:

1) Conducting face to face interactions between technology, business and executives help inspire confidence in program objectives

2) Being able to identify the middle point between cost of platform setup and skilled resources and business value

3) Technology’s ability to convince business with ease of report development & enhanced layout is critical to success

4) Sharing business value translated from building reports that provide ease of use and give insight into business information.

5) Performance centric approach which would empower users build some self-serve reports based on the knowledge base shared on how the reporting model is built

6) Simplification by creating one system of Truth there by removing data redundancy and creation of data silos.

6. Summary

True value of Datalake can be realized when it moves from data provisioning / data collection hub to platform for strategic analysis. While technology depends extensively on business to understand the data consumption patterns, they are uniquely positioned to serve as single source of truth for execs, driving consolidation of business logic across lines of business, while giving power to the business for discovering data and enabling business analytics, without compromising compliance and governance. All this can be delivered by leveraging the foundational elements for business analytics transformation as described in this blog.

Work with us

240 views0 comments

Recent Posts

See All


bottom of page