The client wanted to modernize their on-premises data warehouse, that stored transactional data from their point-of-sale system, as well as other data sources like HR payroll, Salesforce, and financial data from flat files. The data modernization involved migrating 14TB of data from the legacy warehouse to a cloud-based solution. This data has to be synchronized with new data that is fetched from purpose-built POS system across 2000+ stores.
To address these challenges, we proposed a two-phase solution for the data warehouse migration.
Continue Reading
Historical data migration was initiated using AWS Database Migration Service (DMS) to efficiently extract data from the existing on-premises data warehouse into AWS S3 in parquet format. The data was then transformed and structured using Databricks, serving as the orchestrating layer. During this phase, we streamlined the data model significantly, reducing the number of tables from 250 to 180. This alignment of the data platform allowed it to respond efficiently to business queries. Further, key datasets like Customer360, Inventory360, and Store360 were established to deliver a comprehensive view of business insights.
A vital aspect of this data transformation was ensuring proper synchronization between the legacy and new data warehouses. Databricks jobs played a pivotal role in orchestrating this synchronization, while alert services were established to notify designated emails in case of any operational failures.
Phase 2 – Event stream ingestion:
In the second phase, which was the event stream ingestion phase, the primary focus was on the incremental processing of data. This was achieved through the implementation of a Medallion architecture that seamlessly processed records within the data lake. Data were ingested into AWS S3 and subsequently consumed to construct a delta lake directly within S3. Further, quality checks were executed before loading data into the cloud data platform.
We implemented TriggerOnce mechanism that orchestrated the streaming process at intervals of 15 minutes. This ensured that the data was available and accessible to all stakeholders, even before it was fully integrated into the data warehouse.
On top of these technical enhancements, the reporting system experienced a significant transformation a substantial transformation. Over 10,000 reports were meticulously analyzed and strategically grouped into a concise set of 200+ reports.
These refined reports were then burst daily to approximately 3,000 stakeholders, catering to the diverse reporting needs across the organization. The reporting solution was further integrated with Power BI, establishing a direct connection to curated information. This enabled real-time synchronization of dashboards, aligning with the updates in Delta Lake’s data storage.
Benefits: Elevating operations and decision-making with modernized data infrastructure
- 77% enhanced operational efficiency through streamlined data migration and synchronization.
- 86% improved decision-making with real-time dashboard updates aligned with data changes.
- 92% reduction in reporting complexity with consolidated and refined reports.
- 74% increased data accessibility and availability to all stakeholders.
Ready to modernize your retail business?
At zeb, we specialize in transforming retail data infrastructure, enhancing operational efficiency, and elevating decision-making. If you’re looking to modernize your data systems, streamline reporting, and improve data availability, we’re here to help.
Partner with us and embark on a journey towards data-driven success.