Price Monitoring Tool The Business Need A leading Australian marketplace organization wished to monitor…
The major challenge was to have fully automated real-time data harvesting from hundreds of sites followed by data normalization, versioning and lineage and aggregation reports on a large-scale. Data imputation had to be implemented and outlier elimination had to be done to refine the quality of the final dataset. The final visualization represents the depth and breadth of Gas fundamentals data across different geographic zones, time-trends and predictions.
An end-to-end ETL solution was built using Microsoft SQL Server Integration Services with python scrape bots for data harvesting purposes. Customized data harvesting bots monitor & gather data from the target monitored sites whenever published.
From the bots, data is moved to a Azure BLOB and data pipelines aid in bringing in the data to relational tables and apply standardization rules for conversion. Normalization of timezones, units and establishing versions of data are done through ETL routines and normalization jobs.
Data lineage is established and brought to a structured format before feeding onto the Datalake for reporting.
Tableau-based interactive dashboards are used for showcasing the flow of gas and the variations in flow and trends based on past data.
The system helps the users to identify outages in a matter of seconds from the time of publishing on the site and this helps with the reporting and alerting of stakeholders in this regard.
The solution we have provided is a highly scalable model. We can add any number of new web sources from different regions and the data can flow through the same workflow.
Generating variety of reports which will help the customer’s business. Some of the reports are:
Entire solution is implemented on cloud infra with Disaster Recovery Server which will ensure 99.9% uptime and business continuity even during any disasters.
Content delivered in various formats – XML, CSV
Proper data archival ensures that the customer will have access to historical data