Content Management System Client A leading US Provider of Legal and Public-Health Records. The…
A renowned software service provider to the petroleum supply and trading sector based out of Canada had a requirement to extract ad aggregate about 160 data points on shipping from a daily set of 300 cargo inspection reports involving more than 700 file activities. The need would provide data that would enable the service provider in providing products and platforms to shipping and oil refinery organizations to effective business operations and mitigate risks.
he data extraction needed to be carried out on mostly unstructured data formats which comprised of the 300 daily inspection reports. 160 data points needed to be extracted from these reports that were being generated daily and this led to challenges as the reports were received on a daily basis in varied formats such as XML, Emails and PDFs.
The Mobius technology team reviewed the problem and the business need and arrived at a solution that would solve the extraction of data from unstructured formats and do this on the reports that were being generated daily. The solution provided Mobius to achieve extraction of all the required attribute was a completely automated one.
The solution involved a smart extraction tool that had 2 major components – an OCR component that converted the input pdfs into machine-readable documents and a data extraction component that extracted the expected attributes including cargo owner information, inspection company details, cargo transfer date, vessel name, port location, type of cargo handled, report generation date from the document.
To boost the overall productivity of our tool, the data extraction component was fortified with machine learning that simplified the process of spotting the attributes to be extracted in the reports. The well-trained machine learning model, classified the OCR-ed documents and captured data intelligently from the inspection reports. The extracted data points were then pushed in JSON format to the client’s API.
The solution by Mobius leveraged technology to automate the task of scanning and extracting the required information from the inspection documents in various formats. This automation ensured that the data collected had a 98% consistency in output and the advance machine Learning technique ensured that TAT was cut by about 60% and assured an activity coverage of 97%.