search-close-icon
IDX

Intelligent Data Extraction (IDX)

An AI-powered cognitive information extraction platform that extracts contextual information from unstructured sources.

Platform Overview

Business growth in the digital world is often slowed by the effort to extract and process documents. Enterprises trying to achieve their strategic objectives of operational efficiency, productivity and business growth are challenged by the complexity of the domain, processes and the sheer variety of complex document data required for business consumption. Enterprise’s digital transformation journeys have run into frequent roadblocks because traditional document processing systems lack the appropriate intelligence to efficiently tackle this intricate undertaking.

So far, businesses have approached this problem by adding humans to the front, middle, and back offices. Unfortunately, that has resulted in the human workforce spending considerable time and manual effort on these documents instead of applying their creativity to more valuable tasks.

Platform Overview

Intelligent Data Extraction (IDX)

IDX is an intelligent, contextual, cloud-native, AI-powered, purpose-built platform that addresses real-world challenges in the document-to-data space.

IDX leverages state of the art AI technologies like computer vision, machine learning, deep earning and Natural Language Processing to transform unstructured data on the documents into structured data which can seamlessly integrate with current business processes.

The platform provides the intelligence and flexibility to classify, extract, validate and enrich document data, turning mass and dark data sources into reliable information sets, to ensure insight-based decision making.

Product Highlights

Contextual Extraction:

IDX delivers contextual and meaningful classification and extraction from real world documents to help automate document-based workflows.

Complete Ownership of Customer Value:

The platform’s capabilities are built ground up with no dependencies on third party OCR tools, providing complete ownership to deliver accuracy and value to customers.

Last Mile Solution:

Build end-to-end solutions for real-world use cases with post-extraction capabilities that can transform existing business processes and drive end-to-end automation.

Cloud-Native Experience:

Built using the best in breed architecture with a cloud-native, multi-tenancy and API-first approach, where customer data is secure and isolated with seamless integration capabilities.

Functional Capabilities

Doc Sources

  • Internal
  • External
about-magic

Input Types

  • Semi-structured
  • Unstructured
about-magic

Formats

  • Word
  • Excel
  • PDF
  • Images
  • Text

1. Classification

classification
  • Image-based
  • Text-based
  • Rule-based

2. Extraction

extra
  • Domain-trained NER
  • Pipeline of NLUs
  • Multi-mode OCR
  • ML models, layered on top of OCR

3. Normalisation

narmalisation
  • Input Models
  • Rules and AI-based output model
  • Semantic maps

4. Validation

validation
  • Rule-based
  • External Data

5. Enrichment

enrichment
  • Data Enrichment via connections
  • APIs

6. Verification

verification
  • Human in loop to triage and make decisions on exceptions, errors and approvals

7. Integration

integration-2
  • Integrate structured output to target system (e.g: Underwriting)

1. Classification

classification
  • Image-based
  • Text-based
  • Rule-based

2. Extraction

extra
  • Domain-trained NER
  • Pipeline of NLUs
  • Multi-mode OCR
  • ML models, layered on top of OCR

3. Normalisation

narmalisation
  • Input Models
  • Rules and AI-based output model
  • Semantic maps

4. Validation

validation
  • Rule-based
  • External Data

5. Enrichment

enrichment
  • Data Enrichment via connections
  • APIs

6. Verification

verification
  • Human in loop to triage and make decisions on exceptions, errors and approvals

7. Integration

integration-2
  • Integrate structured output to target system (e.g: Underwriting)

Output Types: JSON/CSV

Product Features

Product Highlights

Classification

IDX provides an AI model-based Classification, which means that it can identify various document types from the uploaded files, even if uploaded as part of a single PDF or TIFF file. It splits the uploaded file into different document types, thus overcoming the practical challenges of ordering and sorting pages in the scanned file, prior to any further processing.

Extracts

Extracts

The platform then contextually Extracts data at the field, entity, checkbox, table and section levels and easily responds to variations from incoming documents. This sets it apart from traditional OCR tools in the market that are more position-based.

Normalise

Normalise

IDX provides post-extraction capabilities to Normalise the output of client-specific schema requirements, so all the extracted data can be translated into a standard set of fields. IDX can Validate against configurable business rules to identify discrepancies or exceptions from the extracted data within or across documents. It provides Enrichment opportunities to integrate with third-party data sources, while addressing data coverage and data quality issues from the submitted documents.

Into the loop

Into the loop

An operational user can be brought into the loop with an intuitive point and click user interface to handle exceptions where the extraction falls off the STP flow. The user validation also serves as a feedback loop, and the machine learning model is retrained based on user inputs, to improve accuracy over time.

Seamless pluggability

Seamless pluggability

IDX is modular and its API-driven design allows seamless pluggability to upstream and downstream business processes. The transformed data can be consumed in various formats like JSON or a CSV file.

To know more about our IDX Platform

Request Demo