ABOUT NORCOM
Manage your large, constantly growing, complex document pool. Bundle all documents in one archive system and make them searchable (including scans). DaSense also shows the quality of the documents in terms of readability and tonality (sentiment analysis).
The task
Invoices must be checked for consistency and abnormalities such as double bills must be made visible. As a complex manual process, this should be automated to the greatest possible extent; through the use of advanced analytics, even rare and complex abnormalities should be easy to find.
The challenge
Invoices were available in scanned form with different scan quality, the number of pages and order were variable, the information contained therein was both structured (tables) and unstructured (free text), both with strong structural variations.
our solution
We created a pipeline consisting of OCR, table recognition and information extraction. An integral part of the pipeline was an automatic evaluation of the quality of the extraction results with the possibility of controlled optimization. Invoices were merged through the detection of close duplicates and duplicate entries and other anomalies were made visible using advanced analytics. A scalable architecture makes the functionality of the pipeline visible even on large data and enables the analysis of statistical anomalies.
The customer benefit
Thanks to automation, only a few invoices need to be checked manually, which leads to significant time and cost savings. The detection rate of abnormalities is significantly increased thanks to advanced analytics.
What sounds banal at first glance poses problems for many AI users in practice: the heterogeneous, distributed data must be made available to the AI system in a form that can be evaluated. Ingest-App reliably takes care of this.
Functions: Recording of all file types, creation date, authors, mdf ingest, preparation for full-text search, deduplication, multidimensional filing, information extraction
​
Currently in use, e.g., in the measurement data analysis in the development department of an automobile manufacturer
​
Someone keeps track! Labeling thoroughly checks documents and provides them with metadata. In this way, no information is lost and those who search will always find the right thing!
Functions: Weak Learning & Machine Learning, Speech Recognition, Author Recognition, Classification, Named Entity Recognition
​
Currently in use,e.g., to determine contractual partners
You have a lot of extensive documents and one question: is the text positive or negative overall? Then she cansentiment apphelp: It assesses the sentiment of each sentence, calculates a sentiment score and provides the ten most positive / most negative sentences per document.
Features:'Filter on sentiment scores, evaluation: positive, negative, neutral, 'German-Sentiment-Bert (based on language model BERT), 'Sentenizer: Spacy, sentiment score
​
The Language app is a language talent: It recognizes the language of a document based on the entire text and allows filtering based on this characteristic.
​
Features:
'Speech recognition, creating a filter for 'the language, confidence of recognition
Your advantages with DaSense
tested
Organization
- any order Dimensions
DaSense offers multidimensional storage structures, so-called facets, which can be combined and filtered as desired. There are also clear annotations for documents and clear versioning.
​
Features:
-
Property facets: Multidimensional filing structure based on document properties such as language, document type, etc.
​
-
Workflow facets: Multidimensional storage structure according to processing status, evaluation, etc.
-
Annotations: Linking properties to individual parts of the document, i.e. sentences, sections or images
​
Advantages
-
Supplementing the existing folder structure with practically relevant categories
-
Illustration of complex relationships
-
Linking multiple facets
Organization
- any order Dimensions
DaSense offers multidimensional storage structures, so-called facets, which can be combined and filtered as desired. There are also clear annotations for documents and clear versioning.
​
Features:
-
Property facets: Multidimensional filing structure based on document properties such as language, document type, etc.
​
-
Workflow facets: Multidimensional storage structure according to processing status, evaluation, etc.
-
Annotations: Linking properties to individual parts of the document, i.e. sentences, sections or images
​
Advantages
-
Supplementing the existing folder structure with practically relevant categories
-
Illustration of complex relationships
-
Linking multiple facets