Rapid Development of custom models using NLP and Deep Learning
Our GrayBox helps data scientists to build and deploy state-of-the-art NLP and Deep Learning models in a fraction of the time than it takes using traditional data science methods. GrayBox has three components
The only end to end solution for data scientists in the legal and regulation space to have all the hard work of data preparation automatically formatted.
Our web-based annotator system curates the whole process of uploading and digitize documents, annotating them in a few simple steps and collect back data samples.
Our web-based annotator system support both NLP and Image annotations for corporate documents with extensive formatting and rich layouts. It provides a multi-user set-up where people can either annotate and/or validate documents simultaneously, turning the whole process into an easy and prosperous experience providing annotations of high quality. To be used internally via our GrayBox training module or by OEM clients that train their own models.
Make annotation projects, define tasks, invite annotators in a fraction of time and monitor the progress of your project in just one intuitive environment.
An intuitive environment that it easy to be used by non-technical people, without worrying about any technical aspect except delivering their tasks.
Upload real corporate documents which are automatically being digitalized without the need to worry about reformatting, data cleansing or any other preprocessing. Our system automatically takes care of all that, so you can focus on the actual annotation process.
Extract annotations files from any project with a single click, ready to be consumed by our GrayBox or any other third-party AutoXML software solution.
Train and evaluate your models faster using our configurable and extendable document analysis engine using state of the art neural networks.
This will give you the ability to run complex pipelines of neural networks on documents and extract/structure results.
Flexible Sampling: entity recognition, document classification, sentence classification, section classification - compatible for long corporate documents where the information is merely a needle in a haystack.
Standard Neural Networks Architecture Libraries: We support text classification or sequence tagging with hyperparameter optimisation to have state-of-the-art models with few quick configurations.
Word vectors for Sample (Feature) Encoding: Use Word vectors form the basis of most recent advances in natural language processing, including language models such as Word2vec, ELMo and GloVe
Evaluation: We produce clear reports on the performance of your models to help you to evaluate and understand where to focus in order to improve your model.
Integrate your model with your application
After training your models you need to be able to deploy them as part of a document processing pipeline in order to be used by the end-users or as a component of other systems within your company. We offer such an infrastructure with the following characteristics.
Ingest: Use our VisionAI technology to scan and ingest scanned corporate documents of different types. With our latest technology, the system has been designed to recognise the layout of corporate documents with great accuracy.
Real Documents: Send your documents (pdf, scanned pdf, doc, docx) and via API receive our pre-trained or your trained models in JSON, XML or HTML
Powerful API: Use our API to connect models for you to embed in your solution.
Scalability: Highly available and horizontally scalable, that can be deployed on any Docker-based environment or use our high-performance cloud.
What our customers say