Custom scanner guide

The Genius Scan SDK enables application developers to add a scanning module taking benefit of the power of the same technology embedded in the Genius Scan app.

The Core and the OCR modules of the SDK allow to add a fully customizable scan flow to any native app.

The Core module provides:

Document processing algorithms, which apply image treatment to image objects.
A PDF generator to create a PDF document from the processed images.
UI elements that help build the main screens of a scanning application.
Text recognition to extract the text present in the scans.

Prerequisites

This guide assumes that you have followed the Getting Started guide :

You have integrated the gssdk library in your app with Gradle.
You have initialized the SDK with the license key.

Core document processing

We’ve split the document processing operations into two classes: DocumentDetection which mainly handles the real-time document detection on the camera preview, and ScanProcessing which applies various image processing algorithms such as perspective correction and filters to a single scan.

Document detection

The edge detection takes in an image and returns a quadrangle representing the four corners of the detected document.

Kotlin

val documentDetector = DocumentDetector.create(context)
val quadrangle = documentDetector.detectDocument(imageToAnalyzePath)

Scan processing

The ScanProcessing class takes an image as input and a list of operations to apply to this image. It returns the processed image, as well as which treatments it performed.

You can apply the following operations to an image:

Perspective correction
Curvature correction
Document enhancement
Rotation
Readability (blur) detection

Kotlin

val scanProcessor = ScanProcessor(context)

// If the detection was already applied, we have a quadrangle:
val quadrangle;

// For a bitmap
val outputConfiguration = OutputConfiguration.bitmap()
// Or a file
val outputConfiguration = OutputConfiguration.file(destinationFolder)

val configuration = ScanProcessor.Configuration(
  PerspectiveCorrection.withQuadrangle(quadrangle),
  CurvatureCorrection.automatic(),
  Enhancement.automatic(),
  Rotation.automatic(),
  Readability.enabled(),
  outputConfiguration
)

val result = scanProcessor.process(imageToProcess, configuration)

PDF generation

The PDF generation module provides a couple objects to generate a PDF file.

PDF Page

An object wrapping the information to create a PDF page.

Kotlin

val page = PDFPage(imageFile, PDFSize(8.27f, 11.69f))

Java

PDFPage page = new PDFPage(imageFile, new PDFSize(8.27f, 11.69f));

PDF Document

A PDF document representing a collection of PDF pages.

Kotlin

val document = PDFDocument(listOf(pages1, page2), title, password, keywords, creationDate, updateDate)

Java

PDFDocument document = new PDFDocument(Arrays.asList(pages1, page2), title, password, keywords, creationDate, updateDate);

PDF Generator

The generator takes in a PDFDocument object, and writes the corresponding PDF file.

Kotlin

val configuration = DocumentGenerator.Configuration(outputFile)
DocumentGenerator(context).generatePDFDocument(pdfDocument, configuration)

Java

DocumentGenerator.Configuration configuration = new DocumentGenerator.Configuration(outputFile);
new DocumentGenerator(context).generatePDFDocument(pdfDocument, configuration);

UI components

Live Capture Screen

The “capture” view displays a camera preview. It takes care of setting up the entire camera stack for you. The view comes free of buttons or any UI element so that you can design it as you want. You will need to take care of this in your implementation.

You will need to include the ScanFragment into an Activity that implements the ScanFragment.CameraCallbackProvider. You can refer to the ScanActivity class in the CustomDemo application for an example.

To enable or disable live document detection, use the method scanFragment.setRealTimeDetectionEnabled. And to implement auto trigger or react based on document detection events, set a listener and implement the callbacks with scanFragment.setBorderDetectorListener.

Edit Frame Screen

The edit frame screen lets the user adjust the auto-detected edges of a document.

You subclass the BorderDetectionImageView and include it into an Activity layout. You can refer to the BorderDetectionActivity class in the CustomDemo application for an example.

Text Recognition (OCR)

The OCR module provides a way to extract text and its layout from scanned images. The extraction outputs the text in two different formats: raw text and XML containing both the text and its layout (also called hOCR). It’s then possible to generate a PDF document using this information to make it searchable and selectable.

Extract text from images

Specify the language you want to use to perform OCR with BCP-47 codes in languageTags property. We recommend setting the smallest number of languages are the text recognition time will increase linearly with the number of requested languages.

All common languages are supported: list of supported languages

Kotlin

val ocrConfiguration = OcrConfiguration(listOf("en-US"))

val result = ocrProcessor.processImage(image, ocrConfiguration, progressListener)

val xmlTextLayout = result.textLayout

Java

OcrConfiguration ocrConfiguration = new OcrConfiguration(Arrays.asList("en-US"));

OcrResult result = ocrProcessor.processImage(image, ocrConfiguration, progressListener);

String xmlTextLayout = result.textLayout;

Generate PDF document with text

PDF generation allows a text layout to be provided for each page of the document.

Kotlin

val page = PDFPage(image.getAbsolutePath(), A4_SIZE, xmlTextLayout)
// Generate PDF

Java

PDFPage page = new PDFPage(image.getAbsolutePath(), A4_SIZE, xmlTextLayout);
// Generate PDF

Handle characters from various languages

By default, the PDF generation uses a standard font which supports English and Western European languages characters. If you perform text recognition for another language, you need to specify a font supporting this language’s characters when generating the PDF document.

Custom scanner guide

Prerequisites

Core document processing

Document detection

Kotlin

Scan processing

Kotlin

PDF generation

PDF Page

Kotlin

Java

PDF Document

Kotlin

Java

PDF Generator

Kotlin

Java

UI components

Live Capture Screen

Edit Frame Screen

Text Recognition (OCR)

Extract text from images

Kotlin

Java

Generate PDF document with text

Kotlin

Java

Handle characters from various languages

Products

Industries

Case Studies

Integration

Company