Custom scanner guide

Introduction

The Genius Scan SDK enables application developers to add a scanning module taking benefit of the power of the same technology embedded in the Genius Scan app.

The Core and the OCR modules of the SDK allow to build a fully customizable scan flow that can be added to a native app.

The Core module provides:

The SDK includes a licensing system and needs to be initialized with a key to work. Without it, all the methods included in the SDK will fail.

Please refer to the platform specific API documentation for more details about each method.

Licensing

A license key is needed to initialize the SDK. The key contains the Application ID / Bundle ID of your application and an expiration date. Please contact us if you don’t have a license key yet. If the key is invalid or your license is expired, the initialization will fail, and the SDK will not work. A good practice is to check if the initialization succeeds properly, and if not, you can take an appropriate workaround (e.g. disable the feature, prompt the user to update the application…).

iOS

if (![GSK initWithLicenseKey:@"<YOUR LICENSE KEY>"]) {
   // The license is expired or invalid
}

Android

The best place to initialize the SDK is in the main Activity of your application:

try {
   GeniusScanLibrary.init(getApplicationContext(), "<YOUR LICENSE KEY>");
} catch (RuntimeException e) {
   // The license is expired or invalid
}

Core document processing

We’ve split the document processing operations into two classes: DocumentDetection, which mainly handles the real-time document detection on the camera preview; and ScanProcessing, which applies various operations such as perspective correction and filters to single scan.

Document detection

The edge detection takes in an image and returns a quadrangle representing the four corners of the detected document.

iOS

GSKDocumentDetector *documentDetector = [GSKDocumentDetector new];

NSError *error = nil;
GSKQuadrangleDetectionResult *result = [self.documentDetector detectQuadrangleFromImage:self.image options:GSKDetectQuadrangleOptionsNone error:&error];
if (!result) {
  NSLog(@"Error while detecting document frame: %@", error);
  return;
}
GSKQuadrangle *quadrangle = result.quadrangle;

Android

DocumentDetection documentDetection;
Quadrangle quadrangle = documentDetection.detectDocument(imageToAnalyzePath);

Scan processing

The ScanProcessing class takes an image as input as well as a list of operations to apply to this image. It returns the processed image, as well as updated parameters if some automatic enhancements were involved.

The operations that can be applied to the image are:

iOS

UIImage *image = 

// If the detection was already applied, we have a quadrangle:
GSKQuadrangle *quadrangle = 

GSKPerspectiveCorrectionConfiguration *perspectiveCorrectionConfiguration = [GSKPerspectiveCorrectionConfiguration perspectiveCorrectionConfigurationWithQuadrangle:quadrangle];
GSKCurvatureCorrectionConfiguration *curvatureCorrectionConfiguration = [GSKCurvatureCorrectionConfiguration curvatureCorrectionConfigurationWithCurvatureCorrection:YES];
GSKEnhancementConfiguration *enhancementConfiguration = [GSKEnhancementConfiguration enhancementConfigurationWithFilter:GSKFilterBlackAndWhite];

NSError *error = nil;
GSKProcessingResult *result = [[GSKDocumentProcessor new] processImage:originalImage
                                    perspectiveCorrectionConfiguration:perspectiveCorrectionConfiguration
                                      curvatureCorrectionConfiguration:curvatureCorrectionConfiguration
                                              enhancementConfiguration:enhancementConfiguration
                                                 rotationConfiguration:nil
                                                   outputConfiguration:[GSKOutputConfiguration defaultConfiguration]
                                                                 error:&error];
if (!result) {
    NSLog(@"Error while processing scan: %@", error);
    return;
}

UIImage *image = [UIImage contentsOfFile:result.processedImagePath];

Android

ScanProcessor scanProcessor;

// If the detection was already applied, we have a quadrangle:
Quadrangle quadrangle;

Configuration configuration = new Configuration(
  PerspectiveCorrection.withQuadrangle(quadrangle),
  CurvatureCorrection.automatic(),
  Enhancement.automatic()
);

OutputParameters outputParameters = scanProcessor.process(imageToProcessPath, processedImagePath, configuration);

PDF generation

The PDF generation module provides a couple objects to generate a PDF file.

PDF Page

An object wrapping the information to create a PDF page.

iOS

GSKPDFPage *page = [[GSKPDFPage alloc] initWithFilePath:imageFilePath inchesSize:[[GSKPDFSize alloc] initWithWidth:8.27 height:11.69] /* size in inches for an A4 sheet */];

Android

PDFPage page = new PDFPage(imageFilePath, new PDFSize(8.27f, 11.69f));

PDF Document

A PDF document representing a collection of PDF pages.

iOS

GSKPDFPage *page1 = 
GSKPDFPage *page2 = 

GSKPDFDocument *document = [[GSKPDFDocument alloc] initWithTitle:title password:nil keywords:nil pages:@[ page1, page2 ]];

Android

PDFPage page1 = 
PDFPage page2 = 

PDFDocument document = new PDFDocument(title, password, keywords, Arrays.asList(pages1, page2));

PDF Generator

The generator takes in a PDF document and offers the ability to write the PDF file.

iOS

GSKPDFGenerator *generator = [GSKPDFGenerator createWithDocument:document];
[generator generatePDF:outputFilePath];

Android

PDFGenerator generator = PDFGenerator.createWithDocument(document, null, null);
generator.generatePDF(outputFilePath);

UI components

Capture Screen

The “capture” view displays a camera preview. It takes care of setting up the entire camera stack for you. The view comes free of buttons or any UI element, so that you can design it as you want. You will need to take care of this in your implementation.

iOS

On iOS, you subclass GSKCameraViewController, and you can customize it as desired. You can refer to the CameraViewController class in the GSSDKDemo for an example.

Android

On Android, you will need to include the ScanFragment into an Activity that implements the ScanFragment.CameraCallbackProvider. You can refer to the ScanActivity class in the CustomDemo application for an example.

To enable or disable live document detection, use the method scanFragment.setRealTimeDetectionEnabled. And to implement auto trigger or react based on document detection events, set a listener and implement the callbacks with scanFragment.setBorderDetectorListener.

Edit Frame Screen

The edit frame screen lets the user adjust the auto-detected edges of a document.

iOS

On iOS, you subclass GSKEditFrameViewController and you can customize it as desired. You can refer to the EditFrameViewController class in the GSSDKDemo for an example.

Android

On Android, you subclass the BorderDetectionImageView and include it into the layout of an Activity. You can refer to the BorderDetectionActivity class in the CustomDemo application for an example.

Text Recognition (OCR)

The OCR module provides a way to extract text and its layout from scanned images. The extraction outputs the text in 2 different formats: raw text and XML containing both the text and its layout (also called hOCR). It’s then possible to generate a PDF document using this information to make it searchable and selectable.

Extract text from images

Text Recognition relies on the Tesseract library and needs training data files to be present on the mobile device. Training data files are specific for each language in which text needs to be recognized and are available on this page. They need to be stored in a directory called tessdata on the device.

iOS

GSKOCRConfiguration *ocrConfiguration = [GSKOCRConfiguration new];
ocrConfiguration.trainedDataPath = 
ocrConfiguration.languageCodes = @[@"eng"];

GSKOCRResult *result = [GSKOCR recognizeTextForImageAtPath:filePath ocrConfiguration:ocrConfiguration onProgress:&progress error:&error];

GSKTextLayout* textLayout = result.textLayout;

Android

OcrConfiguration ocrConfiguration = new OcrConfiguration(Arrays.asList("eng"), tessdataDirectory, false);

OcrResult result = ocrProcessor.processImage(image, ocrConfiguration, progressListener);

String xmlTextLayout = result.textLayout;

Generate PDF document with text

PDF generation allows a text layout to be provided for each page of the document that is converted to PDF.

iOS

GSKPDFPage *page = [[GSKPDFPage alloc] initWithFilePath:imageFilePath inchesSize:A4Size textLayout:textLayout];
// Generate PDF

Android

PDFPage page = new PDFPage(image.getAbsolutePath(), A4_SIZE, xmlTextLayout);
// Generate PDF

© 2020 The Grizzly Labs, Inc. All rights reserved.