The OCR module extracts text and layout information from scanned images. It outputs text in two formats:
This enables creation of searchable and selectable PDF documents, text extraction for data processing, and content indexing.
Extract text from a processed document image:
import GSSDK
// Configure OCR for English
let ocrConfiguration = GSKOCRConfiguration.configuration(languageTags: ["en-US"])
// Perform OCR
let result = try await GSKOCR.recognizeText(
forImageAtPath: filePath,
ocrConfiguration: ocrConfiguration,
onProgress: { progress in
// Update UI with progress (0.0 to 1.0)
print("OCR Progress: \(Int(progress * 100))%")
}
)
// Access results
let extractedText = result.text
let textLayout = result.textLayout // For PDF generation
import com.geniusscansdk.ocr.OcrProcessor
// Configure OCR for English
val ocrConfiguration = OCRConfiguration(listOf("en-US"))
// Perform OCR
val progressListener = object : OcrProcessor.ProgressListener {
override fun onProgressUpdate(progress: Int) {
Log.d(TAG, "OCR progress: $progress")
}
}
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)
// Access results
val extractedText = result.text
val textLayout = result.textLayout // For PDF generation
The SDK supports OCR for multiple languages. For optimal performance, specify only the languages you need.
See the full list of supported languages for complete coverage.
// Configure for multiple languages
let ocrConfiguration = GSKOCRConfiguration.configuration(
languageTags: ["en-US", "es-ES", "fr-FR"]
)
// The OCR will detect and extract text in any of these languages
let result = try await GSKOCR.recognizeText(
forImageAtPath: filePath,
ocrConfiguration: ocrConfiguration,
onProgress: { progress in
updateProgress(progress)
}
)
// Configure for multiple languages
val ocrConfiguration = OCRConfiguration(
listOf("en-US", "es-ES", "fr-FR")
)
// The OCR will detect and extract text in any of these languages
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)
Performance Note: Recognition time increases linearly with the number of languages. For best performance, use the minimum number of languages required.
The OCR module returns layout information that can be used to create searchable PDFs.
For best OCR results, check the readability score before performing OCR.
Extract text for searchable document archives:
// Extract and index text for search
let text = ocrResult.text
searchIndex.addDocument(id: documentId, content: text)
Parse specific information from documents:
// Extract invoice number from text
let invoicePattern = #"Invoice\s*#?\s*(\d+)"#
if let match = text.range(of: invoicePattern, options: .regularExpression) {
let invoiceNumber = String(text[match])
}
Provide text-to-speech for scanned documents:
// Make document accessible
let utterance = AVSpeechUtterance(string: ocrResult.text)
speechSynthesizer.speak(utterance)
After extracting text, generate searchable PDF documents:
© 2025 The Grizzly Labs. All rights reserved.