The OCR module extracts text and layout information from scanned images. It runs entirely on-device and returns plain text for indexing or display, plus a text layout object that can be passed to PDF generation to create searchable PDFs.
Use OCR after image processing, once you have a clean document image. For best performance, configure only the languages your app needs.
Run OCR on the processed image file you want to index, display, or use in a searchable PDF. The result contains both plain text and text layout.
import GSSDK
let ocrConfiguration = GSKOCRConfiguration.configuration(languageTags: ["en-US"])
let result = try await GSKOCR().recognizeText(
forImageAtPath: filePath,
configuration: ocrConfiguration,
onProgress: { progress in
// Update UI with progress (0.0 to 1.0)
print("OCR Progress: \(Int(progress * 100))%")
}
)
let extractedText = result.text
let textLayout = result.textLayout // For PDF generation
import android.util.Log
import com.geniusscansdk.ocr.OcrConfiguration
import com.geniusscansdk.ocr.OcrProcessor
val ocrConfiguration = OcrConfiguration(listOf("en-US"))
val progressListener = object : OcrProcessor.ProgressListener {
override fun onProgressUpdate(progress: Int) {
Log.d("OCR", "OCR progress: $progress")
}
}
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)
val extractedText = result.text
val textLayout = result.textLayout // For PDF generation
The SDK supports OCR for multiple languages. Recognition time increases with the number of configured languages, so keep the list as short as your workflow allows.
See the full list of supported languages for complete coverage.
Pass multiple BCP 47 language tags when the document may contain more than one language.
let ocrConfiguration = GSKOCRConfiguration.configuration(
languageTags: ["en-US", "es-ES", "fr-FR"]
)
let result = try await GSKOCR().recognizeText(
forImageAtPath: filePath,
configuration: ocrConfiguration,
onProgress: { progress in
updateProgress(progress)
}
)
val ocrConfiguration = OcrConfiguration(
listOf("en-US", "es-ES", "fr-FR")
)
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)
The textLayout result preserves word positions. Pass it to PDF generation when you want the generated PDF to be searchable and selectable.
Run OCR after image processing, not on the raw camera capture. If your app already uses readability detection, check the readability result before spending time on OCR. On Android, run OCR off the main thread because processImage is synchronous.
Use the plain text for search, lightweight extraction, accessibility, or any app-specific text workflow.
// Search indexing
let text = ocrResult.text
searchIndex.addDocument(id: documentId, content: text)
// Lightweight data extraction
let invoicePattern = #"Invoice\s*#?\s*(\d+)"#
if let match = text.range(of: invoicePattern, options: .regularExpression) {
let invoiceNumber = String(text[match])
}
// Accessibility
let utterance = AVSpeechUtterance(string: text)
speechSynthesizer.speak(utterance)
// Search indexing
val text = ocrResult.text
searchIndex.addDocument(id = documentId, content = text)
// Lightweight data extraction
val invoicePattern = Regex("""Invoice\s*#?\s*(\d+)""")
val invoiceNumber = invoicePattern.find(text)?.groupValues?.get(1)
// Accessibility
textToSpeech.speak(
text,
TextToSpeech.QUEUE_FLUSH,
null,
"ocr-result"
)
After extracting text, generate searchable PDF documents:
Start with a free trial license to test the SDK, or contact us directly for a custom quote tailored to your needs.
© 2026 The Grizzly Labs. All rights reserved.