OCR (Text Recognition)

Available on: iOS Android

Overview

The OCR module extracts text and layout information from scanned images. It runs entirely on-device and returns plain text for indexing or display, plus a text layout object that can be passed to PDF generation to create searchable PDFs.

Use OCR after image processing, once you have a clean document image. For best performance, configure only the languages your app needs.

Basic Text Extraction

Run OCR on the processed image file you want to index, display, or use in a searchable PDF. The result contains both plain text and text layout.

                    import GSSDK

let ocrConfiguration = GSKOCRConfiguration.configuration(languageTags: ["en-US"])

let result = try await GSKOCR().recognizeText(
    forImageAtPath: filePath,
    configuration: ocrConfiguration,
    onProgress: { progress in
        // Update UI with progress (0.0 to 1.0)
        print("OCR Progress: \(Int(progress * 100))%")
    }
)

let extractedText = result.text
let textLayout = result.textLayout // For PDF generation

                  

                    import android.util.Log
import com.geniusscansdk.ocr.OcrConfiguration
import com.geniusscansdk.ocr.OcrProcessor

val ocrConfiguration = OcrConfiguration(listOf("en-US"))

val progressListener = object : OcrProcessor.ProgressListener {
    override fun onProgressUpdate(progress: Int) {
        Log.d("OCR", "OCR progress: $progress")
    }
}
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)

val extractedText = result.text
val textLayout = result.textLayout // For PDF generation

                  

Multi-language Support

The SDK supports OCR for multiple languages. Recognition time increases with the number of configured languages, so keep the list as short as your workflow allows.

Supported Languages

See the full list of supported languages for complete coverage.

Multi-language OCR

Pass multiple BCP 47 language tags when the document may contain more than one language.

                    let ocrConfiguration = GSKOCRConfiguration.configuration(
    languageTags: ["en-US", "es-ES", "fr-FR"]
)

let result = try await GSKOCR().recognizeText(
    forImageAtPath: filePath,
    configuration: ocrConfiguration,
    onProgress: { progress in
        updateProgress(progress)
    }
)

                  

                    val ocrConfiguration = OcrConfiguration(
    listOf("en-US", "es-ES", "fr-FR")
)

val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)

                  

Text Layout for Searchable PDFs

The textLayout result preserves word positions. Pass it to PDF generation when you want the generated PDF to be searchable and selectable.

Best Practices

Run OCR after image processing, not on the raw camera capture. If your app already uses readability detection, check the readability result before spending time on OCR. On Android, run OCR off the main thread because processImage is synchronous.

Using the Result

Use the plain text for search, lightweight extraction, accessibility, or any app-specific text workflow.

                    // Search indexing
let text = ocrResult.text
searchIndex.addDocument(id: documentId, content: text)

// Lightweight data extraction
let invoicePattern = #"Invoice\s*#?\s*(\d+)"#
if let match = text.range(of: invoicePattern, options: .regularExpression) {
    let invoiceNumber = String(text[match])
}

// Accessibility
let utterance = AVSpeechUtterance(string: text)
speechSynthesizer.speak(utterance)

                  

                    // Search indexing
val text = ocrResult.text
searchIndex.addDocument(id = documentId, content = text)

// Lightweight data extraction
val invoicePattern = Regex("""Invoice\s*#?\s*(\d+)""")
val invoiceNumber = invoicePattern.find(text)?.groupValues?.get(1)

// Accessibility
textToSpeech.speak(
    text,
    TextToSpeech.QUEUE_FLUSH,
    null,
    "ocr-result"
)

                  

What’s Next?

After extracting text, generate searchable PDF documents:

Next: PDF Generation →

Ready to get started?

Start with a free trial license to test the SDK, or contact us directly for a custom quote tailored to your needs.

OCR (Text Recognition)

Overview

Basic Text Extraction

Multi-language Support

Supported Languages

Multi-language OCR

Text Layout for Searchable PDFs

Best Practices

Using the Result

What’s Next?

Ready to get started?

Products

Industries

Case Studies

Integration

Company