Getting Started

Changelog

Licensing & Legal

Support

OCR (Text Recognition)

Available on: iOS Android

Overview

The OCR module extracts text and layout information from scanned images. It outputs text in two formats:

Raw text - Plain text extraction
XML with layout - hOCR format containing text position and formatting

This enables creation of searchable and selectable PDF documents, text extraction for data processing, and content indexing.

Basic Text Extraction

Extract text from a processed document image:

                    import GSSDK

// Configure OCR for English
let ocrConfiguration = GSKOCRConfiguration.configuration(languageTags: ["en-US"])

// Perform OCR
let result = try await GSKOCR.recognizeText(
    forImageAtPath: filePath,
    ocrConfiguration: ocrConfiguration,
    onProgress: { progress in
        // Update UI with progress (0.0 to 1.0)
        print("OCR Progress: \(Int(progress * 100))%")
    }
)

// Access results
let extractedText = result.text
let textLayout = result.textLayout // For PDF generation

                  

                    import com.geniusscansdk.ocr.OcrProcessor

// Configure OCR for English
val ocrConfiguration = OCRConfiguration(listOf("en-US"))

// Perform OCR
val progressListener = object : OcrProcessor.ProgressListener {
            override fun onProgressUpdate(progress: Int) {
                Log.d(TAG, "OCR progress: $progress")
            }
        }
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)

// Access results
val extractedText = result.text
val textLayout = result.textLayout // For PDF generation

                  

Multi-language Support

The SDK supports OCR for multiple languages. For optimal performance, specify only the languages you need.

Supported Languages

See the full list of supported languages for complete coverage.

Multi-language OCR

                    // Configure for multiple languages
let ocrConfiguration = GSKOCRConfiguration.configuration(
    languageTags: ["en-US", "es-ES", "fr-FR"]
)

// The OCR will detect and extract text in any of these languages
let result = try await GSKOCR.recognizeText(
    forImageAtPath: filePath,
    ocrConfiguration: ocrConfiguration,
    onProgress: { progress in
        updateProgress(progress)
    }
)

                  

                    // Configure for multiple languages
val ocrConfiguration = OCRConfiguration(
    listOf("en-US", "es-ES", "fr-FR")
)

// The OCR will detect and extract text in any of these languages
val ocrProcessor = OcrProcessor(context, ocrConfiguration, progressListener)
val result = ocrProcessor.processImage(imageFile)

                  

Performance Note: Recognition time increases linearly with the number of languages. For best performance, use the minimum number of languages required.

Text Layout for Searchable PDFs

The OCR module returns layout information that can be used to create searchable PDFs.

Best Practices

Image Quality

For best OCR results, check the readability score before performing OCR.

Performance Optimization

Limit languages - Only specify languages you need
Background processing - Run OCR on background thread

Use Cases

Document Search

Extract text for searchable document archives:

              // Extract and index text for search
let text = ocrResult.text
searchIndex.addDocument(id: documentId, content: text)

            

Data Extraction

Parse specific information from documents:

              // Extract invoice number from text
let invoicePattern = #"Invoice\s*#?\s*(\d+)"#
if let match = text.range(of: invoicePattern, options: .regularExpression) {
    let invoiceNumber = String(text[match])
}

            

Accessibility

Provide text-to-speech for scanned documents:

              // Make document accessible
let utterance = AVSpeechUtterance(string: ocrResult.text)
speechSynthesizer.speak(utterance)

            

What’s Next?

After extracting text, generate searchable PDF documents:

Next: PDF Generation →

Ready to get started?

Start with a free trial license to test the SDK, or contact us directly for a custom quote tailored to your needs.