DEV Community

Xiao Ling
Xiao Ling

Posted on • Originally published at dynamsoft.com

How to Build an Android Document Scanner with Auto-Capture and PDF Export

Scanning physical documents with a phone camera sounds simple until you deal with skewed angles, inconsistent lighting, and multi-page workflows. The Dynamsoft Capture Vision SDK for Android handles real-time document boundary detection, perspective correction, and image normalization — letting you focus on the user experience instead of low-level image processing.

What you'll build: A full-featured Android document scanning app in Java that auto-captures documents via quad stabilization, supports gallery import, quad editing, image filters (color/grayscale/binary), page sorting, rotation, and multi-page PDF and JPEG export — all powered by Dynamsoft Capture Vision SDK.

Demo Video: Android Document Scanner in Action

Prerequisites

  • Android Studio (Arctic Fox or later)
  • Android SDK with compileSdk 35 and minSdk 23
  • Java 11 (source and target compatibility)
  • Dynamsoft Capture Vision SDKcom.dynamsoft:capturevisionbundle:3.4.1000
  • A physical Android device (camera-based features do not work on the emulator)

Get a 30-day free trial license for Dynamsoft Capture Vision SDK.

Step 1: Add the Dynamsoft Maven Repository and SDK Dependency

Dynamsoft packages are hosted on a custom Maven repository. Add it to your root build.gradle:

allprojects {
    repositories {
        google()
        mavenCentral()
        maven { url "https://download2.dynamsoft.com/maven/aar" }
    }
}
Enter fullscreen mode Exit fullscreen mode

Then declare the SDK dependency in your module-level build.gradle:

dependencies {
    implementation "com.dynamsoft:capturevisionbundle:3.4.1000"

    implementation 'androidx.appcompat:appcompat:1.7.1'
    implementation 'com.google.android.material:material:1.11.0'
    implementation 'androidx.activity:activity:1.8.2'
    implementation 'androidx.viewpager2:viewpager2:1.0.0'
    implementation 'androidx.exifinterface:exifinterface:1.3.7'
}
Enter fullscreen mode Exit fullscreen mode

Step 2: Initialize the License and Set Up the Camera

In your ScannerFragment, initialize the Dynamsoft license and set up CameraEnhancer with CaptureVisionRouter. The router connects to the camera as its input source and starts processing frames using the document detection template.

@Override
public View onCreateView(@NonNull LayoutInflater inflater, @Nullable ViewGroup container, @Nullable Bundle savedInstanceState) {
    PermissionUtil.requestCameraPermission(requireActivity());

    mViewModel = new ViewModelProvider(requireActivity()).get(DocumentScannerViewModel.class);
    mViewModel.actionBarTitle.setValue(requireContext().getString(R.string.scan_page_title));

    if (savedInstanceState == null) {
        LicenseManager.initLicense("LICENSE-KEY", (isSuccess, error) -> {
            if (!isSuccess && error != null) {
                error.printStackTrace();
            }
        });
    }
    return inflater.inflate(R.layout.fragment_scanner, container, false);
}
Enter fullscreen mode Exit fullscreen mode

Wire up the camera and capture vision router in onViewCreated:

CameraView cameraView = view.findViewById(R.id.cameraView);
mCamera = new CameraEnhancer(cameraView, getViewLifecycleOwner());
mRouter = new CaptureVisionRouter();

MultiFrameResultCrossFilter filter = new MultiFrameResultCrossFilter();
filter.enableResultCrossVerification(EnumCapturedResultItemType.CRIT_DESKEWED_IMAGE, true);
mRouter.addResultFilter(filter);

try {
    mRouter.setInput(mCamera);
} catch (CaptureVisionRouterException e) {
    e.printStackTrace();
    return;
}
Enter fullscreen mode Exit fullscreen mode

Start and stop capturing in sync with the fragment lifecycle:

@Override
public void onResume() {
    super.onResume();
    mCamera.open();
    mRouter.startCapturing(EnumPresetTemplate.PT_DETECT_AND_NORMALIZE_DOCUMENT, new CompletionListener() {
        @Override
        public void onSuccess() { }

        @Override
        public void onFailure(int errorCode, String errorString) {
            mViewModel.startCapturingError.postValue(errorString);
        }
    });
}

@Override
public void onPause() {
    super.onPause();
    mCamera.close();
    mRouter.stopCapturing();
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Receive Detection Results and Enable Auto-Capture

Android document scanner

Register a CapturedResultReceiver to handle detected documents. Each frame that contains a deskewed document image is either consumed by manual capture or fed to a QuadStabilizer for automatic capture when the document boundary remains stable across consecutive frames.

mRouter.addResultReceiver(new CapturedResultReceiver() {
    @Override
    public void onProcessedDocumentResultReceived(@NonNull ProcessedDocumentResult result) {
        if (result.getDeskewedImageResultItems().length > 0) {
            DeskewedImageResultItem item = result.getDeskewedImageResultItems()[0];

            mLatestDeskewedItem = item;
            mLatestOriginalImageHashId = result.getOriginalImageHashId();

            if (mIsBtnClicked) {
                mIsBtnClicked = false;
                mHandler.removeCallbacks(mCaptureTimeoutRunnable);
                captureResult(item, result.getOriginalImageHashId());
            } else if (item.getCrossVerificationStatus() == EnumCrossVerificationStatus.CVS_PASSED) {
                Quadrilateral quad = item.getSourceDeskewQuad();
                if (quad != null) {
                    mQuadStabilizer.feedQuad(quad);
                }
            }
        }
    }
});
Enter fullscreen mode Exit fullscreen mode

The QuadStabilizer compares consecutive quad detections using IoU (Intersection over Union) and area-delta thresholds. When the boundary stays stable for a configurable number of frames, it triggers auto-capture:

public void feedQuad(Quadrilateral quad) {
    if (!autoCaptureEnabled) {
        return;
    }

    if (previousQuad == null) {
        previousQuad = quad;
        consecutiveStableFrames = 0;
        return;
    }

    float iou = calculateIoU(previousQuad, quad);
    double prevArea = calculateQuadArea(previousQuad);
    double currArea = calculateQuadArea(quad);
    float areaDelta = prevArea > 0 ? (float) Math.abs(currArea - prevArea) / (float) prevArea : 1.0f;

    if (iou >= iouThreshold && areaDelta <= areaDeltaThreshold) {
        consecutiveStableFrames++;
        if (consecutiveStableFrames >= stableFrameCount && callback != null) {
            callback.onStable();
            reset();
        }
    } else {
        consecutiveStableFrames = 0;
    }

    previousQuad = quad;
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Capture and Store Normalized Document Pages

When a document is captured — manually or via auto-capture — the normalized image, original frame, and detected quad are bundled into a DocumentPage and stored in a shared ViewModel:

private void captureResult(DeskewedImageResultItem item, String originalImageHashId) {
    if (mCooldown) return;
    mCooldown = true;

    ImageData normalizedImage = item.getImageData();
    Quadrilateral quad = item.getSourceDeskewQuad();
    ImageData originalImage = mRouter.getIntermediateResultManager().getOriginalImage(originalImageHashId);

    DocumentPage page = new DocumentPage(originalImage, normalizedImage, quad);

    if (mIsRetakeMode && mRetakeIndex >= 0) {
        mViewModel.replacePage(mRetakeIndex, page);
        mViewModel.retakePageIndex.postValue(-1);
        mHandler.post(() -> requireActivity().getSupportFragmentManager().popBackStack());
    } else {
        mViewModel.addPage(page);
    }

    mQuadStabilizer.reset();
    mHandler.postDelayed(() -> mCooldown = false, 1500);
}
Enter fullscreen mode Exit fullscreen mode

For manual capture, if no document boundary is detected within 500ms, a raw camera frame is captured as a fallback:

private void captureRawFrame() {
    if (mCooldown) return;
    mCooldown = true;

    try {
        ImageData frame = mCamera.getImage();
        if (frame != null) {
            DocumentPage page = new DocumentPage(frame, frame, null);
            if (mIsRetakeMode && mRetakeIndex >= 0) {
                mViewModel.replacePage(mRetakeIndex, page);
                mViewModel.retakePageIndex.postValue(-1);
                mHandler.post(() -> requireActivity().getSupportFragmentManager().popBackStack());
            } else {
                mViewModel.addPage(page);
            }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

    mHandler.postDelayed(() -> mCooldown = false, 1500);
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Edit Document Boundaries with the Quad Editor

edit document boundaries

Users can drag the detected document corners to adjust the crop region. The EditFragment uses Dynamsoft's ImageEditorView to display the original image overlaid with a draggable quad. On apply, a perspective transform produces the corrected output:

private void applyEdit() {
    DocumentPage page = mViewModel.getPage(mPageIndex);
    if (page == null || !page.hasOriginalImage()) {
        requireActivity().getSupportFragmentManager().popBackStack();
        return;
    }

    DrawingLayer layer = mEditorView.getDrawingLayer(DrawingLayer.DDN_LAYER_ID);
    List<DrawingItem> items = layer.getDrawingItems();

    Quadrilateral newQuad = null;
    for (DrawingItem item : items) {
        if (item instanceof QuadDrawingItem) {
            newQuad = ((QuadDrawingItem) item).getQuad();
            break;
        }
    }

    if (newQuad == null) {
        requireActivity().getSupportFragmentManager().popBackStack();
        return;
    }

    try {
        Bitmap originalBitmap = page.getOriginalImage().toBitmap();
        Bitmap deskewed = DocumentPage.perspectiveTransform(originalBitmap, newQuad);
        page.updateFromQuadEdit(deskewed, newQuad);
        mViewModel.notifyPagesChanged();
    } catch (CoreException e) {
        e.printStackTrace();
        Toast.makeText(requireContext(), "Failed to apply edit", Toast.LENGTH_SHORT).show();
    }

    requireActivity().getSupportFragmentManager().popBackStack();
}
Enter fullscreen mode Exit fullscreen mode

Step 6: Apply Image Filters and Export to PDF or JPEG

export documents as PDF or JPEG

Each DocumentPage supports color mode toggling between color, grayscale, and binary. The ResultFragment lets users apply filters, rotate pages, reorder via drag-and-drop, and export:

mBtnFilterColor.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_COLOUR));
mBtnFilterGrayscale.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_GRAYSCALE));
mBtnFilterBinary.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_BINARY));
Enter fullscreen mode Exit fullscreen mode

PDF export iterates over all captured pages and writes them into an Android PdfDocument, scaling to A4 dimensions:

public static Uri exportToPdf(Context context, List<DocumentPage> pages) throws IOException, CoreException {
    if (pages == null || pages.isEmpty()) return null;

    PdfDocument pdfDocument = new PdfDocument();

    try {
        for (int i = 0; i < pages.size(); i++) {
            DocumentPage page = pages.get(i);
            Bitmap bitmap = page.getDisplayBitmap();
            if (bitmap == null) continue;

            int pageWidth = bitmap.getWidth();
            int pageHeight = bitmap.getHeight();

            float scale = 1.0f;
            if (pageWidth > 2480 || pageHeight > 3508) {
                scale = Math.min(2480f / pageWidth, 3508f / pageHeight);
                pageWidth = Math.round(pageWidth * scale);
                pageHeight = Math.round(pageHeight * scale);
            }

            PdfDocument.PageInfo pageInfo = new PdfDocument.PageInfo.Builder(pageWidth, pageHeight, i + 1).create();
            PdfDocument.Page pdfPage = pdfDocument.startPage(pageInfo);

            Canvas canvas = pdfPage.getCanvas();
            if (scale != 1.0f) {
                canvas.scale(scale, scale);
            }
            canvas.drawBitmap(bitmap, 0, 0, null);

            pdfDocument.finishPage(pdfPage);
        }

        String timestamp = new SimpleDateFormat("yyyyMMdd_HHmmss", Locale.getDefault()).format(new Date());
        String fileName = "DocScan_" + timestamp + ".pdf";

        File documentsDir = new File(context.getFilesDir(), "documents");
        if (!documentsDir.exists()) {
            documentsDir.mkdirs();
        }

        File pdfFile = new File(documentsDir, fileName);
        FileOutputStream fos = new FileOutputStream(pdfFile);
        pdfDocument.writeTo(fos);
        fos.flush();
        fos.close();

        return FileProvider.getUriForFile(context,
                context.getPackageName() + ".fileprovider", pdfFile);
    } finally {
        pdfDocument.close();
    }
}
Enter fullscreen mode Exit fullscreen mode

Step 7: Detect and Normalize Documents from Gallery Images

Users can import images from the gallery. The app reads EXIF orientation, applies rotation correction, and feeds the corrected bitmap to the capture vision router for document detection:

private void processGalleryImage(Uri imageUri) {
    try {
        int exifRotation = 0;
        try (InputStream exifStream = requireContext().getContentResolver().openInputStream(imageUri)) {
            if (exifStream != null) {
                ExifInterface exif = new ExifInterface(exifStream);
                int orientation = exif.getAttributeInt(
                        ExifInterface.TAG_ORIENTATION, ExifInterface.ORIENTATION_NORMAL);
                switch (orientation) {
                    case ExifInterface.ORIENTATION_ROTATE_90:  exifRotation = 90;  break;
                    case ExifInterface.ORIENTATION_ROTATE_180: exifRotation = 180; break;
                    case ExifInterface.ORIENTATION_ROTATE_270: exifRotation = 270; break;
                }
            }
        } catch (IOException ignored) { }

        Bitmap bitmap;
        try (InputStream decodeStream = requireContext().getContentResolver().openInputStream(imageUri)) {
            if (decodeStream == null) return;
            bitmap = BitmapFactory.decodeStream(decodeStream);
        }
        if (bitmap == null) return;

        if (exifRotation != 0) {
            android.graphics.Matrix m = new android.graphics.Matrix();
            m.postRotate(exifRotation);
            Bitmap rotated = Bitmap.createBitmap(bitmap, 0, 0, bitmap.getWidth(), bitmap.getHeight(), m, true);
            if (rotated != bitmap) bitmap.recycle();
            bitmap = rotated;
        }

        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        bitmap.compress(Bitmap.CompressFormat.JPEG, 95, baos);
        byte[] jpegBytes = baos.toByteArray();

        CapturedResult capturedResult = mRouter.capture(jpegBytes, EnumPresetTemplate.PT_DETECT_AND_NORMALIZE_DOCUMENT);
        if (capturedResult != null) {
            ProcessedDocumentResult docResult = capturedResult.getProcessedDocumentResult();
            if (docResult != null && docResult.getDeskewedImageResultItems().length > 0) {
                DeskewedImageResultItem deskewedItem = docResult.getDeskewedImageResultItems()[0];
                ImageData normalizedImage = deskewedItem.getImageData();
                Quadrilateral quad = deskewedItem.getSourceDeskewQuad();
                DocumentPage page = new DocumentPage(null, normalizedImage, quad);
                bitmap.recycle();
                mViewModel.addPage(page);
                return;
            }
        }

        Toast.makeText(requireContext(), R.string.no_document_detected, Toast.LENGTH_SHORT).show();
        DocumentPage page = new DocumentPage(bitmap);
        mViewModel.addPage(page);
    } catch (Exception e) {
        e.printStackTrace();
    }
}
Enter fullscreen mode Exit fullscreen mode

Source Code

https://github.com/yushulx/android-camera-barcode-mrz-document-scanner/tree/main/examples/DynamsoftDocumentScanner

Top comments (0)