Scanning physical documents with a phone camera sounds simple until you deal with skewed angles, inconsistent lighting, and multi-page workflows. The Dynamsoft Capture Vision SDK for Android handles real-time document boundary detection, perspective correction, and image normalization — letting you focus on the user experience instead of low-level image processing.
What you'll build: A full-featured Android document scanning app in Java that auto-captures documents via quad stabilization, supports gallery import, quad editing, image filters (color/grayscale/binary), page sorting, rotation, and multi-page PDF and JPEG export — all powered by Dynamsoft Capture Vision SDK.
Demo Video: Android Document Scanner in Action
Prerequisites
- Android Studio (Arctic Fox or later)
-
Android SDK with
compileSdk 35andminSdk 23 - Java 11 (source and target compatibility)
-
Dynamsoft Capture Vision SDK —
com.dynamsoft:capturevisionbundle:3.4.1000 - A physical Android device (camera-based features do not work on the emulator)
Get a 30-day free trial license for Dynamsoft Capture Vision SDK.
Step 1: Add the Dynamsoft Maven Repository and SDK Dependency
Dynamsoft packages are hosted on a custom Maven repository. Add it to your root build.gradle:
allprojects {
repositories {
google()
mavenCentral()
maven { url "https://download2.dynamsoft.com/maven/aar" }
}
}
Then declare the SDK dependency in your module-level build.gradle:
dependencies {
implementation "com.dynamsoft:capturevisionbundle:3.4.1000"
implementation 'androidx.appcompat:appcompat:1.7.1'
implementation 'com.google.android.material:material:1.11.0'
implementation 'androidx.activity:activity:1.8.2'
implementation 'androidx.viewpager2:viewpager2:1.0.0'
implementation 'androidx.exifinterface:exifinterface:1.3.7'
}
Step 2: Initialize the License and Set Up the Camera
In your ScannerFragment, initialize the Dynamsoft license and set up CameraEnhancer with CaptureVisionRouter. The router connects to the camera as its input source and starts processing frames using the document detection template.
@Override
public View onCreateView(@NonNull LayoutInflater inflater, @Nullable ViewGroup container, @Nullable Bundle savedInstanceState) {
PermissionUtil.requestCameraPermission(requireActivity());
mViewModel = new ViewModelProvider(requireActivity()).get(DocumentScannerViewModel.class);
mViewModel.actionBarTitle.setValue(requireContext().getString(R.string.scan_page_title));
if (savedInstanceState == null) {
LicenseManager.initLicense("LICENSE-KEY", (isSuccess, error) -> {
if (!isSuccess && error != null) {
error.printStackTrace();
}
});
}
return inflater.inflate(R.layout.fragment_scanner, container, false);
}
Wire up the camera and capture vision router in onViewCreated:
CameraView cameraView = view.findViewById(R.id.cameraView);
mCamera = new CameraEnhancer(cameraView, getViewLifecycleOwner());
mRouter = new CaptureVisionRouter();
MultiFrameResultCrossFilter filter = new MultiFrameResultCrossFilter();
filter.enableResultCrossVerification(EnumCapturedResultItemType.CRIT_DESKEWED_IMAGE, true);
mRouter.addResultFilter(filter);
try {
mRouter.setInput(mCamera);
} catch (CaptureVisionRouterException e) {
e.printStackTrace();
return;
}
Start and stop capturing in sync with the fragment lifecycle:
@Override
public void onResume() {
super.onResume();
mCamera.open();
mRouter.startCapturing(EnumPresetTemplate.PT_DETECT_AND_NORMALIZE_DOCUMENT, new CompletionListener() {
@Override
public void onSuccess() { }
@Override
public void onFailure(int errorCode, String errorString) {
mViewModel.startCapturingError.postValue(errorString);
}
});
}
@Override
public void onPause() {
super.onPause();
mCamera.close();
mRouter.stopCapturing();
}
Step 3: Receive Detection Results and Enable Auto-Capture
Register a CapturedResultReceiver to handle detected documents. Each frame that contains a deskewed document image is either consumed by manual capture or fed to a QuadStabilizer for automatic capture when the document boundary remains stable across consecutive frames.
mRouter.addResultReceiver(new CapturedResultReceiver() {
@Override
public void onProcessedDocumentResultReceived(@NonNull ProcessedDocumentResult result) {
if (result.getDeskewedImageResultItems().length > 0) {
DeskewedImageResultItem item = result.getDeskewedImageResultItems()[0];
mLatestDeskewedItem = item;
mLatestOriginalImageHashId = result.getOriginalImageHashId();
if (mIsBtnClicked) {
mIsBtnClicked = false;
mHandler.removeCallbacks(mCaptureTimeoutRunnable);
captureResult(item, result.getOriginalImageHashId());
} else if (item.getCrossVerificationStatus() == EnumCrossVerificationStatus.CVS_PASSED) {
Quadrilateral quad = item.getSourceDeskewQuad();
if (quad != null) {
mQuadStabilizer.feedQuad(quad);
}
}
}
}
});
The QuadStabilizer compares consecutive quad detections using IoU (Intersection over Union) and area-delta thresholds. When the boundary stays stable for a configurable number of frames, it triggers auto-capture:
public void feedQuad(Quadrilateral quad) {
if (!autoCaptureEnabled) {
return;
}
if (previousQuad == null) {
previousQuad = quad;
consecutiveStableFrames = 0;
return;
}
float iou = calculateIoU(previousQuad, quad);
double prevArea = calculateQuadArea(previousQuad);
double currArea = calculateQuadArea(quad);
float areaDelta = prevArea > 0 ? (float) Math.abs(currArea - prevArea) / (float) prevArea : 1.0f;
if (iou >= iouThreshold && areaDelta <= areaDeltaThreshold) {
consecutiveStableFrames++;
if (consecutiveStableFrames >= stableFrameCount && callback != null) {
callback.onStable();
reset();
}
} else {
consecutiveStableFrames = 0;
}
previousQuad = quad;
}
Step 4: Capture and Store Normalized Document Pages
When a document is captured — manually or via auto-capture — the normalized image, original frame, and detected quad are bundled into a DocumentPage and stored in a shared ViewModel:
private void captureResult(DeskewedImageResultItem item, String originalImageHashId) {
if (mCooldown) return;
mCooldown = true;
ImageData normalizedImage = item.getImageData();
Quadrilateral quad = item.getSourceDeskewQuad();
ImageData originalImage = mRouter.getIntermediateResultManager().getOriginalImage(originalImageHashId);
DocumentPage page = new DocumentPage(originalImage, normalizedImage, quad);
if (mIsRetakeMode && mRetakeIndex >= 0) {
mViewModel.replacePage(mRetakeIndex, page);
mViewModel.retakePageIndex.postValue(-1);
mHandler.post(() -> requireActivity().getSupportFragmentManager().popBackStack());
} else {
mViewModel.addPage(page);
}
mQuadStabilizer.reset();
mHandler.postDelayed(() -> mCooldown = false, 1500);
}
For manual capture, if no document boundary is detected within 500ms, a raw camera frame is captured as a fallback:
private void captureRawFrame() {
if (mCooldown) return;
mCooldown = true;
try {
ImageData frame = mCamera.getImage();
if (frame != null) {
DocumentPage page = new DocumentPage(frame, frame, null);
if (mIsRetakeMode && mRetakeIndex >= 0) {
mViewModel.replacePage(mRetakeIndex, page);
mViewModel.retakePageIndex.postValue(-1);
mHandler.post(() -> requireActivity().getSupportFragmentManager().popBackStack());
} else {
mViewModel.addPage(page);
}
}
} catch (Exception e) {
e.printStackTrace();
}
mHandler.postDelayed(() -> mCooldown = false, 1500);
}
Step 5: Edit Document Boundaries with the Quad Editor
Users can drag the detected document corners to adjust the crop region. The EditFragment uses Dynamsoft's ImageEditorView to display the original image overlaid with a draggable quad. On apply, a perspective transform produces the corrected output:
private void applyEdit() {
DocumentPage page = mViewModel.getPage(mPageIndex);
if (page == null || !page.hasOriginalImage()) {
requireActivity().getSupportFragmentManager().popBackStack();
return;
}
DrawingLayer layer = mEditorView.getDrawingLayer(DrawingLayer.DDN_LAYER_ID);
List<DrawingItem> items = layer.getDrawingItems();
Quadrilateral newQuad = null;
for (DrawingItem item : items) {
if (item instanceof QuadDrawingItem) {
newQuad = ((QuadDrawingItem) item).getQuad();
break;
}
}
if (newQuad == null) {
requireActivity().getSupportFragmentManager().popBackStack();
return;
}
try {
Bitmap originalBitmap = page.getOriginalImage().toBitmap();
Bitmap deskewed = DocumentPage.perspectiveTransform(originalBitmap, newQuad);
page.updateFromQuadEdit(deskewed, newQuad);
mViewModel.notifyPagesChanged();
} catch (CoreException e) {
e.printStackTrace();
Toast.makeText(requireContext(), "Failed to apply edit", Toast.LENGTH_SHORT).show();
}
requireActivity().getSupportFragmentManager().popBackStack();
}
Step 6: Apply Image Filters and Export to PDF or JPEG
Each DocumentPage supports color mode toggling between color, grayscale, and binary. The ResultFragment lets users apply filters, rotate pages, reorder via drag-and-drop, and export:
mBtnFilterColor.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_COLOUR));
mBtnFilterGrayscale.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_GRAYSCALE));
mBtnFilterBinary.setOnClickListener(v -> applyFilter(EnumImageColourMode.ICM_BINARY));
PDF export iterates over all captured pages and writes them into an Android PdfDocument, scaling to A4 dimensions:
public static Uri exportToPdf(Context context, List<DocumentPage> pages) throws IOException, CoreException {
if (pages == null || pages.isEmpty()) return null;
PdfDocument pdfDocument = new PdfDocument();
try {
for (int i = 0; i < pages.size(); i++) {
DocumentPage page = pages.get(i);
Bitmap bitmap = page.getDisplayBitmap();
if (bitmap == null) continue;
int pageWidth = bitmap.getWidth();
int pageHeight = bitmap.getHeight();
float scale = 1.0f;
if (pageWidth > 2480 || pageHeight > 3508) {
scale = Math.min(2480f / pageWidth, 3508f / pageHeight);
pageWidth = Math.round(pageWidth * scale);
pageHeight = Math.round(pageHeight * scale);
}
PdfDocument.PageInfo pageInfo = new PdfDocument.PageInfo.Builder(pageWidth, pageHeight, i + 1).create();
PdfDocument.Page pdfPage = pdfDocument.startPage(pageInfo);
Canvas canvas = pdfPage.getCanvas();
if (scale != 1.0f) {
canvas.scale(scale, scale);
}
canvas.drawBitmap(bitmap, 0, 0, null);
pdfDocument.finishPage(pdfPage);
}
String timestamp = new SimpleDateFormat("yyyyMMdd_HHmmss", Locale.getDefault()).format(new Date());
String fileName = "DocScan_" + timestamp + ".pdf";
File documentsDir = new File(context.getFilesDir(), "documents");
if (!documentsDir.exists()) {
documentsDir.mkdirs();
}
File pdfFile = new File(documentsDir, fileName);
FileOutputStream fos = new FileOutputStream(pdfFile);
pdfDocument.writeTo(fos);
fos.flush();
fos.close();
return FileProvider.getUriForFile(context,
context.getPackageName() + ".fileprovider", pdfFile);
} finally {
pdfDocument.close();
}
}
Step 7: Detect and Normalize Documents from Gallery Images
Users can import images from the gallery. The app reads EXIF orientation, applies rotation correction, and feeds the corrected bitmap to the capture vision router for document detection:
private void processGalleryImage(Uri imageUri) {
try {
int exifRotation = 0;
try (InputStream exifStream = requireContext().getContentResolver().openInputStream(imageUri)) {
if (exifStream != null) {
ExifInterface exif = new ExifInterface(exifStream);
int orientation = exif.getAttributeInt(
ExifInterface.TAG_ORIENTATION, ExifInterface.ORIENTATION_NORMAL);
switch (orientation) {
case ExifInterface.ORIENTATION_ROTATE_90: exifRotation = 90; break;
case ExifInterface.ORIENTATION_ROTATE_180: exifRotation = 180; break;
case ExifInterface.ORIENTATION_ROTATE_270: exifRotation = 270; break;
}
}
} catch (IOException ignored) { }
Bitmap bitmap;
try (InputStream decodeStream = requireContext().getContentResolver().openInputStream(imageUri)) {
if (decodeStream == null) return;
bitmap = BitmapFactory.decodeStream(decodeStream);
}
if (bitmap == null) return;
if (exifRotation != 0) {
android.graphics.Matrix m = new android.graphics.Matrix();
m.postRotate(exifRotation);
Bitmap rotated = Bitmap.createBitmap(bitmap, 0, 0, bitmap.getWidth(), bitmap.getHeight(), m, true);
if (rotated != bitmap) bitmap.recycle();
bitmap = rotated;
}
ByteArrayOutputStream baos = new ByteArrayOutputStream();
bitmap.compress(Bitmap.CompressFormat.JPEG, 95, baos);
byte[] jpegBytes = baos.toByteArray();
CapturedResult capturedResult = mRouter.capture(jpegBytes, EnumPresetTemplate.PT_DETECT_AND_NORMALIZE_DOCUMENT);
if (capturedResult != null) {
ProcessedDocumentResult docResult = capturedResult.getProcessedDocumentResult();
if (docResult != null && docResult.getDeskewedImageResultItems().length > 0) {
DeskewedImageResultItem deskewedItem = docResult.getDeskewedImageResultItems()[0];
ImageData normalizedImage = deskewedItem.getImageData();
Quadrilateral quad = deskewedItem.getSourceDeskewQuad();
DocumentPage page = new DocumentPage(null, normalizedImage, quad);
bitmap.recycle();
mViewModel.addPage(page);
return;
}
}
Toast.makeText(requireContext(), R.string.no_document_detected, Toast.LENGTH_SHORT).show();
DocumentPage page = new DocumentPage(bitmap);
mViewModel.addPage(page);
} catch (Exception e) {
e.printStackTrace();
}
}



Top comments (0)