Google search engine
HomeDevelopersFace Detection in Flutter using ML Kit (No Backend Required)

Face Detection in Flutter using ML Kit (No Backend Required)

Face Detection in Flutter using ML Kit (No Backend Required)

A production-grade guide to building real-time on-device face detection in Flutter using clean architecture and GetX.

Introduction

Face detection is one of the most widely used computer vision capabilities in modern mobile applications. From camera filters and augmented reality to identity verification and accessibility tools, detecting faces in real time is a foundational building block for many advanced features.

While many tutorials demonstrate how to detect a face in Flutter, most stop once the demo works on an emulator. Real applications require much more: stable camera streaming, correct coordinate transformations, efficient frame processing, and a maintainable architecture that can evolve with the application.

In this guide we will build a production-grade Flutter application that performs real-time face detection entirely on the device using Google ML Kit.

The application runs without:

• a backend server• API keys for inference• network connectivity

All processing occurs locally on the device.

Along the way we will explore several production concerns:

Structuring ML features with clean architecture

Managing camera streams and throttling frame processing

Mapping camera coordinates to screen coordinates correctly

Avoiding common state-management pitfalls with GetX

Debugging camera applications on physical devices

All code examples in this article were tested on a Samsung Galaxy M13 running Android 13, which revealed several edge cases that do not appear in emulators.

Project Demo

The final application performs real-time face detection directly from the device camera.

Features included in the demo:

Live camera preview

Real-time face detection

Bounding boxes over detected faces

Facial landmarks and contours

Smile and eye-open probability classification

Face tracking across frames

Front and rear camera switching

Because detection runs entirely on-device, the feature works offline with extremely low latency.

Typical performance on a mid-range Android device:

MetricObserved ValueDetection latency10–30 msCamera preview30 fpsDetection pipeline5–8 fpsNetwork usage0

Why On-Device Face Detection?

A traditional approach to computer vision involves sending camera frames to a cloud API for processing.

This architecture introduces several drawbacks:

Latency — every frame must travel to a server and back.

Offline failure — the feature stops working without connectivity.

Privacy concerns — captured images are transmitted to third-party infrastructure.

Google ML Kit provides an alternative: on-device machine learning inference.

The face detection model runs locally within the mobile application process. This eliminates network overhead and ensures that user data never leaves the device.

ApproachLatencyOfflinePrivacyCloud API80–400 msNoImages leave deviceML Kit On-Device10–30 msYesImages stay in memory

For camera-based experiences that require real-time interaction, on-device inference is the only practical solution.

Application Architecture

The application follows a three-layer clean architecture where dependencies always flow inward.

Presentation Layer│├── FaceDetectionScreen├── FaceDetectionController (GetX)└── FaceOverlayPainterDomain Layer│├── Entities├── Repository Interfaces└── Use CasesData Layer│├── CameraDataSource├── FaceDetectorDataSource└── FaceDetectionRepositoryImpl

Each layer has a clearly defined responsibility:

Data Layer

Responsible for interacting with external frameworks and services.This is the only layer that imports ML Kit and camera libraries.

Domain Layer

Contains pure Dart business logic including entities, repository interfaces, and use cases.The domain layer has no dependency on Flutter or platform APIs.

Presentation Layer

Responsible for user interaction, UI rendering, and state management through GetX.

This separation ensures the detection pipeline can be tested independently of camera hardware.

Setting Up ML Kit

Dependencies

Add the following packages to your pubspec.yaml.

dependencies: google_mlkit_face_detection: ^0.11.0 camera: ^0.10.5 permission_handler: ^11.3.0 get: ^4.6.6 get_it: ^7.6.7

Android Configuration

ML Kit requires a minimum SDK version of 21 and camera permission.

AndroidManifest.xml

<uses-permission android:name="android.permission.CAMERA" />

build.gradle

android { defaultConfig { minSdkVersion 21 }}

iOS Configuration

Add camera permission to Info.plist.

<key>NSCameraUsageDescription</key><string>Camera access is required to detect faces on this device.</string>

Initializing the Face Detector

The ML Kit detector should be created once and reused across frames.

final detector = FaceDetector( options: FaceDetectorOptions( enableLandmarks: true, enableClassification: true, enableTracking: true, minFaceSize: 0.1, performanceMode: FaceDetectorMode.fast, ),);

Enabling tracking allows ML Kit to maintain a stable ID for each detected face across frames.

Camera Setup

The camera stream provides raw frames that are processed by the detection pipeline.

One critical detail is selecting the correct image format depending on the platform.

CameraController( camera, ResolutionPreset.medium, enableAudio: false, imageFormatGroup: Platform.isAndroid ? ImageFormatGroup.nv21 : ImageFormatGroup.bgra8888,);

Using an incorrect format results in zero detections with no visible error, making it one of the most confusing bugs during development.

Converting Camera Frames to InputImage

The camera frame must be converted into an ML Kit InputImage.

InputImage _convertCameraImage(CameraImage image) { final WriteBuffer allBytes = WriteBuffer(); for (final plane in image.planes) { allBytes.putUint8List(plane.bytes); } final bytes = allBytes.done().buffer.asUint8List(); final Size imageSize = Size( image.width.toDouble(), image.height.toDouble(), ); final camera = cameras[currentCameraIndex]; final rotation = InputImageRotationValue.fromRawValue( camera.sensorOrientation) ?? InputImageRotation.rotation0deg; final format = InputImageFormatValue.fromRawValue( image.format.raw) ?? InputImageFormat.nv21; final metadata = InputImageMetadata( size: imageSize, rotation: rotation, format: format, bytesPerRow: image.planes.first.bytesPerRow, ); return InputImage.fromBytes( bytes: bytes, metadata: metadata, ); }

Frame Throttling

Camera streams often deliver frames at 30 frames per second.

Running detection on every frame can overload mid-range devices.

A simple flag ensures only one detection runs at a time.

if (_isProcessing) return;_isProcessing = true;await detectFaces(image);_isProcessing = false;

This maintains smooth camera preview while keeping inference latency under control.

The Coordinate Transform Problem

Face coordinates returned by ML Kit are expressed in image space, not screen space.

The overlay must therefore transform coordinates from the camera image buffer to the device display.

Three operations are required:

Rotate coordinates based on the camera sensor orientation.

Scale coordinates to match the screen dimensions.

Mirror the x-axis when using the front camera.

Failing to apply these transformations correctly leads to bounding boxes appearing offset, stretched, or mirrored.

Handling this mapping correctly is essential for building reliable camera overlays.

State Management with GetX

GetX provides a lightweight approach to managing application state.

However, several pitfalls can cause runtime issues.

Observable Values

Reactive variables must be wrapped using Rx types.

final rawImageSize = Rxn<Size>();

Assignments update the .value property:

rawImageSize.value = Size(width, height);

Using Obx Correctly

Obx widgets rebuild automatically when observables change.

Obx(() { final imgSize = controller.rawImageSize.value; if (imgSize == null) return const SizedBox.shrink(); return CustomPaint( painter: FaceOverlayPainter(imageSize: imgSize), );})

Async Cleanup

Camera streams must be stopped before disposing the controller.

@overrideFuture<void> onClose() async { await cameraController?.stopImageStream(); await cameraController?.dispose(); super.onClose();}

Failing to await cleanup often causes platform exceptions during hot reload or navigation.

Drawing the Face Overlay

The overlay is rendered using a CustomPainter.

Performance considerations are important when drawing overlays on every frame.

Paint objects should be reused instead of created inside the paint() method to avoid unnecessary garbage collection.

The shouldRepaint method must also compare actual face data rather than list references to ensure updates occur correctly.

Performance on Real Devices

Testing on a Samsung Galaxy M13 (Exynos 850) produced the following measurements:

MetricValueSingle face detection12–18 msThree faces detection20–28 msDetection pipeline5–8 fpsCamera preview30 fps

Because detection runs on a background thread internally, the UI thread remains responsive during inference.

For mid-range hardware, ResolutionPreset.medium provides the best balance between image clarity and detection speed.

Testing the Detection Pipeline

The architecture allows the face detection pipeline to be tested without camera hardware.

A mock repository can simulate detection results.

class MockFaceDetectionRepository extends Mock implements FaceDetectionRepository {}

Unit tests can verify controller behavior using predetermined detection outputs.

This significantly improves maintainability and confidence when refactoring.

Conclusion

On-device face detection is now practical on modern mobile hardware. Using ML Kit, Flutter applications can implement real-time computer vision features without relying on external infrastructure.

However, building a reliable camera feature requires careful attention to several details: camera image formats, sensor orientation, coordinate transforms, state management, and asynchronous lifecycle handling.

By structuring the application around clean architecture principles and isolating ML dependencies in the data layer, the resulting system becomes easier to maintain, test, and extend.

The patterns presented in this article provide a solid foundation for integrating advanced computer vision capabilities into Flutter applications.

Future Improvements

Possible extensions for this project include:

Face mesh rendering for augmented reality effects

Face recognition using embedding models

Real-time emotion detection

GPU-accelerated inference pipelines

Recording annotated video streams

These additions would transform the demo into a fully featured real-time computer vision toolkit for Flutter applications.

Source Code

The full source code is available in the project repository, accompanying this article.

https://github.com/RitutoshAeologic/face_detection

Thanks for reading this article

If I got something wrong? Let me know in the comments. I would love to improve.

Clap

If this article helps you.

Feel free to connect with us:And read more articles from FlutterDevs.com.

FlutterDevs team of Flutter developers to build high-quality and functionally-rich apps. Hire a Flutter developer for your cross-platform Flutter mobile app project hourly or full-time as per your requirement! For any flutter-related queries, you can connect with us on Facebook, GitHub, Twitter, and LinkedIn.

We welcome feedback and hope that you share what you’re working on using #FlutterDevs. We truly enjoy seeing how you use Flutter to build beautiful, interactive web experiences.


Need help building production-grade Flutter apps? FlutterDevs helps teams ship faster with solid architecture, better UX, and practical AI features. Reach us at support@flutterdevs.com.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments