Enhancing Deepfake Detection through Hybrid Mobilenet-LSTM Model with Real- Time Image and Video Analysis

John Benish J; Golden Nancy R; J Jerlin Rajan

PDF

Published: May 19, 2025

John Benish J

Division of Artificial Intelligence and Machine Leaming Karunya Institute of Technology and Sciences, Coimbatore, Tamil Nadu, India.

Golden Nancy R

Division of Artificial Intelligence and Machine Leaming Karunya Institute of Technology and Sciences, Coimbatore, Tamil Nadu, India.

J Jerlin Rajan

Karunya School of Management Karuya Institute of Technology and Sciences, Coimbatore, Tamil Nadu, India.

Abstract

The rapid rise of deepfakes, which involve the use of artificial intelligence to create hyper-realistic manipulated media, has introduced significant challenges in maintaining information integrity and societal trust. As deepfake generation techniques become more sophisticated, they pose substantial risks across various sectors, including media, politics, and law enforcement. Existing detection methods, often reliant on analyzing visual artifacts or inconsistencies in facial expressions, are increasingly vulnerable to circumvention by advanced deepfake algorithms. Furthermore, many current solutions are limited in scope, focusing exclusively on either images or videos, thereby restricting their applicability in real-world scenarios where deepfakes appear in diverse formats. This research proposes a novel hybrid deepfake detection model that leverages the strengths of MobileNet, a lightweight convolutional neural network (CNN), and Long Short-Term Memory (LSTM) networks to address the limitations of existing systems. MobileNet efficiently extracts spatial features from individual frames, identifying subtle visual cues such as texture anomalies and facial inconsistencies. These spatial features are subsequently processed by an LSTM network, which analyzes temporal patterns across frames to detect temporal artifacts in video sequences, making it well- suited for video-based deepfake detection. The hybrid MobileNet-LSTM model is trained on an extensive dataset of real and deepfake media, encompassing a wide range of deepfake generation techniques. This diverse training ensures that the model is robust and adaptable to emerging deepfake methods. The proposed system supports real-time analysis of both images and videos through a user-friendly interface, allowing users to upload media files or provide URLs for deepfake detection. The system outputs a detection score, along with visual explanations of identified artifacts, enhancing transparency and interpretability. The novelty of this approach lies in its hybrid architecture, combining the complementary strengths of MobileNet for spatial analysis and LSTM for temporal modeling. Additionally, the system's support for both image and video inputs expands its practical applicability, while its lightweight nature enables near real-time analysis. This system has the potential to advance deepfake detection in various domains, such as social media platforms, news organizations, and law enforcement agencies, protecting the integrity of digital media and mitigating the harmful effects of deepfakes.

Issue

Vol. 24 No. 01 (2025)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details