Nowadays, at the time of social networks such as Facebook, Instagram, TikTok, Twitter (X), and so on, people are uploading hundreds of thousands of data, including images and videos, every single day. This volume creates challenges in identifying the original sources of these videos. For example, users might see a scene on the networks but struggle to find its original video (movie, series, news, etc.). A proposed solution is to develop a tool that can trace the original video details based on a given scene. This system would first receive the original video via a database, then extract audio and frames from the video file, using the frames to train a convolutional neural network (CNN) to predict the original video. Simultaneously, audio processing would generate a fingerprint linked to the original video. By analyzing a sample video, the tool could find a match and retrieve the original video with its details (title, authors, description,..) from the database.