Create Duplicate File Remover using Python

Simplify your digital life and save space by creating a Duplicate File Remover with Python. In this project, you’ll learn how to build a Python script that identifies and removes duplicate files on your computer or in a specific directory.

Whether you’re running out of storage space or just looking to keep your files organized, this step-by-step guide will help you create a powerful tool to eliminate duplicate content effortlessly. Join us in decluttering your digital world through the magic of Python programming!

import hashlib
import os

# Returns the hash string of the given file name

def hashFile(filename):
    # For large files, if we read it all together it can lead to memory overflow, So we take a blocksize to read at a time
    BLOCKSIZE = 65536
    hasher = hashlib.md5()
    with open(filename, 'rb') as file:
        # Reads the particular blocksize from file
        buf =
        while(len(buf) > 0):
            buf =
    return hasher.hexdigest()

if __name__ == "__main__":
    # Dictionary to store the hash and filename
    hashMap = {}

    # List to store deleted files
    deletedFiles = []
    filelist = [f for f in os.listdir() if os.path.isfile(f)]
    for f in filelist:
        key = hashFile(f)
        # If key already exists, it deletes the file
        if key in hashMap.keys():
            hashMap[key] = f
    if len(deletedFiles) != 0:
        print('Deleted Files')
        for i in deletedFiles:
        print('No duplicate files found')

Leave a Reply