Simplify your digital life and save space by creating a Duplicate File Remover with Python. In this project, you’ll learn how to build a Python script that identifies and removes duplicate files on your computer or in a specific directory.
Whether you’re running out of storage space or just looking to keep your files organized, this step-by-step guide will help you create a powerful tool to eliminate duplicate content effortlessly. Join us in decluttering your digital world through the magic of Python programming!
import hashlib
import os
# Returns the hash string of the given file name
def hashFile(filename):
# For large files, if we read it all together it can lead to memory overflow, So we take a blocksize to read at a time
BLOCKSIZE = 65536
hasher = hashlib.md5()
with open(filename, 'rb') as file:
# Reads the particular blocksize from file
buf = file.read(BLOCKSIZE)
while(len(buf) > 0):
hasher.update(buf)
buf = file.read(BLOCKSIZE)
return hasher.hexdigest()
if __name__ == "__main__":
# Dictionary to store the hash and filename
hashMap = {}
# List to store deleted files
deletedFiles = []
filelist = [f for f in os.listdir() if os.path.isfile(f)]
for f in filelist:
key = hashFile(f)
# If key already exists, it deletes the file
if key in hashMap.keys():
deletedFiles.append(f)
os.remove(f)
else:
hashMap[key] = f
if len(deletedFiles) != 0:
print('Deleted Files')
for i in deletedFiles:
print(i)
else:
print('No duplicate files found')
Latest posts by ideasorblogs (see all)
- Create a Python program to find the DNS record - November 26, 2023
- Billing system using Python project with source code - November 20, 2023
- Python Programming questions with real examples PDF & slides - October 31, 2023