Python Forum
Compare filename with folder name and copy matching files into a particular folder - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Compare filename with folder name and copy matching files into a particular folder (/thread-35815.html)



Compare filename with folder name and copy matching files into a particular folder - shantanu97 - Dec-18-2021

I am trying to write a python script that will use a regex to compare the file name that is present in the input folder with the output folder and if it matches, we, will copy that file from that input folder to another output folder.

Remember, we are comparing the filename that is present in the input folder name with the output folder name.

For Example:

Input folder: The filename looks like this "A10620.csv"

Output folder 1: There is one folder with the name "A10620_RD19CAR2000" besides the other folder name.

In output folder 1, I need to search the folder with that filename(See first 6 characters only), If they match, then we copy the files.

I need to search for that filename in two folder locations(Output folder 1 & Outputfolder2), If the folder does not found in location 2. Then we dumps the files in the "Files not found" folder at location 3.

Please see attached picture to get an idea of how the folder structure looks.

Here is my python script.
import os
import re
import shutil

# Traverse each file
sourcedir = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\InputFolder"
# Search for folder in Location 1
destinationdir1 = r"C:\Users\ShantanuGupta\Desktop\OzSc1\OutputFolder1"
# Search for folder in Location 2
destinationdir2 = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\OutputFolder2"
# Put files in folder "Folder Not Found"
destinationdir3 = "C:\\Users\\ShantanuGupta\\Desktop\\OzSc1\\OutputFolder3\\FoldersThatCouldNotBeFound"

regex = re.compile(r'^A\d{5}', re.IGNORECASE) #Checking only first six characters

count = 0
for files in os.listdir(sourcedir):  # looping over different files
    if regex.match(files):
        print(files)

    found = False

    # Search for a folder in Location 1
    for folderName in os.listdir(destinationdir1):
        if regex.match(folderName):
            print(folderName)
            # Copy the files from the input folder to output folder
            shutil.copy(sourcedir+'/'+files, destinationdir1+'//'+folderName)
            found = True
            break

    if not found:
        print('folder not found in Location1')
        count = 1

    # Search for a folder in Location 2
    for folderName in os.listdir(destinationdir2):
        if regex.match(folderName):
            #print(folderName)
            # Copy the files from the input folder to output folder
            shutil.copy(sourcedir+'/'+files, destinationdir1+'/'+folderName+'/'+files)
            found =  True
            break

    if not found:
        print('folder not found in Location2')
        count = 2

    # Folder Not Found
    if not found:
        print('copyingfilesinfoldernotfound')
        # Copy the files from the input folder to folder not found
        shutil.copy(sourcedir+'/'+files, destinationdir3+'/'+files)
Problems:

In the input folder there are multiple files, I am having a difficulty in make a logic how to get the filename for 1 file at a time and search in different folder location. Then go for second file name and search in different folders and so on...

Is there any better way to write this code?

Attached python code and folder structure here


RE: Compare filename with folder name and copy matching files into a particular folder - Jeff900 - Dec-18-2021

Well, offcourse there will be a better way or at least a different way. But I think I would be helpfull to organize your code a little first. I'm sure your code will work eventually, but putting al those actions, if statements etc. in just one script makes it hard to read, but more important it makes it hard to add new code.

For example, the loops you use for searches in folders could be a function that returns a value such as True or False. The function will be reusable as many times you want, no matter how many folders you want to search.

I do realize this is not a perfect answer to your question. But it is quite difficult to do suggestion for your code without having to rewrite it from the start. Working with smaller pieces of code will help to do suggestions.


RE: Compare filename with folder name and copy matching files into a particular folder - Larz60+ - Dec-18-2021

Is this what you're looking for?
from pathlib import Path
import os
import re
import shutil
import sys


class FileOperations:
    def __init__(self,
        input_folder_name,
        compare_folder_name,
        output1_folder_name,
        output2_folder_name):

        # Note -- You can change directory locations to suit your needs.
        self.regex = re.compile(r'^A\d{5}', re.IGNORECASE)

        self.InputFolder = Path(input_folder_name)
        self.CompareFolder = Path(compare_folder_name)
        self.OutputFolder1 = Path(output1_folder_name)
        self.OutputFolder2 = Path(output2_folder_name)

    def get_dir(self, foldername):
        return [files for files in foldername.iterdir() if files.is_file()]
    
    def copy_matching_files(self):        

        input_folder_files = self.get_dir(self.InputFolder)
        compare_folder_files = self.get_dir(self.CompareFolder)

        for filei in input_folder_files:
            comp_file_equalalent = self.CompareFolder / filei.name
            if re.match(self.regex, str(filei.stem)) and comp_file_equalalent.exists():
                    outfile1 = self.OutputFolder1 / filei.name
                    shutil.copyfile(filei, outfile1)
            else:
                outfile2 = self.OutputFolder2 / filei.name
                shutil.copyfile(filei, outfile2)

    def display_files_in_dir(self, dirname, fullpath=False):
        print(f"\nContets pf {dirname}:")
        for file in self.get_dir(dirname):
            if fullpath:
                print(f"{file.resolve()}")
            else:
                print(f"{file.name}")


class MyProg:
    def __init__(self):
        # assure base starting fiurectory same as script (Change as necessary)
        os.chdir(os.path.abspath(os.path.dirname(__file__)))

        # replace paths with real locations
        self.file_ops = FileOperations(
            './data/input', 
            './data/compare', 
            './data/output1',
            './data/output2')
    
    def move_files(self):
        self.file_ops.copy_matching_files()

        print(f"\nFiles that match:")
        for file in self.file_ops.get_dir(self.file_ops.OutputFolder1):
            print(file.name)
        
        print(f"\nFiles that did not match:")
        for file in self.file_ops.get_dir(self.file_ops.OutputFolder2):
            print(file.name)


def testit():
    mp = MyProg()
    print(f"\nBefore Move:")
    mp.file_ops.display_files_in_dir(mp.file_ops.InputFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.CompareFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder1)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder2)

    mp.move_files()

    print(f"\nAfter Move:")
    mp.file_ops.display_files_in_dir(mp.file_ops.InputFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.CompareFolder)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder1)
    mp.file_ops.display_files_in_dir(mp.file_ops.OutputFolder2)

if __name__ == '__main__':
    testit()
Output:
Before Move: Contets pf data/input: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif Contets pf data/compare: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData1001.csv TestData1003.csv Contets pf data/output1: Contets pf data/output2: Files that match: A10621.sif Files that did not match: B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif After Move: Contets pf data/input: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif Contets pf data/compare: A10621.sif B5678.txt TestData1000.csv TestData10000.csv TestData1001.csv TestData1003.csv Contets pf data/output1: A10621.sif Contets pf data/output2: B5678.txt TestData1000.csv TestData10000.csv TestData10001.csv TestData10002.csv TestData10003.csv TestData1001.csv TestData1003.csv Z011014.sif