Nov-09-2022, 01:39 PM
Hello,
with great support from this forum I'm working on a tool which can detect hangers in a Super8-Film.
I need some support in using multiprocessing.
I create the hash string from frames in a folder. I'm planning to treat this occurrence with multiprocessing to speed it up.
Here is the complete code:
main_short.py:
The simple example following repeats a function 4 times:
So I need to loop over all frames with multiprocessing and not repeat the whole function create_phash()...
I guess that in line 22 from hanger_detection.py I have to treat "frame_phash = str(imagehash.phash(frame))" with multiprocessing...
Would you please be so kind and tell me what I should do?
That would be great...
Thanks a lot...
with great support from this forum I'm working on a tool which can detect hangers in a Super8-Film.
I need some support in using multiprocessing.
I create the hash string from frames in a folder. I'm planning to treat this occurrence with multiprocessing to speed it up.
Here is the complete code:
main_short.py:
import hanger_detection from multiprocessing import Pool import time hangers = [] measure_time_start = time.time() pool = Pool() pool.map(hanger_detection.create_phash) pool.close() frame_hash_list = hanger_detection.create_phash() hangers = hanger_detection.detect_hangers(frame_hash_list) measure_time_end = time.time() number_of_hangers = len(hangers) hanger_detection.fill_hanger_information_in_excel(hangers) print("frame_hash_list: " + str(frame_hash_list)) print("hangers: " + str(hangers)) print("Zeit: " + str(measure_time_end - measure_time_start)) print("number_of_hangers: " + str(number_of_hangers))hanger_detection.py:
import os from PIL import Image import imagehash import openpyxl from itertools import zip_longest def difference_count(a: str, b: str) -> int: """Count differences between a and b""" return sum(1 for a, b in zip_longest(a, b) if a != b) def create_phash(): frame_hash_list = [] p = "D:/S8_hanger_finder/neuer_Ansatz/aktueller_Versuch/phash_test/" obj = os.scandir(p) for entry in obj: # load frames frame = Image.open(p + str(entry.name)) # create pHash # Compare hashes to determine whether the frames are the same or not frame_phash = str(imagehash.phash(frame)) frame_hash_list.append(frame_phash) obj.close() return frame_hash_list def detect_hangers(frame_hash_list, threshold: int = 0, min_count: int = 4): """Return list of "hangers" detected in frame_hash_list. A "hanger" is consecutive frames that are the same. frame_hash_list : list of frame hash strings. Frames are considered same or different by counting the differences in their hash strings. threshold : Maximum number of diffences allowed for two frames to be considered "same". min_count : Minimum length of a hanger. Short hangers aren't noticable and don't have to be removed. """ hangers = [] # List of hanger start, stop frame indexes start_index = 0 start_frame = frame_hash_list[0] for index, frame in enumerate(frame_hash_list[1:], start=1): # Are frame and start_frame disimilar enough? if difference_count(start_frame, frame) > threshold: if index - start_index >= min_count: # Add hanger to list hangers.append((start_index, index - 1)) start_frame = frame start_index = index # Check if we end with a hanger if index - start_index > 10: hangers.append([start_index, index]) return hangers def convert_frame_nr_in_time(d): # S8-Movie (avi-file) is checked of hangers ##################################################### # 1 hour contains 72000 frames c1 = 72000 # 1 minute contains 1200 frames c2 = 1200 # 1 second contains 20 frames c3 = 20 def find_even_frame_nr(a, b, c): while True: if a % c == 0: break else: a -= 1 b += 1 return a, b frame_nr_full_hour, rest_1 = find_even_frame_nr(d, 0, c1) number_of_hours = frame_nr_full_hour / c1 ########################################################### frame_nr_full_minute, rest_2 = find_even_frame_nr(rest_1, 0, c2) number_of_minutes = frame_nr_full_minute / c2 ########################################################### frame_nr_full_second, rest_3 = find_even_frame_nr(rest_2, 0, c3) number_of_seconds = frame_nr_full_second / c3 return number_of_hours, number_of_minutes, number_of_seconds def fill_hanger_information_in_excel(hangers): p = "D:/S8_hanger_finder/neuer_Ansatz/aktueller_Versuch/S8-Hanger_Positionen.xlsx" fileXLSX = openpyxl.load_workbook(p) sheet = fileXLSX["Blatt"] # clear old hanger information # film doesn't have more than 100 hangers r = 5 c = 2 for z in range(r, r + 100): for s in range(c, c + 2): sheet.cell(row=z, column=s).value = None # fill in hanger information r = 5 for i in hangers: frame_nr_hanger_start = i[0] frame_nr_hanger_end = i[1] number_of_hours_start, number_of_minutes_start, number_of_seconds_start = convert_frame_nr_in_time( frame_nr_hanger_start) number_of_hours_end, number_of_minutes_end, number_of_seconds_end = convert_frame_nr_in_time( frame_nr_hanger_end) number_of_hours_start_int = int(number_of_hours_start) number_of_minutes_start_int = int(number_of_minutes_start) number_of_seconds_start_int = int(number_of_seconds_start) number_of_hours_end_int = int(number_of_hours_end) number_of_minutes_end_int = int(number_of_minutes_end) number_of_seconds_end_int = int(number_of_seconds_end) number_of_hours_start_str = str(number_of_hours_start_int) if len(number_of_hours_start_str) == 1: number_of_hours_start_str = "0" + number_of_hours_start_str number_of_minutes_start_str = str(number_of_minutes_start_int) if len(number_of_minutes_start_str) == 1: number_of_minutes_start_str = "0" + number_of_minutes_start_str number_of_seconds_start_str = str(number_of_seconds_start_int) if len(number_of_seconds_start_str) == 1: number_of_seconds_start_str = "0" + number_of_seconds_start_str number_of_hours_end_str = str(number_of_hours_end_int) if len(number_of_hours_end_str) == 1: number_of_hours_end_str = "0" + number_of_hours_end_str number_of_minutes_end_str = str(number_of_minutes_end_int) if len(number_of_minutes_end_str) == 1: number_of_minutes_end_str = "0" + number_of_minutes_end_str number_of_seconds_end_str = str(number_of_seconds_end_int) if len(number_of_seconds_end_str) == 1: number_of_seconds_end_str = "0" + number_of_seconds_end_str # create timestamp timestamp_start_str = number_of_hours_start_str + ":" + number_of_minutes_start_str + ":" + number_of_seconds_start_str timestamp_end_str = number_of_hours_end_str + ":" + number_of_minutes_end_str + ":" + number_of_seconds_end_str sheet.cell(row=r, column=2).value = timestamp_start_str sheet.cell(row=r, column=4).value = timestamp_end_str r += 1 fileXLSX.save(p)In hanger_detection.py (line 17) I loop over all frames. There should be in some way the multiprocessing - but I don't know how...
The simple example following repeats a function 4 times:
from multiprocessing import Pool import time def cpu_extensive(i): time.sleep(2) print(i, "Done") starttime = time.time() pool = Pool() pool.map(cpu_extensive, range(4)) pool.close() endtime = time.time() print(f"Time taken {endtime -starttime} seconds.")But I need to treat all frames with multiprocessing (to get it treated parallel).
So I need to loop over all frames with multiprocessing and not repeat the whole function create_phash()...
I guess that in line 22 from hanger_detection.py I have to treat "frame_phash = str(imagehash.phash(frame))" with multiprocessing...
Would you please be so kind and tell me what I should do?
That would be great...
Thanks a lot...