Python Forum
launch processes from threads
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
launch processes from threads
#1
i am refactoring a program that dynamically runs many processes at the same time. these processes are other programs (often to download, upload, compress, or uncompress something). a 2nd process runs for each "task" to start those process and wait for them to finish (or restart them if soft errors happen). this means each task is running 2 processes. i am hitting process limits too early. so i am thinking about threads to run the control logic for each task. but this would mean launching processes from threads which i recall reading that is not supported. can someone tell me of the current status of and if a newer version of Python3 can now do this. if not, i am thinking to have an extra process that does all the process work (launching, waiting) based on requests via pipes and sending responses back via a pipe. maybe there would be a pipe per task or the task ID is sent with each message over the pipe. ideas? FYI, i have done similar in C w/o threads.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
(Feb-09-2024, 12:56 AM)Skaperen Wrote: this would mean launching processes from threads which i recall reading that is not supported
As far as I know, there is no problem in starting a subprocess from a thread. Where did you read this?
« We can solve any problem by introducing an extra level of indirection »
Reply
#3
(Feb-09-2024, 04:24 AM)Gribouillis Wrote: Where did you read this?
i believe i read it on this forum. but i do not recall where or from who.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#4
It works and you can do strange things...

~ $ cat p.py
import subprocess
from threading import Thread
from multiprocessing import Process

def w1():
    Thread(target=w2).start()

def w2():
    Thread(target=w3).start()

def w3():
    Process(target=w4).start()

def w4():
    print("You are", subprocess.run(["whoami"], capture_output=True, encoding="utf8").stdout)


w1()
~ $ python p.py
You are u0_a153
                                                       ~ $
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#5
i want individual threads in the one parent process to start subprocesses. for example i start N threads, each of which starts a subprocess, so there will be N subprocesses running. these N threads now wait for the subprocess they started to end. the subprocesses can end in any order and threads wake up in that order. i think the issue was that this could not be reliably done in threads. in other words, how thread-aware are the subprocess and multiprocessing modules?

this is about each thread (of many) "managing" one process (of many)
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#6
In Python, launching processes from threads is generally not recommended due to the Global Interpreter Lock (GIL) which can limit the benefits of threading, especially for CPU-bound tasks. However, you can use the multiprocessing module to create and manage processes safely. Each process created with multiprocessing has its own Python interpreter and memory space, making it suitable for CPU-bound tasks and avoiding the limitations of the GIL.

If you're hitting process limits too early, you could consider using a pool of processes managed by multiprocessing.Pool to control the number of concurrent processes running at any given time. This allows you to distribute your tasks across multiple processes efficiently. You can also use inter-process communication mechanisms like pipes or queues to coordinate between the main process and the worker processes.

If you're considering a design with an extra process to manage the launching and waiting for processes based on requests via pipes, it could be a viable approach as well. Each task could communicate with the manager process using a unique pipe or by including a task ID with each message. This would decouple the task logic from the process management logic and provide more flexibility in managing your workload.

Overall, the multiprocessing module in Python provides robust support for managing processes and inter-process communication, making it a suitable choice for your scenario. You may need to experiment with different approaches to find the one that best meets your performance and scalability requirements.
Reply
#7
hitting too many processes was happening when running 2 processes per task. for task N, process N.1 runs to application, while process N.2 runs a monitor in Python. if i can run all the monitor instances as threads in one big process, that would cut my process count nearly in half. among the many things the monitor does is launch the application, possibly again as a result of certain failure modes. application stdout will be saved to one file while stderr goes to another. each task has a directory but a multi-thread version can just include the directory path when it opens files related to specific tasks since threads can't be "in" different directories than the whole process.

i am looking at managing the task size to avoid the process limit without using a process pool. but i would still be using 2 process per task unless i can get the monitor running in a thread. i could limit the number of tasks but that limits how much i can get done in a given time.

there are more options i am looking at since this will be running in the AWS cloud. to use multiple virtual machine instances i need to redesign how the monitors communicate with the master monitor (one more process that launches all the monitors as needed).
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#8
another option i was thinking about was having all the monitor threads communicate with a process dedicated to launching processes on request. this one process could then notify the thread with the monitor logic (the complicated part of this project) when a process exits or stalls (the big problem this project is trying to address).
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#9
(Feb-20-2024, 01:40 AM)wearsafe Wrote: In Python, launching processes from threads is generally not recommended due to the Global Interpreter Lock (GIL) which can limit the benefits of threading, especially for CPU-bound tasks.

subprocess.Popen starts a new Process, which is a child of the Python interpreter.
If the Process is CPU-Bound, then it's the problem of the OS, not of the Python-Interpreter.
The ChildProcess is running outside the Python-Interpreter.

If you do CPU-Bound stuff in Python, then threading is a worse option.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#10
the application i will be starting in a process is heavy on CPU at times but is often waiting on network data (think web scraping). the application is not in Python but there may be a Python layer to get it started with the right command line options. the application will often be ffmpeg doing video format conversion.

the monitor is not CPU bound. it's just complicated (lots of different things to do depending on application results and events. each task will be in a different state so some form of multitasking makes monitoring easier (code it as working on one task).

this whole thing can get CPU-very-heavy if too many tasks needing format conversion need to run concurrently. i hope to eventually mange this by evaluating tasks and placing them is separate queues based on their resource needs so that i better diversify concurrent resource needs.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  printing interleaved lines from many processes Skaperen 2 2,096 Feb-20-2024, 01:48 AM
Last Post: wearsafe
  order to call Popen for 2 piped processes Skaperen 0 1,174 Oct-22-2020, 11:31 PM
Last Post: Skaperen
  capture stdout from child processes Skaperen 0 3,330 Oct-30-2019, 12:11 AM
Last Post: Skaperen
  [split] launch .PY program sparkz_alot 4 5,998 Sep-28-2016, 05:16 PM
Last Post: sparkz_alot

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020