Hi, The current implementation which (worker) - outof (total workers) picks jobs by partitioning the id space. If for whatever reason one of the worker is stuck processing a job, all the current and future jobs assigned to this worker will stall. Is there any plan (or ideas) to make this more fault tolerant? i.e. a worker can pick any job that is scheduled to run now (irrespective of the id of the job)? This will also make adding and removing worker much easier.
Hi, The current implementation which (worker) - outof (total workers) picks jobs by partitioning the id space. If for whatever reason one of the worker is stuck processing a job, all the current and future jobs assigned to this worker will stall. Is there any plan (or ideas) to make this more fault tolerant? i.e. a worker can pick any job that is scheduled to run now (irrespective of the id of the job)? This will also make adding and removing worker much easier.