How can I get a list of scheduled tasks from Gearman?

I am currently analyzing Gearman to handle some of the expensive data import tasks in our backend. So far this looks promising. However, there is one part that I simply cannot find information. How can I get a list of schedule tasks from Gearman?

I understand that I can use the administrator protocol to get the number of current jobs in the queue for each function, but I need information about the actual jobs. There is also the option to use a constant queue (e.g. MySQL) and query the database for tasks, but it seems to me that I'm wrong to get around Gearman for this kind of information. Other than that, I have no ideas.

Probably, I don’t need this at all :) So, here is some more reference material about what I want to do, I am open for the best offers. Both the client and the worker work in PHP. In our admin interface, administrators can initiate a new import for the client; since import takes some time when it starts as a background task. Now the simple questions I want to answer: when was the last import for this client? Is the import already queued for this client (in this case, starting a new import should not have an effect)? It's nice to have: in what position in the queue is this task (so that I can make an assessment when it will work)?

Thanks!

+4
source share
2 answers

The administrator protocol is what you usually used, but as you discovered, it will not list the actual tasks in the queue. We solved this by tracking the current tasks that we started at our application level and a callback in our working document telling the application when the task was completed. This allows us to perform cleaning, notification, etc., when the task is completed, and allows us to save this logic in the application, rather than the working one.

Regarding progress, it is best to use the built-in progressive mechanics in Gearman itself, in the PHP module you can call this using $job->sendStatus(percentDone, 100) . The client can then retrieve this value from the server using the task descriptor (which will be returned when the task starts). This will allow you to show the current progress to users in your interface.

As long as you have the currently running tasks in your application, you can use this to answer that similar tasks are already being performed, but you can also use joint merging / removing duplicates of the mechanism; see the $ unique parameter when adding a task.

The position in the current queue will not be available through Gearman, so you will have to do this in your application. I would stay away from the question of maintaining the Gearman layer for this information.

+3
source

You pretty much answered yourself: use DBRMS (MySQL or Postgres) as the persistance backend and query the gearman_queue table.

For example, we developed a hybrid solution: we generate and pass a unique identifier for the task, which we pass as the third doBackground () parameter ( http://php.net/manual/en/gearmanclient.dobackground.php ) when the task starts.

Then we use this id to query the gear table to check the status of the job by looking at the table field "unique_key". You can also get the queue position since a record has already been ordered.

Pro Bonus: we also catch exceptions inside the worker. If the task fails, we write the payload (which is the serialized JSON object) to the file, and then select the file and request the task using cronjob, increasing the internal counter “repeat” so that we repeat one task 3 times more and get to check back later if it still doesn't work.

+1
source

All Articles