Web application processing: real-time status updates and backend messages

I would like to implement an open source web application where a user sends some kind of request through their browser to a Python web application. The request data is used to identify and send any heavy computer task. Workplace calculations are outsourced to the work backend (also Python). During the processing of work, work passes through different stages over time (from “presented” in intermediate states to “finished”, ideally). What I would like to do is to show the current state of the task to the user in real time. This means that the worker backend must return working states to the web application. Then the web application should enter information into the user's browser. I have given you a photograph that schematically describes the main idea: schematic problem description

The numbers in red circles indicate the chronological order of events. A "web application" and a "working backend" have yet to be developed. Now I would be grateful if you could help me in making some technological decisions.

My questions, in particular:

  • What messaging technology should be used between the web application and the production backend? When the backend executor emits a signal (some message) about some task, he must initiate some event in the web application. Therefore, I need some kind of callback associated with the client who initially requested the application. It seems to me that I need some kind of pub / additional mechanism here where the worker backend is published and the web application is signed. When a web application receives a message, it responds to it by sending the client an update status. I want the desktop backend to be scalable and very separate from the web application. Therefore, I was thinking about using Redis or ZeroMQ for this task. What do you think? Is my whole approach too complicated?

  • What technology should I use to transfer information to the browser? Just out of perfectionism, I would like to receive updates in real time. I do not want a high frequency survey. I want to immediately click on the client when the backend executor displays a message :-). In addition, I do not need maximum browser support. This project is, in the first place, more or less technical for me. Should I go for events sent by the HTML5 server / websites? Or would you recommend otherwise?

Thanks so much for your recommendations in advance.

+8
python web-applications websocket zeromq messaging
source share
3 answers

To be useful, your web application will have a database. I would create a table in the database that is specifically designed for these tasks. You will have a “state” for each job.

This simplifies your system because you can simply send your request to get started and pass it on to the backend workers (zmq is a good solution for this IMO). Since you are using python for internal use, it is very difficult to get work orders in order to either update the current work task in the database or have another "updater" whose only task is to update the fields in the database (saving the logic separately will make for a better solution, will allow you to run several "updater" if you do a lot of updates)

Then for your interface, since you don't want to poll the server, I would do something like 'long poll' . What you basically do is poll the server, but server untruths actually "respond" until there is a change in the data that interests you. As soon as changes occur, you will respond to the request. On the interface, you have JS to reconnect as soon as it receives the latest update. This solution is cross-browser compatible if you use the JS framework, which is also a cross-browser (I would suggest jQuery).


To exclude polling a web application database, follow these steps:

make the initial request a long survey request for a web application. The web application sends the zmq message to your server (perhaps this needs to be done using the REQ / REP socket) and waits. It waits until it receives a status message from the zmq backend. When it receives a state change, it responds to the external interface with the change. At this point, the external interface will send a new request for a long poll (with this current job identifier, which may be its identifier), and the web application will connect to the server again and wait for another state change. The trick for this work is to use ZMQ ZMQ_IDENTITY for the socket when it was originally created (in the first request). This will allow the web application to reconnect to the same server socket and receive new updates. When the backend has a new update to send, it will signal a web application, which, in turn, will respond to a request with a large poll with its state change. Thus, there is no polling , no backend database, and everything comes from active clients.

I would install some kind of guard dog, which, if the external interface goes away (switches pages or closes the browser), the internal connectors will be properly closed. There is no need for them to sit where they were locked when they changed state.

+3
source share

The option will use WebSocket. If you take this road, you can check out Autobahn , which includes clients and servers for Python (Twisted), as well as the RPC + PubSub protocol on top of WebSocket (with libs for Python, JavaScript, and Android). Using the RPC + PubSub subscription greatly facilitates the work and can meet your needs (job assignment => RPC, job updates => PubSub).

AutobahnPython runs on Twisted, which can also act as a WSGI container that allows you to run Flask (or other WSGI-based web infrastructure). You can run everything on one port / server. The GitHub Autobahn repository has an example for the latter.

Disclaimer: I am the original author of Autobahn and WAMP and work for Tavendo.

Details: I assume that your employees work intensively and / or block work.

First, are your working pure Python or external programs?

In the latter case, you can use instances of the Twisted protocol, which communicate via stdio channels (in a non-blocking manner) from the main twisted stream. If the former, you can use the Twisted Background thread pool and use Twisted deferToThread (see http://twistedmatrix.com/documents/current/core/howto/threading.html ).

Autobahn runs on a main twisted jet thread. If your worker also does (see Comments earlier), you can directly call methods in instances of WebSocket / WAMP factory / protocol. If not (the worker is running in the background thread), you should call these methods using callFromThread.

If you use WAMP, the main thing is to get a link to the WampServerFactory for each employee. Then the employee can send the PubSub event to all subscribers by calling the appropriate factory method.

+4
source share

Since you are talking about a python web application, I would recommend that you study:

What messaging technology should be used between the web application and the production base?

Celery - break your tasks into smaller tasks that return results that should be displayed to the client

What technology should I use to enter information into the browser?

Either Socket IO on NodeJS is a kind of server-side JS infrastructure or a web socket library for your python web infrastructure

If you are not too attached to python, check Meteor

Based on this stream , other ways of updating progress from the server to the web client in real time may include recording the status of the progress in the redis database or using Oribited / Morbid (Both are based on Twisted ) using the STOMP protocol based on asynchronous results from the subtask celery

+3
source share

All Articles