Is it possible to exchange data in memory between two separate processes?

I have an xmlrpc server using Twisted. The server stores a huge amount of data stored in memory. Is it possible to have a second, separate xmlrpc server that can access an object in memory on the first server?

So serverA starts up and creates an object. serverB starts up and can read from the object in serverA.

* EDIT *

Shared Data is a list of 1 million tuples.

+40
python
Aug 12 '09 at 19:33
source share
8 answers

Without deep and dark rewriting of the main Python runtime (to ensure forcing a allocator that uses a given segment of shared memory and provides compatible addresses between disparate processes), there is no way to "exchange objects in memory" in any general sense. This list will contain a million addresses of tuples, each tuple consists of addresses of all its elements, and each of these addresses will be assigned to pymalloc in a way that will inevitably vary between processes and spread throughout the heap.

On any system other than Windows, you can create a subprocess that has read-only access to objects in the parent space of the process ... until the parent process modifies these objects. This happened when os.fork() was called, which in practice snapshots the entire memory space of the current process and starts another simultaneous copy / snapshot process. In all modern operating systems, this is very fast due to the β€œcopy to write” approach: pages of virtual memory that are not changed by any of the processes after the fork is not actually copied (access to the same pages is not shared); as soon as any process changes any bit on the previously shared page, poof, this page will be copied and the page table modified, so the modification process now has its own copy, while the other process still sees the original.

This extremely limited form of sharing in some cases can remain a lifesaver (although it is extremely limited: remember, for example, that adding a link to a shared object is considered as a β€œchange” of this object due to the number of links, and will force the page to be copied!). .. besides Windows, of course, where it is not available. With this single exception (which, I think, will not cover your use case), separating object graphics, which includes links / pointers to other objects, is basically not feasible - and almost any objects of interest to modern languages ​​(including Python) falls under this classification.

In extreme (but rather simple) cases, you can get shared use by refusing to represent your own memory of such graphs of objects. For example, a list of a million tuples, each with sixteen floats, can actually be represented as a single block with a total memory of 128 MB - all 16M floats in the IEEE double precision embedded to the end - with a slight latch on top to "make it look like ", you access things in the usual way (and, of course, not too little after-all-gaskets should also take care of the extremely hairy synchronization problems between processes that are bound to arise ;-). It only gets more fading and more complicated from there.

Modern concurrency approaches increasingly despise the "nothing in common" approach in favor of "nothing in common" when tasks exchange messages (even in multi-core systems using threads and shared address spaces, synchronization problems and performance reaches HW in terms of caching, pipeline etc., when large areas of memory are actively modified by several cores at the same time, repel people).

For example, the multiprocessing module in the Python standard library relies mainly on etching and sending objects back and forth, and not on shared memory (of course, not in R / W mode!).

I understand that this is not welcomed by the news for the OP, but if he needs to get several processors to work, he better think about having something to share, in places where they can be accessed and messaging changed - a database, a memcache cluster, a dedicated process that does nothing, but stores this data in memory and sends and receives it upon request, and other similar message passing architectures.

+83
Aug 12 '09 at 22:20
source share
β€” -
 mmap.mmap(0, 65536, 'GlobalSharedMemory') 

I think the tag ("GlobalSharedMemory") should be the same for all processes that want to share the same memory.

http://docs.python.org/library/mmap.html

+14
Aug 12 '09 at 19:36
source share

Python has a pair of 1 third-party libraries available for manipulating low-level shared memory:

  • sysv_ipc
    • > For systems not compatible with posix
  • posix_ipc
    • > Works on Windows with cygwin

Both of them are accessible through pip

[1] Another package, shm , is available but outdated. See this page for a comparison of libraries.

Sample C code for Python messaging c / o Martin O 'Hanlon :

shmwriter.c

 #include <stdio.h> #include <string.h> #include <stdlib.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> int main(int argc, const char **argv) { int shmid; // give your shared memory an id, anything will do key_t key = 123456; char *shared_memory; // Setup shared memory, 11 is the size if ((shmid = shmget(key, 11, IPC_CREAT | 0666)) < 0) { printf("Error getting shared memory id"); exit(1); } // Attached shared memory if ((shared_memory = shmat(shmid, NULL, 0)) == (char *) -1) { printf("Error attaching shared memory id"); exit(1); } // copy "hello world" to shared memory memcpy(shared_memory, "Hello World", sizeof("Hello World")); // sleep so there is enough time to run the reader! sleep(10); // Detach and remove shared memory shmdt(shmid); shmctl(shmid, IPC_RMID, NULL); } 

shmreader.py

 import sysv_ipc # Create shared memory object memory = sysv_ipc.SharedMemory(123456) # Read value from shared memory memory_value = memory.read() # Find the 'end' of the string and strip i = memory_value.find('\0') if i != -1: memory_value = memory_value[:i] print memory_value 
+6
Feb 06 '15 at 23:01
source share

You can write a C library to create and manage shared memory arrays for your specific purpose, and then use ctypes to access them with Python.

Or, put them on the file system in / dev / shm (these are tmpfs). You would save a lot of development effort for very small performance overheads: reading / writing from the tmpfs file system is a bit more than memcpy.

+4
Aug 13 '09 at 12:06
source share
+1
Aug 12 '09 at 20:20
source share

Why not paste the shared data into the memcache server? then both servers can access it quite easily.

+1
Aug 13 '09 at 9:13
source share

Simple really. You can just use shared memory. This example creates a list of tuples (python) in C ++ and shares it with the python process, which can then use the list of tuples. To use Python between two processes, simply make your access as ACCESS_WRITE in the sender process and call the write method.

C ++ (sender process):

 #include <windows.h> #include <stdio.h> #include <conio.h> #include <tchar.h> #include <iostream> #include <string> #define BUF_SIZE 256 TCHAR szName[]=TEXT("Global\\MyFileMappingObject"); TCHAR szMsg[]=TEXT("[(1, 2, 3), ('a', 'b', 'c', 'd', 'e'), (True, False), 'qwerty']"); int _tmain(int argc, _TCHAR* argv[]) { HANDLE hMapFile; LPCTSTR pBuf; hMapFile = CreateFileMapping( INVALID_HANDLE_VALUE, // use paging file NULL, // default security PAGE_READWRITE, // read/write access 0, // maximum object size (high-order DWORD) BUF_SIZE, // maximum object size (low-order DWORD) szName); // name of mapping object if (hMapFile == NULL) { _tprintf(TEXT("Could not create file mapping object (%d).\n"), GetLastError()); return 1; } pBuf = (LPTSTR) MapViewOfFile(hMapFile, // handle to map object FILE_MAP_ALL_ACCESS, // read/write permission 0, 0, BUF_SIZE); if (pBuf == NULL) { _tprintf(TEXT("Could not map view of file (%d).\n"), GetLastError()); CloseHandle(hMapFile); return 1; } CopyMemory((PVOID)pBuf, szMsg, (_tcslen(szMsg) * sizeof(TCHAR))); _getch(); UnmapViewOfFile(pBuf); CloseHandle(hMapFile); return 0; } 

Python (receiver process):

 import mmap shmem = mmap.mmap(0,256,"Global\\MyFileMappingObject",mmap.ACCESS_READ) msg_bytes = shmem.read() msg_utf16 = msg_bytes.decode("utf-16") code = msg_utf16.rstrip('\0') yourTuple = eval(code) 
+1
May 12 '17 at 19:09
source share

Why not just use a database for shared data? You have many lightweight options in which you don't have to worry about concurrency issues: sqlite, any of the nosql / key-value database classes, etc.

0
Aug 13 '09 at 9:03
source share



All Articles