I have a client written in Python for a server that runs over a LAN. In some part of the algorithm, the socket is heavily used, and it runs about 3-6 times slower than almost the same one written in C ++. What are the solutions to speed reading Python sockets?
I have some simple buffering, and my socket class looks like this:
import socket import struct class Sock(): def __init__(self): self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.recv_buf = b'' self.send_buf = b'' def connect(self): self.s.connect(('127.0.0.1', 6666)) def close(self): self.s.close() def recv(self, lngth): while len(self.recv_buf) < lngth: self.recv_buf += self.s.recv(lngth - len(self.recv_buf)) res = self.recv_buf[-lngth:] self.recv_buf = self.recv_buf[:-lngth] return res def next_int(self): return struct.unpack("i", self.recv(4))[0] def next_float(self): return struct.unpack("f", self.recv(4))[0] def write_int(self, i): self.send_buf += struct.pack('i', i) def write_float(self, f): self.send_buf += struct.pack('f', f) def flush(self): self.s.sendall(self.send_buf) self.send_buf = b''
PS: Profiling also shows that most of the time is spent reading sockets.
Edit: Since the data is received in blocks with a known size, I can immediately read the entire block. So I changed my code to this:
class Sock(): def __init__(self): self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.send_buf = b'' def connect(self): self.s.connect(('127.0.0.1', 6666)) def close(self): self.s.close() def recv_prepare(self, cnt): self.recv_buf = bytearray() while len(self.recv_buf) < cnt: self.recv_buf.extend(self.s.recv(cnt - len(self.recv_buf))) self.recv_buf_i = 0 def skip_read(self, cnt): self.recv_buf_i += cnt def next_int(self): self.recv_buf_i += 4 return struct.unpack("i", self.recv_buf[self.recv_buf_i - 4:self.recv_buf_i])[0] def next_float(self): self.recv_buf_i += 4 return struct.unpack("f", self.recv_buf[self.recv_buf_i - 4:self.recv_buf_i])[0] def write_int(self, i): self.send_buf += struct.pack('i', i) def write_float(self, f): self.send_buf += struct.pack('f', f) def flush(self): self.s.sendall(self.send_buf) self.send_buf = b''
recv
'from the socket looks optimal in this code. But now next_int
and next_float
have become the second bottleneck, they take about 1 ms (3000 CPU cycles) per call only for unpacking. Is it possible to make them faster, for example, in C ++?