Easy save / load data in python

What is the easiest way to save and load data in python, preferably in a human readable format?

The data that I save / load consists of two float vectors. Ideally, these vectors will be called in a file (for example, X and Y).

My current save() and load() functions use file.readline() , file.write() and converting a string to float. There must be something better.

+9
python io
source share
7 answers

There are several options - I don’t know exactly what you like. If two vectors are the same length, you can use numpy.savetxt() to save your vectors, say x and y , as columns:

  # saving: f = open("data", "w") f.write("# xy\n") # column names numpy.savetxt(f, numpy.array([x, y]).T) # loading: x, y = numpy.loadtxt("data", unpack=True) 

If you are dealing with large floating-point vectors, you should probably use NumPy anyway.

+9
source share

The easiest way to get readable output is to use a serialization format such as JSON. Python contains a json library that you can use to serialize data to and from a string. Like pickle , you can use this with an IO object to write it to a file.

 import json file = open('/usr/data/application/json-dump.json', 'w+') data = { "x": 12153535.232321, "y": 35234531.232322 } json.dump(data, file) 

If you want to get a simple string back instead of dumping it to a file, you can use json. dumps () :

 import json print json.dumps({ "x": 12153535.232321, "y": 35234531.232322 }) 

Reading back from the file is just as easy:

 import json file = open('/usr/data/application/json-dump.json', 'r') print json.load(file) 

The json library is fully functional, so I recommend checking the documentation to see what things you can do with it.

+22
source share
  • If this should be understandable to humans, I would also go with JSON. If you don’t need to exchange it with the corporate type people, they like XML better. :-)

  • If it should be human editable and not too complicated, I would probably go with some kind of INI-like format like configparser.

  • If it is complex and it does not need to be exchanged for me, I would simply etch the data, if it is not very complicated, in which case I would use ZODB.

  • If it contains a lot of data and needs to change, I would use SQL.

I think that covers almost everything.

+7
source share

A simple serialization format that is easy for people to read on computers: JSON .

You can use the json module of Python.

+2
source share

Since we are talking about human editing a file, I assume that we are talking about relatively small data.

How about implementing the next skeleton. It just saves the data as key=value pairs and works with lists, tuples, and many other things.

  def save(fname, **kwargs): f = open(fname, "wt") for k, v in kwargs.items(): print >>f, "%s=%s" % (k, repr(v)) f.close() def load(fname): ret = {} for line in open(fname, "rt"): k, v = line.strip().split("=", 1) ret[k] = eval(v) return ret x = [1, 2, 3] y = [2.0, 1e15, -10.3] save("data.txt", x=x, y=y) d = load("data.txt") print d["x"] print d["y"] 
0
source share

As I commented on the accepted answer using numpy , this can be done using a simple single line interface:

Assuming numpy imported as np (which is common practice),

 np.savetxt('xy.txt', np.array([x, y]).T, fmt="%.3f", header="xy") 

saves the data in a (optional) format and

 x, y = np.loadtxt('xy.txt', unpack=True) 

download it.

The xy.txt file will look like this:

 # xy 1.000 1.000 1.500 2.250 2.000 4.000 2.500 6.250 3.000 9.000 

Please note that the fmt=... format string is optional, but if the goal is human readable, it can be very useful. If used, it is set using regular printf codes (in my example: a floating-point number with 3 decimal places).

0
source share

Here is an Encoder example until you probably want to write for the Body class:

 # add this to your code class BodyEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, np.ndarray): return obj.tolist() if hasattr(obj, '__jsonencode__'): return obj.__jsonencode__() if isinstance(obj, set): return list(obj) return obj.__dict__ # Here you construct your way to dump your data for each instance # you need to customize this function def deserialize(data): bodies = [Body(d["name"],d["mass"],np.array(d["p"]),np.array(d["v"])) for d in data["bodies"]] axis_range = data["axis_range"] timescale = data["timescale"] return bodies, axis_range, timescale # Here you construct your way to load your data for each instance # you need to customize this function def serialize(data): file = open(FILE_NAME, 'w+') json.dump(data, file, cls=BodyEncoder, indent=4) print("Dumping Parameters of the Latest Run") print(json.dumps(data, cls=BodyEncoder, indent=4)) 

Here is an example of a class that I want to serialize:

 class Body(object): # you do not need to change your class structure def __init__(self, name, mass, p, v=(0.0, 0.0, 0.0)): # init variables like normal self.name = name self.mass = mass self.p = p self.v = v self.f = np.array([0.0, 0.0, 0.0]) def attraction(self, other): # not important functions that I wrote... 

Here's how to serialize:

 # you need to customize this function def serialize_everything(): bodies, axis_range, timescale = generate_data_to_serialize() data = {"bodies": bodies, "axis_range": axis_range, "timescale": timescale} BodyEncoder.serialize(data) 

Here's how to dump:

 def dump_everything(): data = json.loads(open(FILE_NAME, "r").read()) return BodyEncoder.deserialize(data) 
0
source share

All Articles