Basic Data Warehouse with Python

Question

Basic Data Warehouse with Python

I need to store the basic data of customers and the cars that they bought, and the payment schedule of these cars. This data comes from a graphical interface written in Python. I do not have enough experience to use a database system such as sql, so I want to save my data in a file as plain text. And it should not be online.

To be able to search and filter them, I first convert my data (lists of lists) to a string, and then when I need the data to be converted to the usual Python list syntax. I know that this is a very brute force, but is it safe to do this, or can you advise me differently?

+4

python database data-storage

erkangur Jul 18 '10 at 16:07

source share

7 answers

The answer to using brine is good, but I personally prefer to put it off. This allows you to save the variables in the same state in which they were between starts, and it is easier for me to use it than a brine. http://docs.python.org/library/shelve.html

+5

gddc Jul 18 '10 at 16:44

source share

I agree with the others that serious and important data will be more secure in any lightweight database, but may also sympathize with the desire to keep things simple and transparent.

So, instead of inventing your own textual data format, I suggest you use YAML

The format is human readable, for example:

List of things: - Alice - Bob - Evan

You upload the file as follows:

 >>> import yaml >>> file = open('test.yaml', 'r') >>> list = yaml.load(file)

And the list will look like this:

 {'List of things': ['Alice', 'Bob', 'Evan']}

Of course, you can also do the opposite and save the data in YAML, documents will help you with this.

At least another alternative to consider:

+4

Hagge Jul 19 '10 at 7:57

source share

very simple and basic - (more info @ http://pastebin.com/A12w9SVd )

 import json, os db_name = 'udb.db' def check_db(name = db_name): if not os.path.isfile(name): print 'no db\ncreating..' udb = open(db_name,'w') udb.close() def read_db(): try: udb = open(db_name, "r") except: check_db() read_db() try: dicT = json.load(udb) udb.close() return dicT except: return {} def update_db(newdata): data = read_db() wdb = dict(data.items() + newdata.items()) udb = open(db_name, 'w') json.dump(wdb, udb) udb.close()

via:

 def adduser(): print 'add user:' name = raw_input('name > ') password = raw_input('password > ') update_db({name:password})

+3

DirectorX Oct 13 '13 at 16:07

source share

You can use this library to write an object to the file http://docs.python.org/library/pickle.html

+2

Quonux Jul 18 '10 at 16:11

source share

Writing data to a file is not a safe way to store data. It is better to use a simple database like sqlalchemy . This is ORM for easy database use ...

+1

svenwltr Jul 18 '10 at 16:50

source share

You can also store simple data in a text file. However, you have little support to check for data consistency, double values, etc.

Here is my simple data like "card file" in a text file code fragment using namedtuple so that you can access not only by the index in the row, but by the name of the header:

 # text based data input with data accessible # with named fields or indexing from __future__ import print_function ## Python 3 style printing from collections import namedtuple import string filein = open("sample.dat") datadict = {} headerline = filein.readline().lower() ## lowercase field names Python style ## first non-letter and non-number is taken to be the separator separator = headerline.strip(string.lowercase + string.digits)[0] print("Separator is '%s'" % separator) headerline = [field.strip() for field in headerline.split(separator)] Dataline = namedtuple('Dataline',headerline) print ('Fields are:',Dataline._fields,'\n') for data in filein: data = [f.strip() for f in data.split(separator)] d = Dataline(*data) datadict[d.id] = d ## do hash of id values for fast lookup (key field) ## examples based on sample.dat file example key = '123' print('Email of record with key %s by field name is: %s' % (key, datadict[key].email)) ## by number print('Address of record with key %s by field number is: %s' % (key ,datadict[key][3])) ## print the dictionary in separate lines for clarity for key,value in datadict.items(): print('%s: %s' % (key, value)) input('Ready') ## let the output be seen when run directly """ Output: Separator is ';' Fields are: ('id', 'name', 'email', 'homeaddress') Email of record with key 123 by field name is: gishi@mymail.com Address of record with key 123 by field number is: 456 happy st. 345: Dataline(id='345', name='tony', email=' tony.veijalainen@somewhere.com ', homeaddress='Espoo Finland') 123: Dataline(id='123', name='gishi', email=' gishi@mymail.com ', homeaddress='456 happy st.') Ready """

0

Tony veijalainen Jul 18 '10 at 18:59

source share

ntcong · Accepted Answer · 2010-07-18T16:39:00+0000

It is not possible to save the database in text format (either using a pickle or something else). There is a risk that problems with saving data could lead to damage. Not to mention the risks associated with the theft of your data.

As your data set grows, performance failure may occur.

look at sqlite (or sqlite3), which is small and easier to manage than mysql. Unless you have a very small data set that will fit into a text file.

P / S: btw using berkeley db in python is simple and you don't need to learn all the DB stuff, just import bsddb

Basic Data Warehouse with Python

More articles: