This refers to a project for converting a two-way ANOVA program to SAS in Python.
Mostly I started learning the language on Thursday, so I know that I have many opportunities for improvement. If I am missing something that is obviously obvious, by all means let me know. I don't have Sage yet and it doesn't work, and no, so right now, it's all pretty vanilla Python 2.6.1. (Portable)
Primary query: you need a good set of list concepts that can retrieve data in lists of samples in lists by coefficient A, coefficient B, in general, and by groups of each level of factors A & B (AxB).
After some work, the data is in the following form (3 layers of nested lists):
Reply [a] [b] [p]
(value [a1 [b1 [n1, ..., nN] ... [bB [n1, ... nN]]], ..., [aA [b1 [n1, ..., nN] .. . [bB [n1, ... nN]]] I hope this is clear.)
Factor levels in my example: A = 3 (0-2), B = 8 (0-7), N = 8 (0-7)
byA= [[a[i] for i in range(b)] for a[b] in response]
(Can someone explain why this syntax works? I stumbled upon it, trying to figure out what the parser would take. I haven't seen this syntax related to this behavior elsewhere, but it's really nice. Any good links to sites or books on This topic will be appreciated. Editing: The constancy of variables between runs explained this oddity. This does not work.)
byB=lstcrunch([[Bs[i] for i in range(len(Bs)) ]for Bs in response])
(It should be noted that zip(*response) almost does what I want. The above version does not actually work, as I recall. I have not tested it yet with thorough testing.)
byAxB= [item for sublist in response for item in sublist]
(Stolen from Alex Martelli's answer on this site. Can someone explain why again? List syntax syntax is not well explained in the texts I read.)
ByO= [item for sublist in byAxB for item in sublist]
(Obviously, I just used the previous understanding here because he did what I needed. Edit :)
I would like them to end with the same data types, at least when they are fixated on the factor under consideration, so that you can use and use the same average / total / SS / et cetera functions.
This can be easily replaced with something cleaner:
def lstcrunch(Dlist): """Returns a list containing the entire contents of whatever is imported, reduced by one level. If a rectangular array, it reduces a dimension by one. lstcrunch(DataSet[a][b]) -> DataOutput[a] [[1, 2], [[2, 3], [2, 4]]] -> [1, 2, [2, 3], [2, 4]] """ flat=[] if islist(Dlist):
Oh, if I'm in a topic, what is the preferred way to define a variable as a list? I use:
def islist(a): "Returns 'True' if input is a list and 'False' otherwise" return type(a)==type([])
Newsletter Request: Is there a way to explicitly force a shallow copy to convert to depth? copy? Or, similarly, when copying to a variable, is there a way to declare that the assignment should also replace the pointer, and not just the value? (st assignment will not apply to other small copies). Likewise, using this can be useful from time to time, so being able to control when this happens or doesn't happen sounds very good. (I really pounced on myself when I prepared the table for input by calling: answer = [[[0] * N] * B] * A)
Edit : Further research leads to most of this working fine. Since then I have made a class and tested it. It works great. I left the list comprehension forms intact for reference.
def byB(array_a_b_c): y=range(len(array_a_b_c)) x=range(len(array_a_b_c[0])) return [[array_a_b_c[i][j][k] for k in range(len(array_a_b_c[0][0])) for i in y] for j in x] def byA(array_a_b_c): return [[repn for rowB in rowA for repn in rowB] for rowA in array_a_b_c] def byAxB(array_a_b_c): return [rowB for rowA in array_a_b_c for rowB in rowA] def byO(array_a_b_c): return [rep for rowA in array_a_b_c for rowB in rowA for rep in rowB] def gen3d(row, col, inner): """Produces a 3d nested array without any naughty shallow copies. [row[col[inner]] named st the outer can be split on, per lprn for easy display""" return [[[k for k in range(inner)] for i in range(col)] for j in range(row)] def lprn(X): """This prints a list by lines. Not fancy, but works""" if isiterable(X): for line in X: print line else: print x def isiterable(a): return hasattr(a, "__iter__")
Thanks to all who responded. Already there is a noticeable improvement in the quality of the code due to the improvement of my gnosis. Of course, further thoughts are appreciated.