Count all +1 in python file

I have the following data:

  1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1
  0 3 4 2 6 7 8 8 90 23 45 2 0 0 0 1
  0 3 4 2 6 7 8 6 93 23 45 2 0 0 0 1
  -1 3 4 2 6 7 8 8 21 23 45 2 0 0 0 1
  -1 3 4 2 6 7 8 8 0 23 45 2 0 0 0 1

The above data is in the file. I want to count the number 1, 0, -1, but only in the 1st column. I take the file to standard input, but the only way I could think of is to do this:

  cnt = 0
  cnt1 = 0
  cnt2 = 0
  for line in sys.stdin:
      (t1, <having 15 different variables as that many columns are in files>) = re.split("\s+", line.strip())
      if re.match("+1", t1):
         cnt = cnt + 1
      if re.match("-1", t1):
         cnt1 = cnt1 + 1
      if re.match("0", t1):
         cnt2 = cnt2 + 1

How can I do it better, especially 15 different variables, as this is the only place I will use these variables.

+4
source share
6 answers

Use collections.Counter:

from collections import Counter
with open('abc.txt') as f:
    c = Counter(int(line.split(None, 1)[0]) for line in f)
    print c

Output:

Counter({0: 2, -1: 2, 1: 1})

Here str.split(None, 1)breaks the line only once:

>>> s = "1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1"                                                
>>> s.split(None, 1)
['1', '3 4 2 6 7 8 8 93 23 45 2 0 0 0 1']

Numpy makes it even easier:

>>> import numpy as np
>>> from collections import Counter                                                         
>>> Counter(np.loadtxt('abc.txt', usecols=(0,), dtype=np.int))                                     
Counter({0: 2, -1: 2, 1: 1})
+3
source

, . .

count = dict()
for line in sys.stdin:
    (t1, rest) = line.split(' ', 1)
    try:
        count[t1] += 1
    except KeyError:
        count[t1] = 1
for item in count:
    print '%s occurs %i times' % (item, count[item])
+3

Instead of using unpacking tuples, where you need the number of variables exactly equal to the number of parts returned by the split () function, you can simply use the first element of these parts:

parts = re.split("\s+", line.strip())
t1 = parts[0]

or equivalent, just

t1 = re.split("\s+", line.strip())[0]
+2
source
import collections

def countFirstColum(fileName):
    res = collections.defaultdict(int)
    with open(fileName) as f:
    for line in f:
        key = line.split(" ")[0]
        res[key] += 1;
    return res
+1
source

This is from my script with infile, I checked and works with standard input as infile:

dictionary = {}

for line in someInfile:
    line = line.strip('\n') # if infile but you should
    f = line.split() #  do your standard input thing
    dictionary[f[0]]=0

for line in someInfile:
    line = line.strip('\n') # if infile but you should
    f = line.split() #  do your standard input thing
    dictionary[f[0]]+=1

print dictionary
0
source
rows = []
for line in f:
    column = line.strip().split(" ")
    rows.append(column)

then you get a two-dimensional array.

1st column:

for row in rows:
    print row[0]

exit:

1
0
0
-1
-1
0
source

All Articles