I would build a hash using directory entries, like keys containing hashes (actually sets) of each list in which it was found. Iterate through each list, as each new record adds it to an external hash with one set (or hash) containing the identifier of the listing in which it first met. For any entry found in the hash, simply add the current list identifier to the set / hash value.
From there, you can simply process the sorted hash keys and create the rows of your resulting table.
Personally, I think Perl is ugly, but here's a sample in Python:
#!/usr/bin/env python import sys if len(sys.argv) < 2: print >> sys.stderr, "Must supply arguments" sys.exit(1) args = sys.argv[1:] # build hash entries by iterating over each listing d = dict() for each_file in args: name = each_file f = open(each_file, 'r') for line in f: line = line.strip() if line not in d: d[line] = set() d[line].add(name) f.close() # post process the hash report_template = "%-20s" + (" %-10s" * len(args)) print report_template % (("Dir Entries",) + tuple(args)) for k in sorted(d.keys()): row = list() for col in args: row.append("yes") if col in d[k] else row.append("no") print report_template % ((k,)+tuple(row))
Basically it should be legible, as if it were pseudo-code. The expressions (k,) and ("Dir Entries",) may look a little strange; but to make them be tuples that need to be unpacked into a format string using the % operator for strings. They can also be written as tuple([k]+row) for example (wrapping the first element in [] makes it a list that you can add to another list and convert everything to a tuple).
Also, translating to Perl should be fairly simple, just using hashes instead of dictionaries and collections.
(By the way, this example will work with an arbitrary number of lists, presented as arguments and displayed as columns. Obviously, after dozens of columns, the output will be rather cumbersome to print or display, but it was easy to generalize to).
source share