This is similar to an LDIF file. The python-ldap library has a data processing library using pure Python LDIF, which can help if your file has some of the nasty errors in LDIF, for example. Base64 encoded values, input folding, etc.
You can use it like this:
import csv import ldif class ParseRecords(ldif.LDIFParser): def __init__(self, csv_writer): self.csv_writer = csv_writer def handle(self, dn, entry): self.csv_writer.writerow([entry['LoginId'], entry['mail']]) with open('/path/to/large_file') as input, with open('output_file', 'wb') as output: csv_writer = csv.writer(output) csv_writer.writerow(['LoginId', 'Mail']) ParseRecords(input, csv_writer).parse()
Edit
So, to extract from the live LDAP directory using the python-ldap library, you would like to do something like this:
import csv import ldap con = ldap.initialize('ldap://server.fqdn.system.edu')
It might be worth reading the documentation for the ldap module , especially example .
Note that in the example above, I completely missed the supply of the filter, which you probably want to do in production. The filter in LDAP is similar to the WHERE in an SQL statement; it restricts which objects are returned. Microsoft really has a good guide to LDAP filters . Canonical reference for LDAP RFC 4515 filters.
Similarly, if there are potentially several thousand entries, even after applying the appropriate filter, you may need to study LDAP swap control , although using this again will make the example more complex. Hope this is enough to get you started, but if something comes along, feel free to ask or open a new question.
Good luck.