I order huge scenes with a bunch of landscapes from USGS, which are included in the tar.gz archive. I am writing a simple python script to unzip them. Each archive contains 15 TIFF images ranging in size from 60-120 mb, just a little over 2 GB. I can easily extract the entire archive with the following code:
import tarfile
fileName = "LT50250232011160-SC20140922132408.tar.gz"
tfile = tarfile.open(fileName, 'r:gz')
tfile.extractall("newfolder/")
I actually need 6 of the 15 tiffs designated as βstripesβ in the title. These are some of the larger files, so together they make up about half the data. Therefore, I decided to speed up this process by changing the code as follows:
fileName = "LT50250232011160-SC20140922132408.tar.gz"
tfile = tarfile.open(fileName, 'r:gz')
membersList = tfile.getmembers()
namesList = tfile.getnames()
bandsList = [x for x, y in zip(membersList, namesList) if "band" in y]
print("extracting...")
tfile.extractall("newfolder/",members=bandsList)
script ( ). , , , , , .
, , , ? python tarfile , , , .
!