Reading RAR file contents into memory in Python

I am looking for a way to read specific files from a rar archive into memory. In particular, they are a collection of numbered image files (I'm writing a comic book). Although I can simply unzip these files and download them as needed (deleting them when it's done), I would rather avoid this if possible.

With everything said, I would prefer a solution that is possibly cross-platform (Windows / Linux), but Linux is a must. It is also important if you are going to specify a library for processing this for me, please understand that it should be free (like in beer) or OSS.

+6
python linux stream rar
source share
7 answers

The real answer: there is no library, and you cannot create it. You can use rarfile, or you can use 7zip unRAR (which is less free than 7zip but still free, like in beer), but both approaches require an external executable. The RAR license basically requires this, because while you can get the source code for unRAR, you can’t change it in any way, and turning it into a library will constitute an illegal modification.

In addition, solid RAR archives (the best compressed ones) cannot be accidentally accessed, so you need to parse everything anyway. WinRAR introduces a user interface that seems to avoid this, but in fact it just unpacks and repackages the archive in the background.

+4
source share

It appears that the limitation that rarsoft places on derivative work is that you cannot use the unrar source code to create a variation of the RAR COMPRESSION algorithm. From the context, it would seem that it specifically allows people to use their code (modified or not) to decompress files, but you cannot use them if you intend to write your own compression code. Here is a direct quote from the license.txt file I just downloaded:

  • UnRAR sources can be used in any software for processing RAR archives without restrictions for free, but cannot be used to recreate the proprietary RAR compression algorithm. Distribution of modified UnRAR sources in a separate form or as part of other software is permitted provided that it is explicitly stated in the documentation and in the original comments that the code may not be used to develop a compatible RAR (WinRAR) archiver.

Seeing how everyone seemed to just want something that would allow them to write a comic viewer capable of handling reading images from CBR (rar) files, I don’t understand why people think that something does not allow them to use the provided source the code.

+2
source share

RAR is a proprietary format; I don’t think there are any public specifications, so third-party support for tools and libraries is bad for nonexistent ones.

You are much better off using ZIP; it is completely free, has an exact public specification, a compression library is available everywhere (zlib is one of the most widely distributed libraries in the world), and it is very easy to code.

http://docs.python.org/library/zipfile.html

+1
source share

The pyUnRAR2 library can extract files from RAR archives into memory (and a disk if you want). It is available under the MIT license and simply wraps UnRAR.dll on Windows and unrar on Unix. Click "QuickTutorial" for usage examples.

On Windows, it can extract into memory (rather than a disk) with (included) UnRAR.dll by setting a callback using RARSetCallback (), and then calling RARProcessFile () with the RAR_TEST parameter instead of the RAR_EXTRACT parameter for do not extract files to disk. The callback then tracks the UCM_PROCESSDATA events to read data. From the documentation for UCM_PROCESSDATA events: "Process unpacked data. It can be used to read a file while it is being extracted or tested without actually extracting the file to disk."

On Unix, unrar can simply print the file to stdout, so the library simply reads from the pipe connected to unrar stdout. The unrar binary is required, which has the p command for the Print File to Stdout command. Use "apt-get install unrar" to install it on Ubuntu.

+1
source share

Take a look at the Python "struct" module. You can then interpret the RAR file format directly in your Python program, allowing you to retrieve the contents inside the RAR, regardless of the external software, to do this for you.

EDIT: This, of course, is vanilla Python - there are alternatives that use third-party modules (as already published).

EDIT 2: According to the Wikipedia article , my answer will require that you have permission from the author.

0
source share

the free 7zip library can also process rar files.

0
source share

All Articles