martes, 7 de agosto de 2012

Working with ZIP files in memory

Long time no see right?

Today's topic is ZIP files. I'm going to tell you how to work with ZIP files in memory.

The use case comes handy when your code has no access to the actual file persisted in the file system. The most common scenario, I think, it's when a file it's being uploaded or transferred  through any end point.

As it turns out Java has quite a lot of library build in in its java.util.zip packages to make this magic happens (kind of logical it's in the zip package eh? I just DIDN'T see it before :P ).

These are the main classes we are going to be using:

  • java.util.zip.ZipEntry
  • java.util.zip.ZipInputStream
  • java.util.zip.ZipOutputStream
As you may already notice, if we can access the actual file we should have access to something right?
Well that something it's an InputStream, not any input stream of course but a ZipInputStream, so lets see how to create one out of any common InputStream:

InputStream zipFileInputStream;
ZipInputStream inputZipFile = new ZipInputStream(zipFileInputStream);

Pretty easy, right? 
Of course, bear in mind that the use case here assumes that you have access to an input stream, could be the actual file, an http etc. The point is that ones you have the input stream you won't have to use anything else to mange the ZIP file.

Now we are going to see two use cases that may come handy, first to read a particular file/s inside a ZIP file and second how to add content to the ZIP file.

Let us begin with the first use case how to get a particular file from inside the ZIP file:

ZipEntry entryFile;
try {
  entryFile = inputZipFile.getNextEntry();
  while (entryFile != null) {
    String[] nameAndExtention = StringUtils.split(entryFile.getName(), ".");
    if (nameAndExtention.length >= 2) {
      String extension = nameAndExtention[nameAndExtention.length - 1].toLowerCase()
      if (extension.equals(FILE_EXTENTION)) {        

        int n;
byte[] buf = new byte[1024];         

        ByteArrayOutputStream tmpOutStream = new ByteArrayOutputStream(1024);

while ((n = inputZipFile.read(buf, 0, 1024)) > -1) {
          tmpOutStream.write(buf, 0, n);
}

fileList.add(new String(tmpOutStream.toString()));
    }
  }
  entryFile = inputZipFile.getNextEntry();
 }
} catch (IOException e) {
  e.printStackTrace();
}

So as you can see the first thing we use here is the ZipEntry class. This is what the ZipFileInputStream returns when we iterate through it. It represents a file inside the the ZIP file and its related metadata.
So from this point onwards it's quite simple you'll j just need to play with the ZipEntry API to get what you need.
In my case I was looking for files with a certain extension(word of advise though the getName method returns the canonical name of the file).

So once you selected the files you needed you may want to read them right?
As you may have guessed by now this is achieved by this code:
        int n;
byte[] buf = new byte[1024];         

        ByteArrayOutputStream tmpOutStream = new ByteArrayOutputStream(1024);

while ((n = inputZipFile.read(buf, 0, 1024)) > -1) {
          tmpOutStream.write(buf, 0, n);
}

fileList.add(new String(tmpOutStream.toString()));

And the first question should be but how does the ZipFileInpuStream which file to read and the answer it's easy. 
When you do:

inputZipFile
.getNextEntry()

This works as a pointer for the ZipFileInpuStream, and internally position the pointer to the beginning of the entry it just returned to you. So when you do getNextEntry the pointer moves forward to the next entry. 
Thus when you do:
inputZipFile.read

You are going to be reading until the end of the entry you are on, this is the end of the file you selected.
Finally I just do a tmpOutStream.toString() but because the files I work with are text based files.

In this way you can read the content of any file inside a ZIP file.

Now for the second use case,  add content to the ZIP file.
The idea it's pretty much the same BUT you'll have to do something like this:

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

ZipOutputStream zipOutputFile = new ZipOutputStream(outputStream);


ZipEntry newZipentry = new ZipEntry("some_name");
zipOutputFile.putNextEntry(newZipentry);
zipOutputFile.write(newFileContent.getBytes());


As you can see you first need to create an output file first.
So you'll say "BUT YOU SAY ADD CONTENT TO A ZIP FILE" well we ARE doing that it's just a matter of creating your zipOutpuFile from an original content.


Any way I hope you've enjoy it.
Also take a look at the following web sites I used to code this:







No hay comentarios:

Publicar un comentario