Read a zip file comment with Java

I was not able to find a Java build-in solution to read the file comment of a ZIP.

Reading the comment of a ZipEntry can easily be done by invoking ZipEntry.getComment(), but you cannot read the global archive comment of the ZIP file with it.

Therefore, I have implemented the following solution. It reads n bytes from the end of the ZIP file and searches (backward) for the magic sequence that indicates the end of the ZIP contents.

The ZIP file comment can be found 22 bytes after the beginning of that magic sequence…

public static String extractZipComment (String filename) {
String retStr = null;
try {
File file = new File(filename);
int fileLen = (int)file.length();

FileInputStream in = new FileInputStream(file);

/* The whole ZIP comment (including the magic byte sequence)
* MUST fit in the buffer
* otherwise, the comment will not be recognized correctly
*
* You can safely increase the buffer size if you like
*/
byte[] buffer = new byte[Math.min(fileLen, 8192)];
int len;

in.skip(fileLen - buffer.length);

if ((len = in.read(buffer)) > 0) {
retStr = getZipCommentFromBuffer (buffer, len);
}

in.close();
} catch (Exception e) {
e.printStackTrace();
}
return retStr;
}

private static String getZipCommentFromBuffer (byte[] buffer, int len) {
byte[] magicDirEnd = {0x50, 0x4b, 0x05, 0x06};
int buffLen = Math.min(buffer.length, len);
// Check the buffer from the end
for (int i = buffLen-magicDirEnd.length-22; i >= 0; i--) {
boolean isMagicStart = true;
for (int k=0; k < magicDirEnd.length; k++) {
if (buffer[i+k] != magicDirEnd[k]) {
isMagicStart = false;
break;
}
}
if (isMagicStart) {
// Magic Start found!
int commentLen = buffer[i+20] + buffer[i+21]*256;
int realLen = buffLen - i - 22;
System.out.println ("ZIP comment found at buffer position " + (i+22) + " with len="+commentLen+", good!");
if (commentLen != realLen) {
System.out.println ("WARNING! ZIP comment size mismatch: directory says len is "+
commentLen+", but file ends after " + realLen + " bytes!");
}
String comment = new String (buffer, i+22, Math.min(commentLen, realLen));
return comment;
}
}
System.out.println ("ERROR! ZIP comment NOT found!");
return null;
}

If you like my work, buy me a beer. (Suggested: 3€ for a beer, or more for more beer ;-) )

Tags:

3 Responses to “Read a zip file comment with Java”

  1. Shirkit says:

    Dude, did you knew this code just saved my life? I knew there was a magic secret to read zip comments. But I had to make a few changes for this code to work:

    First, put an if the commentLen is > 0, then multiply it by -1 (yes, avoid negative values) (after line 45).
    Second, changed line 52. realLen was always returning 10 bytes smaller than the realLen of the comment. So, I just removed that Math.min and put just commentLen.

  2. Ivan says:

    I am having the reverse problem with my code.

    I am trying to figure out how to write the registered trademark ‘®’ and copyright character ‘©’ to the archive comment for a zip file. This is not a comment for the ZipEntry, although the solution may be similar; but, rather, the whole zip file comment, as you have retrieved in your example above.

    I have tried a lot of different things; but, at the end of the day, the setComment() method on the JarOutputStream (which extends ZipOutputStream) writes a “circumflex a” (i.e., an ‘Â’) before the extended ASCII characters.

    So, instead of:

    MySoftware®
    Copyright © 2011

    I get:

    MySoftware®
    Copyright © 2011

    when viewing the archive comments using WinZip or PKZIP or 7-ZIP or any other archive tool I have tried (including your code above).

    I have tried converting to Unicode; but, since the setComment() implementation only writes single bytes, I get a literal ‘\u00A9′ string in the comment.

    Can you tell me how to write (and/or replace) the comments to a zip file without using the setComment() method. I have tried appending the comment directly to the end of the file using other suggestions; but I am somehow corrupting the archive when doing so.

    I know that I could simply use ‘(R)’ and ‘(c)’ instead, but I would rather use the extended ASCII characters, as they look better. I also know that this can be done via WinZip’s command line utility; but I would like to use Java so I don’t have to buy a zip license just to add an archive comment to a jar file. I also know that this might have been fixed with Java 7 (with the addition of Charset arg in constructor); but I must use 1.6 for the time being.

    Thank you in advance for any help you can offer.

  3. Ivan says:

    I forgot to follow up with my original query.

    I solved the problem and posted it here (http://www.coderanch.com/t/530362/Streams/java/Zip-file-archive-comment-extended).

    Thought it might help, sorry for the delay.

    Ivan

Leave a Reply