• ZIP file names encoding

  • We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #3195  by DYA
 
Hello!

I\'ve just installed OpenKM and find it very useful! Thank you, for this great software!
But it has very serious complication I\'m not able to overcome.
The thing is, some of my files have non-Latin names. If I upload/download them one by one everything is fine, but if I use downloading of zip archives file names are broked. I guess that\'s because Java creates zip archives with Unicode file names.
I found similar topics around the google, but no solution. I\'ve tried to open downloaded archives with 7-zip and winRar, which claims to support UTF-8, but with no luck.
Although file name displayed in 7-zip corresponds to UTF-8 -> CP866 conversion, so I guess zip is in UTF-8.

Is this known issue? Does it have simple workaround? What archiver can properly read UTF-8 zip files?
Or if there is no simple solution, should I put get my dirty hand to hardcode preferred zip encoding to CP866, just for my case?

Thanks,
Artyom.
 #3204  by jllort
 
Please upload a simple .zip example here ( remember might be lower file size, because forum limits it, few kb ).

Which OS are you using, if windows, there\'s the problem, the Windows-ISO charset could be the problem when jboss is running under windows OS
 #3208  by DYA
 
I\'m using windows 2003.
Sample archive: [file name=__1090___1077___1089___1090_.zip size=2022]http://www.openkm.com/images/fbfiles/fi ... _1090_.zip[/file]
Name of folder and file should be 4 letters in cyrillic similar to \"TecT/TecT.doc\".

I was not able to open this archive using WinZIP, WinRAR, 7-zip (I also tried to force unicode mode using command line parameters).
But this file was opened correctly using MacOS finder, with no problems at all!

Thanks,
Artyom.
 #3213  by jllort
 
The problem is on windows, other great bill job. Really OpenKM needs create temporary directory to explode zip file, and it uses OS supporting. Windows codification is not utf-8 as gnu linux or mac that\'s the reason why on this OS you can extract without problems this zip file.

You\'ll might discover if it\'s possible to add some language supporting on your windows to solve this problem - really I\'m not sure, but I think it could be possible - because OpenKM delegates on OS the unziping process.
 #4717  by treblereel
 
I have the same problem, i have open 2 tickets on mantis bugtrack: 0001355 and 0001352. But i have no idea how to fix it, as i understand openkm keeps files in utf8, zip(winxp) archive does not support utf8. so we need to make archive in right to windows charset but how openkm can understant it ? parse headers ?
 #4720  by jllort
 
We've solved the problem in OpenKM 5.0. we identify the windows zip files than others. But it's not yet solved in 4.0 and upper
 #4739  by pavila
 
I've backported this fix to OpenKM 4.1-RC2 which I hope to release very soon.
 #6439  by zlatan24
 
jllort wrote:We've solved the problem in OpenKM 5.0. we identify the windows zip files than others. But it's not yet solved in 4.0 and upper
I have many zip archives and once some of them were damaged, luckily I keep one's head and entered the Google and could find a one tool. It helped me for seconds and I hope it relieve you in this condition - repair zip file.
 #7230  by treblereel
 
Thanks, export to zip works great, but i have problems while importing zip to OpenKM5 rc1. Russian names have wrong charset in openkm.
Example is in attachment

ps: Thanks for any help
Attachments
(1.1 KiB) Downloaded 318 times
 #7236  by pavila
 
Which tool (and tool version) did you use to create the zip?
 #7343  by pavila
 
Unicode support in ZIP is a pain. If I tweak the unzip routine to read a charset, is breaking for other charset. The only real solution is to user JAR format, which have good support for utf-8. My recommendation is trying several applications to make the ZIP and see the results. I also will help us to improve non-latin language support.

Note: I'have tested this zip under Linux and the OpenKM zip import works fine.
Attachments
Russian words (taken randomly from wikipedia)
(329.89 KiB) Downloaded 367 times
 #7504  by treblereel
 
I have test russian zip (with utf8) under osx,linux(redhat,centos),freebsd -- everythings ok. Bun not under Windows XP/7. :roll:

I have find out howto unzip zip archive like in post(with windows russian charset) before under linux :

[root@zip]# unzip -Z1 Новая\ папка.zip | iconv -f cp1252 -t cp850 | iconv -f cp866
Новая папка/
Новая папка/Новая папка/
Новая папка/Новая папка (2)/
Новая папка/Новая папка (2)/Текстовый документ.txt
Новая папка/Новая папка/Документ Microsoft Word (2).doc
Новая папка/Новая папка/Документ Microsoft Word.doc

Its OK.

ps:
I am very know in gwt, but i can try to add somethings like dropdown menu in import dialog with "Rus Windows Option".
What do you think ?
 #7505  by jllort
 
all zip files will contains russian files, because in this case seems more a general configuration parameter than some one in input form.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.