Page 1 of 1

upload speed

PostPosted:Thu Jan 14, 2021 7:42 am
by kaajsoft
hello
is there any way to upload large file faster? it takes about 15m to upload 1GB files.

Re: upload speed

PostPosted:Fri Jan 15, 2021 7:19 am
by jllort
usually this should be a network problem ... your desktop and OpenKM are in the same network?
did you have some kind of automation?
dis you have antivirus enabled?
finally, what OpenKM version are talking about?

Re: upload speed

PostPosted:Fri Jan 15, 2021 9:28 pm
by kaajsoft
Im talking about latest comunity version.
My network is okey, we transfer file with ftp very fast.
There is no automatio and no antivirus.

Re: upload speed

PostPosted:Sat Jan 16, 2021 7:14 pm
by jllort
Describe your uploading process ( type of files, total amount etc... ).

Re: upload speed

PostPosted:Sun Jan 17, 2021 1:40 pm
by kaajsoft
hello
i have debug the code
this line take a long time in fileupload servlet
List<FileItem> items = upload.parseRequest(request);
i have just upload one file with 1GB size

Re: upload speed

PostPosted:Fri Jan 22, 2021 6:58 pm
by jllort
It reads the stream ( in background the file is not keep in memory first it keeps in a temporary file in the file system and later it is copied to final destination ).

Re: upload speed

PostPosted:Fri Jan 22, 2021 8:09 pm
by kaajsoft
Okey
I know that.
What do you think of move file instead of copy it? It is implemented in common upload, it dosnt allow us to access file, but i think it is possible to access the file and move it.
Usualy moving is faster.
Let me know your idea.
Thank you

Re: upload speed

PostPosted:Tue Jan 26, 2021 8:21 am
by kaajsoft
??

Re: upload speed

PostPosted:Sat Jan 30, 2021 7:55 am
by jllort
I do not follow you in "move" scenario. The problem is not move binary data in the file system, the problem is the data also must be registered at database level and for it you must go across the API. I do not see any shortcut for it.

I suppose you have not enabled the option to keep binary data in the database?

Usually file uploading is fast ( maybe you are thinking is like moving files between a folders in the file system, that is totally wrong concept, it is more near to upload file to an application in the cloud ( but because in your network should be faster ). Anyway is clear you have something strange, because 1GB should be uploaded in few seconds, obviously not minutes ( may be hardware issue or configuration issue or software like an antivirus what it is causing the delay -> it is a very bad idea having antivirus in combination with DMS ).

I suggest have a clean computer and test there to isolate the issue. Your prefered OS is Linux or Windows.

What is your prefered database?

Did you have configuration parameter named "upload.throttle.filter" enabled, because when enabled is applied a bandwidth filter in the uploading.

Re: upload speed

PostPosted:Wed Feb 03, 2021 8:57 pm
by kaajsoft
Hello, my os is centos and database is postgresql and files stores on disk not database.
When we upload a 2g file to ftp in my network it takes 2minutes,
When i upload that file to openkm it takes about 4 minutea! I have find the problem:
When we upload file to openkm firat common upload file store files and then the openkm read that file by inputstream and copy that to its place! This copy process is the problem i think. And what i say is to move that file instead of copy.
Do you undrestod me?

Re: upload speed

PostPosted:Sat Feb 06, 2021 12:33 pm
by jllort
Did you debug the code to arrive at this conclusion?

You should enable debug for ( take a look at log-back.xml in the tomcat-XXX folder for it purpose ):
Code: Select all
com.openkm.servlet.frontend.FileUploadServlet
com.openkm.api.OKMDocument
com.openkm.module.db.DbDocumentModule
com.openkm.module.db.base.BaseDocumentModule
com.openkm.dao.NodeDocumentDAO
Also should enable profiling and hibernate stats in administration looking for methods and queries with high time.
Also I suggest modify the source code adding additional logs to calculate time ( if needed , with profiling should be enought to identify methods or queries what take a lot of time )