Large number of subdirectories/files

We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules
Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
Post Reply
ticl
Fresh Boarder
Fresh Boarder
Posts: 8
Joined: Wed Jan 06, 2010 1:39 pm

Large number of subdirectories/files

Post by ticl » Fri Jan 15, 2010 2:01 pm

I am planning to use OKM for storing large number of files (50mil+) in a bank. I would get roughly 5mil subdirectories under root (one directory for each customer), and an average of 5 files inside each directory (scanned customer files). Can OKM handle such archive? I am worried about the Taxonomy/FolderTree structure. If it tries to load the entire tree on startup, it would surely mess things up. Is it possible to get rid of the Taxonomy panel and create a small "Go to Folder" input box? What file system is recommended for storage? ZFS any good?

jllort
Moderator
Moderator
Posts: 10929
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: Large number of subdirectories/files

Post by jllort » Sat Jan 16, 2010 9:44 am

Hi,

Number of files is not a problem, 50K files are supported without problems to OpenKM, but you need some minimal hardware requirements, in this case I recomend increase memory to 4GB as minimum.

You've got a problem :) 5k folders on taxonomy in not a problem to internal structure but really it's a problem for browser, here it'll must be done some changes in code to adapt to your needs, really could be simply. Contact with us using contact form and we can talk about the modifications you need ( I think in one week it can be done, but we'll must talk more about your installation ).

Petr_Valenta
Senior Boarder
Senior Boarder
Posts: 47
Joined: Mon Mar 06, 2017 9:10 am

Re: Large number of subdirectories/files

Post by Petr_Valenta » Wed Oct 25, 2017 6:49 am

Hello

We have the same problem. We have folders organized by the first letter of the surname and named A,B, ... and each of sub-folders has its name starting with the surname and followed by the first name and some additional ID.

The pagination has solved the "content" view, but reading on the folder tree takes too much time displaying "Updating folder tree" (approx. 1 minute for 6k of subfolders).

Could you please provide me with some advice?

Thank you very much in advance

Petr Valenta

Server:
VMWare 12 Pro
Memory 16 GB
1 proc - 4 cores
Centos 7 Server
PostgreSQL 9.2.18
OpenKM Community 6.3.4

jllort
Moderator
Moderator
Posts: 10929
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: Large number of subdirectories/files

Post by jllort » Fri Oct 27, 2017 7:20 am

There's no way for rendering more quickly 6000 folders in the tree. The only solution is not having so much subfolders, basically changing the logic for what you have created the taxonomy, you must distribute these 6000 folders in other additional level ( for good performance should have 100-200 subfolders, 1000 folders is the limit when things are going crearly starting going wrong ).

The reasons ? well time for retrieving all data ( is not the biggest problem and you could activate database cache etc... ) and browser rendering time ( here there's the big problem what can not be solved from server hardware, increase the performance of desktop computer might help something but usually is not a solution ).

Note: You are using the application in wrong way for solving your problem.

Petr_Valenta
Senior Boarder
Senior Boarder
Posts: 47
Joined: Mon Mar 06, 2017 9:10 am

Re: Large number of subdirectories/files

Post by Petr_Valenta » Mon Oct 30, 2017 7:15 am

Sorry, I cannot accept your note that I am using your application in the wrong way. The client wants to see the first letter of the surname of his clients in the first level, then combination Surname+Firstname in the second level. Please could you give me some advice how to satisfy his requirement in some other way?

Documentum (OpenText now) is solving this by displaying some small number of folders (but not all) and the last node in the folder tree has the label "all xxx folders "(or something like that). After clicking here you have to wait to read all, but only for the first time. When you leave the folder and then return, you are not waiting any more.

I think, this can be solved for example by displaying the first n-rows (n is the number chosen within paging) and then an node "xxx more" (or something like that). And in the right panel display n rows as the content + paging controls. If the user clicks on the last node "xxx more", he will wait to read for all. If he clicks on the Next in paging, simply read more n rows into the folder tree and these n-rows in the content panel. To make it even faster, you can display only previous n folders, the current one in bold (or highlighted in some other way) and then next n folders (so in the every time you will read just 2*n+1 folders).

jllort
Moderator
Moderator
Posts: 10929
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: Large number of subdirectories/files

Post by jllort » Wed Nov 01, 2017 10:14 am

I can accept without problems what you say about tree perfomance, but now GWT does not provide us an easy solution for solving it. Here we have a technical problem and a restriction on the way the UI should be used.

You can be created a custom UI, it's quite easy and fast working on it with SDK and spring boot sample. From that scenario you will sucess in two points, exact fit of features in your needs and best performance only showing features you need in manner you wish. Really the users need to navigate across the folder structure or might be more easy a simply search with two or three input boxes and then retrieving data. In a huge scenario where OpenKM is used for a very specific data container as you are explaining now, I suggest small cutom UI in combination with automatic catalog feature. User should not know that behind the UI they have the DMS.

The last time I wach on documentum was focused on something more near a search engine view ( filtering and paginating ). But I do not remember a tree navigation in combination with table list ... if you make some screenshot of it, will be welcome, because always we are open to do improvements.

I have reviewed a lot of dms tools, from now I think are two main directions:
1- solutions based on search filtering ( work in cabinet concepts or document type concept ). You choose one cabinet or document type and from there you are able to apply filters what shown a paginated table with results ( here you do not have any kind of tree )
2- solutions based on tree ( taxonomy ), where you are able to navigate from left side, and from right panel you have a table with nodes paginated.

Both solutions have advantatges and disadvantatges ( I could tell you a list of application what going in one directions and other what going in another, really older dms started with option 1 and seems later option 2 has been applied by newer. It does means first is better than second, in my opinion tree option have some disadvantatges and in some scenarios might be not a good idea ). In deep, I have arrived the conclusion that is not possible building an UI what fit all user scenarios. Arrived this point you can go in three directions:
1- User must adapt to UI restrictions
2- UI can be extended and or modified ( us we have discarted this scenario, I will not explain all the reasons behind the decision, but a will share what for me is the most relevant, here we are in OSGI architecture what is quite complex and you need a very high qualified staff for working on it ... this scenarios have some disadvantatges )
3- Custom UI based on SDK + sample project ( for me this is the best aproach, what has less disadvantatges in comparison of what you get and the maintenance of it ).

Post Reply