Sorting a large number of files into a hierarchical tree structure in java

I have a large number of files (several thousand XML files), and I need to write a graphical interface in java that sorts these files in a tree structure based on the "Category" elements in the XML data of each file. This program can be run several times a day, and small changes / additions can be made to these files daily.

How to save this sorted structure in such a way as to minimize load time on subsequent launches of the application? This program, unfortunately, will work with files on the USB hard drive, so I try to avoid parsing each XML document every time the application starts to build this tree.

For example, each XML file may have several attributes (ie, "Person" with the value "Fred" and "Organization" with the value "Google"), and I would like to allow the user to select file groups based on these category values โ€‹โ€‹in the GUI .

Thank you in advance for your help and any help =)

+4
source share
2 answers

Ok, here is what you need to do.

  • Create an SQL database that will store BOTH file names and associated XML tree structure data.
    • MySQL This is a good free option.
  • When starting the application, scan the directory for file names and compare the list of file names with it.
    • Any names that are not indexed should be parsed and added to the database.
    • Create a new stream to view these raw files and process them so that the user does not notice any delays.
  • Enable the button in the application called "Recover Cache".
    • Leave a warning โ€œJust click this button when the file has changedโ€ or something
    • Let the user tell your application when the old file has changed since it almost never happens.

As an alternative to options 2/3 you can do this:

  • Create Daemon Task
    • This will be a standalone program that supports the database.
    • Observe changes to the XML catalog and update the database accordingly.
    • It can also periodically check for changes in other files once a day at 2 a.m.
+1
source

Do not read or parse each file again and again each time they should be displayed. You can store data from XML files in a different format, which allows you to quickly and efficiently read. A format perfect for this is a relational database.

So here is what you need to do:

  • Install the SQL engine. I am not a licensing specialist, but MySQL needs to achieve what you need, and it's free. Create a comlink table that matches the structure of your XML files.
  • Write a system service that monitors changes in the file system (you can use FileSystemWatcher from .NET). You can use Java instead of C #, but then you have to do it with periodic polls.
  • Each time a change occurs, the services take the file and send it to the SQL database. There you can easily parse the file using SELECT ExtractValue (xml). After receiving the data, you transfer it to the table as an insert (new files) or update (edited files).
  • Each time you need to upload files to a tree, you run a simple SELECT statement in the database, returning the data you need.
+1
source

All Articles