Marklogic big removal

I want to know if anyone has experience deleting millions of documents in Marklogic? At the moment, I used simple xqueries to retrieve the uri document that needs to be deleted, and then I use corb for a batch operation.

Is there a faster way to delete millions of documents when I have a Uris list?

+4
source share
3 answers

There are several ways to solve this problem. First question: how to get uri document? The best approach for this is to use the vocabulary of URI and cts:uris or cts:uri-match . Secondly, how do you perform the deletion. You can xdmp:document-delete over the uri found and call xdmp:document-delete for each, but you can consider skipping all the above and return to xdmp:collection-delete alltogether. It works very efficiently. This requires that you have a unique collection label and can be completely removed.

NTN!

+3
source

Calling xdmp:spawn or xdmp:spawn-function can be a little faster than corb, simply because it avoids the round-trip network.

If documents are organized for him, xdmp:collection-delete or xdmp:directory-delete can also be faster. But collections or catalogs should be about 1000-100,000 documents ideally.

Finally, if you want to get rid of everything, it would be much faster to clear the forest or database. It may even be faster to export the material you want to save (possibly using XQSync), clear the database, and then re-import.

+3
source

Please also note that if you have created directories, the removal of which is much slower. If you do not need directives (only for WEBDAV), I suggest not using directories, and then deleting will be much faster.

+1
source

All Articles