Oracle: how to perform full-text search on XMLType?

I have an application storing XML in an Oracle table as XMLType . I want to perform full-text searches on this data. The Oracle documentation, in Full-Text XML Data Search , recommends using the contains SQL function, which requires the data to be indexed using a context . The problem is that it seems that the context indexes are asynchronous , which is not the case, where I should be able to search through the data immediately after adding it.

Can I make this index somehow synchronous? If not, what other technique should I use for full-text XMLType ?

+4
source share
2 answers

It cannot be executed transactionally (i.e. it will not update the index so that the change is visible to the subsequent statement in the transaction). The best you can do is upgrade to commit ( SYNC ON COMMIT ), as in:

 create index your_table_x on your_table(your_column) indextype is ctxsys.context parameters ('sync (on commit)'); 

Text indexes are complex things, and I would be surprised if you could achieve a text index compatible with transaction / ACID (i.e. transaction A, inserting documents and having those that are visible in the index for this transaction and not visible for transaction B before commit).

+4
source
  • You can update the index at regular intervals in cron-like . In the worst case scenario, you can update the index after each update to the table, sync_index , on which the index is built. For example: EXEC CTX_DDL.SYNC_INDEX('your_index'); I'm not a big fan of this technique because of the complexity it introduces. In addition to the cron-like aspect, you have to deal with index fragmentation, which may require you to do full updates from time to time. Update: Instead of updating the index at regular intervals, you can update it when committed, as suggested by Gary , and this is really what you are looking for.

  • You can perform a simple text search in an XML document, as if you were doing ctrl-f with XML in a text editor. In many cases, this does not produce the expected result, because it doesn’t matter to users whether the string they are looking for will be used in the element name, attribute name or namespace. But, if this method works for you, go for it: it's simple and pretty fast. For instance:

     select count(*) from your_table d where lower(d.your_column.getClobVal()) like '%gaga%'; 
  • Using existsNode() in the where clause, as in the example below. This has two potential problems. First, without proper indexes, this is slower than method # 2, about 2 times in my testing, and I'm not sure how to create an index for the unstructured data that this query will use. Secondly, you will be case sensitive, which is often not what you want. And you cannot just call XPath lower-case() since Oracle only supports XPath 1.0.

     select * from your_table where existsNode(your_column, '//text()[contains(., "gaga")]') = 1; 
+3
source

All Articles