I use some UIMA annotators in the pipeline. It runs tasks such as:
- tokenizer
- offer separator
- gazetizer
- My annotator
The problem is that I donβt want to write all annotations (tokens, sentences, subtitles, time, myAnnotations, etc.) to disk, because the files become very fast.
I want to delete all annotations and keep only those created by My Annotator .
I work with the following libraries:
- uimaFIT 2.0.0
- ClearTK 1.4.1
- Maven
And I use org.apache.uima.fit.pipeline.SimplePipelinewith:
SimplePipeline.runPipeline(
UriCollectionReader.getCollectionReaderFromDirectory(filesDirectory), //directory with text files
UriToDocumentTextAnnotator.getDescription(),
StanfordCoreNLPAnnotator.getDescription(),//stanford tokenize, ssplit, pos, lemma, ner, parse, dcoref
AnalysisEngineFactory.createEngineDescription(//
XWriter.class,
XWriter.PARAM_OUTPUT_DIRECTORY_NAME, outputDirectory,
XWriter.PARAM_FILE_NAMER_CLASS_NAME, ViewURIFileNamer.class.getName())
);
What I'm trying to do is use the Standford NLP annotator (from ClearTK) and remove the useless annotation.
How to do it?
From what I know, you can use the method removeFromIndexes();from an Annotation instance.
UIMA- ?