If you mean this document , the following will be done:
- train the parser using a generative model, that is, where you compute P (term | tree) and use the Bayes rule to cancel this and get P (tree | term),
- apply this to get the initial k-best rating of trees from the model,
- prepare the second model according to the characteristics of the desired trees,
- apply this to reset the output from 2.
The reason the second model is useful is because in generative models (such as naive Bayesian, HMM, PCFG) it can be difficult to add functions that are different from word identities, as the model will try to predict the probability of an exact function vector rather than individual functions that may not be in the training data and will have P (vector | tree) = 0 and, therefore, P (tree | vector) = 0 (+ smoothing, but the problem remains). This is the eternal problem of NLP in data resolution: you cannot create a training case that contains every single statement that you want to process.
Discrimination models such as MaxEnt process sign vectors much better, but take longer and can be more difficult to process (although CRF and neural networks were used to create parsers as discriminatory models). Collins et al. try to find an intermediate point between completely generative and completely discriminatory approaches.
source share