In many natural language processing applications two or more models usually have to be involved for accuracy. But it is difficult for minor models, such as “backoff” taggers in part-of-speech tagging, to cooperate smoothly with the major probabilistic model. We introduce a two-stage approach for model selection between hidden Markov models and other minor models. In the first stage, the major model is extended to give a set of candidates for model selection. Parameters weighted hidden Markov model is presented using weighted ratio to create the candidate set. In the second stage, heuristic rules and features are used as evaluation functions to give extra scores to candidates in the set. Such scores are calculated using a diagnostic likelihood ratio test based on sensitivity and specificity criteria. The selection procedure can be fulfilled using swarm optimization technique. Experiment results on public tagging data sets show the applicability of the proposed approach.