Solve issue 1726 by allowing duplicate tags, using LXML parser, and turning off thesaurus corrections.