58 Tech
Mar 2, 2020 · Artificial Intelligence
Low-Quality Text Detection Using Unsupervised Language Model Perplexity
This article proposes a method to identify low-quality text in business data by training a large-scale unsupervised language model to compute sentence perplexity, converting the detection problem into a threshold decision, and details model design, challenges, optimizations, and online performance results.
BERTNLPlanguage model
0 likes · 13 min read