What is noisy text? - Definition from Whatis.com

Noisy text is an electronically-stored communication that cannot be categorized properly by a text mining software program. In an electronic document, noisy text is characterized by a discrepancy between the letters and symbols in the HTML code and the author's intended meaning. Noisy text results from writing that does not comply with rules the program uses to identify and categorize words, phrases and clauses in a particular language.

Idiomatic expressions, abbreviations, acronyms and business-specific lingo can all cause noisy text. Other potential causes include poor spelling and punctuation, typographical errors and poor translations from optical (OCR) and speech recognition programs.

Noisy text is particularly prevalent in the unstructured text found in blog posts, chat conversations, discussion threads and SMS text messages.

See also: fuzzy logic

 

 

 

 

 

This was last updated in June 2010
Editorial Director: Margaret Rouse

Email Alerts

Register now to receive SearchBusinessAnalytics.com-related news, tips and more, delivered to your inbox.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Dig Deeper

Do you have something to add to this definition? Let us know.

Send your comments to techterms@whatis.com

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.