Noisy text is an electronically-stored communication that cannot be categorized properly by a text mining
software program. In an electronic document, noisy text is characterized by a discrepancy between
the letters and symbols in the HTML code and the author's intended meaning. Noisy text results from
writing that does not comply with rules the program uses to identify and categorize words, phrases
and clauses in a particular language.
Idiomatic expressions, abbreviations, acronyms and business-specific lingo can all cause noisy
text. Other potential causes include poor spelling and punctuation, typographical errors and poor
translations from optical (OCR) and speech recognition programs.
Noisy text is particularly prevalent in the unstructured
text found in blog posts, chat conversations, discussion threads and SMS text
messages.
See also: fuzzy
logic
This was last updated in June 2010
Email Alerts
Register now to receive SearchBusinessAnalytics.com-related news, tips and more, delivered to your inbox.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy
Dig Deeper
-
Mark Madsen questions why people take the wrong approach with unstructured data, abandoning their experiences with demographics and market research.
-
Lawyers and analysts note a growing interest in text analytics for e-discovery and name software firm kCura a leader in a market that’s changed considerably in the last few years.
-
Want to know what your customers are really saying about you? Text analytics can’t tell you that by itself, but it may be able to help, according to users who spoke at the Text Analytics Summit.
-
People who read this also read...
Join the conversationComment
Share
Comments
Results
Contribute to the conversation