noisy text

Noisy text is an electronically-stored communication that cannot be categorized properly by a text mining software program. Noisy text is often caused by an end user's excessive use of idiomatic expressions, abbreviations, chat and text acronyms or business-specific lingo.

Noisy text is an electronically-stored communication that cannot be categorized properly by a text mining software program. In an electronic document, noisy text is characterized by a discrepancy between the letters and symbols in the HTML code and the author's intended meaning. 

Noisy text does not comply with rules the program uses to identify and categorize words, phrases and clauses in a particular language. Idiomatic expressions, abbreviations, acronyms and business-specific lingo can all cause noisy text. It is particularly prevalent in the unstructured text found in blog posts, chat conversations, discussion threads and SMS text messages. Other potential causes include poor spelling and punctuation, typographical errors and poor translations from optical (OCR) and speech recognition programs.

See also: fuzzy logic, noisy data

This was first published in May 2012

Continue Reading About noisy text

Glossary

'noisy text' is part of the:

View All Definitions

Dig deeper on Text analytics and text mining

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchDataManagement

SearchAWS

SearchContentManagement

SearchCRM

SearchOracle

SearchSAP

SearchSQLServer

Close