Text mining applications
Recently, text mining has received attention in many areas. Text mining usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interestingness. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.
Text mining in security applications
One of the largest text mining applications that exists is probably the classified ECHELON surveillance system. Additionally, many text mining software packages such asStatistica Text Miner, AeroText, Attensityand Expert System are marketed towards security applications, particularly analysis of plain text sources such as Internet news.
In 2007, Europol's Serious Crime division developed an analysis system in order to track transnational organized crime. This Overall Analysis System for Intelligence Support (OASIS) integrates among the most advanced text analytics and text mining technologies available on today's market. This system led Europol to make the most significant progress to support law enforcement objectives at the international level.
Text mining in Biomedical applications
A range of text mining applications in the biomedical literature has been described. One example is PubGene that combines biomedical text mining with network visualization as an Internet service. Another example, which uses ontologies with textmining is GoPubMed.org. Semantic similarity has also been used by text-mining systems, namely, GOAnnotator.
Text mining in Software and applications
Research and development departments of major companies, including IBM and Microsoft, are researching text mining techniques and developing programs to further automate the mining and analysis processes. Text mining software is also being researched by different companies working in the area of search and indexing in general as a way to improve their results.
Text mining in Online Media applications
Text mining is being used by large media companies, such as the Tribune Company, to disambiguate information and to provide readers with greater search experiences, which in turn increases site "stickiness" and revenue. Additionally, on the back end, editors are benefiting by being able to share, associate and package news across properties, significantly increasing opportunities to monetize content.
Text mining in Marketing applications
Text mining is starting to be used in marketing as well, more specifically in analytical Customer relationship management. Coussement and Van den Poel (2008) apply it to improve predictive analytics models for customer churn (Customer attrition).
Text mining in Sentiment analysis
Sentiment analysis may, for example, involve analysis of movie reviews for estimating how favorable a review is for a movie. Such an analysis may require a labeled data set or labeling of the affectivity of words. A resource for affectivity of words has been made for WordNet.
Text mining in Academic applications
The issue of text mining is of importance to publishers who hold large databases of information requiring indexing for retrieval. This is particularly true in scientific disciplines, in which highly specific information is often contained within written text. Therefore, initiatives have been taken such as Nature's proposal for an Open Text Mining Interface (OTMI) and NIH's common Journal Publishing Document Type Definition (DTD) that would provide semantic cues to machines to answer specific queries contained within text without removing publisher barriers to public access.