These techniques and processes discover and present knowledge – facts, business rules, and relationships – that is otherwise locked in textual form, impenetrable to automated processing.
Approaches of Text analytics
A typical application is to scan a set of documents written in a natural language and either model the document set for predictive classification purposes or populate a database or search index with the information extracted. Current approaches to text analytics use natural language processing techniques that focus on specialized domains. Text Analytics applications typically follow two approaches:- Statistical, which uses a system trained with test data to apply statistical algorithms and methods to unstructured content in order to identify and classify a document?s concepts and subjects
- Linguistic, which applies rule-based tokenization and analysis to unstructured content in order to identify specific kinds of information objects called ?entities? and then extract these for further processing.
Text analytics commercial softwares & tools
- AeroText - provides a suite of text mining applications for content analysis. Content used can be in multiple languages.
- AlchemyAPI - web-based text analytics API: document categorization, language identification, term extraction, named entities, etc. Multi-lingual support.
- IBM LanguageWare is the IBM suite for Text Analytics (Tools and Runtime).
- Infonic provides commercial sentiment analysis of financial news feeds for the Thomson Reuters RMDS trading information system. The "sentiment scores" that this software provides are used within algorithmic trading systems by several major trading banks. Infonic also develops unique document summarization and textual navigation technologies that aid in Knowledge Management.
- Nstein Technologies - text mining solution that creates rich metadata to allow publishers to increase page views, increase site stickiness, optimize SEO, automate tagging, improve search experience, increase editorial productivity, decrease operational publishing costs, increase online revenues. In combination with search engines it is used to create semantic search applications.
- SPSS - provider of PASW Text Analytics for Surveys and PASW Text Analytics, Advanced NLP-based text analytics software (multi-lingual sentiment, event and fact extraction), that can be used in conjunction with SPSS Predictive Analytics Solutions.
- Execware - publisher of Reason, PC program with patented automated data tables for visually detecting connections - text/numeric data about anything, i.e. objects, events, people, places, or anything else.
Open-Source text analytic Applications
- GATE - General Architecture for Text Engineering, an open-source toolbox for natural language processing
- UIMA - Unstructured Information Management Architecture
- RapidMiner - open-source software for data and text mining



