Research and development departments of major companies, including IBM and Microsoft, are researching text mining techniques and developing programs to further automate the mining and analysis processes. Text mining software is also being researched by different companies working in the area of search and indexing in general as a way to improve their results.
There are a large number of companies that provide commercial computer programs:
- AeroText - provides a suite of text mining applications for content analysis. Content used can be in multiple languages.
- AlchemyAPI - SaaS-based text mining platform that supports 6+ languages. Includes named entity extraction, keyword extraction, document categorization, etc.
- Autonomy - suite of text mining, clustering and categorization solutions for a variety of industries.
- Endeca Technologies - provides software to analyze and cluster unstructured text.
- Expert System S.p.A. - suite of semantic technologies and products for developers and knowledge managers.
- Fair Isaac - leading provider of decision management solutions powered by advanced analytics (includes text analytics).
- Inxight - provider of text analytics, search, and unstructured visualization technologies. (Inxight was bought by Business Objects that was bought by SAP AG in 2008)
- LexisNexis - LexisNexis is a provider of business intelligence solutions based on an extensive news and company information content set. Through the recent acquisition of Datops LexisNexis is leveraging its search and retrieval expertise to become a player in the text and data mining field.
- LanguageWare - Text Analysis libraries and customization tooling from IBM
- Linguamatics - Text mining software for getting meaningful information from unstructured data
- Nstein Technologies - text mining solution that creates rich metadata to allow publishers to increase page views, increase site stickiness, optimize SEO, automate tagging, improve search experience, increase editorial productivity, decrease operational publishing costs, increase online revenues. In combination with search engines it is used to create semantic search applications.
- Pervasive Data Integrator - includes Extract Schema Designer that allows the user to point and click identify structure patterns in reports, html, emails, etc. for extraction into any database
- RapidMiner/YALE - open-source data and text mining software for scientific and commercial use.
- SAS - solutions including SAS Text Miner and Teragram - commercial text analytics, natural language processing, and taxonomy software leveraged for Information Management.
- SPSS - provider of SPSS Text Analysis for Surveys, Text Mining for Clementine, LexiQuest Mine and LexiQuest Categorize, commercial text analytics software that can be used in conjunction with SPSS Predictive Analytics Solutions.
Thomson Data Analyzer - Enables complex analysis on patent information, scientific publications and news.
Text mining open source software
- GATE - natural language processing and language engineering tool.
- YALE/RapidMiner with its Word Vector Tool plugin - data and text mining software.
- UIMA - UIMA (Unstructured Information Management Architecture) is a component framework for analysing unstructured content such as text, audio and video, originally developed by IBM.
- U-Compare - "http://u-compare.org" is an integrated text mining/natural language processing system based on the UIMA Framework, which provides access to a large collection of ready-to-use interoperable natural language processing components, currently the world largest UIMA component repository. U-Compare allows users to build complex NLP workflows via an easy drag-and-drop interface, and makes visualization and comparison of the outputs of these workflows simple.



