Text mining, the Glossary
Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text.[1]
Table of Contents
134 relations: Advertising network, Affect (psychology), Algorithm, Annotation, Association of European Research Libraries, Australian Law Reform Commission, Authors Guild, Inc. v. Google, Inc., Automatic summarization, Big data, Bioinformatics, Biology, Biomedicine, Biotechnology and Biological Sciences Research Council, Blog, Book, Business intelligence, Business rule, Classification, Commercial software, Competitive intelligence, Concept mining, Content analysis, Context (linguistics), Copyright and Information Society Directive 2001, Copyright law of Australia, Copyright law of Japan, Copyright law of the European Union, Copyright law of the United States, Coreference, Corpus manager, Counterintelligence, Customer attrition, Customer relationship management, Data and information visualization, Data mining, Data model, Database, Database Directive, Database index, Digital journalism, Dimensionality reduction, Discovery (observation), Document, Document classification, Document clustering, Document processing, Document type definition, Electronic discovery, Email, Email filtering, ... Expand index (84 more) »
- Applied data mining
- Text
Advertising network
An online advertising network or ad network is a company that connects advertisers to websites that want to host advertisements.
See Text mining and Advertising network
Affect (psychology)
Affect, in psychology, is the underlying experience of feeling, emotion, attachment, or mood.
See Text mining and Affect (psychology)
Algorithm
In mathematics and computer science, an algorithm is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation.
Annotation
An annotation is extra information associated with a particular point in a document or other piece of information.
See Text mining and Annotation
Association of European Research Libraries
The Association of European Research Libraries (Ligue des Bibliothèques Européennes de Recherche or LIBER) is a professional association of national and university research libraries in Europe.
See Text mining and Association of European Research Libraries
Australian Law Reform Commission
The Australian Law Reform Commission (often abbreviated to ALRC) is an Australian independent statutory body established to conduct reviews into the law of Australia.
See Text mining and Australian Law Reform Commission
Authors Guild v. Google 804 F.3d 202 (2nd Cir. 2015) was a copyright case heard in federal court for the Southern District of New York, and then the Second Circuit Court of Appeals between 2005 and 2015.
See Text mining and Authors Guild, Inc. v. Google, Inc.
Automatic summarization
Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Text mining and Automatic summarization are computational linguistics and natural language processing.
See Text mining and Automatic summarization
Big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data-processing application software.
Bioinformatics
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex.
See Text mining and Bioinformatics
Biology
Biology is the scientific study of life.
Biomedicine
Biomedicine (also referred to as Western medicine, mainstream medicine or conventional medicine)"." NCI Dictionary of Cancer Medicine.
See Text mining and Biomedicine
Biotechnology and Biological Sciences Research Council
Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation, is a non-departmental public body (NDPB), and is the largest UK public funder of non-medical bioscience.
See Text mining and Biotechnology and Biological Sciences Research Council
Blog
A blog (a truncation of "weblog") is an informational website consisting of discrete, often informal diary-style text entries (posts).
Book
A book is a medium for recording information in the form of writing or images.
Business intelligence
Business intelligence (BI) consists of strategies and technologies used by enterprises for the data analysis and management of business information.
See Text mining and Business intelligence
Business rule
A business rule defines or constrains some aspect of a business.
See Text mining and Business rule
Classification
Classification is usually understood to mean the allocation of objects to certain pre-existing classes or categories.
See Text mining and Classification
Commercial software
Commercial software, or seldom payware, is a computer software that is produced for sale or that serves commercial purposes.
See Text mining and Commercial software
Competitive intelligence
Competitive intelligence (CI) is the process and forward-looking practices used in producing knowledge about the competitive environment to improve organizational performance.
See Text mining and Competitive intelligence
Concept mining
Concept mining is an activity that results in the extraction of concepts from artifacts. Text mining and concept mining are natural language processing.
See Text mining and Concept mining
Content analysis
Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video.
See Text mining and Content analysis
Context (linguistics)
In semiotics, linguistics, sociology and anthropology, context refers to those objects or entities which surround a focal event, in these disciplines typically a communicative event, of some kind.
See Text mining and Context (linguistics)
Copyright and Information Society Directive 2001
The Copyright and Information Society Directive 2001 is a directive in European Union law that was enacted to implement the WIPO Copyright Treaty and to harmonise aspects of copyright law across Europe, such as copyright exceptions.
See Text mining and Copyright and Information Society Directive 2001
Copyright law of Australia
The copyright law of Australia defines the legally enforceable rights of creators of creative and artistic works under Australian law.
See Text mining and Copyright law of Australia
Copyright law of Japan
consist of two parts: "Author's Rights" and "Neighbouring Rights".
See Text mining and Copyright law of Japan
Copyright law of the European Union
The copyright law of the European Union is the copyright law applicable within the European Union.
See Text mining and Copyright law of the European Union
Copyright law of the United States
The copyright law of the United States grants monopoly protection for "original works of authorship".
See Text mining and Copyright law of the United States
Coreference
In linguistics, coreference, sometimes written co-reference, occurs when two or more expressions refer to the same person or thing; they have the same referent.
See Text mining and Coreference
Corpus manager
A corpus manager (corpus browser or corpus query system) is a tool for multilingual corpus analysis, which allows effective searching in corpora.
See Text mining and Corpus manager
Counterintelligence
Counterintelligence (counter-intelligence) or counterespionage (counter-espionage) is any activity aimed at protecting an agency's intelligence program from an opposition's intelligence service.
See Text mining and Counterintelligence
Customer attrition
Customer attrition, also known as customer churn, customer turnover, or customer defection, is the loss of clients or customers.
See Text mining and Customer attrition
Customer relationship management
Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information.
See Text mining and Customer relationship management
Data and information visualization
Data and information visualization (data viz/vis or info viz/vis) is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items.
See Text mining and Data and information visualization
Data mining
Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.
See Text mining and Data mining
Data model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities.
See Text mining and Data model
Database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data.
Database Directive
The Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases is a directive of the European Union in the field of copyright law, made under the internal market provisions of the Treaty of Rome.
See Text mining and Database Directive
Database index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure.
See Text mining and Database index
Digital journalism
Digital journalism, also known as netizen journalism or online journalism, is a contemporary form of journalism where editorial content is distributed via the Internet, as opposed to publishing via print or broadcast.
See Text mining and Digital journalism
Dimensionality reduction
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension.
See Text mining and Dimensionality reduction
Discovery (observation)
Discovery is the act of detecting something new, or something previously unrecognized as meaningful.
See Text mining and Discovery (observation)
Document
A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content.
Document classification
Document classification or document categorization is a problem in library science, information science and computer science. Text mining and document classification are natural language processing.
See Text mining and Document classification
Document clustering
Document clustering (or text clustering) is the application of cluster analysis to textual documents.
See Text mining and Document clustering
Document processing
Document processing is a field of research and a set of production processes aimed at making an analog document digital. Text mining and document processing are Applied data mining.
See Text mining and Document processing
Document type definition
A document type definition (DTD) is a specification file that contains set of markup declarations that define a document type for an SGML-family markup language (GML, SGML, XML, HTML).
See Text mining and Document type definition
Electronic discovery
Electronic discovery (also ediscovery or e-discovery) refers to discovery in legal proceedings such as litigation, government investigations, or Freedom of Information Act requests, where the information sought is in electronic format (often referred to as electronically stored information or ESI).
See Text mining and Electronic discovery
Electronic mail (email or e-mail) is a method of transmitting and receiving messages using electronic devices.
Email filtering
Email filtering is the processing of email to organize it according to specified criteria.
See Text mining and Email filtering
Encryption
In cryptography, encryption is the process of transforming (more specifically, encoding) information in a way that, ideally, only authorized parties can decode.
See Text mining and Encryption
Engineering and Physical Sciences Research Council
The Engineering and Physical Sciences Research Council (EPSRC) is a British Research Council that provides government funding for grants to undertake research and postgraduate degrees in engineering and the physical sciences, mainly to universities in the United Kingdom.
See Text mining and Engineering and Physical Sciences Research Council
Entity–relationship model
An entity–relationship model (or ER model) describes interrelated things of interest in a specific domain of knowledge.
See Text mining and Entity–relationship model
European Commission
The European Commission (EC) is the primary executive arm of the European Union (EU).
See Text mining and European Commission
Exploratory data analysis
In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods.
See Text mining and Exploratory data analysis
Fair dealing
Fair dealing is a limitation and exception to the exclusive rights granted by copyright law to the author of a creative work.
See Text mining and Fair dealing
Fair use
Fair use is a doctrine in United States law that permits limited use of copyrighted material without having to first acquire permission from the copyright holder.
File system
In computing, a file system or filesystem (often abbreviated to FS or fs) governs file organization and access.
See Text mining and File system
Full-text search
In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database.
See Text mining and Full-text search
Gensim
Gensim is an open-source library for unsupervised topic modeling, document indexing, retrieval by similarity, and other natural language processing functionalities, using modern statistical machine learning.
GoPubMed
GoPubMed was a knowledge-based search engine for biomedical texts.
Homonym
In linguistics, homonyms are words which are either homographs—words that have the same spelling (regardless of pronunciation)—or homophones—words that have the same pronunciation (regardless of spelling)—or both.
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American multinational technology company headquartered in Armonk, New York and present in over 175 countries.
Information
Information is an abstract concept that refers to something which has the power to inform.
See Text mining and Information
Information Awareness Office
The Information Awareness Office (IAO) was established by the United States Defense Advanced Research Projects Agency (DARPA) in January 2002 to bring together several DARPA projects focused on applying surveillance and information technology to track and monitor terrorists and other asymmetric threats to U.S.
See Text mining and Information Awareness Office
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. Text mining and information extraction are natural language processing.
See Text mining and Information extraction
Information retrieval
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. Text mining and information retrieval are natural language processing.
See Text mining and Information retrieval
Intelligence analysis
Intelligence analysis is the application of individual and collective cognitive methods to weigh data and test hypotheses within a secret socio-cultural context.
See Text mining and Intelligence analysis
Jisc
Jisc is a United Kingdom not-for-profit organisation that provides network and IT services and digital resources in support of further and higher education and research, as well as the public sector.
Lexical analysis
Lexical tokenization is conversion of a text into (semantically or syntactically) meaningful lexical tokens belonging to categories defined by a "lexer" program.
See Text mining and Lexical analysis
Limitations and exceptions to copyright
Limitations and exceptions to copyright are provisions, in local copyright law or the Berne Convention, which allow for copyrighted works to be used without a license from the copyright owner.
See Text mining and Limitations and exceptions to copyright
Linguistics
Linguistics is the scientific study of language.
See Text mining and Linguistics
List of life sciences
This list of life sciences comprises the branches of science that involve the scientific study of life – such as microorganisms, plants, and animals including human beings.
See Text mining and List of life sciences
List of text mining software
Text mining computer programs are available from many commercial and open source companies and sources.
See Text mining and List of text mining software
Machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.
See Text mining and Machine learning
Machine translation
Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. Text mining and Machine translation are computational linguistics and natural language processing.
See Text mining and Machine translation
Macromolecular docking
Macromolecular docking is the computational modelling of the quaternary structure of complexes formed by two or more interacting biological macromolecules.
See Text mining and Macromolecular docking
Market sentiment
Market sentiment, also known as investor attention, is the general prevailing attitude of investors as to anticipated price development in a market.
See Text mining and Market sentiment
Microsoft
Microsoft Corporation is an American multinational corporation and technology company headquartered in Redmond, Washington.
Name resolution (semantics and text extraction)
In semantics and text extraction, name resolution refers to the ability of text mining software to determine which actual person, actor, or object a particular use of a name refers to. Text mining and name resolution (semantics and text extraction) are computational linguistics.
See Text mining and Name resolution (semantics and text extraction)
Named-entity recognition
Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. Text mining and named-entity recognition are computational linguistics.
See Text mining and Named-entity recognition
National Centre for Text Mining
The National Centre for Text Mining (NaCTeM) is a publicly funded text mining (TM) centre. Text mining and National Centre for Text Mining are computational linguistics.
See Text mining and National Centre for Text Mining
National Institutes of Health
The National Institutes of Health, commonly referred to as NIH, is the primary agency of the United States government responsible for biomedical and public health research.
See Text mining and National Institutes of Health
National security
National security, or national defence (national defense in American English), is the security and defence of a sovereign state, including its citizens, economy, and institutions, which is regarded as a duty of government.
See Text mining and National security
Natural language
In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that occurs naturally in a human community by a process of use, repetition, and change without conscious planning or premeditation. Text mining and natural language are natural language processing.
See Text mining and Natural language
Natural language processing
Natural language processing (NLP) is an interdisciplinary subfield of computer science and artificial intelligence. Text mining and Natural language processing are computational linguistics.
See Text mining and Natural language processing
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. Text mining and natural Language Toolkit are natural language processing.
See Text mining and Natural Language Toolkit
Nature (journal)
Nature is a British weekly scientific journal founded and based in London, England.
See Text mining and Nature (journal)
News analytics
In trading strategy, news analysis refers to the measurement of the various qualitative and quantitative attributes of textual (unstructured data) news stories. Text mining and news analytics are natural language processing.
See Text mining and News analytics
Noun phrase
A noun phrase – or NP or nominal (phrase) – is a phrase that usually has a noun or pronoun as its head, and has the same grammatical functions as a noun.
See Text mining and Noun phrase
Novelty (patent)
Novelty is one of the patentability requirement for a patent claim, whose purpose is to prevent issuing patents on known things, i.e. to prevent public knowledge from being taken away from the public domain.
See Text mining and Novelty (patent)
Offender profiling
Offender profiling, also known as criminal profiling, is an investigative strategy used by law enforcement agencies to identify likely suspects and has been used by investigators to link cases that may have been committed by the same perpetrator.
See Text mining and Offender profiling
Ontology learning
Ontology learning (ontology extraction, ontology generation, or ontology acquisition) is the automatic or semi-automatic creation of ontologies, including extracting the corresponding domain's terms and the relationships between the concepts that these terms represent from a corpus of natural language text, and encoding them with an ontology language for easy retrieval. Text mining and ontology learning are natural language processing.
See Text mining and Ontology learning
Open access
Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers.
See Text mining and Open access
Open Mind Common Sense
Open Mind Common Sense (OMCS) is an artificial intelligence project based at the Massachusetts Institute of Technology (MIT) Media Lab whose goal is to build and utilize a large commonsense knowledge base from the contributions of many thousands of people across the Web.
See Text mining and Open Mind Common Sense
Open source
Open source is source code that is made freely available for possible modification and redistribution.
See Text mining and Open source
Parsing
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar.
Part-of-speech tagging
In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context.
See Text mining and Part-of-speech tagging
Pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern.
See Text mining and Pattern matching
Pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data.
See Text mining and Pattern recognition
Plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters.
See Text mining and Plain text
Predictive analytics
Predictive analytics is a form of business analytics applying machine learning to generate a predictive model for certain business applications.
See Text mining and Predictive analytics
Protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues.
PubGene
PubGene AS is a bioinformatics company located in Oslo, Norway and is the daughter company of PubGene Inc.
Readability
Readability is the ease with which a reader can understand a written text.
See Text mining and Readability
Record linkage
Record linkage (also known as data matching, data linkage, entity resolution, and many other terms) is the task of finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and databases).
See Text mining and Record linkage
Relevance (information retrieval)
In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user.
See Text mining and Relevance (information retrieval)
Research
Research is "creative and systematic work undertaken to increase the stock of knowledge".
Research Councils UK
Research Councils UK, sometimes known as RCUK, was a non-departmental public body that coordinated science policy in the United Kingdom from 2002 to 2018.
See Text mining and Research Councils UK
Review
A review is an evaluation of a publication, product, service, or company or a critical take on current affairs in literature, politics or culture.
Search engine
A search engine is a software system that provides hyperlinks to web pages and other relevant information on the Web in response to a user's query.
See Text mining and Search engine
Security appliance
A security appliance is any form of server appliance that is designed to protect computer networks from unwanted traffic.
See Text mining and Security appliance
Semantic Web
The Semantic Web, sometimes known as Web 3.0 (not to be confused with Web3), is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C).
See Text mining and Semantic Web
Sentiment analysis
Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Text mining and Sentiment analysis are natural language processing.
See Text mining and Sentiment analysis
Sequential pattern mining
Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence.
See Text mining and Sequential pattern mining
Sexism
Sexism is prejudice or discrimination based on one's sex or gender.
Social media are interactive technologies that facilitate the creation, sharing and aggregation of content (such as ideas, interests, and other forms of expression) amongst virtual communities and networks.
See Text mining and Social media
Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies.
See Text mining and Social science
Statistics
Statistics (from German: Statistik, "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.
See Text mining and Statistics
Subject–verb–object word order
In linguistic typology, subject–verb–object (SVO) is a sentence structure where the subject comes first, the verb second, and the object third.
See Text mining and Subject–verb–object word order
In information systems, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark, multimedia, database record, or computer file).
See Text mining and Tag (metadata)
Text Analysis Portal for Research
TAPoR (Text Analysis Portal for Research) is a gateway that highlights tools and code snippets usable for textual criticism of all types. Text mining and text Analysis Portal for Research are computational linguistics.
See Text mining and Text Analysis Portal for Research
Text corpus
In linguistics and natural language processing, a corpus (corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated. Text mining and text corpus are computational linguistics.
See Text mining and Text corpus
Tribune Media Company, also known as Tribune Company, was an American multimedia conglomerate headquartered in Chicago, Illinois.
See Text mining and Tribune Media
UC Berkeley School of Information
The University of California, Berkeley, School of Information, also known as the UC Berkeley School of Information or the I School, is a graduate school and, created in 1994, the newest of the schools at the University of California, Berkeley.
See Text mining and UC Berkeley School of Information
University of Alberta
The University of Alberta (also known as U of A or UAlberta) is a public research university located in Edmonton, Alberta, Canada.
See Text mining and University of Alberta
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California.
See Text mining and University of California, Berkeley
University of Manchester
The University of Manchester is a public research university in Manchester, England.
See Text mining and University of Manchester
University of Tokyo
The University of Tokyo (abbreviated as Tōdai (東大) in Japanese and UTokyo in English) is a public research university in Bunkyō, Tokyo, Japan.
See Text mining and University of Tokyo
Unstructured data
Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner.
See Text mining and Unstructured data
W-shingling
In natural language processing a w-shingling is a set of unique shingles (therefore ''n-grams'') each of which is composed of contiguous subsequences of tokens within a document, which can then be used to ascertain the similarity between documents. Text mining and w-shingling are natural language processing.
See Text mining and W-shingling
Website
A website (also written as a web site) is a collection of web pages and related content that is identified by a common domain name and published on at least one web server.
Weka (software)
Waikato Environment for Knowledge Analysis (Weka) is a collection of machine learning and data analysis free software licensed under the GNU General Public License.
See Text mining and Weka (software)
WordNet
WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. Text mining and WordNet are computational linguistics.
See also
Applied data mining
- Able Danger
- Anomaly Detection at Multiple Scales
- Automatic number-plate recognition in the United Kingdom
- Behavioral analytics
- Business analytics
- CORE (research service)
- Cross-industry standard process for data mining
- Customer analytics
- Daisy Intelligence
- Data Applied
- Data mining in agriculture
- Data thinking
- Document processing
- Educational data mining
- Equifax Workforce Solutions
- Examples of data mining
- Game analytics
- Inference attack
- Java Data Mining
- Open-source intelligence
- PRODIGAL
- Path analysis (computing)
- SEMMA
- Stellar Wind
- Text mining
- Zapaday
Text
- Computer keyboards
- Copy (publishing)
- Document collaboration
- Internationalization and localization
- Intertextuality
- Literary theory
- Memory typewriter
- Pastebin
- PrivateBin
- Source code
- Text (literary theory)
- Text display
- Text editors
- Text files
- Text linguistics
- Text mining
- Text processing
- Texts
- Transliteration
- Typewriter
- Typewriters
- Typing
- Typography
- Wikis
- Word processors
References
[1] https://en.wikipedia.org/wiki/Text_mining
Also known as Applications of text mining, Auto-entity extraction, Data and text mining, Intelligent text analysis, Text analytics, Text and data mining, Text-mining, Textmining.
, Encryption, Engineering and Physical Sciences Research Council, Entity–relationship model, European Commission, Exploratory data analysis, Fair dealing, Fair use, File system, Full-text search, Gensim, GoPubMed, Homonym, IBM, Information, Information Awareness Office, Information extraction, Information retrieval, Intelligence analysis, Jisc, Lexical analysis, Limitations and exceptions to copyright, Linguistics, List of life sciences, List of text mining software, Machine learning, Machine translation, Macromolecular docking, Market sentiment, Microsoft, Name resolution (semantics and text extraction), Named-entity recognition, National Centre for Text Mining, National Institutes of Health, National security, Natural language, Natural language processing, Natural Language Toolkit, Nature (journal), News analytics, Noun phrase, Novelty (patent), Offender profiling, Ontology learning, Open access, Open Mind Common Sense, Open source, Parsing, Part-of-speech tagging, Pattern matching, Pattern recognition, Plain text, Predictive analytics, Protein, PubGene, Readability, Record linkage, Relevance (information retrieval), Research, Research Councils UK, Review, Search engine, Security appliance, Semantic Web, Sentiment analysis, Sequential pattern mining, Sexism, Social media, Social science, Statistics, Subject–verb–object word order, Tag (metadata), Text Analysis Portal for Research, Text corpus, Tribune Media, UC Berkeley School of Information, University of Alberta, University of California, Berkeley, University of Manchester, University of Tokyo, Unstructured data, W-shingling, Website, Weka (software), WordNet.