FACTOID # 22: South Dakota has the highest employment ratio in America, but the lowest median earnings of full-time male employees.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Data mining

Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data"[1] and "the science of extracting useful information from large data sets or databases."[2] Data mining in relation to enterprise resource planning is the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making.[3] Sorting refers to a process of arranging items in some sequence and/or in different sets, and accordingly, it has two common, yet distinct meanings: ordering: aranging items of the same kind, class, nature, etc. ... The term business intelligence (BI) dates to 1958. ... A financial analyst (or securities analyst, research analyst, equity analyst, investment analyst) works with financial analysis. ... A data set (or dataset) is a collection of data, usually presented in tabular form. ... The ASCII codes for the word Wikipedia represented in binary, the numeral system most commonly used for encoding computer information. ... For other uses, see Data (disambiguation). ... A data set (or dataset) is a collection of data, usually presented in tabular form. ... This article is principally about managing and structuring the collections of data held on computers. ... Enterprise resource planning (ERP) systems attempt to integrate several data sources and processes of an organization into a unified system. ...

Contents

Background

Traditionally, analysts have performed the task of extracting useful information from recorded data, but the increasing volume of data in modern business and science calls for computer-based approaches. As data sets have grown in size and complexity, there has been a shift away from direct hands-on data analysis toward indirect, automatic data analysis using more complex and sophisticated tools. The modern technologies of computers, networks, and sensors have made data collection and organization much easier. However, the captured data needs to be converted into information and knowledge to become useful. Data mining is the entire process of applying computer-based methodology, including new techniques for knowledge discovery, to data.[4] The ASCII codes for the word Wikipedia represented in binary, the numeral system most commonly used for encoding computer information. ... For other uses, see Data (disambiguation). ... A data set (or dataset) is a collection of data, usually presented in tabular form. ... The tower of a personal computer. ... A wide variety of systems of interconnected components are called networks. ... A sensor is a technological device or biological organ that detects, or senses, a signal or physical condition. ... // Data collection is a term used to describe a process of preparing and collecting business data as part of a process improvement or similar project. ... The ASCII codes for the word Wikipedia represented in binary, the numeral system most commonly used for encoding computer information. ... For other uses, see Knowledge (disambiguation). ... Meethodology is defined as the analysis of the principles of methods, rules, and postulates employed by a discipline, the systematic study of methods that are, can be, or have been applied within a discipline or a particular procedure or set of procedures [1]. It should be noted that methodology is... Knowledge discovery is a concept of the field of computer science that describes the process of automatically searching large volumes of data for patterns that can be considered knowledge about the data. ...


Data mining identifies trends within data that go beyond simple analysis. Through the use of sophisticated algorithms, users have the ability to identify key attributes of business processes and target opportunities.


Although data mining is a relatively new term, the technology is not. Companies for a long time have used powerful computers to sift through volumes of data such as supermarket scanner data to produce market research reports. Continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy and usefulness of analysis.


The term data mining is often used to apply to the two separate processes of knowledge discovery and prediction. Knowledge discovery provides explicit information that has a readable form and can be understood by a user. Forecasting, or predictive modeling provides predictions of future events and may be transparent and readable in some approaches (e.g. rule based systems) and opaque in others such as neural networks. Moreover, some data-mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery. A prediction is a statement or claim that a particular event will occur in the future in more certain terms than a forecast. ... Look up forecast in Wiktionary, the free dictionary. ... Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. ... // Traditionally, the term neural network had been used to refer to a network or circuitry of biological neurons. ...


Metadata, or data about a given data set, are often expressed in a condensed data-minable format, or one that facilitates the practice of data mining. Common examples include executive summaries and scientific abstracts. Metadata is data about data. ...


Data mining relies on the use of real world data. This data is extremely vulnerable to collinearity precisely because data from the real world may have unknown interrelations. An unavoidable weakness of data mining is that the critical data that may explain the relationships is never observed. Alternative approaches using an experiment based approach such as Choice Modelling for human-generated data may be used. Inherent correlations are either controlled for or removed altogether through the construction of an experimental design. The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. ...


Recently, there were some efforts to define a standard for data mining, for example the CRISP-DM standard for analysis processes or the Java Data-Mining Standard. Independent of these standardization efforts, freely available open-source software systems like RapidMiner and Weka have become an informal standard for defining data-mining processes. CRISP-DM stands for CRoss Industry Standard Process for Data Mining. ... RapidMiner (formerly YALE) is an environment for machine learning and data mining experiments. ... Weka is a suite of machine learning software written in Java at the University of Waikato which implements several machine learning algorithms from various learning paradigms. ...


Privacy concerns

There are also privacy and human rights concerns associated with data mining, specifically regarding the source of the data analyzed. Data mining provides information that may be difficult to obtain otherwise. When the data collected involves individual people, there are many questions concerning privacy, legality, and ethics.[5] In particular, data mining government or commercial data sets for national security or law enforcement purposes has raised privacy concerns.[6][7] Privacy is the ability of an individual or group to control the flow of information about themselves and thereby reveal themselves selectively. ... Human rights are rights which some hold to be inalienable and belonging to all humans. ...


Notable uses of data mining

Terrorism

Data mining has been cited as the method by which the U.S. Army unit Able Danger had identified the September 11, 2001 attacks leader, Mohamed Atta, and three other 9/11 hijackers as possible members of an Al Qaeda cell operating in the U.S. more than a year before the attack. Able Danger was a classified military intelligence program under the command of the U.S. Special Operations Command (SOCOM). ... A sequential look at United Flight 175 crashing into the south tower of the World Trade Center The September 11, 2001 attacks (often referred to as 9/11—pronounced nine eleven or nine one one) consisted of a series of coordinated terrorist[1] suicide attacks upon the United States, predominantly... Mohamed Atta ( transliteration: ) was a terrorist who participated in the hijacking of American Airlines Flight 11, the first plane to crash into the World Trade Center during the September 11, 2001 attacks. ... Map of major attacks attributed to al-Qaeda Al-Qaeda (also al-Qaida or al-Qaida or al-Qaidah) (Arabic: ‎ , translation: The Base) is an international alliance of terrorist organizations founded in 1988[4] by Osama bin Laden and other veteran Afghan Arabs after the Soviet War in...


It has been suggested that both the Central Intelligence Agency and the Canadian Security Intelligence Service have employed this method.[8] CIA redirects here. ... “CSIS” redirects here. ...


Previous data mining to stop terrorist programs under the US government include the Terrorism Information Awareness (TIA) program, Computer-Assisted Passenger Prescreening System (CAPPS II), Analysis, Dissemination, Visualization, Insight, and Semantic Enhancement (ADVISE), Multistate Anti-Terrorism Information Exchange (MATRIX), and the Secure Flight program Security-MSNBC. These programs have been discontinued due to controversy over whether they violate the US Constitution's 4th amendment.


Games

Since the early 1960s, with the availability of oracles for certain combinatorial games, also called tablebases (e.g. for 3x3-chess) with any beginning configuration, small-board dots-and-boxes, small-board-hex, and certain endgames in chess, dots-and-boxes, and hex; a new area for data mining has been opened up. This is the extraction of human-usable strategies from these oracles. Current pattern recognition approaches do not seem to fully have the required high level of abstraction in order to be applied successfully. Instead, extensive experimentation with the tablebases, combined with an intensive study of tablebase-answers to well designed problems and with knowledge of prior art, i.e. pre-tablebase knowledge, is used to yield insightful patterns. Berlekamp in dots-and-boxes etc. and John Nunn in chess endgames are notable examples of researchers doing this work, though they were not and are not involved in tablebase generation. In complexity theory and computability theory, an oracle machine is an abstract machine used to study decision problems. ... This article may be too technical for most readers to understand. ... In chess, a tablebase is a database containing the win/loss status of every possible position of pieces in the endgame. ... Dots and Boxes (also known as Boxes, Squares, Paddocks, Square-it, Dots and Dashes, Dots, or, simply, the Dot Game) is a pencil and paper game for two players (or sometimes, more than two). ... Elwyn Ralph Berlekamp is professor of mathematics at University of California, Berkeley. ... John Denis Martin Nunn (born April 25, 1955) is an English chess player and mathematician. ... This article is about the Western board game. ... In chess, the endgame (or end game or ending) refers to the stage of the game when there are few pieces left on the board. ...


Business

Data mining in customer relationship management applications can contribute significantly to the bottom line. Rather than contacting a prospect or customer through a call center or sending mail, only prospects that are predicted to have a high likelihood of responding to an offer are contacted. More sophisticated methods may be used to optimize across campaigns so that we can predict which channel and which offer an individual is most likely to respond to - across all potential offers. Finally, in cases where many people will take an action without an offer, uplift modeling can be used to determine which people will have the greatest increase in responding if given an offer. Data clustering can also be used to automatically discover the segments or groups within a customer data set. Customer relationship management (CRM) is a broad term that covers concepts used by companies to manage their relationships with customers, including the capture, storage and analysis of customer, vendor, partner, and internal process information. ... Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. ...


Businesses employing data mining quickly see a return on investment, but also they recognize that the number of predictive models can quickly become very large. Rather than one model to predict which customers will churn, a business could build a separate model for each region and customer type. Then instead of sending an offer to all people that are likely to churn, it may only want to send offers to customers that will likely take to offer. And finally, it may also want to determine which customers are going to be profitable over a window of time and only send the offers to those that are likely to be profitable. In order to maintain this quantity of models, they need to manage model versions and move to automated data mining. Churning is the practice of executing trades for an investment account by a salesman or broker in order to generate commissions from the account. ...


Data mining can also be helpful to human-resources departments in identifying the characteristics of their most successful employees. Information obtained, such as universities attended by highly successful employees, can help HR focus recruiting efforts accordingly. Additionally, Strategic Enterprise Management applications help a company translate corporate-level goals, such as profit and margin share targets, into operational decisions, such as production plans and workforce levels.[9]


Another example of data mining, often called the market basket analysis, relates to its use in retail sales. If a clothing store records the purchases of customers, a data-mining system could identify those customers who favour silk shirts over cotton ones. Although some explanations of relationships may be difficult, taking advantage of it is easier. The example deals with association rules within transaction-based data. Not all data are transaction based and logical or inexact rules may also be present within a database. In a manufacturing application, an inexact rule may state that 73% of products which have a specific defect or problem will develop a secondary problem within the next six months. Market Basket Analysis Market Basket Analysis is a modelling technique based upon the theory that if you buy a certain group of items, you are more (or less) likely to buy another group of items. ... In data mining and treatment learning, association rule learners are used to discover elements that co-occur frequently within a data set [1] consisting of multiple independent selections of elements (such as purchasing transactions), and to discover rules, such as implication or correlation, which relate co-occurring elements. ... Look up rule, ruling in Wiktionary, the free dictionary. ... This article is principally about managing and structuring the collections of data held on computers. ...


Related to an integrated-circuit production line, an example of data mining is described in the paper "Mining IC Test Data to Optimize VLSI Testing."[10] In this paper the application of data mining and decision analysis to the problem of die-level functional test is described. Experiments mentioned in this paper demonstrate the ability of applying a system of mining historical die-test data to create a probabilistic model of patterns of die failure which are then utilized to decide in real time which die to test next and when to stop testing. This system has been shown, based on experiments with historical test data, to have the potential to improve profits on mature IC products.


Science and engineering

In recent years, data mining has been widely used in area of science and engineering, such as bioinformatics, genetics, medicine, education, and electrical power engineering. Bioinformatics or computational biology is the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biological problems. ... Look up Genetic in Wiktionary, the free dictionary. ... For the chemical substances known as medicines, see medication. ... Transmission lines in Lund, Sweden Electric power, often known as power or electricity, involves the production and delivery of electrical energy in sufficient quantities to operate domestic appliances, office equipment, industrial machinery and provide sufficient energy for both domestic and commercial lighting, heating, cooking and industrial processes. ...


In the area of study on human genetics, the important goal is to understand the mapping relationship between the inter-individual variation in human DNA sequences and variability in disease susceptibility. In lay terms, it is to find out how the changes in an individual's DNA sequence affect the risk of developing common diseases such as cancer. This is very important to help improve the diagnosis, prevention and treatment of the diseases. The data mining technique that is used to perform this task is known as multifactor dimensionality reduction.[11] The structure of part of a DNA double helix Deoxyribonucleic acid, or DNA, is a nucleic acid molecule that contains the genetic instructions used in the development and functioning of all known living organisms. ... Cancer is a class of diseases or disorders characterized by uncontrolled division of cells and the ability of these to spread, either by direct growth into adjacent tissue through invasion, or by implantation into distant sites by metastasis (where cancer cells are transported through the bloodstream or lymphatic system). ... // General Multifactor dimensionality reduction (MDR) is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable. ...


In the area of electrical power engineering, data mining techniques have been widely used for condition monitoring of high voltage electrical equipment. The purpose of condition monitoring is to obtain valuable information on the insulation's health status of the equipment. Data clustering such as self-organizing map (SOM) has been applied on the vibration monitoring and analysis of transformer on-load tap-changers(OLTCS). Using vibration monitoring, it can be observed that each tap change operation generates a signal that contains information about the condition of the tap changer contacts and the drive mechanisms. Obviously, different tap positions will generate different signals. However, there was considerable variability amongst normal condition signals for the exact same tap position. SOM has been applied to detect abnormal conditions and to estimate the nature of the abnormalities.[12] Condition monitoring is the process of monitoring a parameter of condition in machinery, such that a significant change is indicative of a developing failure. ... Look up insulation in Wiktionary, the free dictionary. ... Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. ... The self-organizing map (SOM) is a subtype of artificial neural networks. ...


Data mining techniques have also been applied for dissolved gas analysis (DGA) on power transformers. DGA, as a diagnostics for power transformer, has been available for centuries. Data mining techniques such as SOM has been applied to analyse data and to determine trends which are not obvious to the standard DGA ratio techniques such as Duval Triangle.[13] For other uses, see Transformer (disambiguation). ...


A fourth area of application for data mining in science/engineering is within educational research, where data mining has been used to study the factors leading students to choose to engage in behaviors which reduce their learning[14] and to understand the factors influencing university student retention.[15]


Other examples of applying data mining technique applications are biomedical data facilitated by domain ontologies,[16] mining clinical trial data,[17] traffic analysis using SOM,[18] et cetera. Health science is the discipline of applied science which deals with human and animal health. ... Traffic analysis is the process of intercepting and examining messages in order to deduce information from patterns in communication. ...


A typical example

A useful example is the large database of thousands of beautiful photos of all kind, collected - and enhanced almost daily - since more than a decade by the Norwegian "anonymous" Svein Ulvund. This huge database is available under www.vossnow.net, [1] (Voss is a town in the Norwegian district of Hordaland near Bergen).


To exploit that photo-collection is not easy. For example, as a typical task, one may want to find out photos from a specific place, e.g. the famous Nærøyfjord nearby. Note that the name of that fjord involves letters which are not easily available on a non-Norwegian keyboard. However, to get around that problem one can find out towns on the fjord without the above-mentioned unfortunate property, e.g. Gudvangen. This can be done with the help of a conventional map, or on the internet. Thus, finally one enters the searchpage of the database, using the two entries Gudv (which suffices for Gudvangen) and the year and month, e.g. Mai 2006, say (not May 2006, since the search machine "speaks Norsk"). Then one gets still a rather large collection of photos from the Nærøyfjord, but now the number is reduced in such a way that it makes sense to have a final selection, either personally or otherwise. Nærøyfjord The Nærøyfjord (Nærøyfjorden) is a fjord in the municipality of Aurland in Sogn og Fjordane, Norway. ... Cruise ship in Gudvangen Gudvangen is a village in Aurland municipality, Norway. ...


This simple example also makes two points obvious: (i) the necessity to automatize the process and (ii) some of the problems involved.


See also

A data warehouse is the main repository of an organizations historical data, its corporate memory. ... Data analysis is the act of transforming data with the aim of extracting useful information and facilitating conclusions. ...

References

  1. ^ W. Frawley and G. Piatetsky-Shapiro and C. Matheus (Fall 1992). "Knowledge Discovery in Databases: An Overview". AI Magazine: pp. 213–228. ISSN 0738-4602. 
  2. ^ D. Hand, H. Mannila, P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge, MA. ISBN 0-262-08290-X. 
  3. ^ Ellen Monk, Bret Wagner (2006). Concepts in Enterprise Resource Planning, Second Edition. Thomson Course Technology, Boston, MA. ISBN 0-619-21663-8. 
  4. ^ Kantardzic, Mehmed (2003). Data Mining: Concepts, Models, Methods, and Algorithms. John Wiley & Sons. ISBN 0471228524. 
  5. ^ Chip Pitts (March 15, 2007). "The End of Illegal Domestic Spying? Don't Count on It". Wash. Spec.. .
  6. ^ K.A. Taipale (December 15, 2003). "Data Mining and Domestic Security: Connecting the Dots to Make Sense of Data". Colum. Sci. & Tech. L. Rev. 5 (2). SSRN 546782 / OCLC 45263753. .
  7. ^ John Resig, Ankur Teredesai (2004). "A Framework for Mining Instant Messaging Services". In Proceedings of the 2004 SIAM DM Conference. .
  8. ^ Stephen Haag et al.. Management Information Systems for the information age, pp 28. ISBN 0-07-095569-7. 
  9. ^ Ellen Monk, Bret Wagner (2006). Concepts in Enterprise Resource Planning, Second Edition. Thomson Course Technology, Boston, MA. ISBN 0-619-21663-8. 
  10. ^ http://web.engr.oregonstate.edu/~tgd/publications/kdd2000-dlft.pdf
  11. ^ Xingquan Zhu, Ian Davidson (2007). Knowledge Discovery and Data Mining: Challenges and Realities. Hershey, New Your, pp 18. ISBN 978-159904252-7. 
  12. ^ A.J. McGrail, E.Gulski, and al.. "Data Mining Techniques to Asses the Condition of High Voltage Electrical Plant". CIGRE WG 15.11 of Study Committee 15. .
  13. ^ A.J. McGrail, E.Gulski, and al.. "Data Mining Techniques to Asses the Condition of High Voltage Electrical Plant". CIGRE WG 15.11 of Study Committee 15. .
  14. ^ R.Baker. "Is Gaming the System State-or-Trait? Educational Data Mining Through the Multi-Contextual Application of a Validated Behavioral Model". Workshop on Data Mining for User Modeling 2007. 
  15. ^ J.F. Superby, J-P. Vandamme, N. Meskens. "Determination of factors influencing the achievement of the first-year university students using data mining methods". Workshop on Educational Data Mining 2006. 
  16. ^ Xingquan Zhu, Ian Davidson (2007). Knowledge Discovery and Data Mining: Challenges and Realities. Hershey, New Your, pp 163-189. ISBN 978-159904252-7. 
  17. ^ Xingquan Zhu, Ian Davidson (2007). Knowledge Discovery and Data Mining: Challenges and Realities. Hershey, New Your, pp 31-48. ISBN 978-159904252-7. 
  18. ^ Yudong Chen, Yi Zhang, Jianming Hu, Xiang Li. "Traffic Data Analysis Using Kernel PCA and Self-Organizing Map". Intelligent Vehicles Symposium, 2006 IEEE. .

ISSN, or International Standard Serial Number, is the unique eight-digit number applied to a periodical publication including electronic serials. ... The Social Science Research Network (SSRN) is a website devoted to the promotion of scholarship in the fields of economics, finance, accounting, management and law. ... The Online Computer Library Center (OCLC) was founded in 1967 and originally named the Ohio College Library Center. ...

Sources

  • Kurt Thearling, An Introduction to Data Mining (also available is a corresponding online tutorial)
  • Dean Abbott, I. Philip Matkovsky, and John Elder IV, Ph.D. An Evaluation of High-end Data Mining Tools for Fraud Detection published a comparative analysis of major high-end data mining software tools that was presented at the 1998 IEEE International Conference on Systems, Man, and Cybernetics, San Diego, CA, October 12-14, 1998.
  • Mierswa, Ingo and Wurst, Michael and Klinkenberg, Ralf and Scholz, Martin and Euler, Timm: YALE: Rapid Prototyping for Complex Data Mining Tasks, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), 2006.
  • Peng, Y., Kou, G., Shi, Y. and Chen, Z. "A Systemic Framework for the Field of Data Mining and Knowledge Discovery", in Proceeding of workshops on The Sixth IEEE International Conference on Data Mining Technique (ICDM), 2006

Books

  • Peter Cabena, Pablo Hadjnian, Rolf Stadler, Jaap Verhees, Alessandro Zanasi, Discovering Data Mining: From Concept to Implementation (1997), Prentice Hall, ISBN 0137439806
  • Ronen Feldman and James Sanger, The Text Mining Handbook, Cambridge University Press, ISBN 9780521836579
  • Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining (2005), ISBN 0-321-32136-7 (companion book site)
  • Bing Liu, "Web Data Mining: Exploring Hyperlinks, Contents and Usage Data", Springer, 2007, ISBN-10: 3-540-37881-2
  • Galit Shmueli, Nitin R. Patel and Peter C. Bruce , Data Mining for Business Intelligence (2006), ISBN 0-470-08485-5 (companion book site)
  • Richard O. Duda, Peter E. Hart, David G. Stork, Pattern Classification, Wiley Interscience, ISBN 0-471-05669-3, (see also Powerpoint slides)
  • Phiroz Bhagat, Pattern Recognition in Industry, Elsevier, ISBN 0-08-044538-1
  • Ian Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (2000), ISBN 1-55860-552-5, (see also Free Weka software)
  • Mark F. Hornick, Erik Marcade, Sunil Venkayala: "Java Data Mining: Strategy, Standard, And Practice: A Practical Guide for Architecture, Design, And Implementation" (Broché)
  • Weiss and Indurkhya, Predictive Data Mining, Morgan Kaufman
  • Yike Guo and Robert Grossman, editors: High Performance Data Mining: Scaling Algorithms, Applications and Systems, Kluwer Academic Publishers, 1999
  • Trevor Hastie, Robert Tibshirani and Jerome Friedman (2001). The Elements of Statistical Learning, Springer. ISBN 0387952845 (companion book site)
  • Pascal Poncelet, Florent Masseglia and Maguelonne Teisseire (Editors). Data Mining Patterns: New Methods and Applications , Information Science Reference, ISBN 978-1599041629, (October 2007).

Weka is a suite of machine learning software written in Java at the University of Waikato which implements several machine learning algorithms from various learning paradigms. ...

External links

The Open Directory Project (ODP), also known as dmoz (from , its original domain name), is a multilingual open content directory of World Wide Web links owned by Netscape that is constructed and maintained by a community of volunteer editors. ...

  Results from FactBites:
 
Data Mining Software, Data Mining Applications and Data Mining Solutions (592 words)
Data mining tools provide a number of techniques that can be applied to any business problem.
Data mining tools are used to ensure flexibility and the greatest accuracy possible.
Because data mining tools are so flexible, a set of data mining guidelines and a data mining methodology have been developed to help guide the process.
Data mining - Wikipedia, the free encyclopedia (2002 words)
Although the term "data mining" is usually used in relation to analysis of data, like artificial intelligence, it is an umbrella term with varied meanings in a wide range of contexts.
Unlike data analysis, data mining is not based or focused on an existing model which is to be tested or whose parameters are to be optimized.
Used in the technical context of data warehousing and analysis, the term "data mining" is neutral.
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m