数据挖掘与OLAP
Both data mining and OLAP are two of the common Business Intelligence (BI) technologies. Business intelligence refers to computer-based methods for identifying and extracting useful information from business data. Data mining is the field of computer science which, deals with extracting interesting patterns from large sets of data. It combines many methods from artificial intelligence, statistics and database management. OLAP (online analytical processing) as the name suggest is a compilation of ways to query multi-dimensional databases.
数据挖掘也称为数据中的知识发现(KDD)。如上所述,它是计算机科学领域,该领域涉及从原始数据中提取以前未知和有趣的信息。由于数据的指数增长,尤其是在诸如业务等领域,因此数据挖掘已成为将大量数据转换为商业智能的非常重要的工具,因为在过去的几十年中,手动提取模式似乎已经变得不可能。例如,它目前用于各种应用程序,例如社交网络分析,欺诈检测和营销。数据挖掘通常处理以下四个任务:聚类,分类,回归和关联。聚类正在从非结构化数据中识别类似的组。分类是可以应用于新数据的学习规则,通常包括以下步骤:数据预处理,设计建模,学习/功能选择和评估/验证。回归正在发现模型数据的误差最小的函数。协会正在寻找变量之间的关系。数据挖掘通常用于回答诸如明年在沃尔玛获得高利润的主要产品之类的问题。
OLAP is a class of systems, which provide answers to multi-dimensional queries. Typically OLAP is used for marketing, budgeting, forecasting and similar applications. It goes without saying that the databases used for OLAP are configured for complex and ad-hoc queries with a quick performance in mind. Typically a matrix is used to display the output of an OLAP. The rows and columns are formed by the dimensions of the query. They often use methods of aggregation on multiple tables to obtain summaries. For example, it can be used to find out about the sales of this year in Wal-Mart compared to last year? What is the prediction on the sales in the next quarter? What can be said about the trend by looking at the percentage change?
Although it is obvious that Data mining and OLAP are similar because they operate on data to gain intelligence, the main difference comes from how they operate on data. OLAP tools provides multidimensional data analysis and they provide summaries of the data but contrastingly, data mining focuses on ratios, patterns and influences in the set of data. That is an OLAP deal with aggregation, which boils down to the operation of data via “addition” but data mining corresponds to “division”. Other notable difference is that while data mining tools model data and return actionable rules, OLAP will conduct comparison and contrast techniques along business dimension in real time.
Leave a Reply