An Improved Approach to Discover High Utility Item Set from Large Data Set
Keywords:
Utility, Candidates, Transactions, Thresholds, Item setAbstract
The term data mining is often used to apply to the two separate processes of knowledge discovery and prediction. Knowledge discovery provides explicit information that has a readable form and can be understood by a user. Forecasting, or predictive modeling provides predictions of future events and may be transparent and readable in some approaches (e.g. rule based systems) and opaque in others such as neural networks. Moreover, some data mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery. Utility item set mining is addition to the frequent pattern mining. The goal of high utility item set mining is to find all item sets that give utility greater or equal to the user specified threshold. The deficiency of this approach is that it does not consider the statistical aspect of item sets. Utility-based measures should incorporate user-defined utility as well as raw statistical aspects of data. Consequently, it is meaningful to define a specialized form of high utility item sets, utility-frequent item sets which are a subset of high utility item sets as well as frequent item sets. In this paper we proposed an efficient approach to mine high utility items form transactional records.