有关零售超市毕业设计外文翻译.doc

资源描述

《有关零售超市毕业设计外文翻译.doc》由会员分享，可在线阅读，更多相关《有关零售超市毕业设计外文翻译.doc（20页珍藏版）》请在三一办公上搜索。

1、有关零售超市毕业设计外文翻译毕业设计（论文）外文翻译题目对零售超市数据进行最优产品选择的数据挖掘框架：广义PROFSET模型专业网络工程附录英文原文A Data Mining Framework for OptimalProduct Selection in Retail Supermarket Data:The Generalized PROFSET Model1 IntroductionSince almost all mid to large size retailers today possess electronic sales transaction Systems,

2、 retailers realize that competitive advantage will no longer be achieved by the mere use of these systems for purposes of inventory management or facilitating customer check-out. In contrast, competitive advantage will be gained by those retailers who are able to extract the knowledge hidden in the

3、data, generated by those systems, and use it to optimize their marketing decision making. In this context, knowledge about how customers are using the retail store is of critical importance and distinctive competencies will be built by those retailers who best succeed in extracting actionable knowle

4、dge from these data. Association rule mining 2 can help retailers to efficiently extract this knowledge from large retail databases. We assume some familiarity with the basic notions of association rule mining.In recent years, a lot of effort in the area of retail market basket analysis has been inv

5、ested in the development of techniques to increase the interestingness of association rules. Currently, in essence three different research tracks to study the interestingness of association rules can be distinguished.First, a number of objective measures of interestingness have been developed in or

6、der to filter out non-interesting association rules based on a number of statistical properties of the rules, such as support and confidence 2, interest 14, intensity of implication 7, J-measure 15, and correlation 12. Other measures are based on the syntactical properties of the rules 11, or they a

7、re used to discover the least-redundant set of rules 4. Second, it was recognized that domain knowledge may also play an important role in determining the interestingness of association rules. Therefore, a number of subjective measures of interestingness have been put forward, such as unexpectedness

8、 13, action ability 1 and rule templates 10. Finally, the most recent stream of research advocates the evaluation of the interestingness of associations in the light of the micro-economic framework of the retailer 9. More specifically, a pattern in the data is considered interesting only to the exte

9、nt in which it can be used in the decision-making process of the enterprise to increase its utility.It is in this latter stream of research that the authors have previously developed a model for product selection called PROFSET 3, that takes into account both quantitative and qualitative elements of

10、 retail domain knowledge in order to determine the set of products that yields maximum cross-selling profits. The key idea of the model is that products should not be selected based on their individual profitability, but rather on the total profitability that they generate, including profits from cr

11、oss-selling. However, in its previous form, one major drawback of the model was its inability to deal with supermarket data (i.e., large baskets). To overcome this limitation, in this paper we will propose an important generalization of the existing PROFSET model that will effectively deal with larg

12、e baskets. Furthermore, we generalize the model to include category management principles specified by the retailer in order to make the output of the model even more realistic.The remainder of the paper is organized as follows. In Section 2 we will focus on the limitations of the previous PROFSET m

13、odel for product selection. In Section 3, we will introduce the generalized PROFSET model. Section 4 will be devoted to the empirical implementation of the model and its results on real-world supermarket data. Finally, Section 5 will be reserved for conclusions and further research.2 The PROFSET Mod

14、elThe key idea of the PROFSET model is that when evaluating the business value of a product, one should not only look at the individual profits generated by that product (the naive approach), but one must also take into account the profits due to cross-selling effects with other products in the asso

15、rtment. Therefore, to evaluate product profitability, it is essential to look at frequent sets rather than at individual product items since the former represent frequently co-occurring product combinations in the market baskets of the customer. As was also stressed by Cabena et al. 5, one disadvant

16、age of associations discovery is that there is no provision for taking into account the business value of an association. The PROFSET model was a first attempt to solve this problem. Indeed, in terms of the associations discovered, the sale of an expensive bottle of wine with oysters accounts for as

17、 much as the sale of a carton of milk with cereal. This example illustrates that, when evaluating the interestingness of associations, the micro-economic framework of the retailer should be incorporated. PROFSET was developed to maximize cross-selling opportunities by evaluating the profit margin ge

18、nerated per frequent set of products, rather than per product. In the next Section we will discuss the limitations of the previous PROFSET model. More details can be found elsewhere 3.2.1 LimitationsThe previous PROFSET model was specifically developed for market basket data from automated convenien

19、ce stores. Data sets of this origin are characterized by small market baskets (size 2 or 3) because customers typically do not purchase many items during a single shopping visit. Therefore, the profit margin generated per frequent purchase combination (X) could accurately be approximated by adding t

20、he profit margins of the market baskets (Tj) containing the same set of items, i.e. X = Tj. However, for supermarket data, the existing formulation of the PROFSET model poses significant problems since the size of market baskets typically exceeds the size of frequent item sets. Indeed, in supermarke

21、t data, frequent item sets mostly do not contain more than 7 different products, whereas the size of the average market basket is typically 10 to 15. As a result, the existing profit allocation heuristic cannot be used anymore since it would cause the model to heavily underestimate the profit potent

22、ial from cross-selling effects between products. However, getting rid of this heuristic is not trivial and it will be discussed in detail in Section 3.1.A second limitation of the existing PROFSET model relates to principles of category management. Indeed, there is an increasing trend in retailing t

23、o manage product categories as separate strategic business units 6. In other words, because of the trend to offer more products, retailers can no longer evaluate and manage each product individually. Instead, they define product categories and define marketing actions (such as promotions or store la

24、yout) on the level of these categories. The generalized PROFSET model takes this domain knowledge into account and therefore offers the retailer the ability to specify product categories and place restrictions on them.3 The Generalized PROFSET ModelIn this section, we will highlight the improvements

25、 being made to the previous PROFSET model 3.3.1 Profit AllocationAvoiding the equality constraint X = Tj results in different possible profit allocation systems. Indeed, it is important to recognize that the margin of transaction Tj can potentially be allocated to different frequent subsets of that

26、transaction. In other words, how should the margin m (Tj) be allocated to one or more different frequent subsets of Tj?The idea here is that we would like to know the purchase intentions of the customer who bought Tj . Unfortunately, since the customer has already left the store, we do not possess t

27、his information. However, if we can assume that some items occur more frequently together than others because they are considered complementary by customers, then frequent item sets may be interpreted as purchase intentions of customers. Consequently, there is the additional problem of finding out w

28、hich and how many purchase intentions are represented in a particular transaction Tj . Indeed, a transaction may contain several frequent subsets of different sizes, so it is not straightforward to determine which frequent sets represent the underlying purchase intentions of the customer at the time

29、 of shopping. Before proposing a solution to this problem, we will first define the concept of a maximal frequent subset of a transaction.Definition 1. Let F be the collection of all frequent subsets of a sales transaction Tj . Then is called maximal, denoted as X max , if and only if.: .Using this

30、definition, we will adopt the following rationale to allocate the margin m(Tj) of a sales transaction Tj .If there exists a frequent set X = Tj, then we allocate m(Tj) to M(X), just as in the previous PROFSET model. However, if there is no such frequent set, then one maximal frequent subset X will b

31、e drawn from all maximal frequent subsets according to the probability distribution, withAfter this, the margin m(X) is assigned to M(X) and the process is repeated for Tj X. In summary:Table 1 contains all frequent subsets of T for a particular transaction database. In this example, there is no uni

32、que maximal frequent subset of T. Indeed, there are two maximal frequent subsets of T, namely cola, peanuts and peanuts, cheese. Consequently, it is not obvious to which maximal frequent subset the profit margin m(T) should be allocated. Moreover, we would not allocate the entire profit margin m(T)

33、to the selected item set, but rather the proportion m(X) that corresponds to the items contained in the selected maximal subset. Now how can one determine to which of both frequent subsets of T this margin should be allocated? As we have already discussed, the crucial idea here is that it really dep

34、ends on what has been the purchase intentions of the customer who purchased T. Unfortunately, one can never know exactly since we havent asked the customer at the time of purchase. However, the support of the frequent subsets of T may provide some probabilistic estimation. Indeed, if the support of

35、a frequent subset is an indicator for the probability of occurrence of this purchase combination, then according to the data, customers buy the maximal subset cola, peanuts two times more frequently than the maximal subset peanuts, cheese. Consequently, we can say that it is more likely that the cus

36、tomers purchase intention has been cola, peanuts instead of peanuts, cheese. This information is used to construct the probability distribution , reflecting the relative frequencies of the frequent subsets of T. Now, each time a sales transaction cola, peanuts, cheese is encountered in the data, a r

37、andom draw from the probability distribution will provide the most probable purchase intention (i.e. frequent subset) for that transaction. Consequently, on average in two of the three times this transaction is encountered, maximal subset cola, peanuts will be selected and m(cola; peanuts) will be a

38、llocated to M(cola; peanuts). After this, T is split up as follows: T := T cola; peanutsand the process of assigning the remaining margin is repeated as if the new T were a separate transaction, until T does not contain a frequent set anymore.3.2 Category Management RestrictionsAs pointed out in Sec

39、tion 2.1, a second limitation of the previous PROFSET model is its inability to include category management restrictions. This sometimes causes the model to exclude even all products from one or more categories because they do not contribute enough to the overall profitability of the optimal set. Th

40、is often contradicts with the mission of retailers to offer customers a wide range of products, even if some of those categories or products are not profitable enough. Indeed, customers expect supermarkets to carry a wide variety of products and cutting away categories / departments would be against

41、 the customers expectations about the supermarket and would harm the stores image. Therefore, we want to offer the retailer the ability to include category restrictions into the generalized PROFSET model. This can be accomplished by adding an additional index k to the product variable to account for

42、 category membership, and by adding constraints on the category level. Several kinds of category restrictions can be introduced: which and how many categories should be included in the optimal set, or how many products from each category should be included. The relevance of these restrictions can be

43、 illustrated by the following common practices in retailing. First, when composing a promotion leaflet, there is only limited space to display products and therefore it is important to optimize the product composition in order to maximize cross-selling effects between products and avoid product cann

44、ibalization. Moreover, according to the particular retail environment, the retailer will include or exclude specific products or product categories in the leaflet. For example, the supermarket in this study attempts to differentiate from the competition by the following image components: fresh, prof

45、itable and friendly. Therefore, the promotion leaflet of the retailer emphasizes product categories that support this image, such as fresh vegetables and meat, freshly-baked bread, ready-made meals, and others. Second, product category constraints may reflect shelf space allocations to products. For

46、 instance, large categories have more product facings than smaller categories. These kind of constraints can easily be included in the generalized PROFSET model as will be discussed hereafter.中文翻译对零售超市数据进行最优产品选择的数据挖掘框架：广义PROFSET模型第一章引言当今几乎所有的中大型零售商拥有电子销售交易系统，零售商认识到，竞争优势将不再仅仅取决于使用这些系统管理目的的库存或便利客户退房。

47、相反的，谁能够在提取这些数据背后隐藏的、由数据库生成的信息，并用它来优化其营销决策，就能获得竞争优势。在此背景下，能够最成功地从这些数据中提取可操作信息的零售商，他们提取的信息在零售行业中是至关重要的，而且具有特有的竞争优势。如果我们假设关联规则挖掘具有一些熟悉的基本概念，从大型零售数据库运用关联规则挖掘2，可以帮助零售商成功地提取这方面的知识。近年来，随着关联规则利润的发展，在零售市场分析方向的许多区域出现了投资现象。目前，基于此规则，已经发展了一些利润客观评价方法，以便排除一些无利润因素；例如：规则数据特性的支持和密度、利润、应用的完整性、J-规则以及关联。其他的方法是基于此规则的同步性发

48、展起来的。其次，人们已经意识到掌握这些信息，在决定这些规则的相关利润时扮演极为重要的角色，然而，例如像不可预测性、行为能力和规则模板的利润客观标准已经被提出，最终，在零售商微观经济框架理论的协助下，当今主流的研究方向已经转向关联利润的评估，更重要的是，它已经用于在大型企业的决策制定，以加强统一性。在本文的后部分，作者优先介绍了一种面向产品选择的模块PROFSET。它在零售知识上，从质量管理和数量管理两方面进行了阐述，为的是能够对特定规格的产品产生最大的效益。这个模块的关键点在于它不能基于个体特性来进行选择，而是基于它们产生的特性集合，包括因交叉交易产生的特性。但是最初它还不能克服在超级市场中

49、表现出的一些缺陷，为解决之一问题，本文引入了一种现有PROFSET模块的重要改进版，可以有效地运用到大型市场上。进一步，我们发展了一种专业于零售行业的模块，包括产品种类管理规则，以便让模块色输出更加真实。本文接下来的内容分布如下：第二章，我们介绍以前PROFSET模块的局限性；第三章，介绍集成化PROFSE模块；第四章，介绍集成化PROFSET模块在实用市场数据方面的一些以有点；最后在第五章，总结本文，并介绍一些将来研究方向。第二章 PROFSET模块 PROFSET模块概念的关键之处在于当评价一个商品的商业价值时，不仅要看到它本身的个体效益（自然方法），更要考虑在交易过程中与其他产品相结合时的效益。然而，当评价一种产品的市场效益时，必须从全局出发，而不是着眼于个体，因为前者更能反映市场上消费者多次、重复购买的市场特性。

展开阅读全文