Breadth search strategies for finding minimal reducts: towards hardware implementation

  • PDF / 830,171 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 75 Downloads / 161 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

Breadth search strategies for finding minimal reducts: towards hardware implementation Mateusz Choroman´ski1 • Tomasz Grzes´1 • Piotr Hon´ko1 Received: 27 December 2018 / Accepted: 5 March 2020  The Author(s) 2020

Abstract Attribute reduction, being a complex problem in data mining, has attracted many researchers. The importance of this issue rises due to ever-growing data to be mined. Together with data growth, a need for speeding up computations increases. The contribution of this paper is twofold: (1) investigation of breadth search strategies for finding minimal reducts in order to emerge the most promising method for processing large data sets; (2) development and implementation of the first hardware approach to finding minimal reducts in order to speed up time-consuming computations. Experimental research showed that for software implementation blind breadth search strategy is in general faster than frequency-based breadth search strategy not only in finding all minimal reducts but also in finding one of them. An inverse situation was observed for hardware implementation. In the future work, the implemented tool is to be used as a fundamental module in a system to be built for processing large data sets. Keywords Attribute reduction  Minimal reducts  Breadth search strategy  Hardware implementation  Field programmable gate arrays

1 Introduction Feature selection, especially attribute reduction [26], seems to be a more essential data preprocessing task than ever before. In the Big Data era, reducing a data set, even by one feature/attribute, may significantly influence the data size or/and time needed for processing it. In spite of the fact that reducing as many attributes as possible of a large data set may be expensive itself, benefits that come from further processing a much smaller data set can be incomparably greater. Attribute reduction has extensively been investigated from a theoretical and practical viewpoint (e.g. [2, 4, 11, 19, 20, 30, 33, 37, 38, 40, 56]) in rough set theory & Piotr Hon´ko [email protected] Mateusz Choroman´ski [email protected] Tomasz Grzes´ [email protected] 1

[26, 28] that is viewed as a mathematical tool to deal with inconsistent data. In spite of the existence of a rich literature, attribute reduction is still a hot topic; a new area of its application and new methods of its improvement are constantly discovered (e.g. [3, 5, 14, 15, 22, 25, 39]). Due to the scope of this paper, the below subsections provide an overview of software and hardware rough setbased approaches for finding reducts.

1.1 Software approaches for finding reducts A variety of algorithms have been developed in rough set theory for finding reducts of an information system or decision table. They can be categorized into the following general groups: discernibility matrix-based approach [4, 11, 19, 30, 44, 53, 55], positive region-based approach [7, 9, 13, 26, 29, 50, 52], and information entropy-based approach [10, 23, 24, 31, 47, 4