Big data classification problems have drawn great attention from diverse fields, and many classifiers have been developed. Among those classifiers, the extended belief rule-based system (EBRBS) has shown its potential in both big data and multiclass situations, while the time complexity and computing efficiency are two challenging issues to be handled in EBRBS. As such, three improvements of EBRBS are proposed first in this paper to decrease the time complexity and computing efficiency of EBRBS for multiclass classification under the assumption of large amount of data, including the strategy to skip rule weight calculation, a simplified evidential reasoning algorithm, and the domain division-based rule reduction method. This turns out to be a micro version of the EBRBS, called Micro-EBRBS. Moreover, one of commonly used cluster computing, named Apache Spark, is then applied to implement the parallel rule generation and inference schemes of the Micro-EBRBS for big data multiclass classification problems. The comparative analyses of experimental studies demonstrate that the Micro-EBRBS not only can obtain a desired accuracy but also has the comparatively better time complexity and computing efficiency than some popular classifiers, especially for multiclass classification problems.
|Number of pages||21|
|Journal||IEEE Transactions on Systems, Man, and Cybernetics: Systems|
|Early online date||26 Oct 2018|
|Publication status||Published - 1 Jan 2021|
- Apache spark
- big data
- extended belief rule-based system (EBRBS)