ANALYZE VISUAL MODELS FOR ASSESSMENT OF BIG DATA CLUSTERING RESULTS

Authors:

Mrs. A. P. Bhuvaneswari,Dr. C. Shoba Bindu,Dr. R. Praveen Sam,

DOI NO:

https://doi.org/10.26782/jmcms.spl.3/2019.09.00021

Keywords:

VAT,cVAT,SpecVAT,cSpecVAT,

Abstract

Cluster analysis refers to the process of combining the group of objects based on similarity features; Traditional methods such as k-means, graphbased clustering etc. are used for clustering of given data objects. Other clustering models, namely, visual access tendency (VAT), cosine based VAT (cVAT), Spectral VAT (SpecVAT), cosine based spectral VAT (cSpecVAT) are more effective because they shows the clustering results with visual evidence for big datasets. These methods compute an initial difference matrix for a set of objects and re-order the same based on ordering of dissimilarity values between objects. Image of re-ordered dissimilarity matrix shows the dark color shaded square blocks along the diagonal, in which each square shaped block represented as a cluster. Synthetic and other benchmarked datasets are taken in the experimental study for proving the efficiency of visual model based clustering approaches.

Refference:

I. C.Lacey, S.Cole, “Merger rates in hierarchical models of galaxy formation.
II: Comparison with N-body simulations”, Mon. Not. Roy. Astron. Soc.,
Vol.: 271, Issue: 3, pp. 676–692, Feb. 1994.
II. D.Zhang, K.Ramamohanarao, S.Versteeg, R.Zhang, “RoleVAT: Visual
assessment of practical need for role based access control”, in Proc. Conf.
Comput. Security Appl., Honolulu, HI, USA, pp. 13–22, Dec. 2009.
III. E.B.S.D.D.Agrawal, “Challenges and opportunities with big Data-A
community white paper developed by leading researchers across The United
States”, The computing Research Association, CRA white Paper, Feb. 2012.
IV. J.C.Bezdek, R.J.Hathaway, “VAT: A tool for visual assessment of (cluster)
tendency”, in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Honolulu, HI,
USA, pp. 2225–2230, May 2002.
V. J.Dean, S.Ghemawat, “Mapreduce: Simplified data processing on large
clusters”, Commun. ACM, Vol.: 51, Issue: 1, pp. 107-113, 2008.
VI. J.Gantz, D.Reinsel, “The digital universe in 2020: Big data, bigger Digital
shadows, and biggest growth in the Far East”, in Proc. IDC iView, IDC Anal.
Future, 2012.
VII. J.H.Howard, “Scale and performance in a distributed _file system”, ACM
Trans. Comput. Syst., Vol.: 6, Issue: 1, pp. 51-81, 1988.
VIII. J.Wilbik, M.Keller, J.C.Bezdek, “Linguistic prototypes for data from
eldercare residents”, IEEE Trans. Fuzzy Syst., Vol.: 22, Issue: 1, pp. 110–
123, Mar. 2013.
IX. L.O.Hall, S.Eschrich, J.Ke, D.B.Goldgof, “Fast accurate fuzzy clustering
through data reduction”, IEEE Trans. Fuzzy Syst., Vol.: 11, Issue: 2, pp. 262–
270, Apr. 2003.
X. M.Moshtaghi, “Clustering ellipses for anomaly detection”, Pattern Recognit.,
Vol.: 44, Issue: 1, pp. 55–69, Jan. 2011.
XI. M.Schmidt, D.Feldman, C.Sohler, “Turning big data into tiny data: Constantsize
coresets for k-means, PCA and projective clustering”, in Proc. 24th
Annu. ACM Symp. Discrete Algorithms, New Orleans, LA, USA, pp. 1434–
1453, 2013.
XII. P.H.A.Sneath, R.R.Sokal, “Numerical Taxonomy—the Principles and
Practice of Numerical Classification”, San Francisco, CA, USA: W. H.
Freeman, 1973.
XIII. P.Rathore, D.Kumar, J.C.Bezdek, “A Rapid Hybrid Clustering Algorithm for
Large Volumes of High Dimensional Data”, IEEE Transactions On
Knowledge and Data Engineering, 2018.

XIV. R.Cattell, “Scalable SQL and NoSQL data stores”, SIGMOD Rec., Vol.: 39,
Issue: 4, pp. 12-27, 2011.
XV. T.White, “Hadoop: The Definitive Guide”, Sebastopol, CA, USA: O’Reilly
Media, 2012.
XVI. V.R.Borkar, M.J.Carey, C.Li, “Big data platforms: What’s next?” XRDS”,
Crossroads, ACM Mag.Students, Vol.:19, Issue:1, pp.44-49, 2012.

View | Download