1Research Supervisor,
2PhD Research Scholar,
Clustering has been widely used in many applications including data mining, pattern recognition and machine learning. Noise is a major problem in cluster analysis, which degrades the performance of many existing methods. This paper is aimed at solving noise problems in data clustering.
Many existing clustering algorithms are sensitive to the presence of outliers. In this paper, a new robust operator is developed to attack this problem, namely the modified l2 norm. There are many merits in using this new measure. No sensitiveuser-defined parameter is needed for this measure and it automatically assigns a small weight to the sample, which is far away from the cluster center. It is robust to outliers and has a theoretical 50% breakdown point. It can be solved without using an exhaustive search and can be extended to more general prototype, for example curve. We have tested this method with four synthetic and three real world datasets. Experiment results show that the method yields better results than other clustering algorithms.