DESIDOC Bulletin of Information Technology
  • Year: 2007
  • Volume: 27
  • Issue: 4

Zipf's law in a random text from english with a new ranking method

  • Author:
  • Anurag Saxena1, Monika Jauhari2, B.M. Gupta3
  • Total Page Count: 8
  • Page Number: 51 to 58

1Indira Gandhi National Open University, New Delhi-110 068. E-mail: anurags@ignou.ac.in.

2Bundelkhand University, Jhansi-284 128.

3National Institute of Science, Technology and Developmental Studies, Dr K.S. Krishnan Marg, New Delhi-110 012. E-mail: bmgupta1@yahoo.com

Abstract

Zipf's law has attracted infometricians time and again. There have been many studies, which have explored the application of Zipf's law to various areas. However, there are a few parameters, which largely affect a study. These parameters are the power law embedded in Zipf's law, the ranking method, the type of text taken for the study and the behaviour of extreme regions in the Zipf's curve. This paper tries to address all these points by taking a random text in English language from computer science literature. The selected text is called random because of its highly specific nature of technical words. The paper studies the properties of this text and compares the product of rank and frequency for three ranking procedures. It also analyses the performance of data in the extreme regions of the Zipf's curve. It is observed that ranking procedure and type of text have definite bearings on the performance of Zipf's curve.

Keywords

Zipf's law, zipf's curve, infometrics, power law, computer science