Studied the significance of Readability formulas, acquired from linguistic domain on document classification.
Developed a new input feature set extracted from graph-of-word representation of text.
Analysed and reported the impact of document length, different content words set in effective document classification.
Contributed two large dataset for machine learning based text analysis, available on GitHub for public use.
This method finds its usage as filtering attribute in advance search tools, user profile over web that includes user’s interests and behaviour while browsing web pages.