Main Article Content

Abstract

The data available on the internet and World Wide Web is very huge and vast only 10% of data is in structured form and approximate 90% is either unstructured or semi-structured. Data mining is only feasible for structured data and not for unstructured and semi-structured. The paper entitles towards the text mining phases such as text transformation, text pre- processing, filtration and stemming. The paper also aimed towards the high frequency viral infective diseases textual online news from various newspapers and processing it for better information retrieval. The research is aligned on various stemming techniques and theircomparison.

Article Details