Main Article Content

Abstract

aBig adata ausually aincludes adata asets awith asizes abeyond athe aability aof acommonly aused asoftware atools ato acapture, acreation, amanage, aand aprocess adata awithin aa atolerable aelapsed atime. aBig adata ahas athe anow agreat aattention ain athe ainformation aindustry aand ain asociety adue ato athe aexistence aof alarge aamounts aof adata aand athe acrucial aneed afor achanging asuch adata ainto auseful ainformation aand aknowledge. aGenerally adata ais amanaged aby athe adistributed afile asystem a(DFS) aand aHadoop aDFS acalled aHDFS. aBut awhen athe adata acontain asome atime-series, athen ait ais amore aefficient ato aapply asome amore alevel aof adata aprocessing aat athe adata asource abefore astoring athe adata. aUsing aHadoop athe adata ais astored aas awell aas aprocess asimultaneously. aScheduling ain aHadoop arefers ato athe adistribution aof ajobs.

Article Details