Dark Data Analysis to Reduce Data Complexity in Big Data

Authors

  • Bansari Trivedi M. Tech student, Information Technology department, Parul Institute of Engineering and Technology, Waghodia, Vadodara, India.

Keywords:

Dark Data, Chunking, Classification, Big Data

Abstract

Big data is a large amount of data which is hard to handle by traditional systems. It requires new structures,
algorithms and techniques. As data increases, dark data also increases. In such way there is one portion of data within a
main data source which is not in regular use but it can help in decision making and to retrieve the data. This portion is
known as “Dark Data”. Dark data is generally in ideal state. The first use and defining of the term "dark data" appears to be
by the consulting company Gartner. Dark data is acquired through various operational sources but not used in any manner
to derive insights or for decision making. It is subset of Big Data. Usually each big data sets consists average 80% dark data
of whole data set. There are two ways to view the importance of dark data. One view is that unanalyzed data contains
undiscovered, important insights and represents an opportunity lost. The other view is that unanalyzed data, if not handled
well, can result in a lot of problems such as legal and security problems. In this phase solution for side effects of dark data on
whole data set is introduced. Dark data is important part of Big Data. But it is in ideal state so it may cause load on system
and processes. So it is important to find solution such that dark data should remain same and also can’t affect rest of data.

Published

2017-03-25

How to Cite

Bansari Trivedi. (2017). Dark Data Analysis to Reduce Data Complexity in Big Data. International Journal of Advance Engineering and Research Development (IJAERD), 4(3), 443–447. Retrieved from https://ijaerd.org/index.php/IJAERD/article/view/2063