Implementing Big Data Analytics Projects in Business

Big Data analytics present both opportunities and challenges for companies. It is important that, before embarking on a Big Data project, companies understand the value offered by Big Data and the processes needed to extract it. This chapter discusses why companies should progressively increase their data volumes and the process to follow for implementing a Big Data project. We present a variety of architectures, from in-memory servers to Hadoop, to handle Big Data. We introduce the concept of Data Lake and discuss its benefits for companies and the research still required to fully deploy it. We illustrate some of the points discussed in the chapter through the presentation of various architectures available for running Big Data initiatives, and discuss the expected evolution of hardware and software tools in the near future.

[1]  Erik Marcadé,et al.  Industrial Mining of Massive Data Sets , 2007, NATO ASI Mining Massive Data Sets for Security.

[2]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[3]  David J. Leinweber,et al.  Stupid Data Miner Tricks , 2007 .

[4]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[5]  Chuck Lam,et al.  Hadoop in Action , 2010 .

[6]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[7]  Thomas J. Steenburgh,et al.  Motivating Salespeople: What Really Works , 2012, Harvard business review.

[8]  T. Davenport Competing on analytics. , 2006, Harvard business review.

[9]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[10]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[11]  Erik Marcadé,et al.  Mining on Social Networks , 2011 .

[12]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[13]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[14]  Françoise Fogelman-Soulié,et al.  Utilisation des réseaux sociaux dans la lutte contre la fraude à la carte bancaire sur Internet , 2012, AAFD.

[15]  Christopher Ré,et al.  Brainwash: A Data System for Feature Engineering , 2013, CIDR.

[16]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[17]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.