Big data tutorial: Everything you need to know
A comprehensive collection of articles, videos and more, hand-picked by our editors
Big data analytics projects are at the top of the IT priority list for many organizations looking to wring business...
benefits out of all the data -- structured, unstructured and semi-structured -- flowing into their systems. But with any initiative that offers big rewards, there are also accompanying big risks. That's certainly true of a big data implementation, which makes planning and managing deployments effectively a must.
There are ways to go right -- and ways to go wrong. What follows is a list of steps that big data analytics project managers should take to help set their programs on the right path, one that leads to the expected business value and a strong return on investment.
Find business sponsors with solid business plans in mind. With all the hype surrounding big data analytics, business executives might well be lining up to sponsor a project. The key criterion for selecting sponsors should be whether they can articulate a clear set of business objectives with a realistic timeline. By having a well-defined target of the business results you're looking to achieve, you can establish a scope for the data management and analytics systems that need to be built along with the supporting technology that needs to be installed. If a project starts without that kind of scoping, it's likely to spin out of control and try to do too much, too soon.
Make learning -- and mistakes -- part of the project plan. Big data analytics will introduce new technologies, techniques and methodologies in your organization, and likely will require new skills. In addition, big data technologies are still evolving; a considerable amount of custom development work is often required; and there's a serious shortage of those required new skills, both for IT developers and the data scientists and other analytics professionals who will lead the data analysis work. As a result, your project team will be learning as it goes, and business managers and users will be figuring out what big data analytics really means to them. You need to create project schedules and budgets based on a long learning curve, including the inevitable mistakes that will be made in the process of that learning.
Get Agile on application development. Because substantial training and education is likely to be required on everyone's part, and detailed business requirements might change as you proceed, Agile development methodologies are a better fit for big data analytics applications than standard waterfall approaches are. An Agile approach that delivers functionality in small, iterative chunks and accommodates quick changes in development plans works best amid all the uncertainty. It should be coupled with a visible and transparent change management process and regular communications with project sponsors and participants about progress and the changes that do occur.
More on managing
big data analytics implementations
Get real-world examples and advice in our guide to big data analytics tools and best practices
Read about the big data project management strategies at health system UPMC and financial services firm CIBC
Learn about key factors to consider in planning a big data analytics architecture
Time-box everything. One of the tried-and-true project management rules, especially when it comes to software development, is that work will fill whatever available time bucket there is. As a big data analytics project manager, it's very likely that you'll be blessed with an extremely enthusiastic community of business executives and workers looking for information they can use to drive operational strategies and tactics. While learning on the fly and being open to changing requirements are part of the process, you need to leverage that enthusiasm by fitting scheduled work into tight time boxes in order to keep the big data initiative moving forward -- and to keep people from getting discouraged by it becoming stuck on particular tasks.
Treat data scientists as artists. Data scientists and other skilled analysts have a key role to play in pulling business insights out of big data stockpiles. Generating those insights, through applications such as predictive analytics and data mining, is an incremental and iterative process. A data scientist will devise an analytical model, test it, refine it, validate it, and finally run it and publish the results internally. In doing so, he might test out dozens or hundreds of variables using a variety of statistical methods. The term data science is somewhat misleading: Creating analytical insights is equal parts science and art. Treat data scientists as talented artists rather than common laborers and you'll encourage better productivity -- and get better results.
Set realistic expectations and manage them proactively. In organizations that are new to big data projects, lofty expectations can be set by technology vendors that claim big data tools are easy to use and point to other enterprises that have gained significant business value by using them. It's important to keep in context that many of the early adopters of big data systems were large Internet companies that have significant expertise and, in many cases, played leading roles in developing Hadoop and other big data technologies. If you let expectations get out of hand and then can't meet them, your big data implementation could be viewed as a failure regardless of the business value it does produce. Constrain expectations to realistic levels at the outset -- and continue to do so throughout the project.
Clearly, there are both big risks and big rewards in undertaking a big data analytics project. But with proper attention to sound project management practices, project managers and their teams can minimize the downsides and make deployments a big business opportunity for their organizations.
About the author:
Rick Sherman is the founder of Athena IT Solutions, a consulting and training services company that focuses on business intelligence and data warehousing. Sherman is also an adjunct faculty member at Northeastern University's Graduate School of Engineering, and he blogs at The Data Doghouse. Email him at firstname.lastname@example.org.
Take this quiz on big data analytics tools and best practices