Everywhere you go, everything you do, you’re generating data, and so is everyone around you. Your mobile phone usage, your internet browsing behavior, the way you drive your car, the number of times you buy turkey at the grocery store…all of that data is being collected and used by companies around the world. The massive growth in this information — which has exploded in volume, velocity and variety — has given rise to a new name for a new field: Big data.
But the explosion of data has also given rise to a tremendous need for skilled professionals capable of dealing with all of this information. In fact, the numbers of people needed in big data are simply staggering. According to one new projection from McKinsey & Company, the U.S. alone faces a shortfall of 140,000 to 190,000 big data professionals in the next five years. Another recent study from Gartner suggests that 4.4 million IT jobs worldwide will be needed to support big data by 2015. That’s a lot of potential employment for the right people.
Too Much Data, Not Enough People
But where will all of these new employees come from? While some of those thousands or millions of people will likely end up working in traditional areas such as storage or infrastructure or security, experts say the data scientists that are truly needed to make sense of all of this data remain a rare breed.
“The ability to successfully harness big data requires a unique combination of skills and attributes,” says Richard Rodts, manager of global analytics academic programs at IBM. “On the technical side, it’s essential to understand how to operate analytics technology solutions to read into the data for hidden insights and build predictive models that help business decision-makers chart smarter courses for their organizations.” Beyond that, it’s important to understand the business model and culture of your company or client so you can ask the right questions of your data. And then, Rodts says, “there are the very human attributes, such as a knack for both strategic and creative thinking, the ability to collaborate with colleagues across the business, and strong communication skills that enable you to convey data-driven findings to senior decision-makers in a compelling way.”
That’s a lot of skills for a single person. As Mark A. Herschberg, CTO of Madison Logic puts it, “That combination doesn’t exactly grow on trees.”
So What Does a Big Data Person Do?
The roots of big data lie in the older, still valid term business intelligence. “Big data is just business intelligence on steroids,” says Marty Carney, CEO of WCI. “People doing BI data warehousing can do big data. They just need more experience dealing with bigger data sets and larger architectures.”
Rodts takes it a bit further. “Data scientists or analytics professionals are part digital trend-spotter and part storyteller,” he says. “These are people, teams and centers of excellence at businesses and organizations who sift through vast amounts of data to uncover insights that can yield revenue-growing opportunities, spot risks before they occur, save money, time — and even lives.”
The exact tasks for a big-data professional can vary depending on the goals at a particular company or project. “We start with a very simple question,” says Samer Forzley, VP of marketing at the data-management company Pythian. “What are you trying to achieve from a business point of view? Are you trying to save money? Are you trying to increase revenue? Do you need to create insight on the fly? Are you trying to create a condition engine on your website that will recommend other products?” Each answer has a different set of solutions, he says.
Meanwhile, a lot of the work being done in big data today isn’t directly analysis but the transition from older systems in silo, legacy databases. “The biggest enemy of big data is silo data,” says Ali Riaz, CEO of Attivio. Companies may have been collecting disparate forms of data in various silos for years, but getting the full value of that information is a step many aren’t ready to take. “When we talk about big data, we’re talking about actually pulling all of your structured and unstructured information assets together,” Riaz says. “We can’t get to the big-data goals if everyone is married to smaller data.”
Getting In
To help address the need for big data professionals, several universities around the country have added new data analytics programs. Some, like the program at the University of Tennessee, focus not just on the technology but the business side of big data. “We think it is really important that our students have the technical skills, but that they also have some business savvy and understand the importance of subject-matter expertise in deciding both how you collect the data and how you will analyze it,” says Dr. Kenneth Gilbert, head of the university’s business analytics department. Toward that end, the school’s MS in business analytics program includes concentrations on teamwork, giving presentations to managers, and related skills.
For coursework, the best place to start is with statistics, says Dr. Olly Downs, senior VP of Data Sciences at Globys, who recently helped assemble the curriculum for the new data sciences certificate program at the University of Washington. But statistics alone isn’t enough, and Downs suggests that students get to know distributed computation and programs such as Hadoop, Python and R. At that point, you can “start getting into data and visualizing it and gaining insight from it,” he says. The next step is to start to understand how to communicate and visualize the output of your data, since a key part of every data scientist’s job is getting managers to understand their conclusions.
Unlike more traditional data fields — which often specialized in a single tool — working in big data requires a broad knowledge base. “You can’t know just one tool,” says Riaz. “You have to be multifunctional. You have to be multidimensional.”
Even with the need for multidimensionality, Riaz suggests finding the big-data specialty that appeals most to you by talking to data scientists who are already in the field to see what they do. “Then you map it to who you are,” he says. “Are you an infrastructure guy, or are you a board-level guy? Do you want to interact with people? Do you want to educate? Do you want to consume? Do you want to make decisions? Do you want to enable? Do you want to drive?” He suggests talking to as many people as you can, being open to trying new things, and applying for internships. “Don’t get in a decision mode until you have finished your discovery mode.”
Once you’re in the field, it’s important to keep moving forward. “Get into a continuous learning mode,” Riaz says. “What it means to be a data scientist today is going to radically change the next time a big new technology comes your way.”
What’s Next?
Although companies area already basing more decisions than ever on data, experts say the full scope of how big data will impact business remains to be seen. “I have a colleague who compares the whole big data thing to Eisenhower’s interstate system,” Gilbert says. “It’s going to create business opportunities that people can’t even imagine at this point.”
But even with its rapid growth, big data may actually be due for a shakeup in the next few years. In part, because it is so new. “Big data is in a way not fully defined yet because it is still emerging,” Forzley says. The rapid expansion we see today could eventually cause a similar contraction as processes work themselves out – and as companies realize that they may have hired too many people. “We’re going to find efficiencies,” says Riaz, who expects the short-term projections of the number of people needed in the field to fall considerably by the end of the decade.
According to Downs, the role of data scientists will continue to evolve. “Data scientists are no longer going to just be modelers and visualizers of data,” he predicts. “They will also be creating near-product-worthy pieces of software that a software engineer can then integrate into a bigger system.”
Experts say the future of the field could bring more regulation to protect consumers’ data, but it will certainly require more security. “Now that we’re housing more sensitive information, you’re going to have to have more locks on your door and more gates around your castle and more guard dogs and policemen,” Carney says. “The securitizing of big data is going to be a huge business,” he predicts.
The biggest risk for the future of big data may be entrenched business practices that don’t yet see the value of analytics. Gilbert points at McKinsey’s study, which predicts a need not just for a few hundred thousand big-data professionals but also for 1.5 million data-savvy managers. “What is going to determine the winners and losers in the business world are the ones that learn how to use this new resource for strategic advantage,” he says.
This article originally appeared on IEEE and has been republished with permission.