Currently only 13% of companies achieve full-scale implementation of their in-house big data projects, and currently only 27% of executives describe their in-house big data initiatives successful. Such a low success rate should be concerning for executives considering adopting big data technology, such as Hadoop or Spark. This is especially true since many businesses are choosing to adopt big data without a clear understanding of what the ROI will be. Big data requires experimentation to discover where it can best be used to yield significant ROI, but experimentation requires a risky up-front investment. Given the amount of risk involved, increasing time to value is crucial. Here are five factors businesses should consider to ensure they see timely ROI from their big data investment.
5 Crucial Considerations for Big data adoption from Qubole
1. Time to Deployment
The amount of time required simply to deploy a big data solution can vary significantly depending on the type of implementation. An in-house solution, built and installed on internal servers, can take anywhere from 6-9 months to build the necessary infrastructure required. A cloud-based solution requires no internal infrastructure, so average time to production will be a matter of days or weeks.
The ability to scale a project up or down is crucial. Many organizations underestimate how quickly their dataset will grow, or fail to take into account varying usage levels. Spikes in data usage or temporary projects can quickly create latency issues if a project isn’t easily scalable. For in-house deployments, this means either having extra capacity on hand to handle extra workloads or building out infrastructure as needed which can delay a project for weeks or months. Cloud-based systems will offer more scalability, allowing businesses to increase or decrease usage as needed.
3. Connecting Tools
The Hadoop ecosystem is made up of many different engines and tools, each with a unique use case. Businesses will need to evaluate which engines and tools it wants to use. Hive and Spark, for example, both offer SQL-on-Hadoop, but Spark is much better suited to handle real-time data. Compiling and mastering each of these tools requires significant time and expertise. Organizations without internal expertise will have to rely on external experts or seek training for their internal teams.
4. Infrastructure Management
Once the big data infrastructure is in place, Hadoop still requires a significant investment into management. From cluster sizing and configuration management to health and monitor performance, the tools will require a dedicated team that manages and maintains the clusters. Hadoop as a service offerings can ease the management burden for companies by offering their own internal expertise that handles the majority of management requirements.
Finally, big data is only useful if it is accessible to the people who can actually learn something from the data and implement it into everyday business practices. Currently 57% of organizations cite skills gap as a major inhibitor to Hadoop adoption. Once again, ease of accessibility will vary based on implementation, so selecting a vendor that matches your internal capabilities will be crucial. Consider the time and investment it will take to train teams when calculating time to value and overall ROI.