What is the future of analytics?

6 trends shaping the future of data analytics

Organizations are demanding more from their data analytics efforts, wanting immediate insights that will help drive business decisions. In response, many are adopting new technologies such as machine learning, deep learning and natural language processing. Eric Mizell, vice president of global solution engineering at Kinetica, discusses what role these technologies will plan in shaping the future of data analytics.

The algorithmic economy comes of age

“Organizations are dealing with a tsunami of data,” Mizell stresses. “The speed, size, shape of data generated by newer sources such as sensors, mobile apps, social media, machine logs, and connected devices far outpaces the ability of current systems and humans to comprehend, draw insights, and act on data. Organizations should look at algorithmic approaches such as machine learning, deep learning, and natural language processing (NLP) to automate insight discovery at scale.” 

Data and analytics architectures evolve

“Data and analytics architectures must evolve for the hybrid world,” Mizell says. “Cloud and on-premise, data in motion and at rest, transactional and analytic databases, in-memory and spinning disk, real-time and batch, AI and BI – all need to co-exist and interoperate. Organizations must look to bring together workload-specific, complementary analytic solutions to analyze all data, gain insights, and act. They must look at open, standard-based solutions that use APIs, micro-services, programming languages, and connectivity to seamlessly integrate with existing infrastructure and deliver business value while preserving existing investments.”

Need for speed

“From high-speed Internet to 5G networks to high-speed trading, ‘speed’ is a critical element for business success,” Mizell confirms. “Customers demand instant gratification and enterprises need fresh data and real-time insights to deliver business value. Gone are the days of nightly batch processing and waiting for hours or days to get answers to critical business questions. Technology executives need to build real-time analytic pipelines to simultaneously ingest, analyze, visualize, and act on data in motion and at rest and deliver fresh, timely insights to capitalize on fast-moving business opportunities.”

GPUs vs. CPUs

“CPUs have been the workhorse of business applications for decades,” Mizell says. “However, big data’s volume, variety, and velocity–coupled with shrinking insight shelf life–require organizations to investigate other technologies to address the compute bottleneck. GPUs, with thousands of processing cores per chip vs. 16 to 32 for CPUs, have emerged as the “go to” alternative to process complex data at scale. Organizations must investigate GPU-based analytic technologies that deliver performance, flexibility, and ease-of-use to modernize the analytic infrastructure.”

From control to collaboration

“Data and analytics must be pervasively available across the organization for maximum business value,” Mizell stresses. “Everyone in an organization– data scientists, business analysts, and business users regardless of their technical skills–must have fast, easy, and self-service access to data and analytics for data-insight-driven decision making. Organizations need to adopt analytic technologies that democratize analytics, data science, and machine learning to establish a data-insight-driven culture. Analytic technologies must be flexible to balance analytic innovation with guard rails of security, scalability, and availability.”

The impact of the Internet of Things

“The Internet of Things (IoT) will fundamentally transform how organizations do data and analytics,” Mizell says. “With the nexus of people, devices, and data, IoT will have a profound impact on every industry and every line of business. Organizations will have to figure out how to sense, interpret, and respond to data in motion and rest, in real-time and at scale. Organizations must evolve their data and analytics architectures to seamlessly ingest the tsunami of IoT data, combine it with data at rest for contextual insights, and act in real-time and at scale to maximize business value. They must look at analytic solutions that deliver exponential scale and flexibility to manage the IoT data cost-effectively.”

This article first appeared on “Information-Management.com

Reid Hoffman: A.I. Is Going to Change Everything About Managing Teams

Imagine a spider chart mapping a complex web of interactions, sentiments, and workflow within an office. What would your company look like?

When most of us think of artificial intelligence in the workplace, we imagine automated assembly lines of robots managed by an algorithm. LinkedIn’s Reid Hoffman has a different idea.

In an essay for MIT Sloan Management Review, Hoffman describes human applications for the technology. Among other things, it would help to use data science to improve the way we onboard new team members, organize workflow, and communicate about performance. Addressing the question of how technology will change management practices over the next five years, Hoffman explains how the use of a “knowledge graph” will become standard management practice.

Read more

The dplyr package for R

dplyr: A Grammar of Data Manipulation

A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

When working with data you must:
Figure out what you want to do.
Precisely describe what you want in the form of a computer program.
Execute the code.

Read more

ggvis data visualization

ggvis is a data visualization package for R which lets you:
Declaratively describe data graphics with a syntax similar in spirit to ggplot2.
Create rich interactive graphics that you can play with locally in Rstudio or in your browser.
Leverage shiny’s infrastructure to publish interactive graphics usable from any browser (either within your company or to the world).
ggvis combines the best of R (e.g. every modelling function you can imagine) and the best of the web (everyone has a web browser). Data manipulation and transformation are done in R, and the graphics are rendered in a web browser, using Vega. For RStudio users, ggvis graphics display in a viewer panel, which is possible because RStudio is a web browser.

Read more

How to load Shiny into RStudio

Shiny is an R package that makes it easy to build interactive web applications (apps) straight from R.

To install the shiny packages please open your Rstudio environment and type the usual command into your shell:

install.packages(“shiny”)

To run Shiny type:

library(shiny)

The Shiny package comes with eleven built-in examples. Each example is self-contained and demonstrates how Shiny works.

The first example you may want to try is called Hello Shiny, an example plot of a R dataset with a configurable number of bins. Users will be able to change the number of bins by moving a slider bar. The application will immediately respond to the input.

To run Hello Shiny, type:

runExample(“01_hello”)

How to install RStudio and swirl in Ubuntu

swirl is a library which makes it possible to learn R in R.
swirl teaches you R programming and data science interactively, at your own pace, and right in the R console!

1. In order to run swirl, you must have R 3.0.2 or later installed on your computer. Check your version of R by typing the following into your terminal:

R –version

2. Install RStudio (recommended)

“RStudio is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.”

Read more

How to add PyDev IDE to Eclipse

PyDev is an open source Python IDE for Eclipse, which may be used in Python, Jython and IronPython development.

To add PyDev to Eclipse, open Eclipse > go to “Help” > “Eclipse Marketplace”

and find “PyDev” in the Search.

Once you have found: “PyDev – Python IDE for Eclipse” – click “install”.

Read more

How to install R in Ubuntu

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

Read more