Welcome to my Blog

The Data Post

Start Reading

9 Useful R Data Visualization Libraries for Any Discipline

By Asha Hill — Customer Success Analyst at Mode

If you’ve visited the CRAN repository of R packages lately, you might have noticed that the number of available packages has now topped a dizzying 12,550. This means there are packages for practically any data visualization task you can imagine, from visualizing cancer genomes to graphing the action of a book.

For new R coders, or anyone looking to hone their R data viz chops, CRAN’s repository may seem like an embarrassment of riches—there are so many data viz packages out there, it’s hard to know where to start.

To provide one path through the labyrinth, today we’re giving an overview of 9 useful interdisciplinary R data visualization packages. We’ve noted the ones you can take for a spin without the hassle of running R locally, using Mode R Notebooks.

Read more

Python for Big Data Analytics and the Role of R

Two Popular Open-Source Programming Languages to Consider for Your Data Science Toolkit
R and Python are two very popular open-source programming languages for data analysis. Frequently, users debate as to which tool is more valuable, however both languages offer key features and can be used to complement one another. A common perception is that R offers more depth when it comes to data analysis, data modeling and machine learning, but Python is easier to learn and tends to present graphs in a slightly more polished way.1,2 Using the interface Python offers for calling R allows users to reap the benefits of both of these powerful, popular tools for data science. Even if you choose not to combine the two, the different ways in which these two languages are valuable make them both important parts of a data science toolkit.

Why Python?
Read more

The (Data Science) Notebook: A Love Story by David Wallace

Computational notebooks for data science have exploded in popularity in recent years, and there’s a growing consensus that the notebook interface is the best environment to communicate the data science process and share its conclusions. We’ve seen this growth firsthand; notebook support in Mode quickly became one of our most adopted features since launched in 2016.

This growth trend is corroborated by Github, the world’s most popular code repository. The amount of Jupyter (then called iPython) notebooks hosted on Github has climbed from 200,000 in 2015, to almost two million today. Data from the nbestimate repository shows that the number of Jupyter notebooks hosted on GitHub is growing exponentially:

This trend begs a question: What’s driving the rapid adoption of the notebook interface as the preferred environment for data science work?

Inspired by an Analog Ancestor

The notebook interface draws inspiration (unsurprisingly) from the research lab notebook. In academic research, the methodology, results, and insights from experiments are sequentially documented in a physical lab notebook. This style of documentation is a natural fit for academic research because experiments must be intelligible, repeatable, and searchable.

Read more

Resources for Data Science Job Seekers

February 12, 2018 | Sadavath Sharma — Analyst

Getting your first job in data science can be a full-time job all on its own. Simply finding a job post worth applying to can be a chaotic pursuit (though we’ve tried to make that part easier with the Mode Analytics Data Jobs Board (edited). Once you’ve found a job posting that looks like it could be a fit, you need to make sure you stand out from the crowd of other applicants.

As a data science job applicant, there are two stages to your search. First, you need to get an interview. To do that, you need documentation that you can fill the role. This is where your resume, your portfolio, and (unfortunately) your online presence come in. There are serious issues with looking up candidates on search engines, which range from creating unconscious bias to opening up murky legal situations, but it happens (not here at Mode though). For better or worse, it’s worth taking a quick look at your name’s search results to get a sense for what people might find.

Read more

Thinking in SQL vs Thinking in Python

July 7, 2016 | Benn Stancil — Chief Analyst at Mode

Over the years, I’ve used a variety of languages and tools to analyze data. As I think back on my time using each tool, I’ve come to realize that each encourages a different mental framework for solving analytical problems. Being conscious of these frameworks—and the behaviors they promote—can be just as important as mastering the technical features of a new language or tool.

I was first introduced to data analysis about ten years ago as a college student (my time studying the backs of baseball cards notwithstanding). In school, and later as a economics researcher, I worked almost exclusively in two tools—Excel and R—which both worked well with CSVs or Excel files downloaded from government data portals or academic sources.

Read more

101+ Infographic Tools And Resources

The below is a roundup of helpful tools, resources, and articles to create infographics that people love. Broken down into categories according to each stage of the infographic creation process, from brainstorming to distribution, so you can skip to the categories you might be most interested in.

Infographic Tools & Resources for Ideas/Inspiration

  1. Why ideas are the most important piece of an infographic: Find out what makes a great infographic idea.
  2. 16 exercises to come up with great infographic ideas: Our favorite tips and tricks.
  3. 5 ways to know if your idea will work: Framework to vet your ideas.
  4. Alltop: An aggregator of the Internet’s most popular stories.
  5. Answer the Public: Visualizations of the questions people ask Google.
  6. Brainpickings: An inventory of cross-disciplinary interestingness, spanning art, science, design, history, philosophy, and more.
  7. BuzzSumo: Insights on the most-shared content on any topic.
  8. Dadaviz: Charts on a variety of subjects.
  9. Daily Infographic: Design inspiration updated each day.
  10. Read more

Data Will Save Music

The writing is on the wall.
The music industry is dying.
Nobody buys music.
It’s the Wild West.
The last one might be true. But the rest? Not exactly.

In the Wild West, the winner of the shootout was always the one who was armed the best and able to take the best shot. Nowadays, artists and executives need to have that same kill or be killed attitude. It’s time to upgrade the arsenal.

Leonardo da Vinci left us with a quote that we can use to bridge the gap of this analogy:

“Principles for the Development of a Complete Mind: Study the science of art. Study the art of science. Develop your senses — especially learn how to see. Realize that everything connects to everything else.”

Science + Art. That’s the future of the music (and entertainment) industry.

Read more

What to expect from business intelligence in 2017

Major Growth

As it looks like Business Intelligence is going to be going from strength to strength in 2017. Organizations in a variety of global markets are planning major investment in their Business Intelligence strategies this year. Over the pond in the UK, over three quarters of small to medium sized enterprises are planning a major analytics or data project this year.

Where there is investment, there is research; and where there is research, there is innovation. This means that we can expect some exciting steps forward this year, as organizations stumble over themselves to stay on the cutting edge of the discipline.

Data Diversity Is the Order of the Day

In order to keep ahead of the curve, businesses in 2017 are turning their attentions to a variety of different sources from which to draw their data. After all, why limit your insight when we are practically adrift in a sea of data and understanding?

If you can find a way to connect it to an analytics platform, it is a data source. This means, businesses now have the technology to measure just about everything. Need qualitative data from customer reviews? No problem. Want customer behavior data from a physical product itself? It’s yours. Looking for information on which of your competitors your churned customers have moved on to? Right here.

The fact is, you cannot get too much data, and the greater variety of sources you have for that data, the more comprehensive the understanding you can gain from it. This is why datasets and sources will be becoming increasingly diverse in 2017.

Read more

Areas of AI & machine learning to watch closely

Distilling a generally-accepted definition of what qualifies as artificial intelligence (AI) has become a revived topic of debate in recent times. Some have rebranded AI as “cognitive computing” or “machine intelligence”, while others incorrectly interchange AI with “machine learning”. This is in part because AI is not one technology. It is in fact a broad field constituted of many disciplines, ranging from robotics to machine learning. The ultimate goal of AI, most of us affirm, is to build machines capable of performing tasks and cognitive functions that are otherwise only within the scope of human intelligence. In order to get there, machines must be able to learn these capabilities automatically instead of having each of them be explicitly programmed end-to-end.

Read more

One Third of Americans Prefer a Software Robot Over a Human Boss

Digitization and automation are ever-growing topics in relation to the workplace.

A famous Oxford study on the future of employment from 2013 estimated that up to 47% of American jobs may be automated by 2035; a brand new McKinsey study shows that current technologies could automate 45 percent of job activities; and the business mantra goes that if you can digitize, you should digitize to gain a competitive advantage.

But how do we, as human beings, really feel about potentially working with or even for AIs, and what impact do we think they will have on our workplace?

A recent study conducted in the US, UK and Denmark explores people’s openness towards working with and for “unbiased computer programs”—defined as “a software robot that makes decisions or proposals for decisions based on data from HR, financial or market information. The software robot is unbiased, i.e. it is not affected by the personal, social and cultural bias that influence human decision making, but balances all input only based on the data.”

The study shows some surprising results in openness, and big geographical differences.

Read more