Going Back to the Basics: Data and Machine Learning

By Dan Knott

Artificial intelligence and machine learning are today’s industry buzzwords. In fact, even outside of the tech and marketing industries, it seems that every company is full-speed ahead in the rush to leverage these technologies. The boom has occurred so rapidly that many organizations have neglected to take the first crucial step in understanding how—or if—machine learning truly help solve their business problem(s).

With everyone ready to jump on the bandwagon, there is a growing need to dig back into the basics of machine learning—how the technology uses data and how to effectively leverage its power to design impactful products, workplaces, business models and solutions.

Defining the terms

I’m always cautious when people start talking about AI—what do they actually mean when they say “Artificial intelligence?” A lot of ink has been spilled outlining the dilemma, not to mention the centuries humans have spent trying to understand just what intelligence is.  

Vishal Maini from DeepMind provides a pretty good working definition of artificial intelligence. Specifically, he explains:

Artificial intelligence is the study of agents that perceive the world around them, form plans, and make decisions to achieve their goals.
Its foundations include mathematics, logic, philosophy, probability, linguistics, neuroscience, and decision theory. Many fields fall under the umbrella of AI, such as computer vision, robotics, machine learning, and natural language processing.

Maini goes on to define machine learning as:

…a subfield of artificial intelligence. Its goal is to enable computers to learn on their own. A machine’s learning algorithm enables it to identify patterns in observed data, build models that explain the world, and predict things without having explicit pre-programmed rules and models.

Simply put: machine learning is the technology that powers autocorrect on your iOS device, how Amazon makes product recommendations, what Uber uses to determine the best routes and drivers for customers, and how Facebook decides what shows up in your News Feed.

Let’s go another level deeper because, as it turns out, there are several types of machine learning:

  • supervised learning – the learning algorithm is provided with pre-labeled training examples to learn from.
  • unsupervised learning – the learning algorithm is provided with unlabeled examples. Generally, unsupervised learning is used to uncover some structure of or pattern in the data.
  • semi-supervised learning – the learning algorithm is provided with a mixture of labeled and unlabeled data.
  • active learning – similar to semi-supervised learning, but the algorithm can “ask” for extra labeled data based on what it needs to improve on.
  • reinforcement learning – actions are taken and rewarded or penalized in some way and the goal is maximizing lifetime/long-term reward (or minimizing lifetime/long-term penalty).

You’ll notice that, in all of these examples, there is a basic need for two things:

1.   Data  

2.  A way for organizing the data through labels, rules, or goals

Bad data in, bad data out

Machine learning isn’t magic, and in its simpler forms, it’s really an exercise similar to linear regression modeling. Remember drawing graphs like these in your Statistics 101 class?

This is a simple, but (hopefully) illustrative example. If you don’t have the data to plot the dots on the graph (corresponding x and y values in numeric form, in this case), or if they’re organized in a way that isn’t helpful, how can you expect to draw the red line (read: draw any insight or predictions from your data)?

Therein lies the problem many businesses face today: Their data isn’t collected, stored or organized in a way that is usable for machine learning—or they’re not collecting the right data in the first place.

So how can marketers ensure that they’re collecting the right information and using it in the most efficient way?

Solutions-based thinking…

Think about the ways that you want to impact your business through automation powered by machine learning, and then think about the data that would be required to make it a reality.

Do you want to know what product to recommend to customers based on their previous purchase on your site? You’ll need historical data on customers’ buying behaviors, organized in a way that you can detect patterns in purchase behavior over a long period of time on a macro scale.  

Do you want to empower sellers to share the best message at the best time based on the prospect’s industry, role, and your relationship to the client?  You’ll need congruent sales data for your products and sellers, with a framework for organizing deal types. All the while identifying the industry, prospect role and relationship status with your company to be able to make any kinds of predictions or decisions through machine learning.

While there are far more in-depth studies and opinions about the strategies, philosophies, and technical requirements necessary to make moves in this area—simply knowing what sort of data your brand or company needs to begin with is half of the battle. And, if you’re unsure of where to start, bring in a trusted partner.

…coupled with user centricity

A simple response to many of these challenges could be to ramp up data collection efforts, along with making sure it’s appropriately organized and labeled. Countless times I’ve heard the statement, “We need more data to make the things we want,” but I believe it’s important to take a step back and introduce user-centric thinking to the mix.  

Even if you could create a machine learning solution for your organization the question remains: should you? And not “should” based only on business outcomes, but is it ethically good for your customers and for society? What biases should be considered in your data, your approach, etc.? Is the solution just… weird? Put yourself in the users’ shoes to best understand what solutions will appeal to them the most.

Data and machine learning are powerful tools that have the potential to transform businesses. And while there’s much appeal for jumping feet-first into these technologies, it’s far more important to take an empathetic, solutions-based approach. Nothing that your brand does will be usable to a purposeful end unless it’s been designed to achieve a specific desired outcome—one that comes from putting the user first and aligning their needs with your business objectives.


As Director of Client Engagement, Dan manages end-to-end client engagements, connecting VSA’s teams with clients’ championing their needs throughout the lifecycle of each project. Building on more than 12 years of experience, he has led numerous efforts across a variety of traditional and digital efforts for IBM across multiple business units. Prior to joining VSA, Dan was a Strategic Account Executive at Digg, and SVP of Strategic Partnerships at LoudDoor, where he worked with clients including Beam Suntory, Kia Motors of America, Beautyrest and Kraft. Dan can be reached at dknott@vsapartners.com.