Updated: Sep 15
Photo by Drew Graham on Unsplash
The board has made AI a top strategic priority for your company and is ready to put the money where its mouth is. Good. Do we have the right people to execute the strategy? Who do we train and how? What people do we hire and how?
In is article we offer a decision framework to help you make clarity around the complex web of choices around your AI Talent Strategy.
An AI Talent Strategy Framework
So, what does a good enterprise AI Talent Strategy looks like? To help defining your company needs we’ve created a simple AI Talent Strategy Framework.
We will address other dimensions of a company's AI Transformation in other articles but when it comes to AI Talent companies should look across the following three dimensions:
Let's dive into each of these pillars.
Data Scientists and Unicorns
First question to answer is: what roles should be defined and need to be developed in the company to successfully support AI and Data Science activities?
If you are fairly early in your AI Transformation journey, chances are that you are leveraging your traditional data analysts resources. This imply they are probably using “deductive” techniques to explore and summarize data. There is nothing wrong with that but you’re probably not able to extract all the insight available in your data to drive specific decisions. A professional trained in modern Machine Learning techniques can start using more “inductive” techniques and start leveraging your data for create new products, make your existing ones smarter or make your processes more efficient.
Now those companies who have started hiring data scientists to get more value out of their data have quickly come to the realization that their organizations had a number of challenges to be solved before these data scientists could actually produce any business value. The best representation of this situation has been presented by D. Sculley et alii in their now famous 2015 article “Hidden Technical Debt in Machine Learning Systems”. Below is the awesome visualization they have provided in their article putting in perspective how much work is left once you have the Machine Learning code covered.
from “Hidden Technical Debt in Machine Learning Systems” — D. Sculley et alii, Dec. 2015
Time and again we've seen companies falling in the "unicorn hunt race" assuming that by paying top dollars for very smart PhD types resources, they will solve all the data problems the company is facing and will put their company ahead. As the above visualization eloquently show, you need a lot more than a handful of top-notch Data Scientists if you are trying to extract business value from the ocean of data your company may be collecting in Data Warehouses, Lakes and the likes.
You need a team.
Organizations who have reached Data Fluency not only has clearly codified and standardized the 5 following Data Roles across the enterprise, but also the mechanisms to enable Continuous Learning of its employees for these roles:
Key Roles in advanced Data teams - Copyright The AI Academy 2021
Data Engineer: Having the data stored in a Data Lake and having quality data ready for Machine Learning algorithms are two very different things. Companies trying to do Machine Learning know this very well and list of references indicating 80% of the time in Data Science projects is spent in producing the right data is endless. Today the role of the Data Engineer is well established both in terms of a high priority need and in terms of the required skills. In a nutshell, the DE is responsible to guarantee access to quality data to address the specific business problem tackled by the team.
Machine Learning Engineer: Once the raw data has been transformed into quality data we can use this data to write Machine Learning code, generate new features, train predictive models and get models ready to be deployed. Here is where the role of the Machine Learning Engineer comes in: he/she applies advanced analytics techniques to extract insight from data and build predictive models to address the business problem tackled by the team.
Data Analyst: A common pattern we see in organizations at the early stages of their AI Transformation journey is to excessively focus on the technical aspects of the Machine Learning magic box and measuring success by monitoring only technical metrics. In a mature organization the teams creating data science products and solutions are able to clearly formulate the problem as well as measuring the results of their Machine Learning algorithms in terms of business benefits for their organizations. The role of the Data/Business Analyst is therefore critical to bring the necessary Domain Knowledge and guide the team towards a business oriented definition of success.
Data Storyteller: Extracting insights from data has become a commonly accepted need by most organizations but those who work with data know that insights aren’t written somewhere with the data or just there to be taken. They also know that sometimes is easy to reach the wrong conclusions if the assumptions adopted for the analysis aren’t valid. A very critical role in any data science teams is that of the Data Storyteller who is responsible for extracting valid insights from the existing data and create compelling visualizations to clearly communicate the results achieved by the team.
MLOps Engineer: As the Scully visualization above indicates, there is a great deal of work after a team has created an algorithm before it can produce the desired output on a continuous basis for the organization. Part of the skills required for this stage of the data science product lifecycle may be handled by existing DevOps Engineering resources but being able to monitor the model performance of a production predictive system and being able to indicate when the model in production no longer provides good predictions, requires specific skills that are covered by the MLOps Engineer role. He/she is basically responsible for the system integration and maintenance tasks required to deploy and monitor the predictive models in a production environment
"But wait, do I need all 5 profiles?" It depends at which stage of AI Adoption Maturity your company is, but you will only be able to reach the highest level of performance ("Data Fluency") if you have teams developing data products in cross-functional teams with these 5 profiles.
Now that we’ve defined which profiles are required in the organization, the next question will likely be: “where/how do I get people with these profiles and sufficient experience that can hit the ground running?” The good news is that there are only two ways to do this (end of good news …):
External Hiring - depending where you are in your AI Transformation journey, you might need to bring talent from outside your organization. In the case of Data Science and AI this is easier said than done: the field is still in rapid evolution and the number of people with the right knowledge and enough experience is small. Applying the rules of any demand-supply economy this means also that if you can actually find the right people, this will likely cost you a lot.
Upskill/Reskill - implementing an Upskilling/Reskilling strategy could be a more efficient and cost-effective way: equipping the people you already have in your team with the right skills and competences it will be faster and cheaper than hiring external talent and guiding them through their learning curve as they understand how your company actually work.
Centralized, Decentralized, Hybrid
Defining the required roles and identifying the professionals allocated to these functions is only the first step of your AI Talent Strategy. A critical aspect is defining your organizational structure and put in place structures and methodologies for the organization to function smoothly.
As with other types of transformation initiatives, organizations tend to follow the “stone in the lake” pattern: the stone dropped in the lake is a new leader responsible to define the new practice which happens in a more centralized fashion at the early stages (small inner circles) and slowly expand to the entire organization as best practices and tools get distributed and adopted across the enterprise (expanding circles).
Let’s consider the following three stages of AI Maturity. Consider these as snapshots of a continuous maturity process that can happen over a more or less long time window, depending of the company business objectives and competitive landscape the company operates in:
Leadership: several studies points to the presence of Data-Aware leaders in an organization as a key success factor. Like many other transformations efforts, the AI Transformation of your organization need leaders who understand and support the required changes for your organization to have any real chances of success. Typically this stage is marked with the designation of a Chief Data Officer (CDO) in charge for the company Data Strategy.
Center Of Excellence: an AI Center of Excellence (CoE) is a common way to experiment around tools and processes with a focused investment. It provides a lean way for the organization to identify what works and what doesn’t without a corporate wide, big-bang approach that minimizes the innovation risk. Over time, as best practices become more widespread and decentralized, the CoE can continue to offer a valuable role in driving AI Governance, guiding the organization towards full Data Fluency.
Data Fluency: If having executive sponsorship and a CoE that drives the AI innovation roadmap is critical to start any AI transformation journey, full Data Fluency can only be achieved if the large majority of people in the organization, regardless of area or level, have acquired at least the basic skills to face any project and decision with a Data-Driven mindset. At this maturity stage data science products are routinely developed by cross-functional, agile data teams across the enterprise managing all stages of data product lifecycle from planning to continuous monitoring in production.
It's a long and difficult road but one worth taking to reap the full business benefits other organizations are able to get.
Data is a team sport
No AI Talent Strategy can be complete if it only looks at the internal organization and the resources that operate in such organization. We have seen plenty of examples where medium and large enterprises have spent significant time and money to build a machine with great potential, but fell short in providing the right fuel to make the machine move forward at the required speed towards success: such fuel is the way the resources all work together in an efficient way.
Photo by Josh Calabrese on Unsplash
This is why we have given "Collaboration" a special place in our AI Talent Strategy Framework. Let's take a look at some typical stages in this dimension:
Silos - At the lower stages of AI Maturity Map we see often data professionals working in silos, performing work in isolation and storing data and results in local environments. This is basically a no-collaboration stage where data engineers need to go through long approval processes to get access to the data they need, machine learning engineers wait weeks to get processing capacity and it takes forever to take any predictive model in production. Due to lack of agility and long lead-times, models are typically built only once which means the models in production might actually lead the company to take the wrong decisions every day if its users behavior changes compared to what the model has learned on the old data — a phenomenon called “data drift”. What better example than the pandemic to explain how real this risk is? In larger organizations, beside the speed issue, working in silos also means that the same problem might be solved multiple times by separate data teams in different organizations, introducing another factor contributing to high inefficiency.
Knowledge Repositories - As the organization realizes that working in silos is not a very efficient way to scale the data practice, the first steps where most companies tend to focus is Knowledge Sharing. Unfortunately when it comes to data science, this is not as simple as using a company Wiki or internal sharing site. Having helped large enterprises defining and building their Data Science Knowledge Repository we believe only purpose-built tools for knowledge sharing and data teams collaborations are capable to meet the Reproducibility, Consumability and Discoverability requirements typical of the data science workflows.
Cross-functional Data Squads - Looking at the variety of skills required to build and maintain a data product, it is clear that only a coordinated effort of multiple people with a broad set of collective skills can efficiently accomplish the task. A simple way to convey this concept is with the phrase often used “Data is a Team Sport”. While hiring or qualifying the right people for the right roles is necessary, having executive sponsorship and an AI Center of Excellence that drives the corporate AI Roadmap is critical to start any AI Transformation journey, we believe full Data Fluency can only be achieved if data teams are working in cross-functional squads that have developed the ability to build data products end to end together.
Making the correct decision about your AI Talent Strategy may be the most important aspect to build a successful transition to become an AI-first company. In this article we’ve suggested a decision framework centered around People, Organization and Processes, to help you chart your path in the challenging AI Talent Race. If you found it useful, please comment below or simply share it within your network.