How to Define and Execute Your Data and AI Strategy
Over the past decade, many organizations have come to recognize that their future success will depend on data and AI (artificial intelligence) capabilities. Expectations are high and companies are heavily investing in the area. However, our experience advising organizations in diverse industries suggests that many have also become disillusioned in their journey to create companywide, data-driven business transformation. This article discusses some of the common pitfalls in the implementation of data and AI strategies and gives recommendations for business leaders on how to successfully include data and AI in their business processes. These recommendations address the core enablers for data and AI capabilities, from setting the ambition level to hiring the right talent and defining the AI organization and operating model.
Keywords: data strategy, AI strategy, data-driven business transformation, AI leadership, digital transformation
Many companies are currently investing in data and artificial intelligence (AI). Since the terminology varies, the activities may be called AI, advanced analytics, data science, or machine learning, but the goals are the same: to increase revenues and efficiency in current business and to develop new data-enabled offerings. In addition, many companies see an increasing responsibility to contribute their AI expertise toward humanitarian and social matters. It is well understood that to stay competitive in the digital economy, the company’s internal processes and products need to be smart—and smartness comes from data and AI.
Over the past 4 years, our company DAIN Studios has been involved in more than 40 Data and AI initiatives in different companies and industries in Finland, Germany, Austria, Switzerland, and the Netherlands. Our clients are typically large, publicly listed companies. In our work, we have defined data and AI strategies, evaluated AI execution projects, and advised companies on topics such as data governance, organization, and operating model. We have also built cloud infrastructures, engineered data pipelines, and developed scalable machine-learning models. We have advised dozens of business leaders on how their organizations can become data driven and use AI to their benefit. This article reviews some of the findings that we have made and proposes best practices going forward.
When we started our company in early 2016, the term artificial intelligence had just begun its resurrection from the 1960s. Before 2016, the cool term was ‘Big Data’ (which now sounds hopelessly outdated) and before that ‘advanced analytics’ and ‘data science’ (still used!). While many digital-native companies were applying advanced AI methods in 2016, many older companies were not. Digitalization and the resulting requirements for data and AI have taken many established companies by surprise, disrupting business models in industries as diverse as fashion (Sun & Zhao, 2018), insurance (Bohn, 2018), and logistics (Ivanov et al., 2019). Industries experiencing competition from digital-native companies, such as media and retail (Sundström, 2019), have had to transform themselves and rapidly adopt data utilization. In contrast, many manufacturing companies are only in the first phases of their digital and data transformation.
As a result of increased data and AI awareness, many established companies have commenced targeted data and AI programs with big expectations to turn around the business and attract star talent. However, a couple of years into the programs, many show signs of fatigue and unmet expectations, with senior managers and leaders unhappy about the speed of progress. According to a new study, 70% of companies globally are currently working on getting the first AI deployment operational (Schmetzer, 2020). Pilots have been made in selected areas and even data-enabled products may have been launched, but the desired large-scale business transformation has not taken place. Data and AI are still niche activities, not a core competence of the business. In some cases, people whisper about ‘the project which must not be named.’ Bayer, the German pharmaceutical company, has humorously but aptly dubbed the habit of continuous piloting as a disease called “pilotitis.” As a result, management grows increasingly impatient and wonders how to get out of the rut.
The reality is that there are no shortcuts. Amazon, Google, Apple, and Facebook all used very different business strategies to gain their current market dominance and global influence, but their common success is arguably due to their foresight in understanding the value of data and positioning themselves early. They worked from the inside out, placing continuous emphasis on human capability building, alongside developing, testing, and deploying the top technologies internally, so that they could offer the best to their customers. For established, non-digital companies the road is even rockier. Old companies have established ways of working, digitally immature people, and legacy infrastructure. Overcoming those matters calls for strong determination and persistence from the company leadership (Ross et al., 2019). It means bringing data and AI into the core of all aspects of decision-making—from strategy to operations, supported by key performance Indicators that align data-driven decision-making. Such action usually manifests itself with a focus on data and AI capability development seen on the agenda of leadership meetings—from the board, to C-suite, to senior managers. It is usually also on the agenda of forward-looking human resource managers, who understand that digital talent is of key importance to the company.
Based on our experience, business leaders need to be highly involved in all aspects of the execution of data and AI strategies and the capabilities that the supporting initiatives involve. We observe that fully committed leadership has been one of the common denominators for success in digital transformation and becoming a data-driven company.
Sometimes the leadership understands the importance of becoming data- and AI-driven but feels inadequate in their own knowledge about the subject matter. That is a good sign. Many universities and consultancies offer data and AI training for business leaders. An effective way is to custom-tailor a data and AI workshop as part of the leadership strategy days. A word of warning though: sometimes business leaders make the mistake to focus on statistics, computer science, and coding in their desire to enhance their understanding of AI. While coding is a critical skill for data scientists and data engineers, business leaders are better off putting their efforts into creating an effective company environment for data and AI. That means setting business goals, hiring the right people, educating the workforce, committing to investments, and implementing an effective operating model and organization for data and AI. This is best done by setting clear goals and incentives for the organization and following up on them. Figure 1 presents the topics that need to be on the leadership agenda when defining and executing the data and AI strategy.
In the next section, we go deeper into the topics that should be accounted for when defining and executing a data and AI strategy.
2. Setting the Data and AI Vision
Economic benefits of AI expected by various industries and countries are assumed to be high. Using various econometric methods, PwC (2018) predicts the global GDP to be 14% higher in 2030 as a result of AI. They assume each industry to obtain a gain in GDP of at least 10% by 2030. For their part, McKinsey (Bughin, 2018) predicts the cumulative GDP to be about 16% higher by 2035. Accenture (2017) gives even higher estimates by modeling that AI could double the annual GDP growth rates by 2035 and increase productivity by up to 40%.
These predictions will not materialize by themselves. The premise for successful data and AI strategy is to know your business goals. What are your must-win battles? Where do you need to succeed in the future? Access to data will help in the definition of business priorities, but it is important to remember that data and AI will not solve your issues in business models, products, and services. Proper uses of data and AI will help you make more informed decisions, obtain information faster, automate processes, and enable delivery faster than a human mind—but they will not construct or replace the lack of business vision and ideas.
AI priorities are derived from business priorities. Since data and AI will make different contributions in different areas, one should consider the business case for each business area when assessing where to focus a company’s data and AI efforts, as well as AI’s relative importance to the case.
In addition to the outcomes, the implementation effort should be considered when thinking where to deploy data and AI first. For example, putting data and AI into use in sales and marketing will typically yield results quickly, while employing them in product development takes longer but can eventually result in large impactful outcomes. Often, it makes sense to start with process optimization cases. An improvement of 1% in efficiency or avoided downtime may mean saving millions of euros. Calculating business cases for cost savings is often easier than assessing business opportunities for new revenue as the existing business processes and data are known. Early wins are important to communicate to obtain buy-in from the organization, as well as increase general understanding on demonstrated AI benefits.
Gaining support and buy-in for a data and AI vision is equally important. The conventional way to do this is to make a business case for data and AI showing the baseline internal rate of return (IRR) on planned investments (‘current state’), and compare it to the IRR of investments in data and AI (‘future state’). Strategizing along these lines is a good exercise for understanding the big picture, however, it needs to be kept in mind that for many digital products, services, and businesses, the option of ‘not doing data and AI’ is not feasible.
With so many potential opportunities to apply data and AI, it can be difficult for organizations to know where to start. To guide this process, we often use the data opportunity matrix (Figure 2). The natural starting point is often the optimization of current business processes: that is, leveraging existing internal data sources to enhance business models, products, services, internal processes, and functions (e.g., production, marketing, supply chain, HR). These improvements can be optimized further by the enrichment of internal data assets with external data sources. For example, weather data can help improve the accuracy of the predicted caller volume for an insurance company or the right supply for a fashion company.
Once you have a solid understanding of the data and AI use cases that help your current business, new data-driven business opportunities should be investigated. These include data as a business (e.g., selling data) and data partnerships (where new offerings are created by pooling data from several organizations). Neither topic is easy, but the opportunities are worth looking into.
One of our clients, Elo, a Finnish mutual pension company, is a good example of using a systematic approach to develop their data and AI capability. As an insurance company, it has a solid actuary unit, but data was not used in optimizing the customer experience. Elo started with the definition of their desired target state and roadmap, and then moved into implementation. That included the development of a new, modern cloud-based infrastructure, an analytics environment, data integrations and modeling from various sources, and the development of dashboarding tools and analytical scores for various customer-interfacing actions. Today, Elo has a dedicated data science team focusing on customer experience.
3. Data Management and Data Governance
The availability of high-quality data is the foundation for successful, productized AI. Data can be called an asset if it is structured according to the FAIR principles (Findable–Accessible–Interoperable–Reusable) as suggested by the European Commission (2018). Data that resides in various systems, in different formats and ontologies, or misses key attributes (such as unique identifiers), is not an asset. If the data asset is not reusable, every data science/AI activity will be a separate, possibly large IT exercise. The principle of ‘build once—use many’ is pivotal for maximizing the value of data assets. For example, for the personalization of an online service, you might want to use behavioral data from the online and mobile channels, Customer Relationship Management (CRM) data, and consumer online and offline transactions—not only data from the online service itself. The goal of a productized data asset is to support all use cases.
When building your company’s data asset, start with the data needed for the prioritized business opportunities/use cases. This sounds self-evident, but in many companies, there is an organizational disconnect between the IT teams that engineer data and the business functions that use data-driven insights. In the worst case, the data engineering teams may be busy building a data asset that integrates various data sources into a common data environment, which ultimately lacks the data sources that the business end-user needs. As a result, both sides end up frustrated.
One practical way to start building a data asset is to take stock of the current data assessing how ’FAIR’ the data is. The process is called data due diligence or data inventory. A data due diligence responds to questions such as the following: What data exists? Where does it reside? How can it be accessed? What is its quality? Can it be linked to other data? How much effort does its retrieval take? Have we overlooked any obvious data sources for this use case? Once the current state of the data asset has been assessed, a roadmap for its development can be made.
A Dutch energy provider we worked with was keen to understand the value of their data asset. They started by identifying the data needed for their prioritized use cases and then analyzed what data they already had and what was missing. Subsequently, they made a plan for retrieving the most critical internal and external data. Working along the value vs. effort axes helped them make a realistic use case implementation roadmap.
4. Solution Architecture and Technology
Solution architecture and technology refer to the technical side of the data asset management. Apart from digital native companies, existing companies typically have plenty of legacy infrastructure. After defining the business & AI vision and conducting data due diligence, the next step is to have an experienced data and solution architect take a critical look into the current technical architecture and define the target architecture and its development roadmap. This task, too, should follow the end-to-end use case logic accounting for data collection from operating systems (e.g., CRM, Enterprise Resource Planning (ERP)), data warehouses, cloud environments, analytical environments, and business-interfacing systems. Traditionally, having experience with reporting and business intelligence, many data solution architects stop the definition of the data architecture on the data warehouse level. However, automated machine-learning solutions need to be linked back to operational systems —meaning that operational systems should be an integral part of the data and solution architecture. For example, to use your consumer data asset (including individual microsegments, next-best offers, and other consumer scores) in real time as part of a modern, omnichannel marketing-automation system, you need to set up an end-to-end architecture. Yet sometimes the marketing department oversees marketing technology while IT is responsible for backend systems; this may lead to marketing only using marketing data (e.g., online, email) and discarding many other interesting data sources, because they are not aware of their existence. In the worst case, the data science teams, which should create the companywide ML/AI algorithms, are not involved in either activity.
Managing the transition from traditional IT systems into the digital world is often a lengthy process. While automation and AI will eventually drive costs down, during the transition time, costs are likely to increase as new and old solutions live side by side. Furthermore, a typical IT department’s budgets are tied up with the operating and maintenance of current systems while development budgets are modest. It needs to be understood that new technical solutions require new investments.
5. Data and AI Protection, Privacy, and Regulation
Data protection and privacy is of key interest to consumers and those with access to consumer data. Data protection relates to data collection, processing, and utilization. According to the General Data Protection Regulation (GDPR) of the European Union (European Parliament and the Council, 2016), the legitimate interest of data processing must be defined, and the user informed about the collection, processing, and combination of their data. The user must be offered mechanisms to opt out and object to data processing. The level of user identification between data flows between different data-processing systems must be defined.
A good team for setting up the companywide privacy policies consists of a combination of business owners, privacy lawyers, and AI strategists (and/or data scientists). AI strategists with a technical background will help translate the business use cases into data and AI requirements and discuss the interpretations of different options with privacy lawyers.
As companies begin to adopt AI solutions, the transparency and explainability of AI have become important societal topics. The European Commission (2020a) puts considerable attention on trustworthy and ethical AI; Europeans should be able to understand and trust the outcomes of AI solutions. While the European Commission Data and AI White Papers (European Commission, 2020b and 2020a, respectively) propose new investments in AI, they also promote a framework of new rules and regulations for assessing ‘high-risk’ AI cases, such as health care, policing, and transport. As already demonstrated by GDPR and now by the new strategy papers, Europe is taking a stricter stand on the deployment of AI than the United States and China.
6. Human Skills
The data and AI journey requires new roles in an organization. While the exact role terminology varies, data and AI roles are needed for four different levels of business processes:
Business units and business functions (e.g. sales, marketing, finance)
Data science (and business intelligence)
Data platforms and technical solutions
The business use cases come from the business level (no. 1). In addition to actual businesspeople, the role of the AI strategist resides here. The AI strategist translates the business vision and goals into data and AI requirements, oversees project execution, and ensures that project outcomes are taken into use by business processes. Most companies do not have this role, but we see it as one of the most critical roles in the successful execution of data and AI projects. Without an AI strategist, the communication distance between people with a business background and the data scientists is often too wide and can take some time to align (see also Malone, 2020). A good background for an AI strategist is that of a senior data scientist who wants to develop him- or herself into a business and managerial talent. Over time, the AI strategist will develop the responsibility for AI product ownership tasks. McKinsey (Bughin, 2018) presents this role as “Analytics Translator” (Henke et al., 2018). While this is similar to our definition, we place even more emphasis on the role of the AI strategist as a driver of business impact after the initial solution has been developed. In addition to AI strategists, business leaders themselves need to have a solid understanding of the opportunities of data and AI in order to drive the topic forward and integrate the AI outcomes into their respective business processes.
While most companies lack the role of the AI strategist, many have hired data scientists (no. 2). Data scientists come in various forms, with different backgrounds. As an educational background, many have studied quantitative methods such as computer science, mathematics, statistics, physics, or engineering. It makes sense to have a data science team with different types of educational backgrounds (also see Davenport, 2020). For example, people with a statistics or econometrics background are good in statistical inference, while people with a computer science background are proficient in machine-learning techniques and coding. Physicists are trained to model observed phenomena and think outside the box. Data-savvy sociologists, psychologists, or biologists can bring different perspectives to the team. Since machine learning/AI is a new field, there is a large demand for experienced data scientists. We recommend recruiting a senior data scientist as the first hire and let them build a balanced team consisting of experienced people and promising young talent.
A common mistake is to hire only data scientists and not fill the technical roles such as data engineers and data architects (no. 3) or platform engineers and solution architects (no. 4). This leads to a high level of frustration among data scientists as they must retrieve data from the source systems and build the databases themselves. It has been estimated that nearly 80 percent of data scientists’ time is spent on these tasks, rather than building models and generating insights (Press, 2016). In practice, this often means that the data asset and data infrastructure will not be built properly. Data science teams will do pilots and build point solutions, but a scalable data foundation will be out of reach. Data scientists are trained to build machine-learning models, not to do extract, transform, and load (ETL) and build databases and cloud solutions. The frustrations will spill over to the management side as the pilots do not scale.
Nowadays, many people call themselves data scientists or data engineers. It may be difficult to distinguish walk from talk. In addition to looking at the education and past work experience of potential recruits in this field, we recommend using an assessment test in recruiting. This can reveal more about the candidate than in an interview alone and it can be tailored to the role at hand. In our own recruitment process, we use assessment tests and have found that, in addition to the insight it gives us as employers, the candidates also enjoy the tasks and appreciate the insight it gives into our daily work. A good start is to hire a chief data and AI officer with experience in business, data, data science, and technology to hire the talent and set up the teams. In addition to subject-matter expertise, this person should have excellent leadership and communication skills as they need to communicate effectively with different levels of people in the organization.
7. Data and AI Organization
The optimal data and AI organization structure depends on the overall company size and organization, culture, the level of AI maturity, and the type of data/AI tasks.
To get things going, establishing a center of excellence (CoE) generally helps to bring focus to the topic. Depending on where the CoE sits in a company, it will be responsible for different areas. The CoE may consist of data science and business intelligence teams only, while the technical teams (data engineering, platforms) reside in IT. Alternatively, the CoE may cover the technology side, while the data scientists sit in business units. The optimal setup needs to be carefully analyzed. In our experience, most companies will benefit from a common technical infrastructure and data asset management, as well as some form of centralized data science team, which solves the most difficult use cases and creates a scalable AI portfolio for the use of all business units and functions. The AI strategists should optimally sit within business units to drive the AI use cases forward, but in the beginning, they can also reside with the data science teams and help business from there.
In a mature data and AI-driven company, the role of the CoE will become smaller as the whole company uses data in their daily business. At a mature stage, the CoE will continue taking care of common data governance topics such as data quality and integrity, technical systems, ontologies, and standards.
Sometimes, to make starting easy, it makes sense to introduce a companywide AI program to drive the data and AI agenda forward—with the premise that the program exists for two to three years and will then be dissolved. A program’s benefit is that you do not need to make line-organizational decisions in the early phase but will learn over time what type of team structure works for your organization.
8. Operating Model
A closely related topic to data and AI organization is the operating model between different business units. Prioritized business use cases should drive the development of specific data and AI capabilities as identified within the initial strategic assessment. In order to have the data experts work on the most important use cases, business leaders should establish an AI steering group or include the data and AI development into the existing leadership team meetings. The head of the CoE (chief data and AI officer) should drive the agenda in the meetings. In addition to a cross-unit steering group, individual use-case areas should have their own, operational steering groups.
For the first years of the CoE, we have seen that it often makes sense to centralize budgets. Budgets drive prioritization, and without a centralized budget, data and AI activities will not scale up. Typically, individual business units and functions do not want to carry the costs for companywide capability building (e.g., common data models, infrastructure, application programming interfaces) even if it would be optimal for the whole company. This means that the AI solutions become separate, disconnected islands. Furthermore, without a common roadmap, prioritization, and clear governance, businesses that contribute the most financially will demand that they get the resources even if resources would be strategically better used in another area.
It is important to remember that it is not only about the data experts. Business processes and businesspeople are fundamentally impacted by the utilization of data and AI. Close collaboration between the data and business functions secures tangible and sustainable results. The parallel between technical and business development processes is illustrated in Figure 3.
For example, in marketing, to drive trigger-based, personalized marketing, data and targeting models need to be available, but so do content production, customer treatment models, channel strategies, front-end systems, and so on. Always-on data-driven marketing requires different skills and capabilities from marketers than traditional marketing. Similarly, for process automation: if data scientists build a predictive service maintenance model, business impact will only be obtained if the service fleet and technical systems are enabled to perform timely intervention to respond to the predictions.
A smart way to increase the business impact is to give the same incentives to everyone involved in a data/AI project. For example, if the goal is to increase the marketing campaign lift by 20% with AI-driven targeting, this target should be given to marketers, data scientists, and data engineers. This is likely to prompt some objections but will eventually lead to the best results for the company.
9. Data Science and Machine Learning/AI Algorithms
Like the data asset, algorithms can also be treated as the algorithm asset. That means that over time, the portfolio of machine-learning/AI algorithms will become FAIR. Every new analytical modeling exercise does not need to start from scratch, but builds on top of tested code. This will make the data science team more efficient over time. Like software coding teams, it requires the data science team to use common code repositories and standards.
It is also important to establish maintenance processes for the data and algorithm assets. If maintenance processes remain undeployed, development teams remain in a state of stagnation as their efforts go into keeping the current assets in production. By applying maintenance processes to data and algorithm portfolios, new solutions can be discovered and developed.
The more machine-learning algorithms are deployed, the more their explainability becomes a topic. People want to understand why certain predictions or decisions were made. This means that data experts need to be able to explain the reliability of the algorithms (especially neural networks) and the characteristics of the underlying data (e.g., sample and selection biases, stereotypes/prejudices, measurement errors, truncated and censored distributions). Regulators are also taking a growing interest in explainability. The European Commission (2020a), for example, has recently called for mandatory and voluntary labeling of “high-risk” and “no-high risk” AI applications, respectively, to increase the transparency and trustworthiness of algorithms.
10. Final Words
To summarize, the following steps are needed to execute a data and AI strategy successfully (see Figure 4 for illustration):
Translate your business and digital strategy into your data and AI vision and strategy highlighting the biggest opportunity areas optimizing your current business as well as new innovative businesses utilizing AI and data.
Identify the business processes (product development, production, sales & marketing, supply chain, pricing, HR, finance, etc.) where you want to use data and AI.
Understand the current state of your data and AI capabilities.
Describe the target state for your business processes once data and AI capabilities have been deployed.
Define new data-driven business and product ideas.
Define your execution roadmap, including investments.
Execute the first data and AI use cases by creating your AI playbook, aiming at production readiness.
Automate and scale up operations.
Sometimes people think that a company reaches the highest level of data and AI maturity when all decision-making is done by automated algorithms. That is, however, a misconception. Automation and AI do not make smart business decisions by themselves. The highest level of AI maturity is when the whole company moves in unison, silos are dissolved, and data and AI are used by everyone as part of their daily business. Everything that can be automated will be automated, and humans need to ensure that automation is done in a smart way.
We would like to thank our clients and co-workers for their valuable input for this article. An earlier version of the article has been published as a white paper on www.dainstudios.com and on Medium.
European Parliament and the Council. (2016). Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). https://gdpr-info.eu/
Ivanov, D., Dolgui, A., & Sokolov, B. (2019). The impact of digital technology and Industry 4.0 on the ripple effect and supply chain risk analytics. International Journal of Production Research, 57(3), 829–846. https://doi.org/10.1080/00207543.2018.1488086
Sun, L., & Zhao, L. (2018). Technology disruptions: Exploring the changing roles of designers, makers, and users in the fashion industry. International Journal of Fashion Design, Technology and Education, 11(3), 362–374. https://doi.org/10.1080/17543266.2018.1448462
Many organizations are putting money into Data and Artificial Intelligence (AI). The activities may be referred to as Artificial Intelligence, Advanced Analytics, Data Science, or It is self-evident that in order to compete in the digital economy, a firm's internal procedures and products must be clever — and clever and cleverness begins with data.