Summary:
CEOs must not overlook the intricacies of GenAI costs. As CEOs navigate this dynamic landscape, a balanced approach that harmonizes innovation with fiscal prudence is the key to a sustainable and impactful GenAI journey.
With the rise of generative AI, a common misconception is that CEOs need not concern themselves extensively with how much it costs to implement but should rather focus on its strategic implications. In reality, the costs associated with genAI are intricately woven into the entire process of formulating and executing an enterprise’s GenAI strategy, demanding the continuous attention of CEOs throughout.
GenAI is poised to revolutionize business models, a trend consistent with the historical pattern where core technological innovations disrupt prevailing business paradigms. Throughout business history, the advent of pivotal technologies has consistently heralded disruptive shifts. Enterprises that fail to adapt to these innovations face extinction. This dynamic force is reshaping industries, compelling established entities to either evolve or face obsolescence. As this transformation unfolds, businesses must navigate the ever-shifting landscape, adapting and innovating to harness the power of GenAI, ensuring their relevance and competitiveness in this new era.
Why CEOs Must Pay Attention
The cost considerations stand as vital pillars upon which CEOs must build their GenAI strategies. Marked by a digital renaissance, the implications of these costs cannot be underestimated. A significant challenge arises from the fact that many companies find themselves in the midst of cloud migration initiatives, while others, who once pioneered these efforts, are retreating from the cloud due to the pressures of unforeseen costs. This exodus highlights a critical issue: the insufficient consideration of full lifecycle costs during the strategic planning phase. Generative AI, far from a mere add-on to existing systems, demands a holistic approach that includes intricate system design and an understanding of the nuanced hidden costs.
CEOs, as the architects of their company’s future, bear the responsibility of grasping the depth of these potential costs. As GenAI infiltrates industries, reshaping business models and customer experiences, a comprehensive understanding of the financial landscape is indispensable.
The Complex Landscape of Generative AI Costs
Inference Cost: The inference cost is the cost of calling a large language model (LLM) to generate a response. Each instance where you furnish an input (the prompt) to an LLM to yield an output (the completion) incurs the utilization of compute resources. This act of engaging a trained LLM to produce an output is formally termed “inference.” At its core, this process is underpinned by GPU compute, constituting what is known as the inference cost.
For example, the inference cost of generating a text completion from a LLM such as GPT-4 is typically around $0.006 per 1,000 output tokens plus $0.003 per 1,000 input tokens. Tokens are groups of words, where 1,000 tokens are roughly 750 words. The inference cost of generating an image from a large image generation model such as DALL-E 2 is typically around $0.18 per image.
The inference cost of generative AI can be a significant barrier to adoption, especially for businesses that need to generate large volumes of content. However, there are a number of ways to reduce the inference cost of generative AI, such as: using a smaller model, hosting an Open-Source LLM, optimizing the inference process. CEOs can formulate optimal strategies by organizing engineers and consultants to conduct a comparative analysis of these different approaches, weighing their advantages and costs meticulously.
Fine-tuning Cost: Fine-tuning is the process of adapting a pre-trained generative AI model to a specific task or domain. This involves training the model on a new dataset that is specific to the desired task or domain. The fine-tuning cost of generative AI depends on a number of factors, including: the size and complexity of the model, the amount of data used for fine-tuning, and the number of epochs (iterations) of training.
For example, according to OpenAI, to estimate the costs for a specific fine-tuning job, use the following formula: base cost per 1k tokens * number of tokens in the input file * number of epochs trained. For a training file with 100,000 tokens trained over 3 epochs on GPT-3.5 Turbo, the expected cost would be about $2.40. There are also some tips for reducing the fine-tuning cost, notably “transfer learning” and “distributed training.” Transfer learning can re-use the knowledge learned from another task, hence require a smaller dataset. Distributed training utilizes multiple GPUs or CPUs at the same time to reduce the training time and cost.
Prompt Engineering Cost: Prompt engineering is the process of structuring text that can be interpreted and understood by a GenAI model. Effective prompt engineering can lead to significant improvements in the quality and relevance of the model’s output. The expenses linked with the development and utilization of prompts to enhance the performance of GenAI models could indeed be substantial. Such investments necessitate careful consideration and strategic planning to optimize their impact on overall business outcomes.
CEOs need to find the balance between prompt engineering and fine-tuning. If the task requires a high degree of accuracy and precision, then fine-tuning may be the best approach. Fine-tuning can be more expensive and time-consuming than prompt engineering, so CEOs need to consider their budget and resources when making a decision. Additionally, CEOs need to consider the expertise that is available in their company when making a decision.
Cloud Expense: When CEOs contemplate cloud expenses, it’s imperative to look beyond the mere hosting costs of GenAI applications. Instead, a holistic view of the overall cloud architecture posts the implementation of GenAI strategies is essential. Take, for instance, the healthcare sector in the United States. Medical institutions cannot simply upload patient data to remote locations. In such scenarios, expanding local private cloud storage becomes imperative when dealing with the massive user-related data generated by GenAI-driven chatbots in customer service.
Additionally, some companies are yet to make a definitive choice among public cloud, private cloud, and multi-cloud solutions. CEOs, in crafting GenAI strategies, must refine their cloud strategies accordingly. For instance, our research indicates that 70% of corporate executives lean towards the “lift-and-shift” cloud migration strategy, wherein legacy systems are maintained as long as possible. However, this seemingly cost-saving approach might inadvertently elevate the expenses associated with GenAI adoption. Enterprises might find themselves forced to juggle two incompatible systems and teams simultaneously, thus compromising efficiency and inflating costs in the long run.
Talent Costs: Talent serves as the cornerstone of GenAI strategies. While the industry consensus acknowledges GenAI’s significant potential in enhancing efficiency across various roles, CEOs must exercise caution to avoid a short-term rush for talent, which could lead to a dramatic escalation in GenAI talent costs. In the long run, GenAI is poised to create entirely new job categories. CEOs, in collaboration with CHROs, must swiftly develop medium and long-term talent plans to adapt to the evolving work landscape driven by GenAI.
Particular attention must be paid to nurturing leadership and cultural capabilities. Faced with the strategic transformations brought about by GenAI — be it changes in organizational structures, underlying business logic, or shifts in work methodologies — businesses require a pool of talent capable of driving these changes. Identifying talents that can be sourced externally versus those that demand internal promotion and training necessitates CEOs to establish a comprehensive talent roadmap during the initial stages of leading GenAI transformations.
Moreover, CEOs must be vigilant regarding the changes in corporate culture stemming from shifts in the talent landscape. For instance, remote working, a preferred mode for many GenAI engineers, might not align with the current company structure if it isn’t inherently remote-first or operates on a hybrid working model. In such scenarios, the office engineering department plays a pivotal role in facilitating the seamless integration of these engineers into the team, fostering efficient collaboration with colleagues.
Operation Costs: Machine learning operations (MLOps) embodies a suite of practices meticulously designed to streamline workflow processes and automate machine learning and deep learning deployments. It accomplishes the deployment and maintenance of models reliably and efficiently for production, at a large scale. Within the realm of AI initiatives, especially when contemplating expansion and the transition of models into production, MLOps stands as a pivotal cost-reducing force. Its impact ripples throughout the machine learning lifecycle, chiefly by automating tasks that would otherwise demand substantial manual efforts. This not only simplifies the process but also enhances error detection mechanisms and elevates model management standards.
A holistic MLOps lifecycle encompasses pivotal stages: fetching data, engineering features, training, testing, storing, and model deployment. CEOs, when delving into the realm of MLOps, must be mindful of several key considerations, or caveats. These include the continual retraining of ML models based on new data, models exhibiting self-learning capabilities, the gradual degradation of models over time, the dispersion of data across multiple systems, and the immutability of data storage. Understanding these nuances is integral to navigating the intricate landscape of MLOps successfully.
Potential Hidden Costs
Infrastructure Overhaul: GenAI demands a thorough reassessment of existing infrastructure, with potential implications often overlooked by many CEOs. Legacy systems, upon integration with GenAI, might necessitate substantial modifications, exerting a significant impact on costs. Training and deploying GenAI models can be immensely computationally intensive, prompting the need for a robust infrastructure overhaul. This entails ensuring access to substantially augmented computing power, a challenge that can be tackled through the utilization of cloud-based resources, including GPUs and TPUs, or the establishment of dedicated on-premises data centers tailored specifically for GenAI demands.
Furthermore, GenAI models exhibit an insatiable appetite for data, demanding efficient and scalable solutions for data storage and management. CEOs, when estimating costs, often overlook the hidden complexities that emerge from these foundational infrastructure changes. For instance, a client striving to minimize training costs clustered multiple GPUs and employed distributed training, allocating independent data to each GPU simultaneously. Yet, the implementation revealed an unexpected bottleneck: limited bandwidth for communication among GPU cards hindered model training. Data trained on one GPU could only be transferred to another GPU after waiting for the networking module, resulting in substantial resource wastage. Recognizing such intricacies, we developed a novel technology enabling direct data transfer between GPU memories, bypassing the conventional networking module. Anticipating and accommodating potential technological patches of this nature necessitates CEOs to allocate budgetary space when estimating costs, ensuring flexibility in the face of evolving infrastructure demands.
Data Security: In the realm of data security for GenAI, several critical areas require meticulous attention before paying high cost to remediate. These encompass preventing data and intellectual property leakage, addressing the risks associated with malicious content dissemination and targeted attacks, and countering misinformation, copyright infringement, biases, and discrimination. GenAI projects inherently pose a heightened risk of compromise, necessitating a well-planned security strategy from the outset.
To mitigate these risks effectively, CEOs must focus on key strategies. Firstly, creating a trusted environment and minimizing data loss risks through data loss prevention (DLP) practices. Secondly, proactive workforce training is crucial to ensure employees are informed about the technology’s risks and best practices, countering potential shadow IT and cybersecurity threats. Thirdly, transparency regarding the data used in training models is vital, emphasizing openness about data sources and potential biases. Fourthly, leveraging human oversight alongside AI, incorporating reinforcement learning with human feedback, strengthens response validation. Lastly, understanding emerging threats like prompt injection attacks, and designing robust security systems around the models themselves.
Ethical Considerations: CEOs must consider the cost of responsibility when formulating their GenAI strategies. Addressing biases, ensuring transparency, and integrating fairness into Generative AI systems come with additional expenses.
For instance, GenAI can inadvertently perpetuate biases from its training data. In the deployment of generative AI, several biases require attention. Machine bias reflects biases ingrained in training data, reinforcing stereotypes. Availability bias favors easily accessible content, reinforcing existing biases. Confirmation bias leads users to encounter information aligning with their beliefs. Selection bias arises from unrepresentative training data, impacting content comprehensiveness. Group attribution bias generalizes behaviors based on a few individuals’ actions, reinforcing prejudices. Contextual, linguistic, and anchoring biases influence content relevance, cultural relatability, and reliance on initial information. Automation bias fosters unwarranted trust in AI-generated content, risking the spread of biased or false information.
Mitigating these biases requires inclusive training datasets, robust governance, and continuous monitoring. Ethical AI development, transparent guidelines, and responsible deployment are crucial. These steps are vital for harnessing GenAI’s potential while ensuring fairness and unbiased technological progress. CEOs must factor in these costs for a truly responsible and effective GenAI strategy.
Controlling Costs: A Strategic Approach for CEOs
Integrating Cost Control into GenAI Strategic Decision-Making Process: CEOs hold the pivotal role of orchestrators for GenAI strategy, even if they can’t delve into the intricacies of every GenAI project’s cost. Instead, CEOs can wield influence by integrating cost control into the decision-making process. This involves crafting a decision-making framework with clear parameters: who participates, what questions to address, what data to collect, which guidelines to uphold, meeting structures, and the delineation of decision-making roles.
For instance, a CEO we recently worked with established a committee to discern which GenAI projects demanded his direct input and which could be overseen by the CIO or CFO. In projects requiring his intervention, a collaboration with key executives and AI experts was orchestrated. Together, they outlined metrics for business evaluation, selected benchmarks, set a six-month timeline, and organized biweekly review sessions to track progress and strategize further steps. Furthermore, the CEO delegated the task of establishing a benchmark to the CIO and CFO, guiding technology selections and vendor choices.
A robust decision-making process not only clarifies roles and goals but also determines the timeframe and cost expectations. CEOs must delineate “who to task,” “what metrics to set,” “what timeframe to establish,” and “what cost expectations to create.” By embedding these parameters into the decision-making flow, CEOs can assert strategic control over GenAI project costs, ensuring a judicious allocation of resources aligned with overarching business objectives.
Monitoring Costs in the Execution of GenAI Strategies: CEOs must have a comprehensive dashboard to monitor all GenAI projects within their organizations. This includes tracking model training costs, inference expenses, fine-tuning expenditures, cloud usage costs, employee salaries, operational expenses, investments in foundational hardware, and more. Thanks to advancements in Cloud and GenAI technologies, creating such a dashboard requires just a few hours of development time. CEOs must decide the granularity at which they want to monitor project costs. Current technologies allow for precise tracking, even down to per-minute expenditures for most explicit infra consumption. However, CEOs may not need to delve into costs on such minute scales. Dashboard developers can automate weekly reports for CEOs and set up alerts for sudden events, such as when a customer-facing application unexpectedly consumes significant resources, indicating a potential business opportunity. A chatbot can also be easily set up to answer CEO’s questions.
This dashboard serves as the CEO’s prime tool for providing feedback to relevant teams. It offers a clear, time-bound presentation of project status, objectives, and actions. This enables CEOs to ask questions ranging from the overarching vision to intricate details. Simultaneously, project teams can adequately prepare for meetings, fostering a culture of informed decision-making and strategic precision.
Empowering Leadership and Teams: Aligning with strategic objectives, a well-structured GenAI talent pool not only saves costs but also fuels innovation in the GenAI domain. CEOs need to construct this talent pool considering selection, hiring, training, and retention dimensions, all intertwined with cost considerations. Collaboration between CEOs and CHROs is essential to address the scarcity of generative AI experts. Retaining top talent involves competitive salaries and strategic engagement. Upskilling existing employees proves vital and cost-effective in the long term. For example, GenAI developers should master code reviewing alongside generating code. CIOs and CTOs play a pivotal role in extending generative AI skills to non-tech staff, enabling proficient use of AI tools. CEOs can implement adaptable training programs and certifications. Shifting from a talent pyramid to a diamond structure is imperative, focusing on skill enhancement while reducing roles centered on manual tasks.
In conclusion, CEOs must not overlook the intricacies of GenAI costs. CEOs must grasp and integrate the multifaceted costs into their strategic vision, acknowledging nuances such as inference cost, fine-tuning cost, prompt engineering cost, cloud expenses, talent costs, and operation costs. Additionally, CEOs need to be vigilant about often overlooked expenses, including infrastructure overhaul, data security, and ethical considerations. Integrating cost control into decision-making processes, utilizing comprehensive monitoring dashboards, and empowering teams through strategic talent management are indispensable strategies. As CEOs navigate this dynamic landscape, a balanced approach that harmonizes innovation with fiscal prudence is the key to a sustainable and impactful GenAI journey.
Copyright 2023 Harvard Business School Publishing Corporation. Distributed by The New York Times Syndicate.
Topics
Technology Integration
Adaptability
Economics
Related
Shifting from Star Performer to Star ManagerArtificial Intelligence in Healthcare: Pros, Cons, and Future ExpectationsHealthcare Executive Highlights for Third Quarter 2024Recommended Reading
Operations and Policy
Shifting from Star Performer to Star Manager
Operations and Policy
Artificial Intelligence in Healthcare: Pros, Cons, and Future Expectations
Motivations and Thinking Style
The Enemies of Trust
Motivations and Thinking Style
The Vital Role of the Outgoing CEO