Prompt engineering is emerging as a critical discipline for bridging the gap between generative AI’s capabilities and its real-world enterprise value. As organisations strive to harness AI at scale, managing prompts with the same rigour as any other strategic asset may prove essential.
The story of generative AI in business is not one of sudden revolution but of cautious evolution. For many executives, the introduction of tools like Microsoft Copilot has been accompanied by anticipation and trepidation in equal measure. Behind the allure of productivity gains lies a growing realisation that the technology is only as good as the way we ask it to perform. At the heart of this is prompt engineering, a practice that, while still in its infancy, could prove to be the critical link between AI’s potential and its practical value for the enterprise.
Paul Walker, Global Solutions Director at iManage, draws a helpful analogy: “Prompt engineering reminds me of the early days of operating systems like MS-DOS in the ’70s and ’80s,” he says. “Back then, we had powerful computing capabilities but limited user interfaces. We handed people the command prompt and said, ‘Here, go explore!’ But only a small subset could fully leverage it.” The parallel with today’s generative AI is striking. For all its capacity to summarise documents, extract insights, or generate structured content, generative AI often arrives at the user interface as an intimidating blank text box.
This is not a problem of ambition but of accessibility. As Walker notes, the early adopters of Gen AI are “the curious and motivated” or those willing to explore, iterate, and refine their queries. Yet most users, particularly within large organisations, will not adopt this mindset. They need abstraction, simplification, and predictability. In other words, they need something more than a prompt.
The rise and recalibration of Copilot
Microsoft’s Copilot is one of the most visible examples of enterprise Gen AI and one that has had to learn quickly. It launched with fanfare and ambition, but adoption has been uneven. “Users didn’t fully grasp its capabilities,” Walker says. “Many didn’t know what to ask it to do or what kind of output to expect. They’d try a few things, feel disappointed when it didn’t meet their expectations, and then disengage.” This behavioural pattern is familiar. It mirrors how organisations often react to emerging technologies when usability does not meet expectations.
However, according to Walker, the solution is already becoming visible: “Copilot’s success has come through simplification, particularly through features like clickable buttons for functions like ‘summarise’ or ‘help with drafting’. These eliminate the need to type specific prompts.” The shift is subtle but significant. It signals a broader movement away from raw interaction with AI models and towards embedded intelligence within the software, a trend that may render the art of prompting invisible to the user.
Scaling intelligence, not just models
Beneath this evolution lies a structural challenge: how to scale prompt engineering to align with enterprise workflows, governance, and knowledge structures. For individuals using tools like ChatGPT, prompts are transient and personal. In an enterprise, they must be repeatable, auditable, and shareable.
“Prompt engineering is now almost like a utility,” Walker explains. “It is available everywhere. The real differentiation for companies lies in how they manage their prompt libraries.” These prompts are not simply strings of text, they are organisational assets. They must be created, reviewed, and retired with the same discipline as policy documents or internal knowledgebases. iManage, which handles over 13.5 billion documents and emails for high-stakes clients in law, finance, and government, has developed a framework for prompt lifecycle management. Prompts are linked to document types, controlled by permissions, and continuously benchmarked to ensure alignment with model behaviour.
Benchmarking, in particular, plays a critical role. “Our internal data science team continuously benchmarks prompt lists against a large dataset to ensure consistency and quality,” Walker says. “As AI models improve, businesses must validate that these updates still align with their goals. A model might suddenly become more capable in certain areas but weaker in others. Without regular testing, the apparent improvement in model performance could, paradoxically, lead to a decline in the relevance or accuracy of outputs.”
Tacit knowledge and the democratisation dilemma
One of the more intriguing implications of prompt engineering is its potential to bridge long-standing organisational silos, particularly in knowledge-rich environments like legal, consulting, and accounting. Much of the critical expertise in these fields resides in tacit form, embedded in individual workflows, templates, and judgement. That knowledge is often difficult to codify, let alone transfer.
“Prompt engineering could democratise knowledge access by making expertise more accessible,” Walker adds. “But doing so requires both cultural and technical shifts. In many firms, senior staff view their personal knowledge assets, such as templates or processes, as intellectual property that distinguishes them. Sharing these assets, even within their own organisation, can feel like a loss of control or value.”
Then there is the generational risk. “Many firms face a looming knowledge gap as senior employees near retirement, taking with them decades of expertise,” Walker notes. Structured prompts offer one way to capture and disseminate this institutional memory when built with domain insight. However, doing so requires buy-in from those who hold the knowledge in the first place. AI may help extract expertise but cannot resolve organisational reluctance to share it.
Small models have a significant impact
As large language models strain the capacity of global cloud infrastructure, a more targeted approach is emerging. Walker believes the future lies in smaller language models (SLMs), which are leaner, more efficient models trained for specific industries. “These require less compute power and are potentially more efficient for use cases like law or accounting,” he explains. Where generalist models may struggle with domain-specific nuance, SLMs offer greater precision and reduced cost, particularly in regulated environments where hallucinations carry reputational risk.
This shift also aligns with a broader trend: the return of enterprise customisation. For two decades, firms moved away from bespoke software toward off-the-shelf platforms. That may be reversing. “AI can quickly generate code for custom functions, allowing businesses to address specific needs in real time without long procurement processes,” Walker explains. “What used to require a team of developers and months of effort might now be accomplished in hours. This potential re-energises enterprise IT as a strategic enabler rather than a cost centre.”
From reactive to proactive intelligence
Ultimately, the goal of AI in the enterprise is not to replace knowledge workers but to augment their capabilities in more intelligent, scalable, and secure ways. In domains like law, where the value of work often rests in its interpretation of complex documents, the benefit of AI lies not in output but in insight. According to Walker, “We aim to shift industries from reactive to proactive information management.”
This means embedding AI not just in user interfaces but in the very fabric of enterprise platforms. Organisations can query their data continuously instead of waiting for an audit or regulatory event to trigger document reviews. Instead of relying on email searches or memory, users can draw on structured prompts linked to validated knowledge assets. “AI enables clients to query their data proactively,” Walker explains. “This allows them to prepare for challenges before they arise, reducing costly manual interventions.”
The success of this vision hinges on infrastructure, governance, and culture. Access to generative AI is not enough. Enterprises must structure their data, benchmark their prompts, and continuously update their practices. As Walker states, “After around seven cycles of learning, AI models begin to reinforce their own outputs, leading to degraded quality.” The only solution is to feed the system fresh, high-quality data, something many organisations are still not equipped to do.
Prompt engineering may, one day, fade from view as its logic becomes embedded in applications and workflows. But for now, it remains the discipline that connects curiosity with capability, structure with scale, and potential with performance. Like the command line of early computing, it is the access point to a new kind of intelligence that does not merely process data but helps organisations understand themselves.