• Temrel
  • Posts
  • Core Principles of AI Engineering

Core Principles of AI Engineering

A 9-minute read on designing AI workflow and agentic systems

Artificial Intelligence (AI) has emerged as a transformative force across industries, enabling automation, decision-making, and enhanced user experiences. However, building effective AI systems is no simple task. Success in AI engineering and product development hinges on adhering to core principles that balance technical capabilities, user needs, and ethical considerations. This article explores four foundational principles for creating robust AI solutions: simplicity, performance vs latency, fixed paths vs open-ended systems, and transparency.

1. Do It as Simply as Possible

You can get really complicated, really quickly in AI development. Simplicity is often underestimated in AI. With the allure of cutting-edge algorithms and complex models, it’s easy to fall into the trap of over-engineering. You’ll find maintaining and scaling such solutions more difficult than needs be, though.

Ease of Understanding: A simpler system is easier for teams to understand, modify, and debug. This accelerates development cycles and reduces the likelihood of errors. New engineers joining the team can ramp up faster, too.

Reduced Resource Costs: Complex models typically require more computational power and storage. Simpler models optimise resource use without compromising effectiveness. Given that gen-AI costs are already amongst the highest in any form of computing, this is essential to the value of AI in business.

Improved Reliability: Overly intricate systems can behave unpredictably in edge cases, while simpler models often exhibit more stable performance across a range of scenarios.

Examples of Simplicity in Practice
  • Rule-Based Systems for Narrow Tasks: In domains with clear, well-defined problems, simple rule-based systems may outperform complex machine learning models. For instance, keyword-based email filters remain effective for basic categorisation tasks.

  • Model Pruning and Optimisation: Techniques like pruning, quantisation, and knowledge distillation reduce the complexity of machine learning models while retaining accuracy.

Guiding Questions for Simplicity
  • Do I need generative AI to achieve this goal? Executing code is much more efficient and cost-effective than making an AI do it. Any element with the same outcome each time should be handled by code, not AI.

  • What steps in the chain can I disintermediate? Are you creating several AIs with slightly different prompts when one foundational model could do it all?

  • Can existing tools or pre-trained models address the task without custom development? The lowest amount of code you have to manage, the better your operations will be.

  • What trade-offs are involved in simplifying the solution? Precision and Recall versus latency is a big one, not to mention other considerations like guardrails, security, session and long-term data storage.

2. Task Performance vs Latency

Task performance and latency are two critical metrics in AI system design. They can be considered as two ends of the same rope. Pulling on one end moves the other. Striking the right balance between them is essential for building systems that are both effective and usable. Great results might be less valid if one must wait inordinately long for them.

  • Task Performance: This refers to how well an AI system accomplishes its goals. Metrics such as accuracy, precision, recall, or F1 score quantify performance. This is dependent on the quality of the model in use.

  • Latency: Latency measures the time it takes for the system to generate a response or complete a task. High latency can degrade user experience, even if performance metrics are excellent. Latency is a product of the power of the infrastructure in use. In essence, faster networks and more powerful GPUs decrease latency, though other factors can also influence this.

Optimising for the Context

The optimal balance depends on the use case:

  • Low-Latency Applications: Real-time systems, such as voice assistants or autonomous vehicles, prioritise minimal response times. Sacrificing some accuracy for speed is often necessary in these contexts.

  • High-Accuracy Applications: Tasks such as medical diagnosis or fraud detection require high accuracy, even if it means longer processing times. Latency is secondary to the reliability of the output.

Techniques for Balancing Performance and Latency
  • Model Compression: Reducing the size of models can lower latency without significantly impacting performance.

  • Edge Computing: Running models locally on edge devices reduces the time spent on data transmission.

  • Asynchronous Processing: Splitting tasks into synchronous and asynchronous components ensures critical actions happen instantly while less time-sensitive tasks run in the background.

Guiding Questions for Performance vs Latency
  • What are the user’s expectations for response time? More generic or less refined responses must be sacrificed to get a real-time latency, for example.

  • Is there room to optimise model architecture for faster inference? We can potentially use smaller, faster models for niche tasks, or multiple models in parallel for subtasks.

3. Fixed Path (Workflow) vs Open-Ended (Agent)

The basic premise is this: if you know the steps you’ll take to get to goal, it’s a workflow. If you don’t know the steps you’ll take, it’s an Agent. This is true regardless of how much of the work is done by GenAI.

Deciding which approach to adopt depends on the complexity and variability of the task.

Fixed Path Workflows

Fixed path systems operate within predefined rules or workflows. These systems are ideal for tasks that:

  • Have well-defined inputs and outputs

  • Require predictable and repeatable behaviour

  • Benefit from clear user expectations

Examples include:

  • Automated Email Sorting: Rules-based systems that categorise emails into folders based on predefined criteria. Naturally, much of this work can be done by simple filters, a technology that’s been around since about 1987. Such filters don’t parse the content of emails, though.

  • Customer Support Workflows: Chatbots programmed with decision trees to answer FAQs or route users to human agents.

Advantages:

  • Simplicity and reliability

  • Easier to debug and maintain

  • Faster to deploy for narrow use cases

Limitations:

  • Lack of flexibility

  • Inability to handle unexpected scenarios

Open-Ended Agents

This is where we head towards Artificial General Intelligence (AGI) territory. Open-ended agents operate autonomously within broad parameters. They can:

  • Adapt to new information or scenarios

  • Perform complex reasoning

  • Handle tasks with undefined paths or outcomes

Examples include:

  • Autonomous Virtual Assistants: Systems like OpenAI’s ChatGPT that can answer diverse questions and perform varied tasks.

  • AI-Powered Decision Support Systems: Agents that provide recommendations based on dynamic data, such as financial market analysis.

Advantages:

  • Greater flexibility and adaptability

  • Capable of learning and improving over time

  • Suitable for complex or unpredictable environments

Limitations:

  • Higher computational requirements

  • Increased difficulty in debugging and controlling behaviour

Guiding Questions for Workflow vs Agent
  • Is the task well-defined, or does it require flexibility and adaptability? If it’s well designed and has humans in the loop applying expertise then it can probably be replaced by an AI workflow.

  • What level of control and predictability does the user require? More predictable systems are normally workflows.

  • How critical is transparency in the system’s decision-making? Giving GenAI broader responsibility for deciding how to proceed is generally less transparent and may be more subject to bias.

4. Transparency: Show Your Working

Transparency is a cornerstone of trust and accountability in AI systems. Given that LLM output is non-determinative (i.e. the same inputs don’t always lead to the same outputs), users and stakeholders need to understand how decisions are made, particularly in high-stakes applications.

  • Building Trust: Transparent systems foster user confidence by explaining how and why decisions are made.

  • Accountability: Transparency ensures developers and organisations can address errors or biases.

  • Regulatory Compliance: Many industries, such as finance and healthcare, require transparent AI systems to meet legal and ethical standards.

Techniques for Transparency
  • Explainable AI (XAI): Methods like SHAP (SHapley Additive exPlanations) or LIME (Local Interpretable Model-agnostic Explanations) provide insights into how models generate outputs.

  • Decision Logs: Recording and presenting the steps taken by the AI system allows users to audit and review its processes.

  • User-Friendly Interfaces: Simplifying the presentation of complex decisions ensures non-technical users can understand AI behaviour.

Challenges of Transparency
  • Balancing simplicity and comprehensiveness in explanations

  • Ensuring that transparency efforts do not compromise proprietary algorithms

  • Educating users to interpret AI explanations correctly

Guiding Questions for Transparency
  • How will the system explain its decisions to users? I can be a default part of the output, or available on demand.

  • What level of detail is appropriate for different stakeholders? Given that transparency features are technical overheads, pitching at the right level is key. Your Marketing AI probably doesn’t need to explain at each step why it’s targeting LinkedIn and Instagram for your social content. Alternately, a loan validation AI would need to explain the decision more robustly.

The core principles of AI engineering and product development—simplicity, task performance vs latency, fixed path vs open-ended design, and transparency—serve as guiding lights for creating impactful AI solutions.

  • Developers can ensure maintainability and scalability by keeping systems as simple as possible.

  • Balancing task performance and latency ensures that systems remain both efficient and user-friendly.

  • Choosing between fixed workflows and open-ended agents allows solutions to align with the complexity of the task at hand.

  • Finally, transparency builds trust, fosters accountability, and aligns systems with ethical and regulatory standards.

These principles will remain foundational as AI continues to evolve, ensuring that innovation matches responsibility and practicality. By adhering to these guidelines, businesses can deliver AI products that perform and resonate with users, stakeholders, and society at large.