Ai

All Posts

Published on
March 14, 2025
Are you A/B testing and reviewing performance of your prompts?
AI Prompt-Engineering Machine-Learning A/B-Testing Performance-Testing
Most AI applications, whether a RAG based chatbot or a simple model wrapper, rely on prompts to generate responses. User or application inputs are converted into prompts, which are then fed to the underlying model to generate responses. Due to the nature of the models, the quality of response is highly dependent on the quality of the prompt. Of course, we can manually test prompts to some extent, but it's not scalable. In this article I will discuss Latitude - a prompt engineering platform that can help in refining prompts, A/B testing them and measuring their performance.
Published on
February 15, 2025
ONNX and running models in the browser
AI Artificial-Intelligence Deep-Learning LLMs Quantization Models ONNX Hardware
Last week I blogged about how Quantization can help you run your models on lower-powered hardware. In todays blog, I am extending the discussion further, talking about ONNX (Open Neural Network Exchange), which provides a standard format for representing machine learning models. This enables interoperability between frameworks and simplifies deployment across diverse hardware, including browser-based inference with onnxruntime-web. I have also included a demo to run a model in the browser.
Published on
February 2, 2025
Quantization of Models: Why and How
AI Artificial-Intelligence Deep-Learning LLMs Quantization Models Data-Types Performance Optimization
When storing data in memory, the data type used to represent the data has an impact on the memory usage and the performance of the overall system. Consider saving a number. On a high level, the number can either be an integer (whole number) or a floating-point number (number with decimal). Floating-point numbers can represent larger range of numbers with higher precision. Weights and biases in a large language model, which are learned during training and are used to make predictions, are stored as floating-point numbers to maintain high precision. The count of these parameters is what constitutes the size of the model, memory usage and how much computational resources are needed to run the model. In this post, we will discuss how quantization can be used to reduce the memory usage of models and improve performance (assuming the loss of precision is acceptable).
Published on
September 14, 2024
Are you building AI Agents
AI Agents Assistants Open-AI Artificial-Intelligence
Oracle among other companies announced recently that 50+ role-based AI agents within the Oracle Fusion Cloud Applications Suite will help successfully execute frequent, repetitive tasks. Other companies are doing the same. In this article I will discuss what AI agents are, what are some of the use cases and link some tools/frameworks that can help you design and build agents.
Published on
September 4, 2024
Implementing RAG with OpenAI assistants
OpenAI RAG Assistants Agents AI
RAG is one of the most common use cases that has been implemented in the past couple of years. Retrieval Augmented Generation (RAG) is a technique that enhances the capabilities of LLMs by combining them with external knowledge sources. It involves retrieving relevant information from a knowledge base, incorporating it into the LLM's context, and then generating a response that leverages both the LLM's internal knowledge and the retrieved information. Building RAG applications requires integrating various components like vector databases and search algorithms, which can be quite involved. In this blog we'll briefly talk about RAG basics and leveraging OpenAI's assistants to build simple RAG applications.
Published on
March 12, 2024
Can you recommend me a movie? Crafting Recommendations with Vector Databases
Vector-Search Vector-Databases AI Recommendation-System
This blog post will take you through the process of building a recommendation system and the concept of embeddings, vector databases and various use cases. These concepts are not only limited to recommendation systems but are widely used in various domains such as image recognition, natural language processing, semantic search, and anomaly detection. The ability to represent complex, high-dimensional data in a dense, lower-dimensional space is a fundamental technique in machine learning.
Published on
February 8, 2024
Securing AI adoption
Artificial-Intelligence AI Machine-Learning ML LLM Security Shift-Left OWASP
AI adoption is accelerating across industries, transforming how businesses operate and innovate. As companies embrace AI, it is crucial to understand the security and privacy implications. This article will explore security considerations when building custom AI solutions and integrating AI into business operations.

Ai

ai (7)

Are you A/B testing and reviewing performance of your prompts?

ONNX and running models in the browser

Quantization of Models: Why and How

Are you building AI Agents

Implementing RAG with OpenAI assistants

Can you recommend me a movie? Crafting Recommendations with Vector Databases

Securing AI adoption