AI products seem magic and hard to concretely conceptualize. When I thought of AI initially, I thought of a tool that magically computes artifacts that would take me hours … in 30 seconds, flawlessly, with little direction. Obviously that is a naive presentation, but I still marvel at the generative aspect of AI.
Let’s step through the pieces and demystify the aura of AI products. This will be a 2-part article where I 1) explain the core building blocks of a web-app AI product, and then 2) actually explain the code of my project interviewssuck.com (a prototype of a tool that provides recommendations for remote interviews in real-time), and share a Github template with comments that allow you to build an AI product of your own.
Web App: Simply put, a website, a user interacts with that takes in input (stuff clicked on and typed in), and provides output (text, images, payments etc):
front-end: the front-end of the web-app, made in HTML and CSS.
back-end: the database of the web-app, can be written in multiple ways, Mongo. DB, Javascript etc
Note: technically there is a difference between websites and web-apps. Pre 2010s, websites had significantly less capabilities than desktop software; as web-development improved, the term web-apps referred to websites that had ‘software’ like capabilities in contrast to original websites, which were very bare bones.
Generative AI Model: AI model that collectively can generate text, images, videos etc in response to some form of input, or prompt.
Techniques used to personalize the AI model:
As I’m sure you’ve seen, Chat GPT is a very general product that works okay across many use-cases. I will explain a couple ways to improve the quality of the data the model produces and make it more context appropriate
Data pipeline [note, this is my terminology], but a way that data is mapped from the web-app to the AI model.
Web App
All browser based products (which many AI products are) are a web apps, which is literally HTML and CSS that provide a front-end for users to interact with.
For instance, this Substack post is a web-page. You are reading HTML showing this text, and can interact with the post by taking a couple actions: you can like the post, you can comment etc.
An AI product supplements a web-app by generating output that is added to the content the web app is displaying. Here is a simplified side by side comparison of a normal grocery app and a grocery app that uses AI:
Let’s assume you are shopping for groceries online. Normal app:
you search for ingredients / browse
add to cart
check out
AI powered app:
you search for ingredients
add to cart
during this process, a grocery AI model takes as an input the ingredients you have selected (apple, raisins, oatmeal), and then adds an output (peanut butter) because it anticipates the user might want it
the web app adds or asks the user if they might want peanut butter as peanut butter is a delicious compliment to the ingredients you had already added
Officially known as an LLM (Large Language Model), an AI model is a tool that can generate text, images, video, and audio by itself. Historically, computers had to make decisions through very specific logic, i.e when getting out of bed:
turn to the right, turn off alarm clock
put on T-shirt, then pants
check email
If say your partner moved the alarm clock to the left of the bed from the right, if you were a pre-AI computer, you would be unable to turn the alarm off and continue getting up.
AI allows a computer to, for the first time, make inferences and generate its own thinking and solutions as to how to solve a problem.
There are different AI models. ChatGPT, and Groq are well known examples. Like people with individually different brains, each model can perform relatively better on specific types of task, while some are created for very specific tasks, i.e think of someone who studies brain surgery for years.
But what is it literally doing?
Let’s break down LLM, aka, ‘Large Language Model:’
Large = the model has been trained on a large amount of text (more on this below)
Language model — the language model refers to trying to predict the next word in a sequence, in response to a question or other type of prompt. It does this one word at a time, choosing the most likely word until it has completed the answer
source: https://medium.com/data-science-at-microsoft/how-large-language-models-work-91c362f5b78f
The model takes in an input (most commonly submitted as text), and using that input (we call this input a prompt) to produce output
So in the above example: the input might be: please complete the following sentence in realistic way with the following sentence “the cat likes to”
output: the cat likes to sleep in the cat bed
At a very simplified level:
AI models learn from data
Much like humans go through education where they study concepts to be able to solve a variety of problems, AI studies a critical mass of data. This process is called pre-training.
A rabbit hole I won’t go into: generating / finding the data for an AI model to study, and the relationship between the data and the AI model being able to produce output in a specific context with a minimum quality
An AI model can learn in two ways: supervised and un-supervised
Note, the above is from Chamath’s excellent deep-dive on AI
Supervised learning: models study data that is explicitly structured, ie in the above example, the teacher is presenting examples of cats to the AI model
Unsupervised learning: models are fed data that is not explicitly structured. Instead they must categorize and make sense of the data on their own
See https://www.ibm.com/blog/supervised-vs-unsupervised-learning/ for more info.
Techniques used to further personalize and improve an AI model
Let’s start with an analogy:
an analyst, with a college education, with 1 month of training for an entry level job
The above pattern explains how the AI gets a general education, prompt-engineering and fine-tuning are techniques used to further personalize the AI to teach it to perform a specific task and supplement the overall education it received in the model training process.
Prompt Engineering:
Imagine an AI model as an analyst: someone with potential who can analytically make sense of information, consistently, with a reasonable degree of accuracy.
Prompt engineering refers to specific instructions told to a model about how to handle a task in addition to its general training.
For instance, imagine the task: summarize a 2-page memo and structure in a specific 3 bullet format for a client. The prompt might look something like “for every summary, please identify three main ideas and express your output in numbered output, ie 123”
Summarizing 2 page memo: this relies on the analyst’s general training, ie pre-training
Structuring in a specific 3 bullet format: this relies on the model being given specific instructions that inform it to produce output in a specific way, by specifying and / or providing some examples.
Finetuning
In contrast to a small subset of instructions, Fine-tuning provides actual examples of inputs / outputs for a model to consume that allow the model to better understand the relationship between input / output by inferring and studying the examples provided.
For instance, as opposed to merely telling the model above “provide a three bullet format,” (this is prompt engineering), I would provide actual examples of text + summaries to supplement the data / examples that the model knows about.
This has a couple advantages:
Works faster. Prompt engineering involves the model reminding itself of instructions every single time rather than actually learning a given set of information
Works in a more specific context: imagine a given model is being used to diagnose health conditions (note, I took this example from DeepLearning’s course). Pre fine-tuning, the model can diagnose the problem as “skin,” post fine-tuning, the model can
Can have more consistent performance, and avoid mistakes called hallucinations where the model asserts statements that are not true
Privacy - one risk of working with AI is that proprietary data particular to your product content etc is incorporated to an AI model. I’m still trying to understand whether fine tuning can be done in a way that ensures you can avoid sharing proprietary data. My hunch is yes? but potentially this involves using an AI model that is less mainstream, or otherwise has less pre-training data than the mammoth models. I’ll keep you updated as I better understand here.
Going back to our analyst example — fine-tuning is as if our analyst takes a Master’s course, say an MBA, which helps them perform better output (let’s imagine it is a certain type of finance report) by actually training the analyst and giving them new skills and information.
Prompt engineering is limited, because it is just sets of instructions, which can provide intent, whereas fine-tuning is actually helping the model learn a set of information
I am still learning about RAG and will update this document as I learn
Data-Pipeline
Remember there are 2 pieces to a web-app based AI product
the web-app where the user does typical actions (makes an account, browses, purchase something, writes, edits etc)
an AI model that takes in some input, and returns an output
A web-app needs to have a bidirectional relationship between 1 and 2, ie 1<>2. The web-app needs to send an input to the AI model, which will then generate an output and send it back to the web-app
In my project interviessuck.com, I take voice input, convert it to text, and then feed this text to a Groq model that uses prompt engineering (more on this in the follow up article).
I will step through the code applying the above for interviewssuck.com, my website that is an AI powered recommendation tool and apply the above framework
Will also update as I improve my understanding of fine-tuning and RAG
Sasha’s AI Resources:
Note: I am obviously still learning about this space and may have stated something incorrectly despite my best attempts not to. Very welcome to feedback!
For learning web-dev needed to build the wrapper tech around AI models
Finetuning:
For understanding LLMs:
Prompt engineering:
https://www.promptingguide.ai/
For understanding the bigger picture of AI works and the types of products being created around it
Chamath PalihapitiyaDeep Dive: Artificial IntelligenceEvery day we are bombarded with so much news about AI that it's increasingly difficult to know what is signal and what is noise. And the tidal wave of this information came so quickly that for many, including myself, there was never a moment to take a step back and learn the basics in a way that put everything into context…Read more4 months ago · 158 likes · 6 comments · Chamath Palihapitiya
If you’ve been forwarded this post, you can subscribe here!