A 🚶through the building blocks of a Web- App based AI product (part 1) (2024)

AI products seem magic and hard to concretely conceptualize. When I thought of AI initially, I thought of a tool that magically computes artifacts that would take me hours … in 30 seconds, flawlessly, with little direction. Obviously that is a naive presentation, but I still marvel at the generative aspect of AI.

Let’s step through the pieces and demystify the aura of AI products. This will be a 2-part article where I 1) explain the core building blocks of a web-app AI product, and then 2) actually explain the code of my project interviewssuck.com (a prototype of a tool that provides recommendations for remote interviews in real-time), and share a Github template with comments that allow you to build an AI product of your own.

  • Web App: Simply put, a website, a user interacts with that takes in input (stuff clicked on and typed in), and provides output (text, images, payments etc):

    • front-end: the front-end of the web-app, made in HTML and CSS.

    • back-end: the database of the web-app, can be written in multiple ways, Mongo. DB, Javascript etc

    • Note: technically there is a difference between websites and web-apps. Pre 2010s, websites had significantly less capabilities than desktop software; as web-development improved, the term web-apps referred to websites that had ‘software’ like capabilities in contrast to original websites, which were very bare bones.

  • Generative AI Model: AI model that collectively can generate text, images, videos etc in response to some form of input, or prompt.

  • Techniques used to personalize the AI model:

    • As I’m sure you’ve seen, Chat GPT is a very general product that works okay across many use-cases. I will explain a couple ways to improve the quality of the data the model produces and make it more context appropriate

  • Data pipeline [note, this is my terminology], but a way that data is mapped from the web-app to the AI model.

Web App

All browser based products (which many AI products are) are a web apps, which is literally HTML and CSS that provide a front-end for users to interact with.

For instance, this Substack post is a web-page. You are reading HTML showing this text, and can interact with the post by taking a couple actions: you can like the post, you can comment etc.

An AI product supplements a web-app by generating output that is added to the content the web app is displaying. Here is a simplified side by side comparison of a normal grocery app and a grocery app that uses AI:

Let’s assume you are shopping for groceries online. Normal app:

  • you search for ingredients / browse

  • add to cart

  • check out

AI powered app:

Officially known as an LLM (Large Language Model), an AI model is a tool that can generate text, images, video, and audio by itself. Historically, computers had to make decisions through very specific logic, i.e when getting out of bed:

  • turn to the right, turn off alarm clock

  • put on T-shirt, then pants

  • check email

If say your partner moved the alarm clock to the left of the bed from the right, if you were a pre-AI computer, you would be unable to turn the alarm off and continue getting up.

AI allows a computer to, for the first time, make inferences and generate its own thinking and solutions as to how to solve a problem.

There are different AI models. ChatGPT, and Groq are well known examples. Like people with individually different brains, each model can perform relatively better on specific types of task, while some are created for very specific tasks, i.e think of someone who studies brain surgery for years.

But what is it literally doing?

Let’s break down LLM, aka, ‘Large Language Model:’

  • Large = the model has been trained on a large amount of text (more on this below)

  • Language model — the language model refers to trying to predict the next word in a sequence, in response to a question or other type of prompt. It does this one word at a time, choosing the most likely word until it has completed the answer

A 🚶through the building blocks of a Web- App based AI product (part 1) (1)

source: https://medium.com/data-science-at-microsoft/how-large-language-models-work-91c362f5b78f

  • The model takes in an input (most commonly submitted as text), and using that input (we call this input a prompt) to produce output

  • So in the above example: the input might be: please complete the following sentence in realistic way with the following sentence “the cat likes to”

  • output: the cat likes to sleep in the cat bed

At a very simplified level:

  • AI models learn from data

  • Much like humans go through education where they study concepts to be able to solve a variety of problems, AI studies a critical mass of data. This process is called pre-training.

  • A rabbit hole I won’t go into: generating / finding the data for an AI model to study, and the relationship between the data and the AI model being able to produce output in a specific context with a minimum quality

  • An AI model can learn in two ways: supervised and un-supervised

  • Note, the above is from Chamath’s excellent deep-dive on AI

  • Supervised learning: models study data that is explicitly structured, ie in the above example, the teacher is presenting examples of cats to the AI model

  • Unsupervised learning: models are fed data that is not explicitly structured. Instead they must categorize and make sense of the data on their own

See https://www.ibm.com/blog/supervised-vs-unsupervised-learning/ for more info.

Techniques used to further personalize and improve an AI model

Let’s start with an analogy:

  • an analyst, with a college education, with 1 month of training for an entry level job

The above pattern explains how the AI gets a general education, prompt-engineering and fine-tuning are techniques used to further personalize the AI to teach it to perform a specific task and supplement the overall education it received in the model training process.

Prompt Engineering:

Imagine an AI model as an analyst: someone with potential who can analytically make sense of information, consistently, with a reasonable degree of accuracy.

Prompt engineering refers to specific instructions told to a model about how to handle a task in addition to its general training.

For instance, imagine the task: summarize a 2-page memo and structure in a specific 3 bullet format for a client. The prompt might look something like “for every summary, please identify three main ideas and express your output in numbered output, ie 123”

Summarizing 2 page memo: this relies on the analyst’s general training, ie pre-training

Structuring in a specific 3 bullet format: this relies on the model being given specific instructions that inform it to produce output in a specific way, by specifying and / or providing some examples.

Finetuning

In contrast to a small subset of instructions, Fine-tuning provides actual examples of inputs / outputs for a model to consume that allow the model to better understand the relationship between input / output by inferring and studying the examples provided.

For instance, as opposed to merely telling the model above “provide a three bullet format,” (this is prompt engineering), I would provide actual examples of text + summaries to supplement the data / examples that the model knows about.

This has a couple advantages:

  • Works faster. Prompt engineering involves the model reminding itself of instructions every single time rather than actually learning a given set of information

  • Works in a more specific context: imagine a given model is being used to diagnose health conditions (note, I took this example from DeepLearning’s course). Pre fine-tuning, the model can diagnose the problem as “skin,” post fine-tuning, the model can

  • Can have more consistent performance, and avoid mistakes called hallucinations where the model asserts statements that are not true

  • Privacy - one risk of working with AI is that proprietary data particular to your product content etc is incorporated to an AI model. I’m still trying to understand whether fine tuning can be done in a way that ensures you can avoid sharing proprietary data. My hunch is yes? but potentially this involves using an AI model that is less mainstream, or otherwise has less pre-training data than the mammoth models. I’ll keep you updated as I better understand here.

Going back to our analyst example — fine-tuning is as if our analyst takes a Master’s course, say an MBA, which helps them perform better output (let’s imagine it is a certain type of finance report) by actually training the analyst and giving them new skills and information.

Prompt engineering is limited, because it is just sets of instructions, which can provide intent, whereas fine-tuning is actually helping the model learn a set of information

  • I am still learning about RAG and will update this document as I learn

Data-Pipeline

  • Remember there are 2 pieces to a web-app based AI product

    1. the web-app where the user does typical actions (makes an account, browses, purchase something, writes, edits etc)

    2. an AI model that takes in some input, and returns an output

  • A web-app needs to have a bidirectional relationship between 1 and 2, ie 1<>2. The web-app needs to send an input to the AI model, which will then generate an output and send it back to the web-app

  • In my project interviessuck.com, I take voice input, convert it to text, and then feed this text to a Groq model that uses prompt engineering (more on this in the follow up article).

  • I will step through the code applying the above for interviewssuck.com, my website that is an AI powered recommendation tool and apply the above framework

  • Will also update as I improve my understanding of fine-tuning and RAG

Sasha’s AI Resources:

  • Note: I am obviously still learning about this space and may have stated something incorrectly despite my best attempts not to. Very welcome to feedback!

If you’ve been forwarded this post, you can subscribe here!

A 🚶through the building blocks of a Web- App based AI product (part 1) (2024)
Top Articles
Latest Posts
Article information

Author: Wyatt Volkman LLD

Last Updated:

Views: 6434

Rating: 4.6 / 5 (66 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Wyatt Volkman LLD

Birthday: 1992-02-16

Address: Suite 851 78549 Lubowitz Well, Wardside, TX 98080-8615

Phone: +67618977178100

Job: Manufacturing Director

Hobby: Running, Mountaineering, Inline skating, Writing, Baton twirling, Computer programming, Stone skipping

Introduction: My name is Wyatt Volkman LLD, I am a handsome, rich, comfortable, lively, zealous, graceful, gifted person who loves writing and wants to share my knowledge and understanding with you.