7. Lab: Serving Digits with FastAPI#
Previously we made a model. Now itâs time to serve it so others can use it!
7.1. Pre-lab#
For this lab, you will serve your previously trained MNIST digits model.
Make your own copy of the template repository: USAFA-ECE/ece386-lab01
Commit your Keras neural network model for predicting digits into
server/digits.keras
Make four images of different handwritten digits and place them into
client/img/
7.2. Colab Demo#
This exercise simply demonstrates how to serve a FastAPI app in Google Colab.
Get a terminal#
First, we install colab-xterm, which allows us to have an interactive terminal. We will need this to run the app process, since normally colab runs a process and then kills it.
%pip install -q "fastapi[standard]"
!pip install -q colab-xterm
%load_ext colabxterm
%xterm
You should now have an interactive terminal! Run a command in it, such as
pwd
Create python script#
Next we will create the python file. Run the command
vim hello.py
Vim is a ubiquitous text editor in linux systems. Enter insert mode by pressing the i key. Then paste in this code:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
Figure out how to exit Vim.
Once you exit the file, you can make sure you saved it properly with
cat hello.py
Run the script#
In production, you would query a domain and DNS would resolve that to an IP address. But since we are just doing testing, simply look at what your IP address is with this command:
ip a
It shows a loopback at 127.0.0.1
(this is a reserved address and always means âmyselfâ).
You will also see an eth0 address that probably starts with 172.
. This subnet is reserved for local networks.
In production, this would be a public IP address, but this will work for now⊠we are just going to talk to ourselves.
Run the script! We will use dev mode.
# First, view help menu
fastapi dev --help
# Then, you can run the app
# 8000 is the default port
fastapi dev hello.py
You should see something starting like:
Serving at: http://127.0.0.1:8000 ⊠Application startup somplete.
Notice that in dev
mode you get automatic reloading and more debug info, but the app only listens on the local host.
In run
mode FastAPI will listen on all interfaces.
Hit the API#
Now itâs time to connect to the API endpoint and see what we get!
Down the road weâll do this with Python, but for our quick test, just run the below cell.
!curl 127.0.0.1:8000/
You should see a glorious {âHelloâ: âWorldâ} returned! Yes, thatâs JSON being returned. JSON is a standard way of representing text data, and youâll need to learn to work with it a bit.
Back in your terminal, you should also see a âGETâ request from yourself that was answered with a â200 OKâ status code. You donât need to go in-depth on HTTP response status codes, but
100s for info
200s for success
300s for redirect
400s for client error
500s for server error
So a â404 Not Foundâ error is your fault, as the client, because you requested something that doesnât exist!
7.3. Lab#
This lab will require significant self-learning! Youâll absolutely need to spend time reading documentation.
Large Language Models (such as ChatGPT or Claude) are authorized for clarifying questions and how to do very specific things, but you cannot ask for a large chunk of the program. Furthermore, you must document in the README by:
Stating your primary query AND
Providing a link to the entire conversation.
For example, this is ok: âWrite a demo FastAPI POST that can accepts an image and explain each line.â
But this is not ok: âWrite a FastAPI app with a POST method that feeds an image into a Keras model.â
Disclaimer, I entered both of those prompts into ChatGPT and neither one was very good! You really, really, need to be careful taking code from LLMs. A better approach might be âI have this simple FastAPI method, but am new to using the package. Can you help explain and critique it for me?â
Note
You should do your development and testing on a single local machine, and then deploy to the server.
Server#
You will use a Raspberry Pi as your server.
All the code for the server will be located in the server/
directory of your repository.
Clone your repository.
Create a virtual environment on a Raspberry Pi.
Use
server/requirements.txt
to install packages.Serve the model with
fastapi run digits.py
The app should simply:
Open the previously saved Keras model.
Accept a POST request to
/predict
.Reshape and grayscale the image to work with the modelâs input expectations.
Conduct inference on the image and return an integer of the predicted class.
Client#
For your client, you can use any machine with Python and on the same network as the Pi.
All the code for the client will be in the client
directory of your repository.
Clone your repository.
Create a virtual environment on your client.
Use
client/requirements.txt
to install packages.Run
client.py "xxx.xxx.xxx.xxx"
, where the Xâs are the IP address of the server.client.py
prompts the user for a path to an image. After hitting âenterâ the client makes an HTTP POST request to the server, waits for the response, then displays the integer to the user.The client continues to prompt the user until the user exits with CTRL+C.
Tip
The requests package is easier to use than the built-in urllib.
Deliverables#
Demonstrate your client-server exchange.
Complete the three READMEâs in the repository (root, client, server).
Push your code and submit to Gradescope.