Final: LLM Prompt Checkpoint

Contents

32. Final: LLM Prompt Checkpoint#

Tip

This is a great time to use few-shot prompting.

Structured outputs are probably overkill since we just want a string

Design a LLM system prompt for converting a sentence requesting weather into the format wttr.in needs.
Design several test cases.
Evaluate your prompt; iterate if necessary.
Upload this completed notebook to Gradescope.

%pip install -q ollama

32.1. Template#

Complete this and submit to Gradescope.

LLM Setup and Prompt#

"""This script evaluates an LLM prompt for processing text so that it can be used for the wttr.in API"""

from ollama import Client

LLM_MODEL: str = "gemma3:1b"  # Change this to be the model you want
client: Client = Client(
    host="http://localhost:11434"  # Change this to be the URL of your LLM
)

# TODO: define llm_parse_for_wttr()

Test cases and function#

# Test cases
test_cases = [  # TODO: Replace these test cases with ones for wttr.in
    {"input": "What's the weather in Rio Rancho?", "expected": "Rio+Rancho"},
    {"input": "another test....", "expected": "answer"},
    {"input": "another test....", "expected": "answer"},
]

def run_tests(test_cases: list[dict[str, str]]):
    """run_tests iterates through a list of dictionaries,
    runs them against an LLM, and reports the results."""
    num_passed = 0

    for i, test in enumerate(test_cases, 1):
        raw_input = test["input"]
        expected_output = test["expected"]

        print(f"\nTest {i}: {raw_input}")
        try:
            result = llm_parse_for_wttr(raw_input).strip()
            expected = expected_output.strip()

            print("LLM Output  :", result)
            print("Expected    :", expected)

            if result == expected:
                print("✅ PASS")
                num_passed += 1
            else:
                print("❌ FAIL")

        except Exception as e:
            print("💥 ERROR:", e)

    print(f"\nSummary: {num_passed} / {len(test)} tests passed.")

Execute tests#

run_tests(test_cases=test_cases)