Controlled classroom simulation. Do not test these techniques on real systems. All confidential data below is fake.
AI LAB · classroom simulation

Can you break the assistant?

Three levels. You'll attack a naive assistant, poison a document it reads, and then switch sides and fix a broken permission model. This is a personal assignment — work on your own. Every input is fake data — ACME Corp doesn't exist.

The whole lesson in one sentence:

A system prompt is not authorization. The model follows meaning, not keywords — which is why filtering can't save you, and why the only real defense is controlling what data the model is ever allowed to see.

Student sign-in required

Sign in with your school Google account (*.ie.edu) to enter the lab.

Continue with Google