I. The Question That Suddenly Woke Me Up
The other day, a friend and I were talking about the backend architecture of LingoContext, and he asked a very simple question:
“Is your AI endpoint only accessible by your extension? People can’t just use it freely, right?”
I instinctively replied, “Of course, CORS restricts the origin, only allowing requests from chrome-extension://”—but halfway through my sentence, I started feeling guilty. Because I suddenly realized I had never actually verified this.
I opened my terminal and ran a command against my own production API:
curl -X POST https://lingo-context-api.vercel.app/api/analyze/stream \
-H 'Content-Type: application/json' \
-H 'Origin: chrome-extension://gjcgecdgmhbehagblealbdghojkoeakk' \
-d '{"text":"hello"}'
It returned a streaming JSON response containing Gemini’s answer to me. I had just used a single line of curl that anyone could write to call my own paid AI service.
At that moment, I realized my understanding of “extension-exclusive APIs” was wrong from the very beginning. This article is my reflection on this misconception, and how I went about fixing it.
II. A Common But Fatal Misconception: “Only Let My Extension Call the API”
My initial thought process was something like this:
- Add a whitelist for
chrome-extension://<my-extension-id>in the CORS configuration. - Check the
Originheader of the request, and allow it if it matches. - Assume that “since others don’t know my extension ID anyway, it shouldn’t be a big deal.”
This logic makes sense under the browser’s security model—but for preventing API abuse, it has a fatal flaw:
The Extension ID is Public
If you open the Chrome Web Store, the extension’s ID is right there in the installation link:
https://chromewebstore.google.com/detail/lingocontext-%E2%80%94-context-aw/gjcgecdgmhbehagblealbdghojkoeakk
────────────────────────────────
This is the extension ID
The Extension Code is Public
A Chrome extension is packaged as a .crx file, which is essentially a zip archive. Anyone who downloads and extracts it can directly read content.js, background.js, and manifest.json. So the idea of “hiding a secret key in the client for verification”—that path was non-existent from the start.
The Origin Header Can Be Forged Arbitrarily
Browsers automatically set the Origin when making a request, but the server only receives a string. curl -H 'Origin: chrome-extension://any-id' passes the CORS whitelist check just the same. CORS is meant to protect the user’s browser from being abused by malicious websites—it is not meant to block curl.
The True Security Model
Putting these three things together, the conclusion becomes clear:
“Only let my extension call the API” is technically unachievable. What can be trusted is never the client, but the logged-in user.
The extension is just the UI layer, an entry point for the user to interact with the API. The trusted entity is the real user who has logged in via Google OAuth and holds a session cookie.
The direction for the fix thus became clear: all paid APIs must require authentication. CORS stays, but as a layer of defense-in-depth, not the only layer.
III. Status Audit: Which Endpoints Were Running Naked
With a mindset of re-evaluation, I swept through all my routes:
| Route | Login Required? | Notes |
|---|---|---|
/api/words/* | ✅ ensureAuthenticated | Vocabulary CRUD |
/api/user/* | ✅ ensureAuthenticated | User preferences |
/api/analyze | ❌ Naked | Calling Gemini / DeepSeek |
/api/analyze/stream | ❌ Naked | Streaming call to Gemini / DeepSeek |
/api/tts | ❌ Naked | Edge TTS, consumes bandwidth |
/api/word-definition | ❌ Naked | Calling AI for quick translation |
/api/furigana | ❌ Open | Local kuromoji, cheap |
/api/dictionary | ❌ Open | Calling free third-party dictionaries |
The database and vocabulary notebook were fine, but all the AI endpoints that actually burn money were open. During early development, I added authentication to “user data-related” routes, but I forgot that “AI invocation = user data-related”. From an attacker’s perspective, AI endpoints are actually the ones with the most motive for abuse—others might not care about your vocabulary notebook, but they would definitely be willing to freeload on your Gemini quota.
IV. The Fix: A Three-Layered Defense
I didn’t adopt a simple “fix it once and for all” approach; instead, I broke the solution into three layers:
Layer 1: Authentication (The Most Critical)
I added the ensureAuthenticated middleware to all four money-burning routes. If a request lacks a valid session cookie, it immediately returns a 401 and never enters the business logic:
// server/index.js
app.use(
"/api/analyze",
ensureAuthenticated,
aiPerMinute,
aiPerDay,
analyzeRoutes
);
app.use(
"/api/analyze/stream",
ensureAuthenticated,
aiPerMinute,
aiPerDay,
analyzeStreamRoutes
);
app.use(
"/api/word-definition",
ensureAuthenticated,
aiPerMinute,
aiPerDay,
wordDefinitionRoutes
);
app.use("/api/tts", ensureAuthenticated, ttsPerMinute, ttsRoutes);
What this layer blocks are all requests without an OAuth login identity. If curl wants to bypass it, it must first complete the full Google OAuth flow—and the OAuth flow requires a human to interact with Google in a browser, which cannot be automated.
At the same time, I changed the 401 response body to a structured format, making it easier for the client to recognize and display:
res.status(401).json({
error: "Unauthorized. Please login.",
message: "Please sign in via the LingoContext popup to use this feature.",
code: "AUTH_REQUIRED",
});
The old version only had { error: '...' }, leaving the frontend to display a vague message. The new version adds a code field, allowing the frontend to directly render a “Sign In” button based on AUTH_REQUIRED instead of a “Retry” button—the error response itself is also UX.
Layer 2: Per-User Rate Limiting
Authentication only blocks “people without identities.” But if a user’s session leaks, or if a user writes a script to abuse the API themselves, authentication alone is not enough.
I introduced express-rate-limit, where the key design is crucial:
function keyByUserOrIp(req) {
if (req.user && req.user.id != null) {
return `u:${req.user.id}`; // Prioritize by user ID
}
return `ip:${req.ip || "unknown"}`; // Use IP for non-logged-in scenarios
}
Why not just use IP? Because IPs are unstable. Users switching Wi-Fi, using VPNs, or sharing office IPs can all cause IP-based rate limiting to mistakenly block normal users. The user ID is the true entity of a “quota unit.”
The specific limits:
| Limiter | Window | Default Value | Control Variable |
|---|---|---|---|
| AI API / Minute | 60s | 30 | RATE_LIMIT_AI_PER_MIN |
| AI API / Day | 24h | 1500 | RATE_LIMIT_AI_PER_DAY |
| TTS / Minute | 60s | 60 | RATE_LIMIT_TTS_PER_MIN |
| Public API / Minute | 60s | 60/IP | RATE_LIMIT_PUBLIC_PER_MIN |
All thresholds are controlled by environment variables. So if I suddenly get DDoS’d one day, or my Gemini quota runs tight, I can immediately tighten the limits by changing an environment variable without redeploying the code.
The 429 response is also structured:
res.status(429).json({
error: "rate_limited",
message:
"You're sending requests too quickly. Please wait a moment and try again.",
code: "RATE_LIMITED",
limiter: name,
retry_after_seconds: retryAfter,
});
It includes a Retry-After header, so the client can decide whether to automatically retry.
Layer 3: CORS + CSRF Remains Unchanged
Many articles suggest, “Since CORS can’t block curl, don’t bother configuring it.” I disagree. CORS protects against a completely different scenario:
A user has my extension and a malicious website open in their browser at the same time. The malicious website’s JS secretly tries to call my API (carrying the user’s session cookie).
In this scenario, the Origin is automatically set by the browser, and the malicious website cannot forge it. The CORS + CSRF origin check can block the attack at this layer.
So the three layers of defense have a division of labor:
- Authentication: Blocks “people without identities” (curl, crawlers, automated scripts).
- Rate Limiting: Blocks “people with identities but abnormal behavior” (compromised accounts, user self-abuse).
- CORS/CSRF: Blocks “malicious websites borrowing legitimate identities” (CSRF attacks).
You can’t afford to lose a single layer.
V. The Most Error-Prone Part: Not Breaking Existing Features
Adding locks isn’t hard; what’s hard is not breaking the experience of existing users. I spent more time on this part than on the locking itself.
Step 1: Confirm the Client is Actually Sending Sessions
If the extension’s fetch doesn’t have credentials: 'include', then even if the user is logged in, the request won’t carry cookies, and after locking, everyone will be blocked by a 401.
I scanned through all the fetch calls in background.js:
const response = await fetch(`${backendUrl}/analyze/stream`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ ... }),
credentials: 'include' // ← Confirmed all four calls have this
});
The four money-burning endpoints (analyze/stream, furigana, word-definition, tts) already included credentials: 'include'. This means logged-in users will be completely unaware before and after the locking—the cookie follows automatically, and the session is automatically verified.
Step 2: Audit the Failure Fallback Path for Each Endpoint
The extension already had graceful fallback paths written for many endpoints; I just hadn’t noticed them before.
| Endpoint | Extension’s Reaction on Failure | Impact After Locking |
|---|---|---|
/api/tts | Automatically falls back to Web Speech API (speakWithWebSpeech) | Zero impact—unlogged users hear the browser’s native TTS |
/api/word-definition | Silently returns an empty array, skipping the quick definition area | Zero impact—just one less bonus feature |
/api/furigana | Silently returns null, skipping quick furigana | Zero impact—furigana will still render after AI analysis returns |
/api/analyze | Displays an error UI | Needs fixing—can’t let the user see “Backend Error: 401” |
In other words, even if TTS, word-definition, and furigana are locked, the overall experience of the extension won’t be interrupted. The Web Speech fallback gave me the peace of mind to lock TTS—if a feature has no fallback path, you must think clearly about the user experience during a 401 before adding auth.
Step 3: Turn analyze’s 401 into a “Login” Prompt
Only the failure of /api/analyze/stream would be directly seen by the user (as an error popup). This required special handling.
background.js passes the 401’s code through to the content script:
if (!response.ok) {
const error = await response.json().catch(() => ({}));
port.postMessage({
error: true,
status: response.status,
code: error.code, // ← New addition
message:
error.message || error.error || `Backend Error: ${response.status}`,
});
}
The stream handler in content.js recognizes code === 'AUTH_REQUIRED' and renders an error interface with a “Sign in” button, rather than the standard “Retry”:
const isAuth = msg.code === "AUTH_REQUIRED" || msg.status === 401;
popup.innerHTML = renderError(msg.message, { authRequired: isAuth });
renderError receives an authRequired option to decide whether to show 🔒 + “Sign in” or ⚠️ + “Try Again”, and the button’s data-action switches accordingly:
const actionAttr = authRequired ? 'data-action="login"' : 'data-action="retry"';
const actionLabel = authRequired ? loginBtnStr : tryAgainStr;
const icon = authRequired ? "🔒" : "⚠️";
And clicking the “Sign in” button reuses the OPEN_LOGIN message flow that has existed in the extension for a long time—no new IPCs were added, popup.html was untouched, and the dashboard was untouched. The entire change was encapsulated within two files and under 20 lines.
Trade-offs: Which Routes to Lock and Which Not To
Not every route should be locked. I made independent judgments for each one:
| Route | Lock? | Reason |
|---|---|---|
/api/analyze* | Yes | Calls Gemini/DeepSeek, every call costs real money |
/api/word-definition | Yes | Same as above, although max tokens is 100, high volume still burns money |
/api/tts | Yes | Edge TTS consumes bandwidth; and the client already has a Web Speech fallback |
/api/furigana | No, rate-limit only | Local kuromoji, zero external cost; triggered even when extension is unlogged; IP rate-limiting is enough |
/api/dictionary | No, rate-limit only | Calls free jisho/dictionaryapi; same as above |
The cost structure of the feature itself + whether the client has a fallback path are the two dimensions for deciding whether to lock. Mindlessly locking everything is safe but hurts the experience; mindlessly leaving everything open is running naked.
VI. Verification: Tests + End-to-End Smoke Testing
Changing the code isn’t the end; it must be verified:
Unit Tests
The old authMiddleware.test.js hardcoded the exact shape of the 401 body:
expect(res.json).toHaveBeenCalledWith({ error: "Unauthorized. Please login." });
I changed it to a structured match, turning the fact that “the extension can recognize authentication errors and switch the UI” into an expressible contract in the tests:
expect(res.json).toHaveBeenCalledWith(
expect.objectContaining({
error: "Unauthorized. Please login.",
code: "AUTH_REQUIRED",
message: expect.stringMatching(/sign in/i),
})
);
I added 4 new cases in index.test.js, asserting that anonymous access to each money-burning route would return a 401 + code: 'AUTH_REQUIRED'.
An unexpected discovery: analyzeRoute.test.js and analyzeStreamRoute.test.js had ensureAuthenticated hanging in their mock middlewares from the very beginning—which means, the “me” who wrote the tests already assumed these routes were protected. The production code was actually the anomaly. Test-driven development exposed the inconsistency between implementation and intent.
In the end: all 210/210 server tests + 13/13 root tests passed.
6.2 End-to-End Smoke Testing
I wrote a minimal Express harness, mounted the exact same middlewares as production, simulated an “unlogged” state with a mocked passport, and then ran curl:
=== Cost-bearing routes (expect 401)
{"error":"Unauthorized. Please login.","message":"Please sign in via the LingoContext popup to use this feature.","code":"AUTH_REQUIRED"} ← HTTP 401
{"error":"Unauthorized. Please login.","message":"...","code":"AUTH_REQUIRED"} ← HTTP 401
{"error":"Unauthorized. Please login.","message":"...","code":"AUTH_REQUIRED"} ← HTTP 401
{"error":"Unauthorized. Please login.","message":"...","code":"AUTH_REQUIRED"} ← HTTP 401
=== Open public routes (expect 200)
HTTP 200 POST /api/furigana
HTTP 200 GET /api/dictionary?word=x
Matched expectations perfectly.

VII. Several Principles Left Behind by This Refactoring
On the surface, this issue was a misunderstanding of CORS, but at its core, it was a makeup class in security awareness. Abstracting from it, I think there are a few principles that can be repeatedly applied in the future.
1. Never Trust Any Client
As long as the code runs on the user’s device, it cannot be treated as a secret. Browser extensions, mobile apps, and desktop clients are essentially just API entry points, not trusted identities themselves.
What the server can truly trust is not “this request looks like it came from my extension,” but:
Whether this request comes from an authenticated user.
So if you find yourself designing a system to “only allow a specific client to call the API,” it’s best to stop and rethink: can this be changed to “only allow a specific type of user identity to call the API”?
2. VibeCoding Indeed Liberates Productivity, But Also Amplifies Security Blind Spots
Using AI to write code, add features, and generate endpoints is incredibly fast now, and productivity has indeed been massively unleashed. But the problem also lies here: the faster the code is written, the easier it is to assume it’s safe.
Especially with security issues, many times it’s not “the feature doesn’t work,” but “the feature is too easy to abuse.” It’s very hard for non-professionals to spot these issues immediately because they don’t throw errors right away like bugs do.
So I now feel that AI shouldn’t just be used for writing code, but also for conducting security reviews. For example, before deploying, you could ask several different AIs:
Could this endpoint be bypassed? If you were an attacker, how would you abuse it? Are there any issues related to authentication, rate-limiting, CORS, or CSRF here? Which endpoints could generate real costs?
Having multiple AIs review it from different angles might not guarantee absolute security, but at least it can help expose many blind spots I wasn’t aware of.
3. Error Responses Are Also UX
{ error: 'Unauthorized' } and { error, message, code: 'AUTH_REQUIRED' } are completely different things for the frontend.
The former only tells the user “an error occurred,” while the latter explicitly lets the frontend know: this is an authentication issue, and a login button should be displayed instead of a retry button.
So an error response doesn’t end with just returning a random string. It’s actually a part of the product experience.
4. Defense Must Be Layered, and Each Layer’s Responsibility Must Be Clear
CORS protects against malicious websites, authentication protects against anonymous calls, and rate limiting protects against identity abuse.
These three layers do not solve the same problem. The fact that CORS can’t block curl doesn’t mean CORS is useless; the fact that authentication blocks anonymous requests doesn’t mean you can skip rate limiting.
Security isn’t solved by a single “silver bullet,” but by ensuring each layer knows exactly what it’s defending against.
VIII. Final Thoughts
This incident left me a bit ashamed—a project I maintain every day had such an obvious vulnerability that I turned a blind eye to for months. But what makes me reflect more is: my belief that “my API can only be called by my extension” had never been verified. I never ran curl, never played the attacker, and never truly asked myself, “If I wanted to abuse this API myself, how would I do it?”
Writing code and protecting code are two different mindsets. The former asks “Does it work?”, the latter asks “Can it be abused?”. I used to be much more familiar with the former.
If you are working on a browser extension, mobile app, or desktop client with a backend—please take ten minutes to run curl against your own API. See which endpoints should require a login but can actually be called naked. You might be very surprised.