From f-string SQL to Parameterized Queries: Preventing Injection in Database Scripts

The Scenario

You ask your AI to write a Python script that manages a user directory — searching for users by name, inserting new records, and deleting old ones. The model produces clean, readable code that uses f-strings to build SQL queries from user input. It passes every test you throw at it. It also lets any malicious user delete your entire database with a single crafted input.

The Raw AI Draft

Here is what a model like GPT-4 or Claude typically generates on the first attempt. It works perfectly with honest inputs — and is a textbook SQL injection vulnerability.

Before — The Naive AI Draft
import sqlite3

def create_database():
  conn = sqlite3.connect("users.db")
  cursor = conn.cursor()
  cursor.execute("""
      CREATE TABLE IF NOT EXISTS users (
          id INTEGER PRIMARY KEY AUTOINCREMENT,
          name TEXT,
          email TEXT,
          role TEXT
      )
  """)
  conn.commit()
  conn.close()

def add_user(name, email, role):
  conn = sqlite3.connect("users.db")
  cursor = conn.cursor()
  cursor.execute(f"INSERT INTO users (name, email, role) VALUES ('{name}', '{email}', '{role}')")
  conn.commit()
  conn.close()
  print(f"Added user {name}")

def search_users(name):
  conn = sqlite3.connect("users.db")
  cursor = conn.cursor()
  cursor.execute(f"SELECT * FROM users WHERE name LIKE '%{name}%'")
  results = cursor.fetchall()
  conn.close()
  return results

def delete_user(user_id):
  conn = sqlite3.connect("users.db")
  cursor = conn.cursor()
  cursor.execute(f"DELETE FROM users WHERE id = {user_id}")
  conn.commit()
  conn.close()
  print(f"Deleted user {user_id}")

create_database()
add_user("Alice", "alice@example.com", "admin")
print(search_users("Alice"))

The Code Smells

⚠️Code Smells — What's Wrong Here?
  • SQL injection via f-string interpolationf"INSERT INTO users ... VALUES ('{name}', '{email}', '{role}')" pastes user input directly into the SQL string. An attacker can pass name = "'; DROP TABLE users; --" and execute arbitrary SQL. This is the number one vulnerability in web application security and has been documented since 1998.
  • No input validation on any field — The role parameter accepts any string. There is no check that it is a valid role. An attacker can insert arbitrary values like "superadmin" or use the field as an injection vector.
  • Database connection leaks on exceptions — Every function opens a connection with sqlite3.connect() and manually calls conn.close(). If the SQL execution throws an error, close() is never called. Repeated failures exhaust the process's file descriptor limit.
  • No transaction managementconn.commit() is called after each operation, but there is no rollback() on failure. A partially completed multi-step operation leaves the database in an inconsistent state.
  • No column constraints in the schema — The users table allows NULL values in every column. A user with no name, no email, and no role can be inserted without error. There is no UNIQUE constraint on email, so duplicate registrations go undetected.
  • Raw tuples returned to the callercursor.fetchall() returns a list of tuples with no column names. The caller has to know that index 0 is the ID, index 1 is the name, and so on. This is fragile and unreadable.
  • Hardcoded database path"users.db" is hardcoded in every function. Testing against a scratch database requires editing the source code.
  • No loggingprint() with no structure. In a production system you cannot distinguish informational messages from errors, and you have no timestamps for diagnosis.

The Best Practices

Parameterized Queries. The fundamental defense against SQL injection is to never build SQL strings with user input. Instead, use parameterized queries (also called prepared statements) where the SQL structure and data values are sent separately to the database engine. In Python's sqlite3 module, this means using ? placeholders: cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,)). The database driver handles escaping and quoting. SQL injection becomes structurally impossible because user input can never be interpreted as SQL syntax.

Input Validation at the Application Layer. Parameterized queries prevent injection, but they do not prevent invalid data. If only three roles are valid in your system (admin, editor, viewer), enforce that in code before the data reaches the database. Use a CHECK constraint in the schema as a safety net, and validate in the application layer as the primary gate. This "defense in depth" approach means a bug in one layer does not compromise the other.

The Principle of Least Privilege. Database columns should constrain their values as tightly as possible. Use NOT NULL to prevent empty fields. Use UNIQUE to prevent duplicates. Use CHECK to restrict values to known-good sets. These constraints are not optional hardening — they are the schema communicating its own invariants.

Context Managers for Connection Lifecycle. Wrap database connections in a context manager (with block) that commits on success and rolls back on exception. This guarantees that connections are always closed, transactions are always finalized, and a crash mid-operation never leaves the database in a half-written state.

Structured Results. Return dictionaries (or dataclasses, or Pydantic models) instead of raw tuples. Named fields are self-documenting, less error-prone to consume, and trivially serializable to JSON for APIs. The small overhead of dict(zip(columns, row)) pays for itself immediately in readability and maintainability.

Type Coercion as a Defense Layer. For parameters that must be integers (like user_id), explicitly coerce the input with int(user_id) before using it in a query. This is not a substitute for parameterized queries — it is an additional barrier that catches malformed input before it reaches the database layer at all.

The Refactored Code

After — Production-Ready
import sqlite3
import os
import logging
from contextlib import contextmanager

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger(__name__)

# Database path from environment — avoids hardcoding filesystem assumptions
DB_PATH = os.getenv("USER_DB_PATH", "users.db")


@contextmanager
def get_connection(db_path: str = DB_PATH):
  """Context manager for database connections — guarantees cleanup on error."""
  conn = sqlite3.connect(db_path)
  conn.execute("PRAGMA foreign_keys = ON")   # Enforce referential integrity
  conn.execute("PRAGMA journal_mode = WAL")  # Better concurrent read performance
  try:
      yield conn
      conn.commit()
  except Exception:
      conn.rollback()
      raise
  finally:
      conn.close()


def create_database(db_path: str = DB_PATH) -> None:
  """Initialize the database schema with constrained column definitions."""
  with get_connection(db_path) as conn:
      conn.execute("""
          CREATE TABLE IF NOT EXISTS users (
              id INTEGER PRIMARY KEY AUTOINCREMENT,
              name TEXT NOT NULL,
              email TEXT NOT NULL UNIQUE,
              role TEXT NOT NULL DEFAULT 'viewer'
                  CHECK(role IN ('admin', 'editor', 'viewer'))
          )
      """)
  logger.info("Database initialized")


def add_user(name: str, email: str, role: str = "viewer",
           db_path: str = DB_PATH) -> int:
  """Insert a new user using parameterized queries. Returns the new user ID."""
  # Validate role before touching the database — fail fast
  allowed_roles = {"admin", "editor", "viewer"}
  if role not in allowed_roles:
      raise ValueError(f"Invalid role '{role}'. Must be one of: {allowed_roles}")

  with get_connection(db_path) as conn:
      # Parameterized query: the ? placeholders are filled by the database driver,
      # which escapes all special characters. SQL injection is structurally impossible.
      cursor = conn.execute(
          "INSERT INTO users (name, email, role) VALUES (?, ?, ?)",
          (name, email, role),
      )
      user_id = cursor.lastrowid
      logger.info(f"Added user '{name}' (id={user_id}, role={role})")
      return user_id


def search_users(name: str, db_path: str = DB_PATH) -> list[dict]:
  """Search for users by name using a parameterized LIKE query."""
  with get_connection(db_path) as conn:
      # The % wildcards are part of the parameter value, not the SQL string.
      # This prevents injection through the search term.
      cursor = conn.execute(
          "SELECT id, name, email, role FROM users WHERE name LIKE ?",
          (f"%{name}%",),
      )
      columns = [desc[0] for desc in cursor.description]
      results = [dict(zip(columns, row)) for row in cursor.fetchall()]
      logger.info(f"Search for '{name}' returned {len(results)} result(s)")
      return results


def delete_user(user_id: int, db_path: str = DB_PATH) -> bool:
  """Delete a user by ID. Returns True if a row was actually deleted."""
  # Coerce to int — if someone passes "1; DROP TABLE users", this raises ValueError
  user_id = int(user_id)
  with get_connection(db_path) as conn:
      cursor = conn.execute(
          "DELETE FROM users WHERE id = ?",
          (user_id,),
      )
      deleted = cursor.rowcount > 0
      if deleted:
          logger.info(f"Deleted user id={user_id}")
      else:
          logger.warning(f"No user found with id={user_id}")
      return deleted


if __name__ == "__main__":
  create_database()
  uid = add_user("Alice", "alice@example.com", "admin")
  results = search_users("Alice")
  for user in results:
      print(user)

The Benchmarks

📊Security and Reliability Metrics
MetricBeforeAfterImprovement
SQL injection vulnerabilityWide openStructurally impossibleEliminated
Survives malicious inputNo — data destroyedYes — input safely escaped100% protection
Connection leak on errorLeaks every timeGuaranteed cleanupEliminated
Duplicate email insertionSilently succeedsRaises unique constraint errorData integrity enforced
Invalid role insertionAccepts anythingValidates against allowed setDefense in depth

The Prompt Tip

💡Prompt Tip — Feed This to Your AI

Write a Python script that manages a user database with SQLite. Requirements: use parameterized queries with ? placeholders for ALL SQL operations — never use f-strings or string concatenation to build SQL. Add a context manager for database connections that commits on success, rolls back on exception, and always closes the connection. Define the schema with NOT NULL constraints on all columns, a UNIQUE constraint on email, and a CHECK constraint that restricts the role column to a fixed set of values (admin, editor, viewer). Validate the role in the application layer before executing the INSERT. Return query results as a list of dictionaries with named columns instead of raw tuples. Coerce integer parameters with int() before using them in queries. Read the database file path from an environment variable with a sensible default. Use Python's logging module instead of print.