logo
09

String Formatting

⏱️ 25 min

Advanced Strings: Turning Raw Text into Usable Data

What You Might Be Wondering

"I already know print and replace. What else is there?"

Real-world text is messy: random whitespace, inconsistent delimiters, mixed casing. The advanced stuff is about standardizing input so you can actually work with it.

One-Line Definition

Advanced string processing uses split, join, find, and count to transform messy text into structured information.

Real-Life Analogy

Think of it like sorting packages: unbox (split), sort (clean), repackage (join).

Minimal Working Example

line = "python,sql,git"
skills = line.split(",")
print(skills)  # ['python', 'sql', 'git']
print("|".join(skills))  # python|sql|git

Common Methods

text = "python python ai"
print(text.find("ai"))      # 14
print(text.count("python")) # 2
print(text.replace("ai", "AI"))

Cleaning Pipeline Example

raw = "  USER@EXAMPLE.COM  "
clean = raw.strip().lower()
print(clean)  # user@example.com

Quick Quiz (5 min)

  1. Turn a,b,c into a|b|c.
  2. Count how many times a keyword appears in a sentence.
  3. Clean 3 emails and output them as a list.

Quiz Rubric & Grading Criteria

  • Direction: write runnable code that covers the core requirements and edge cases from the prompt.
  • Criterion 1 (Correctness): main flow produces correct results, key branches execute.
  • Criterion 2 (Readability): clear variable names, no excessive nesting.
  • Criterion 3 (Robustness): basic protection against empty values, type errors, or unexpected input.

Take-Home Task

Implement normalize_tags(text):

  • Input: " Python, AI ,data "
  • Output: ["python", "ai", "data"]

Acceptance Criteria

You can independently:

  • Process text using split/join/find/count/replace
  • Build a reliable string cleaning pipeline
  • Convert raw text into a structured list

Common Errors & Debugging Steps (Beginner Edition)

  • Error message looks like gibberish: read the last line for the error type (TypeError, NameError, etc.), then trace back to the offending line.
  • Not sure what a variable holds: drop a temporary print(variable, type(variable)) to check.
  • Changed code but nothing happened: make sure you saved the file, you're running the right file, and your terminal environment (venv) is correct.

Common Misconceptions

  • Misconception: find() throws an error when it doesn't find anything.

  • Reality: it returns -1. No exception.

  • Misconception: you can compare user input directly without cleaning it.

  • Reality: always strip + lower first, then compare.