自学编程遇到瓶颈怎么办？

遇到瓶颈是正常的。建议：1. 动手做项目 (Project-based Learning)，不要只看视频；2. 善用 AI 助手 (如 Cursor, ChatGPT) 解释代码和逻辑；3. 加入全球技术社区 (如 Discord, GitHub) 与他人交流；4. 拆解大问题为小模块逐个击破。

如何构建一个具备全球竞争力的开发者作品集 (Portfolio)？

优秀的 Portfolio 不在多而在精。包含 2-3 个完整的、已上线的项目 (Live Demo) 最佳。每个项目应包含：GitHub 源码链接、在线演示地址、以及一份中英文 Readme 文档说明解决了什么问题、使用了什么技术栈。

LLM Evaluation

Evaluation prompts (overview)

The core of evaluation: write judging criteria clearly enough that an LLM acting as judge can give explainable comparisons or scores. You're not looking for "the perfect answer" -- you're building a stable, reusable, auditable evaluation workflow.

Learning Path (suggested order)

Beginner: Fix scoring dimensions and output format
Intermediate: Introduce rubrics and weights
Advanced: Use evaluation results to drive iteration

What Is an Evaluation Prompt?

An Evaluation Prompt has the model play judge/reviewer -- comparing output quality, scoring, and explaining its reasoning.

┌─────────────────────────────────────────────────────────────┐
│                    Evaluation Prompt Flow                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Candidate outputs → Evaluation criteria → Score/rank → Explanation & suggestions │
│   (A/B/multiple)       (Rubric)             (scores/ranking) (improvement direction) │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Why Evaluation Matters

Use Case	Specific Application	Business Value
Prompt iteration	Pick the better version	Lower trial-and-error cost
Content production	Copy/summary quality review	Better consistency
Model comparison	Compare outputs across models	Inform model selection
Standardized output	Auto-scoring and filtering	Better efficiency

Business Output (PM Perspective)

With Evaluation Prompts you can deliver:

Quantifiable comparison results (A/B output rankings)
Evaluation templates (reusable rubrics)
Improvement suggestions (for prompt iteration)

Completion criteria (suggested):

Read this page + complete 1 exercise + self-check once

Core Prompt Structure

Goal: Evaluate candidate outputs
Criteria: Scoring dimensions and weights
Format: Output structure (scores/rationale/conclusion)
Input: Candidate answers

General Template

You are a strict evaluator. Compare the outputs using the criteria below.

Scoring criteria (1-5 per dimension):
1) Accuracy
2) Clarity
3) Completeness

Candidate outputs:
A: {output_a}
B: {output_b}

Output format:
- Scores: A=?, B=?
- Winner:
- Rationale (1-3 points):

Quick Start: A/B Comparison

Compare the two answers below. Score on "accuracy, clarity, completeness" (1-5 each).

A: Answer 1
B: Answer 2

Example 1: Writing Quality Evaluation

Evaluate these two product descriptions. Criteria: conciseness, persuasiveness, information completeness.

A: Lightweight and durable, great for travel.
B: Ultra-light design, 30L capacity, works for both urban and travel use.

Example 2: Summary Quality Evaluation

Evaluate the two summaries. Criteria: covers key points, clearly expressed, doesn't introduce new information.

Example 3: Structured Scoring (Rubric)

Scoring dimensions:
1) Accuracy (40%)
2) Readability (30%)
3) Structure (30%)

Output:
- Total score (0-100)
- Per-dimension scores
- Winner

Migration Template (swap variables to reuse)

Criteria: {criteria}
Candidates: {outputs}
Output: Scores + Winner + Rationale

Self-check Checklist (review before submitting)

Are scoring dimensions clear and actionable?
Does it prevent the model from introducing new info?
Is the output structure fixed?
Can the output be parsed programmatically?

Advanced Tips

Weighted scoring: Assign different weights to different metrics.
Score first, explain after: Prevents rationale from retroactively influencing scores.
Three-round evaluation: Run multiple times and average to reduce bias.
Align with goals: Scoring criteria should match business objectives.
Output improvement suggestions: Makes it easy to iterate directly.

Common Problems & Solutions

Problem	Cause	Solution
Inconsistent scores	Vague criteria	Clarify dimension descriptions
Verbose output	No format limits	Fix output fields
Introduces new info	Not restricted	Add "based on input only"
Too subjective	No rubric	Design a scoring table

Hands-on Exercises

Exercise 1: A/B Evaluation

Evaluate two course descriptions. Criteria: clarity, appeal, information completeness.

Exercise 2: Multi-candidate Ranking

Rank 3 answers and provide rationale.

Exercise Scoring Rubric (self-assessment)

Dimension	Passing Criteria
Clear criteria	Scoring dimensions are actionable
Stable output	Scores and rationale are structurally consistent
Reusable	Rubric is swappable
Parseable	Output can be processed programmatically

Index

/learn/prompt-master/prompt-evaluation-plato-dialogue

Takeaways

The key to Evaluation Prompts is actionable scoring criteria.
Fixed output structure makes comparison and automation easier.
Rubrics significantly reduce subjective bias.
Output suggestions can feed directly into prompt iteration.
Templates improve reuse efficiency.

Prompt 大师