prompt-engineeringllmaichain-of-thoughtfew-shotsystem-prompt

Prompt Engineering Practical Guide - Techniques, Patterns, and Optimization Strategies

A comprehensive guide covering prompt engineering core techniques (Zero-shot, Few-shot, CoT, Self-Consistency, ToT), practical patterns, system prompt design, structured output, and evaluation/optimization strategies.

Data DynamicsApril 16, 202610 min read

Prompt engineering is the technique of optimizing inputs to LLMs to achieve desired results. This post systematically covers fundamental techniques through advanced patterns, practical templates, and evaluation methods.

1. Prompt Engineering Fundamentals

Components of a Prompt

An effective prompt is composed of the following elements:

┌────────────────────────────────────────────┐
│              Prompt Structure               │
│                                             │
│  1. Role         — Model persona            │
│  2. Context      — Background, constraints  │
│  3. Instruction  — Task to perform          │
│  4. Input        — Data to process          │
│  5. Examples     — Desired I/O pairs        │
│  6. Format       — Response structure       │
│  7. Constraints  — What NOT to do           │
└────────────────────────────────────────────┘

Prompt Writing Principles

Principle	Bad Example	Good Example
Be specific	"Optimize the code"	"Improve this Python function's time complexity from O(n²) to O(n log n)"
Assign role	"Write SQL"	"You are a DBA. Write optimized SQL considering index utilization"
Specify format	"Tell me pros and cons"	"Present pros/cons in table format (item/pros/cons/notes)"
State constraints	"Summarize this"	"Summarize in 3 sentences max, translate technical terms to Korean"
Separate steps	"Analyze and write report"	"Step 1: Data analysis, Step 2: Derive insights, Step 3: Write report"

2. Core Prompting Techniques

Zero-shot Prompting

Performing tasks with instructions only, without examples.

Classify the sentiment of the following customer review as "positive", "negative", or "neutral".

Review: "Product quality is fine but shipping was too slow. Please send faster next time."

Sentiment:

Few-shot Prompting

Providing a few examples so the model learns the pattern.

Classify the following SQL error messages using these examples:

Error: "ORA-00942: table or view does not exist"
Category: Object access error
Action: Check table existence, verify permissions

Error: "ORA-01400: cannot insert NULL into"
Category: Data integrity error
Action: Check NOT NULL columns, set defaults

Error: "ORA-04031: unable to allocate shared memory"
Category:
Action:

Few-shot design tips:

Tip	Description
Diverse examples	Include all categories evenly
Edge cases	Include boundary cases for accuracy
Consistent format	Write all examples in identical structure
Right number	3-5 is optimal (too many increases cost)
Order	Place most relevant example last

Chain-of-Thought (CoT) Prompting

Explicitly inducing step-by-step reasoning.

Q: 3 servers each handle 150 requests per second. If traffic increases 
   2.5x during peak hours, how many servers are needed with no request loss?

A: Let's solve step by step.

1. Current total capacity: 3 × 150 = 450 req/s
2. Peak traffic: 450 × 2.5 = 1,125 req/s
3. Required servers: 1,125 ÷ 150 = 7.5
4. Round up (servers are integers): 8

Therefore, at least 8 servers are needed during peak hours.

Note: Adding just "Let's think step by step" significantly improves accuracy in math, logic, and code analysis tasks.

Self-Consistency

Generate multiple reasoning attempts for the same question, then select the most frequent answer.

import anthropic
 
client = anthropic.Anthropic()
 
def self_consistency(prompt: str, n: int = 5):
    """Self-Consistency: majority vote answer selection"""
    answers = []
    for _ in range(n):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            temperature=0.7,
            messages=[{"role": "user", "content": prompt}]
        )
        answers.append(extract_answer(response.content[0].text))
    
    from collections import Counter
    most_common = Counter(answers).most_common(1)[0]
    return most_common[0], most_common[1] / n  # answer, confidence

Tree-of-Thought (ToT) Prompting

Explores multiple reasoning paths in a tree structure to find the optimal answer.

Loading diagram…

Technique Comparison

Technique	Suitable Tasks	Cost	Accuracy Gain
Zero-shot	Simple classification, translation, summary	Lowest	Baseline
Few-shot	Pattern learning, format specification	Low	Medium
CoT	Math, logic, code analysis	Medium	High
Self-Consistency	Problems with clear answers	High (Nx)	Very high
ToT	Complex decisions, diagnostics	High	Very high

3. System Prompt Design

Role of System Prompts

System prompts define the model's overall behavior, role, and constraints. They are applied before all user messages.

import anthropic
 
client = anthropic.Anthropic()
 
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    system="""You are 'DataBot', a senior data engineer at Data Dynamics.
 
## Role
- Expert in Apache Spark, Kafka, NiFi, Kudu and big data technologies
- Familiar with internal technical standards and best practices
 
## Response Rules
1. Always include code examples for technical questions
2. Show before/after comparison for configuration changes
3. Explicitly state "verification needed" for uncertain information
4. Never output security-sensitive information (passwords, keys)
 
## Response Format
- Concise, practical answers (skip unnecessary greetings)
- Write production-ready code
- Present key points first (Bottom-up)
 
## Limitations
- Do not advise on infrastructure costs or licensing
- Do not execute direct production environment changes""",
    messages=[{
        "role": "user",
        "content": "I'm getting Spark executor OOM errors. How do I fix this?"
    }]
)

System Prompt Design Patterns

Pattern 1: ROLE-TASK-FORMAT

[ROLE] You are a {role}.
[TASK] You {task description}.
[FORMAT] Respond in {format}.

Pattern 2: Behavior-based (DO/DON'T)

## DO
- Include code examples
- Explain impact of configuration changes
- Suggest at least 2 alternatives

## DON'T
- Output passwords or API keys
- State uncertain information definitively
- Directly modify production environments

4. Structured Output

Forcing JSON Output

Analyze the following server logs and return results in JSON format.

Logs:
2025-03-15 14:32:01 ERROR [PaymentService] Connection timeout to payment gateway (retry 3/3)
2025-03-15 14:32:05 WARN  [OrderService] Order #12345 payment pending, fallback to queue
2025-03-15 14:33:00 INFO  [OrderService] Order #12345 payment retried successfully

Output format:
{
  "incident_summary": "summary",
  "severity": "critical|warning|info",
  "affected_services": ["service names"],
  "root_cause": "root cause",
  "resolution": "resolution",
  "timeline": [
    {"time": "time", "event": "event", "level": "level"}
  ]
}

Output ONLY the JSON with no other text.

XML Tag-Based Structuring

Particularly effective with Claude, using XML tags to clearly delineate input/output areas.

Analyze the following code.

<code>
def process_data(df):
    result = df.groupBy("user_id").agg(
        count("*").alias("total_orders"),
        sum("amount").alias("total_amount")
    )
    return result.filter(col("total_amount") > 1000)
</code>

Write analysis results matching these tags:

<analysis>
  <purpose>Code purpose</purpose>
  <issues>Issues found (if any)</issues>
  <optimization>Optimization suggestions</optimization>
  <improved_code>Improved code</improved_code>
</analysis>

5. Advanced Prompt Patterns

Role-Playing Pattern

Conduct this code review from three perspectives:

<reviewer role="Security Expert">
Review the code from a security vulnerability perspective.
</reviewer>

<reviewer role="Performance Engineer">
Review the code from a performance bottleneck perspective.
</reviewer>

<reviewer role="Junior Developer">
Review from a code readability and comprehension perspective.
</reviewer>

Devil's Advocate Pattern

Present counterarguments to the following architecture decision.

Decision: "Migrate to microservices architecture"

You are a senior architect who opposes this decision.
1. Three potential risks
2. Specific scenarios where monolith is better
3. Most likely failure cause during migration
4. Alternative approaches

Graduated Complexity Pattern

Explain Kafka Consumer Groups at three levels:

[Beginner] Explain using analogies in 5 lines or less
[Intermediate] Explain core concepts and mechanics with code examples
[Advanced] Detail rebalancing protocols, partition assignment strategies, and error handling

Meta Prompting

Having the LLM generate prompts themselves.

I want to perform a systematic Spark performance tuning analysis.

Create the optimal prompt to achieve this goal, considering:
- Target: Apache Spark 3.x
- Scope: Configuration, code, infrastructure
- Output format: Checklist + improvement plan
- Environment: Kubernetes-based Spark on K8s

6. Practical Prompt Templates

Code Generation Template

Write {language} code matching these requirements.

## Requirements
{detailed requirements}

## Tech Stack
{libraries, frameworks}

## Constraints
- {constraint 1}
- {constraint 2}

## Code Quality Standards
- Include error handling
- Use type hints (Python)
- Write docstrings
- Include unit tests

## Output Format
1. Main code
2. Usage example
3. Test code

Incident Analysis Template

Analyze the following incident.

## Symptoms
{currently observed issues}

## Environment
- System: {system name}
- Version: {version}
- Infrastructure: {infra info}

## Collected Information
{logs, metrics, configuration}

## Analysis Request
1. Possible causes (highest probability first)
2. Diagnostic commands/queries per cause
3. Immediate mitigation (emergency response)
4. Root cause fix (prevent recurrence)
5. Impact assessment

7. Prompt Anti-Patterns and Solutions

Common Mistakes and Improvements

Anti-Pattern	Problem	Improvement
Vague instructions	"Write it well"	"Summarize in 3 sentences including key metrics"
Excessive rules	20+ rules listed	Compress to 5-7 core rules with priorities
Negative instructions	List of "don'ts"	State "dos" first
Missing context	Just dropping code	Include purpose, environment, expected result
Single-turn overload	Everything in one prompt	Separate into sequential steps
Ignoring Temperature	Always use defaults	Creative: 0.7-1.0, Analysis: 0.0-0.3, Code: 0.0

Temperature and Top-p Guide

Task	Temperature	Top-p	Reason
Code generation	0.0	1.0	Accuracy first
Bug analysis	0.0-0.2	1.0	Fact-based analysis
Technical docs	0.3-0.5	0.9	Accurate but natural prose
Brainstorming	0.7-1.0	0.95	Diverse ideas
Creative writing	0.8-1.0	0.95	Creative expression

8. Prompt Evaluation and Optimization

Evaluation Methods

Method	Description	Suitable For
Automated eval	Answer comparison (Exact Match, F1)	Classification, extraction, QA
LLM-as-Judge	Another LLM scores quality	Generative tasks, summarization
Human eval	Domain experts evaluate directly	High quality requirements
A/B testing	Compare prompt variants	Production optimization

LLM-as-Judge Implementation

def evaluate_with_llm(question: str, response: str, criteria: list) -> dict:
    """Evaluate response quality with LLM"""
    eval_prompt = f"""Evaluate the quality of this response.
 
Question: {question}
Response: {response}
 
Evaluation criteria (1-5 points each):
{chr(10).join(f'- {c}' for c in criteria)}
 
Return evaluation results in JSON:
{{"scores": {{"criterion": score}}, "total": average, "feedback": "improvements"}}
"""
    
    result = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        temperature=0.0,
        messages=[{"role": "user", "content": eval_prompt}]
    )
    return json.loads(result.content[0].text)

Prompt Optimization Process

Loading diagram…

Note: Prompt optimization is about "iterative improvement," not "perfect on first try." Collect and analyze failure cases for gradual refinement. Also, prompt re-validation is needed when model versions change.

References

Wei, J. et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS
Wang, X. et al. (2023). "Self-Consistency Improves Chain of Thought Reasoning in Language Models." ICLR
Yao, S. et al. (2023). "Tree of Thoughts: Deliberate Problem Solving with Large Language Models." NeurIPS
Anthropic. "Prompt Engineering Guide" — https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering
OpenAI. "Prompt Engineering Guide" — https://platform.openai.com/docs/guides/prompt-engineering
White, J. et al. (2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." arXiv

— Data Dynamics Engineering Team