Part 4: Token Economics | CPE Module 1

4.1 Understanding Tokenization

LLMs don't process text character by character or word by word -- they work with "tokens." A token is a chunk of text that the model treats as a single unit. Understanding tokenization helps you write more efficient prompts and stay within context limits.

Token

The basic unit of text processed by an LLM. A token can be a whole word, part of a word, a punctuation mark, or a space. English text averages about 4 characters per token, or roughly 0.75 words per token.

How Text Is Tokenized

The contract shall be governed by Indian law .

9 words = 10 tokens (note: "governed" splits into "govern" + "ed")

Tokenization Rules of Thumb

1 token ≈ 4 characters in English (including spaces)
1 token ≈ 0.75 words on average
100 tokens ≈ 75 words (useful for estimation)
1,000 tokens ≈ 750 words or about 1.5 pages of text
Common words are usually single tokens; rare words may split
Non-English text often uses more tokens per word

💡 Why This Matters

Both input (your prompt) AND output (AI's response) count toward token limits and costs. A verbose prompt that could be written more concisely wastes tokens you could use for longer, more detailed responses.

4.2 Context Windows Explained

The context window is the total amount of text (measured in tokens) that an LLM can process at once -- including both your input and its output. Think of it as the AI's "working memory" for the current conversation.

4K-8K

Basic Models

32K-128K

Extended Context

1M+

Latest Models

Current Model Context Windows

Model	Context Window	Approximate Text
GPT-4 Turbo	128,000 tokens	~300 pages
Claude 3 Opus	200,000 tokens	~500 pages
Gemini 1.5 Pro	1,000,000 tokens	~1,500 pages
GPT-3.5 Turbo	16,000 tokens	~40 pages

⚠️ Important Limitation

Larger context windows don't mean unlimited memory. Performance often degrades with very long contexts, especially for information in the "middle" of the input. For critical analysis, keep relevant information near the beginning or end of your prompt.

Working Within Context Limits

Prioritize essential information: Put the most important content first
Summarize when possible: Replace lengthy documents with concise summaries
Chunk large documents: Process in sections rather than all at once
Remove redundancy: Don't repeat information unnecessarily
Use references: "Refer to the contract above" instead of restating

4.3 API Pricing and Cost Management

When using AI through APIs (application programming interfaces), you pay per token. Understanding pricing helps you make cost-effective choices and budget for AI integration in legal practice.

Typical Pricing Structure (as of 2024)

Model Tier	Input Cost	Output Cost	Example Models
Economy	$0.50/1M tokens	$1.50/1M tokens	GPT-3.5, Claude Haiku
Standard	$3-5/1M tokens	$10-15/1M tokens	GPT-4, Claude Sonnet
Premium	$10-15/1M tokens	$30-75/1M tokens	GPT-4 Turbo, Claude Opus

Example Cost Calculation

Analyzing a 10-page contract with GPT-4 Turbo

Input (10 pages)

~6,000 tokens

Output (analysis)

~2,000 tokens

Estimated Cost

~$0.15

✓ Cost Optimization Tips

Use cheaper models for simple tasks: GPT-3.5 or Claude Haiku for summarization, formatting, simple Q&A.
Reserve premium models for complex analysis: Use GPT-4 or Opus only when you need superior reasoning.
Minimize output tokens when possible: Request concise responses; output tokens cost more than input.

4.4 Prompt Optimization Strategies

Efficient prompts achieve better results with fewer tokens. This section covers practical techniques for writing economical prompts without sacrificing clarity or effectiveness.

Token-Efficient Writing

Be direct: "Summarize this contract" not "I would like you to please provide a summary of the following contract document"
Use abbreviations contextually: After first defining "Information Technology Act" you can use "IT Act"
Eliminate filler phrases: Remove "I think," "perhaps," "maybe," "in my opinion"
Use structured formats: Bullet points and numbered lists are often more token-efficient than prose
Reference don't repeat: "Analyze the clause above" instead of copying it again

Before and After Examples

Inefficient (More Tokens)	Efficient (Fewer Tokens)
"I would like you to please help me understand what the implications of this particular clause might be for my client who is a small business owner"	"Explain this clause's implications for a small business owner"
"Can you take a look at the following contract and let me know if there are any issues that I should be concerned about?"	"Identify potential issues in this contract"
"In your response, please make sure to include information about the relevant legal provisions and also provide some examples if possible"	"Include relevant provisions and examples"

Handling Long Documents

Summarize first: Ask the AI to summarize the document, then ask follow-up questions
Extract relevant sections: Only include the clauses actually relevant to your query
Process in chunks: Analyze sections separately, then synthesize findings
Use hierarchical analysis: Start with high-level overview, drill down as needed

⚖️ Legal Practice Application

When reviewing a 50-page contract, don't paste the entire document and ask for "issues." Instead: (1) Extract the table of contents first, (2) Identify high-risk sections based on headings, (3) Analyze those specific sections in detail. This approach is faster, cheaper, and produces better results.

4.5 Practical Budgeting for AI Usage

Integrating AI into legal practice requires thoughtful budgeting. Here's how to estimate costs and make economically sound decisions about AI tool usage.

Cost-Benefit Framework

Before using AI for a task, consider:

Time saved: How long would this take manually?
AI cost: Estimated token cost for the task
Quality needs: Does this need premium model accuracy?
Verification effort: How much review will the output require?

Typical Monthly Costs by Usage Level

Usage Level	Description	Estimated Cost
Light	Occasional research queries, simple drafting help	$20-50/month
Moderate	Daily use for research, document review, drafting	$100-300/month
Heavy	Extensive document analysis, complex legal research	$500-1,500/month
Enterprise	Firm-wide deployment, high-volume processing	$2,000+/month

💡 ROI Consideration

If AI saves a junior associate 2 hours per day (billed at Rs. 3,000/hour), that's Rs. 1.8 lakh/month in potential time savings. Even with heavy AI usage costs, the ROI is typically positive if the tool is used effectively.

Key Takeaways

Tokens are the currency of LLMs -- both input and output count toward limits and costs
Rule of thumb: 1 token ≈ 4 characters ≈ 0.75 words; 1,000 tokens ≈ 750 words
Context windows range from 4K to 1M+ tokens; larger isn't always better for accuracy
Output tokens typically cost 2-5x more than input tokens -- request concise responses
Use economy models for simple tasks, reserve premium models for complex analysis
Optimize prompts by being direct, eliminating filler, and processing long documents in chunks