[AI] How to effectively use AI agents: a lesson learned from paper summary skills

Like many scientists and engineers, I have been working hard to leverage AI to boost my productivity by automating routine tasks—such as writing LinkedIn posts to summarize research papers—and generating code so that I don’t have to memorize every syntax.

However, I have mixed feelings about the power of AI tools like Claude Code, Gemini CLI, or Qwen-code. On one hand, they are incredibly effective at generating ideas I hadn’t considered, such as finding the perfect word to refine my meaning. On the other hand, they often produce outputs I didn’t intend, such as incorporating unrequested information or suggesting inefficient algorithms.

I have been learning how to use these tools effectively, and here are the key lessons I’ve gathered from my experience:

  • Specify your boundaries: If you don’t want the AI to introduce new information, state that clearly.

  • Be precise: The more specific you are about your requirements—and the more examples you provide—the better the output.

  • Iterate wisely: Don’t be afraid to add constraints after seeing failure cases. However, avoid overloading the prompt with too many constraints at the beginning, as this can hinder the model’s ability to generate high-quality content.

  • Underlying models matter: I found that the command line based Gemini CLI was more effective than the web-based Gemini.google.com, and Qwen-code outperformed the Gemini CLI for my paper summary task. The choice of model can significantly impact the quality of the output, so it’s worth experimenting with different options. Note that I have been using the free models and tools.

In this post, I demonstrate these points by sharing the evolution of my paper-summary agent skill. I will walk through how I iteratively refined the skill from v1 to v3 and detail the lessons I learned along the way.

You can find all versions of the skill files in this GitHub repository.

I iterated my paper-summary agent skill across 3 versions (more versions added after publishing this post). Here are the key changes and the lessons behind them.


v1 → v2: From Over-Specified to Minimal

What changed: v1 was ~500 words with rigid roles (“Expert Scientific Communications AI”), dynamic bullet headers, emoji-per-bullet quotas, and a strict “Paywall Mode.” v2 stripped it down to ~200 words of pure behavioral rules.

Why: LLMs don’t need role-playing fluff. Over-specification creates brittleness — the agent started following template mechanics instead of producing natural summaries.

1
2
3
4
5
6
7
- ### **Role: Expert Scientific Communications AI**
- ##### **Dynamic Bullet Points**
-   * **Headers:** Use bold headers within bullets (e.g., **🧬 Methodological Advance:**)
-   * **Visuals:** Use exactly one relevant emoji per bullet.
+ ### Behavior
+ - Make the post concise, ranging from 20 to 70 words
+ - Use bullet points for complex study

Lesson 1: Less structure ≠ less quality. Give the LLM clear constraints, not a rigid template. The model already knows how to write — it needs guardrails, not a fill-in-the-blank form.


v2 → v3: Adding What Was Missing in Practice

What changed: Three additions driven by real usage:

Addition Example Why
Journal name mention "A recent study in **Nature Genetics** found..." Credibility & link verification
AI disclaimer "This post was generated by AI (Qwen Code)..." Transparency
Dual output (md + html) LinkedIn/qwen/2026/0412-title.{md,html} Ready-to-post HTML avoids manual formatting
1
2
3
4
+ - Mention the journal name in the post
+ - Also adding a gentle reminder message at the end to indicate AI-generated
+ - Save the post in two formats: html and markdown
+ - File path format LinkedIn/[agent]/[year]/[mmdd]-[title].{md,html}

Lesson 2: Add constraints from failures, not guesses. Each v3 addition came from a real gap: missing source attribution, no transparency on AI origin, and manual HTML conversion for LinkedIn posting.


Guardrail Evolution: Tightening Veracity

1
2
3
v1: "Zero-Inference" + "Paywall Mode" + "Final Cross-Reference"  → verbose
v2: "Never use or infer information beyond the paper"            → clean
v3: + "Try to use words in the abstract as much as possible"     → actionable

Lesson 3: Say “how,” not just “don’t." Telling the model to use abstract wording is more effective than just saying don’t hallucinate.


Key Takeaways for Future Skill Development

  1. Start minimal. A 20-line skill file that works beats a 50-line one that breaks. Add rules only when you observe failures.

  2. Remove role-playing. "You are an expert X" adds tokens, not quality. Behavioral constraints work better.

  3. Specify outputs precisely. File paths, formats, and exact phrasing examples prevent ambiguity:

    1
    
    LinkedIn/[agent]/[year]/[mmdd]-[title].{md,html}
    
  4. Guardrails > Templates. Tell the model what to avoid and how to verify, not what each bullet must look like.

  5. Iterate from real usage. v3’s additions (journal name, AI disclaimer, dual format) all came from actual posting friction, not hypothetical needs.


This post was generated with help from AI (Qwen Code), please read with a critical eye.


Last modified on 2026-04-12