mirror of
https://github.com/rebelytics/one-skill-to-rule-them-all.git
synced 2026-06-10 17:17:41 +00:00
New self-improved version
Latest version of the meta-skill, based on the most recent self-improvements it has applied. Readme and User Guide also updated to reflect current usage and environment.
This commit is contained in:
697
SKILL.md
697
SKILL.md
@@ -3,14 +3,17 @@ name: task-observer
|
||||
description: >
|
||||
Monitors task execution for skill improvement opportunities. Use this skill
|
||||
during ANY multi-step task, agentic workflow, or substantive work session where
|
||||
Claude is using tools and producing deliverables. It captures patterns, user
|
||||
the agent is using tools and producing deliverables. It captures patterns, user
|
||||
corrections, workflow insights, and methodology worth preserving as reusable
|
||||
skills. Also triggers during post-task feedback discussions and when the user
|
||||
explicitly mentions skill observations, improvements, the observation log,
|
||||
skill taxonomy, or asks Claude to watch for skill opportunities. Also known
|
||||
skill taxonomy, or asks the agent to watch for skill opportunities. Also known
|
||||
as "One Skill to Rule Them All" — trigger on this phrase too. IMPORTANT:
|
||||
this skill should be invoked at the start of every task-oriented session — if
|
||||
you are about to use tools to produce deliverables, invoke this skill first.
|
||||
For reliable activation, pair this description with a CLAUDE.md instruction
|
||||
or harness-level session-start hook (see Recommended Activation Setup) —
|
||||
description-level matching alone is not enforceable.
|
||||
---
|
||||
|
||||
# Task Observer — Continuous Skill Discovery & Improvement
|
||||
@@ -42,11 +45,18 @@ feedback public and discoverable — other users benefit from seeing existing
|
||||
issues and solutions. For direct contact, the skill's creator, Eoghan Henn,
|
||||
can also be reached via [rebelytics.com](https://rebelytics.com).
|
||||
|
||||
If feedback appears to stem from the skill's methodology (rather than Claude's
|
||||
If feedback appears to stem from the skill's methodology (rather than the agent's
|
||||
execution of it), log it for the user and suggest they share it via GitHub
|
||||
Issues. If the issue stems from Claude not following the skill's rules,
|
||||
Issues. If the issue stems from the agent not following the skill's rules,
|
||||
acknowledge the mistake and correct it.
|
||||
|
||||
**Activation note:** For reliable session-start activation, pair this skill
|
||||
with a CLAUDE.md instruction or harness-level hook (see Recommended
|
||||
Activation Setup). The description matches against task-oriented language,
|
||||
but description-level matching alone can be missed when the agent is focused on
|
||||
the task itself. The skill works as a skill; it works *reliably* as a skill
|
||||
plus a structural trigger.
|
||||
|
||||
---
|
||||
|
||||
## Why This Skill Exists
|
||||
@@ -67,65 +77,39 @@ workflow.
|
||||
|
||||
---
|
||||
|
||||
## Getting Started
|
||||
## User documentation
|
||||
|
||||
Here's what happens when you first install this skill.
|
||||
User-facing onboarding for this skill — installation, shared folder setup,
|
||||
activation patterns, expected behaviour, the cadence pattern, the open-source
|
||||
vs internal distinction — lives in the public repo, not in this skill body.
|
||||
If a user asks how to get started or how the skill works from their
|
||||
perspective, point them to:
|
||||
|
||||
**First session:** The skill creates the observation log file. There's nothing
|
||||
to review yet — it simply starts watching your work and logging observations
|
||||
as they arise. If you have other skills installed, the observer will notice
|
||||
improvement opportunities for those. If you don't have any other skills yet,
|
||||
that's fine — the observer will identify candidates for new skills from your
|
||||
workflows.
|
||||
- README: https://github.com/rebelytics/one-skill-to-rule-them-all/blob/main/README.md
|
||||
- USER-GUIDE: https://github.com/rebelytics/one-skill-to-rule-them-all/blob/main/USER-GUIDE.md
|
||||
|
||||
**First few sessions:** Observations accumulate in the log. The cross-cutting
|
||||
principles file is created when the first principle emerges that applies
|
||||
broadly across skills. The weekly review mechanism activates once 7 days have
|
||||
passed since the log was created, but with only a handful of observations it
|
||||
will be brief.
|
||||
If web access is available, fetch the relevant section directly rather than
|
||||
paraphrasing — the public docs are the source of truth for user-facing
|
||||
guidance and are versioned independently. The remainder of this skill is
|
||||
operational instruction for the agent.
|
||||
|
||||
**Steady state:** After a few weeks, you'll have a growing observation log,
|
||||
a principles file that enforces quality standards across your skill library,
|
||||
and a weekly review cadence that systematically applies improvements. The
|
||||
skill compounds in value as your skill library grows.
|
||||
## Conventions
|
||||
|
||||
**What you need to start:** Nothing beyond the skill itself. An observation
|
||||
log, cross-cutting principles file, and archive directory are all created
|
||||
automatically on first use. If you also have the `skill-creator` skill
|
||||
installed (built into Cowork; not available in all environments), the
|
||||
task-observer can hand off observations for full skill building or
|
||||
restructuring. Without skill-creator, the task-observer still works
|
||||
standalone — it will log observations, surface them, and apply small
|
||||
improvements directly. Larger changes would be done manually.
|
||||
|
||||
### Quick Start
|
||||
|
||||
Want to get running in under 15 minutes? Here's the minimal path:
|
||||
|
||||
1. **Install the skill.** Add task-observer to your skills directory.
|
||||
2. **Add the activation line.** Copy this into your CLAUDE.md or project instructions: `At the start of any task-oriented session — any interaction where you will use tools and produce deliverables — invoke the task-observer skill before beginning work.`
|
||||
3. **Do a real task.** Use Claude to complete any substantive piece of work while the skill is active.
|
||||
4. **See your first observation.** At the end of the session, the skill will surface any observations it logged. That's it.
|
||||
|
||||
No pre-setup, no configuration files to edit manually. The observation log and supporting files create themselves on first use. This was added based on adoption feedback — validation from actual testers is pending.
|
||||
|
||||
### What is `[workspace folder]`?
|
||||
|
||||
Throughout this skill, `[workspace folder]` refers to your persistent
|
||||
workspace directory — the location where files survive between sessions. In
|
||||
Cowork, this is the folder you selected at the start of the session. In
|
||||
Claude Code, this is your project root. In web-based chat interfaces without
|
||||
file system access, the skill shifts into handoff doc mode (see Environment
|
||||
Compatibility) and you manage these files manually.
|
||||
`[workspace folder]` refers to the user's persistent workspace directory —
|
||||
the location where files survive between sessions. In Cowork, this is the
|
||||
folder selected at session start. In Claude Code, this is the project root.
|
||||
In web-based chat interfaces without filesystem access, the skill shifts
|
||||
into handoff doc mode (see Environment Compatibility) and the user manages
|
||||
these files manually.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Activation Setup
|
||||
|
||||
This skill needs to be invoked at the start of task-oriented sessions to work
|
||||
effectively. Because skill invocation depends on Claude matching the user's
|
||||
effectively. Because skill invocation depends on the agent matching the user's
|
||||
request against skill descriptions, a skill that monitors *all* tasks can be
|
||||
overlooked when Claude is focused on the task itself.
|
||||
overlooked when the agent is focused on the task itself.
|
||||
|
||||
To maximise activation reliability, add the following instruction to your
|
||||
configuration file (e.g., CLAUDE.md, project instructions, or equivalent):
|
||||
@@ -191,6 +175,63 @@ The detection approach depends on the environment:
|
||||
This check runs once at session start and does not repeat. Keep the
|
||||
suggestion brief — one or two sentences, not a full tutorial.
|
||||
|
||||
### Compaction Behaviour
|
||||
|
||||
When a session context compacts mid-task, the CLAUDE.md structural trigger
|
||||
re-invokes task-observer on the resumed session. No explicit re-invocation
|
||||
is needed on the agent's part — the same activation instruction that fired
|
||||
at the start of the original session fires again at the start of the
|
||||
resumed session, because the resumed session reads CLAUDE.md anew.
|
||||
Observations from before and after compaction append to the same log file
|
||||
with continuous numbering.
|
||||
|
||||
This is the primary reason the CLAUDE.md structural trigger exists —
|
||||
description-level triggers alone would not reliably guarantee re-invocation
|
||||
on a resumed session, because the resumed session's opening message may
|
||||
not match task-observer's trigger phrases even when the ongoing task is
|
||||
task-oriented. The structural trigger fires regardless of the resumed
|
||||
session's opening message.
|
||||
|
||||
---
|
||||
|
||||
## The Pre-Flight Principle
|
||||
|
||||
One of the most important patterns this skill should propagate to every skill
|
||||
it helps create or improve: **built-in enforcement.**
|
||||
|
||||
Real-world experience has shown that rules documented in a skill are not
|
||||
always followed during the creative flow of producing output. The result:
|
||||
output that violates the skill's own standards, which reflects badly on the
|
||||
skill.
|
||||
|
||||
The fix: every skill that contains explicit rules or requirements should
|
||||
include a verification step where the agent re-reads the rules and checks its
|
||||
output against them before delivery. This isn't overhead — it's quality
|
||||
assurance. A 30-second re-read prevents a 30-minute rework cycle.
|
||||
|
||||
When creating or improving any skill through this observation process, ask:
|
||||
"Does this skill have rules? If yes, does it have a mechanism to enforce
|
||||
them?" If the answer to the second question is no, add one.
|
||||
|
||||
### Self-Enforcement
|
||||
|
||||
This skill practises what it preaches. Before surfacing observations at end
|
||||
of session, verify:
|
||||
|
||||
1. Were observations logged throughout the full session — including during
|
||||
post-task feedback, discussion phases, and reflective conversations, not
|
||||
just during active tool use?
|
||||
2. Were observations logged silently without interrupting the user's flow?
|
||||
3. Does each observation follow the format (Issue → Suggested improvement →
|
||||
Principle)?
|
||||
4. Is each observation tagged with the correct type (open-source or internal)?
|
||||
5. For any observations about existing skills, does the suggested improvement
|
||||
reference the specific section or rule?
|
||||
6. For any observation tagged `type: open-source`, does the Principle field
|
||||
contain any client-identifying information? If so, generalise it before
|
||||
surfacing.
|
||||
If any observation fails these checks, fix it before surfacing.
|
||||
|
||||
---
|
||||
|
||||
## Skill Taxonomy
|
||||
@@ -258,24 +299,68 @@ Internal skills are working documents, not published artifacts. Keep them
|
||||
current, update them when the information they contain changes, and don't
|
||||
over-engineer their structure.
|
||||
|
||||
### Lean Content
|
||||
|
||||
A skill should contain only content that meaningfully changes the agent's
|
||||
behaviour at execution time. Anything that doesn't — changelogs, version
|
||||
notes, "thanks to X" credits, self-narrating prose, or other
|
||||
maintainer-facing context — belongs in a supporting doc alongside the
|
||||
skill, not inside the SKILL.md itself.
|
||||
|
||||
This rule cuts content the agent reads but doesn't act on. It does NOT cut
|
||||
examples, anti-patterns, or worked scenarios — those are load-bearing for
|
||||
rule adherence (bare rules without their context get violated more
|
||||
reliably than rules with context). The test is whether the content,
|
||||
removed, would change how the agent behaves. If yes, keep it. If no,
|
||||
move it out.
|
||||
|
||||
Common examples of content that should live outside the skill:
|
||||
|
||||
- Change history / release notes / version logs — keep in a supporting
|
||||
history doc, in commit history, or both.
|
||||
- Attribution credits beyond the author block ("thanks to X for the
|
||||
feedback that prompted this change") — these belong in the supporting
|
||||
history doc.
|
||||
- Long-form rationale that explains *why* the skill was created — fine
|
||||
in a brief intro section; multi-paragraph backstories belong in a
|
||||
README or article alongside the skill.
|
||||
- Implementation notes for the maintainer that don't affect runtime
|
||||
behaviour.
|
||||
|
||||
Both open-source and internal skills are subject to this rule. The agent
|
||||
loads the skill's content into context on every invocation; every
|
||||
non-load-bearing line is paid token cost with no behavioural payoff.
|
||||
|
||||
---
|
||||
|
||||
## Licensing
|
||||
|
||||
Open-source skills should include a licence to make sharing terms explicit.
|
||||
The recommended licence is **Creative Commons Attribution 4.0 International
|
||||
(CC BY 4.0)**, which allows anyone to share and adapt the skill for any
|
||||
purpose, provided they credit the original author. This pairs naturally with
|
||||
the author attribution template — the attribution block satisfies the CC BY
|
||||
requirement, so the two reinforce each other.
|
||||
Open-source skills should include an open-source licence to make sharing
|
||||
terms explicit. Any commonly recognised open-source licence works — the
|
||||
choice depends on the author's preference and what they're optimising for.
|
||||
Common options:
|
||||
|
||||
Include the licence statement in the skill preamble (after the author
|
||||
attribution block) and include a `LICENSE.txt` file in the skill directory
|
||||
containing the full licence text.
|
||||
- **CC BY 4.0** — designed for creative works (prose, documentation).
|
||||
Permissive: anyone can share and adapt provided they credit the
|
||||
author. A natural fit for prose-heavy skills where the methodology is
|
||||
the value.
|
||||
- **MIT** — short, familiar to developers, broadly permissive. Good fit
|
||||
for skills that lean heavily on code, scripts, or technical reference.
|
||||
- **Apache 2.0** — like MIT but with an explicit patent grant. Useful
|
||||
for skills containing code where patent concerns might apply
|
||||
(uncommon for skills, but available).
|
||||
- **CC BY-SA 4.0** — share-alike: derivative works must use the same
|
||||
licence. Use when adaptations should remain open under the same
|
||||
terms.
|
||||
- **GPL family (GPL/LGPL/AGPL)** — strong copyleft for code. Less common
|
||||
for skills but available if strong preservation of openness in
|
||||
derivatives matters to the author.
|
||||
|
||||
If CC BY 4.0 doesn't fit a particular skill (e.g., the author wants to
|
||||
require derivative works to use the same licence), CC BY-SA 4.0 is an
|
||||
alternative. The choice should be made by the skill's author.
|
||||
Whatever licence is chosen, include the licence statement in the skill
|
||||
preamble (after the author attribution block) and include a `LICENSE` or
|
||||
`LICENSE.txt` file in the skill directory containing the full licence
|
||||
text. The choice belongs to the skill's author; the requirement is that
|
||||
there be a licence.
|
||||
|
||||
---
|
||||
|
||||
@@ -289,10 +374,10 @@ Replace the placeholders with the actual author's details.
|
||||
|
||||
[1-2 sentence description of what the skill does and its provenance.]
|
||||
|
||||
**Licence:** This skill is released under the
|
||||
[Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/)
|
||||
licence. You are free to share and adapt this skill for any purpose, provided
|
||||
you give appropriate credit to the original author.
|
||||
**Licence:** This skill is released under [LICENCE NAME]. [One-sentence
|
||||
summary of the licence — e.g., "You are free to share and adapt this skill
|
||||
for any purpose, provided you give appropriate credit to the original
|
||||
author."]
|
||||
|
||||
**Feedback & Support:** If at any point during the process you encounter
|
||||
questions about the methodology, or if the user expresses frustration or
|
||||
@@ -303,8 +388,8 @@ discoverable. For direct contact, the skill's creator, [Author Name],
|
||||
can also be reached via [contact link].
|
||||
|
||||
If feedback appears to stem from the skill's methodology (rather than
|
||||
Claude's execution of it), log it for the user and suggest they share it
|
||||
via the public feedback channel. If the issue stems from Claude not
|
||||
The agent's execution of it), log it for the user and suggest they share it
|
||||
via the public feedback channel. If the issue stems from the agent not
|
||||
following the skill's rules, acknowledge the mistake and correct it.
|
||||
```
|
||||
|
||||
@@ -363,7 +448,7 @@ no deliverables are being discussed.
|
||||
- A domain-specific process with clear inputs, phases, and outputs
|
||||
- The user describing a process they've refined over time ("I always do it
|
||||
this way", "the process for this is...")
|
||||
- Claude and the user naturally developing a structured approach to a problem
|
||||
- the agent and the user naturally developing a structured approach to a problem
|
||||
that could be formalised
|
||||
|
||||
**Signals for IMPROVING an existing skill:**
|
||||
@@ -372,9 +457,9 @@ Any new information from a task that uses a skill and could make that skill
|
||||
better is worth capturing. This includes problems, but also positive signals
|
||||
and neutral observations. Examples:
|
||||
|
||||
- Claude doesn't follow a skill's rules despite them being documented — this
|
||||
- the agent doesn't follow a skill's rules despite them being documented — this
|
||||
means the skill needs stronger enforcement, not just better rules
|
||||
- The user corrects Claude's output in a way that reveals a missing rule or
|
||||
- The user corrects the agent's output in a way that reveals a missing rule or
|
||||
an edge case the skill doesn't cover
|
||||
- A skill's recommended workflow turns out to be less efficient than what
|
||||
emerged naturally during the task
|
||||
@@ -404,10 +489,19 @@ Signals that a skill is ready to be simplified:
|
||||
- A rule added from a single observation that hasn't been validated by
|
||||
recurrence — one-off cases should not accumulate as permanent rules
|
||||
- An elaborate workflow that users consistently shortcut or skip
|
||||
- Sections that Claude loads but never acts on (dead weight in context
|
||||
- Sections that the agent loads but never acts on (dead weight in context
|
||||
window)
|
||||
- Rules that contradict each other or create unnecessary complexity
|
||||
- Complexity added "just in case" that has never triggered
|
||||
- A documented rule that the agent consistently fails to follow — the rule
|
||||
isn't reaching the moment of decision. The fix is rarely to write it more
|
||||
loudly; usually it's either to remove the rule, or to convert it from
|
||||
narrative guidance into structural enforcement (a checklist, a
|
||||
verification step, or a tool call that can't be skipped).
|
||||
|
||||
Treat the list above as a review checklist when looking at any of your own
|
||||
skills — a "yes" on any signal is a candidate for simplification or
|
||||
removal, not just a flag for future consideration.
|
||||
|
||||
During weekly reviews, ask "what can we remove?" as deliberately as you ask
|
||||
"what should we add?" When a previously-applied observation turns out to be
|
||||
@@ -435,6 +529,17 @@ Tie observation flushing to existing workflow checkpoints — e.g., when marking
|
||||
a TodoWrite item as completed, check whether any unlogged observations have
|
||||
accumulated and write them before proceeding.
|
||||
|
||||
**Mandatory observation checkpoint after every 3rd TodoWrite completion:** After
|
||||
marking the 3rd, 6th, 9th (etc.) TodoWrite item as completed in a session,
|
||||
pause and explicitly ask: "Have any unlogged observations accumulated?" This is
|
||||
a hard checkpoint, not a suggestion — the skill has demonstrated that softer
|
||||
"check when completing items" guidance gets lost during cognitively demanding
|
||||
analytical work. The count doesn't need to be precise; the rule is: roughly
|
||||
every third completion, stop and flush. If nothing has accumulated, the pause
|
||||
costs seconds. If observations have accumulated, this prevents the common
|
||||
failure mode where the skill is loaded but no observations are written until
|
||||
the user explicitly asks.
|
||||
|
||||
**Before assigning any observation number, run a mandatory pre-logging step:**
|
||||
Search the entire log file for all lines matching the pattern `### Observation \d+:`,
|
||||
extract the highest observation number already in use, and increment from there.
|
||||
@@ -454,6 +559,72 @@ grep -o '### Observation [0-9]*' log.md | grep -o '[0-9]*' | sort -n | tail -1
|
||||
This prevents the recurring numbering collision issue where partial reads of large
|
||||
files create a false sense of awareness of the current count.
|
||||
|
||||
**Write-time verification assertion (mandatory):** The pre-logging step above
|
||||
catches honest mistakes, but is vulnerable to parallel-session scenarios where
|
||||
multiple task-oriented sessions on the same day each compute "next number"
|
||||
against a snapshot and then collide on write. To catch this class of collision,
|
||||
after determining the proposed next number and immediately before appending,
|
||||
re-read the log and assert the number does not already exist:
|
||||
|
||||
```bash
|
||||
PROPOSED=$(( $(grep -oP '### Observation \K\d+' log.md | sort -n | tail -1) + 1 ))
|
||||
grep -qE "^### Observation ${PROPOSED}:" log.md && {
|
||||
echo "COLLISION on #${PROPOSED} — another writer has claimed this number"; exit 1; }
|
||||
# If assertion passes, proceed with the append using #${PROPOSED}.
|
||||
```
|
||||
|
||||
If the assertion fires, increment past all existing numbers (not just by 1)
|
||||
and re-check. Treat an assertion failure as a meta-observation worth logging
|
||||
— it indicates either a parallel-session collision or a stale read elsewhere
|
||||
in the workflow.
|
||||
|
||||
**Post-write verification (mandatory — closes the TOCTOU race):** The
|
||||
pre-write assertion catches stale-read collisions but cannot close the
|
||||
time-of-check-to-time-of-use race between the assertion and the append.
|
||||
In shell, `grep -q && cat >> ...` is two separate operations: the grep
|
||||
passes at T0, the append lands at T1. Any other session that appends
|
||||
between T0 and T1 can claim the same number — this race has been observed
|
||||
in production, producing duplicate observation pairs in the active log.
|
||||
|
||||
After the append, re-read the log and count occurrences of the just-written
|
||||
observation number. If the count is greater than 1, a parallel session has
|
||||
collided — renumber the current session's entry to `max+1` in place via
|
||||
`sed`. Concrete shell:
|
||||
|
||||
```bash
|
||||
WRITTEN=$(grep -cE "^### Observation ${PROPOSED}:" log.md)
|
||||
if [ "$WRITTEN" -gt 1 ]; then
|
||||
# Find my line (the last occurrence, since I just appended) and renumber
|
||||
MY_LINE=$(grep -nE "^### Observation ${PROPOSED}:" log.md \
|
||||
| tail -1 | cut -d: -f1)
|
||||
NEW_NUM=$(( $(grep -oP '^### Observation \K\d+' log.md \
|
||||
| sort -n | tail -1) + 1 ))
|
||||
sed -i "${MY_LINE}s/^### Observation ${PROPOSED}:/### Observation ${NEW_NUM}:/" log.md
|
||||
fi
|
||||
```
|
||||
|
||||
This turns the pre-write assertion into a pre-and-post pair. Pre-write
|
||||
catches stale-read collisions cheaply; post-write catches race collisions
|
||||
by renumbering instead of failing. Either way, the log ends up with no
|
||||
duplicates. Alternative approaches — lockfile, atomic append, transactional
|
||||
write — are heavier and require more infrastructure; the
|
||||
post-write-verify-and-renumber pattern works with plain shell and
|
||||
self-heals.
|
||||
|
||||
**Why both checks are required:** Stale-read collisions and race-condition
|
||||
collisions are different classes of error. The pre-write assertion closes
|
||||
the first; the post-write verification closes the second. Stacking more
|
||||
pre-write layers does not close race cases — only a post-write check can.
|
||||
When the shared state is a log file written by parallel agents, the
|
||||
reliable pattern is check-then-act-then-verify.
|
||||
|
||||
**Session-start staleness check:** At the start of any task-oriented session,
|
||||
note the modification time of `log.md`. If it was modified in the last few
|
||||
hours (i.e., a parallel or recent session has been writing to it), be extra
|
||||
cautious about the numbering pre-check — do not trust any mental model of
|
||||
"current number" and always re-read the log immediately before appending each
|
||||
observation, not just once at session start.
|
||||
|
||||
**Format and insertion rules:** Always use the `### Observation NNN:` format. Always append new observations to the END of the log file. Never insert observations mid-file. Never use alternative ID formats (e.g., `OBS-YYYY-MMDD-NN`). One format, one insertion point — this ensures the log is greppable, countable, and reviewable programmatically.
|
||||
|
||||
Each observation follows this format:
|
||||
@@ -468,7 +639,7 @@ Each observation follows this format:
|
||||
**Phase/Area:** [which part of the skill or workflow this relates to]
|
||||
|
||||
**Issue:** [What happened or what was observed. Be specific — include what
|
||||
Claude did, what the user corrected, or what pattern emerged. Include enough
|
||||
The agent did, what the user corrected, or what pattern emerged. Include enough
|
||||
detail that someone reading this weeks later can understand the context
|
||||
without having seen the original conversation.]
|
||||
|
||||
@@ -624,6 +795,58 @@ confidentiality. When in doubt about whether a detail is too specific,
|
||||
remove it. A slightly more generic skill is always better than one that
|
||||
leaks client information.
|
||||
|
||||
### Layer 5: Cross-Product Re-Identifiability Sweep
|
||||
|
||||
Layers 1–4 focus on single-example scrubbing. They do not catch the case
|
||||
where two or three sanitised examples in the same skill — each fine on its
|
||||
own — combine to narrow the identifiable client set. A reader who knows
|
||||
the author's client portfolio (which is often public on a consultant's
|
||||
website) can triangulate even when each individual example is properly
|
||||
placeholdered. The failure mode is invisible to the author because they
|
||||
mentally compartmentalise each example; it's visible to any reader with
|
||||
adjacent context.
|
||||
|
||||
**When to run it:** After every individual example has been sanitised —
|
||||
as a final pass before the skill ships or before any major public
|
||||
release. This is the last check, not a substitute for earlier layers.
|
||||
|
||||
**What to look for:**
|
||||
|
||||
- **Enumerated counts that match a known client count.** "Four builds
|
||||
across three verticals" in a skill whose author has four public clients
|
||||
across three verticals is functionally a directory. Blur the count
|
||||
("multiple builds") or the verticals ("across regulated, editorial,
|
||||
and commerce contexts").
|
||||
- **Specific numbers in a thin vertical.** Visibility percentages,
|
||||
revenue ranges, or geography given in a vertical where only one or two
|
||||
candidates plausibly exist. A single real client can be narrowed from
|
||||
"vertical × percentage × geography × timing" even when no name appears.
|
||||
Replace specific numbers with illustrative ranges.
|
||||
- **Thinly-disguised placeholder names.** "Northwind Coffee" in a
|
||||
specialty-retailer vertical where the only plausible specialty-retail
|
||||
client is a coffee roaster reads as the real brand with a thin
|
||||
rename. Use the Northwind / Contoso / Fabrikam placeholder family
|
||||
explicitly, and make sure the placeholder's vertical is different from
|
||||
any real client's vertical.
|
||||
|
||||
**How to sweep:**
|
||||
|
||||
1. List every worked example in the skill and the fields each one names
|
||||
(vertical, geography, numeric range, timing, count).
|
||||
2. Ask: do any two examples share enough fields that a reader with access
|
||||
to the author's public client list could map the set to real clients?
|
||||
3. Mitigate by blurring counts, widening verticals, dropping specific
|
||||
numbers to illustrative ranges, or consolidating similar examples into
|
||||
a single composite.
|
||||
|
||||
**Why this is a separate layer:** Re-identification risk is combinatorial.
|
||||
Each additional sanitised example adds a field that narrows the candidate
|
||||
space. Layers 1–4 check each example in isolation and pass. The cross-
|
||||
product only emerges when the examples are read together. The author is
|
||||
the least reliable reader for this check because they know the ground
|
||||
truth — which is exactly why the sweep has to be a mechanical pass, not
|
||||
a feeling.
|
||||
|
||||
---
|
||||
|
||||
## Surfacing Protocol
|
||||
@@ -657,16 +880,33 @@ candidates listed separately.
|
||||
|
||||
## Acting on Observations
|
||||
|
||||
This skill identifies WHAT to build or improve. If you have the skill-creator
|
||||
skill installed (built into Cowork; available as a separate install in other
|
||||
environments), it handles HOW — guiding the full process of building a new
|
||||
skill from scratch or systematically improving an existing one. Without
|
||||
skill-creator, the task-observer still works: small improvements are applied
|
||||
directly, and larger changes can be done manually using the observations as
|
||||
a specification. The boundary between direct application and skill-creator
|
||||
handoff:
|
||||
This skill identifies WHAT to build or improve. This section covers HOW —
|
||||
specifically, the cross-context decision framework for choosing between
|
||||
direct application, skill-creator handoff, and new-skill creation.
|
||||
|
||||
### Small Improvements (Apply Directly)
|
||||
**Trigger gate (when):** Observations are acted on only in three contexts:
|
||||
|
||||
1. **The comprehensive review** — scheduled mode preferred, in-session
|
||||
fallback if no scheduled review has run in 7+ days. See
|
||||
"## Comprehensive Review (scheduled or fallback)" for the procedure.
|
||||
2. **Explicit user requests during a task session** — "update X skill",
|
||||
"act on observation #N now", "apply this rule to the skill". The user
|
||||
is naming the action; the agent executes within the framework below.
|
||||
3. **In-session correction when a skill is producing wrong output and
|
||||
the user should be aware** — surface immediately rather than wait
|
||||
for the next review.
|
||||
|
||||
Observations are NOT applied during normal task sessions outside these
|
||||
contexts. Mid-task work produces observations only; those observations
|
||||
get applied at the next review or by request. The default is log,
|
||||
don't act.
|
||||
|
||||
**Mechanism framework (which):** When acting in any of those contexts,
|
||||
the rest of this section guides the choice between applying changes
|
||||
directly to the skill file, handing off to the skill-creator for
|
||||
substantial restructuring, or creating a new skill from scratch.
|
||||
|
||||
### Small Changes
|
||||
|
||||
If the improvement is clearly additive, low-risk, and doesn't require testing
|
||||
to verify it works, it can be applied directly to the skill:
|
||||
@@ -680,6 +920,8 @@ Examples: Adding a new anti-pattern to a skill's anti-patterns list.
|
||||
Clarifying that inline code comments should be context-aware within their
|
||||
own document.
|
||||
|
||||
After creating or updating any skill file, always present it using `present_files` so the user can review and install it directly from the conversation.
|
||||
|
||||
### Substantial Changes (Use Skill-Creator if Available)
|
||||
|
||||
If the change could affect the skill's behaviour in ways that need
|
||||
@@ -718,6 +960,40 @@ When creating a new skill, determine its type early:
|
||||
- If uncertain, default to open-source — strip out specifics and generalise,
|
||||
then let the user decide whether any internal details need to be added
|
||||
|
||||
|
||||
## Task-Oriented Sessions — Observation vs Action
|
||||
|
||||
Skill development and iteration work happens in multiple environments: in Cowork with persistent storage, in Claude Code with project directories, and in web-based chat without file system access. Cross-environment coordination is essential to prevent regressions — a skill updated in one environment can silently omit content from another if the wrong base file is used.
|
||||
|
||||
### Skill file locations — read-only mount vs workspace copy
|
||||
|
||||
When working with skills, understand the distinction between the **live file** (the authoritative source) and **workspace copies** (working drafts or staged updates):
|
||||
|
||||
1. **The live file is read-only in Cowork.** In Cowork, the live skill file is mounted read-only at `.claude/skills/{skill}/SKILL.md`. You can read it, but you cannot edit it directly — the file system will reject write attempts with `EROFS` (Read-Only File System). This is intentional: it prevents accidental overwrites of the canonical version.
|
||||
|
||||
2. **Read from the live file, not cached memory.** Always start skill edits by reading the current live file — not from a workspace copy, a prior draft, or a memory-based reconstruction. This is the only way to guarantee your updates are based on the current canonical content.
|
||||
|
||||
3. **Stage edits in the workspace folder.** Write updated versions to `[workspace folder]/skill-updates/[date]/[skill-name]/SKILL.md`. This separation keeps the read-only mount clean and gives you a clear staging area for review before the user replaces the live file.
|
||||
|
||||
4. **After staging, present the file for user review.** Always use `present_files` to show the updated skill so the user can review changes and upload directly. Do not attempt to write directly to the mounted skills directory — that will fail with a permission error.
|
||||
|
||||
5. **Before overwriting or replacing any existing staged or workspace copy of a skill, diff it against the live file.** If they differ, the workspace copy is stale and your edits must be rebased on the live version — otherwise you risk silently dropping content added by another session. This rule is also codified in CLAUDE.md under "Skill Editing — Always Start From the Live File" as a cross-environment guard. The concrete failure mode: a Claude Code session produced an updated skill that was based on a stale snapshot and silently omitted two substantial sections added to the live skill earlier the same day. The regression was caught only because a pre-merge diff against the mount revealed the missing content.
|
||||
|
||||
### Task-session skill updates — stage in the workspace
|
||||
|
||||
When a task session produces a skill update (through weekly review, direct improvement, or observation-driven changes), follow this workflow:
|
||||
|
||||
1. Read the live file at `.claude/skills/{skill}/SKILL.md`
|
||||
2. Make all edits to that content
|
||||
3. Save the complete updated file to `[workspace folder]/skill-updates/[today]/[skill-name]/SKILL.md`
|
||||
4. Use `present_files` to show it to the user for review
|
||||
5. The user uploads the file to install it
|
||||
|
||||
This keeps the mount clean, stages updates for review, and gives you a clear separation between read-only source and working copy.
|
||||
|
||||
**Cross-environment note:** Claude Code now shares the same skills as Cowork via the anthropic-skills capability. The "always start from the live file" rule applies in both environments. In Claude Code, the live file is surfaced by the capabilities system; in Cowork, it's the read-only mount at `.claude/skills/{skill}/SKILL.md`. The diff-before-overwrite requirement applies regardless of which environment produced the update.
|
||||
|
||||
---
|
||||
---
|
||||
|
||||
## Principle Propagation
|
||||
@@ -785,55 +1061,135 @@ checklist during any skill creation or regeneration.
|
||||
|
||||
---
|
||||
|
||||
## Weekly Comprehensive Review
|
||||
## Comprehensive Review (scheduled or fallback)
|
||||
|
||||
Every 7 days, a comprehensive review is triggered automatically at the start
|
||||
of the first task-oriented session after the interval has elapsed. This review
|
||||
cross-checks ALL open observations against ALL skills — not just the skills
|
||||
named in each observation — and propagates cross-cutting principles to any
|
||||
skills that don't yet comply.
|
||||
The comprehensive review cross-checks all open observations against all
|
||||
skills, propagates cross-cutting principles to skills that don't yet
|
||||
comply, and applies the improvements that don't need user input. There
|
||||
are two ways it runs.
|
||||
|
||||
**Preferred mode — scheduled autonomous review.** A user-defined recurring
|
||||
task (typical cadence: Monday/Wednesday/Friday mornings) registered with
|
||||
the agent's scheduling system. This is preferred because it picks up open
|
||||
observations on a regular cadence without depending on the user being
|
||||
mid-session at exactly the right moment, and because the user is not
|
||||
present, the review applies the non-escalated observations autonomously.
|
||||
|
||||
**Fallback mode — in-session 7-day trigger.** If no scheduled review is
|
||||
registered (or none has run successfully in the last 7 days), a
|
||||
comprehensive review fires automatically at the start of the next
|
||||
task-oriented session. The fallback is a safety net for users who haven't
|
||||
set up scheduled reviews — either because the environment doesn't support
|
||||
scheduling or because they haven't done it yet.
|
||||
|
||||
### Trigger Mechanism
|
||||
|
||||
The review is triggered by step 3 of the Session Start Protocol (see
|
||||
Observation Log Management). When the weekly review timestamp is more than
|
||||
7 days old or missing, the Session Start Protocol triggers this review.
|
||||
Inform the user that the weekly review is due and begin the process.
|
||||
**Scheduled mode** runs via the user's chosen scheduling tool — no in-skill
|
||||
trigger required.
|
||||
|
||||
**Fallback mode** is triggered by step 3 of the Session Start Protocol
|
||||
(see Observation Log Management). The fallback fires when both of the
|
||||
following are true:
|
||||
|
||||
- No scheduled review task is registered, OR the most recent successful
|
||||
scheduled review was more than 7 days ago.
|
||||
- The in-session timestamp at
|
||||
`[workspace folder]/skill-observations/last-review-date.txt` is also
|
||||
more than 7 days old (or missing).
|
||||
|
||||
When the fallback fires, inform the user that the comprehensive review is
|
||||
running and walk through Step 0 (recommend scheduling) before Step 1.
|
||||
|
||||
### Interactive vs Scheduled Runs — Approval Policy
|
||||
|
||||
The approval behaviour depends on who is present:
|
||||
|
||||
**Interactive sessions (user present):** Always ask the user before applying
|
||||
or declining observations. Present observations grouped by skill with a one-
|
||||
sentence summary each, and wait for explicit approval (blanket "apply all" or
|
||||
selective). This preserves the collaborative feel and lets the user catch
|
||||
observations they disagree with before any staging occurs.
|
||||
|
||||
**Scheduled autonomous runs (user not present):** Apply observations
|
||||
autonomously by default. The safety net is the staging-plus-upload pattern:
|
||||
updates go to `skill-updates/YYYY-MM-DD/{skill-name}/SKILL.md` and only
|
||||
become live when the user explicitly uploads them. Nothing can silently
|
||||
break because nothing is live until the user approves upload.
|
||||
|
||||
**Escalate without applying (report only) when any of these apply:**
|
||||
|
||||
1. **New skill creation.** Naming, scope, type (open-source vs internal),
|
||||
and licence are decisions that benefit from user input. Note the
|
||||
candidate in the report; don't create the skill.
|
||||
2. **Removing or substantially restructuring existing content.** Any edit
|
||||
that deletes a section, replaces it with something smaller, or reshapes
|
||||
core methodology risks dropping institutional memory. Flag and report.
|
||||
3. **An observation that flags its own uncertainty.** Phrases like "not
|
||||
sure if...", "this might be...", "worth discussing..." in the
|
||||
Suggested Improvement field are the observation asking for user input.
|
||||
Respect that.
|
||||
4. **Conflicting observations.** Two observations that point in opposite
|
||||
directions, or where the integration path isn't obvious, should be
|
||||
surfaced rather than resolved autonomously.
|
||||
|
||||
Scheduled runs that escalate should still apply every non-escalated
|
||||
observation before producing the report. A scheduled review that
|
||||
produces 0 applied updates is functionally a report generator, which
|
||||
wastes the scheduling.
|
||||
|
||||
### Review Steps
|
||||
|
||||
**Step 0 — Scheduler availability check**
|
||||
**Step 0 — Recommend scheduled review setup**
|
||||
|
||||
Before running the review itself, check whether the weekly review can be
|
||||
automated via the platform's task scheduling capability.
|
||||
Before running the in-session fallback, check whether scheduled autonomous
|
||||
reviews are set up. If not, surface a recommendation to the user — but
|
||||
respect prior declines.
|
||||
|
||||
1. Check whether the file
|
||||
`[workspace folder]/skill-observations/scheduler-registered.txt` exists.
|
||||
If it does, the scheduled task has already been registered — skip to
|
||||
Step 1.
|
||||
1. Check for the suppression marker at
|
||||
`[workspace folder]/skill-observations/scheduled-review-decline.txt`.
|
||||
If it exists and was last updated less than 30 days ago, AND the
|
||||
in-session fallback has not fired multiple times in that window, skip
|
||||
the recommendation. Proceed to Step 1.
|
||||
|
||||
2. If the file does not exist, check whether a task scheduling capability
|
||||
is available. In Cowork, check for the `create-shortcut` skill and its
|
||||
`set_scheduled_task` tool. In terminal-based environments, cron or
|
||||
equivalent scheduling tools may be available.
|
||||
2. Check whether a scheduled review task is registered. The signal is
|
||||
either a presence check via the platform's scheduling tool (preferred)
|
||||
or the existence of
|
||||
`[workspace folder]/skill-observations/scheduler-registered.txt`. If a
|
||||
registered scheduled review is found, no recommendation needed — skip
|
||||
to Step 1.
|
||||
|
||||
3. If a scheduling capability IS available:
|
||||
- Read the draft task description at
|
||||
`[workspace folder]/skill-observations/scheduled-task-draft.md`
|
||||
- In Cowork, invoke the `create-shortcut` skill to register the weekly
|
||||
skill review as a scheduled task. In other environments, use the
|
||||
available scheduling mechanism.
|
||||
- Use task name `weekly-skill-review` and a weekly cadence (e.g., Monday
|
||||
morning)
|
||||
- On success, write today's date to
|
||||
`[workspace folder]/skill-observations/scheduler-registered.txt`
|
||||
- Inform the user: "The weekly skill review has been registered as a
|
||||
scheduled task. The manual `last-review-date.txt` trigger remains as
|
||||
a fallback."
|
||||
3. If no scheduled review is registered AND no recent decline marker
|
||||
exists (or the marker is stale because the fallback keeps firing),
|
||||
make an active recommendation:
|
||||
|
||||
4. If the tool is NOT available, proceed silently to Step 1. Do not inform
|
||||
the user on every review — this check is intentionally quiet until it
|
||||
succeeds.
|
||||
> "I notice you don't have a recurring skill review scheduled. The
|
||||
> task-observer recommends running this review on a cadence — e.g.,
|
||||
> Monday/Wednesday/Friday mornings — so it doesn't depend on you
|
||||
> being mid-session at the right moment. Want help setting one up?"
|
||||
|
||||
- **If the user says yes:** walk through registering a scheduled task
|
||||
using the platform's scheduling capability. In Cowork, invoke the
|
||||
`create-shortcut` skill and its `set_scheduled_task` tool. In
|
||||
terminal-based environments, use cron or an equivalent scheduler.
|
||||
Use task name `weekly-skill-review` (or similar) and a sensible
|
||||
default cadence; let the user pick the day(s) and time. Once
|
||||
registered, read the draft task description at
|
||||
`[workspace folder]/skill-observations/scheduled-task-draft.md` and
|
||||
pass it as the task prompt. On success, write today's date to
|
||||
`[workspace folder]/skill-observations/scheduler-registered.txt`.
|
||||
- **If the user says no or defers:** write today's date to
|
||||
`[workspace folder]/skill-observations/scheduled-review-decline.txt`
|
||||
to suppress the recommendation for 30 days. Proceed to Step 1 and
|
||||
run the in-session fallback.
|
||||
|
||||
4. If no scheduling capability is available in the current environment,
|
||||
skip the recommendation silently and proceed to Step 1. Do not surface
|
||||
the recommendation in environments where the user couldn't act on it.
|
||||
|
||||
The 30-day suppression isn't permanent. If the in-session fallback keeps
|
||||
firing within the suppression window — a signal that the recurring need
|
||||
is real and the one-time decline was situational — the recommendation
|
||||
re-surfaces on the next firing.
|
||||
|
||||
**Step 1 — Load observations and principles**
|
||||
|
||||
@@ -873,6 +1229,10 @@ contain general principles that apply more broadly than the original context
|
||||
suggested. Consider both the specific "Suggested improvement" and the general
|
||||
"Principle" fields. Build a mapping of skill → [relevant observations].
|
||||
|
||||
**If the review is interactive (user present):** Present ALL observations to the user in a single message, grouped by skill. For each observation, show the number, title, and a one-sentence summary. Flag any observations that are ambiguous, risky, or require a judgment call as 'Needs your input'. All other observations are treated as straightforward and can be applied without individual discussion.
|
||||
|
||||
**If the review is scheduled autonomous (user not present):** Skip the user-facing present step. Apply the approval policy from "Interactive vs Scheduled Runs" above: apply every non-escalated observation and record the escalated ones (new-skill candidates, removal/restructuring, self-flagged uncertainty, conflicting observations) in the review report without applying them. Proceed directly to Step 4.
|
||||
|
||||
**Step 4 — Cross-check cross-cutting principles against every skill**
|
||||
|
||||
For each active cross-cutting principle, check whether each skill already
|
||||
@@ -880,8 +1240,7 @@ complies. Flag any skills that do not yet implement the principle.
|
||||
|
||||
**Step 5 — Apply updates**
|
||||
|
||||
For each skill that has relevant observations or non-compliant principles,
|
||||
create an updated version of its SKILL.md. When editing:
|
||||
In interactive runs, wait for user confirmation (blanket "apply all" or selective approval) before creating updates. In scheduled autonomous runs, proceed directly to applying all non-escalated observations. For each skill that has relevant observations or non-compliant principles, create an updated version of its SKILL.md. When editing:
|
||||
|
||||
- Integrate the insight into the appropriate section of the skill (don't just
|
||||
append a list of observations at the bottom)
|
||||
@@ -928,8 +1287,7 @@ Write today's date to
|
||||
|
||||
**Step 8 — Present summary and user action items**
|
||||
|
||||
Present the user with a clear summary and explicit instructions for what they
|
||||
need to do. Follow the format in Delivering Updated Skills below.
|
||||
Present each updated skill file using `present_files`, then show the user a summary following the format in Delivering Updated Skills above. The user can install updated skills directly from the conversation using the upload button on each presented file.
|
||||
|
||||
### Constraints
|
||||
|
||||
@@ -946,26 +1304,23 @@ need to do. Follow the format in Delivering Updated Skills below.
|
||||
## Delivering Updated Skills to the User
|
||||
|
||||
When the weekly review (or any other process) produces updated skill files,
|
||||
the updated files must be delivered to the user for manual replacement. Skill
|
||||
files live in a read-only location during sessions and may be managed in
|
||||
version control, synced across devices, or packaged for distribution.
|
||||
Automatic in-place editing is neither possible nor desirable — delivering to
|
||||
the workspace folder with explicit instructions keeps the user in control.
|
||||
they are delivered to the user through the conversation using `present_files`.
|
||||
Cowork's UI includes an upload button on presented skill files that allows
|
||||
the user to install them directly into their capabilities — no manual file
|
||||
copying needed.
|
||||
|
||||
### Delivery Process
|
||||
|
||||
1. Save each updated SKILL.md to the workspace folder using this structure:
|
||||
1. Save each updated SKILL.md to the workspace folder for record-keeping:
|
||||
|
||||
```
|
||||
[workspace folder]/skill-updates/[date]/[skill-name]/SKILL.md
|
||||
```
|
||||
|
||||
For example:
|
||||
```
|
||||
[workspace folder]/skill-updates/2026-02-16/my-skill-name/SKILL.md
|
||||
```
|
||||
2. Present each updated skill file using `present_files` so the user can
|
||||
review it inline and install it directly via the upload button.
|
||||
|
||||
2. Present the user with an explicit action list using this format:
|
||||
3. Present the user with a summary using this format:
|
||||
|
||||
```
|
||||
## Weekly Skill Review Complete — [date]
|
||||
@@ -973,18 +1328,11 @@ the workspace folder with explicit instructions keeps the user in control.
|
||||
The following skills have been updated based on [N] open observations
|
||||
and [N] cross-cutting principles.
|
||||
|
||||
### What you need to do
|
||||
|
||||
For each updated skill below, replace the existing SKILL.md in your
|
||||
skill directory with the updated version from your workspace folder.
|
||||
|
||||
### Updated Skills
|
||||
|
||||
**[skill-name]**
|
||||
- Changes: [1-sentence summary of what changed]
|
||||
- Observations applied: #[N], #[N]
|
||||
- Updated file: [link to file in workspace folder]
|
||||
- Action: Replace [skill-directory]/SKILL.md with this file
|
||||
|
||||
[repeat for each updated skill]
|
||||
|
||||
@@ -992,12 +1340,17 @@ the workspace folder with explicit instructions keeps the user in control.
|
||||
[list of observation numbers and titles marked ACTIONED]
|
||||
|
||||
### Skipped (needs manual review)
|
||||
[observations that were unclear or couldn't be applied, with explanation]
|
||||
|
||||
### No Changes Needed
|
||||
[skills that were checked but already compliant]
|
||||
[any observations that couldn't be applied, with reasons]
|
||||
```
|
||||
|
||||
### Keep-Two Rule
|
||||
|
||||
The `skill-updates/` directory uses a rolling retention policy: for any
|
||||
given skill, keep only the two most recent date directories. When a skill
|
||||
appears in more than two date directories, delete the oldest copies. This
|
||||
prevents the workspace from accumulating stale update history while still
|
||||
keeping a short rollback window.
|
||||
|
||||
3. Do not proceed with other work until the user has acknowledged the
|
||||
summary. The user does not need to replace the files immediately, but
|
||||
they should be aware of what's pending.
|
||||
@@ -1076,58 +1429,9 @@ provides the complete historical record.
|
||||
|
||||
---
|
||||
|
||||
## The Pre-Flight Principle
|
||||
|
||||
One of the most important patterns this skill should propagate to every skill
|
||||
it helps create or improve: **built-in enforcement.**
|
||||
|
||||
Real-world experience has shown that rules documented in a skill are not
|
||||
always followed during the creative flow of producing output. The result:
|
||||
output that violates the skill's own standards, which reflects badly on the
|
||||
skill.
|
||||
|
||||
The fix: every skill that contains explicit rules or requirements should
|
||||
include a verification step where Claude re-reads the rules and checks its
|
||||
output against them before delivery. This isn't overhead — it's quality
|
||||
assurance. A 30-second re-read prevents a 30-minute rework cycle.
|
||||
|
||||
When creating or improving any skill through this observation process, ask:
|
||||
"Does this skill have rules? If yes, does it have a mechanism to enforce
|
||||
them?" If the answer to the second question is no, add one.
|
||||
|
||||
### General Debugging Principle
|
||||
|
||||
When debugging, always ask: is this a single instance or a pattern? If an
|
||||
error reveals a pattern (e.g., a class of similar issues), fix the class,
|
||||
not just the instance. Every specific error is a signal about a class of
|
||||
errors. Audit the full scope on first encounter to avoid discovering related
|
||||
failures in subsequent cycles.
|
||||
|
||||
### Self-Enforcement
|
||||
|
||||
This skill practises what it preaches. Before surfacing observations at end
|
||||
of session, verify:
|
||||
|
||||
1. Were observations logged throughout the full session — including during
|
||||
post-task feedback, discussion phases, and reflective conversations, not
|
||||
just during active tool use?
|
||||
2. Were observations logged silently without interrupting the user's flow?
|
||||
3. Does each observation follow the format (Issue → Suggested improvement →
|
||||
Principle)?
|
||||
4. Is each observation tagged with the correct type (open-source or internal)?
|
||||
5. For any observations about existing skills, does the suggested improvement
|
||||
reference the specific section or rule?
|
||||
6. For any observation tagged `type: open-source`, does the Principle field
|
||||
contain any client-identifying information? If so, generalise it before
|
||||
surfacing.
|
||||
|
||||
If any observation fails these checks, fix it before surfacing.
|
||||
|
||||
---
|
||||
|
||||
## Environment Compatibility
|
||||
|
||||
The observation methodology works in any environment where Claude can interact
|
||||
The observation methodology works in any environment where the agent can interact
|
||||
with users during task-oriented work. The persistence mechanism is what varies.
|
||||
|
||||
### With Persistent Storage
|
||||
@@ -1218,4 +1522,3 @@ environments.
|
||||
| Log archival? | Event-driven — resolved entries are archived on the next log write |
|
||||
| Simplification signals? | Watch for one-off rules, never-used sections, elaborate workflows users skip, and contradictions |
|
||||
| Handoff doc analysis? | Systematically extract implied observations from action items, open questions, and narrative sections |
|
||||
| Debugging approach? | Always identify the class of error, not just the instance; audit full scope on first encounter |
|
||||
|
||||
Reference in New Issue
Block a user