The great big Ai LLM thread. Github code, blogs & opinions, walkthru's, trainer's & more

RchdR · April 15, 2026, 12:30am

https://efficienist.com/claude-code-may-be-burning-your-limits-with-invisible-tokens-you-cant-see-or-audit/

A developer’s experiment suggests Anthropic may be silently injecting thousands of tokens into Claude Code requests, and users’ usage limits are taking the hit.

ccaudit — Claude Code Token Usage Explorer

ccaudit is a terminal UI for exploring how Claude Code spends your token budget. It reads the JSONL session logs that Claude Code writes to ~/.claude/projects/ and breaks down token usage by session, exchange, and content category.

Why observability matters

When Claude Code runs autonomously — spawning subagents, calling tools, reading files, executing commands — you have no visibility into what’s actually happening. The terminal shows a fraction of the activity. Subagents are invisible. Tool calls blur together. And when something goes wrong three agents deep in a parallel execution, you’re left reading through logs after the fact.

Claude Observe captures every hook event as it happens and streams it to a live dashboard. You see exactly what each agent is doing, which tools it’s calling, what files it’s touching, and how subagents relate to their parents.

ironmansculler · April 15, 2026, 3:19am

if you create the MCP server or control the end point connection you control what connections Claude makes. The configuration files in the claude set up let you configure what tools it uses. In the case of the small demo we posted control what comes back to be executed by the claude response is not some agent on a far off machine. You developer has full control over the resulting output actions.

ironmansculler · April 16, 2026, 9:51pm

claude 4.7 released.. along with some performance stats.. https://www.anthropic.com/news/claude-opus-4-7

ironmansculler · April 18, 2026, 6:52pm

By now some of you will have tried 4.7 and here is an example of Claude code compiling the QT editor no errors..

https://claude.ai/public/artifacts/d36c51ac-e890-4f18-ad9e-5459296b3438

Adding Highlighting for a language or construct and of course then creating a platform for direct AI interaction without the need for MCP servers and also reducing the stack to mere kilobytes not megabytes and gigabytes…

https://claude.ai/public/artifacts/a9d547ff-100c-4736-94db-8b2d9aad2baf

No need for the bloat… it should become a movement like Bauhaus, and of course it means AI will program your programs in real time… without an MCP server..

Once you remove the MCP server and AI becomes part of your application the fastest way to let it drive the APP is to bypass the normal script driven paradigm.

And this is what AI proposes as it determines the services that the HOST can provide to driving the product..

https://claude.ai/public/artifacts/a395f84c-17ca-4bda-9d04-475d60d1d584

RchdR · May 15, 2026, 11:20am

https://openai.com/index/work-with-codex-from-anywhere/

Observing their users activities has created these two spin-offs: Finance & Legal.

Although the solution suggested in this issue is to use plugin’s, I do think there is a point to be made to have switchable persona’s.

github.com/anthropics/claude-code

Feature Request: Persona Profiles — switchable bundles of model, writing style, skills, and constraints

opened 07:03AM - 26 Apr 26 UTC

closed 08:36PM - 27 Apr 26 UTC

rong4483-sketch

enhancement area:cli area:skills

### Preflight Checklist - [x] I have searched [existing requests](https://githu…b.com/anthropics/claude-code/issues?q=is%3Aissue%20label%3Aenhancement) and this feature hasn't been requested yet - [x] This is a single feature request (not multiple features) ### Problem Statement ## Summary As someone who uses Claude Code across multiple professional domains — education content, marketing, and professional standards — I find myself repeatedly re-establishing context at the start of every session. I have to manually re-specify my writing register, activate the right skills, and adjust the quality bar depending on which "hat" I'm wearing. This is friction that compounds across dozens of sessions. ## The Problem My daily workflow spans at least three distinct operating modes: - **Education** — formal, CPD-compliant writing, structured learning outcomes, enterprise-grade HTML modules - **Marketing** — persuasive, audience-aware copy, brand voice, GTM positioning - **Professional Standards** — precise, board-ready language, regulatory tone, no ambiguity Each mode requires a different writing style, a different set of active skills, and ideally a different model (e.g. Opus for complex strategy work, Sonnet for fast drafting). Right now, none of this is bundled — I have to reconstruct the context manually or hope my CLAUDE.md covers it. ## Proposed Feature: Persona Profiles A **Persona Profile** is a named, switchable bundle containing: - **Model selection** (e.g. Opus 4 for strategy, Sonnet for drafting) - **Writing style / register** (formal, persuasive, technical, educational) - **Active skills** (which slash commands and skill files are loaded) - **Quality bar and constraints** (e.g. "Board-ready polish, no forced linear progression") - **CLAUDE.md overrides** specific to that persona The user can define multiple profiles and switch between them with a single click or command — e.g. `/profile education`, `/profile marketing`, `/profile strategy`. ## Why This Matters The current workaround is directory-level CLAUDE.md files, but this: - Requires navigating to a different project directory to change context - Doesn't change the active model - Doesn't activate/deactivate skills - Isn't visible or selectable from the UI A persona profile would be a first-class concept in Claude Code settings — visible in the UI, switchable at session start, and portable across machines. ## Real-World Impact Over 58 sessions I've worked across all three domains above. The time spent re-establishing context — re-specifying tone, quality bar, constraints, and skills — is a consistent tax on every session start. A one-click profile switch would eliminate that entirely and make Claude Code meaningfully more useful for professionals who work across domains. ## Suggested Implementation - Profiles defined in `~/.claude/profiles/` as markdown files (similar to skills) - Each profile specifies: model, writing_style, active_skills[], claude_md_additions - Selectable via `/profile <name>` command or a dropdown in the Claude Code UI - Active profile shown in the status line ### Proposed Solution A named **Persona Profile** stored in `~/.claude/profiles/` as a markdown file, containing: - **model** — which Claude model to use (e.g. `claude-opus-4-7` for strategy, `claude-sonnet-4-6` for drafting) - **writing_style** — register and tone descriptor (e.g. "formal CPD-compliant", "persuasive marketing", "board-ready executive") - **active_skills[]** — list of skills to load automatically at session start - **claude_md_additions** — persona-specific CLAUDE.md overrides (quality bar, constraints, anti-patterns) **Interaction model:** - Switch via `/profile <name>` slash command at session start - Or select from a dropdown in the Claude Code UI session header - Active profile name shown in the status line so the user always knows which context is active - Profiles are portable — stored in `~/.claude/` so they transfer across machines ### Alternative Solutions The current workaround is directory-level CLAUDE.md files, but this approach has significant limitations: - Requires navigating to a different project directory to change context - Does not change the active model - Does not activate or deactivate skills - Is not visible or selectable from the UI - Cannot be easily transferred to a new machine as a named profile Manually re-specifying tone, quality bar, and constraints at the start of each session works but creates repeated overhead across every session. ### Priority Critical - Blocking my work ### Feature Category CLI commands and flags ### Use Case Example _No response_ ### Additional Context _No response_

urayoan · May 15, 2026, 12:54pm

Hey @Mark_Sarson , yesterday I was testing Hermes Agent and it may be a replacement for a paid subscription of Claude Code or at least a good “let’s see the capabilities for free and see how it works”.

The catch, it use owl-alpha, not a trained model for Clarion as far as I know. The good stuff, is free and with Hermes (the Coding agent) it is supposed to get smarter over time.

RchdR · May 21, 2026, 10:10am

From this ClarionHub thread

RchdR · May 29, 2026, 10:38pm

ironmansculler · May 29, 2026, 11:46pm

We have found the real benefit of Claude Code is that it can port code and also test code. Here is Claude code creating, compiling and reading the logs from a program to develop it in a continuous cycle. It also records the video of the running program at the same time. no interaction was required by a human in this process other than a conversation. In this program a mapping solution for automated control rescaling was being tested. It was a cartograph mapping solution we use in linux back ported to clarion code.

RchdR · May 30, 2026, 6:08pm

I dont have a clue what I’m supposed to be looking at here.

RchdR · May 31, 2026, 11:40pm

https://znetwork.org/znetarticle/9-trillion-collapse-machine/

The Reddit comments are interesting on this article are interesting, some might even say prophetic.

LANSRAD · May 31, 2026, 11:48pm

The article may be overstated, but the underlying point is important: AI is being sold as software while being scaled as heavy industry.

That doesn’t mean AI is useless.

It does mean developers should be cautious about assuming unlimited cheap compute, stable pricing, or magical agentic workflows.

My own takeaway remains the same: use AI aggressively where it helps, but keep the programmer in the loop and keep the workflow grounded.

ironmansculler · June 2, 2026, 9:22am

The AI models seem to be cultural in nature and even claude code is not really what one would like from a european perspective.

LANSRAD · June 2, 2026, 10:10am

I think that’s a fair observation. The models are definitely not culturally neutral. They reflect a lot of the language, assumptions, priorities, and working habits present in the material they were trained on.

That shows up in coding tools too. The default posture often feels very American tech-industry: cloud-first, move-fast, automate aggressively, connect everything, and assume the platform knows best.

That does not make the tools useless, but it does mean they need boundaries. Different developers, companies, and regions may have different expectations around privacy, control, auditability, maintainability, and how much autonomy they are willing to hand over to a tool.

So I think the better way to look at these systems is not as neutral experts, but as powerful assistants whose defaults need to be adjusted to the environment they are being used in.

RchdR · June 2, 2026, 8:12pm

Seven new AI models developed by its Microsoft AI Superintelligence Team, their attempt to reduce dependency on OpenAi

Independents preferred its midrange MAI-Thinking-1 model to Anthropic’s Claude Sonnet Opus 4.6, and MAI-Thinking–1 benchmark matched Opus 4.6’s coding ability.

MAI-Code-1-Flash is an inference-efficient, 5-billion-parameter coding model developed by Microsoft as part of their homegrown MAI series. It is specifically built for autonomous agentic coding, deeply integrated into GitHub Copilot and VS Code. [1, 2]

Because it is a native Microsoft product embedded into their proprietary developer ecosystem, it is not currently hosted as a public model on the Hugging Face hub. [1]

It is designed to compete with Anthropic’s Claude Haiku. Despite its small size, it scores an impressive 51.2% on SWE-Bench Pro by autonomously planning and completing complex coding tasks from start to finish

If you are looking for highly efficient coding and reasoning models currently available on the Hugging Face hub, consider these widely used alternatives:

Step 3.5 Flash by StepFun: A 197B-parameter MoE model (with ~11B active parameters) optimized for high-speed coding and agentic workflows. [1]
Ling-flash-2.0 by inclusionAI: A tiny-activation MoE model that achieves major inference speed advantages for local coding environments. [1]
Qwen3.5-35B: A powerful open-weight base built for complex reasoning and coding tasks.

Not strictly anything to do with Ai but also released today.

Kapa builds AI assistants that answer questions from technical documentation. The knowledge bases we process hold millions of images: screenshots, architecture diagrams, circuit schematics, annotated UI walkthroughs. We spent several months working out how to make them useful in our RAG pipeline.

Tue, Jun 2 8:30 PM - 9:45 PM BST Duration 1 hour 15 minutes In San Fran Only Not being Recorded!

https://openai.com/index/openai-frontier-models-and-codex-are-now-available-on-aws/

LANSRAD · June 3, 2026, 12:18am

That is interesting, and it actually looks like a pretty natural move by Microsoft.

I don’t think it necessarily means they are moving away from OpenAI entirely, but it does suggest they do not want to be completely dependent on someone else’s frontier models for something as central as Copilot, VS Code, GitHub, Azure, and developer tooling.

The MAI-Code-1-Flash angle is especially interesting because it sounds less like “build the smartest general model” and more like “build a fast, cheap, deeply integrated coding model that Microsoft can control and run at scale.” That may matter more inside Copilot than winning every general benchmark.

For Clarion, though, I would still be cautious. Most of these coding models are going to be strongest where they have the most training data and feedback: C#, TypeScript, Python, Java, C++, web frameworks, and so on. Clarion is still niche enough that the real advantage will probably come from giving the AI very explicit Clarion-specific rules, examples, and tightly bounded tasks.

So I see this less as “one new model wins” and more as another sign that the AI coding tools are becoming a normal part of the developer stack. The trick is still to use them in reviewable chunks, keep the programmer in charge, and not let the tool wander off pretending that almost-right code is good code.

RchdR · June 3, 2026, 1:40am

Back In Jan, Microsoft and OpenAI restructured their partnership, dropping Microsoft’s exclusive rights to OpenAI’s models. This cleared the way for a massive $50 billion alliance between OpenAI and Amazon. OpenAI is now offering its technology directly on Amazon Web Services, while Microsoft will no longer pay a revenue share to OpenAI.

https://openai.com/index/next-phase-of-microsoft-partnership/

https://openai.com/index/openai-frontier-models-and-codex-are-now-available-on-aws/

Thats always been the case, which is why you wont find an Ai writing good Template Language code.

There’s just not enough examples on Github, but more importantly, there’s bug’s and plenty of undocumented features cuz of the help docs.

Karpathy’s AutoResearch might work well training itself on the template language, but I’ve not had the time to get into it yet.

The freebie ChatGPT I have here, not that I use it really for coding, cant do template language code, it just hallucinates like its found the main stash in William Leonard Pickard’s nuclear silo!

LANSRAD · June 3, 2026, 3:15am

Yes, that looks like the important distinction.

Microsoft is not exactly being cut loose from OpenAI, but the relationship is clearly becoming less exclusive and more complicated.

The OpenAI/Microsoft announcement says Microsoft remains OpenAI’s primary cloud partner, and OpenAI products still ship first on Azure unless Microsoft cannot or chooses not to support the needed capabilities. But it also says Microsoft’s license to OpenAI models is now non-exclusive, and OpenAI can serve its products through other cloud providers.

That makes the AWS announcement much more understandable. OpenAI can now meet large enterprise customers where they already live, including AWS and Amazon Bedrock. For big companies, that matters because procurement, security review, compliance, billing, and governance are often already built around AWS.

So I think this supports the larger point: Microsoft still benefits from OpenAI, but it has every reason to reduce dependency and build more of its own model stack. That makes the MAI models less surprising. Microsoft wants Copilot, VS Code, GitHub, Azure, and Windows to be built on technology it can control, tune, price, and deploy without being completely dependent on another company’s roadmap.

For developers, I still think the practical takeaway is the same.

The model landscape is going to keep shifting. OpenAI, Microsoft, Anthropic, Google, Amazon, Meta, and others will keep making moves. The safer long-term habit is not to marry one model or one vendor, but to build workflows that use AI in bounded, reviewable chunks, with enough project-specific context and human review to keep the work honest.

RchdR · June 3, 2026, 10:24am

Because those companies will keep moving, not always in the right direction if their past is an example and the corporate practice Enshittification will still exist to maximise profits, the best practice is to develop one’s own Ai model to remain in control.

LANSRAD · June 3, 2026, 10:55am

I agree with the concern about vendor lock-in and enshittification. We have all seen enough software and cloud services get worse, more expensive, more restrictive, or more intrusive over time that it would be foolish not to think about control.

Where I would make a distinction is between several very different things that often get lumped together.

Running a local model can make sense. Using an open-weight model can make sense. Fine-tuning a smaller model for a narrow purpose can make sense. Using RAG or local search over your own documents can also make a lot of sense, especially when privacy, repeatability, or disconnected use matters.

But that is not the same thing as developing your own AI model in the sense of competing with OpenAI, Anthropic, Google, Microsoft, Meta, or Amazon. No individual developer, and probably no ordinary small company, is going to reproduce the compute, training data, research staff, infrastructure, evaluation pipeline, safety work, and ongoing retraining needed to build and maintain a frontier model.

So to me this is partly an apples-to-oranges comparison.

For some tasks, a local or specialized model may be the better answer because it gives you control and keeps the work close to your own data. For other tasks, especially broad reasoning, coding help across many languages, writing assistance, research, and general problem solving, the large hosted models are going to remain far ahead for the foreseeable future.

The practical answer may not be “build your own model.” It may be “avoid building your workflow so that it depends completely on one vendor.” Use portable prompts, keep your own data and documentation, design your workflow so different models can be swapped in, and use local models where they actually fit the job.

That gives you some control without pretending that a desktop machine or small office server can replace a frontier-scale LLM.