Introducing Institutionalized
An experiment in using LLM's to create a tool that uses an LLM. Hats on hats.
A couple weeks back, I wrote a post about vibe coding. It was something simple and stupid and it was an experiment in using the Copilot chat window in VS Code. It was fun, but the more I thought about it, the more I felt like I needed to really push it.
The Problem
In the last year I've probably made a few thousand git commits. I think anyone working on software and using git as their source control, probably has. 
In the last year, we also started rolling out more AI tools at work with an emphasis on using them.
Combine all that with a ChatGPT CLI made by kardolus (including their prompt repository), I had come up with a Rube Goldberg machine of a command that would look at git status, check the diffs, and write a pull request description which would then pipe into the GitHub CLI. 
It worked about 40% of the time and I eventually just kind of abandoned the setup...but not the idea.
The Spark
In the aforementioned experiment with agentic coding, I came across a way to have the agent create commit messages and PR descriptions. It was actually kind of a neat feature since they were very descriptive and, most importantly, didn't take a whole lot of work.
That re-ignited my drive for a tool that would let me make all of the code changes, but not make me do the boring part of writing descriptive commit messages and adequate pull request descriptions.
So I took some time over the weekend and started mucking about.
The Solution
It's a CLI written in Go that will leverage different LLM API's to create your commit messages and pull request descriptions. It's relatively easy to install, relatively easy to configure, and has logic setup to handle three different LLM providers.
The command is maybe a little wordy, but I couldn't stop thinking about that time that Mike wanted just one Pepsi.
The Process
I think this is the real meat of everything.
Starting
I created the repository and used the "jump start your project with Copilot" option when you create a new repository. I threw in a prompt with the basic requirements and let it go.
Thirty minutes or so I had a pull request waiting for me that had a base program that connected to the OpenAI API and was able to create a commit message and complete the commit.
From a project management standpoint, this was an MVP. I felt like "yeah, this will work from here, but I want some more options."
Building
From there, I used a combination of a couple different ways to go about doing this:
GitHub Issues Assigned to Copilot
The premise is pretty simple (if you've been around software engineering and project management for a while). You write a user story with your requirements, acceptance criteria, and any other notes. You assign it to Copilot and sometime late you get a pull request to review. Once complete, you can go back and check the logic it used and the steps it went through to get there. If it's all above board, you merge the pull request and you're there.
While this is by far away the easiest, it also feels like the most "you need to know what you're looking at" factor. I'm not super proficient in Go, which is why I used it instead of my beloved .Net. I wanted to have to really dive in and make sure that things made sense and that I could understand them. I had to really review them. If you were doing this without a solid foundation in programming, you're kind of flying blind that it works as expected and has any degree of maintainability.
Copilot Chat in VS Code
Again, it's pretty simple. Just like other GPTs, you tell it what you want and it will work through iteration with you. Start broad or start narrow and it will just go. Along the way it will prompt you for any commands it needs to run and check with you to make sure you want to keep going. When all is said and done it will explain to you what it did and in some propose some next steps for you.
This feels more like actually coding. You're closer to the metal as you watch the process go and you can interject when you see things start to go sideways. The visibility is pretty awesome.
What I found difficult was that it (somewhat regularly) can get stuck in a loop. In another case I was working with it to write tests and it was stuck in a loop trying to change a property name from camel-case to pascal-case. I would stop the agent, re-prompt it to solve the problem, and it would get stuck in the loop again.
The other issue I ran into was context. When you use Copilot to work though an issue, it maintains its own context as it iterates. If you need it to fix something, it can check what it did and the original prompt and work from there. When using the chat, it seems to lose context more often. You can find it doing things like forgetting names for variables or re-doing the same thing twice.
Completion
Moving back and forth between the two, I was able to iterate and expand to a point that I hit everything I had originally wanted in the tool. A lot of these were from observations while iterating and a lot of these were just "nice to haves" that I kind of knew would be there.
I had Copilot help me create some CI workflows and it's ready for prime time.
The Concerns
I think the environmental, socio-economic, psychological, and educational issues with using AI tools across the board has been documented by people way smarter than me. However, as I said previously, this is the reality of the industry I'm in. I'll set all of that aside for now and try to make peace with myself on those other fronts.
I do have concerns about the actual usage of these tools though.
The first is that I think there's a level of comfort it creates that can be misleading. In cases of data security or logic, I've found it can get itself very twisted. So it may "work," but it wouldn't pass a basic test of logic or security. It would also struggle to scale and undoing those decisions would be costly to fix. If you don't have a base understanding of the tools/languages it is using, it can be incredibly difficult to troubleshoot on your own and if you try to have the tooling do it, there can be further obfuscation.
The second is the lack of visibility it can create. If you use GitHub Issues, the commits and pull requests show they were performed by the AI agent. If you use the VS Code plugin, the same changes appear to come from you. The implication here is that you may lose visibility into what/who did what to your codebase. This could become an issue when working in a team and you're trying to chase down an issue or a change. You could find yourself in a loop of introducing a change that no one knows how to fix and the AI just loops trying to fix it. There has to be a very solid "driver in the loop" to keep it honest and validate everything happening.
The third is, and I said this before, it's a junior engineer at best. You give it a well defined task and it complete it. You review it heavily, peer test, and send it if it's all above board. Sometimes you ask for fixes and it makes the problem worse and you have to guide it back. This means the window for junior engineers is closing, which is bad for an aging industry in the middle of a jobs crisis.
The final is the actual cost. I can't actually nail down what the cost is for this functionality in GitHub. It looks like there is no cost to the right accounts until November. This has the potential to really bite you if you're not keeping an eye on your billing and budgets in GitHub.
The Closing
In the right hands, I think these tools are pretty handy. There are cases that I've used it to work through issues that I've been struggling with for months.
A GitHub Action introduced a breaking change and didn't really document the work-around that makes sense for my environment? Ask the chat agent to take a swing at fixing it.
That Action needs to be updated across 34 Workflows? Ask the agent to write a script to update all of them.
I believe it also creates a nicer way to boilerplate your projects.
I want to build a data tool that has a .Net 8 backend, a PostgeSQL database, an Angular frontend, and I want to use docker-compose to run the whole thing? Prompt upon repository creation and start from there. I can build from there without the tedium of scaffolding out everything on my own and trying to remember where that one change is you need to make in the Dockerfile to use the right version of NPM. 
Will I keep using it? Probably. It's the nature of the industry right now and I have a backlog of things that need to be done without the time to do them. It also makes the barrier to starting new ideas (this is all a case in point) much lower.
Anyways, I hope the overlords eventually read this post and decide to take pity on me.
Besos robots!
