Folder-Based Orchestration for Smarter AI Agents
📄 Zusammenfassung
Dieses Video von "Folder-Based Orchestration for Smarter AI Agents" enthaelt keine Beschreibung und kein Transkript. Bitte das Video direkt auf YouTube aufrufen fuer mehr Informationen.
📝 Transkript
A year ago, I created a video about prompt chaining and how to improve your AI coding output. Obviously, a lot has changed since then. We're now in the era of agents and cramming everything into one file won't work anymore. I switched to a folder structure instead. Every time I open a new chat with Claude, Cursur, Pi, or any other agent, I load a clear set of files that defines the coding style, the project rules, where I left off, and what is next. The agent has context from the first message. In this video, I will show you how to build this from scratch, building one prompt. We will set it up live for a real Flutter project, but you can adapt this to any coding or non-coding task. You will find the prompts and LLM conversation in the video description down below. You don't have to join the free Discord community, although I'd appreciate if you did, because there are a lot of cool people, and you might learn a thing or two. So, without further ado, let's jump into the video. Now, before I walk you through the process, I quickly wanted to touch on the subject of context engineering. For those of you already know what that is, I've linked the timestamps in the video description down below, so you can just skip this part. But, Andrew Karpathy defined context engineering as the delicate art and science of filling the context window with just the right information for the next step. Now, if we look at the next graphic, this becomes quite clear, right? So, 3 to 4 years in the past, when ChatGPT first came out, there wasn't really much context that needed to be handled. You had the system prompt, the user didn't really have access to, and you had the user message. So, something like explain physics to me like I'm five, for example. But, that's it. Now, of course, you had to do something like prompt engineering, which means you had to write the prompt in a certain way for the model to increase its output quality, but you couldn't really do much else, right? Now, in 2026, this looks a lot different, because the agent can now look at docs, it can make tool calls, you can add memory files, comprehensive instructions. So, there's a lot more information that needs to be digested by the model or the agent. And, unfortunately, models only have a limited context size, which means the amount of information they can hold isn't endless. And, at some point, no matter the model, doesn't matter if it's Gemini or Claude or Quen, they all experience something which is called context rot, and that happens when the context window gets too full, which means at some point the model will get forget about certain definitions that you gave it earlier in the chat, or it will even hallucinate things. This is why it's so important that you only feed the absolute necessary context to the model or the agent in order to increase its output quality. And now, we'll head back over to Perplexity, and I'll show you how to create the folder structure. We're in Perplexity, and for that part, you can really use any LLM you like. You can even spin up a Claude code instance, doesn't really matter. I'm just on the Perplexity Pro subscription, this is why I'm using it. As you can see, I have selected Claude Sonnet 4.6 with thinking enabled. And, before I walk you through some of the milestones of this conversation, I quickly wanted to look at the initial prompt. So, in the beginning, we define the role of the LLM, nothing really too fancy. You're helping me build a folder-based orchestration system for a new project. The system gives AI agent persistent context, clear routing, and consistent standards, so they never start from zero and always produce structured, predictable output. And, the different discovery questions are divided into four different rounds. Let me quickly disable my video here. So, the first round is regarding the project identity, what it is about. And, the second round is the most important one, because this is custom and unique to you. You will see throughout the video that I'm currently using four different stages until I um finish my app, right? And, for you, it might be different. Maybe you just have three different work modes or stages, maybe you have 17. I don't really know. And, this is the beauty of this prompt, because you can really define the workspace as you like. So, what are the two to four distinct mental modes you shift between in this project? Is it researching, writing, publishing? Um is it planning, building, testing? And, each of those boundaries is a workspace. Round three and four are also necessary, because they define the standards and repeating patterns. So, something like the coding patterns, for example, how you want the agent to name the file it files it creates, and so on. And, the life cycle defines where the project starts and where does it end. Um what happens if you take a break for a few days, what files do you want to look at or need to look at to quickly understand where you left off and what you need to do next, okay? And, now we head back over to Perplexity, so I can walk you through some of the results. As you can see, I just pasted the prompt here, and immediately, I got the first round question, what is this project, um what is the end goal, and so on. And, as I've said, it's about a Flutter mobile app targeted iOS, first designed for creators. The app allows users to record voice memos on device, convert them to text, and use a local LLM to analyze and structure thoughts to help make sense of them. I also added a next steps for this specific step. You don't really have to do it, but I thought it would be a nice addition, where I ask the LLM to research what packages are needed and what core features the target audience would want. Market research has always been a little difficult for me, because there's so much information and sources out there, so I like to sideload some of the work to models like Claude or tools like Perplexity. Of course, these are just suggestions, right? You still have to do your own research and understand if you really need certain packages, if you really need certain features, but those give you may give you some new inspiration that you haven't thought about previously, so I find it very helpful. After the round one is finished, you will immediately receive the round two questions. And, as I've told you, I currently work in four different phases, which is the planning phase, the spec definition, the development and context, and the publishing phase. Now, those are also pretty self-explanatory, but the first planning phase is about defining the app, what it does, what its target audience is, what features it needs, and so on. And, the spec definition here is the most important part, because instead of jumping straight into code, this phase focuses on iterating specific functionalities, e.g., onboarding, journal functions, and it aims to map out 80% for a solution per feature. While unexpected developments may require last-minute changes, the bulk of the logic should already be predefined. So, of course, during the development, you will probably encounter issues, things that won't work as expected, so you have to change the course a little, but overall, you want to have a clear path and a clear direction that you want to go in, so you can tell the agent which direction it should go in, if that makes sense. Now, the third part is about the actual development, and this focuses on, as I've said, actual execution and the logistics of building out the specs. We have blocker management to identify potential blockers that prevent finishing a spec, the context and order, and the last phase is about everything I need to do before I can submit the app to the App Store for review, for example, or host it on the web. Again, after that, you will receive the first round, which is really about the standard and repeating patterns. It will become more clear once we look into the actual folder workspace. Once you've answered the last round about the life cycle and the resuming, you will receive the finished workspace as a zip file, and you can download it and load it into VS Code, and this is what we're going to do now, so you can see the actual files. Now, we're in VS Code. For those of you who don't know what that is, it's really just a fancy tool to look at your files in a different way. So, on the left, we have the folder tree, and if you click on a file, you will immediately see the content of the file on the right, in comparison to when you are in the explorer on Windows or in the finder on Mac, you have to open the folder, and then you have to manually open the files in a different program and look at them one by one. This is just more convenient, right? Now, the first thing you're going to see when you load this workspace into VS Code is two folders, Flutter standards with a context.md file, and we're going to look at this in a bit. And, you will also see the projects folder, where we have the current app project. Now, this is how I am working, right? I figured it's much more convenient to have a projects folder, and I can just tap this and copy it, and then I can rename it to a template, and whenever I start working on a new project, I can just, again, copy that template folder, rename it to the current project that I'm working on, and I'm good to go. I don't have to create a new workspace, create the files manually, and for every project, and so on. But, again, if you like to have one app project per workspace, you can definitely do this. All you need to do is to delete the projects folder, put this into the root of the folder tree, so so to speak, and then you're good to go, too. So, the most important file inside of the project folder is the cloud.md file, and let me quickly disable my video, so that you can see it more clearly. Now, the role definition is currently empty, and I'm going to show you how this looks like in a bit when we look at the actual live workspace that I'm currently using to build out this voice creator app. I just wanted to show you the boilerplate files that you receive from the agent first, right? At the top we have the project identity where we define the project, the stack. We have the workspace overview, and let me make this a little smaller so the formatting looks a little nicer. We have the routing table. Now, you've probably heard of OpenClaw if you don't live under a rock, and OpenClaw at the end of the day is nothing more than an orchestration layer, right? A clear set of files, again, that show and tell the agent what it needs to do when. And this is the exact same logic. Um we have a task, for example, write or update the PRD. It has to go to the planning folder, read the context.md file, and also load the Flutter standards. And we also have different tasks like make an architecture or scheme decision, and so on. So, the agent knows exactly where it needs to go and doesn't need to traverse the folder structure every time you give it a task, right? Then we have some global rules like in architecture, for example, load Flutter standards before every new feature, and so on. And we also have a part with the naming conventions, how you want the agent to name the folders and blog files and spec docs, and so on. Now, the status file is about having one source of truth, basically. When you take a break from the project for a few days, you can look at this file and you immediately know where left off and where you have to start again. Before we dive into the different phases, I want to look into the Flutter standards folder. And if you open that folder, you will find a context.md file, which essentially defines the Flutter standards, right? Those are the app-agnostic conventions that apply to every Flutter project in this workspace. Loading this file gives an agent the full context it needs to match existing architecture patterns, naming conventions, and code style. Again, you have something like a routing table. Okay, when should this file be loaded? The non-negotiables are feature-based folder structure, so each feature lives at features, then the folder feature name with different layers, data, domain, presentation. This is exactly the clean architecture I was talking about. We have a feature folder structure as an example, so the agent immediately understands what it needs to create. We have the standard files to add as each project evolves. And the way how you would kickstart the process is you would just spin up a Cloud Code instance. You would say, "Please look at uh context, Flutter standards, and help me create the files by asking a series of questions." Now, um I currently don't have a Cloud Code subscription. This is why that won't work, but as I've said, the agent will start to ask questions, and it will help you to create these files: block conventions, theming, testing standards, architecture decisions, and service patterns. And I'm going to show you how this looks like in a bit. Now, we come to the meat and potatoes of this workflow, which is the different stages, right? Each folder holds a context.md file and some additional information in the case of development, for example. Again, the context.md file in 01 planning serves the same purpose at the context as the context.md file in Flutter standards. This workspace establishes the complete foundation before any code is written. Every downstream decision, feature specs, implementation order must trace back to an artifact created here. The procedure is the same. You spin up a Cloud Code instance. say, "Read context.md in planning, ask a series of questions, and help me create these files: app overview, PRD, architecture decisions, database schema, app theming, risk assessment, core user loop." You already know this from my previous video from last year, but this is a lot more nuanced, and each file serves a very specific purpose. You don't have one file with, I don't know, seven, six 600, 700 lines of text. Each file is very lean with around 50 to 60 lines of text, except for the PRD, but we're going to talk about this in a bit. As you can see, the artifacts we need to produce app overview, PRD, as I've told you. And essentially, the process is the same for all of the other stages, right? Again, you would spin up a Cloud Code instance. You would say, "Read context.md, please ask a series of questions, help me map out each feature." And now we'll head back over to the actual workspace that I'm currently working in so that you can see how this looks for an actual live development project. So, I loaded my current workspace. As you can see, it looks exactly as the one before. The only thing that I changed is that inside of the actual app project, I created a folder called a dot cloud, and I put all of the different folder stages in here. As you can see, 01 planning. It also has a project-specific cloud.md file, but this will be created if you kickstart the process by telling the agent it should analyze the root cloud.md file. First, let's have a look at the Flutter standards folder. As you can see, I have multiple files. For example, we have the architecture where I define the core pattern. I have some code snippets or examples of package imports. I always wanted to use package imports, never relative imports. Then we have the features folder structure. We have the data layer, domain layer, presentation layer definition. As I've said, some coding examples so the agent know what patterns it needs to implement. Core folder structure, the naming conventions. And the same is true for all of the other files, right? For for the co- block conventions, we have a little more code snippets so the agent understands the pattern, how to register a block, how to handle streams, how to handle errors. I usually work with Firebase, so I also added some Firebase patterns so the agent immediately understands the off patterns and other important stuff. We have packages registry where you can keep track of all of the packages that you currently use. And we have a theming file to define how to create, well, the theming, right? The app theme, the app colors, app te- text styles. And if you remember, this is something that you had to define or explain on each new conversation. You would spin up a new conversation, the agent would implement something, a screen or a feature, and this would look nothing like the rest of the app. And um this will help resolve this issue. If we look at the actual folder or the actual project folder, and we'll have a look at the planning phase. You can see that we have all of the files that we talked about earlier, right? The app overview, app theming, architecture decisions, database schema, PRD. Now, if you look at the app overview, you will see that again, this file is incredibly lean, right? Less than 50 lines of text, but the agent will understand what the app is about, who is it for, the problem, the core features, what makes it different, and so on. Same is true for the theming. This file is actually a little longer because we have a lot of stuff that we need to define, but it's all condensed. The agent only needs to load this one specific file, and it doesn't need to have it doesn't need to look at three different files that are 300 lines of codes each. Only one file. The only file that is a little longer is the PRD.md, right? Because we define the acceptance criteria the uh for each feature. And uh this file is actually a lot longer, as you can see, around 270 lines of text. But again, especially in the planning stage, you need a little more context so the agent really understands what you're going to implement, right? Architecture decisions, it's also a little longer. And let me walk you through the how this looks for the features. So, you have the folders 02 spec. Inside of the features folder, you have all of the features mapped out. As you can see, we're currently at 11 features, I believe. If we look at the onboarding, we get the design requirements and task file. And again, those files are also a little longer to begin with because we have the requirements, which exactly map out the features and the acceptance criteria, and what needs to be implemented. Then we go over to the design. How do we actually want the feature to look like in code and also partly as design, as you can see. And then we have the tasks file where we have different tasks for each of the different features: for the domain layer, for the data layer, and for the presentation layer, right? As you can see, those files are also a little longer, but it doesn't really matter because you're working on a per feature basis, right? Again, you don't need to load 10 different files that are six or 700 lines of text each. You have one file if you work on a specific feature that defines each task, and the agent can just go ahead and check one after another, right? The other folders, I can't really show you much. Uh the implementation log is still empty because I haven't yet started the implementation phase. I only finished the features inside of the specs folder. But I want to quickly touch on another file, which is the risk register. This is also part of the file that is going to be created in this workflow because especially if you're a beginner, or even if you're slightly intermediate, right? You probably have a lot of unknown unknowns, as I called it here. So, every unknown unknown, document it before development begins. Risks are ordered by severity. You might have an idea for an app, and you think, "Oh yeah, this is going to be easy to implement. I just spin up Cloud Code, and then I prompt a a back and forth, and after a few hours, I will have a new app. But, this might be possible for small projects, but especially if you have an idea that you want to grow to a certain user base, and that is a little more complex, you probably don't know about certain problems that you could face, right? For example, for my app, the first risk could be that Llama 3.2 3 billion exceeds iOS memory limits on older devices. Of course, this is a more obvious risk that could happen, but there are also other risks like performance of the transcription, run anywhere package maturity, where package may not be compatible with certain devices, or whatever it may be. So, this is really another point of control, so to speak, that you have before you start the process, that that you can make sure that you don't set yourself up for failure later down the line. Now that we are done with the main video, I quickly wanted to touch on a few more things. Is this a perfect workflow? No. Is this how senior engineers work with AI? Probably also no. They would probably think about the problem from a different perspective. They would have a main agent or a core agent that is able to spin up multiple sub agents, and those agents then fulfill their assigned tasks and report back to the main agent. Now, at this current stage of my personal AI coding journey, this is not what I want. I want to remain in the driver's seat. I want to have control over every step of the development process. I'm still learning how to code myself. I'm still gathering experience, and I still want to be able to solve the bugs AI creates by myself, okay? In case I don't have access to AI. Now, that does not mean that I'm not already working on an extension of this workflow. In fact, in a few days or weeks, I will upload another video about the high AI agent harness. Essentially, it's similar to Claude Code, but you get a lot more flexibility. I will talk about it in the later video, as I've said. This new tool will make it necessary that I update my current workflow. So, stay tuned. However, this video is targeted towards an earlier version of myself. This is knowledge that I would have wanted when I started out this whole AI coding spiel, to not run into issues like going back and forth with the agent, and having it forget about certain things that we've discussed earlier in the conversation. So, that being said, I hope you gained a little bit of value from the video, and I'll see you in the next one.
📺 Ähnliche Videos
🔮 MiroFish simuliert die Zukunft mit tausenden Agents | Alles was du wissen musst!
Christoph Magnussen · 2026-05-11🇩🇪 DE
Ich habe das fortschrittlichste KI-Tool der aktuellen Zeit entdeckt… Das ist Hermes Agent 🔥
Der KI-Doktor · 2026-05-09🇩🇪 DE
Diese OpenClaw MasterClass Wird Deine Arbeitsweise Für Immer Verändern
Der KI-Doktor · 2026-05-08🇩🇪 DE
Paperclip Is Insane | Full Tutorial
Ferdy․com | Ferdy Korpershoek · 2026-05-06🇬🇧 EN