# Celaya Solutions Research Lab > Celaya Solutions Research Lab (CSR) is an independent LLM and applied intelligence research lab in El Paso, Texas. CSR builds local-first applied intelligence systems, large language model research instruments, multi-agent orchestration, archival intelligence, and industrial applied intelligence for manufacturing, data centers, maquiladoras, and the Ciudad Juarez / El Paso border region (the Paso del Norte Borderland). This is the full-text companion to llms.txt. It concatenates the substantive written content of the site so an AI agent can read it in one request. Section order matches llms.txt. For the canonical, link-first index see https://celayasolutions.com/llms.txt. ## Research instruments ### CLOS https://celayasolutions.com/research/clos A 37-agent cognitive operating system for modeling distributed cognition. ### Trinidad https://celayasolutions.com/research/trinidad Archival intelligence over 22,000 Texas historical records (1780 to 1900), deployed at trinidad.town. ### DRAMATURG https://celayasolutions.com/research/dramaturg Generates provenance-enforced screenplays from the Trinidad historical corpus. ### CORTEX https://celayasolutions.com/research/cortex A 14-agent manufacturing intelligence platform for industrial applied intelligence and maquiladora LLM workflows. ### EPPE https://celayasolutions.com/research/eppe The El Paso Proof Engine, a civic blockchain notarization instrument for public accountability. ### VERDICT https://celayasolutions.com/research/verdict A 4-agent legal intelligence instrument with JUDGE, CLERK, WITNESS, and COUNSEL roles. ### MORTEM https://celayasolutions.com/research/mortem A real-time biometric streaming and audit ledger instrument. ### AXON https://celayasolutions.com/research/axon An autonomous agent with Earned Autonomy Architecture, a behavioral backbone, and a blockchain audit trail. ### GALEN https://celayasolutions.com/research/galen A biomedical literature retrieval system built as a research instrument. ### RECALL https://celayasolutions.com/research/recall A semantic-search RAG instrument for retrieval-augmented knowledge workflows. ### Local-First AI https://celayasolutions.com/research/local-first-ai Local-first applied intelligence architecture keeping inference, routing, and evidence handling close to the data and operator. ## Research papers ### CCP-Scholar-v1 https://celayasolutions.com/research/CCP-Scholar-v1 (PDF: https://celayasolutions.com/research/CCP-Scholar-v1.pdf) Celaya Chain Protocol scholar paper. ### EAA-Scholar-v1 https://celayasolutions.com/research/EAA-Scholar-v1 (PDF: https://celayasolutions.com/research/EAA-Scholar-v1.pdf) Earned Autonomy Architecture scholar paper. ## Technical notes ### Meet the Lab (2026-06-17) https://celayasolutions.com/notes/meet-the-lab ## The short version - This is a research lab. We build instruments: tools made to test an idea and learn something true. - We work across many areas: documents, health signals, public records, factories, and the design of AI itself. - We have built dozens of instruments. Below are a few flagship ones. - These notes come from this lab. We build the same way we teach here: test it, measure it, write it down. ## What the lab is We are Celaya Solutions Research (CSR), an independent lab based in El Paso, Texas. We did not start in software. We spent years in heavy industry: running data centers and building high-voltage equipment. That work taught us one thing that shapes everything we do now. The same patterns show up everywhere. In wiring, in sound, in code, and in how a mind sorts the world. Once you see a pattern in one place, you start to see it in all of them. So we build instruments, not products. An instrument is a tool made to learn something, not to sell something. We aim for surprise. We want to find out what is true, even when the safe move would be to ship something simple. One method runs through all of it. Treat the work like a lab. Form a guess. Test it. Measure what happens. Write it down so anyone can check it. That is the same method these notes teach you. ## A few flagship instruments These are a handful of the tools we have built. Each one is its own project. RECALL. Document intelligence. It reads a large pile of documents, then lets you ask plain questions and get answers that point back to the source. We proved it on a public archive of more than 22,000 Texas history documents. It ran at almost no cost, and it produced a receipt that shows the records were never changed. EPPE, the El Paso Proof Engine. Civic proof. It takes an everyday record, like a utility bill, and stamps it onto a public ledger. Later, anyone can prove the record is real and was not changed. It is built for plain accountability, starting at home. MORTEM. Live body data. It streams heart and body signals onto a dashboard in real time. It can also hand that data off in the same format hospitals use, so the readings are not locked inside one app. CORTEX. Factory intelligence. A team of software agents that work together to make sense of how a plant runs. It came straight out of our years on the factory floor. CLOS. A cognitive operating system. A larger group of agents that run as one system. It is our test bed for a hard question: how do many small parts add up to something that thinks? BYTEFRAME. A lean AI engine. A new way to build a language model that is small enough to run on your own machine, not just in a giant data center. It is part of how we study running AI in private, on hardware you control. These six are a sample. The lab runs dozens more. The wide range is on purpose. Working across many fields is how we find the patterns that tie them together. ## Where to begin New here? Start with the note on token cost. It covers the basics of using AI without wasting money. After that, the notes build in order: writing better prompts, picking the right model, running your own private model, and building systems that do many steps at once. Every note ends with steps you can run yourself. That is the promise, and it is the whole reason this lab writes anything down. --- ### Lab Notes (2026-06-16) https://celayasolutions.com/notes/lab-notes ## Introduction Welcome to the lab notes. These are short, plain guides to using AI well. Not magic tricks. A method. Most AI advice online is folklore. Someone says "try this prompt, it works great," and you have no way to know if that is true for you. We do it differently. We treat using AI like lab work. We test things. We measure them. We write down the steps and the numbers so you can repeat them and see for yourself. You do not need a degree to read these. We write at about an 8th grade level on purpose. Plain words, real examples, and the math shown. How to use this page. If you are new, read the notes in order. Each one builds on the last. If you need one thing, jump straight to it. Every note stands on its own and ends with a "Try it yourself" so you can run it. ## Lab principles These are the rules we hold ourselves to. They also make a good way for you to work. 1. We test, we do not guess. We never just tell you something works. We show you how to check it. 2. Every note can be repeated. If we claim a result, we give you the steps and the numbers to get the same one. 3. Measure before you trust. Keep a few test cases you can run again. That is how you know a change really helped. 4. Change one thing at a time. It is the only way to learn what actually made the difference. 5. Start small and cheap. Use the smallest model and the shortest prompt that does the job. Step up only when you must. 6. The right context beats more context. A few useful details help more than a wall of text. 7. Write it down. A result you did not record is a result you cannot repeat. Keep a lab notebook. 8. Be honest about limits. We say what we know, what we do not, and what might change. 9. Protect privacy. We keep people's names and private data out of public notes. 10. Facts go stale. Prices and tools change. Always check the source for today's number. ## Where this is going More notes will land here over time. Each one is the same method pointed at a new piece: writing better prompts, picking the right model, running your own private model on your own machine, and building systems that do many steps at once. The goal is simple, and a little strange for a lab to admit. We want you to stop needing us. When you can test your own AI work and trust the result, you are doing real lab work. That is the whole point. This is an open lab. Try the steps. Push on them. If something breaks, or you find a better way, we want to hear it. ## Thanks for reading Your attention is the rarest thing you own. Thank you for spending some of it here. We will keep these notes honest, plain, and worth your time. That is the deal. --- ### Tokens, Dollars, and Why Long Chats Get Expensive (2026-06-15) https://celayasolutions.com/notes/token-cost-efficiency ## The short version - An AI charges you by the token. A token is a small chunk of text. Roughly 4 letters, or about three quarters of a word. - You pay for input (what you send) and output (what you get back). Output costs more. - Every new message in a chat re-sends the whole chat so far. So a long chat costs more on every turn, and it can also get less accurate. - Match the model to the job. Use a small, cheap model for simple work. Save the big model for hard problems. - A clear question with the right context up front beats a long back-and-forth. Good questions save money. ## What a token is A token is a piece of a word. Short common words like "lab" are often one token. Longer words get split into two or three. In English, about 1,000 tokens is about 750 words. A short paragraph is around 100 tokens. This matters because the model does not count words. It counts tokens. The bill counts tokens too. So when we talk about cost, we talk about tokens. A rough way to guess: take your word count and multiply by 1.33. That gives you a close-enough token count. ## How tokens become dollars There are two prices, and they are different: - Input: the text you send (your question, your files, the chat history). - Output: the text the model sends back. Prices are listed per 1,000,000 tokens. People write that as "per million." Here is the formula: cost = (input tokens / 1,000,000 x input price) + (output tokens / 1,000,000 x output price) Let us do a real one. As of June 2026, Anthropic lists Claude Sonnet at $3 per million input tokens and $15 per million output tokens. (Always check the provider's pricing page for today's number. Prices change.) Say you send 2,000 tokens and get back 1,000 tokens: - Input: 2,000 / 1,000,000 x $3 = $0.006 - Output: 1,000 / 1,000,000 x $15 = $0.015 - Total: about $0.021, or 2 cents. One chat is cheap. The trouble starts when you run a thousand chats, or one very long chat. Small numbers add up fast. Notice one thing in those prices. Output costs 5 times more than input. That is true across the current models. So long, rambling answers cost more than short ones. If you only need the answer, ask for just the answer. ## Why long chats cost more and get worse This is the part most people miss, so read it twice. The model has no memory of its own between turns. Each time you send a new message, the model re-reads the entire chat from the top. The whole history goes back in as input, every single time. So the chat grows like a snowball. Turn 1 might send 300 tokens. By turn 30, you might be sending 15,000 tokens with every message, because the bill now includes everything you said before. Here is the same question in a short chat and in a long one, using the Sonnet prices above. Short, fresh chat. Your question is 1,000 tokens. The answer is 500 tokens. - (1,000 / 1,000,000 x $3) + (500 / 1,000,000 x $15) = $0.003 + $0.0075 = about $0.011 (1 cent). Long chat. The history is now 15,000 tokens. You add the same 1,000-token question. The answer is still 500 tokens. - (16,000 / 1,000,000 x $3) + (500 / 1,000,000 x $15) = $0.048 + $0.0075 = about $0.056 (5 to 6 cents). Same question. About 5 times the cost. And you pay that bloated input on every later turn, not just once. It is not only about money. A long chat that wanders across many topics also gets less sharp. The important stuff gets buried in old, off-topic text. The model has to split its attention across all of it. It can also get stuck on something you said early on that no longer fits what you need now. Two problems, one fix. When your goal changes, start a new chat. Carry over only the short summary or the file you actually need. Not the whole transcript. Rule of thumb: one objective per chat. New objective, new chat. Name the chat after its objective so you can find it later. ## Which model for which job Think in three tiers. The names below are current Claude models, but the idea works anywhere. - Small and fast (cheapest). Example: a Haiku-class model, about $1 per million input. Good for simple, clear, high-volume work: sorting text, pulling out fields, simple formatting, quick answers. - Middle (balanced). Example: a Sonnet-class model, about $3 per million input. A good default for most real work. - Large (priciest). Example: an Opus-class model, about $5 per million input. Save it for the hard stuff: tricky reasoning, messy problems, and answers where a mistake costs more than the tokens. The move is simple. Start with the cheaper model. Step up only when it fails your own check. Do not reach for the biggest model out of habit. That habit is how bills balloon. There is a stronger version of this idea. A cheap model with great context often beats an expensive model with poor context. The next section is about that. ## Ask a better question, give better context A vague question gets a vague answer. Then you spend 5 more turns fixing it. Each turn costs tokens and time. A clear question with the right context can get the answer in one turn. That is cheaper and faster. Give the model what it needs to win: - The goal. What are you trying to do. - The limits. What it must or must not do. - The format. What you want the answer to look like. - An example, if you have one. This up-front context is leverage. A little more input buys a lot more output quality, and usually fewer total turns. That is the trade you want. But more is not always better. The right context beats the most context. Dump in too much off-topic text and you are back to the long-chat problem: higher cost, lower focus. Aim for relevant, not huge. ## More ways to save - Say how long. "Keep it short" or "just the code, no explanation" cuts output tokens. Output is the pricey part. - Reuse a big context with caching. Some providers let you cache a long document or a set of instructions so you do not pay full price to send it again every turn. When you reuse the same big context a lot, this can cut cost by most of it. - Batch the non-urgent jobs. Some providers offer a lower rate (often about half) for work you can wait on. - Count the cost of a redo. A cheap answer that is wrong, and makes you redo the work, was not cheap. Price the mistakes, not just the tokens. ## What it means - Tokens are money. Input and output both cost. Output costs more. - Long chats cost more on every turn and can get less accurate. Reset when the goal changes. - Start with the cheaper model. Step up only when it fails your check. - Spend your effort on the question and the context, not on more turns. - Always check today's prices on the provider's page. ## Try it yourself 1. Find your model's input and output prices, per million tokens. 2. Take a recent chat. Guess its tokens: word count x 1.33. 3. Put the numbers into the formula: cost = (input / 1,000,000 x input price) + (output / 1,000,000 x output price). 4. Now do it twice for the same task: once in a long chat, once in a short fresh chat. Compare the cost. You will usually find the fresh chat is cheaper, and often the answer is just as good or better. 5. Bonus: run the same task on a small model and a big model. Can you really tell the difference for that task? If not, the small model wins.