<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://sinaptia.dev/feed.xml" rel="self" type="application/atom+xml" /><link href="https://sinaptia.dev/" rel="alternate" type="text/html" /><updated>2026-04-09T17:17:11+00:00</updated><id>https://sinaptia.dev/feed.xml</id><title type="html">SINAPTIA</title><author><name>SINAPTIA</name></author><entry><title type="html">Ruby Argentina March meetup</title><link href="https://sinaptia.dev/posts/ruby-argentina-march-2026-meetup" rel="alternate" type="text/html" title="Ruby Argentina March meetup" /><published>2026-03-31T00:00:00+00:00</published><updated>2026-03-31T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/ruby-argentina-march-2026-meetup</id><content type="html" xml:base="https://sinaptia.dev/posts/ruby-argentina-march-2026-meetup"><![CDATA[<p><a href="https://ruby.com.ar/">Ruby Argentina</a> opened its 2026 meetup season at Eryx’s office in Buenos Aires, and the night set the tone quickly: pay attention to Ruby’s warnings, then make room for a few lightning talks.</p>

<p>The event kicked off with Ariel Juodziukynas’ talk <strong>“Warnings en Ruby: NO LOS IGNORES!!!”</strong>. Ariel focused on an easy part of everyday Ruby work to ignore: warnings. His talk was a reminder that the yellow text in our terminal often points to real problems. He showed practical examples of common warnings and how to fix them.</p>

<p align="center" width="100%">
  <img class="w-[70%]" alt="Ariel's talk" src="/assets/images/posts/ruby-argentina-march-2026-meetup/1.webp" />
</p>

<p>After Ariel’s talk, we took a break to catch up, meet new people, and keep the conversations going over drinks and empanadas.</p>

<p>The lightning talks took the night in a looser direction:</p>

<ul>
  <li><strong>Simon</strong> shared how he fixed a tricky performance issue with a fulltext search, walking us through his debugging process and the solution he found.</li>
  <li><strong>Viktor</strong> presented <strong>aj</strong>, a tool he built to improve his AI workflow, showing how Ruby developers can build utilities to streamline their daily work.</li>
  <li><strong>Santiago</strong> taught us “how to do juggling”. Yes, actual juggling with balls, not code.</li>
  <li><strong>Ariel</strong> came back for a second round with “how to tie your shoes”. It was even better watching everyone try Ariel’s method on the spot.</li>
</ul>

<p align="center" width="100%">
  <img class="w-[70%]" alt="Simon's talk" src="/assets/images/posts/ruby-argentina-march-2026-meetup/2.webp" />
</p>

<p>Thanks to the organizers, the sponsors (<a href="https://sinaptia.dev/">SINAPTIA</a>, Rootstrap, Ombulabs, and Eryx), and everyone who showed up. We’re already looking forward to April’s online meetup.</p>]]></content><author><name>SINAPTIA</name></author><category term="Ruby" /><category term="Community" /><summary type="html"><![CDATA[Ruby Argentina opened its 2026 meetup season with a night that went from Ruby warnings to juggling lessons.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">NO AI CODE IN PRODUCTION DIRECTIVE</title><link href="https://sinaptia.dev/posts/no-ai-code-in-production-directive" rel="alternate" type="text/html" title="NO AI CODE IN PRODUCTION DIRECTIVE" /><published>2026-03-17T00:00:00+00:00</published><updated>2026-03-17T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/no-ai-code-in-production-directive</id><content type="html" xml:base="https://sinaptia.dev/posts/no-ai-code-in-production-directive"><![CDATA[<p>At SINAPTIA, we started enforcing a no “AI code in production” directive.</p>

<p>LOL, no. We are not the next tier of Luddites! We started using AI-assisted code generation a little over a year ago, and from the looks of it, we are going to use it even more in the near future. AI is here to stay, and programming will never be the same, what a time to be alive and so on, and so on…</p>

<p>But also, in a couple of years, we’ll all lose our jobs. <em>What a time to be alive, indeed.</em></p>

<p>We continually ask ourselves: what’s the AI capable of? Can it really replace us? Can it change the way we work? Will it make our problems simpler? Or there will be another kind of problems, more problems? Are we attending the death of software engineering?</p>

<p>There’s only one way to find out.</p>

<p>Initially, we were somewhat skeptical about coding agents. We tried them, but the autocomplete feature was usually annoying, and the chat mode wasn’t smart enough to understand the context of the code. But then, at some point, Claude Code and Cursor became smart enough to be of massive help while building features: they could explain things easily in the context they were in, and they could edit text nicely.</p>

<p>Since then LLMs and coding agents became a key part of our daily routine and we started building more and more <a href="https://sinaptia.dev/blog/tags/ai">features using AI</a>, and growing tools for the RubyLLM ecosystem (such as <a href="https://github.com/sinaptia/ruby_llm-instrumentation">RubyLLM::Instrumentation</a>, <a href="https://github.com/sinaptia/ruby_llm-monitoring">RubyLLM::Monitoring</a>, and <a href="https://github.com/sinaptia/ruby_llm-evals">RubyLLM::Evals</a>) that we use in our projects and we open-sourced for the larger community in hope to make Ruby one of the top languages for building with AI.</p>

<h2 id="the-problem-with-ai-generated-code">The problem with AI-generated code</h2>

<p>I bet at some point you stumbled upon a piece of code that achieved something, but it was hard to understand, poorly thought out, or just ugly (eg, for us, Rubyists, seeing Ruby code that doesn’t feel Ruby). That happened a lot in the StackOverflow era.</p>

<p>With AI-generated code, it’s the same as with any code you didn’t write yourself. You might find AI-generated code hard to understand, poorly thought out, or even solving problems that no one asked to solve.</p>

<p>We’ve seen coding agents working without oversight and proper feedback going down paths that ended up in code that no one would be able to manage. Neither the coding agents nor the humans. Discard and regenerate is a possibility, yes, but tokens are not actually free, and budgeting and financials are something coding agents cannot fix either.</p>

<p>Models and agents will become better over time, and the barrier might be farther away each time, but the problem will always be there.</p>

<h2 id="the-hard-parts">The hard parts</h2>

<p>After more than a year of using coding agents for our daily work, developing AI-powered features, and running several experiments, we feel that AI is truly a multiplier force, but it isn’t changing the most fundamental bits of programming and software production: the hard parts of software production are still hard.</p>

<p>The power of solving a complex problem with a simple and elegant solution/architecture is what makes a good engineer a great one. And AI can help one do so, but is not able to come up with such architectural decisions on its own (at least not yet).</p>

<p>But in the areas where the AI excels, we should try to leverage it. I, honestly, cannot care less if you wrote the code we just deployed to production by punching holes in a card, taping keys in a keyboard letter by letter, copying-pasting snippets from StackOverflow, or you used a high-rate probabilistic word predictor that can produce hundreds of words per second. Tools are not at trial here. But, regardless of how that code came to life, I do care you don’t fall prey to the laziness of not curating the code you produce, of validating it and understanding it (There are many ways to do this, and they are changing, but isn’t this what engineering is about?)</p>

<p>Striving for quality and good architectural decisions is still central. Simplicity is still the only way software remains workable in the long run. For humans, yes, but for AI agents too.</p>

<p>Simple was hard in the pre-AI era and is still hard today (perhaps even harder), but what’s simpler for humans is also simpler for agents. And it is still worth all the effort.</p>

<p>So we say: There is no such thing as AI code. We just have <em>code</em>: good, bad, simple, or complex. And we have tools and processes to deal with it. Like we always did.</p>

<p>So, long live power tools. Long live software engineering.</p>

<hr />

<p><em>At SINAPTIA, <a href="/posts/building-intelligent-applications-with-rails">we specialize in helping businesses implement AI solutions</a> that deliver real value. If you’re facing challenges with prompt engineering or AI integration, we’d love to help.</em></p>]]></content><author><name>Fernando Martinez</name></author><category term="AI" /><summary type="html"><![CDATA[AI is here to stay. Programming will never be the same, or... will it?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Storing multi-valued enum fields in ActiveRecord</title><link href="https://sinaptia.dev/posts/storing-multi-valued-enum-fields-in-activerecord" rel="alternate" type="text/html" title="Storing multi-valued enum fields in ActiveRecord" /><published>2026-03-03T00:00:00+00:00</published><updated>2026-03-03T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/storing-multi-valued-enum-fields-in-activerecord</id><content type="html" xml:base="https://sinaptia.dev/posts/storing-multi-valued-enum-fields-in-activerecord"><![CDATA[<p>A few weeks ago, we ran into an interesting problem in one of our projects. We had a <code class="language-plaintext highlighter-rouge">reports</code> table with a <code class="language-plaintext highlighter-rouge">reason</code> column that used the classic Rails enum approach:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># db/migrate/XXXXXXXXXXXXXX_create_reports.rb</span>
<span class="k">class</span> <span class="nc">CreateReports</span> <span class="o">&lt;</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">7.1</span><span class="p">]</span>
  <span class="n">create_table</span> <span class="ss">:reports</span> <span class="k">do</span> <span class="o">|</span><span class="n">t</span><span class="o">|</span>
    <span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:title</span>
    <span class="n">t</span><span class="p">.</span><span class="nf">integer</span> <span class="ss">:reason</span><span class="p">,</span> <span class="ss">default: </span><span class="mi">0</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># app/models/report.rb</span>
<span class="k">class</span> <span class="nc">Report</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="n">enum</span> <span class="ss">:reason</span><span class="p">,</span> <span class="p">{</span>
    <span class="ss">spam: </span><span class="mi">0</span><span class="p">,</span>
    <span class="ss">harassment: </span><span class="mi">1</span><span class="p">,</span>
    <span class="ss">inappropriate_content: </span><span class="mi">2</span><span class="p">,</span>
    <span class="ss">copyright: </span><span class="mi">3</span><span class="p">,</span>
    <span class="ss">misinformation: </span><span class="mi">4</span>
  <span class="p">},</span> <span class="ss">prefix: </span><span class="kp">true</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Everything worked fine until the client requested the ability to select multiple reasons for a report. A user could report content for being spam and containing misinformation. The set of possible reasons is fixed and defined by developers.</p>

<p>We tried four approaches.</p>

<h2 id="four-ways-to-solve-it">Four ways to solve it</h2>

<ul>
  <li><strong>Bitwise Operations</strong>: store multiple values in a single integer using bit-level flags.</li>
  <li><strong>PostgreSQL Array</strong>: use native PostgreSQL array columns with enum-like syntax.</li>
  <li><strong>JSONB</strong>: store reasons as a JSON array inside a JSONB column.</li>
  <li><strong>HABTM</strong>: the classic many-to-many approach with a join table.</li>
</ul>

<h3 id="1-bitwise-operations">1. Bitwise Operations</h3>

<p>They use bit-level operations to store multiple values in a single integer. Each reason occupies a specific bit. The <a href="https://github.com/kenn/active_flag">active_flag</a> gem provides a clean DSL for this:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># db/migrate/XXXXXXXXXXXXXX_add_reasons_to_reports.rb</span>
<span class="k">class</span> <span class="nc">AddReasonsToReports</span> <span class="o">&lt;</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">7.1</span><span class="p">]</span>
  <span class="k">def</span> <span class="nf">change</span>
    <span class="n">add_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">:bigint</span><span class="p">,</span> <span class="ss">default: </span><span class="mi">0</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>

    <span class="c1"># migrate data to new format</span>

    <span class="n">remove_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reason</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># app/models/report.rb</span>
<span class="k">class</span> <span class="nc">Report</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="n">flag</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="p">[</span><span class="ss">:spam</span><span class="p">,</span> <span class="ss">:harassment</span><span class="p">,</span> <span class="ss">:inappropriate_content</span><span class="p">,</span> <span class="ss">:copyright</span><span class="p">,</span> <span class="ss">:misinformation</span><span class="p">]</span>
<span class="k">end</span>

<span class="c1"># Usage</span>
<span class="n">report</span> <span class="o">=</span> <span class="no">Report</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">reasons: </span><span class="p">[</span><span class="ss">:spam</span><span class="p">,</span> <span class="ss">:misinformation</span><span class="p">])</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span> <span class="c1"># =&gt; [:spam, :misinformation]</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span><span class="p">.</span><span class="nf">spam?</span>   <span class="c1"># =&gt; true</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span> <span class="o">=</span> <span class="p">[</span><span class="ss">:spam</span><span class="p">]</span>
<span class="n">report</span><span class="p">.</span><span class="nf">save!</span>

<span class="c1"># Read: check if spam is included</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span><span class="p">.</span><span class="nf">spam?</span> <span class="c1"># =&gt; true</span>

<span class="c1"># Validation: invalid values raise ArgumentError</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span> <span class="o">=</span> <span class="p">[</span><span class="ss">:invalid</span><span class="p">]</span> <span class="c1"># =&gt; raises ArgumentError</span>

<span class="c1"># Query: find all reports with spam AND misinformation</span>
<span class="no">Report</span><span class="p">.</span><span class="nf">where_reasons</span><span class="p">(</span><span class="ss">:spam</span><span class="p">,</span> <span class="ss">:misinformation</span><span class="p">)</span>
</code></pre></div></div>

<p>The benefits here are clear: extremely compact storage, very fast boolean checks, and no GIN index required, just a standard integer index. But the trade-offs become apparent quickly. Database values are inexpressive (what does value 4 mean?). You’re limited to 64 values with <code class="language-plaintext highlighter-rouge">bigint</code>. And query operations are less intuitive than standard ActiveRecord.</p>

<p>Note: Most Ruby gems (including active_flag) are based on integers with a 64-bit limit. While PostgreSQL supports the bit string data type, which can store many more flags without this limitation, the Ruby ecosystem doesn’t have widely adopted gems for this approach.</p>

<h3 id="2-postgresql-array-field-multivalued-column">2. PostgreSQL Array Field (Multivalued Column)</h3>

<p>PostgreSQL has native array support. We can store reason strings directly in a string array column without any gem:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># db/migrate/XXXXXXXXXXXXXX_add_reasons_to_reports.rb</span>
<span class="k">class</span> <span class="nc">AddReasonsToReports</span> <span class="o">&lt;</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">7.1</span><span class="p">]</span>
  <span class="k">def</span> <span class="nf">change</span>
    <span class="n">add_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">:string</span><span class="p">,</span> <span class="ss">array: </span><span class="kp">true</span><span class="p">,</span> <span class="ss">default: </span><span class="p">[]</span>
    <span class="n">add_index</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">using: </span><span class="s1">'gin'</span>

    <span class="c1"># migrate data to new format</span>

    <span class="n">remove_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reason</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># app/models/report.rb</span>
<span class="k">class</span> <span class="nc">Report</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="no">VALID_REASONS</span> <span class="o">=</span> <span class="sx">%w[spam harassment inappropriate_content copyright misinformation]</span><span class="p">.</span><span class="nf">freeze</span>

  <span class="n">validate</span> <span class="ss">:reasons_must_be_valid</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">reasons_must_be_valid</span>
    <span class="k">return</span> <span class="k">if</span> <span class="n">reasons</span><span class="p">.</span><span class="nf">blank?</span> <span class="o">||</span> <span class="n">reasons</span><span class="p">.</span><span class="nf">all?</span> <span class="p">{</span> <span class="o">|</span><span class="n">r</span><span class="o">|</span> <span class="no">VALID_REASONS</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">r</span><span class="p">)</span> <span class="p">}</span>

    <span class="n">errors</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="ss">:reasons</span><span class="p">,</span> <span class="s2">"contain invalid values"</span><span class="p">)</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># Usage</span>
<span class="n">report</span> <span class="o">=</span> <span class="no">Report</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="ss">title: </span><span class="s2">"Report 1"</span><span class="p">,</span> <span class="ss">reasons: </span><span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"misinformation"</span><span class="p">])</span>

<span class="c1"># Read: check if spam is included</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="s2">"spam"</span><span class="p">)</span> <span class="c1"># =&gt; true</span>

<span class="c1"># Query: find all reports with spam AND misinformation</span>
<span class="no">Report</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="s2">"reasons @&gt; ARRAY[?]::varchar[]"</span><span class="p">,</span> <span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"misinformation"</span><span class="p">])</span>
</code></pre></div></div>

<p>The upside: no gem dependency, human-readable values in the database, and easy to expand. The downside: it’s PostgreSQL-specific, validations require manual implementation, and it’s less familiar to developers who don’t use PostgreSQL regularly.</p>

<h3 id="3-jsonb-multivalued-column">3. JSONB (Multivalued Column)</h3>

<p>We store reasons as a JSON array inside a JSONB column.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># db/migrate/XXXXXXXXXXXXXX_add_reasons_to_reports.rb</span>
<span class="k">class</span> <span class="nc">AddReasonsToReports</span> <span class="o">&lt;</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">7.1</span><span class="p">]</span>
  <span class="k">def</span> <span class="nf">change</span>
    <span class="n">add_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">:jsonb</span><span class="p">,</span> <span class="ss">default: </span><span class="p">[],</span> <span class="ss">null: </span><span class="kp">false</span>
    <span class="n">add_index</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">using: </span><span class="s1">'gin'</span>

    <span class="c1"># migrate data to new format</span>

    <span class="n">remove_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reason</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># app/models/report.rb</span>
<span class="k">class</span> <span class="nc">Report</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="no">VALID_REASONS</span> <span class="o">=</span> <span class="sx">%w[spam harassment inappropriate_content copyright misinformation]</span><span class="p">.</span><span class="nf">freeze</span>

  <span class="n">validate</span> <span class="ss">:reasons_must_be_valid</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">reasons_must_be_valid</span>
    <span class="k">return</span> <span class="k">if</span> <span class="n">reasons</span><span class="p">.</span><span class="nf">blank?</span> <span class="o">||</span> <span class="n">reasons</span><span class="p">.</span><span class="nf">all?</span> <span class="p">{</span> <span class="o">|</span><span class="n">r</span><span class="o">|</span> <span class="no">VALID_REASONS</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">r</span><span class="p">)</span> <span class="p">}</span>

    <span class="n">errors</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="ss">:reasons</span><span class="p">,</span> <span class="s2">"contain invalid values"</span><span class="p">)</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># Usage</span>
<span class="n">report</span> <span class="o">=</span> <span class="no">Report</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="ss">title: </span><span class="s2">"Report 1"</span><span class="p">,</span> <span class="ss">reasons: </span><span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"misinformation"</span><span class="p">])</span>

<span class="c1"># Read: check if spam is included</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="s2">"spam"</span><span class="p">)</span> <span class="c1"># =&gt; true</span>

<span class="c1"># Query: find all reports with spam AND misinformation</span>
<span class="no">Report</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="s2">"reasons @&gt; ?"</span><span class="p">,</span> <span class="s1">'["spam", "misinformation"]'</span><span class="p">)</span>
</code></pre></div></div>

<p>The benefit is GIN indexing for fast searches. The cost: the flexible structure requires stronger validations to enforce the expected format, it’s slightly more verbose than alternatives, and there’s JSON parsing overhead in some cases.</p>

<h3 id="4-habtm-table">4. HABTM Table</h3>

<p>The classic many-to-many approach with a join table.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># db/migrate/XXXXXXXXXXXXXX_create_reasons_and_join_table.rb</span>
<span class="k">class</span> <span class="nc">CreateReasonsAndJoinTable</span> <span class="o">&lt;</span> <span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Migration</span><span class="p">[</span><span class="mf">7.1</span><span class="p">]</span>
  <span class="k">def</span> <span class="nf">change</span>
    <span class="n">create_table</span> <span class="ss">:reasons</span> <span class="k">do</span> <span class="o">|</span><span class="n">t</span><span class="o">|</span>
      <span class="n">t</span><span class="p">.</span><span class="nf">string</span> <span class="ss">:name</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>
      <span class="n">t</span><span class="p">.</span><span class="nf">timestamps</span>
    <span class="k">end</span>
    <span class="n">add_index</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">:name</span><span class="p">,</span> <span class="ss">unique: </span><span class="kp">true</span>

    <span class="n">create_join_table</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reasons</span> <span class="k">do</span> <span class="o">|</span><span class="n">t</span><span class="o">|</span>
      <span class="n">t</span><span class="p">.</span><span class="nf">index</span> <span class="p">[</span><span class="ss">:report_id</span><span class="p">,</span> <span class="ss">:reason_id</span><span class="p">]</span>
      <span class="n">t</span><span class="p">.</span><span class="nf">index</span> <span class="p">[</span><span class="ss">:reason_id</span><span class="p">,</span> <span class="ss">:report_id</span><span class="p">]</span>
    <span class="k">end</span>

    <span class="c1"># migrate data to new format</span>

    <span class="n">remove_column</span> <span class="ss">:reports</span><span class="p">,</span> <span class="ss">:reason</span>
  <span class="k">end</span>
<span class="k">end</span>

<span class="c1"># app/models/report.rb</span>
<span class="k">class</span> <span class="nc">Report</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="n">has_and_belongs_to_many</span> <span class="ss">:reasons</span>
<span class="k">end</span>

<span class="c1"># app/models/reason.rb</span>
<span class="k">class</span> <span class="nc">Reason</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
  <span class="n">has_and_belongs_to_many</span> <span class="ss">:reports</span>

  <span class="n">validates</span> <span class="ss">:name</span><span class="p">,</span> <span class="ss">inclusion: </span><span class="p">{</span> <span class="ss">in: </span><span class="sx">%w[spam harassment inappropriate_content copyright  misinformation]</span> <span class="p">}</span>
<span class="k">end</span>

<span class="c1"># Usage</span>
<span class="n">report</span> <span class="o">=</span> <span class="no">Report</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="ss">title: </span><span class="s2">"Report 1"</span><span class="p">,</span> <span class="ss">reasons: </span><span class="no">Reason</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="ss">name: </span><span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"misinformation"</span><span class="p">]))</span>

<span class="c1"># Read: check if spam is included</span>
<span class="n">report</span><span class="p">.</span><span class="nf">reasons</span><span class="p">.</span><span class="nf">exists?</span><span class="p">(</span><span class="ss">name: </span><span class="s2">"spam"</span><span class="p">)</span> <span class="c1"># =&gt; true</span>

<span class="c1"># Query: find all reports with spam AND misinformation</span>
<span class="no">Report</span><span class="p">.</span><span class="nf">joins</span><span class="p">(</span><span class="ss">:reasons</span><span class="p">)</span>
      <span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="ss">reasons: </span><span class="p">{</span> <span class="ss">name: </span><span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"misinformation"</span><span class="p">]</span> <span class="p">})</span>
      <span class="p">.</span><span class="nf">group</span><span class="p">(</span><span class="ss">:id</span><span class="p">)</span>
      <span class="p">.</span><span class="nf">having</span><span class="p">(</span><span class="s2">"COUNT(DISTINCT reasons.id) = 2"</span><span class="p">)</span>
</code></pre></div></div>

<p>Benefits include total flexibility to add additional metadata, extreme familiarity for Rails developers, and unlimited scalability. The drawback: slower writes (create/update) and the need for more queries or joins to read data.</p>

<h2 id="comparison">Comparison</h2>

<p>Here’s a summary of how each approach compares across the attributes that matter most:</p>

<table>
  <thead>
    <tr>
      <th>Attribute</th>
      <th>Bitwise</th>
      <th>PostgreSQL Array</th>
      <th>JSONB</th>
      <th>HABTM</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Write Performance</strong></td>
      <td>Excellent</td>
      <td>Very Good</td>
      <td>Very Good</td>
      <td>Fair</td>
    </tr>
    <tr>
      <td><strong>Read Performance</strong></td>
      <td>Very Good</td>
      <td>Good</td>
      <td>Good</td>
      <td>Good</td>
    </tr>
    <tr>
      <td><strong>Query Simplicity</strong></td>
      <td>Excellent</td>
      <td>Excellent</td>
      <td>Excellent</td>
      <td>Fair</td>
    </tr>
    <tr>
      <td><strong>Default Values</strong></td>
      <td>Very Good</td>
      <td>Excellent</td>
      <td>Excellent</td>
      <td>Poor</td>
    </tr>
    <tr>
      <td><strong>Database-Level Validation</strong></td>
      <td>Very Good</td>
      <td>Good</td>
      <td>Good</td>
      <td>Excellent</td>
    </tr>
    <tr>
      <td><strong>Extensibility</strong></td>
      <td>Poor</td>
      <td>Good</td>
      <td>Very Good</td>
      <td>Excellent</td>
    </tr>
    <tr>
      <td><strong>Familiarity</strong></td>
      <td>Poor</td>
      <td>Good</td>
      <td>Good</td>
      <td>Excellent</td>
    </tr>
    <tr>
      <td><strong>Ecosystem Support</strong></td>
      <td>Poor</td>
      <td>Fair</td>
      <td>Fair</td>
      <td>Excellent</td>
    </tr>
    <tr>
      <td><strong>DB Compatibility</strong></td>
      <td>Excellent</td>
      <td>Poor</td>
      <td>Excellent</td>
      <td>Excellent</td>
    </tr>
    <tr>
      <td><strong>Scalability</strong></td>
      <td>64 values</td>
      <td>Limited</td>
      <td>Limited</td>
      <td>Unlimited</td>
    </tr>
    <tr>
      <td><strong>Property vs Entity</strong></td>
      <td>property</td>
      <td>property</td>
      <td>property</td>
      <td>entity</td>
    </tr>
  </tbody>
</table>

<h3 id="property-vs-entity">Property vs Entity</h3>

<p>In data modeling terms, reasons are a property of a report, not an entity with its own identity and lifecycle. Nobody queries “show me all attributes of the spam reason”. That makes the multivalued column approaches (array, JSONB, bitwise) more true to the conceptual model, even though they break First Normal Form. The HABTM is the more purely relational approach, but it treats a property more as if it were an entity.</p>

<h3 id="performance">Performance</h3>

<p>We ran benchmarks with 1000 operations for create, find, and update scenarios. You can find the full benchmark code and results in the <a href="https://github.com/sinaptia/multivalued_attributes">multivalued_attributes</a> repository.</p>

<p>The HABTM is ~4x slower on creates and ~3-5x slower on updates compared to the other methods. However, it’s worth noting that most performance issues related to HABTM and JOINs aren’t actually caused by the JOIN itself. Databases like PostgreSQL are highly optimized for these operations. The real culprits are usually N+1 queries, missing indexes, or poorly designed queries. With proper indexing and eager loading, the read performance gap narrows significantly. The first three approaches (bitwise, array, and jsonb) have very similar performance. On reads, the difference is marginal (only 2-3 extra seconds over 1000 queries).</p>

<h3 id="query-simplicity">Query Simplicity</h3>

<p>One often overlooked aspect is query complexity at the call site. If your codebase filters by these values extensively, in scopes, serializers, admin interfaces, and column-based approaches, keep queries simple and on a single table. With array_enum or bitwise, you can do:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Report</span><span class="p">.</span><span class="nf">with_reason</span><span class="p">(</span><span class="ss">:spam</span><span class="p">)</span>
<span class="no">Report</span><span class="p">.</span><span class="nf">where</span><span class="p">.</span><span class="nf">not</span><span class="p">(</span><span class="ss">reasons: </span><span class="p">[])</span>
</code></pre></div></div>

<p>With a HABTM, every call site would need a <code class="language-plaintext highlighter-rouge">.joins(:reasons)</code>, adding complexity and increasing the risk of N+1 queries if eager loading is forgotten. While you can mitigate this with default scopes or associations, it adds friction that column-based approaches simply don’t have.</p>

<h3 id="default-values">Default Values</h3>

<p>With array or JSONB columns, you can set default values directly in the migration:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">t</span><span class="p">.</span><span class="nf">integer</span> <span class="ss">:reasons</span><span class="p">,</span> <span class="ss">array: </span><span class="kp">true</span><span class="p">,</span> <span class="ss">default: </span><span class="p">[</span><span class="s2">"spam"</span><span class="p">,</span> <span class="s2">"harassment"</span><span class="p">]</span>
</code></pre></div></div>

<p>New records automatically get these values. No callbacks required. With a HABTM, you’d need a callback to seed join table rows, adding an extra step that can be forgotten or misconfigured.</p>

<h3 id="database-level-validation">Database-Level Validation</h3>

<p>With a HABTM, foreign keys enforce valid references automatically:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">reasons_reports</span> <span class="p">(</span><span class="n">report_id</span><span class="p">,</span> <span class="n">reason_id</span><span class="p">)</span>
<span class="k">VALUES</span> <span class="p">(</span><span class="s1">'some-uuid'</span><span class="p">,</span> <span class="mi">999</span><span class="p">);</span>
<span class="c1">-- ERROR: Key (reason_id)=(999) is not present in table "reasons"</span>
</code></pre></div></div>

<p>You also get cascading behavior. <code class="language-plaintext highlighter-rouge">ON DELETE CASCADE</code> automatically cleans up join table rows when a reason is removed.</p>

<p>With array or JSONB, you can use PostgreSQL CHECK constraints to enforce valid values at the database level:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">reports</span>
<span class="k">ADD</span> <span class="k">CONSTRAINT</span> <span class="n">valid_reasons</span>
<span class="k">CHECK</span> <span class="p">(</span><span class="n">reasons</span> <span class="o">&lt;@</span> <span class="n">ARRAY</span><span class="p">[</span><span class="s1">'spam'</span><span class="p">,</span> <span class="s1">'harassment'</span><span class="p">,</span> <span class="s1">'inappropriate_content'</span><span class="p">,</span> <span class="s1">'copyright'</span><span class="p">,</span> <span class="s1">'misinformation'</span><span class="p">]);</span>

<span class="k">UPDATE</span> <span class="n">reports</span>
<span class="k">SET</span> <span class="n">reasons</span> <span class="o">=</span> <span class="n">ARRAY</span><span class="p">[</span><span class="s1">'spam'</span><span class="p">,</span> <span class="s1">'harassment'</span><span class="p">,</span> <span class="s1">'oops_typo'</span><span class="p">];</span>
<span class="c1">-- ERROR: new row for relation "reports" violates check constraint "valid_reasons"</span>
</code></pre></div></div>

<p>Both approaches can enforce valid values at the database level. The difference is ergonomics: foreign keys handle this naturally, while CHECK constraints require explicit maintenance when adding new values.</p>

<h3 id="extensibility">Extensibility</h3>

<ul>
  <li><strong>Bitwise</strong>: Limited to 64 values. To add more, you need to change to <code class="language-plaintext highlighter-rouge">bigint</code> or use bit strings.</li>
  <li><strong>Array</strong>: Easy to expand, but storing many values (dozens+) becomes unwieldy. Row size grows, and <a href="https://www.postgresql.org/docs/current/gin.html">GIN</a> indexes start having performance issues.</li>
  <li><strong>JSONB</strong>: Flexible, but the same problem as arrays when storing many values.</li>
  <li><strong>HABTM</strong>: The most extensible. You can add additional columns (e.g., <code class="language-plaintext highlighter-rouge">reason_details</code>, <code class="language-plaintext highlighter-rouge">reported_at</code> in the join table). Also ideal for dynamic or user-defined values.</li>
</ul>

<h3 id="familiarity">Familiarity</h3>

<ul>
  <li><strong>Bitwise</strong>: Strange for most. Requires explaining bit-level operations.</li>
  <li><strong>Array</strong>: Intuitive if you know Ruby/PostgreSQL.</li>
  <li><strong>JSONB</strong>: Familiar to those who use modern REST APIs.</li>
  <li><strong>HABTM</strong>: The most idiomatic. Any Rails developer immediately understands <code class="language-plaintext highlighter-rouge">has_and_belongs_to_many</code>.</li>
</ul>

<h3 id="ecosystem">Ecosystem</h3>

<ul>
  <li><strong>Bitwise/Array/JSONB</strong>: Require custom form inputs, serializers, and manual validations. Some support from gems.</li>
  <li><strong>HABTM</strong>: Works out-of-the-box with ActiveAdmin, RailsAdmin, nested forms, and bulk operations.</li>
</ul>

<h3 id="database-compatibility">Database Compatibility</h3>

<p>PostgreSQL arrays are PostgreSQL-specific. Bitwise and HABTM work across any database. JSON works across MySQL, SQLite, and PostgreSQL (though with different syntax and index types. MySQL uses GIN indexes similarly to PostgreSQL, while SQLite has the JSON1 extension).</p>

<h2 id="conclusion">Conclusion</h2>

<p>If your value set is static and small, <strong>PostgreSQL array</strong> is the better choice over bitwise. They offer nearly identical performance without introducing unfamiliar bit-level operations that confuse most Rails developers.</p>

<p>But don’t dismiss HABTM. For most applications, the write overhead is irrelevant compared to the cost of maintaining “clever” code that the next developer doesn’t understand.</p>

<p><strong>In our specific case</strong>, a reporting system where users can select multiple reasons, we chose <strong>HABTM</strong>. Not because it was the fastest, it wasn’t. But because it didn’t need justification. Any new developer on the team immediately understands it, and that clarity is worth more than the performance gains we’d rarely notice in practice.</p>

<p>The best technical choice is the one that doesn’t need a justification.</p>]]></content><author><name>Nazareno Moresco</name></author><category term="Ruby on Rails" /><category term="Performance" /><summary type="html"><![CDATA[Rails' enum DSL is great for single values, but what about multiple? We compared 4 approaches across performance, extensibility, and maintainability to find the best fit.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Evaluating LLM prompts in Rails</title><link href="https://sinaptia.dev/posts/evaluating-llm-prompts-in-rails" rel="alternate" type="text/html" title="Evaluating LLM prompts in Rails" /><published>2026-02-17T00:00:00+00:00</published><updated>2026-02-17T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/evaluating-llm-prompts-in-rails</id><content type="html" xml:base="https://sinaptia.dev/posts/evaluating-llm-prompts-in-rails"><![CDATA[<p>We’ve built several AI features in Rails by now: <a href="/posts/scaling-image-classification-with-ai">image classification</a>, <a href="/posts/upscaling-images-with-ai">image upscaling</a>, <a href="/posts/improving-a-similarity-search-with-ai">similarity search</a>, etc. And every time, the same question came up: which model and prompt should we actually use? The image classification project made this especially painful: a pricing change blew up our budget, smaller images proved to work better than larger ones, and every model switch required re-running the entire evaluation from scratch.</p>

<p>Every change on a prompt opens up a tree of choices. Which provider should we use? Which model? How detailed should the instructions be? Would more samples in the prompt work better? How much context per message? Should we use a reasoning model? Or augment the data available to the model with multi-modal input? There’s also the cost vs. accuracy tradeoff: is 10x the price worth a 5% improvement for this specific feature?</p>

<p>The combinatorial explosion gets overwhelming fast, and the result of the process has this feeling of uncertainty… is there a branch I missed that works better? Or that costs less?</p>

<h2 id="the-pragmatic-choice-spreadsheets">The pragmatic choice: spreadsheets</h2>

<p>We needed a methodology to track changes across iterations so the team can follow along. Naturally, we took a pragmatic stance: we started using spreadsheets for each feature tracking results across prompt/provider/model configurations, all run against the same data. It worked quite well, and over several features, we started seeing a workflow emerge, but…</p>

<h2 id="spreadsheets-dont-scale">Spreadsheets don’t scale</h2>

<p>We knew the limits going in, but they became harder to ignore over time:</p>

<ul>
  <li><strong>They fragment.</strong> People make copies. When you’re sharing with non-technical collaborators, you end up with multiple sources of truth.</li>
  <li><strong>No enforced structure.</strong> Each feature ended up with its own format. You have to re-learn how to read each one, and not all of them track the same metrics the same way.</li>
  <li><strong>Hard to compare.</strong> Eyeballing results across configurations isn’t intuitive, and people get confused.</li>
  <li><strong>No regression baseline.</strong> Once you settle on a configuration, how do you catch regressions later?</li>
  <li><strong>Prompts drift.</strong> Someone edits the spreadsheet and forgets to update the code. Nobody notices until something breaks.</li>
  <li><strong>Disconnected from code.</strong> Prompts and evaluations should live where the application lives.</li>
</ul>

<p>In one project with many AI features, this all came apart. Links got lost, copies multiplied across different people’s drives with small divergences. Building eval datasets meant downloading images and re-uploading them to sheets. Running prompts required manual dev work because the data lived in Google Drive, but prompts had to go through the LLM provider. We built some internal tooling to help, but since every sheet and feature had a different format, nothing was reusable.</p>

<p>But they were useful to uncover what we needed: a place where you can couple a prompt configuration with a curated dataset extracted from real data, that helps you find the right balance between accuracy and costs for the feature at hand. Ideally, without leaving the Rails app.</p>

<p>So we built <a href="https://github.com/sinaptia/ruby_llm-evals">RubyLLM::Evals</a>, a Rails engine for testing, comparing, and improving LLM prompts directly inside your application.</p>

<h2 id="rubyllmevals">RubyLLM::Evals</h2>

<p>Since we’re using <a href="https://github.com/crmne/ruby_llm">RubyLLM</a>, it made sense to build on top of it.</p>

<p>The core abstractions are <strong>prompts</strong> and <strong>samples</strong>. A prompt captures a full configuration: provider, model, system instructions, message template (with Liquid variables), tools, and output schemas. If you already have tools or schemas in your app, you can reuse them. Samples are your test cases: each one defines an evaluation type (exact match, contains, regex, LLM judge, or human judge) and an expected output.</p>

<p>The interesting design choice was making the LLM-as-judge a first-class eval type. For features like summarization or classification with fuzzy boundaries, exact matching doesn’t cut it. You need another model to assess whether the response is <em>good enough</em>. It’s not perfect, the judge has its own biases and failure modes, but for iterative prompt development, it’s a pragmatic tradeoff: fast feedback now, human review on the edge cases.</p>

<p>Each run saves a snapshot of the prompt settings and records accuracy, cost, and duration. A comparison tool lays all runs of a prompt side by side, so you can spot what changed and why.</p>

<h3 id="real-application-data">Real application data</h3>

<p>One thing we really wanted was the ability to populate samples from the application’s data. For example, in our image categorization feature, we can:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">prompt</span> <span class="o">=</span> <span class="no">RubyLLM</span><span class="o">::</span><span class="no">Evals</span><span class="o">::</span><span class="no">Prompt</span><span class="p">.</span><span class="nf">find_by</span><span class="p">(</span><span class="ss">slug: </span><span class="s2">"image-categorization"</span><span class="p">)</span>

<span class="no">Image</span><span class="p">.</span><span class="nf">uncategorized</span><span class="p">.</span><span class="nf">limit</span><span class="p">(</span><span class="mi">50</span><span class="p">).</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">image</span><span class="o">|</span>
  <span class="n">sample</span> <span class="o">=</span> <span class="n">prompt</span><span class="p">.</span><span class="nf">samples</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="ss">eval_type: :human</span><span class="p">)</span>
  <span class="n">sample</span><span class="p">.</span><span class="nf">files</span><span class="p">.</span><span class="nf">attach</span><span class="p">(</span><span class="n">image</span><span class="p">.</span><span class="nf">attachment</span><span class="p">.</span><span class="nf">blob</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Now you’re iterating on your prompt with actual production data, not synthetic examples.</p>

<p>The temptation is to throw hundreds of samples at a prompt and see what sticks. In practice, a smaller curated set that covers your edge cases tells you more than a large random one. We typically start with 20-30 samples: a mix of straightforward cases, known hard cases from production, and a few adversarial examples. If accuracy looks promising, we expand. If not, the small set is faster to iterate on.</p>

<h3 id="in-production">In production</h3>

<p>Once you’re happy with a prompt, you can use it directly in your application:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">response</span> <span class="o">=</span> <span class="no">RubyLLM</span><span class="o">::</span><span class="no">Evals</span><span class="o">::</span><span class="no">Prompt</span><span class="p">.</span><span class="nf">execute</span><span class="p">(</span>
  <span class="s2">"image-categorization"</span><span class="p">,</span>
  <span class="ss">files: </span><span class="p">[</span><span class="n">image</span><span class="p">.</span><span class="nf">attachment</span><span class="p">.</span><span class="nf">blob</span><span class="p">]</span>
<span class="p">)</span>
<span class="n">response</span><span class="p">.</span><span class="nf">content</span>  <span class="c1"># =&gt; "deck"</span>
</code></pre></div></div>

<p>The configuration lives in the database, versioned through your evaluation runs, always in sync with what you tested. Rolling back to a previous version or A/B testing a new iteration becomes straightforward.</p>

<h2 id="where-this-leaves-us">Where this leaves us</h2>

<p>Production data has a way of surprising you: new usage patterns, edge cases you never curated a sample for, a provider silently updating a model or its pricing… your prompt’s accuracy can degrade, or your cost can skyrocket without a single line of code changing. This is a challenge that has no single solution, but monitoring a prompt’s performance in production is key. Each feature will require something different and use different metrics, but you need feedback, so when your metrics surface a drift, lower quality results, or higher error cases, higher costs, you can pull new samples into RubyLLM::Evals and adjust the prompt to the new reality.</p>

<p>The pattern we keep seeing across projects is that prompts are never done. Models get updated, data distributions shift, and what worked last month might silently degrade and fail over time. Continuous testing and monitoring are critical.</p>

<p><a href="https://github.com/sinaptia/ruby_llm-evals">RubyLLM::Evals</a> and <a href="https://github.com/sinaptia/ruby_llm-monitoring">RubyLLM::Monitoring</a> are how we go from concept to production. Both are open source and built for Rails.</p>

<hr />

<p><em>At SINAPTIA, <a href="/posts/building-intelligent-applications-with-rails">we specialize in helping businesses implement AI solutions</a> that deliver real value. If you’re facing challenges with prompt engineering or AI integration, we’d love to help.</em></p>]]></content><author><name>Patricio Mac Adden</name></author><category term="Ruby on Rails" /><category term="AI" /><summary type="html"><![CDATA[Finding the right model and prompt for your AI feature is harder than it looks. Spreadsheets help, until they don't. So we did something about it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">AI agents in Ruby: Why is it so easy?</title><link href="https://sinaptia.dev/posts/ai-agents-in-ruby-why-is-it-so-easy" rel="alternate" type="text/html" title="AI agents in Ruby: Why is it so easy?" /><published>2026-02-09T00:00:00+00:00</published><updated>2026-02-09T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/ai-agents-in-ruby-why-is-it-so-easy</id><content type="html" xml:base="https://sinaptia.dev/posts/ai-agents-in-ruby-why-is-it-so-easy"><![CDATA[<p>Scott Werner (founder of Sublayer and organizer of <a href="https://www.artificialruby.ai/">Artificial Ruby</a>) told me something that stuck with me:</p>

<blockquote>
  <p><em>“The first version of the sublayer gem was actually a coding agent, but it was coming together so quickly… I was like, wait… if this is so easy for me, it’s going to be easy for everybody, and everybody is going to be making these…”</em></p>
</blockquote>

<p>Last week, we open-sourced a minimal but feature-packed coding agent.
We were after the simplest, straightforward, stupidly effective agent possible, so we named it <a href="https://github.com/sinaptia/detritus">Detritus</a>, after Lance Constable Detritus of the Ankh-Morpork City Watch from <a href="https://en.wikipedia.org/wiki/Discworld">Discworld</a>
(thanks for so much and so many, Sir Terry).</p>

<p>Detritus is built in just <strong>250 lines of code</strong>, yet it packs a CLI with history, custom slash commands and skills (sort of), save/resume chats, subagents, and a two-level configuration (project and global). A full-featured coding agent.</p>

<p>While building this basic agent, we confirmed, firsthand, what Scott had said. And I kept wondering:</p>

<p><strong><em>Why?</em></strong>  <strong>What makes it <em>so</em> easy?</strong></p>

<p>Is it the LLMs? Is it Ruby? Is it that it’s fun, so you don’t really feel the pain? Or is it something else?</p>

<p>After giving it some thought and talking about this with teammates, we converged on two key factors:</p>

<h2 id="the-first-key-general-availability-of-llms">The first key: general availability of LLMs</h2>

<p>General availability of LLMs changed the nature of the problem of building something like Detritus. Before, building a coding AI was <em>unthinkable</em>, but current LLMs made impossible things almost trivial:</p>

<p>Code some utility functions for the LLM to call (one for editing files, one for bash commands), hook up an LLM via API, put it all in a loop, and that’s it. You have a coding agent.</p>

<p>What used to be a research problem is now an integration problem. The problem migrated from the lab to the workshop.</p>

<h2 id="the-second-key-rubys-power">The second key: Ruby’s power</h2>

<p>Ruby is well known for its historical focus on developer happiness: “A programmer’s best friend”. I think this is a fundamental characteristic of the language, but sometimes I feel it’s a little superficial, and it doesn’t tell you <em>why</em>.</p>

<p>I think Ruby brings something else that is a much more fundamental property that emerges out of its design and philosophy: <strong>Power</strong>.</p>

<p>Originally, the idea of “powerful programming languages” came to me via Amir Rajan, creator of DragonRuby, when he shared this article from Paul Graham, <a href="https://paulgraham.com/avg.html">“Beating the averages”</a>. We talked about how and why Lisp was the most powerful language, with Ruby being a close second. Graham’s key insight — what he calls the “Blub paradox” — is that power in programming languages sits on a continuum, and you can only recognize a more powerful language from above, never from below.</p>

<p>Any general-purpose programming language is nowadays more or less equivalent, equally capable. You can build Detritus with the exact same features in Python, Go, JavaScript, or even in C. And yet, the experience of building this in Ruby feels fluid and frictionless.
Like cutting a wooden block with a hand saw or a circular saw. Both will cut the wood just fine (they are equally capable), and you can probably enjoy both (personal taste is not the matter here), but one will make you feel more <em>powerfully invested</em> than the other.</p>

<p>I think power in programming languages is not just capability, but <strong>the relation between using the capability and the effort the developer has to invest in wielding it</strong>.</p>

<p>In this sense, Ruby has the ability to maximize the capability/effort ratio. The amount of power condensed in a few lines of code feels extraordinary.</p>

<p>If you take a look at Detritus’ source code, this is how you set up the agent:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">create_chat</span><span class="p">(</span><span class="ss">instructions: </span><span class="n">state</span><span class="p">.</span><span class="nf">instructions</span><span class="p">,</span> <span class="ss">tools: </span><span class="p">[</span><span class="no">EditFile</span><span class="p">,</span> <span class="no">Bash</span><span class="p">,</span> <span class="no">WebSearch</span><span class="p">,</span> <span class="no">SubAgent</span><span class="p">],</span> <span class="ss">persist: </span><span class="kp">true</span><span class="p">)</span>
  <span class="n">chat</span> <span class="o">=</span> <span class="no">RubyLLM</span><span class="o">::</span><span class="no">Chat</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">model: </span><span class="n">state</span><span class="p">.</span><span class="nf">model</span><span class="p">,</span> <span class="ss">provider: </span><span class="vg">$state</span><span class="p">.</span><span class="nf">provider</span><span class="p">)</span>
  <span class="n">chat</span><span class="p">.</span><span class="nf">with_instructions</span><span class="p">(</span><span class="n">instructions</span><span class="p">)</span> <span class="k">if</span> <span class="n">instructions</span>
  <span class="n">chat</span><span class="p">.</span><span class="nf">on_end_message</span> <span class="p">{</span> <span class="o">|</span><span class="n">msg</span><span class="o">|</span> <span class="n">save_chat</span> <span class="p">}</span> <span class="k">if</span> <span class="n">persist</span>
  <span class="n">chat</span><span class="p">.</span><span class="nf">with_tools</span><span class="p">(</span><span class="o">*</span><span class="n">tools</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Five lines of RubyLLM set the model, system prompt, and tools. That’s all you need to set the agentic loop ready to go.</p>

<p>And the rest of the code is the same: chat persistence is <code class="language-plaintext highlighter-rouge">Marshal.dump</code>. The CLI router is a case statement. The subagent is a tool that calls <code class="language-plaintext highlighter-rouge">create_chat</code>. None of this code is clever or magical; it’s just plain Ruby. That’s exactly the point. When the language is powerful enough, building an AI agent doesn’t require anything special, just the mundane. And Ruby makes the mundane exquisite, short.</p>

<p>Detritus’ history started when Thorsten Ball published <a href="https://ampcode.com/notes/how-to-build-an-agent">The Emperor Has No Clothes</a>, a guide to building a super basic coding agent in Go. My immediate thought after the head explosion was: if we did this in Ruby, it would take a fraction of the code and give us twice the features. So, as Thorsten suggested, “I went and tried how far I could get”. I got <em>this</em> far.</p>

<h2 id="raised-to-the-power">Raised to the power</h2>

<p>LLMs’ general availability turned AI from a “research problem” into an “integration problem”. The nature of the work changed to match Ruby’s strengths: orchestration, expressiveness, and fast iteration.</p>

<p>When you combine Ruby with LLMs, you get compounding power. Power * Power. Power squared.</p>

<p>The key to building an agent is defining what to delegate to the LLM and what to handle in code. For example, Detritus’ skills feature: the code just provides a list of instructions and scripts. The actual skill, knowing <em>when</em> to use each one, <em>how</em> to combine them, that’s all the LLM.</p>

<p>This is where both keys meet. LLMs do the hard part; our job is orchestration. And Ruby makes the orchestration so clean you can see just how little code is actually needed. Compounding power.</p>

<h2 id="the-opportunity">The Opportunity</h2>
<p>The Ruby AI ecosystem is young, but it’s growing fast. <a href="https://github.com/crmne/ruby_llm">RubyLLM</a>, the gem that powers Detritus, is already spawning its own ecosystem: MCP support, <a href="https://github.com/sinaptia/ruby_llm-monitoring">monitoring</a>, agent frameworks, etc. Andrew Kane has quietly built an entire ML infrastructure layer for Ruby: transformers, torch, embeddings, vector search, and ONNX runtime. Officially supported SDKs from OpenAI, Anthropic, and MCP. The foundations are being laid right now, the Ruby way: simple, expressive, and delightful to use.</p>

<p>In the coming years, most of us, Ruby developers, won’t be training models. We will be orchestrating API calls, building agents, capabilities, features, and designing systems. Building products on top of a dynamic, ever-changing landscape. We’ll be doing what Ruby does best: making powerful capabilities accessible through elegant, expressive interfaces. And because of Ruby’s power, we can do those things naturally, frictionlessly, easily.</p>

<p>The same things that made Ruby great for web development 15 years ago are perfectly aligned again, but now with a more mature, faster, and modern Ruby. The potential is huge.</p>

<p>The Ruby community has decades of experience building products and delightful tools. The AI landscape is wide open, the tools are here, and the problem fits like a glove. So… what are we, Rubyists, going to do?</p>]]></content><author><name>Fernando Martinez</name></author><category term="Ruby" /><category term="AI" /><summary type="html"><![CDATA[We found two keys to answer this question while building a full-featured coding agent in just 250 lines of Ruby code.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">RubyLLM::Instrumentation: The foundation for RubyLLM monitoring</title><link href="https://sinaptia.dev/posts/ruby-llm-instrumentation-the-foundation-for-rubyllm-monitoring" rel="alternate" type="text/html" title="RubyLLM::Instrumentation: The foundation for RubyLLM monitoring" /><published>2026-01-20T00:00:00+00:00</published><updated>2026-01-20T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/ruby-llm-instrumentation-the-foundation-for-rubyllm-monitoring</id><content type="html" xml:base="https://sinaptia.dev/posts/ruby-llm-instrumentation-the-foundation-for-rubyllm-monitoring"><![CDATA[<p>In our <a href="/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring">last post</a>, we introduced <a href="https://github.com/sinaptia/ruby_llm-monitoring">RubyLLM::Monitoring</a>, a Rails engine that captures every LLM request your application makes and provides a dashboard where you can see cost, throughput, response time, and error aggregations, and lets you set up alerts so that when something interesting to you happens, you receive an email or a Slack notification.</p>

<p>But how did we do it? What mechanism does RubyLLM provide that we can use to capture all LLM requests? Or did we use something else?</p>

<h2 id="rubyllm-event-handlers">RubyLLM event handlers</h2>

<p>RubyLLM provides event handlers out of the box. You can use them to capture an event when a message is sent to the LLM and, for example, calculate its cost. This is how you’d use them:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Provided that you have gemini configured in config/initializers/ruby_llm.rb</span>
<span class="n">chat</span> <span class="o">=</span> <span class="no">RubyLLM</span><span class="p">.</span><span class="nf">chat</span> <span class="ss">provider: </span><span class="s2">"gemini"</span><span class="p">,</span> <span class="ss">model: </span><span class="s2">"gemini-2.5-flash"</span>

<span class="n">chat</span><span class="p">.</span><span class="nf">on_end_message</span> <span class="k">do</span> <span class="o">|</span><span class="n">message</span><span class="o">|</span>
  <span class="no">Event</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span>
    <span class="ss">provider: </span><span class="n">chat</span><span class="p">.</span><span class="nf">model</span><span class="p">.</span><span class="nf">provider</span><span class="p">,</span>
    <span class="ss">model: </span><span class="n">chat</span><span class="p">.</span><span class="nf">model</span><span class="p">.</span><span class="nf">id</span><span class="p">,</span>
    <span class="ss">input_tokens: </span><span class="n">message</span><span class="o">&amp;</span><span class="p">.</span><span class="nf">input_tokens</span> <span class="o">||</span> <span class="mi">0</span><span class="p">,</span>
    <span class="ss">output_tokens: </span><span class="n">message</span><span class="o">&amp;</span><span class="p">.</span><span class="nf">output_tokens</span> <span class="o">||</span> <span class="mi">0</span>
  <span class="p">)</span>
<span class="k">end</span>

<span class="n">response</span> <span class="o">=</span> <span class="n">chat</span><span class="p">.</span><span class="nf">ask</span><span class="p">(</span><span class="s2">"Write a short poem about Ruby"</span><span class="p">)</span>
</code></pre></div></div>

<p>In the code above, an event record is created when a message is completed, and the cost is calculated in an ActiveRecord callback. The solution is pretty simple and works perfectly, but it doesn’t scale very well:</p>

<ul>
  <li>You need to add this manual tracking everywhere. Every chat instance requires this callback to be set up; otherwise you will lose that data. You can simplify it even more, but you’ll always have to set up the callback.</li>
  <li>Your instrumentation code and your business logic are tightly coupled, which makes both harder to maintain.</li>
  <li>This only works for <code class="language-plaintext highlighter-rouge">RubyLLM::Chat</code> instances. What about embeddings, image generation, and other operations? You’d need different mechanisms for each.</li>
  <li>Tracking full request metrics like latency needs more complex and intrusive code.</li>
</ul>

<p>We needed something more comprehensive and automatic that doesn’t rely on us remembering to hook the instrumentation code everywhere. Luckily, Rails has something neat for us to use baked in.</p>

<h2 id="activesupportnotifications">ActiveSupport::Notifications</h2>

<p>ActiveSupport::Notifications is Rails’ instrumentation API. It’s what Rails uses internally to track things like database queries, view rendering, controller executions, and more.</p>

<p>Using it is simple: you make your code emit events by calling <code class="language-plaintext highlighter-rouge">ActiveSupport::Notifications#instrument(...)</code>, and subscribers can consume those events to do logging, monitoring, or whatever else you need. An interesting example is <a href="https://github.com/charkost/prosopite">Prosopite</a>, which hooks into <code class="language-plaintext highlighter-rouge">sql.active_record</code> events to detect N+1 queries.</p>

<p>This mechanism is especially important for libraries, as it decouples the business logic of the library and the business logic of the application that uses it. In the case of RubyLLM::Monitoring, the monitoring logic lives separately and subscribes to what it cares about. No coupling between RubyLLM and RubyLLM::Monitoring.</p>

<p>So, this is what we did in <a href="https://github.com/sinaptia/ruby_llm-instrumentation">RubyLLM::Instrumentation</a> to make RubyLLM emit events after each LLM call. RubyLLM::Monitoring, on the other hand, provides an event subscriber that captures the events and feeds them into its dashboard.</p>

<h2 id="rubyllminstrumentation">RubyLLM::Instrumentation</h2>

<p>Instrumentation should be automatic and invisible. RubyLLM::Instrumentation achieves that: just add it to your <code class="language-plaintext highlighter-rouge">Gemfile</code>, run <code class="language-plaintext highlighter-rouge">bundle install</code>, and you’re done. RubyLLM will start emitting events for you to subscribe to.</p>

<p>Now, following the example above, the code becomes:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># in config/initializers/ruby_llm.rb</span>
<span class="no">ActiveSupport</span><span class="o">::</span><span class="no">Notifications</span><span class="p">.</span><span class="nf">subscribe</span><span class="p">(</span><span class="sr">/ruby_llm/</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">event</span><span class="o">|</span>
  <span class="c1"># Do whatever you want with the event, in RubyLLM::Monitoring we store the event data in the database for later use</span>
  <span class="no">Event</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span>
    <span class="ss">provider: </span><span class="n">event</span><span class="p">.</span><span class="nf">payload</span><span class="p">[</span><span class="ss">:provider</span><span class="p">],</span>
    <span class="ss">model: </span><span class="n">event</span><span class="p">.</span><span class="nf">payload</span><span class="p">[</span><span class="ss">:model</span><span class="p">],</span>
    <span class="ss">input_tokens: </span><span class="n">event</span><span class="p">.</span><span class="nf">payload</span><span class="p">[</span><span class="ss">:input_tokens</span><span class="p">]</span> <span class="o">||</span> <span class="mi">0</span><span class="p">,</span>
    <span class="ss">output_tokens: </span><span class="n">event</span><span class="p">.</span><span class="nf">payload</span><span class="p">[</span><span class="ss">:output_tokens</span><span class="p">]</span> <span class="o">||</span> <span class="mi">0</span>
  <span class="p">)</span>
<span class="k">end</span>

<span class="c1"># Provided that you have gemini configured in config/initializers/ruby_llm.rb</span>
<span class="n">chat</span> <span class="o">=</span> <span class="no">RubyLLM</span><span class="p">.</span><span class="nf">chat</span> <span class="ss">provider: </span><span class="s2">"gemini"</span><span class="p">,</span> <span class="ss">model: </span><span class="s2">"gemini-2.5-flash"</span>

<span class="c1"># RubyLLM will emit the event, and it'll be captured by the subscriber above</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">chat</span><span class="p">.</span><span class="nf">ask</span><span class="p">(</span><span class="s2">"Write a short poem about Ruby"</span><span class="p">)</span>
</code></pre></div></div>

<p>The code remains practically the same as in the original example, but the instrumentation becomes much simpler and decoupled, and there’s no need to repeat the same hook in multiple places.</p>

<p>In the example above, all <code class="language-plaintext highlighter-rouge">ruby_llm</code> events are captured, but you can subscribe to specific events. You can read more about the instrumented events and their payload in the <a href="https://github.com/sinaptia/ruby_llm-instrumentation">project’s repository</a>.</p>

<h2 id="wrapping-up">Wrapping up</h2>

<p>RubyLLM::Instrumentation takes off the burden of manually instrumenting the code from the users’ shoulders. Originally written as part of RubyLLM::Monitoring, we extracted it into its own gem because we thought it was a fundamental tool, and as we needed it, other people might need it too to build a different monitoring tool, or an analytics tool, or set up logging differently.</p>

<p>Give it a try, send us feedback, and contribute if you want to!</p>

<hr />

<p>If you’re building AI-powered applications with Rails and need help with architecture, optimization, or observability, <a href="/contact-us/">get in touch</a>.</p>]]></content><author><name>Patricio Mac Adden</name></author><category term="Ruby on Rails" /><category term="AI" /><summary type="html"><![CDATA[While working on RubyLLM::Monitoring, we needed a way to instrument all RubyLLM operations. But we wanted to do it without changing RubyLLM. Read along to know how we did it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Monitoring LLM usage in Rails with RubyLLM::Monitoring</title><link href="https://sinaptia.dev/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring" rel="alternate" type="text/html" title="Monitoring LLM usage in Rails with RubyLLM::Monitoring" /><published>2026-01-14T00:00:00+00:00</published><updated>2026-01-14T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring</id><content type="html" xml:base="https://sinaptia.dev/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring"><![CDATA[<p>You’ve built an AI-powered feature into your Rails application using LLMs. You’ve built an evaluation set to test different prompts and model combinations, compared them, and improved them<sup id="fnref:1"><a href="#fn:1" class="footnote" rel="footnote" role="doc-noteref">1</a></sup> so you could get the best bang for the buck out of your LLM usage. You aimed for the highest accuracy at the lowest possible cost. You deployed it to production. And now?</p>

<p>Unlike most APIs, LLM APIs calls have variable costs. They are usage-based, so the price depends on input and output tokens consumed. So, how do you know how your users are using it? Or, how much will it cost you monthly? Is it what you estimated, and are the usage limits you designed ok? Are they needed at all?</p>

<h2 id="why-monitoring-llm-interactions-matters">Why monitoring LLM interactions matters</h2>

<p>Beyond basic visibility, monitoring unlocks practical improvements:</p>

<ul>
  <li><strong>Cost management</strong>: Track which models and features are costing you money, then focus optimization efforts where they matter. When 80% of your costs come from one feature, you can try a cheaper model, add caching, optimize prompts, or, if the provider and feature allow it, <a href="https://sinaptia.dev/posts/the-untold-challenges-of-openai-s-batch-processing-api">batch processing</a>.</li>
  <li><strong>Performance tracking and anomaly detection</strong>: Monitor response times to identify slow prompts and set realistic expectations. A sudden spike in latency or requests usually means something changed—a bug causing retries, or model performance issues—and monitoring helps you correlate changes with their impact.</li>
  <li><strong>Capacity planning</strong>: Understanding your throughput patterns (requests per minute, hour, day) helps you forecast costs and identify features that might benefit from caching or batching.</li>
  <li><strong>Provider comparison</strong>: With multiple LLM providers offering similar capabilities at different price points, monitoring helps you make informed decisions about which model delivers the best results for your use case.</li>
  <li><strong>Reporting</strong>: Product managers and stakeholders want to know what AI is costing. With monitoring data in your database, generating reports is a SQL query away.</li>
  <li><strong>Model migration planning</strong>: When a provider releases a new model or changes pricing, you can estimate the impact on your costs before making the switch.</li>
</ul>

<h2 id="introducing-rubyllmmonitoring">Introducing RubyLLM::Monitoring</h2>

<p>As you might guess, after deploying our AI-powered features, we had several usage spikes that threatened the viability of the features. We needed to monitor our LLM usage in production. At the beginning, we did it manually, using whatever each inference platform provided. But as we started using different providers and models on several features, manually tracking cost and token usage became complicated and error-prone. So we built <a href="https://github.com/sinaptia/ruby_llm-monitoring">RubyLLM::Monitoring</a>: a Rails engine that tracks every LLM request your application makes and provides a dashboard where you can see cost, throughput, response time, and error aggregations. On top of it, you can set up alerts so that when something interesting to you happens, you receive an email or a Slack notification.</p>

<p>As the name suggests, it’s built on top of <a href="https://github.com/crmne/ruby_llm">RubyLLM</a> and integrates seamlessly with your existing setup. No separate infrastructure, no external services, just another engine mounted in your Rails app.</p>

<h3 id="how-it-works">How it works</h3>

<p>The engine instruments every LLM request your app makes (stay tuned for a related post) and saves it to your database. Cost is calculated automatically using RubyLLM’s built-in pricing data. Since everything lives in your database, you can run custom queries when the dashboard isn’t enough.</p>

<h3 id="the-dashboard">The dashboard</h3>

<p>Once installed, you get a dashboard at <code class="language-plaintext highlighter-rouge">/monitoring</code> (or wherever you mount it) with:</p>

<ul>
  <li>Summary cards showing total requests, total cost, average response time, and error rate.</li>
  <li>A breakdown table grouping metrics by provider and model, so you can see at a glance which models are being used and what they’re costing you.</li>
  <li>Metrics:
    <ul>
      <li><strong>Throughput</strong>: Request count over time</li>
      <li><strong>Cost</strong>: Accumulated costs per time window</li>
      <li><strong>Response time</strong>: Average latency trends</li>
      <li><strong>Error rate</strong>: Percentage of failed requests</li>
    </ul>
  </li>
</ul>

<p><img src="/assets/images/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring/metrics.webp" alt="RubyLLM::Monitoring metrics" /></p>
<p class="text-sm italic mb-5">Demo with slow responses and a high error rate.</p>

<p><img src="/assets/images/posts/monitoring-llm-usage-in-rails-with-rubyllm-monitoring/alerts.webp" alt="RubyLLM::Monitoring alerts" /></p>

<h3 id="alerts">Alerts</h3>

<p>Beyond the dashboard, you can configure custom alert rules to notify you when specific conditions are met. This is essential for catching cost overruns, error spikes, or unusual patterns before they become problems.</p>

<p>Alert rules are flexible and can trigger based on any condition you can express as a query. Here are some practical examples:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># config/initializers/ruby_llm_monitoring.rb</span>
<span class="no">RubyLLM</span><span class="o">::</span><span class="no">Monitoring</span><span class="p">.</span><span class="nf">channels</span> <span class="o">=</span> <span class="p">{</span>
  <span class="ss">email: </span><span class="p">{</span> <span class="ss">to: </span><span class="s2">"team@example.com"</span> <span class="p">},</span>
  <span class="ss">slack: </span><span class="p">{</span> <span class="ss">webhook_url: </span><span class="no">ENV</span><span class="p">[</span><span class="s2">"SLACK_WEBHOOK_URL"</span><span class="p">]</span> <span class="p">}</span>
<span class="p">}</span>

<span class="no">RubyLLM</span><span class="o">::</span><span class="no">Monitoring</span><span class="p">.</span><span class="nf">alert_rules</span> <span class="o">+=</span> <span class="p">[{</span>
  <span class="ss">time_range: </span><span class="o">-&gt;</span> <span class="p">{</span> <span class="no">Time</span><span class="p">.</span><span class="nf">current</span><span class="p">.</span><span class="nf">at_beginning_of_month</span><span class="o">..</span> <span class="p">},</span>
  <span class="ss">rule: </span><span class="o">-&gt;</span><span class="p">(</span><span class="n">events</span><span class="p">)</span> <span class="p">{</span> <span class="n">events</span><span class="p">.</span><span class="nf">sum</span><span class="p">(</span><span class="ss">:cost</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">500</span> <span class="p">},</span>
  <span class="ss">channels: </span><span class="p">[</span><span class="ss">:email</span><span class="p">,</span> <span class="ss">:slack</span><span class="p">],</span>
  <span class="ss">message: </span><span class="p">{</span> <span class="ss">text: </span><span class="s2">"More than $500 spent this month"</span> <span class="p">}</span>
<span class="p">},</span> <span class="p">{</span>
  <span class="ss">time_range: </span><span class="o">-&gt;</span> <span class="p">{</span> <span class="mi">1</span><span class="p">.</span><span class="nf">day</span><span class="p">.</span><span class="nf">ago</span><span class="o">..</span> <span class="p">},</span>
  <span class="ss">rule: </span><span class="o">-&gt;</span><span class="p">(</span><span class="n">events</span><span class="p">)</span> <span class="p">{</span> <span class="n">events</span><span class="p">.</span><span class="nf">average</span><span class="p">(</span><span class="ss">:response_time</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">5000</span> <span class="p">},</span>
  <span class="ss">channels: </span><span class="p">[</span><span class="ss">:slack</span><span class="p">],</span>
  <span class="ss">message: </span><span class="p">{</span> <span class="ss">text: </span><span class="s2">"Average response time exceeded 5 seconds"</span> <span class="p">}</span>
<span class="p">}]</span>
</code></pre></div></div>

<p>Alert rules have built-in cooldown periods to prevent notification spam, and you can customize channels for each rule. You can even build custom notification channels beyond the built-in email and Slack options.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Building AI-powered features doesn’t end at deployment. The models you depend on are expensive, their performance varies, and usage patterns shift over time. Models and providers are unstable due to the rapidly evolving AI landscape. Without proper visibility, you have only guesses. So, we built <a href="https://github.com/sinaptia/ruby_llm-monitoring">RubyLLM::Monitoring</a>.</p>

<p>Give it a try, send us feedback, and contribute if you want to!</p>

<hr />

<p><em>At SINAPTIA, <a href="/posts/building-intelligent-applications-with-rails">we specialize in helping businesses implement AI solutions</a> that deliver real value. If you’re facing challenges with LLM monitoring or AI integration, we’d love to help.</em></p>

<h2 id="references">References</h2>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1">
      <p>If you don’t know how to do this, we’ll have a surprise for you soon. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Patricio Mac Adden</name></author><category term="Ruby on Rails" /><category term="AI" /><summary type="html"><![CDATA[When you're using multiple LLM providers, tracking costs manually becomes impossible fast. We needed visibility into our AI spending and LLM performance. Here's the monitoring engine we built for Rails.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">la_plata.rb November meetup</title><link href="https://sinaptia.dev/posts/la-plata-rb-november-meetup" rel="alternate" type="text/html" title="la_plata.rb November meetup" /><published>2025-12-02T00:00:00+00:00</published><updated>2025-12-02T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/la-plata-rb-november-meetup</id><content type="html" xml:base="https://sinaptia.dev/posts/la-plata-rb-november-meetup"><![CDATA[<p>On November 27th, 2025, the <a href="http://laplatarb.github.io/">la_plata.rb</a> community came together at Calle Uno for what became the first and last meetup of the year in the city. More than 30 Ruby developers gathered together to share knowledge, experiences, and drinks.</p>

<p>The meetup was made possible thanks to the support of RubyCentral, GitHub, SINAPTIA, and Unagi. Many thanks to our sponsors; this wouldn’t have been possible without them.</p>

<h2 id="observability-en-la-era-de-ai">Observability en la era de AI</h2>

<p>The first talk was presented by Patricio Mac Adden from SINAPTIA, who shared his team’s journey combining observability tools with LLMs to tackle real-world Rails application problems. With an upfront disclaimer that this wasn’t “the definitive solution” but rather hard-earned experience from the trenches, Patricio dove into a fascinating story.</p>

<p>He started by providing context on how SINAPTIA uses LLMs daily—from AI agents that help with programming, code reviews, and debugging, to production features like <a href="https://sinaptia.dev/posts/scaling-image-classification-with-ai">image classification</a>, <a href="https://sinaptia.dev/posts/upscaling-images-with-ai">image upscaling</a>, <a href="https://sinaptia.dev/posts/improving-a-similarity-search-with-ai">similarity search</a>, and <a href="https://sinaptia.dev/posts/mcp-on-rails">MCP integration</a>. The goal? Optimize time and maximize value.</p>

<p>The talk then explored the observability challenges they faced: applications with performance problems, memory leaks, slow actions, and scarce hardware and software resources (think free-tier APMs or no APM at all), all while trying to develop new features. Their DIY APM journey took them from ActiveSupport::Notifications and ActiveSupport::ErrorReporter to OpenTelemetry, eventually leading to <a href="https://github.com/sinaptia/solid_telemetry">SolidTelemetry</a>.</p>

<p>But when they tried combining their APM data with LLMs for automated problem-solving, they hit roadblocks: the process was too manual (exporting traces, exceptions, and performance items was tedious), too repetitive (you had to tell the LLM what to do every time), used too much context (OpenTelemetry exports many spans per trace), and ultimately wasn’t LLM-friendly.</p>

<p>The solution? <a href="https://github.com/sinaptia/mini_telemetry">MiniTelemetry</a>. A simpler, more lightweight approach that replaces traces/spans with events, eliminates metrics, and is built on top of Rails’ native ActiveSupport::Notifications and ActiveSupport::ErrorReporter. Most importantly, it’s LLM-friendly by design. The talk concluded with a live demo showing how this approach works in practice.</p>

<p align="center" width="100%">
  <img class="w-[70%]" alt="Patricio's talk" src="/assets/images/posts/la-plata-rb-november-meetup/1.webp" />
</p>

<h2 id="elijo-tu-propia-aventura">Elijo tu propia aventura</h2>

<p>The second presentation, delivered by Renzo Quaggia from Unagi, took a creative storytelling approach inspired by the classic “Choose Your Own Adventure” books. Rather than an interactive format, Renzo shared his real-world experience working on a checkout page, a critical part of any e-commerce system where every decision can significantly impact conversion rates.</p>

<p>Renzo walked the audience through the decision points he faced during the project, much like the branching paths in those beloved adventure books. The talk explored how these choices ultimately led him to implement A/B testing as a solution, allowing data rather than assumptions to guide which path to take. It was a practical reminder that in software development, we often face multiple valid approaches, and sometimes the best answer is to test them all.</p>

<p align="center" width="100%">
  <img class="w-[70%]" alt="Renzo's talk" src="/assets/images/posts/la-plata-rb-november-meetup/2.webp" />
</p>

<p>As with any good Ruby meetup, the event concluded with time for networking, sharing experiences, and connecting with other developers over food and drinks.</p>

<p>Thanks again to RubyCentral, GitHub, SINAPTIA, and Unagi for making this event possible, and to everyone who attended. We hope next year we can see more events like this!</p>]]></content><author><name>SINAPTIA</name></author><category term="Ruby" /><category term="Community" /><summary type="html"><![CDATA[On November 27th, the la_plata.rb community gathered for its first and last meetup of 2025. Observability, AI, and A/B testing.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ruby Argentina November meetup</title><link href="https://sinaptia.dev/posts/ruby-argentina-november-meetup" rel="alternate" type="text/html" title="Ruby Argentina November meetup" /><published>2025-11-21T00:00:00+00:00</published><updated>2025-11-21T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/ruby-argentina-november-meetup</id><content type="html" xml:base="https://sinaptia.dev/posts/ruby-argentina-november-meetup"><![CDATA[<p>On November 13th, the <a href="https://ruby.com.ar/">Ruby Argentina</a> community concluded the current year’s agenda of meetups. 2025 was an amazing year for the group, marked by the flow of speakers and sponsors, as well as the organization’s mechanics for each event. The 100% online events, which opened the field to speakers from around the world, such as Jason Swett and Rosa Gutierrez from Basecamp, are all remarkable achievements for the community. Props to the organization team and their fantastic work.</p>

<p>The main talk was given by Fernando E. Silva Jacquier, bringing his point of view about “Expressive coding”, the pros and cons of writing code in a more human way, and the importance of understanding what happens under the hood for expressions that sound the same but behave a bit differently.</p>

<p align="center" width="100%">
  <img class="w-[70%]" alt="The first talk" src="/assets/images/posts/ruby-argentina-november-meetup/1.webp" />
</p>

<p>After a short break to eat empanadas and drink beers, there were some ⚡ lightning talks ⚡, a space to relax and share whatever you want in 5 minutes or less. Among the highlights were our own Nazareno Moresco sharing his journey understanding the Mayan calendar, which led to a fun side project gem (<a href="https://github.com/nazamoresco/mayan">Mayan</a>), and Gemma Falconi talking about turtles, those lovely and tiny dinosaurs.</p>

<p align="center" width="100%" class="flex-row md:flex space-y-4 md:gap-x-4">
  <img class="w-[70%] md:w-[50%]" alt="Nazareno's lightning talk" src="/assets/images/posts/ruby-argentina-november-meetup/2.webp" />
  <img class="w-[70%] md:w-[50%]" alt="Gemma's lightning talk" src="/assets/images/posts/ruby-argentina-november-meetup/3.webp" />
</p>

<p>Huge thanks to sponsors and organizers (<a href="https://sinaptia.dev/">SINAPTIA</a>, Rootstrap, Ombulabs, Eagerworks, Moony, Roxom, Crunchloop, Svitla, and LeWagon) and all the people who made each meeting special. We hope to see you again in 2026.</p>]]></content><author><name>SINAPTIA</name></author><category term="Ruby" /><category term="Community" /><summary type="html"><![CDATA[Last week we attended to the last meetup of the year from the Ruby Argentina community in Buenos Aires. Expressive coding and a lot of fun lightning talks.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">What’s actually slow? A practical guide to Rails performance</title><link href="https://sinaptia.dev/posts/whats-actually-slow" rel="alternate" type="text/html" title="What’s actually slow? A practical guide to Rails performance" /><published>2025-11-06T00:00:00+00:00</published><updated>2025-11-06T00:00:00+00:00</updated><id>https://sinaptia.dev/posts/whats-actually-slow</id><content type="html" xml:base="https://sinaptia.dev/posts/whats-actually-slow"><![CDATA[<p>For the last couple of months, we’ve been building an observability tool that we intend to use internally in our AI-powered solutions. One of the features we wanted to work on was slow action detection, but… What makes an action slow? It’s one of those questions that sounds simple but gets interesting fast. Let’s break it down.</p>

<h2 id="what-users-actually-experience">What users actually experience</h2>

<p>When a request hits your Rails app and a response goes back, that total time is just a portion of what users experience. Server response time is crucial, but it’s only one piece of perceived performance:</p>

<ul>
  <li>Network round-trip matters. Your app might respond in 100ms, but if the user is on a slow connection or geographically far from your server, they might wait 500ms for the round-trip. A fast server doesn’t fix slow networks.</li>
  <li>Download and rendering matter. Once the HTML arrives, the browser needs to download CSS, JavaScript, and images. Then it needs to parse, render, and potentially hydrate a JavaScript framework. A 100ms server response followed by 2 seconds of asset downloads and rendering feels slow to users.</li>
</ul>

<p>The vision on performance should be integral. Server time, network latency, asset delivery, and browser rendering add up to what users experience. In this post, we will focus exclusively on server response time.</p>

<h2 id="percentiles-the-right-way-to-measure">Percentiles: the right way to measure</h2>

<p>You’ve got a group of similar actions. Some are fast, some are slow. What metric do you use to decide if it’s “slow”?</p>

<p>You shouldn’t use the average. The average lies. Imagine 99 requests at 50ms and 1 request at 5 seconds. Your average is 99.5ms, which looks great! But 1% of your users just waited 5 seconds. That’s not acceptable. Depending on the size of your user base, that 1% can be considered an outlier, but if your user base is large, it means a lot of people are having a bad experience.</p>

<p>Percentiles show you what real users experience:</p>

<ul>
  <li>P50 (median): The middle. Half your requests are faster, half are slower.</li>
  <li>P95: 95% of requests are faster than this number.</li>
  <li>P99: 99% of requests are faster than this number.</li>
</ul>

<p>Here’s what it looks like in practice:</p>

<p>Action: posts#index</p>
<ul>
  <li>P50: 120ms    ← typical case</li>
  <li>P95: 450ms    ← 5% of users wait this long or more</li>
  <li>P99: 2.1s     ← 1% of users are suffering</li>
</ul>

<p>That P99 of 2.1 seconds is telling you something. If you have 1000 requests a day, that’s 10 users waiting over 2 seconds every single day.</p>

<h3 id="which-percentile-should-you-use">Which Percentile Should You Use?</h3>

<h4 id="p50-median-too-optimistic">P50 (median): Too optimistic</h4>

<p>P50 only tells you about the typical case. It completely ignores tail latency, i.e., the slow requests that frustrate users.</p>

<p>If P50 is 120ms but P95 is 2 seconds, you have a serious problem that P50 won’t show you. Half your users get a fast experience, but a significant chunk are having a terrible time.</p>

<p>Don’t use P50 to decide what’s slow. It hides too much.</p>

<h4 id="p95-the-sweet-spot">P95: The sweet spot</h4>

<p>It catches problems that affect enough users to matter. If P95 is 2 seconds, that means 5% of your users (1 in 20) are waiting that long. That’s significant.</p>

<p>It’s not so sensitive that every minor blip flags the system. You’re looking at the experience of a meaningful percentage of users, not just the absolute worst cases.</p>

<p>When to use P95:</p>
<ul>
  <li>Setting performance thresholds for alerts</li>
  <li>Deciding if an action needs optimization</li>
  <li>Comparing performance across different endpoints</li>
</ul>

<h4 id="p99-more-aggressive-catches-edge-cases">P99: More aggressive, catches edge cases</h4>

<p>P99 is more aggressive than P95 as it looks at the worst 1% of requests. This catches the outliers, the edge cases, the weird scenarios.</p>

<p>Use P99 when:</p>
<ul>
  <li>You want to understand your absolute worst-case performance</li>
  <li>You’re debugging specific slow requests</li>
  <li>You have extremely high traffic, and 1% still represents many users</li>
  <li>You’re operating at a scale where tail latency really matters (think Amazon, Google)</li>
</ul>

<p>But for flagging what’s “slow” in most applications, P99 can be too noisy. That worst 1% might include legitimate edge cases—a user with a massive dataset, a bot, a weird network condition. Flagging everything where P99 exceeds your threshold might give you too many false positives.</p>

<h4 id="the-decision-rule">The decision rule</h4>

<p>Use P95 as your threshold for marking something as slow. Monitor P99 too; it tells you about edge cases worth investigating. But make decisions based on P95. Why? Because P95 catches problems that affect enough users to matter without drowning you in noise from edge cases.</p>

<h2 id="what-actually-matters-server-response-time">What actually matters: server response time</h2>

<p>Rails tells you this for free:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Completed 200 OK in 250ms (Views: 180ms | ActiveRecord: 45ms)
</code></pre></div></div>

<p>That 250ms is what the server spent processing the request. This is what’s considered in reality:</p>

<p>Fast enough that nobody complains:</p>
<ul>
  <li>Under 100ms: Feels instant. Users are happy.</li>
  <li>100-200ms: Still responsive. Most users won’t notice.</li>
</ul>

<p>Getting into trouble territory:</p>
<ul>
  <li>200-500ms: Noticeable. Not great, not terrible.</li>
  <li>500ms-1s: Users are tapping their fingers.</li>
  <li>1-3 seconds: You’re losing people.</li>
  <li>Over 3 seconds: They’ve already opened another tab.</li>
</ul>

<p>Of course, the context matters. A simple action with basic queries should be under 200ms. A complex dashboard with aggregations spending 500ms to a second might be acceptable. But anything consistently over 500ms deserves investigation.</p>

<h2 id="breaking-down-the-bottlenecks">Breaking down the bottlenecks</h2>

<p>Your action response time is the sum of its parts.  This is what we use as a baseline when we analyze each component of a request. Bear in mind that these values are just guidelines; they can vary from project to project and be influenced by business requirements (eg, SEO penalties) or context (eg, for an admin interface that’s used sparingly for very specific tasks, there’s no problem in relaxing these a little bit).</p>

<h3 id="database-queries">Database Queries</h3>

<p>Your actions are only as fast as your slowest queries.</p>

<p>Fast:</p>
<ul>
  <li>Under 10ms: Perfect. Nothing to do here, this is probably a properly designed query using the correct indexes.</li>
  <li>10-50ms: Good for queries with optimized joins.</li>
</ul>

<p>Acceptable:</p>
<ul>
  <li>50-100ms: Fine for moderately complex queries.</li>
  <li>100-200ms: Okay for heavy aggregations.</li>
</ul>

<p>Slow:</p>
<ul>
  <li>200-500ms: Here we start seeing things that are worth investigating.</li>
  <li>500ms-1s: Definitely needs work.</li>
  <li>Over 1 second: We can consider these critical and MUST FIX if they are part of a critical path.</li>
</ul>

<p>Simple queries (single table, indexed columns) should be under 10ms. If <code class="language-plaintext highlighter-rouge">User.find(123)</code> is taking 50ms, something’s wrong. Complex queries with joins and aggregations? They should be under 200ms.</p>

<p>Some of the common root causes of these slow queries we usually see when we are doing performance optimization work are missing indexes on foreign keys or WHERE/ORDER BY columns, N+1 queries, full table scans on large tables, and unoptimized LIKE queries with wildcards on both sides.</p>

<p>The power tool to uncover these: <code class="language-plaintext highlighter-rouge">EXPLAIN ANALYZE</code>. It will let you see execution plans and identify missing indexes or sequential scans.</p>

<h3 id="view-rendering">View Rendering</h3>

<p>View rendering time is usually high because of:</p>

<ul>
  <li>Rendering too many partials (<a href="https://sinaptia.dev/posts/rails-views-performance-matters">partials are slow!</a>)</li>
  <li>N+1 queries hidden in view code</li>
  <li>Not <a href="https://sinaptia.dev/posts/think-before-you-cache">using fragment caching</a> where you could</li>
</ul>

<p>Our suggestion for flagging views as slow is: if they are consistently over 100ms, investigate.</p>

<h3 id="external-api-calls">External API Calls</h3>

<p>An action is only as fast as its slowest code statement. Hitting an external service in an action <em>will kill</em> your response time. This is not always possible, but we should work hard to avoid hitting 3rd party services via HTTP/network during the process of a request flow. Try to move those calls to background jobs and build a business process that takes into account asynchronicity around them.</p>

<p>In cases where the above is not possible, we try to target under 200ms for API calls. Anything over 500ms should be moved to background jobs or cached aggressively.</p>

<p>If you must make synchronous API calls, remember to set timeouts and have fallback behavior or use circuit breakers.</p>

<h2 id="tldr-thresholds">TL;DR: Thresholds</h2>

<p>Here’s what to flag as slow using P95:</p>

<ul>
  <li>Actions: P95 &gt; 500ms</li>
  <li>Database queries: P95 &gt; 100ms</li>
  <li>API calls: P95 &gt; 200ms</li>
</ul>

<p>And remember, these thresholds can vary from project to project and be influenced by business requirements (eg, SEO penalties) or context (eg, an admin interface that’s used sparingly for very specific tasks, there’s no problem in relaxing these a little bit), but they work as solid starting points.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Performance is an integral concern: Server time, network latency, asset delivery, and browser rendering, users experience all of it.</p>

<p>Of all these components, server time is where you have the most control. Every millisecond you shave off server response time is a millisecond that doesn’t add to the total user experience.</p>

<p>Look at P95 for your actions. Find the bottlenecks (database queries, view rendering, API calls) and fix what’s making users wait.</p>

<p>Always take the whole picture into account when prioritizing performance-related work, and put your effort where it will give your users the bigger benefits.</p>]]></content><author><name>Patricio Mac Adden</name></author><category term="Ruby on Rails" /><category term="Performance" /><summary type="html"><![CDATA[Learn how to measure and identify slow Rails actions and their components: database queries, view rendering, and API calls.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sinaptia.dev/assets/images/logo-black.png" /><media:content medium="image" url="https://sinaptia.dev/assets/images/logo-black.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>