What's actually slow? A practical guide to Rails performance

Learn how to measure and identify slow Rails actions and their components: database queries, view rendering, and API calls.

Ruby on Rails Performance

Nov 6th, 2025

By Patricio Mac Adden

Nov 6th, 2025

By Patricio Mac Adden

Ruby on Rails Performance

For the last couple of months, we’ve been building an observability tool that we intend to use internally in our AI-powered solutions. One of the features we wanted to work on was slow action detection, but… What makes an action slow? It’s one of those questions that sounds simple but gets interesting fast. Let’s break it down.

What users actually experience

When a request hits your Rails app and a response goes back, that total time is just a portion of what users experience. Server response time is crucial, but it’s only one piece of perceived performance:

Network round-trip matters. Your app might respond in 100ms, but if the user is on a slow connection or geographically far from your server, they might wait 500ms for the round-trip. A fast server doesn’t fix slow networks.
Download and rendering matter. Once the HTML arrives, the browser needs to download CSS, JavaScript, and images. Then it needs to parse, render, and potentially hydrate a JavaScript framework. A 100ms server response followed by 2 seconds of asset downloads and rendering feels slow to users.

The vision on performance should be integral. Server time, network latency, asset delivery, and browser rendering add up to what users experience. In this post, we will focus exclusively on server response time.

Percentiles: the right way to measure

You’ve got a group of similar actions. Some are fast, some are slow. What metric do you use to decide if it’s “slow”?

You shouldn’t use the average. The average lies. Imagine 99 requests at 50ms and 1 request at 5 seconds. Your average is 99.5ms, which looks great! But 1% of your users just waited 5 seconds. That’s not acceptable. Depending on the size of your user base, that 1% can be considered an outlier, but if your user base is large, it means a lot of people are having a bad experience.

Percentiles show you what real users experience:

P50 (median): The middle. Half your requests are faster, half are slower.
P95: 95% of requests are faster than this number.
P99: 99% of requests are faster than this number.

Here’s what it looks like in practice:

Action: posts#index

P50: 120ms ← typical case
P95: 450ms ← 5% of users wait this long or more
P99: 2.1s ← 1% of users are suffering

That P99 of 2.1 seconds is telling you something. If you have 1000 requests a day, that’s 10 users waiting over 2 seconds every single day.

Which Percentile Should You Use?

P50 (median): Too optimistic

P50 only tells you about the typical case. It completely ignores tail latency, i.e., the slow requests that frustrate users.

If P50 is 120ms but P95 is 2 seconds, you have a serious problem that P50 won’t show you. Half your users get a fast experience, but a significant chunk are having a terrible time.

Don’t use P50 to decide what’s slow. It hides too much.

P95: The sweet spot

It catches problems that affect enough users to matter. If P95 is 2 seconds, that means 5% of your users (1 in 20) are waiting that long. That’s significant.

It’s not so sensitive that every minor blip flags the system. You’re looking at the experience of a meaningful percentage of users, not just the absolute worst cases.

When to use P95:

Setting performance thresholds for alerts
Deciding if an action needs optimization
Comparing performance across different endpoints

P99: More aggressive, catches edge cases

P99 is more aggressive than P95 as it looks at the worst 1% of requests. This catches the outliers, the edge cases, the weird scenarios.

Use P99 when:

You want to understand your absolute worst-case performance
You’re debugging specific slow requests
You have extremely high traffic, and 1% still represents many users
You’re operating at a scale where tail latency really matters (think Amazon, Google)

But for flagging what’s “slow” in most applications, P99 can be too noisy. That worst 1% might include legitimate edge cases—a user with a massive dataset, a bot, a weird network condition. Flagging everything where P99 exceeds your threshold might give you too many false positives.

The decision rule

Use P95 as your threshold for marking something as slow. Monitor P99 too; it tells you about edge cases worth investigating. But make decisions based on P95. Why? Because P95 catches problems that affect enough users to matter without drowning you in noise from edge cases.

What actually matters: server response time

Rails tells you this for free:

Completed 200 OK in 250ms (Views: 180ms | ActiveRecord: 45ms)

That 250ms is what the server spent processing the request. This is what’s considered in reality:

Fast enough that nobody complains:

Under 100ms: Feels instant. Users are happy.
100-200ms: Still responsive. Most users won’t notice.

Getting into trouble territory:

200-500ms: Noticeable. Not great, not terrible.
500ms-1s: Users are tapping their fingers.
1-3 seconds: You’re losing people.
Over 3 seconds: They’ve already opened another tab.

Of course, the context matters. A simple action with basic queries should be under 200ms. A complex dashboard with aggregations spending 500ms to a second might be acceptable. But anything consistently over 500ms deserves investigation.

Breaking down the bottlenecks

Your action response time is the sum of its parts. This is what we use as a baseline when we analyze each component of a request. Bear in mind that these values are just guidelines; they can vary from project to project and be influenced by business requirements (eg, SEO penalties) or context (eg, for an admin interface that’s used sparingly for very specific tasks, there’s no problem in relaxing these a little bit).

Database Queries

Your actions are only as fast as your slowest queries.

Fast:

Under 10ms: Perfect. Nothing to do here, this is probably a properly designed query using the correct indexes.
10-50ms: Good for queries with optimized joins.

Acceptable:

50-100ms: Fine for moderately complex queries.
100-200ms: Okay for heavy aggregations.

Slow:

200-500ms: Here we start seeing things that are worth investigating.
500ms-1s: Definitely needs work.
Over 1 second: We can consider these critical and MUST FIX if they are part of a critical path.

Simple queries (single table, indexed columns) should be under 10ms. If User.find(123) is taking 50ms, something’s wrong. Complex queries with joins and aggregations? They should be under 200ms.

Some of the common root causes of these slow queries we usually see when we are doing performance optimization work are missing indexes on foreign keys or WHERE/ORDER BY columns, N+1 queries, full table scans on large tables, and unoptimized LIKE queries with wildcards on both sides.

The power tool to uncover these: EXPLAIN ANALYZE. It will let you see execution plans and identify missing indexes or sequential scans.

View Rendering

View rendering time is usually high because of:

Rendering too many partials (partials are slow!)
N+1 queries hidden in view code
Not using fragment caching where you could

Our suggestion for flagging views as slow is: if they are consistently over 100ms, investigate.

External API Calls

An action is only as fast as its slowest code statement. Hitting an external service in an action will kill your response time. This is not always possible, but we should work hard to avoid hitting 3rd party services via HTTP/network during the process of a request flow. Try to move those calls to background jobs and build a business process that takes into account asynchronicity around them.

In cases where the above is not possible, we try to target under 200ms for API calls. Anything over 500ms should be moved to background jobs or cached aggressively.

If you must make synchronous API calls, remember to set timeouts and have fallback behavior or use circuit breakers.

TL;DR: Thresholds

Here’s what to flag as slow using P95:

Actions: P95 > 500ms
Database queries: P95 > 100ms
API calls: P95 > 200ms

And remember, these thresholds can vary from project to project and be influenced by business requirements (eg, SEO penalties) or context (eg, an admin interface that’s used sparingly for very specific tasks, there’s no problem in relaxing these a little bit), but they work as solid starting points.

Conclusion

Performance is an integral concern: Server time, network latency, asset delivery, and browser rendering, users experience all of it.

Of all these components, server time is where you have the most control. Every millisecond you shave off server response time is a millisecond that doesn’t add to the total user experience.

Look at P95 for your actions. Find the bottlenecks (database queries, view rendering, API calls) and fix what’s making users wait.

Always take the whole picture into account when prioritizing performance-related work, and put your effort where it will give your users the bigger benefits.