Most teams think of Copilot as a writing tool. A study of 146,000 Jira tickets suggests the bigger return is somewhere else entirely.
Jellyfish pulled data from 145 companies and 6,500+ engineers, all using GitHub Copilot, and tracked the time savings per ticket end to end. The headline number is three workdays saved per ticket. The breakdown is the part most people haven't seen: one day came from reduced coding time, two days came from faster PR review. The tool most teams position as "code faster" is delivering its largest return on the phase that comes after the code is written.
What the Numbers Actually Say
That two-to-one split isn't a rounding artifact. It comes from how Copilot users approach review differently. They use it to understand unfamiliar code quickly, to generate test cases that surface edge cases before the review conversation starts, and to check whether incoming changes match the acceptance criteria from the ticket without reading every line manually. The review becomes less about comprehending everything and more about validating what the assistant flagged.
The coding time savings follow a more familiar pattern. Copilot writes boilerplate faster, suggests completions that reduce typing, and shortens the implementation loop for well-specified work. That one day per ticket is real. It's just not the bigger number.
Jellyfish also broke the savings down by seniority. Senior engineers reduced their coding time by 22% with Copilot. Junior engineers reduced it by 4%. That five-to-one gap is worth sitting with. The usual assumption about AI coding tools is that they should help less experienced developers more, narrowing the skill gap. The data goes the opposite direction. Experienced developers have the context to use AI assistance effectively during review — they know what to question, what the codebase expects, and where the risky edge cases live. Junior developers applying the same tool to the same review task don't have that context to draw on, so the tool has less to amplify.
The Deployment Mismatch
Here is the problem: companies are not buying Copilot for their senior engineers doing review. They are buying it for developers writing code.
That framing is understandable. The product is called a coding assistant. The marketing shows autocomplete and chat-with-your-codebase demos. The obvious use case is writing. But it tracks the tool's label, not the data about where the value lands.
Jellyfish also found that average company-level adoption sits at 22.6%. Most teams are not deploying Copilot broadly — they are licensing it for a fraction of their engineers. Within that fraction, the seat allocation typically follows the "more code, more value" intuition: the engineers writing the most code get the tool that helps write code. That tends to be mid-level developers on active feature work, not the senior engineers and tech leads spending half their week in review queues.
The result is that the roles where the data shows the biggest return — senior engineers doing review — are frequently the roles without Copilot. The roles where the data shows a modest return — junior developers writing code — are where the seats actually went.
Copilot for Jira Makes This Worse
GitHub Copilot for Jira went generally available on June 25. The integration works exactly as it sounds: assign a Jira ticket to Copilot like you'd assign it to a developer. The agent reads the ticket description, acceptance criteria, and linked context, produces an implementation, and opens a draft pull request. No human writes a line of code. The ticket moves from "in progress" to "in review" automatically.
That is a useful automation. It's also a review queue multiplier. If the Jellyfish data is correct and code review is already where most of Copilot's value is, teams that adopt Copilot for Jira are about to have a lot more code in review written by a source that has no opinion about whether the output is easy to validate. AI-generated code, as CodeRabbit's analysis of 470 PRs found, contains more issues per line than human-written code. Not catastrophically more, but more. The reviewer receives a PR that is technically functional, passes lint, maybe even passes tests, and still requires genuine attention to approve safely.
When that code is coming from an autonomous agent rather than a teammate, the reviewer cannot ask "what were you thinking here?" The diff is the only artifact. Whatever efficiency Copilot provides on review has to absorb a larger, potentially harder queue.
Where the Leverage Actually Is
If the review phase is both the bigger time savings and the bigger risk surface, that is where the tool deployment should be concentrated. Not exclusively — writing faster still matters — but the allocation question deserves more scrutiny than most teams give it.
A senior engineer who spends six hours a day in review, three of which are waiting to understand context, is sitting on a much larger productivity lever than a junior developer who writes 20% more boilerplate. If Copilot helps with the context-loading and edge-case surfacing in review, the senior engineer's six hours might actually compress. The three hours of comprehension work might become ninety minutes. That is a different order of magnitude than shaving typing time.
But you cannot know whether this is happening without measuring it. "Time saved per ticket" is Jellyfish's number, drawn from aggregate data. For a specific team with a specific review bottleneck, the signal you want is simpler: how much of your senior engineers' time is currently going into review, and is it changing as AI tool adoption changes? If you can answer that, you can evaluate whether the tool is working where you actually need it to.
Most teams cannot answer that question. Editor-based tracking shows coding time. Jira shows ticket throughput. Neither captures the hours a senior engineer spends in GitHub reading diff, the context switches between the PR and the codebase, or the time opening a second session to verify whether a generated function handles the edge case the original developer didn't test.
At xeve, we track at the system level — every app, every window, every session — specifically because the work developers actually do doesn't fit neatly into IDE activity metrics. A senior engineer's day includes the coding time that VS Code captures and the review, debugging, documentation, and investigation time that it doesn't. When you're evaluating where an AI tool is delivering value, the picture you get from editor heartbeats is incomplete in exactly the ways that matter for the review-side argument.
What to Do With This
The Jellyfish study is a single data source across a specific population of Copilot users. It probably doesn't transfer uniformly to every team. But the directionality — review benefits exceed writing benefits, senior engineers benefit more, adoption is shallow — is consistent with what you'd expect from a tool that amplifies existing skill rather than equalizing it.
The practical implication is: if you are thinking about AI coding tool deployment as "give it to developers so they write faster," you have the frame partially wrong. Give it to your reviewers. Give it to your senior engineers first, not last. Track what they do with their time before and after, at the system level, not just inside their editor.
Copilot for Jira generating autonomous PRs is going to create pressure in exactly the place the data says is already the more valuable phase. Whether your team is positioned to absorb that pressure, or whether it compounds the review backlog, depends on whether the people doing the reviewing have the tools to work at pace with the code being written.
The writing side of the stack just got automated. The question is whether the review side is keeping up.