Keeping human reasoning visible when AI joins research collaboration.

I Asked for Your Feedback, Not Your AI’s

Karla Stepanova (with AI used for editing and polishing the original full text version, not generating the content)

AI is becoming a powerful tool for individual work, already helping us program, write, debug, draft, edit, and brainstorm. We learned where to trust it and where to rather stay with our version. We know what our own thoughts were and what came from AI. But what about cooperative research work? What if we write a research paper together? What if we develop code in a team? Can we use the same system we use for individual work, or do we need to change something so that AI supports the collaboration rather than undermining it?

When one person uses AI, they know what they asked, what they changed, what they accepted, and where the AI might have hallucinated. They are fully aware of how much it is his or AI’s work and where and how much to trust it.

But cooperation is different in several ways.

Cooperation is about respect and reasoning, not just output

Collaboration is not only about producing a result. It is about understanding the thoughts, intentions, and decisions behind it. When we work together on code, papers, or research, we are interested in what the other person actually thinks and why something has changed. This gives us an option to agree or disagree and discuss it.

Imagine spending days on a section of a paper. You iterate over every sentence and choose every word intentionally. You send it to a colleague for feedback, and what comes back is not their comments or their reasoning, but a fully rewritten version. Now you are left wondering: which changes are my colleague’s and which are the AI’s? Why was a paragraph I worked on for a day rewritten wholesale? What did my colleague actually think was wrong?

Normally, you do not rewrite every sentence of a colleague’s paper. It carries their signature and their time. You change things when you have a reason, and ideally, you explain your reasoning. Otherwise, people feel their input was ignored and their time wasted without a good reason.

The frustration is not that the text changed. It is that human reasoning has disappeared, and the extent of their involvement is unclear.

When I ask a colleague for feedback, I am asking for their judgment, their experience, their honest reaction. I ask them for a favor to use their own brain, which I trust and value. A whole rewrite (with heavy AI support) instead feels like sending someone your melody, asking them to add words, and getting back a totally new song with both a new melody and new words, without any explanation of why your melody wouldn’t work. They just made another song. Furthermore, it puts the work of reconstructing their reasoning on my shoulders. (This can be fine when a supervisor hands a student a worked example. It does not work between cooperating colleagues.)

I have no problem with a colleague checking their feedback against AI. To verify a claim, test a criticism, or find better wording. But I still want their thoughts: what they noticed, what bothered them, what they were unsure about. Those thoughts should arrive uncovered, not buried under a layer of generated text from which I cannot separate them. And by the time I ask you, I have usually already iterated with AI several times myself. Its perspective is already baked into what I sent. That is why I am now asking a human.

The amount of input should correspond to your effort

AI also makes the effort invisible. The depth of feedback used to signal the effort behind it. But now five minutes with an AI can look like five days of careful reading. The number of changed lines of text or code no longer proves engagement, but engagement is exactly what I am asking for. When effort becomes unreadable, trust between collaborators (and in the feedback) erodes, even when nobody intended any harm. How much time do I have now to find out if loads of changed text/code are carefully checked by humans or are mainly unchecked AI-generated ones? How much honest feedback is in there?

The time AI “saves” does not disappear — it transfers to me. Faced with a wall of changed text, I have to decode it all: compare it line by line, work out what each change means, and guess which differences carry intent and which are noise. What could have been five comments I read in two minutes becomes hours of forensic work. The total time spent goes up, it only moved from the giver to the receiver. That is the opposite of what AI should do for a team.

The same problem hits the code

An AI-rewritten function may look better at first glance, but without explanation and without a clear line between human decisions and AI suggestions, you are no longer discussing logic — you are debugging a mix of human intention and machine output without knowing which is which.

This creates an ambiguity that didn’t exist before: you can no longer tell an accident from a decision. If a colleague adjusts one thing and the AI silently deletes a neighboring line, did they remove it because they disagreed, or did the model just delete it? Before, every change was a choice. Now it might be a choice, a side effect, or an accident, and they look identical. I cannot even ask the right question, because I don’t know whether there is a “why” to ask about.

It gets worse when something breaks. Debugging in a team used to be a conversation where you sit with the author and reconstruct the thinking: “Why handle this case here? What did you assume about this input?” But when AI-generated code is merged without any human truly absorbing it and taking responsibility for it, there is no one to have that conversation with. The logic exists, but it was never held in a human mind. We end up with orphaned code: it runs, until it doesn’t.

Invisibility of AI conversations

Finally, the AI conversations themselves are invisible. A modern team includes several collaborators plus several private AI chats, each holding context and reasoning that shaped the shared work. My colleague’s AI knows why a paragraph was restructured, while mine knows why I chose the approach I did. But these “AI brains” are not attached to the project. They don’t talk to each other, and no one else can look inside them. The team’s real decision history ends up scattered across chat logs nobody shares.

When a colleague rewrites your work themselves, you can usually trust that there is experience and intention behind it. When the rewrite comes through AI, that trust blurs.

And because it looks fine and we are trying to be efficient, we tend to leave it in place, even when the original was just as good. But I believe in cooperation; we should keep the colleague’s work unless we have a good reason to change it.

Sometimes the rewrite is the feedback

I don’t want to be one-sided. I believe we can make AI-assisted rewrites work also for teamwork. I just think there is still some way to learn how to do it well (at least for me). And sometimes an AI-assisted rewrite is already the most efficient feedback: a senior engineer’s cleaner version of a junior’s function serves as a teaching example. But that is supervision or teaching, not collaboration between equals. And even then, I would appreciate the reasoning that preserves trust in the supervisor’s actual involvement.

What might actually help

I am thinking about these topics and do not have answers yet, only ideas what might help (at least me) when using AI in cooperation:

Observations before any rewrite. Begin feedback with all your observations in a very drafted way: “This argument is unclear,” “This needs evidence.” Only then let AI help implement or propose alternatives. Separate these clearly from one another.
Say what you verified, and how you engaged. “I tested the edge cases.” “I confirmed this citation.” “I did not check the benchmarks.” This matters more than labeling lines AI- or human-written. That boundary is blurry, but it is important to know what the human took responsibility for.
Mark what matters to you. Flag the parts of a text or codebase you care about most and you do not want to be changed, so a collaborator takes extra care before letting AI touch them and has to provide good reasons for the change.
Attach your reasoning to every significant change. Add a sentence of “why” per meaningful edit. If you can’t say why the AI’s version is better, you haven’t reviewed it, only forwarded it.
Attach the AI context to the shared work. If you used AI for your decisions, share the conversation with colleagues. Link it, paste the key exchange, or at least summarize: “I explored X, it suggested Y, I rejected it because Z.”
Do not favor AI over humans. Preserve the colleague’s work in place unless you have a reason to change it. Even if a rewrite is good, keep the original if the original was similarly good.

What I think is important is to really explicitly separate: what came purely from a human mind, what a human verified and takes responsibility for, and what is AI-suggested and unchecked/lightly checked.

This is just an extension of existing practice

Parts of this already happen. Many journals and conferences now require authors to disclose AI use, and many state plainly that AI cannot be an author because it cannot take responsibility. Developer tooling has “Co-authored-by” trailers, and teams increasingly note AI assistance in pull requests. The ideas above extend these disclosure norms from the document level down to the decision level, where collaboration actually happens.

AI should save time, not waste it. It should support cooperation, not replace it. It should help us express our thoughts, not hide them. The goal is to make AI a tool for cooperation — not a wall between collaborators.

I want your ideas

This is an open question, and I would really like to hear from everyone working in research, software, and academic writing: how do you use AI in collaboration without losing the human thinking and involvement? Share your workflows (and frustrations) with me at karla.stepanova@cvut.cz or in the comments.