Generative AI and Changing Inputs

In previous posts I’ve written about Natural Language Programming, and Intent/Realization Toolchains as two grounding concepts in AI productivity tools, particularly AI for Code. In this post I want to explore another angle that comes up in features in AI productivity tools, particularly those that generate documentation, synthesize specifications, discover build rules or other “extractive” features of AI-enriched toolchains.

Why we must deal with change

The context is creating AI features for developers. We often use LLMs and coding agents to make functions that take some input (e.g. the issue in a repo, or the source code plus a task, or the files in a pull request) and produce some output (extracting entities, summarizing, grouping, suggesting, rewriting, whatever). Let’s call that F(In) ↝ Out, a non-deterministic function.  Sometimes this is a whole bunch of LLM calls and chained in the form of a coding agent. Sometimes tools are invoked, “agentically”.

For the purposes of this post, you can think of F as any of these

  • Documentation generation (Repo ↝ Docs)
  • Extracting build rules (Repo ↝ Build Script)
  • Generating agent assistance files (Repo ↝ AGENTS.md + skills)
  • Summarizing a pull request (Repo + Changes ↝ Text)

These are all fundamentally extractive – and there are an infinte variety of similar functions which are now gloriously implementable thanks to the magic of the era that we live in. These functions are rarely adding insights or new knowledge. They’re taking knowledge in the existing information and reducing it, organizing it, presenting it.

The problem we want to deal with in this blog post is this: in the real world, very often the inputs change. And of course, when things change, we need to update the outputs. So we need to compute F(In’) ↝ Out’ given some representation of F and the change between In and In’, or some approximation of it.

In developer tools with AI for Code features, inputs can change for all sorts of reasons:

  • The user is iterating and changes their mind, or
  • New information refines an input, a task or requirements spec, or
  • The code in a PR changes due to external changes, or
  • An issue gets a new comment,
  • and many other such examples.

Either way, every AI feature or analysis, operating over code or metadata about code on, say, GitHub, every such feature must deal with changing inputs . I say “must” for a reason – change is easily ignored when creating demos, but if you’re going to make your extractive feature into a long-term feature, then dealing with change is critical to both good user experiences and performance. When I look at GitHub and similar systems, what I see is a stream of changes, a stream of deltas, a stream of events. Not just commits, but the event stream of change for issues and pull requests too, and indeed change across the whole of what I like to call the “GitHub Information Fabric” (roughly speaking, the objects and relationships in the GitHub API). Change is deeply fundamental to how GitHub works, and indeed how any information fabric works at scale.

If we’re going to deal with change well, we almost certainly want to do that incrementally. There are other options (we can basically ignore the change, and ask the user to regenerate, and deal with the consequences). But we need some kind of position on change.

If we do care about processing change, that means we’re almost certainly talking about implementing an FInc that in some way folds a change into the existing output from the previous run. There are lots of ways to do this, but one way is to augment the chain of calls and prompts that make up F with the ability to take incremental inputs and rewrite the output, where the change is characterised by a delta (“we added ΔIn”), or by comparison (e.g. “The input changed from In to In'”), or by commands (“The previous inputs were In, and the user has also requested that the outputs change by Command”), and or by events (“After In, some new Events have also happened, update things please”).

Competing Goals for Incremental Extractive AI

Now, for AI summaries and other AI-implemented functions, there is a tension here between different goals. And it’s really this tension between conflicting goals that’s my main point in this post. I’ll describe these in terms of our general function F, which we assume is a non-deterministic function (that is, a function choosing probabilistically from a set of possible outputs in a manifold of possibilities)

  1. Freshness: One goal is that we could run F all over again, on the entire new input. This forms some kind of “freshness” meta-goal – reload/refresh/regenerate. Yes the result is fresh, but it might have changed – and in a non-deterministic generative world with many choices, it may be changed very substantially. This is what early versions of Copilot Completions did, for example. No one cares much what the model offered previously. Just re-run and offer something new
  2. Divergence: It’s possible F and FInc just don’t do the same thing. The incremental updates may just lie in a completely different part of the solution space, misaligned. 
  3. Stability: F is non-deterministic and we often prefer to have results “similar” to “Out'”. This forms a “stability” meta-goal. There are lots of reasons to do this – for example, the information extracted may be user facing, and the user is already attuned to it.
  4. Performance: It can be more performant to produce deltas to Out’ rather than producing the whole thing all over again (e.g. don’t recreate the whole video), or there can be model-level techniques like speculative decoding/predicted outputs.  Either way we can call this a “performance” meta-goal. 
  5. Localization: We try to reduce the amount of Out we need to rewrite, e.g. by rewriting the file-by-file plan, and then seeing what changed in the plan, and only rewriting the files where the plan changed.
  6. Thoroughness: For thoroughness, it can be useful to rewrite all of Out, if it is not enormous.  Things like predicted outputs help make this efficient.

All of these are reasonable goals. And you can’t have everything.

In practice we often just choose goal (1) – that is, we decide we simply don’t care about the other goals and just automatically re-run the extraction. This is a valid decision and is great for creating demos.  But will often later come back to bite.

If we decide we do something else, incrementally, with an FInc, we have a much harder job. For example, let’s look more closely at the risk of divergence between F and FInc.

  • Sometimes FInc can fail to update things (be too biased to stability, not noticing detail, over-prone to being localized).
  • Sometimes FInc can change things too radically (be too biased to freshness, regardless of the history, or over-emphasising the importance of a change – something we’ve noticed in incremental documentation update). 
  • Sometimes FInc can fail to detect that “everything has changed” and that the input has changed so radically that it’s just better the past is completely forgotten (another form of being too biased to stability). 

Towards Incrementality

So what’s to be done? What we’d really like is that, for any description of F (e.g. for any prompt-with-input-holes, or any coding agent), there was a “magic” way to make a “good-enough” FInc – an incremental version of F – that balanced the various priorities above according to some policy.  Ideally there would be a way to get that incremental version by construction or by transformation (e.g by prompt rewriting)  Sort of like automatic incrementalization for agents (where an agent here is just a composition of LLM invocations).  I know that’s nowhere near possible, but just throwing the concept out there, as a sort of pipe dream.

So a possible project could be this: what if we took a more disciplined programming model for implementing AI functions – for example something like GenAIScript or GitHub Agentic Workflows – and modified these to be fundamentally incremental – a “ΔGenAIScript” or “ΔAgenticWorkflows” if you will – in the sense that, if you declare the overall “function” you want implementated in these systems, then you’d get the incremental version for free “by construction”.

If such a thing existed, it would be useful. For example, we could write a GenAIScript that says “generate the documentation like this”, and automatically get an incremental version of (roughly) the same documentation update.

For systems like Agentic Workflows, it’s almost certainly possible to do this via a source-to-source translation. Indeed if you ask ChatGPT to make such an incremental version given the documentation of the techniques available in the programming model, it will do a pretty good job.

Much more ambitiously, imagine a world where a programming model could express the core transformations/extractions that go on inside, say, a coding agent, or a AI-infused algorithmic transformations like GraphRAG – and an incremental version dropped out, including being able to tune how the incremental versions of the AI functions traded off the different goals above.

Conclusion

A reference point for incrementality for me is this, where incremental computation drops out of F# code https://github.com/fsprojects/Fsharp.Data.Adaptive . There are similar libraries for other languages but it fits particularly well with F#’s computation expressions for computational DSLs.

These problems have come up repeatedly in my work at GitHub Next: AI in the context of incremental changes and events. 

  • In SpecLang, discussed in my post on Intent, meet Toolchain
  • In Copilot Workspace in generative functions from spec ↝ plan ↝ code.
  • In Copilot for Pull Requests, a simple extractive summarization of a pull request
  • Build Discovery has this issue, if you want the inferred build steps to evolve automatically as the repository changes, but also want them to stay stable.
  • In documentation generation – getting reliable updates based on changes to code.

To summarise, just as one of the grand challenges of algorthmic programming is to automatically derive useful, practical, performant incremental versions of functions from their primaries, so one of the grand challenges for mixed AI/algorithmic programming models is to automatically derive incremental versions of AI “functions” from reasonably declarative mixtures of natural language and algorithmic specifications. There are fundamental tradeoffs when doing this between different goals.

p.s.the title of this post should logically be “Extractive AI and Changing Inputs” but I knew Generative would get more clicks. My apologies!

Leave a comment