We Built a GTM Diagnostic. Here's What Building It Actually Taught Us.

AI in Practice

Jun 1

Most GTM assessments are built to sell something. Vendor health checks are designed to surface gaps their product fills. Consultant audits are thorough, but they take weeks, cost real money, and create a dependency before you've decided whether you need one. The result is a market full of diagnostic tools that tell you what you already suspect, dressed up as objective analysis.

BlindSpot built something different: a free, AI-powered GTM diagnostic that takes five minutes, scores your go-to-market across four structural dimensions, and delivers a specific read on where your motion is most likely breaking down. No vendor agenda. No lead-bait survey masquerading as insight.

The tool is live at blindspotrisk.com/gtm-diagnostic. But this post isn't an announcement. It's an accounting of what building it forced us to figure out, the design decisions that weren't obvious, what the AI layer is actually doing, and why the tool is built to give direction rather than hand you a tactical plan.

The Design Problem: What Does a Diagnostic Actually Need to Do?

The first version of this tool had 22 questions. It was comprehensive. It was also wrong.

A 22-question diagnostic isn't a diagnostic. It's a survey. It creates the feeling of rigor while burying the signal in noise. The leaders we built this for (VPs of PMM, CMOs, heads of GTM strategy at $50M–$300M ARR B2B SaaS companies) don't have patience for self-administered surveys. They need to know, quickly, whether their GTM motion has a structural problem worth investigating. The tool has to earn their time before it asks for more of it.

So we cut to 12 questions and forced ourselves to answer a harder one: what are the fewest questions that reveal the most about GTM health?

That constraint turned out to be generative. It forced a decision about what GTM health actually means, not as an abstraction but as something diagnosable in a short exchange. We landed on four dimensions: ICP and market clarity, messaging and narrative strength, GTM execution and launch discipline, and PMM function maturity. Each maps to a failure mode we've seen repeatedly across client engagements. Each question is designed to surface the difference between a structural problem and a surface symptom.

| Twelve questions sounds like a shortcut. It's actually a forcing function — it requires you to know what matters, not just what's measurable.

Four Dimensions, Not Five: Why We Left One Out

The original framework included a fifth dimension: competitive positioning. It's a legitimate PMM concern, and we track competitive displacement closely in client engagements.

We cut it anyway.

Competitive positioning weakness is almost never a root cause. It's a downstream symptom of unclear ICP definition, inconsistent messaging, or an underdeveloped PMM function. When a company loses deals it shouldn't be losing, the fix rarely lives in the competitive intelligence program. It lives in one of the four dimensions we kept. Including competitive positioning as a separate diagnostic category creates false parity between a structural failure mode and a symptom of other structural failures.

That's the kind of decision that feels arbitrary until you've reviewed several dozen GTM audit findings and noticed the pattern. Competitive displacement tends to resolve after you've addressed the other four areas. It doesn't need its own questions.

Why Answer Choices Matter More Than Questions

Most diagnostic tools ask reasonable questions with useless answer choices. "How mature is your product marketing function?" paired with options like "Not at all mature," "Somewhat mature," and "Very mature" produces a score, not an insight. The AI model analyzing those responses can only work with what you give it.

Every answer choice in this diagnostic is written as a behavioral description, not a label. For the question about win/loss analysis, the three options describe how an organization actually operates: no structured process and reliance on rep-reported reasons; occasional post-mortems that are ad hoc; and a structured win/loss program with third-party interviews that feeds back into positioning and enablement. The respondent recognizes themselves in one immediately. And the model generating the analysis can reference a specific operational reality, not just a tier.

We revised the answer choices more than the questions themselves. For a 12-question diagnostic with three options each, that's 36 behavioral descriptions that need to be mutually exclusive, specific enough to be recognizable, and calibrated so the middle option isn't just a softer version of "we're not sure." Getting that right took longer than the question design.

What the AI Layer Is Actually Doing and What It Isn't

The diagnostic collects 12 answers, a company stage, an ARR range, and a market segment, then sends all of it to an AI model that generates a personalized GTM health report in roughly 15 seconds.

What the AI is not doing: returning a tier. Tier-based outputs ("You're a Level 2 GTM organization") are useless because they tell the reader nothing actionable. A $90M ARR DevOps platform with strong messaging and broken launch execution doesn't need to know it's a Level 2. It needs to know that launch discipline is undercutting the messaging work it's already done well.

What the AI is doing: pattern-matching the combination of answers to identify where gaps are most likely compounding each other. A company with weak ICP definition and inconsistent messaging gets a different analysis than one with strong ICP and weak PMM function maturity, even if their aggregate scores are identical. The dimension-level specificity is what makes the output useful.

The model also adjusts its framing based on company stage and ARR. The failure modes at Series A look different from those at Series C. A newly-funded $15M ARR company with an undefined ICP is at a different kind of risk than a $180M ARR company with the same problem. The model knows the difference and frames the guidance accordingly.

Direction, Not Prescription: What the Tool Can't Do

This is the part most diagnostic tools don't say out loud.

The GTM diagnostic will tell you where your motion is most likely breaking down. It will not tell you exactly how to fix it. That distinction is not a limitation we're apologizing for. It's a deliberate design choice, and it matters.

AI is well-suited to pattern recognition at scale. Given a structured set of inputs, a well-prompted model can reliably identify which combination of gaps tends to produce which downstream outcomes. That's what this diagnostic does. What it cannot do is account for the organizational dynamics, the executive relationships, the competitive context, the product roadmap constraints, and the team capability questions that determine which intervention is actually executable for a specific company at a specific moment.

Consider what "weak ICP definition" looks like in practice. The diagnostic identifies the pattern: internally generated ICP, not validated against customer data or win/loss findings. That's a real signal. But the right response is not the same for every company. A 40-person Series B company where the founder still owns every strategic sales call has a different ICP problem than a $200M ARR company with three business units, four customer segments, and a PMM team that's never had a formal mandate to own the ICP. The diagnosis is similar. The intervention is completely different. Mapping that gap requires judgment built from having been inside these organizations, from knowing what weak ICP looks like at each altitude and what it actually takes to close it.

| The diagnostic surfaces what's most likely broken. What you do about it depends on context the tool can't see. That's not a gap in the AI — it's where experienced PMM judgment begins.

This is why the diagnostic is the start of a conversation, not the end of one. The five-minute read gives you enough signal to know whether a deeper conversation is worth having, and to walk into that conversation with a more precise question than "I think our GTM might be off." That's real value. But it's directional value, not a roadmap.

Two Jobs: PMM and Demand Gen Use This Differently

The diagnostic was built with two distinct use cases in mind, and they pull on the tool differently.

For PMM leaders, it functions as a positioning and maturity audit. The ICP clarity and messaging coherence dimensions surface structural gaps that appear in win/loss programs and competitive loss analysis. A PMM VP who suspects her organization has a messaging consistency problem can use the diagnostic to quantify that read and bring it to the CRO with more than anecdote. Seven of twelve answers in the messaging dimension pointing to the middle tier is a different conversation than "I think our messaging needs work."

For demand gen leaders, the tool surfaces upstream problems before they double down on a motion that's structurally compromised. At $30M–$80M ARR, the most common demand gen failure mode isn't insufficient budget or the wrong channels. It's a positioning problem that nobody has named yet. A demand gen leader who sees strong ICP clarity but weak launch discipline now has a diagnosis for why campaigns underdeliver despite strong creative. The gaps connect. The tool shows where they do.

In both cases, the output gives direction. The decision about which gap to close first, and how, still requires someone who understands the organizational context well enough to sequence the work. That's the part no five-minute tool can do.

What Building This Revealed About How GTM Actually Breaks

Three patterns emerged from the design process that we didn't fully anticipate.

ICP problems are almost always misclassified as messaging problems. The strongest predictor of downstream GTM failure, across the question structure we developed, wasn't messaging incoherence. It was ICP documentation that was internally generated rather than market-validated. Companies that defined their ICP from inside the building had messaging that sounded confident but didn't land. The root cause was upstream; the symptom showed up in the messaging layer. The diagnostic is structured to catch the distinction.

The middle tier is the most dangerous place to be. Companies at the low end of the maturity curve know they have a problem. Companies at the high end have built systems. The companies in the middle have enough in place to feel operational, but not enough to notice when their GTM motion starts to degrade. They keep executing the same plays quarter after quarter without realizing those plays have stopped fitting the market. The diagnostic was deliberately calibrated so the middle tier isn't reassuring.

Sales and PMM alignment problems rarely announce themselves as alignment problems. They surface as content utilization gaps, launch execution failures, inconsistent rep messaging, all of which have tactical remedies that miss the structural cause. A team pulling things together at the last minute on every launch isn't having a project management problem. It's having a cross-functional ownership problem that lives at the organizational design level. Recognizing that from a diagnostic read is one thing. Knowing what to do about it in a specific organization is something else entirely.

What We'd Add Next

Three extensions are on the roadmap.

A CRM signal layer. Right now the diagnostic is self-reported, which means it reflects the respondent's perception of their GTM motion, not what pipeline data actually shows. A version that ingests win/loss rates, average deal cycles, and stage conversion from a CRM could cross-reference self-reported answers against actual motion performance. The gap between perception and reality is often where the most important blind spot lives.

A benchmark overlay. The diagnostic produces a personalized analysis, but it doesn't tell respondents how their answers compare to similar companies. A $120M ARR Series C company should know whether its messaging consistency scores are typical for its stage and segment, or outliers. That requires sufficient response volume to be meaningful, but it's the natural next layer once that volume exists.

Longitudinal tracking. GTM health isn't static. A diagnostic taken in Q1 before a positioning refresh should produce a different result than one taken in Q3 after the messaging rollout. A version that lets a team track scores over time and surface whether interventions are producing measurable improvement turns it from a point-in-time read into an operational instrument. That's the version that fits inside a PMM operating rhythm.

Start with the Diagnostic

If you're a CMO or VP of Marketing trying to get a clear read on where your GTM motion is breaking down, the diagnostic will tell you something useful in five minutes. Not a tier. Not a vendor recommendation. A specific, directional read on what's most likely degrading and a sharper question to bring into any conversation about what to do next. PMMs and demand gen leaders use it too, often to surface the upstream positioning problems that explain why execution keeps underdelivering.

Take the free diagnostic at blindspotrisk.com/gtm-diagnostic. If the results surface a pattern worth going deeper on, BlindSpot can help you interpret what it means for your specific organization and build the roadmap to close the gap. Book a 30-minute GTM gap session here.

AI in PracticeGTM StrategyPMM LeadershipProduct Marketing Strategy

Todd Fitzgibbon https://www.blindspotrisk.com

Perspectives on product marketing, GTM strategy, and what moves the market.