Operating notes
How to build a vendor scorecard.
Most vendor relationships run on the last conversation and the loudest complaint. A scorecard replaces opinion with a number both sides can manage from.
9 min read · By Everton Paula
Walk into most growth-stage operations and the vendors are managed by feel. The 3PL is "mostly fine." The support BPO is "a problem lately." The packaging supplier is "great, we love them," which on inspection means they are pleasant on calls, not that they ship on time. None of those statements is a number, and none of them survives the moment a vendor relationship goes wrong and the founder asks how long it has been wrong.
A vendor scorecard fixes that. It is a short, repeatable measurement of a supplier's performance against a few metrics that matter, each with a target and a weight, rolled up to a single number. The number is not the point on its own. The point is that it gives you and the vendor the same scoreboard, so the monthly review is a conversation about the same facts instead of a negotiation over whose anecdote is more recent.
The metrics that earn a place
The discipline of a good scorecard is restraint. Four metrics carry most operations, and a fifth or sixth specific to what the vendor is actually for. Past that, you are building a dashboard, and dashboards get watched, not acted on.
- SLA adherence. Did the vendor hit the delivery and response times in the agreement. This is the spine of the scorecard, and it only works if the SLA was written in numbers, not adjectives.
- Quality or defect rate. Of what they delivered, how much was wrong. Damaged units, failed tickets, returns, reworks. The metric depends on the vendor, the principle does not.
- Responsiveness on escalations. When something broke, how fast did they own it. A vendor who hits SLA in calm weeks but disappears in a crisis is a different risk than the average suggests.
- Cost against plan. Are you paying what the contract said, including the surcharges and exceptions that quietly accumulate outside the headline rate.
Then add at most one or two that are specific to the job. On-time-in-full for a logistics vendor. First-contact resolution for a support partner. Yield for a manufacturing supplier. Pick the number that, if it slipped, would actually change your decision about the relationship. If a metric would not change a decision, it does not belong on the card.
Weighting, so the number means something
Not every metric matters equally, and a scorecard that averages them as if they do will hide the failures that count. Weight the metrics by what actually hurts when they slip. For a last-mile vendor, on-time delivery might carry half the score because a late delivery is a lost customer, while cost variance carries less because you can renegotiate it. For a payments vendor, an outage during peak is catastrophic and a slow invoice is not, so uptime dominates the weighting.
The weights are a strategy statement, not a math exercise. They say, out loud, what you will and will not tolerate from this vendor. Set them with the operating lead who lives with the consequences, not with procurement in isolation, and write them down before you start scoring so nobody can argue the weights after a bad month.
The escalation path is what gives it teeth
A scorecard with no consequence is a monthly report nobody reads twice. The score has to connect to an action. The cleanest way is a written escalation path tied to score bands. A green score runs normally. A yellow score triggers a documented improvement conversation with a 30-day window and a named owner on both sides. A red score, or two yellows in a row, triggers a formal review of the contract, the volume, or the relationship itself.
This is the part most teams skip, and it is the part that makes the scorecard real. The vendor needs to know, in writing and in advance, what a red month sets in motion. Once they do, the scorecard stops being a measurement and becomes the quiet pressure that moves the vendor, no raised voice required. In one operation I ran, replacing ad hoc vendor conversations with a scored card and a written escalation path was a meaningful part of cutting a defect rate from 6 percent to 3 percent in six months, because the vendors who drove the defects could finally see exactly where they stood and exactly what happened next.
How to actually run it
A scorecard is only as good as the cadence that reviews it. Score monthly, on a fixed date, from data you can pull without asking the vendor for their version of the numbers. Review the score in the monthly operating meeting for that function, with the operating lead presenting, not procurement. Send the vendor the same card you used. Transparency is not a courtesy here, it is the mechanism: a vendor managing to a number they can see will move it; a vendor guessing at how you feel about them will not.
If you do not yet have a cadence to review it in, that is the prerequisite, and it is worth building first. The vendor scorecard is one of the artifacts that lives inside a weekly and monthly operating cadence. Without the cadence, the card has no room to run.
The three ways it goes wrong
Too many metrics. The card balloons to twenty lines because every team wants their number on it, and the score loses meaning. Hold the line at four to six. The pressure to add metrics is constant and you have to resist it every quarter.
Vendor-reported data. If the vendor supplies the numbers that grade the vendor, the scorecard measures their reporting, not their performance. Pull the data from your own systems wherever you can, and where you cannot, define the source jointly before the first score.
No teeth. The card gets built, gets shown, and nothing happens when it goes red. After two consequence-free red months, the vendor correctly concludes the scorecard is theater, and you have trained them to ignore it. The escalation path is not optional. It is the whole point.
Why this is operating work, not procurement work
A vendor scorecard looks like a procurement artifact, but it is an operating one. Procurement signs the contract. Operations lives with whether the vendor delivers, and operations is where the scorecard has to be owned, reviewed, and acted on. Building one is a small, concrete example of the larger job: installing the systems that let a company manage its operation with numbers instead of vibes, and that keep running after the person who built them moves on.
That is the work Plenor does. If your operation is managed by feel and you want an outside operator to read it and rank what to fix first, the one-week, fixed-fee Operating Teardown is the place to start. If you need someone to own the operation while you build it, that is fractional COO work.
Next note.
How to install a weekly operating cadence in 30 days
The rhythm a vendor scorecard lives inside. The three-meeting week, the scorecard behind it, and the 30-day install.
The one-page operating diagnostic
How an operator reads a business before touching anything, including how the vendors are really performing.
