See Your Side

Sentiment Balance Score (SBS): A Technical Specification for Topic-Level Sentiment Measurement and Benchmarking

Abstract


Sentiment Balance Score (SBS) is a topic-level, polarity-balanced metric derived from open-ended feedback. SBS converts unstructured comments into standardized aspect/topic mentions, assigns aspect-level polarity, applies confidence and intensity weighting, stabilizes estimates for low-volume topics using Bayesian smoothing, and enables peer-cohort benchmarking and trend/volatility analysis. This document defines the end-to-end pipeline and provides implementable formulas and data structures.

1. Data Model and Notation

Let each written comment be a document dd with metadata:

  • builder (or entity) b(d)b(d)
  • timestamp t(d)t(d)

The pipeline transforms each comment into a set of mentions (aka aspect annotations). A mention iii is a record:

i = bi , τi , i , si , mi , ci i = \langle b_i, \tau_i, \ell_i, s_i, m_i, c_i \rangle

Where:

  • bib_i​: builder/entity id
  • τi\tau_i​: time bucket (e.g., day/week/month)
  • i\ell_i​: canonical topic label (e.g., paint_quality)
  • si{1,0,+1}s_i \in \{-1,0,+1\}: polarity (negative, neutral, positive)
  • mi[0,1]m_i \in [0,1]: intensity/strength (how strong the sentiment is)
  • ci[0,1]c_i \in [0,1]: model confidence (probability-like; calibration discussed later)

A mention has a derived weight: wi = mi ci w_i = m_i \cdot c_i


2. End-to-End Pipeline (Per Comment)

Each comment goes through deterministic stages. You can implement this as an event-driven pipeline.

Step 0 – Ingest

Store raw comment text and metadata:

  • comment_id, builder_id, created_at, text, source, etc.

Step 1 – Preprocessing

Recommended (not strictly required):

  • language detection
  • sentence segmentation
  • PII masking (optional, governance-driven)

Step 2 – Aspect Extraction (Topic Discovery)

Goal: identify spans that correspond to “what the person is talking about.”

Output: a set of extracted aspect spans:A(d)={a1,a2,,aK}A(d) = \{a_1, a_2, \dots, a_K\}

Each aspect span aka_k​ should include:

  • raw span text
  • candidate topic label(s)
  • evidence / anchor phrase boundaries (if your model provides them)

Implementation notes

  • You can do this with LLM structured extraction, ABSA models, or hybrid (LLM + taxonomy mapping).
  • A taxonomy constraint improves standardization (critical for benchmarking).

Step 3 – Aspect-Level Sentiment (No Inheritance)

For each extracted aspect aka_k, predict:

  • polarity ss
  • intensity mm
  • confidence cc

This avoids the inheritance error where a comment-level label is applied to all topics.

Mixed sentiment

If a span expresses mixed sentiment (rare but real), options:

  • split into two mentions (preferred)
  • or assign s=0s=0 and track a mixed flag

Step 4 – Canonical Topic Mapping

Map each aspect’s free-text label to a canonical topic \ell.

Two-stage mapping recommended:

  1. alias dictionary / rules (deterministic)
  2. embedding similarity to canonical topic embeddings with thresholding

Unmatched topics go to OTHER for later curation.

Step 5 – Persist Mentions

Insert mention rows:

  • polarity sis_i
  • intensity mim_i
  • confidence cic_i
  • weight wiw_i
  • canonical topic i\ell_i
  • builder/time metadata

3. Topic-Level Aggregation

For a given builder bb, topic \ell, time window τ\tau, aggregate weighted “votes”:Pb,,τ=i(b,,τ), si=+1wiP_{b,\ell,\tau} = \sum_{i \in (b,\ell,\tau), \ s_i=+1} w_iNb,,τ=i(b,,τ), si=1wiN_{b,\ell,\tau} = \sum_{i \in (b,\ell,\tau), \ s_i=-1} w_iUb,,τ=i(b,,τ), si=0wiU_{b,\ell,\tau} = \sum_{i \in (b,\ell,\tau), \ s_i=0} w_i

Define weighted voting mass:Vb,,τ=Pb,,τ+Nb,,τV_{b,\ell,\tau} = P_{b,\ell,\tau} + N_{b,\ell,\tau}

Neutral is tracked but non-voting by default (analogous to NPS passives). You can optionally include it for other diagnostics (e.g., “clarity/decisiveness”).


4. Core SBS Metric

4.1 Raw (unsmoothed) SBS

If V>0V > 0:SBSb,,τraw=100Pb,,τNb,,τPb,,τ+Nb,,τSBS^{raw}_{b,\ell,\tau} = 100 \cdot \frac{P_{b,\ell,\tau} – N_{b,\ell,\tau}}{P_{b,\ell,\tau} + N_{b,\ell,\tau}}

Range: [100,+100][-100, +100]

If V=0V=0, SBS is undefined; return null and show “insufficient voting signal.”

This is the closest analogue to “%positive − %negative” in the two-class voting universe.


4.2 Bayesian-Smoothed SBS (recommended)

Low-volume topics produce unstable rates. Stabilize using Beta priors.

Interpret PP and NN as weighted pseudo-counts. Use a symmetric prior by default:α>0, β>0\alpha > 0, \ \beta > 0

Smoothed positive and negative rates:p^b,,τ=Pb,,τ+αVb,,τ+α+β\hat{p}_{b,\ell,\tau} = \frac{P_{b,\ell,\tau} + \alpha}{V_{b,\ell,\tau} + \alpha + \beta}n^b,,τ=Nb,,τ+βVb,,τ+α+β\hat{n}_{b,\ell,\tau} = \frac{N_{b,\ell,\tau} + \beta}{V_{b,\ell,\tau} + \alpha + \beta}

Then:SBSb,,τ=100(p^b,,τn^b,,τ)SBS_{b,\ell,\tau} = 100 \cdot (\hat{p}_{b,\ell,\tau} – \hat{n}_{b,\ell,\tau})

Equivalent closed form:SBSb,,τ=100(Pb,,τNb,,τ)+(αβ)Vb,,τ+α+βSBS_{b,\ell,\tau} = 100 \cdot \frac{(P_{b,\ell,\tau} – N_{b,\ell,\tau}) + (\alpha – \beta)}{V_{b,\ell,\tau} + \alpha + \beta}

With symmetric priors α=β\alpha=\beta, this becomes:SBSb,,τ=100Pb,,τNb,,τVb,,τ+2αSBS_{b,\ell,\tau} = 100 \cdot \frac{P_{b,\ell,\tau} – N_{b,\ell,\tau}}{V_{b,\ell,\tau} + 2\alpha}

Choosing priors

  • α=β=1\alpha=\beta=1: Laplace smoothing (simple, common)
  • α=β=k/2\alpha=\beta=k/2: “k pseudo-votes” stabilizer. Pick kk based on desired damping.

5. Confidence and Uncertainty

You should report SBS alongside a confidence measure.

5.1 Vote-Mass Confidence (simple, executive-friendly)

Map VV to [0,1][0,1]:Confb,,τ=1eVb,,τ/κConf_{b,\ell,\tau} = 1 – e^{-V_{b,\ell,\tau}/\kappa}

κ\kappa sets the scale of stability. Example: κ=20\kappa=20.

5.2 Credible Interval (statistician-friendly)

Let the posterior for positive rate be:pBeta(P+α,N+β)p \sim \mathrm{Beta}(P+\alpha, N+\beta)

A credible interval for pp yields an interval for SBS via transformation:SBS=100(2p1)SBS = 100 \cdot (2p – 1)

Compute:

  • plow=BetaInvCDF(0.025,P+α,N+β)p_{low} = \mathrm{BetaInvCDF}(0.025, P+\alpha, N+\beta)
  • phigh=BetaInvCDF(0.975,P+α,N+β)p_{high} = \mathrm{BetaInvCDF}(0.975, P+\alpha, N+\beta)

Then:SBSlow=100(2plow1),SBShigh=100(2phigh1)SBS_{low} = 100(2p_{low}-1), \quad SBS_{high} = 100(2p_{high}-1)


6. Benchmarking Methodology

Compute SBS for peer cohorts.

Let cohort CC be defined by filters (region, price tier, product line, delivery model, etc.). Aggregate across all builders bCb \in C:PC,,τ=bCPb,,τP_{C,\ell,\tau} = \sum_{b \in C} P_{b,\ell,\tau}NC,,τ=bCNb,,τN_{C,\ell,\tau} = \sum_{b \in C} N_{b,\ell,\tau}

Compute cohort SBS the same way:SBSC,,τ=100PC,,τNC,,τ(PC,,τ+NC,,τ)+2αSBS_{C,\ell,\tau} = 100 \cdot \frac{P_{C,\ell,\tau} – N_{C,\ell,\tau}}{(P_{C,\ell,\tau}+N_{C,\ell,\tau}) + 2\alpha}

Then define benchmark delta:Δb,,τ=SBSb,,τSBSC,,τ\Delta_{b,\ell,\tau} = SBS_{b,\ell,\tau} – SBS_{C,\ell,\tau}

Percentiles

For each topic \ell in cohort CC, compute the empirical distribution of SBSb,,τSBS_{b,\ell,\tau} and report:

  • percentile rank
  • quartiles
  • z-score (optional, though SBS isn’t guaranteed normal)

7. Trend and Volatility

Compute SBS over rolling windows (e.g., monthly). Let SBSb,,tSBS_{b,\ell,t} be time series.

Trend slope

Fit OLS on last nn periods:SBSb,,t=a+bt+ϵSBS_{b,\ell,t} = a + bt + \epsilon

Report:

  • slope bb (points/month)
  • R2R^2 (signal strength)
Volatility

Compute standard deviation of SBS or of the posterior mean over last nn periods:Volb,=StdDev(SBSb,,t)Vol_{b,\ell} = \mathrm{StdDev}(SBS_{b,\ell,t})

High volatility + low confidence often indicates insufficient volume or operational instability.


8. Programmatic Implementation Notes

8.1 Storage

Maintain:

  • mentions table (atomic)
  • rollup table keyed by (builder, topic, time bucket):
    • pos_weight_sum
    • neg_weight_sum
    • neutral_weight_sum
    • vote_weight_sum
    • counts
  • model/prompt versions for reproducibility

8.2 Idempotency

Store comment_processing keyed by:

  • comment_id
  • model_version
  • prompt_version
    and only process once per version.

8.3 Topic Governance

Maintain:

  • topic taxonomy
  • alias mappings
  • “OTHER queue” review process
  • periodic embedding re-indexing

8.4 Calibration

Confidence cic_i should ideally be calibrated (temperature scaling or isotonic regression) against a labeled validation set. If not, treat it as relative and keep a QA program that measures drift.


9. Summary of the SBS Definition

Atomic unit: aspect mention ii with (i,si,mi,ci)(\ell_i, s_i, m_i, c_i)

Weight: wi = mi ci w_i = m_i \cdot c_i

Aggregates:P=wi1[si=+1],N=wi1[si=1]P = \sum w_i \mathbb{1}[s_i=+1], \quad N = \sum w_i \mathbb{1}[s_i=-1]

Smoothed SBS:SBS=100PNP+N+2α(for symmetric priors)SBS = 100 \cdot \frac{P – N}{P + N + 2\alpha} \quad \text{(for symmetric priors)}

Benchmark delta:Δ=SBSbuilderSBScohort\Delta = SBS_{builder} – SBS_{cohort}

Confidence:Conf=1e(P+N)/κor credible intervals via Beta posteriorConf = 1 – e^{-(P+N)/\kappa} \quad \text{or credible intervals via Beta posterior}

Share this article: