Probability Model¶
This page explains the exact notion Meerkit uses to turn account signals into a follow-back probability, and why each factor exists.
Summary¶
Meerkit uses a hybrid heuristic + calibration model:
- Start from a conservative baseline probability.
- Add interpretable feature adjustments on a log-odds scale.
- Convert that score to probability with a sigmoid.
- Blend the result with historical success rates from similar past predictions.
- Return both probability and confidence.
flowchart LR
A[Baseline prior 0.28] --> B[Additive feature score]
B --> C[Sigmoid conversion]
C --> D[Historical cohort calibration]
D --> E[Final probability]
B --> F[Confidence adjustments]
D --> F
F --> G[Confidence and ambiguity flags] 1. Baseline Prior¶
The raw model starts from:
This is converted into log-odds before feature adjustments are applied:
Why start at 28%¶
The algorithm assumes that most targets are not naturally coin-flip reciprocity cases. A lower prior makes the model earn higher probabilities through evidence instead of drifting upward too easily.
2. Why The Model Works In Log-Odds Space¶
Feature adjustments are added to a score, not directly to the final probability.
That gives three practical advantages:
- Weak signals combine smoothly.
- Strong signals saturate naturally after sigmoid conversion.
- The model remains interpretable without needing a trained classifier.
The heuristic stage is:
and then:
3. Factor Families¶
Relationship-state factors¶
These are the strongest direct behavioral signals.
| Factor | Approximate effect | Why it matters |
|---|---|---|
| Target already follows active account | Raises score to at least about 82% equivalent | This is effectively direct evidence of reciprocity already existing |
| Active account already follows target | +0.20 score | A follow is more likely to be returned when the connection is already initiated |
| Private account | -0.28 score | Private accounts are typically more selective |
| Public account | +0.06 score | Public profiles are a bit less restrictive |
| Verified account | -0.38 score | Verified profiles usually behave more like asymmetric audience accounts |
Why these factors exist¶
They represent actual relationship friction. Private and verified profiles usually have more selective or asymmetric follow behavior, while existing follow state is a direct social signal rather than a proxy.
Audience-size and ratio factors¶
The model does not just ask how big the account is. It also asks how reciprocal the account looks from its follower/following shape.
| Factor | Effect style | Why it matters |
|---|---|---|
| Follower count | Continuous log-scale adjustment, roughly capped between -0.42 and +0.22 | Smaller accounts are usually closer to reciprocal-follow behavior; very large audiences are less reciprocal |
| Following-to-follower ratio | Continuous tanh adjustment, capped by shape | Accounts that follow back more often tend to have following counts that are closer to their follower counts |
Why not use hard thresholds only¶
Hard buckets are useful for history calibration, but continuous adjustments are better for the live heuristic score. They avoid sudden jumps where two near-identical accounts fall on opposite sides of an arbitrary threshold.
Mutual-follower factors¶
| Factor | Approximate effect | Why it matters |
|---|---|---|
| Mutual followers count | Up to +0.22 score | Mutuals indicate local social proximity |
| Mutual-to-follower ratio | Up to +0.12 score when ratio is at least 3% | Mutuals mean more when they are dense relative to the target's audience size |
Why the effect is intentionally small¶
Mutuals are helpful, but they are not enough on their own. The code treats them as supporting context, not as a dominant shortcut. That avoids overrating targets just because they sit near a shared cluster.
Metadata factors¶
| Factor | Approximate effect | Why it matters |
|---|---|---|
Very high media count (>= 1000) | -0.16 score | Extremely high output often correlates with broadcasting rather than reciprocal behavior |
| Public-figure style category | -0.18 score | Categories like artist, creator, celebrity, musician, and public figure often map to one-to-many audience patterns |
| Professional account flag | -0.14 score | Professional accounts are slightly less reciprocal on average |
Confidence-only metadata factors¶
Some metadata does not directly raise or lower probability. Instead it raises confidence because it means the app has a richer profile snapshot:
- Biography length of at least
80characters:+0.03confidence - Highlight reels present:
+0.02confidence
Why separate probability from confidence here¶
Biography length or highlight reels do not reliably imply reciprocal behavior by themselves. They do, however, signal that the account metadata is richer and less sparse, which helps the system trust the estimate slightly more.
Overlap factors¶
If relationship graph data is available, the model compares the target's network to the active account's latest scanned follower set.
There are two overlap notions:
- Follower overlap: the target is followed by people already in the active audience
- Following overlap: the target follows people already in the active audience
| Factor | Approximate effect | Why it matters |
|---|---|---|
| Overlap followers count | Up to +0.50 score | The target is already adjacent to the audience that matters to the active account |
| Overlap following count | Up to +0.42 score | The target already points attention toward the same audience neighborhood |
| Overlap followers ratio vs reference audience | Up to +0.35 score | Raw overlap count should mean more when it forms a meaningful share of the active audience |
| Overlap following ratio vs reference audience | Up to +0.28 score | Same idea for following-side overlap |
Why overlap is powerful¶
This is the most graph-aware part of the model. Unlike profile metadata, overlap says something about network proximity rather than just account presentation.
Why overlap is optional¶
Fetching relationship graphs is heavier than reading cached profile metadata. So the system can run in metadata-only mode first and then improve the score when cached follower/following data becomes available.
4. Historical Calibration¶
After the heuristic probability is computed, Meerkit calibrates it using past confirmed outcomes for the same active/reference account.
Which past rows are eligible¶
Historical rows must be:
- Labeled
correctorwrong - Stored with feature breakdown data
- Not cases where the target already followed the account
The last exclusion matters because those cases are too obvious and would inflate historical rates in a misleading way.
Cohort keys used for matching¶
The current prediction is compared against prior labeled rows using these cohort dimensions:
- Target size bucket
- Private flag
- Professional-account flag
- Verified flag
- Mutual bucket
- Overlap followers bucket
- Overlap following bucket
- Graph-fetch status
- Whether the active account already follows the target
Smoothing and posterior calculation¶
The model computes a global historical rate first, then uses it as a prior when estimating each cohort-specific posterior.
Conceptually it does this:
This prevents small cohorts from overreacting to sparse data.
Why smoothing is necessary¶
Without smoothing, a cohort with 1/1 correct would look like 100%, which is not credible. Smoothing keeps the model stable when history is still limited.
How history influences the final score¶
The calibration weight grows with sample count but is capped at 0.5.
That means history can shape the final probability strongly, but it can never fully override the live heuristic.
The blended result is:
where:
- \(p_{\text{heuristic}}\) is the score from current target features
- \(p_{\text{historical}}\) is the calibrated cohort rate
- \(w \le 0.5\)
Why cap history at 50%¶
Current target evidence still matters. Historical outcomes are useful, but they are a calibration layer, not a replacement for the actual observed target profile.
5. Confidence Model¶
Confidence starts low and rises as the app gathers more evidence.
Main confidence contributors¶
- Base confidence:
0.24 - Target profile exists:
+0.14 - Target already follows active account:
+0.12 - Mutual count available: up to
+0.08 - Overlap followers available:
+0.10 - Overlap following available:
+0.08 - Any target graph data available:
+0.16 - Rich biography:
+0.03 - Highlight reels:
+0.02 - Historical sample count at least
20: up to+0.16
Why confidence is separate from probability¶
Probability answers whether the target looks likely to follow back.
Confidence answers whether the app had enough evidence to make that estimate robust.
That is why metadata-only predictions can be directionally useful but still less trustworthy than overlap-backed predictions.
6. Final Guardrails¶
Meerkit applies a few explicit guardrails before returning the result.
Probability clamp¶
Final probability is clamped between 3% and 97%.
Confidence clamp¶
Confidence is also clamped into a valid normalized range.
Ambiguity band¶
Predictions between 45% and 65% are marked as ambiguous.
Why these guardrails exist¶
The system is designed for decision support, not absolute certainty. Guardrails prevent overclaiming and help the UI distinguish strong recommendations from borderline ones.
7. Why The Algorithm Uses These Particular Notions¶
History-based prediction¶
Because each active account attracts a different kind of audience, the best calibration is not global platform-wide behavior. It is account-local historical behavior.
Follower/following overlap¶
Overlap is the most direct approximation of network closeness that the app can compute from cached graph data. It is stronger than simple mutual counts because it uses the active account's actual scanned audience as the reference set.
Mutual followers¶
Mutuals are still useful, but they are intentionally weaker than full overlap. They capture social adjacency but not the same depth of audience embedding.
Account size and ratio behavior¶
These features approximate whether the account behaves like:
- a reciprocal personal account
- a selective niche account
- a large asymmetric audience account
Metadata richness¶
Metadata richness mostly improves confidence because it tells the system that it is reasoning from a more complete snapshot rather than sparse profile data.
8. Reading The Output Correctly¶
A prediction should be interpreted like this:
- Probability: estimated chance of follow-back
- Confidence: how much evidence supports that estimate
- Reasons: the strongest human-readable explanations for the score
- Graph fetch status: how complete the overlap data is
- Historical reference count/rate: how much past labeled behavior influenced calibration
In other words, the answer is not just a number. It is a number plus the evidence shape behind it.