31.3.4 Profit-sharing models

2025.10.06.
AI Security Blog

The machinery of distributed jailbreak discovery, from genetic algorithms to adversarial farms, doesn’t run on volunteer effort alone. It’s fueled by a sophisticated and surprisingly formal underground economy. Profit-sharing models are the contractual and financial frameworks that incentivize thousands of individuals and their computational resources to hunt for model vulnerabilities. Understanding these models is crucial, as they directly dictate the type, quality, and persistence of the attacks you will face.

At its core, a profit-sharing model is a system for distributing revenue generated from a successful jailbreak among the parties who contributed to its discovery. It transforms the abstract goal of “finding a vulnerability” into a tangible financial reward, creating a market-driven incentive structure that mirrors legitimate gig economies.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Categorization of Incentive Models

While specific implementations vary, most profit-sharing arrangements in the jailbreak economy fall into one of several primary categories. Each model encourages different behaviors and produces different types of exploits.

1. Pay-per-Discovery (PPD)

This is the simplest model. A contributor submits a working jailbreak prompt and, upon verification, receives a one-time, fixed payment. The marketplace or operator then owns the exploit and can monetize it as they see fit.

  • Incentive: Quantity over quality. Encourages rapid, high-volume submission of any working exploit, regardless of its durability or sophistication.
  • Typical Output: Simple, often brittle jailbreaks that may be quickly patched. Ideal for operators who need a constant stream of new, disposable prompts.

2. Royalty-based Revenue Share

In this more sophisticated model, the discoverer receives a percentage of the revenue generated from their specific jailbreak. This could be a share of subscription fees from users accessing the jailbreak, or a cut from its use in a malicious service (e.g., a “misinformation-as-a-service” platform).

  • Incentive: Quality, durability, and high value. Contributors are motivated to find robust, hard-to-patch jailbreaks that target high-value capabilities (e.g., generating malicious code, bypassing stringent content filters).
  • Typical Output: Complex, multi-step, and adaptive jailbreaks. The financial reward is tied to the exploit’s long-term utility.

3. Contributor Pool & Subscription Model

This model socializes the risk and reward. All contributors submit their findings to a central pool. The operator bundles these jailbreaks and sells access via a subscription service. The total revenue is then distributed among all contributors, often weighted by the volume, quality, or usage statistics of their submissions.

  • Incentive: Consistent contribution and collaboration. It encourages a steady flow of exploits to keep the subscription service valuable, while mitigating the risk of a single high-value exploit being patched and cutting off a contributor’s income.
  • Typical Output: A diverse mix of jailbreaks, ranging from simple to complex, covering multiple models and use cases.

4. Targeted Bounty Model

This functions like a traditional bug bounty program but is operated by malicious actors. A marketplace will post a specific challenge with a large, fixed prize. For example: “10,000 USDT for the first verifiable jailbreak that forces Model-Y to generate functional phishing email templates.”

  • Incentive: Focused, high-skill effort on a specific, valuable target. Attracts more skilled actors who might not participate in lower-paying models.
  • Typical Output: Highly specific and potent exploits designed to defeat a particular defense or enable a particular malicious application.

Operational Mechanics

Implementing these models requires a surprising degree of technical infrastructure for tracking contributions and calculating payouts, almost always using cryptocurrency to maintain anonymity.

Comparison of Profit-Sharing Models
Model Incentive Structure Best For (Operator) Complexity Risk for Contributor
Pay-per-Discovery Fixed fee per valid exploit Generating high volume of exploits Low Low (quick payout)
Royalty-based Percentage of ongoing revenue Acquiring durable, high-quality exploits High High (payout depends on exploit lifetime)
Contributor Pool Share of total subscription revenue Building a resilient, diverse library Medium Medium (diversified risk)
Targeted Bounty Large, one-time prize for specific goal Solving a specific, high-value problem Low Very High (winner-take-all)

Contribution tracking is paramount. When a user in a distributed network finds a potential jailbreak, their client software typically hashes the prompt and other metadata, associating it with their unique worker ID. This creates a verifiable claim. The central system then tests the jailbreak, and if successful, logs it against the contributor’s account.

Payout calculations, especially for royalty models, can be complex. The system must track usage metrics for each jailbreak to determine its revenue contribution.

# Pseudocode for a simple royalty payout calculation
function calculate_payouts(month, total_revenue):
    payouts = {}
    # Get usage stats for all active jailbreaks
    jailbreak_usages = get_monthly_usage_stats(month)
    total_monthly_uses = sum(jailbreak_usages.values())

    for jailbreak_id, uses in jailbreak_usages.items():
        contributor_id = get_contributor_by_jailbreak(jailbreak_id)
        royalty_rate = get_contributor_royalty_rate(contributor_id) # e.g., 0.3 for 30%

        # Calculate this jailbreak's share of total usage
        usage_share = uses / total_monthly_uses
        
        # Attribute revenue and calculate payout
        revenue_share = total_revenue * usage_share
        payout_amount = revenue_share * royalty_rate
        
        if contributor_id not in payouts:
            payouts[contributor_id] = 0
        payouts[contributor_id] += payout_amount
        
    return payouts
                

Implications for Red Teaming

For a red teamer, the prevailing profit-sharing model in the underground offers predictive intelligence. If your threat modeling suggests adversaries are motivated by royalty-based schemes, you should anticipate and test for highly evasive, complex, and durable attacks. These attackers have a vested interest in their exploits remaining undetected for as long as possible.

Conversely, if the market is dominated by pay-per-discovery models, you should prepare for a high-volume “picket-fence” style of attack—a constant barrage of simple, low-effort probes looking for any weakness. Your defenses would need to focus on rapid detection and patching of shallow vulnerabilities, rather than just defending against deep, persistent threats. Understanding the enemy’s economic incentives allows you to more accurately predict their behavior and prioritize your defensive strategy.