Files
Daniel Miessler e7d47a7405 feat: Add Answer First schema and revise knowledge worker estimates
- Create executive summary (SUMMARY.md) with narrative overview
- Revise global estimate from $50-70T to $35-50T (math validation)
- Add DATASET-TEMPLATE.md for future datasets
- Clarify McKinsey $5-7T measures automation impact, not compensation
- Add confidence levels, changelog, and supporting documentation links

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 14:25:22 -08:00

26 KiB

Knowledge Worker Global Compensation - Research Compilation

Source ID: DS-00005 Record Created: 2025-10-25 Last Updated: 2025-12-10 Cataloger: Substrate Data Curation Review Status: Reviewed


🎯 CURRENT BEST ESTIMATE

Metric Value Confidence
Global Knowledge Worker Compensation $35-50 trillion/year 65%
U.S. (Professional Definition) $6-8 trillion/year 95%
U.S. (Broad Definition) $10-12 trillion/year 85%

December 2025 Revision: Global estimate reduced from $50-70T to $35-50T after mathematical validation against global labor share (~$58T total labor compensation). The $70T upper bound exceeded plausible labor share.


Bibliographic Information

Title Statement

  • Main Title: Knowledge Worker Global Compensation: Summary Table (2024-2025)
  • Subtitle: Multi-Source Research Compilation on Global Knowledge Worker Salaries
  • Abbreviated Title: Knowledge Worker Compensation
  • Variant Titles: Global Tech Salaries, Knowledge Economy Compensation Data

Responsibility Statement

  • Publisher/Issuing Body: Substrate Data Curation (Kai Personal AI Infrastructure)
  • Department/Division: Multi-Agent Research System
  • Contributors: 10 parallel AI research agents (Perplexity, Claude, Gemini researchers)
  • Contact Information: Research compiled via automated research system

Publication Information

  • Place of Publication: Digital research compilation
  • Date of First Publication: 2025-10-19
  • Publication Frequency: On-demand research updates
  • Current Status: Active

Edition/Version Information

  • Current Version: 2025-12-10 research snapshot (revised)
  • Version History:
    • 2025-12-10: Revised global estimate from $50-70T to $35-50T based on labor share validation
    • 2025-11-25: 40-agent comprehensive synthesis with Bayesian reconciliation
    • 2025-10-19: Initial research compilation
  • Versioning Scheme: Date-based research snapshots

Authority Statement

Organizational Authority

Issuing Organization Analysis:

  • Official Name: Substrate Data Curation (Kai Personal AI Infrastructure)
  • Type: Research compilation via multi-agent AI system
  • Established: 2025
  • Mandate: Aggregate authoritative compensation data from multiple sources
  • Parent Organization: Independent research project
  • Governance Structure: Automated research with human validation

Domain Authority:

  • Subject Expertise: Aggregation of authoritative salary data from government agencies, consulting firms, industry reports
  • Recognition: Synthesizes data from BLS (U.S. Bureau of Labor Statistics), OECD, ILO, major consulting firms
  • Publication History: Initial research compilation (2025)
  • Peer Recognition: Sources include recognized authorities (BLS, OECD, Dice, Glassdoor, Robert Half)

Quality Oversight:

  • Peer Review: Multi-agent cross-validation (10 parallel agents)
  • Editorial Board: Human review of aggregated findings
  • Scientific Committee: Source validation against authoritative data providers
  • External Audit: Not applicable (research compilation)
  • Certification: None (data aggregation from certified sources)

Independence Assessment:

  • Funding Model: Independent research project
  • Political Independence: No political affiliations
  • Commercial Interests: None (non-commercial research)
  • Transparency: Full source attribution; research methodology documented

Data Authority

Provenance Classification:

  • Source Type: Tertiary (aggregates secondary and primary sources)
  • Data Origin: Multi-source aggregation from government agencies, consulting firms, industry reports
  • Chain of Custody: Primary sources (BLS, OECD, ILO, consulting firms) → AI research agents → Data synthesis → Public documentation

Tertiary Source Characteristics:

  • Synthesizes data from 20+ primary and secondary sources
  • Adds value through cross-validation and regional aggregation
  • Provides confidence levels based on source quality
  • Identifies data gaps and methodological limitations

Scope Note

Content Description

Subject Coverage:

  • Primary Subjects: Labor Economics, Knowledge Worker Compensation, Technology Sector Salaries, Global Workforce Statistics
  • Secondary Subjects: AI/ML Premium Roles, Regional Salary Comparisons, Freelance Knowledge Work
  • Subject Classification:
    • LC: HD (Labor Economics), HB (Economic Theory)
    • Dewey: 331.2 (Labor Economics - Wages)
  • Keywords: knowledge workers, technology salaries, global compensation, tech workers, AI/ML roles, software engineer salaries, consulting compensation, STEM workers, professional services salaries

Geographic Coverage:

  • Spatial Scope: Global (U.S., Europe, Asia-Pacific, Latin America)
  • Countries/Regions Included: United States, Switzerland, Denmark, Germany, Singapore, Eastern Europe, China, India, Latin America
  • Geographic Granularity: Country and regional aggregates
  • Coverage Completeness: U.S. high (85% confidence); Global medium (65% confidence)
  • Notable Exclusions: Africa, Middle East (limited data availability)

Temporal Coverage:

  • Start Date: 2024 (primary data year)
  • End Date: 2025 (latest projections)
  • Historical Depth: 1-2 years (snapshot with growth trends)
  • Frequency of Observations: Annual salary data
  • Temporal Granularity: Year-level
  • Time Series Continuity: Not applicable (single research snapshot)

Population/Cases Covered:

  • Target Population: Global knowledge workers (~1+ billion estimated)
  • Inclusion Criteria: Knowledge-intensive roles (software engineers, data scientists, consultants, healthcare professionals, financial analysts)
  • Exclusion Criteria: Manual labor, routine clerical work, retail service workers
  • Coverage Rate: U.S. ~100 million workers (38-42% of workforce); Global ~1+ billion
  • Sample vs. Census: Aggregation of sample surveys and employment statistics

Variables/Indicators:

  • Number of Variables: 15+ compensation and workforce metrics
  • Core Indicators:
    • Total annual compensation by geography and sector
    • Average/median salaries by role
    • Year-over-year growth rates
    • Workforce size estimates
    • AI/ML premium percentages
    • Freelance knowledge worker statistics
  • Derived Variables: Total market value calculations, regional averages, growth trend projections
  • Data Dictionary Available: Yes - see knowledge-worker-compensation-data.md

Content Boundaries

What This Source IS:

  • Comprehensive multi-source aggregation of knowledge worker compensation data
  • Best available synthesis of U.S. knowledge worker salaries (high confidence)
  • Useful global overview with regional estimates (medium confidence)
  • Identifies data gaps and methodological limitations transparently

What This Source IS NOT:

  • NOT primary salary survey data (use BLS OEWS, Dice, Glassdoor directly)
  • NOT real-time (2024-2025 snapshot; annual updates required)
  • NOT granular below country/regional level (no city-specific data)
  • NOT comprehensive for all countries (Africa, Middle East gaps)
  • NOT peer-reviewed academic research (research compilation)

Comparison with Similar Sources:

Source Advantages Over This Source Disadvantages vs. This Source
BLS OEWS (U.S. only) Primary government data; high accuracy U.S. only; no global coverage; annual lag
OECD Average Wages Official international data; high credibility Country-level only; no occupation-specific knowledge worker data
Dice Tech Salary Report Tech sector depth; annual trends U.S. tech only; narrow scope
Glassdoor Salaries User-generated; granular roles Self-reported bias; quality varies
Payscale/Robert Half Consulting firm depth; market insights Subscription models; narrower scope

Access Conditions

Technical Access

API Information:

  • Endpoint URL: N/A (static research compilation)
  • API Type: N/A
  • API Version: N/A
  • OpenAPI/Swagger Spec: N/A
  • SDKs/Libraries: N/A

Authentication:

  • Authentication Required: No
  • Authentication Type: None (public research documentation)
  • Registration Process: Not applicable
  • Approval Required: No
  • Approval Timeframe: N/A

Rate Limits:

  • Not applicable (static document)

Query Capabilities:

  • Not applicable (static document)

Data Formats:

  • Available Formats: Markdown (knowledge-worker-compensation-data.md)
  • Format Quality: Structured markdown tables
  • Compression: Not compressed
  • Encoding: UTF-8

Download Options:

  • Bulk Download: Yes - markdown file
  • Streaming API: No
  • FTP/SFTP: No
  • Torrent: No
  • Data Dumps: Single markdown document

Reliability Metrics:

  • Not applicable (static research document)

Legal/Policy Access

License:

  • License Type: Research compilation (individual sources retain original licenses)
  • License Version: N/A
  • License URL: See individual source licenses (BLS: public domain, OECD: CC-BY, etc.)
  • SPDX Identifier: Mixed (varies by source)

Usage Rights:

  • Redistribution Allowed: Yes (with source attribution)
  • Commercial Use Allowed: Check individual source licenses
  • Modification Allowed: Yes (with source attribution)
  • Attribution Required: Yes - cite original sources
  • Share-Alike Required: No

Cost Structure:

  • Access Cost: Free

Terms of Service:

  • TOS URL: N/A (research compilation)
  • Key Restrictions: Cite original sources; verify currency of data
  • Liability Disclaimers: Research compilation "as is"; users responsible for validating currency
  • Privacy Policy: No personal data collected

Collection Development Policy Fit

Relevance Assessment

Substrate Mission Alignment:

  • Human Progress Focus: Knowledge worker compensation central to understanding economic progress and human capital value
  • Problem-Solution Connection:
    • Links to Problems: Wage stagnation, skills gaps, labor market inefficiencies
    • Links to Solutions: Education investment, skills development, labor mobility
  • Evidence Quality: Medium-High for U.S. (85% confidence); Medium for global (65% confidence)

Collection Priorities Match:

  • Priority Level: IMPORTANT - valuable for labor economics and human capital domain
  • Uniqueness: Best available multi-source synthesis of global knowledge worker compensation
  • Comprehensiveness: Fills critical gap for global knowledge economy salary data

Comparison with Holdings

Overlapping Sources:

  • None currently in Substrate

Unique Contribution:

  • Only global knowledge worker compensation dataset in Substrate
  • Multi-source validation (20+ authoritative sources)
  • Identifies data gaps and confidence levels transparently
  • U.S. + global coverage in single compilation

Preferred Use Cases:

  • When global knowledge worker compensation overview needed
  • Cross-country salary comparisons
  • Understanding knowledge economy labor market
  • AI/ML skills premium analysis

Technical Specifications

Data Model

Schema Documentation:

  • Schema Type: Markdown tables (structured text)
  • Schema URL: knowledge-worker-compensation-data.md
  • Schema Version: 2025-10-19

Entity Types:

  • Geographic regions (U.S., Global, countries)
  • Sectors (Technology, Finance, Healthcare, Professional Services)
  • Roles (Software Engineer, Data Scientist, Consultant, etc.)
  • Compensation metrics (average, median, growth rates)

Key Relationships:

  • Geography → Sector → Role → Compensation
  • Time → Geography → Growth Rate

Primary Keys:

  • Composite: (Geography, Sector, Role, Year)

Foreign Keys:

  • Not applicable (summary tables)

Metadata Standards Compliance

Standards Followed:

  • Dublin Core
  • DCAT - not applicable
  • Schema.org - not applicable
  • SDMX - not applicable
  • DDI - not applicable
  • ISO 19115 - not applicable
  • MARC - not applicable

Metadata Quality:

  • Completeness: 80% (source attribution comprehensive; some estimates)
  • Accuracy: High for U.S. (85% confidence); Medium for global (65% confidence)
  • Consistency: Good - standardized table format

API Documentation Quality

Documentation Assessment:

  • Not applicable (static research document)

Source Evaluation Narrative

Methodological Assessment

Data Collection Methodology:

Sampling Design:

  • Method: Multi-source aggregation (not original sampling)
  • Sample Size: Varies by source (BLS: millions of establishments; Dice: survey data)
  • Sampling Frame: Aggregates from multiple sampling frames
  • Stratification: By source methodology (varies)
  • Weighting: Not applicable (summary compilation)

Data Collection Instruments:

  • Instrument Type: AI research agents querying authoritative sources
  • Validation: Multi-agent cross-validation (10 parallel agents)
  • Question Wording: Not applicable (secondary data aggregation)
  • Mode: Web-based research + API queries

Quality Control Procedures:

  • Field Supervision: Automated multi-agent validation
  • Validation Rules: Cross-source consistency checks
  • Consistency Checks: Regional and sector-level validation
  • Verification: Human review of aggregated findings
  • Outlier Treatment: Flagged inconsistencies noted in research

Error Characteristics:

  • Sampling Error: Varies by source (BLS low; industry surveys higher)
  • Non-sampling Error: Aggregation errors; definitional differences across sources
  • Known Biases: U.S.-centric (better data coverage); self-reported bias in some sources
  • Accuracy Bounds: U.S. ±5-10% (high confidence); Global ±15-30% (medium confidence)

Methodology Documentation:

  • Transparency Level: 4/5 (Comprehensive)
  • Documentation URL: knowledge-worker-compensation-data.md (full source attribution)
  • Peer Review Status: Multi-agent validation; not academic peer review
  • Reproducibility: SPARQL/API queries documented; research snapshot date-stamped

Currency Assessment

Update Characteristics:

  • Update Frequency: On-demand research updates (not automated)
  • Update Reliability: Requires manual re-research
  • Update Notification: None (static snapshot)
  • Last Updated: 2025-10-19

Timeliness:

  • Collection to Publication Lag: Varies by source (BLS ~6 months; Dice ~1 month; OECD ~1 year)
  • Factors Affecting Timeliness: Annual salary survey cycles; government reporting schedules
  • Historical Timeliness: Single snapshot (not time series)

Currency for Different Uses:

  • Real-time Analysis: Unsuitable (static snapshot)
  • Recent Trends: Suitable for 2024-2025 estimates
  • Historical Research: Not applicable (single snapshot)

Objectivity Assessment

Potential Biases:

Political Bias:

  • Government Influence: Government sources (BLS, OECD) professional independence
  • Editorial Stance: Research compilation neutral
  • Political Pressure: None

Commercial Bias:

  • Funding Sources: Consulting firms (Robert Half, Payscale) have commercial interests in salary data
  • Advertising Influence: Not applicable
  • Proprietary Interests: Some sources proprietary (Glassdoor, Payscale)

Cultural/Social Bias:

  • Geographic Bias: U.S.-centric (better data coverage); Western Europe overrepresented
  • Social Perspective: Knowledge worker focus excludes non-knowledge sectors
  • Language Bias: English-language sources predominate
  • Selection Bias: "Knowledge worker" definition varies by source

Transparency:

  • Bias Disclosure: Data gaps acknowledged; confidence levels provided
  • Limitations Stated: Comprehensive limitation documentation
  • Raw Data Available: Source links provided; original data at individual sources

Reliability Assessment

Consistency:

  • Internal Consistency: Cross-source validation performed
  • Temporal Consistency: Not applicable (single snapshot)
  • Cross-source Consistency: Reasonable agreement for U.S. (±10%); wider variation globally

Stability:

  • Definition Changes: "Knowledge worker" definition varies by source
  • Methodology Changes: Not applicable (single snapshot)
  • Series Breaks: Not applicable

Verification:

  • Independent Verification: 10 parallel AI agents cross-validated findings
  • Replication Studies: Not applicable (research compilation)
  • Audit Results: Multi-agent validation; human review

Accuracy Assessment

Validation Evidence:

  • Benchmark Comparisons: U.S. BLS data cross-checked with Dice, Glassdoor (±10% agreement)
  • Coverage Assessments: U.S. 85% confidence; Global 65% confidence
  • Error Studies: Not applicable

Accuracy for Different Uses:

  • Point Estimates: Reliable for U.S. averages; moderate for global
  • Trend Analysis: Limited (single snapshot with YoY growth rates)
  • Cross-sectional Comparison: Reliable for U.S. sectors; moderate for cross-country
  • Sub-population Analysis: Limited (sector-level; no demographics)

Known Limitations and Caveats

Coverage Limitations

Geographic Gaps:

  • Africa (minimal data available)
  • Middle East (limited coverage)
  • Rural areas (knowledge workers concentrated in urban centers)

Temporal Gaps:

  • Single snapshot (2024-2025)
  • No historical time series

Population Exclusions:

  • Non-knowledge workers (by design)
  • Informal economy knowledge workers
  • Gig economy workers (partial coverage)

Variable Gaps:

  • Equity compensation not captured in most sources (stock options, RSUs)
  • Benefits variation across countries
  • Demographic breakdowns limited

Methodological Limitations

Sampling Limitations:

  • Varies by source (BLS high quality; self-reported surveys lower quality)
  • Self-selection bias in Glassdoor, Payscale user-generated data

Measurement Limitations:

  • "Knowledge worker" definition inconsistent across sources
  • Purchasing power parity adjustments not uniformly applied
  • Exchange rate fluctuations affect cross-country comparisons

Processing Limitations:

  • Aggregation across sources with different methodologies
  • Confidence levels estimated (not statistically rigorous)

Comparability Limitations

Cross-national Comparability:

  • Definitional differences (what constitutes "knowledge worker")
  • PPP adjustments needed for true cost-of-living comparisons
  • Tax and benefit systems vary (gross vs. net compensation)

Temporal Comparability:

  • Single snapshot (cannot assess trends)

Sub-group Comparability:

  • Limited demographic data (gender, race, education level)

Usage Caveats

Inappropriate Uses:

  1. DO NOT use for precise individual salary negotiations - broad averages only
  2. DO NOT assume global estimates are highly accurate - medium confidence (65%)
  3. DO NOT use for historical trend analysis - single snapshot
  4. DO NOT assume equity compensation included - most sources cash compensation only
  5. DO NOT use without PPP adjustment - cross-country comparisons need cost-of-living adjustment

Ecological Fallacy Risks:

  • National/sector averages do not reflect individual company or role compensation
  • Regional averages mask within-country variation

Correlation vs. Causation:

  • Compensation levels do not imply causation
  • Appropriate for descriptive analysis only

Ideal Applications

Research Questions Well-Suited:

  1. "What is the global knowledge worker compensation market size?"
  2. "How do U.S. tech salaries compare to European tech salaries?"
  3. "What is the AI/ML skills premium in 2025?"
  4. "What percentage of the U.S. workforce are knowledge workers?"

Analysis Types Supported:

  • Descriptive statistics (compensation averages by geography/sector)
  • Cross-country comparison (regional salary differences)
  • Sector analysis (technology vs. finance vs. healthcare)
  • Skills premium analysis (AI/ML vs. general software engineering)

Appropriate Contexts

Geographic Contexts:

  • U.S. national analysis (high confidence)
  • Western Europe comparison (medium confidence)
  • Global overview (medium confidence)

Temporal Contexts:

  • Current snapshot (2024-2025)
  • Short-term growth trends (YoY)

Subject Contexts:

  • Knowledge economy labor markets
  • Technology sector compensation
  • STEM workforce analysis
  • Consulting and professional services

Use Warnings

Avoid Using This Source For:

  1. Individual salary negotiation → Use role-specific Glassdoor, Payscale
  2. Historical trend analysis → Single snapshot; need time series data
  3. Precise global estimates → Medium confidence (65%); use OECD, ILO for official stats
  4. Equity compensation analysis → Most sources exclude stock options/RSUs
  5. Demographic analysis → Limited demographic breakdowns

Recommended Alternatives For:

  • U.S. official statistics → BLS OEWS (Occupational Employment and Wage Statistics)
  • Global official statistics → OECD Average Wages, ILO Global Wage Report
  • Tech sector depth → Dice Tech Salary Report, Stack Overflow Developer Survey
  • Equity compensation → Carta Equity Report, Glassdoor total compensation
  • Real-time data → Glassdoor, Payscale (updated continuously)

Citation

Preferred Citation Format

APA 7th: Substrate Data Curation. (2025, October 19). Knowledge worker global compensation: Summary table (2024-2025) [Research compilation]. https://github.com/danielmiessler/substrate

Chicago 17th: Substrate Data Curation. "Knowledge Worker Global Compensation: Summary Table (2024-2025)." Research compilation. October 19, 2025. https://github.com/danielmiessler/substrate.

MLA 9th: Substrate Data Curation. Knowledge Worker Global Compensation: Summary Table (2024-2025). Research compilation, 19 Oct. 2025, github.com/danielmiessler/substrate.

Vancouver: Substrate Data Curation. Knowledge worker global compensation: summary table (2024-2025) [Internet]. Research compilation; 2025 Oct 19 [cited 2025 Oct 25]. Available from: https://github.com/danielmiessler/substrate

BibTeX:

@misc{substrate_knowledge_worker_2025,
  author = {{Substrate Data Curation}},
  title = {Knowledge Worker Global Compensation: Summary Table (2024-2025)},
  year = {2025},
  month = {October},
  howpublished = {Research compilation},
  url = {https://github.com/danielmiessler/substrate},
  note = {Accessed: 2025-10-25; Multi-source aggregation via 10 parallel AI research agents}
}

Data Citation Principles

Following FORCE11 Data Citation Principles:

  • Importance: Research compilation is citable output; cite original sources when possible
  • Credit and Attribution: Citations credit both compilation and original sources (BLS, OECD, etc.)
  • Evidence: Citations enable readers to verify compensation claims
  • Unique Identification: Date + version for exact reproducibility
  • Access: Citation provides access to research compilation
  • Persistence: Static snapshot preserved; future updates versioned
  • Specificity and Verifiability: Research date ensures snapshot reproducibility
  • Interoperability: Standard citation formats for reference managers
  • Flexibility: Adaptable to various research contexts

IMPORTANT: Always cite original sources (BLS, OECD, Dice, etc.) for primary data claims. This compilation provides aggregated overview with source attribution.


Version History

Current Version

  • Version: 2025-10-19 Research Snapshot
  • Date: 2025-10-19
  • Changes: Initial multi-source research compilation (10 parallel AI agents)

Previous Versions

  • None (initial research)

Review Log

Internal Reviews

  • Date: 2025-10-25 | Reviewer: Substrate Data Curation | Status: Approved | Notes: Initial catalog entry; research compilation with transparent methodology

Quality Checks

  • Last Metadata Validation: 2025-10-25
  • Last Authority Verification: 2025-10-25 (source attribution verified)
  • Last Link Check: 2025-10-25
  • Last Access Test: 2025-10-25 (markdown file accessible)

Cross-References

Related Substrate Entities:

  • Problems:
    • Wage stagnation
    • Skills gaps
    • Labor market inefficiencies
  • Solutions:
    • Education investment
    • Skills development programs
    • Labor mobility initiatives
  • Organizations:
    • U.S. Bureau of Labor Statistics
    • OECD
    • International Labour Organization
  • Other Data Sources:
    • DS-00002: U.S. GDP (economic output context)
    • DS-00003: U.S. Inflation (real wage purchasing power)

External Resources:

Additional Documentation

User Guides:

Research Using This Source:

  • Initial research compilation (2025)

Methodology Papers:

  • Research methodology: Multi-agent AI research system (10 parallel agents)
  • Sources: BLS, OECD, ILO, Dice, Glassdoor, Robert Half, Payscale, industry reports

Cataloger Notes

Internal Notes:

  • Unique tertiary source; multi-agent research compilation
  • U.S. data high confidence (85%); Global medium confidence (65%)
  • Future updates require re-research (not automated)
  • Consider annual update schedule (October/November)
  • Equity compensation gap noted (most sources exclude stock options/RSUs)

To Do:

  • Annual update (October 2026) with new research cycle
  • Consider adding demographic breakdowns if data becomes available
  • Explore equity compensation data sources (Carta, Glassdoor total comp)
  • Add Africa and Middle East data if sources improve

Questions for Review:

  • Should this be updated quarterly or annually?
  • How to handle exchange rate fluctuations in cross-country comparisons?
  • Should we add PPP-adjusted values?

END OF SOURCE RECORD