Files
Substrate/README.md
svemagie 34c99ec7a9 fix: rename 6 DE datasets README.md→{Name}.md + add 7 missing source catalogs
Renames (consistent naming convention):
- DE-Church-Exits, DE-Mental-Health, DE-Social-Isolation,
  DE-Wastewater-Surveillance, DE-Wellbeing, DE-World-Values

New source catalog entries (DS-00021 through DS-00027):
- Church Exits (EKD/DBK), Common Metrics (Destatis/Bundesbank/BA/ZEW),
  Democracy Metrics (V-Dem/RSF/ALLBUS), Mental Health (Gallup/Destatis/DAK),
  Social Isolation (Genesis/Einsamkeitsstudie), Wellbeing (Eurostat EHIS),
  World Values (WVS/EVS)

All 16 DE datasets + 1 EU dataset now have consistent naming and source catalogs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 12:05:57 +02:00

23 KiB
Raw Permalink Blame History

Substrate

Infrastructure for Human Knowledge & Progress

GitHub last commit License: MIT Stars TypeScript Bun

What Is ItDataQuick StartDocsContributeRoadmap


🎯 What Is Substrate?


A Shared Foundation for Human Progress

Think of substrate as the base layer—the common ground where we can all work together to understand problems and drive solutions forward. Instead of fragmented discussions and reinventing the wheel, Substrate gives us one place to:

  • Document problems → What's actually broken, with evidence
  • Track solutions → What works, what doesn't, with results
  • Connect progress → Link people, organizations, and projects actually moving things forward
  • Build on each other → Arguments and ideas that build on shared evidence
  • Measure outcomes → Did it work? What actually changed?

The Purpose: Accelerate Human Progress

We can't solve problems we don't understand. We can't build on solutions we can't find. Substrate provides:

  • 🎯 Shared understanding → One place to understand what's wrong and what works
  • 🚀 Faster progress → Build on existing knowledge instead of starting over
  • 📊 Evidence-based action → Ground decisions in authoritative data, not opinions
  • 🔗 Connected knowledge → See how problems, solutions, people, and data interconnect
  • 🌍 Collective intelligence → Human insight + AI analysis working together toward progress

An open-source framework connecting 17+ knowledge components:

graph TB
    subgraph "🌍 The Substrate"
        A[🧩 Problems]
        B[💡 Solutions]
        C[📊 Data Sources]
        D[🗣️ Arguments]
        E[📋 Claims]
        F[👥 People]
        G[🏢 Organizations]
        H[🚀 Projects]
        I[📈 Plans]
        J[🎯 Values]
        K[💭 Ideas]
    end

    L[👤 Human Contributors] --> A
    L --> B
    L --> D
    M[🤖 AI Analysis] --> A
    M --> B
    M --> D

    A -.connects to.-> B
    B -.connects to.-> C
    D -.connects to.-> C
    E -.connects to.-> C
    F -.connects to.-> G
    G -.connects to.-> H
    H -.connects to.-> B

    A --> N[🔍 Shared Understanding]
    B --> N
    C --> N
    D --> N

🏗️ Structured Components

  • Problems - Documented challenges with evidence
  • Solutions - Proven approaches with results
  • Arguments - Reasoning chains with quality scores
  • Claims - Assertions linked to evidence
  • Plans - Actionable strategies with metrics
  • Ideas - Frameworks and concepts
  • People & Organizations - Who's working on what
  • Projects - Active initiatives with outcomes

📊 Authoritative Data

  • 30+ Datasets across US and Germany
  • 20 Source Catalogs with full provenance
  • Library science methodology (8 dimensions)
  • Government agencies + verified databases
  • TypeScript automation with Bun runtime
  • Free APIs with excellent access
  • Human wellbeing indicators beyond GDP
  • Real-time to annual update frequencies

🚀 What's New

Tip

April 2026 — Germany expansion: 16 DE datasets + EU wealth comparison!

The Substrate now covers Germany in depth with 16 datasets tracking democratic resilience, wealth distribution, social mobility, epistemic competence, and more — tied together by the DE-Plan for democratic strengthening.

📅 Latest Updates (Click to expand)

🔥 April 2026 — DE Wealth Distribution & EU Comparison

2 NEW Datasets:

Added Dataset Key Finding
🆕 DE-Wealth-Distribution Wealth Gini 72.4% — far above income Gini (33.7)
🆕 EU-Wealth-Inequality Germany has highest wealth Gini in Eurozone (72.6)

Sources: Bundesbank PHF Wave 5 (2023), ECB DWA Q1 2025, World Inequality Database, Eurofound.

MarchApril 2026 — Germany Dataset Expansion

16 German datasets created covering all 6 DE-Plan challenges:

  • Democratic metrics, parliament activity, lobby transparency
  • Social isolation, mental health, wellbeing
  • Platform media dependency, epistemic competence
  • Social mobility, wealth distribution
  • Energy mix, federal budget, church exits, wastewater surveillance
  • Common metrics dashboard (29 indicators from Destatis/Bundesbank/BA/ZEW)

October 2025 — US Wellbeing Infrastructure

5 US wellbeing data sources (DS-00004 through DS-00008): FRED economic stress, CDC mortality, Census social isolation, BLS worker agency, EPA air quality.

📜 Previous Updates

September 2024 — Community Growth

  • Claims, Arguments, and Values frameworks
  • TELOS integration

July 2024 — Foundation

  • Single-repo structure with 17+ object types
  • Public launch with initial datasets

→ View Full Changelog


📊 Data & Evidence

Note

All data sources include complete library science cataloging with 8-dimension evaluation: Authority, Currency, Objectivity, Accuracy, Methodology, Coverage, Reliability, and Provenance.

Important

We know data can be controversial. That's exactly why we:

  • 📊 Collect from multiple sources - Cross-reference data from different authoritative providers
  • 🔍 Provide complete transparency - Every source fully documented with provenance, methodology, and limitations
  • 📝 Full logging - All data pulls logged with timestamps, source versions, and processing steps
  • 🔓 Open source everything - TypeScript update scripts show exactly how data is fetched and transformed

You can verify, audit, and challenge our data. That's the point.

Core Datasets (Data/)

🇺🇸 United States

Dataset Time Span Source Status
US GDP 19292025 FRED/BEA Active
US Inflation 19472025 FRED/BLS Active
US Common Metrics Current FRED/Census/BLS Active
US Presidential Approval 19372025 Gallup/538 Active
Bay Area COVID Wastewater 20222025 CDPH Active
Pulitzer Prize Winners 19182024 Wikidata Active
Knowledge Worker Salaries Global Research Active

🇩🇪 Germany (DE-Plan)

Dataset Key Metric Source Status
DE Common Metrics 29 economic/demographic indicators Destatis/Bundesbank/BA/ZEW Active
DE Wealth Distribution Wealth Gini 72.4% Bundesbank PHF/ECB DWA/WID Active
DE Social Mobility 61% vs 14% Gymnasium rate Destatis/OECD Active
DE Democracy Metrics Democratic resilience indicators Various Active
DE Social Isolation Social connection metrics SOEP/Einsamkeitsbarometer Active
DE Mental Health Mental health indicators RKI/Destatis Active
DE Wellbeing Wellbeing indicators Destatis/Eurostat Active
DE World Values Value orientations WVS/EVS Active
DE Platform Media Platform dependency metrics Reuters/Eurobarometer Active
DE Epistemic Competence Media literacy indicators PISA/ICILS Active
DE Parliament Activity Bundestag legislative activity DIP API Active
DE Lobby Transparency Lobbyregister metrics Bundestag API Active
DE Federal Budget Bundeshaushalt data Bundeshaushalt API Active
DE Energy Mix Energy transition metrics SMARD API Active
DE Church Exits Kirchenaustritte Destatis Active
DE Wastewater Surveillance SARS-CoV-2 in wastewater RKI AMELAG Active

🇪🇺 European Union

Dataset Key Metric Source Status
EU Wealth Inequality Cross-country Gini comparison ECB HFCS/DWA/Eurofound Active

Source Catalog (Data/sources/)

🌍 Global Health & Development
ID Name Coverage Update
DS-00001 WHO Global Health Observatory 194 countries, 2,000+ indicators Quarterly
DS-00002 UN SDG Indicators 193 countries, 231 targets Biannual
DS-00003 World Bank Open Data Global development metrics Varies
🇺🇸 US Wellbeing Sources
ID Name Key Indicators Update
DS-00004 FRED Economic Wellbeing Debt stress, inequality, unemployment WeeklyAnnual
DS-00005 CDC WONDER Mortality Overdoses, suicides, deaths of despair Annual
DS-00006 Census ACS Social Living alone, commute, digital divide Annual
DS-00007 BLS JOLTS Labor Quit rate, job openings, layoffs Monthly
DS-00008 EPA Air Quality PM2.5, ozone, AQI Real-time
DS-00009 EIA Energy Data Energy production/consumption Monthly
DS-00010 Treasury FiscalData Federal revenue/spending Daily
🇩🇪 Germany Sources
ID Name Key Indicators Update
DS-00011 Lobbyregister Bundestag Registered lobbyists, expenditure Continuous
DS-00012 SMARD Strommarkt Energy mix, renewable share 15-min
DS-00013 Bundeshaushalt Federal budget allocation Annual
DS-00014 DIP Bundestag Legislative activity, Drucksachen Continuous
DS-00015 Platform Media Platform usage, news consumption Annual
DS-00016 Epistemic Competence Media literacy, PISA scores Triennial
DS-00017 Social Mobility Gymnasium rates, education spending Annual
DS-00018 RKI AMELAG Wastewater SARS-CoV-2 wastewater levels Weekly
DS-00019 Wealth Distribution Wealth Gini, top shares, inheritance Triennial
DS-00020 EU Wealth Inequality Cross-country Gini, DWA shares Triennial
DS-00021 Church Exits (EKD/DBK) Kirchenaustritte, membership Annual
DS-00022 Common Metrics 29 economic/demographic indicators MonthlyAnnual
DS-00023 Democracy Metrics V-Dem, press freedom, turnout Annual
DS-00024 Mental Health Engagement, suicide, sick days Annual
DS-00025 Social Isolation Single households, loneliness Annual
DS-00026 Wellbeing Life satisfaction, meaning (Eurostat) ~5 years
DS-00027 World Values Inglehart dimensions (WVS/EVS) ~5 years

Composite Wellbeing Indices

Tip

Combine multiple data sources to create powerful wellbeing metrics:

  • 💸 Financial Stress Composite - Debt + delinquency + evictions + stress index
  • 🚨 Crisis Alert Composite - Overdoses + suicides + long-term unemployment
  • 🤝 Community Health Composite - Living alone + commute + digital divide (inverted)
  • 🆓 Worker Agency Index - Quit rate + job openings / unemployment
  • 🌫️ Environmental Health Index - PM2.5 + ozone (inverted)

→ Wellbeing Implementation Guide | → Data Directory


🤖 Human & AI Collaboration

Substrate provides the pieces. You and your AI create the connections.

👤 Humans Contribute

  • Document problems and solutions
  • Add authoritative data sources
  • Create arguments and claims
  • Link entities explicitly
  • Validate AI suggestions
  • Rate quality of evidence

🤖 AI Analyzes

  • Scan thousands of components
  • Suggest relationships automatically
  • Detect patterns across datasets
  • Score argument quality
  • Find contradictions
  • Generate knowledge graphs

Everything is designed for dual consumption:

  • Human-readable - Markdown and CSV anyone can open
  • Machine-parseable - Consistent formats AI can query
  • Fully documented - Complete methodology for every dataset
  • Linked with IDs - Unambiguous entity references

Use Cases

🔍 OSINT & Investigation
  • Cross-reference public records, corporate filings, government data
  • Link people → organizations → transactions → policies
  • Surface connections invisible in isolated databases
  • Build evidence chains from claims to verifiable records
  • Pattern detection for fraud, corruption, illicit networks
📊 Research & Analysis
  • Track claims against authoritative data sources
  • Evaluate argument quality based on evidence backing
  • Compare solutions across different implementations
  • Measure progress toward stated goals with real metrics
  • Cross-correlate economic, health, social, environmental data
🌐 Public Accountability
  • Verify political claims against documented evidence
  • Track campaign promises → policy outcomes → measured results
  • Link donations → voting records → policy positions
  • Monitor government spending against stated objectives
  • Environmental justice analysis (who breathes toxic air?)

🚀 Quick Start

View the Data (No Installation Required)

All datasets are available as CSV and Markdown files you can browse directly:

# Clone the repository
git clone https://github.com/danielmiessler/Substrate.git
cd Substrate

# Explore core datasets
ls Data/*/

# Explore wellbeing data sources
ls Data-Sources/*/

Run the Automation (Optional)

Warning

Requires Bun runtime. Install: curl -fsSL https://bun.sh/install | bash

# Install dependencies
bun install

# Update a specific dataset
cd Data/US-GDP
bun run update.ts

# Update a wellbeing source (requires API key)
export FRED_API_KEY="your_key_here"
cd Data-Sources/DS-00004—FRED_Economic_Wellbeing
bun run update.ts

# Update all datasets
bun run scripts/update-all.ts

Get API Keys (Free)

Data Source Get Key Rate Limit
FRED Economic fred.stlouisfed.org/docs/api 120 req/min
Census ACS api.census.gov/data/key_signup 500 req/day
EPA Air Quality Email: aqs.support@epa.gov 10 req/min
BLS JOLTS bls.gov/developers/home 500 req/day
CDC WONDER No key required Fair use

→ Complete Getting Started Guide | → Quick Reference


💡 Contribute

Important

Anyone can submit components. No gatekeeping on ideas—just structured formats.

What You Can Add

📋 Problems

  • Water quality issues
  • Healthcare access gaps
  • Climate change impacts

💡 Solutions

  • Filtration systems
  • Telemedicine networks
  • Infrastructure adaptation

📈 Plans

  • Political platforms
  • Policy proposals
  • Action roadmaps

🗣️ Arguments

  • "This works because X, Y, Z"
  • "This failed due to A, B"
  • (We don't judge—community rates)

📊 Data

  • Public records
  • Research datasets
  • Compiled statistics

💭 Ideas

  • Theoretical frameworks
  • Novel measurement approaches
  • Conceptual models

👥 People & Orgs

  • Researchers on problems
  • Organizations implementing
  • Projects with outcomes

📏 Metrics

  • Success criteria
  • Measurement frameworks
  • Progress indicators

🎯 Values

  • Guiding principles
  • Evaluation criteria

How to Submit

  1. Fork the repository on GitHub
  2. Add your component in the appropriate directory (Problems/, Solutions/, etc.)
  3. Follow the format in that directory's README
  4. Submit a Pull Request

Note

We're building a web interface to make this easier for non-technical contributors!

→ Contribution Guidelines


📚 Documentation

Getting Started

Technical


🗺️ Roadmap

Completed

Phase 1: Foundation (July 2024)
  • Single-repo structure with 17+ object types
  • Core framework and documentation
  • Public launch with initial datasets
  • Community contribution framework
Phase 2: Community (Aug-Sep 2024)
  • Claims, Arguments, and Values frameworks
  • 6+ community contributors
  • 10+ merged pull requests
  • TELOS integration
Phase 3: Data Infrastructure (Oct 2025) 🔥
  • 13 authoritative data sources (5 core + 8 wellbeing)
  • Library science methodology (8-dimension evaluation)
  • TypeScript automation system with Bun runtime
  • 6,000+ lines of documentation across all sources
  • Comprehensive wellbeing indicators (economic, health, social, labor, environmental)
  • Free API access with rate limiting and retry logic

🚧 Planned

Phase 4: Enhanced Access (Q4 2025 - Q1 2026)

  • 🎨 Web-based contribution interface (no coding required)
  • 📊 Interactive data visualizations
  • 🔌 RESTful API for programmatic access
  • 📱 Mobile-friendly exploration

Phase 5: Dataset Expansion (2026)

  • 🌍 Additional international sources (UNICEF, OECD, IHME)
  • Real-time data feeds integration
  • 🗳️ Community-driven dataset requests
  • 🤝 Partnerships with research institutions

Phase 6: Intelligence Layer (2026+)

  • 🤖 Automated relationship discovery
  • 📈 Confidence scoring for AI-suggested links
  • 🎯 Pattern detection algorithms
  • 🔔 Email/Slack notifications for data updates
  • 📚 Machine-readable catalog (DCAT/CKAN compliance)

🔗 Integration with TELOS

Substrate provides evidence. TELOS provides intention.

TELOS (Goals & Strategy) Substrate (Evidence & Solutions)
  • Goals - What you want to achieve
  • Strategies - How you'll pursue it
  • Challenges - Obstacles you face
  • Metrics - Progress tracking
  • Problems - What stands in the way
  • Solutions - Proven approaches
  • Data - Measured evidence
  • Plans - Implementation roadmaps

Together: Complete system for intention + evidence-based action.

Plans

Plan Scope Author Challenges
US Plan 1 United States Daniel Miessler US-focused wellbeing
DE Plan 1 Germany Sven 6 challenges: epistemic fragmentation, authoritarian normalization, platform dependency, exhaustion/precarity, knowledge isolation, press freedom

The DE-Plan drives all 16 German datasets — each dataset maps to specific DE-Plan challenges.


🙏 Credits

Created By

Daniel Miessler • July 2024

Twitter Follow Newsletter

Special Thanks

Inspiration & Contributions:

  • Jonathan Dunn @xssdoctor - Similar vision and collaboration
  • Joel Parish - Structure wisdom and guidance
  • Joseph Thacker - Continuous flow of innovative ideas

Community Contributors:

@ThatNateGuy@JaymanW@karai114@DesertEaglePWN@ktfth


📄 License

MIT License - see the LICENSE file for details.


📊 Repository Stats

Data: 24 datasets • 27 source catalogs • 2 country plans (US, DE) • 1 EU comparison

Scope: US economic/wellbeing • Germany democratic resilience • EU wealth inequality

Community: 6+ contributors • 10+ PRs merged • 17 object types


⬆ Back to Top

Built with ❤️ for human understanding and progress

Powered by TypeScriptBun • Open Data


Star History

Star History Chart