diff --git a/README.md b/README.md index e9fa900..1a7162d 100644 --- a/README.md +++ b/README.md @@ -1,331 +1,228 @@
- + -# `Substrate` +# Substrate -![Static Badge](https://img.shields.io/badge/mission-visualize%20human%20progress-brightgreen) -
-![GitHub top language](https://img.shields.io/github/languages/top/human-substrate/Problems) -![GitHub last commit](https://img.shields.io/github/last-commit/human-substrate/Problems) +**An open-source framework for capturing, organizing, and analyzing different aspects of human civilization** + +![GitHub last commit](https://img.shields.io/github/last-commit/danielmiessler/Substrate) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) +![GitHub Repo stars](https://img.shields.io/github/stars/danielmiessler/Substrate) -

-

Substrate An Open-source Framework for Human Understanding, Meaning, and Progress.

-

- -[About](#about) • -[How to Add Items](#how-to-add-problems) • -[Meta](#meta) +[About](#about) • [Data](#-data-directory) • [Updates](#-recent-updates) • [Docs](#-documentation) • [Contributing](#how-to-contribute)
-## Navigation - -- [About](#about) -- [Recent Updates](#-recent-updates) -- [Data Directory](#data-directory) -- [How to Contribute](#how-to-contribute) -- [Documentation](#-documentation) -- [Meta](#meta) - ---- - -## 🚀 **Recent Updates** - -> [!IMPORTANT] -> **🔥 2025-10:** Major data infrastructure upgrade complete! -> -> **DATA REVOLUTION:** -> - 5 authoritative datasets added (GDP, Inflation, COVID, Pulitzer, Salaries) -> - Library science methodology implementation -> - Comprehensive data management system -> - 1,700+ data points spanning 107 years (1918-2025) -> -> [See full changelog →](#recent-updates-detail) - -
-📅 Click to see all updates - -### Recent Changes - -#### **2025-10-25 - Dataset Updates & Validation** -- ✅ **DS-00004:** Pulitzer Prize Winners - Arts & Letters data refreshed -- ✅ **DS-00002:** U.S. GDP data updated (1929-2025) -- ✅ **DS-00003:** U.S. CPI inflation data updated (1947-2025) -- ✅ **DS-00005:** Knowledge Worker Global Salaries validation check completed - -#### **2025-10-18 - New Dataset** -- 🆕 **DS-00005:** Knowledge Worker Global Compensation dataset added -- 📊 Global salary data for knowledge workers -- 🔍 Comprehensive geographic and role coverage - -#### **2025-10-16 - Data Management System** -- 🏗️ **Library Science Methodology** implemented with 8-dimension source evaluation -- ⚡ **TypeScript Automation** with Bun runtime -- 📋 **Auto-Discovery Orchestrator** for dataset updates -- 📊 **Central Logging System** with aggregated logs -- 📈 **Dashboard Auto-Generation** with health metrics -- 🔄 **Git Integration** for version control -- 📚 **Comprehensive Documentation Suite:** - - `GETTING_STARTED.md` - Complete setup guide - - `PROJECT_SUMMARY.md` - Technical architecture - - `QUICK_REFERENCE.md` - Command cheatsheet - - `Data/README.md` - Data philosophy and standards - -#### **2025-10-07 - Major Dataset Additions** -- 🆕 **DS-00004:** Pulitzer Prize Winners - Arts & Letters (1918-2024) - - 249 winners across Poetry, Drama, General/Special awards - - High-quality, complete coverage of selected categories - - Source: Wikidata - -- 🆕 **DS-00003:** Bay Area COVID-19 Wastewater Surveillance - - 161 weekly data points (2022-2025) - - California statewide data (Bay Area proxy) - - Leading health indicator - - Source: California Department of Public Health (CDPH) - -#### **2025-10-06 - GitHub Automation** -- 🤖 **Claude Code Review Workflow** - Automated code review -- 🤖 **Claude PR Assistant Workflow** - PR analysis and assistance -- ⚙️ **CI/CD Integration** for quality assurance - -#### **2025-10-06 - U.S. Inflation Dataset** -- 🆕 **DS-00001:** U.S. Consumer Price Index (CPI-U) -- 📊 945 monthly data points (1947-2025) -- 📈 Gold standard inflation measure -- 🏛️ Source: FRED/Bureau of Labor Statistics - -#### **2025-10-06 - Community Contributions** -- 🌍 **Brazil - São Paulo Mental Health** problem added (@ktfth) -- 📝 **Arguments** contributions (@DesertEaglePWN, @JaymanW) -- 🎯 **Values** framework established (@karai114) -- ✅ Multiple problem database updates - -#### **2024-09-25 - Framework Expansion** -- 📋 **Claims Framework** established (@ThatNateGuy) - - Anthropogenic climate change - - Everettian Interpretation of Quantum Mechanics - - Supernaturalism - - Atavistic Model of Cancer - - Holographic Universe theory - -#### **2024-07-27 - Repository Consolidation** -- 🏗️ **Single-Repo Structure** - Moved from multi-repo to unified structure -- 📦 Easier project management and contribution workflow -- 🚀 Simplified development process - -
- -
-📊 Project Statistics (as of 2025-10-27) - -### Data & Coverage -- **Datasets:** 5 authoritative ground-truth datasets -- **Data Points:** 1,700+ (spanning multiple domains) -- **Historical Coverage:** 1918-2025 (107 years maximum span) -- **Geographic Coverage:** Global (U.S.-focused with expanding international data) - -### Infrastructure -- **Update Scripts:** TypeScript with Bun runtime -- **Automation:** Auto-discovery orchestrator with central logging -- **Data Formats:** CSV, JSON, Markdown, Pipe-delimited -- **Quality Framework:** 8-dimension library science evaluation -- **Version Control:** Full git integration with automated commits -- **GitHub Actions:** 2 active workflows (Code Review, PR Assistant) - -### Documentation -- **Markdown:** 8,000+ lines of documentation -- **TypeScript:** 1,000+ lines of automation code -- **Documentation Files:** 25+ comprehensive guides and references -- **Standards:** Dublin Core, MARC, SDMX, DDI metadata compliance - -### Community -- **Contributors:** 6+ community members -- **Pull Requests Merged:** 10+ contributions -- **Object Types:** 17+ framework components (Problems, Solutions, Ideas, Plans, etc.) - -
- -
-🎯 Milestones & Roadmap - -### ✅ Completed Milestones - -**Phase 1: Foundation (July 2024)** -- ✅ Single-repo structure -- ✅ Core object types defined (17+ types) -- ✅ Basic directory structure -- ✅ Initial documentation -- ✅ Public launch with intro video - -**Phase 2: Community Building (Aug-Sep 2024)** -- ✅ First community contributions -- ✅ Claims framework established -- ✅ Arguments and Values added -- ✅ Multi-contributor ecosystem active - -**Phase 3: Data Infrastructure (Oct 2025)** -- ✅ Five authoritative datasets added -- ✅ Library science methodology implemented -- ✅ TypeScript data management system -- ✅ Comprehensive documentation suite -- ✅ GitHub Actions automation -- ✅ Quality assurance framework - -### 🚧 Upcoming (Planned) - -**Phase 4: Enhanced Access & Interaction** -- [ ] Web-based contribution interface (non-coders can contribute) -- [ ] Interactive data visualizations -- [ ] RESTful API for programmatic access -- [ ] Advanced cross-reference linking -- [ ] Evidence-based problem/solution matching - -**Phase 5: Dataset Expansion** -- [ ] Additional authoritative datasets (UNICEF, OECD, IHME) -- [ ] Community-driven dataset requests -- [ ] Real-time data feeds for select sources -- [ ] Historical data archive expansion - -**Phase 6: Advanced Features** -- [ ] Machine-readable catalog (DCAT/CKAN) -- [ ] Automated quality scoring algorithms -- [ ] Data quality trend tracking -- [ ] Email/Slack notifications for updates -- [ ] Parallel dataset updates - -
- ---- - -**Full Update History:** See [`UPDATES.md`](./UPDATES.md) for complete chronological changelog - --- ## About -**Substrate** is an open-source framework for capturing, organizing, and analyzing different aspects of human civilization. It provides a structured knowledge system covering problems, solutions, plans, experiments, and empirical data—all interconnected and designed to be analyzed by both humans and AI systems. +Substrate provides a structured knowledge system covering **problems**, **solutions**, **plans**, **experiments**, and **empirical data**—all interconnected and designed to be analyzed by both humans and AI systems. -The project combines: -- **Conceptual Components**: Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims -- **Empirical Data**: Curated ground-truth datasets from authoritative sources -- **Organizational Elements**: People, Projects, Organizations, Funding Sources -- **Outcome Tracking**: Results, Experiments, Metrics, Risks +The project combines conceptual frameworks (Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims) with authoritative ground-truth datasets from verified sources. All data is provided in human-readable CSV and Markdown formats with complete methodology documentation. -### Data Directory +**Mission:** Build a trusted foundation of ground-truth data and structured knowledge to support human understanding and progress. -Substrate includes a **Data/** directory with authoritative, ground-truth datasets about important aspects of human life, society, and progress. All datasets come from verified, reputable sources and are provided in human-readable CSV and Markdown formats. +
+🎬 Watch Introduction Video -**Current Datasets:** - -| Dataset ID | Dataset Name | Coverage | Data Points | Source | Description | -|-----------|--------------|----------|-------------|--------|-------------| -| **DS-00002** | **US-GDP** | 1929-2025 | 96 years (annual)
314 quarters | FRED/BEA | Real GDP (chained 2017 dollars) - primary measure of US economic activity | -| **DS-00001** | **US-Inflation** | 1947-2025 | 945 months | FRED/BLS | Consumer Price Index (CPI-U) - gold standard inflation measure | -| **DS-00003** | **Bay-Area-COVID-Wastewater** | 2022-2025 | 161 weeks | CDPH | California COVID-19 wastewater surveillance (leading health indicator) | -| **DS-00004** | **Pulitzer-Prize-Winners** | 1918-2024 | 249 winners | Wikidata | Arts & Letters categories (Poetry, Drama, General/Special awards) | -| **DS-00005** | **Knowledge-Worker-Global-Salaries** | Global | Multi-region | Research | Global compensation data for knowledge workers across roles and geographies | - -**Data Management System:** -- **Library Science Methodology**: 8-dimension source quality evaluation -- **TypeScript Automation**: Auto-discovery orchestrator with Bun runtime -- **Quality Standards**: Dublin Core, MARC, SDMX, DDI metadata compliance -- **Version Control**: Full git integration with automated updates -- **Central Logging**: Aggregated logs and health monitoring -- **Documentation**: Comprehensive guides for each dataset - -**Data Philosophy:** -- **Ground Truth First**: Authoritative, verifiable sources only -- **Human-Readable + Machine-Parseable**: CSV, JSON, and Markdown formats -- **Full Transparency**: Complete methodology documentation and source attribution -- **Shared Knowledge**: Public domain or openly licensed data -- **Research-Grade Quality**: Professional library science evaluation - -See **[Data/README.md](./Data/README.md)** for complete documentation of all datasets, data quality standards, and contribution guidelines. - -## Introduction video - -Here's a video explaining the project and its structure. +
- - Watch the Substrate Intro Video + Watch the Substrate Intro Video -
-## Blog post +
-And here's a full blog post about the project. +**Blog Post:** [Introducing Substrate](https://danielmiessler.com/p/introducing-substrate) -[Introducing Substrate](https://danielmiessler.com/p/introducing-substrate) +--- -## 📚 **Documentation** +## 📊 Data Directory -Substrate includes comprehensive documentation for all aspects of the project: +Substrate includes **5 authoritative datasets** with 1,700+ data points spanning 107 years (1918-2025): -### **Getting Started** -- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Complete setup and usage guide for the data management system -- **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)** - Quick command reference and cheatsheet -- **[Data/README.md](./Data/README.md)** - Data directory philosophy, standards, and contribution guidelines +| Dataset | Coverage | Data Points | Source | +|---------|----------|-------------|--------| +| **US-GDP** | 1929-2025 | 96 years annual
314 quarters | FRED/BEA | +| **US-Inflation** | 1947-2025 | 945 months | FRED/BLS | +| **Bay Area COVID Wastewater** | 2022-2025 | 161 weeks | CDPH | +| **Pulitzer Prize Winners** | 1918-2024 | 249 winners | Wikidata | +| **Knowledge Worker Salaries** | Global | Multi-region | Research | -### **Technical Documentation** -- **[PROJECT_SUMMARY.md](./PROJECT_SUMMARY.md)** - Technical architecture and system design overview -- **[Data/README-LIBRARY-SCIENCE.md](./Data/README-LIBRARY-SCIENCE.md)** - Library science methodology framework -- **[Data/MIGRATION-GUIDE.md](./Data/MIGRATION-GUIDE.md)** - Guide for data directory structure changes +**Data Quality:** +- ✅ Library science methodology with 8-dimension source evaluation +- ✅ Authoritative sources only (government agencies, verified databases) +- ✅ Complete documentation and methodology for each dataset +- ✅ TypeScript automation with quality assurance +- ✅ CSV, JSON, and Markdown formats -### **Update Logs & Changes** -- **[UPDATES.md](./UPDATES.md)** - Complete project update history and changelog -- **[Data/UPDATES.md](./Data/UPDATES.md)** - Data directory-specific update log -- Individual dataset update logs in each `Data/*/UPDATES.md` file +**[→ Explore Data Directory](./Data/README.md)** -### **Dataset Documentation** -Each dataset includes comprehensive documentation: -- **README.md** - Dataset overview, research methodology, and usage -- **UPDATES.md** - Dataset-specific update history -- **RESOURCES.md** - Data sources, APIs, and download instructions -- **source.md** - Library science evaluation (8-dimension quality assessment) +--- -### **Video & Blog** -- **[Introduction Video](https://www.youtube.com/watch?v=ky7ejowc_qY)** - Project explanation and structure -- **[Blog Post](https://danielmiessler.com/p/introducing-substrate)** - Detailed project introduction +## 🚀 Recent Updates + +> [!IMPORTANT] +> **🔥 October 2025:** Major data infrastructure upgrade complete! +> +> - 5 authoritative datasets added (1,700+ data points) +> - Library science methodology implementation +> - TypeScript automation with Bun runtime +> - Comprehensive documentation suite + +
+📅 View detailed changelog + +### Recent Changes + +**2025-10-25 - Dataset Updates** +- ✅ Pulitzer Prize, GDP, and inflation data refreshed +- ✅ Knowledge Worker Salaries validation completed + +**2025-10-18 - New Dataset** +- 🆕 Knowledge Worker Global Compensation added + +**2025-10-16 - Infrastructure** +- 🏗️ Library science methodology (8-dimension evaluation) +- ⚡ TypeScript automation with auto-discovery +- 📊 Central logging and health monitoring +- 📚 Documentation suite (Getting Started, Technical Summary, Quick Reference) + +**2025-10-07 - Major Datasets** +- 🆕 Pulitzer Prize Winners (1918-2024, 249 winners) +- 🆕 Bay Area COVID Wastewater (161 weeks, 2022-2025) + +**2025-10-06 - Automation & Data** +- 🤖 GitHub Actions workflows (Code Review, PR Assistant) +- 🆕 U.S. Inflation dataset (945 months, 1947-2025) + +**2024-09 - Community** +- 📝 Claims, Arguments, and Values frameworks established +- 🌍 Multiple community contributions + +**2024-07 - Foundation** +- 🏗️ Single-repo structure +- 🚀 Public launch + +### Project Stats (2025-10-27) + +**Data:** 5 datasets • 1,700+ points • 107-year span (1918-2025) + +**Infrastructure:** TypeScript automation • Library science framework • GitHub Actions + +**Community:** 6+ contributors • 10+ merged PRs • 17 object types + +**Docs:** 8,000+ lines markdown • 25+ documentation files + +**[→ Full update history](./UPDATES.md)** + +
+ +--- + +## 📚 Documentation + +### Getting Started +- **[Getting Started Guide](./GETTING_STARTED.md)** - Complete setup and usage +- **[Quick Reference](./QUICK_REFERENCE.md)** - Command cheatsheet +- **[Data Directory Guide](./Data/README.md)** - Data philosophy and standards + +### Technical +- **[Project Summary](./PROJECT_SUMMARY.md)** - Architecture overview +- **[Library Science Framework](./Data/README-LIBRARY-SCIENCE.md)** - Methodology +- **[Migration Guide](./Data/MIGRATION-GUIDE.md)** - Structure changes + +### Updates & Changes +- **[UPDATES.md](./UPDATES.md)** - Complete project changelog +- **[Data Updates](./Data/UPDATES.md)** - Dataset-specific logs +- Individual dataset update logs in `Data/*/UPDATES.md` --- ## How to Contribute -You can contribute to Substrate by submitting PRs to modify the various Substrate object files within each directory, e.g.: `Problems`, `Solutions`, `Ideas`, etc. +Contribute by submitting PRs to modify Substrate object files in directories like `Problems/`, `Solutions/`, `Ideas/`, etc. -We're working on a web-based interface for this as well to make it easier for non-coders to contribute. +**Contributing Datasets:** +- See **[Data/README.md](./Data/README.md)** for data quality standards +- Follow **[Getting Started Guide](./GETTING_STARTED.md)** for step-by-step instructions -### Contributing Datasets - -To contribute new datasets, see: -- **[Data/README.md](./Data/README.md)** - Data contribution guidelines and quality standards -- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Step-by-step guide for adding new data sources - -
+**Note:** We're developing a web-based contribution interface for non-coders. > [!NOTE] -> July 27, 2024 — We moved to a single-repo structure to make the project easier to manage. +> **July 27, 2024** — We moved to a single-repo structure to make the project easier to manage. +--- + +## Roadmap + +### ✅ Completed + +**Phase 1: Foundation (July 2024)** +- Single-repo structure with 17+ object types +- Core framework and documentation +- Public launch + +**Phase 2: Community (Aug-Sep 2024)** +- Community contributions and frameworks +- Claims, Arguments, and Values established + +**Phase 3: Data Infrastructure (Oct 2025)** +- 5 authoritative datasets added +- Library science methodology +- TypeScript automation system +- Comprehensive documentation + +### 🚧 Planned + +**Phase 4: Enhanced Access** +- Web-based contribution interface +- Interactive visualizations +- RESTful API + +**Phase 5: Dataset Expansion** +- Additional authoritative datasets (UNICEF, OECD, IHME) +- Real-time data feeds +- Community-driven requests + +**Phase 6: Advanced Features** +- Machine-readable catalog (DCAT/CKAN) +- Automated quality scoring +- Email/Slack notifications + +--- ## Meta -> [!NOTE] -> Special thanks to the following people for their inspiration and contributions! +### Special Thanks -- _Jonathan Dunn_ for being the person most similar in goals that I've met so far. -- _Joel Parish_ for a neverending source of inspiration and structure wisdom. -- _Joseph Thacker_ a constant flow of solid ideas about all aspects of the project. +**Inspiration & Contributions:** +- _Jonathan Dunn_ - Similar goals and collaboration +- _Joel Parish_ - Structure wisdom +- _Joseph Thacker_ - Continuous flow of ideas -### Primary contributors +### Primary Contributors -`Substrate` was created by Daniel Miessler in July of 2024. -

+### Community Contributors + +Special thanks to all contributors: @ThatNateGuy, @JaymanW, @karai114, @DesertEaglePWN, @ktfth + +### Created By + +`Substrate` was created by Daniel Miessler in July 2024. + ![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/danielmiessler) +--- + +
+ +**[↑ Back to Top](#substrate)** + +