Redesign README based on comprehensive best practices research

Major improvements following research of top GitHub READMEs (2025):

**Above-the-Fold Improvements:**
- Reduce logo size from 400px to 250px (research optimal: 200-300px)
- Move "About" section immediately after header (passes 30-second test)
- Streamline to 3 key badges (last commit, license, stars)
- Add clear tagline under title for immediate understanding
- Single clean navigation (removed redundant ## Navigation section)

**Structure Enhancements:**
- "About" now appears in first screen (previously buried after 50+ lines)
- Data Directory showcased early as key differentiator
- Video made collapsible with 500px width (saves above-fold space)
- Recent Updates condensed and streamlined
- Added Roadmap section showing completed phases and future plans

**Content Organization:**
- Follows research pattern: Value prop → Key feature → Updates → Docs
- Progressive disclosure with collapsible sections
- Documentation organized by category (Getting Started, Technical, Updates)
- Gateway approach: README covers essentials, links to detailed docs
- Clean H2-only hierarchy (no deep nesting)

**Results:**
- 331 → 228 lines (31% reduction)
- Better scannability and visual hierarchy
- Passes first-visitor 30-second test
- Follows patterns from freeCodeCamp, React, Vue, Next.js

Based on 9-agent parallel research covering: best examples, best practices,
visual design, information architecture, sizing, readability, user needs,
top project patterns, and common mistakes.

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Daniel Miessler
2025-10-27 02:02:20 +01:00
parent ab2e582e77
commit 90d1ad7087

455
README.md
View File

@@ -1,331 +1,228 @@
<div align="center">
<img src="https://github.com/user-attachments/assets/2137f529-e5de-4d8e-9ae2-3d67a797d0c9" width="400" height="400"/>
<img src="https://github.com/user-attachments/assets/2137f529-e5de-4d8e-9ae2-3d67a797d0c9" width="250"/>
# `Substrate`
# Substrate
![Static Badge](https://img.shields.io/badge/mission-visualize%20human%20progress-brightgreen)
<br />
![GitHub top language](https://img.shields.io/github/languages/top/human-substrate/Problems)
![GitHub last commit](https://img.shields.io/github/last-commit/human-substrate/Problems)
**An open-source framework for capturing, organizing, and analyzing different aspects of human civilization**
![GitHub last commit](https://img.shields.io/github/last-commit/danielmiessler/Substrate)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
![GitHub Repo stars](https://img.shields.io/github/stars/danielmiessler/Substrate)
<p class="align center">
<h4><code>Substrate</code> An Open-source Framework for Human Understanding, Meaning, and Progress.</h4>
</p>
[About](#about) •
[How to Add Items](#how-to-add-problems) •
[Meta](#meta)
[About](#about) • [Data](#-data-directory) • [Updates](#-recent-updates) • [Docs](#-documentation) • [Contributing](#how-to-contribute)
</div>
## Navigation
- [About](#about)
- [Recent Updates](#-recent-updates)
- [Data Directory](#data-directory)
- [How to Contribute](#how-to-contribute)
- [Documentation](#-documentation)
- [Meta](#meta)
---
## 🚀 **Recent Updates**
> [!IMPORTANT]
> **🔥 2025-10:** Major data infrastructure upgrade complete!
>
> **DATA REVOLUTION:**
> - 5 authoritative datasets added (GDP, Inflation, COVID, Pulitzer, Salaries)
> - Library science methodology implementation
> - Comprehensive data management system
> - 1,700+ data points spanning 107 years (1918-2025)
>
> [See full changelog →](#recent-updates-detail)
<details>
<summary><strong>📅 Click to see all updates</strong></summary>
### <a name="recent-updates-detail"></a>Recent Changes
#### **2025-10-25 - Dataset Updates & Validation**
-**DS-00004:** Pulitzer Prize Winners - Arts & Letters data refreshed
-**DS-00002:** U.S. GDP data updated (1929-2025)
-**DS-00003:** U.S. CPI inflation data updated (1947-2025)
-**DS-00005:** Knowledge Worker Global Salaries validation check completed
#### **2025-10-18 - New Dataset**
- 🆕 **DS-00005:** Knowledge Worker Global Compensation dataset added
- 📊 Global salary data for knowledge workers
- 🔍 Comprehensive geographic and role coverage
#### **2025-10-16 - Data Management System**
- 🏗️ **Library Science Methodology** implemented with 8-dimension source evaluation
-**TypeScript Automation** with Bun runtime
- 📋 **Auto-Discovery Orchestrator** for dataset updates
- 📊 **Central Logging System** with aggregated logs
- 📈 **Dashboard Auto-Generation** with health metrics
- 🔄 **Git Integration** for version control
- 📚 **Comprehensive Documentation Suite:**
- `GETTING_STARTED.md` - Complete setup guide
- `PROJECT_SUMMARY.md` - Technical architecture
- `QUICK_REFERENCE.md` - Command cheatsheet
- `Data/README.md` - Data philosophy and standards
#### **2025-10-07 - Major Dataset Additions**
- 🆕 **DS-00004:** Pulitzer Prize Winners - Arts & Letters (1918-2024)
- 249 winners across Poetry, Drama, General/Special awards
- High-quality, complete coverage of selected categories
- Source: Wikidata
- 🆕 **DS-00003:** Bay Area COVID-19 Wastewater Surveillance
- 161 weekly data points (2022-2025)
- California statewide data (Bay Area proxy)
- Leading health indicator
- Source: California Department of Public Health (CDPH)
#### **2025-10-06 - GitHub Automation**
- 🤖 **Claude Code Review Workflow** - Automated code review
- 🤖 **Claude PR Assistant Workflow** - PR analysis and assistance
- ⚙️ **CI/CD Integration** for quality assurance
#### **2025-10-06 - U.S. Inflation Dataset**
- 🆕 **DS-00001:** U.S. Consumer Price Index (CPI-U)
- 📊 945 monthly data points (1947-2025)
- 📈 Gold standard inflation measure
- 🏛️ Source: FRED/Bureau of Labor Statistics
#### **2025-10-06 - Community Contributions**
- 🌍 **Brazil - São Paulo Mental Health** problem added (@ktfth)
- 📝 **Arguments** contributions (@DesertEaglePWN, @JaymanW)
- 🎯 **Values** framework established (@karai114)
- ✅ Multiple problem database updates
#### **2024-09-25 - Framework Expansion**
- 📋 **Claims Framework** established (@ThatNateGuy)
- Anthropogenic climate change
- Everettian Interpretation of Quantum Mechanics
- Supernaturalism
- Atavistic Model of Cancer
- Holographic Universe theory
#### **2024-07-27 - Repository Consolidation**
- 🏗️ **Single-Repo Structure** - Moved from multi-repo to unified structure
- 📦 Easier project management and contribution workflow
- 🚀 Simplified development process
</details>
<details>
<summary><strong>📊 Project Statistics (as of 2025-10-27)</strong></summary>
### Data & Coverage
- **Datasets:** 5 authoritative ground-truth datasets
- **Data Points:** 1,700+ (spanning multiple domains)
- **Historical Coverage:** 1918-2025 (107 years maximum span)
- **Geographic Coverage:** Global (U.S.-focused with expanding international data)
### Infrastructure
- **Update Scripts:** TypeScript with Bun runtime
- **Automation:** Auto-discovery orchestrator with central logging
- **Data Formats:** CSV, JSON, Markdown, Pipe-delimited
- **Quality Framework:** 8-dimension library science evaluation
- **Version Control:** Full git integration with automated commits
- **GitHub Actions:** 2 active workflows (Code Review, PR Assistant)
### Documentation
- **Markdown:** 8,000+ lines of documentation
- **TypeScript:** 1,000+ lines of automation code
- **Documentation Files:** 25+ comprehensive guides and references
- **Standards:** Dublin Core, MARC, SDMX, DDI metadata compliance
### Community
- **Contributors:** 6+ community members
- **Pull Requests Merged:** 10+ contributions
- **Object Types:** 17+ framework components (Problems, Solutions, Ideas, Plans, etc.)
</details>
<details>
<summary><strong>🎯 Milestones & Roadmap</strong></summary>
### ✅ Completed Milestones
**Phase 1: Foundation (July 2024)**
- ✅ Single-repo structure
- ✅ Core object types defined (17+ types)
- ✅ Basic directory structure
- ✅ Initial documentation
- ✅ Public launch with intro video
**Phase 2: Community Building (Aug-Sep 2024)**
- ✅ First community contributions
- ✅ Claims framework established
- ✅ Arguments and Values added
- ✅ Multi-contributor ecosystem active
**Phase 3: Data Infrastructure (Oct 2025)**
- ✅ Five authoritative datasets added
- ✅ Library science methodology implemented
- ✅ TypeScript data management system
- ✅ Comprehensive documentation suite
- ✅ GitHub Actions automation
- ✅ Quality assurance framework
### 🚧 Upcoming (Planned)
**Phase 4: Enhanced Access & Interaction**
- [ ] Web-based contribution interface (non-coders can contribute)
- [ ] Interactive data visualizations
- [ ] RESTful API for programmatic access
- [ ] Advanced cross-reference linking
- [ ] Evidence-based problem/solution matching
**Phase 5: Dataset Expansion**
- [ ] Additional authoritative datasets (UNICEF, OECD, IHME)
- [ ] Community-driven dataset requests
- [ ] Real-time data feeds for select sources
- [ ] Historical data archive expansion
**Phase 6: Advanced Features**
- [ ] Machine-readable catalog (DCAT/CKAN)
- [ ] Automated quality scoring algorithms
- [ ] Data quality trend tracking
- [ ] Email/Slack notifications for updates
- [ ] Parallel dataset updates
</details>
---
**Full Update History:** See [`UPDATES.md`](./UPDATES.md) for complete chronological changelog
---
## About
**Substrate** is an open-source framework for capturing, organizing, and analyzing different aspects of human civilization. It provides a structured knowledge system covering problems, solutions, plans, experiments, and empirical data—all interconnected and designed to be analyzed by both humans and AI systems.
Substrate provides a structured knowledge system covering **problems**, **solutions**, **plans**, **experiments**, and **empirical data**—all interconnected and designed to be analyzed by both humans and AI systems.
The project combines:
- **Conceptual Components**: Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims
- **Empirical Data**: Curated ground-truth datasets from authoritative sources
- **Organizational Elements**: People, Projects, Organizations, Funding Sources
- **Outcome Tracking**: Results, Experiments, Metrics, Risks
The project combines conceptual frameworks (Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims) with authoritative ground-truth datasets from verified sources. All data is provided in human-readable CSV and Markdown formats with complete methodology documentation.
### Data Directory
**Mission:** Build a trusted foundation of ground-truth data and structured knowledge to support human understanding and progress.
Substrate includes a **Data/** directory with authoritative, ground-truth datasets about important aspects of human life, society, and progress. All datasets come from verified, reputable sources and are provided in human-readable CSV and Markdown formats.
<details>
<summary><strong>🎬 Watch Introduction Video</strong></summary>
**Current Datasets:**
| Dataset ID | Dataset Name | Coverage | Data Points | Source | Description |
|-----------|--------------|----------|-------------|--------|-------------|
| **DS-00002** | **US-GDP** | 1929-2025 | 96 years (annual)<br>314 quarters | FRED/BEA | Real GDP (chained 2017 dollars) - primary measure of US economic activity |
| **DS-00001** | **US-Inflation** | 1947-2025 | 945 months | FRED/BLS | Consumer Price Index (CPI-U) - gold standard inflation measure |
| **DS-00003** | **Bay-Area-COVID-Wastewater** | 2022-2025 | 161 weeks | CDPH | California COVID-19 wastewater surveillance (leading health indicator) |
| **DS-00004** | **Pulitzer-Prize-Winners** | 1918-2024 | 249 winners | Wikidata | Arts & Letters categories (Poetry, Drama, General/Special awards) |
| **DS-00005** | **Knowledge-Worker-Global-Salaries** | Global | Multi-region | Research | Global compensation data for knowledge workers across roles and geographies |
**Data Management System:**
- **Library Science Methodology**: 8-dimension source quality evaluation
- **TypeScript Automation**: Auto-discovery orchestrator with Bun runtime
- **Quality Standards**: Dublin Core, MARC, SDMX, DDI metadata compliance
- **Version Control**: Full git integration with automated updates
- **Central Logging**: Aggregated logs and health monitoring
- **Documentation**: Comprehensive guides for each dataset
**Data Philosophy:**
- **Ground Truth First**: Authoritative, verifiable sources only
- **Human-Readable + Machine-Parseable**: CSV, JSON, and Markdown formats
- **Full Transparency**: Complete methodology documentation and source attribution
- **Shared Knowledge**: Public domain or openly licensed data
- **Research-Grade Quality**: Professional library science evaluation
See **[Data/README.md](./Data/README.md)** for complete documentation of all datasets, data quality standards, and contribution guidelines.
## Introduction video
Here's a video explaining the project and its structure.
<br/>
<div align="center">
<a href="https://www.youtube.com/watch?v=ky7ejowc_qY">
<img src="https://img.youtube.com/vi/ky7ejowc_qY/0.jpg" alt="Watch the Substrate Intro Video" style="width:100%;">
<img src="https://img.youtube.com/vi/ky7ejowc_qY/0.jpg" alt="Watch the Substrate Intro Video" width="500">
</a>
</div>
## Blog post
</details>
And here's a full blog post about the project.
**Blog Post:** [Introducing Substrate](https://danielmiessler.com/p/introducing-substrate)
[Introducing Substrate](https://danielmiessler.com/p/introducing-substrate)
---
## 📚 **Documentation**
## 📊 Data Directory
Substrate includes comprehensive documentation for all aspects of the project:
Substrate includes **5 authoritative datasets** with 1,700+ data points spanning 107 years (1918-2025):
### **Getting Started**
- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Complete setup and usage guide for the data management system
- **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)** - Quick command reference and cheatsheet
- **[Data/README.md](./Data/README.md)** - Data directory philosophy, standards, and contribution guidelines
| Dataset | Coverage | Data Points | Source |
|---------|----------|-------------|--------|
| **US-GDP** | 1929-2025 | 96 years annual<br>314 quarters | FRED/BEA |
| **US-Inflation** | 1947-2025 | 945 months | FRED/BLS |
| **Bay Area COVID Wastewater** | 2022-2025 | 161 weeks | CDPH |
| **Pulitzer Prize Winners** | 1918-2024 | 249 winners | Wikidata |
| **Knowledge Worker Salaries** | Global | Multi-region | Research |
### **Technical Documentation**
- **[PROJECT_SUMMARY.md](./PROJECT_SUMMARY.md)** - Technical architecture and system design overview
- **[Data/README-LIBRARY-SCIENCE.md](./Data/README-LIBRARY-SCIENCE.md)** - Library science methodology framework
- **[Data/MIGRATION-GUIDE.md](./Data/MIGRATION-GUIDE.md)** - Guide for data directory structure changes
**Data Quality:**
- ✅ Library science methodology with 8-dimension source evaluation
- ✅ Authoritative sources only (government agencies, verified databases)
- ✅ Complete documentation and methodology for each dataset
- ✅ TypeScript automation with quality assurance
- ✅ CSV, JSON, and Markdown formats
### **Update Logs & Changes**
- **[UPDATES.md](./UPDATES.md)** - Complete project update history and changelog
- **[Data/UPDATES.md](./Data/UPDATES.md)** - Data directory-specific update log
- Individual dataset update logs in each `Data/*/UPDATES.md` file
**[→ Explore Data Directory](./Data/README.md)**
### **Dataset Documentation**
Each dataset includes comprehensive documentation:
- **README.md** - Dataset overview, research methodology, and usage
- **UPDATES.md** - Dataset-specific update history
- **RESOURCES.md** - Data sources, APIs, and download instructions
- **source.md** - Library science evaluation (8-dimension quality assessment)
---
### **Video & Blog**
- **[Introduction Video](https://www.youtube.com/watch?v=ky7ejowc_qY)** - Project explanation and structure
- **[Blog Post](https://danielmiessler.com/p/introducing-substrate)** - Detailed project introduction
## 🚀 Recent Updates
> [!IMPORTANT]
> **🔥 October 2025:** Major data infrastructure upgrade complete!
>
> - 5 authoritative datasets added (1,700+ data points)
> - Library science methodology implementation
> - TypeScript automation with Bun runtime
> - Comprehensive documentation suite
<details>
<summary><strong>📅 View detailed changelog</strong></summary>
### Recent Changes
**2025-10-25 - Dataset Updates**
- ✅ Pulitzer Prize, GDP, and inflation data refreshed
- ✅ Knowledge Worker Salaries validation completed
**2025-10-18 - New Dataset**
- 🆕 Knowledge Worker Global Compensation added
**2025-10-16 - Infrastructure**
- 🏗️ Library science methodology (8-dimension evaluation)
- ⚡ TypeScript automation with auto-discovery
- 📊 Central logging and health monitoring
- 📚 Documentation suite (Getting Started, Technical Summary, Quick Reference)
**2025-10-07 - Major Datasets**
- 🆕 Pulitzer Prize Winners (1918-2024, 249 winners)
- 🆕 Bay Area COVID Wastewater (161 weeks, 2022-2025)
**2025-10-06 - Automation & Data**
- 🤖 GitHub Actions workflows (Code Review, PR Assistant)
- 🆕 U.S. Inflation dataset (945 months, 1947-2025)
**2024-09 - Community**
- 📝 Claims, Arguments, and Values frameworks established
- 🌍 Multiple community contributions
**2024-07 - Foundation**
- 🏗️ Single-repo structure
- 🚀 Public launch
### Project Stats (2025-10-27)
**Data:** 5 datasets • 1,700+ points • 107-year span (1918-2025)
**Infrastructure:** TypeScript automation • Library science framework • GitHub Actions
**Community:** 6+ contributors • 10+ merged PRs • 17 object types
**Docs:** 8,000+ lines markdown • 25+ documentation files
**[→ Full update history](./UPDATES.md)**
</details>
---
## 📚 Documentation
### Getting Started
- **[Getting Started Guide](./GETTING_STARTED.md)** - Complete setup and usage
- **[Quick Reference](./QUICK_REFERENCE.md)** - Command cheatsheet
- **[Data Directory Guide](./Data/README.md)** - Data philosophy and standards
### Technical
- **[Project Summary](./PROJECT_SUMMARY.md)** - Architecture overview
- **[Library Science Framework](./Data/README-LIBRARY-SCIENCE.md)** - Methodology
- **[Migration Guide](./Data/MIGRATION-GUIDE.md)** - Structure changes
### Updates & Changes
- **[UPDATES.md](./UPDATES.md)** - Complete project changelog
- **[Data Updates](./Data/UPDATES.md)** - Dataset-specific logs
- Individual dataset update logs in `Data/*/UPDATES.md`
---
## How to Contribute
You can contribute to Substrate by submitting PRs to modify the various Substrate object files within each directory, e.g.: `Problems`, `Solutions`, `Ideas`, etc.
Contribute by submitting PRs to modify Substrate object files in directories like `Problems/`, `Solutions/`, `Ideas/`, etc.
We're working on a web-based interface for this as well to make it easier for non-coders to contribute.
**Contributing Datasets:**
- See **[Data/README.md](./Data/README.md)** for data quality standards
- Follow **[Getting Started Guide](./GETTING_STARTED.md)** for step-by-step instructions
### Contributing Datasets
To contribute new datasets, see:
- **[Data/README.md](./Data/README.md)** - Data contribution guidelines and quality standards
- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Step-by-step guide for adding new data sources
<br />
**Note:** We're developing a web-based contribution interface for non-coders.
> [!NOTE]
> July 27, 2024 — We moved to a single-repo structure to make the project easier to manage.
> **July 27, 2024** — We moved to a single-repo structure to make the project easier to manage.
---
## Roadmap
### ✅ Completed
**Phase 1: Foundation (July 2024)**
- Single-repo structure with 17+ object types
- Core framework and documentation
- Public launch
**Phase 2: Community (Aug-Sep 2024)**
- Community contributions and frameworks
- Claims, Arguments, and Values established
**Phase 3: Data Infrastructure (Oct 2025)**
- 5 authoritative datasets added
- Library science methodology
- TypeScript automation system
- Comprehensive documentation
### 🚧 Planned
**Phase 4: Enhanced Access**
- Web-based contribution interface
- Interactive visualizations
- RESTful API
**Phase 5: Dataset Expansion**
- Additional authoritative datasets (UNICEF, OECD, IHME)
- Real-time data feeds
- Community-driven requests
**Phase 6: Advanced Features**
- Machine-readable catalog (DCAT/CKAN)
- Automated quality scoring
- Email/Slack notifications
---
## Meta
> [!NOTE]
> Special thanks to the following people for their inspiration and contributions!
### Special Thanks
- _Jonathan Dunn_ for being the person most similar in goals that I've met so far.
- _Joel Parish_ for a neverending source of inspiration and structure wisdom.
- _Joseph Thacker_ a constant flow of solid ideas about all aspects of the project.
**Inspiration & Contributions:**
- _Jonathan Dunn_ - Similar goals and collaboration
- _Joel Parish_ - Structure wisdom
- _Joseph Thacker_ - Continuous flow of ideas
### Primary contributors
### Primary Contributors
<a href="https://github.com/xssdoctor"><img src="https://avatars.githubusercontent.com/u/9218431?v=4" title="Jonathan Dunn" width="50" height="50"></a>
`Substrate` was created by <a href="https://danielmiessler.com/subscribe" target="_blank">Daniel Miessler</a> in July of 2024.
<br /><br />
### Community Contributors
Special thanks to all contributors: @ThatNateGuy, @JaymanW, @karai114, @DesertEaglePWN, @ktfth
### Created By
`Substrate` was created by <a href="https://danielmiessler.com/subscribe" target="_blank">Daniel Miessler</a> in July 2024.
<a href="https://twitter.com/intent/user?screen_name=danielmiessler">![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/danielmiessler)</a>
---
<div align="center">
**[↑ Back to Top](#substrate)**
</div>