From 90d1ad7087f52c59d709a85ae30f845497e640f3 Mon Sep 17 00:00:00 2001 From: Daniel Miessler Date: Mon, 27 Oct 2025 02:02:20 +0100 Subject: [PATCH] Redesign README based on comprehensive best practices research MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major improvements following research of top GitHub READMEs (2025): **Above-the-Fold Improvements:** - Reduce logo size from 400px to 250px (research optimal: 200-300px) - Move "About" section immediately after header (passes 30-second test) - Streamline to 3 key badges (last commit, license, stars) - Add clear tagline under title for immediate understanding - Single clean navigation (removed redundant ## Navigation section) **Structure Enhancements:** - "About" now appears in first screen (previously buried after 50+ lines) - Data Directory showcased early as key differentiator - Video made collapsible with 500px width (saves above-fold space) - Recent Updates condensed and streamlined - Added Roadmap section showing completed phases and future plans **Content Organization:** - Follows research pattern: Value prop β†’ Key feature β†’ Updates β†’ Docs - Progressive disclosure with collapsible sections - Documentation organized by category (Getting Started, Technical, Updates) - Gateway approach: README covers essentials, links to detailed docs - Clean H2-only hierarchy (no deep nesting) **Results:** - 331 β†’ 228 lines (31% reduction) - Better scannability and visual hierarchy - Passes first-visitor 30-second test - Follows patterns from freeCodeCamp, React, Vue, Next.js Based on 9-agent parallel research covering: best examples, best practices, visual design, information architecture, sizing, readability, user needs, top project patterns, and common mistakes. πŸ€– Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude --- README.md | 455 +++++++++++++++++++++--------------------------------- 1 file changed, 176 insertions(+), 279 deletions(-) diff --git a/README.md b/README.md index e9fa900..1a7162d 100644 --- a/README.md +++ b/README.md @@ -1,331 +1,228 @@
- + -# `Substrate` +# Substrate -![Static Badge](https://img.shields.io/badge/mission-visualize%20human%20progress-brightgreen) -
-![GitHub top language](https://img.shields.io/github/languages/top/human-substrate/Problems) -![GitHub last commit](https://img.shields.io/github/last-commit/human-substrate/Problems) +**An open-source framework for capturing, organizing, and analyzing different aspects of human civilization** + +![GitHub last commit](https://img.shields.io/github/last-commit/danielmiessler/Substrate) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) +![GitHub Repo stars](https://img.shields.io/github/stars/danielmiessler/Substrate) -

-

Substrate An Open-source Framework for Human Understanding, Meaning, and Progress.

-

- -[About](#about) β€’ -[How to Add Items](#how-to-add-problems) β€’ -[Meta](#meta) +[About](#about) β€’ [Data](#-data-directory) β€’ [Updates](#-recent-updates) β€’ [Docs](#-documentation) β€’ [Contributing](#how-to-contribute)
-## Navigation - -- [About](#about) -- [Recent Updates](#-recent-updates) -- [Data Directory](#data-directory) -- [How to Contribute](#how-to-contribute) -- [Documentation](#-documentation) -- [Meta](#meta) - ---- - -## πŸš€ **Recent Updates** - -> [!IMPORTANT] -> **πŸ”₯ 2025-10:** Major data infrastructure upgrade complete! -> -> **DATA REVOLUTION:** -> - 5 authoritative datasets added (GDP, Inflation, COVID, Pulitzer, Salaries) -> - Library science methodology implementation -> - Comprehensive data management system -> - 1,700+ data points spanning 107 years (1918-2025) -> -> [See full changelog β†’](#recent-updates-detail) - -
-πŸ“… Click to see all updates - -### Recent Changes - -#### **2025-10-25 - Dataset Updates & Validation** -- βœ… **DS-00004:** Pulitzer Prize Winners - Arts & Letters data refreshed -- βœ… **DS-00002:** U.S. GDP data updated (1929-2025) -- βœ… **DS-00003:** U.S. CPI inflation data updated (1947-2025) -- βœ… **DS-00005:** Knowledge Worker Global Salaries validation check completed - -#### **2025-10-18 - New Dataset** -- πŸ†• **DS-00005:** Knowledge Worker Global Compensation dataset added -- πŸ“Š Global salary data for knowledge workers -- πŸ” Comprehensive geographic and role coverage - -#### **2025-10-16 - Data Management System** -- πŸ—οΈ **Library Science Methodology** implemented with 8-dimension source evaluation -- ⚑ **TypeScript Automation** with Bun runtime -- πŸ“‹ **Auto-Discovery Orchestrator** for dataset updates -- πŸ“Š **Central Logging System** with aggregated logs -- πŸ“ˆ **Dashboard Auto-Generation** with health metrics -- πŸ”„ **Git Integration** for version control -- πŸ“š **Comprehensive Documentation Suite:** - - `GETTING_STARTED.md` - Complete setup guide - - `PROJECT_SUMMARY.md` - Technical architecture - - `QUICK_REFERENCE.md` - Command cheatsheet - - `Data/README.md` - Data philosophy and standards - -#### **2025-10-07 - Major Dataset Additions** -- πŸ†• **DS-00004:** Pulitzer Prize Winners - Arts & Letters (1918-2024) - - 249 winners across Poetry, Drama, General/Special awards - - High-quality, complete coverage of selected categories - - Source: Wikidata - -- πŸ†• **DS-00003:** Bay Area COVID-19 Wastewater Surveillance - - 161 weekly data points (2022-2025) - - California statewide data (Bay Area proxy) - - Leading health indicator - - Source: California Department of Public Health (CDPH) - -#### **2025-10-06 - GitHub Automation** -- πŸ€– **Claude Code Review Workflow** - Automated code review -- πŸ€– **Claude PR Assistant Workflow** - PR analysis and assistance -- βš™οΈ **CI/CD Integration** for quality assurance - -#### **2025-10-06 - U.S. Inflation Dataset** -- πŸ†• **DS-00001:** U.S. Consumer Price Index (CPI-U) -- πŸ“Š 945 monthly data points (1947-2025) -- πŸ“ˆ Gold standard inflation measure -- πŸ›οΈ Source: FRED/Bureau of Labor Statistics - -#### **2025-10-06 - Community Contributions** -- 🌍 **Brazil - SΓ£o Paulo Mental Health** problem added (@ktfth) -- πŸ“ **Arguments** contributions (@DesertEaglePWN, @JaymanW) -- 🎯 **Values** framework established (@karai114) -- βœ… Multiple problem database updates - -#### **2024-09-25 - Framework Expansion** -- πŸ“‹ **Claims Framework** established (@ThatNateGuy) - - Anthropogenic climate change - - Everettian Interpretation of Quantum Mechanics - - Supernaturalism - - Atavistic Model of Cancer - - Holographic Universe theory - -#### **2024-07-27 - Repository Consolidation** -- πŸ—οΈ **Single-Repo Structure** - Moved from multi-repo to unified structure -- πŸ“¦ Easier project management and contribution workflow -- πŸš€ Simplified development process - -
- -
-πŸ“Š Project Statistics (as of 2025-10-27) - -### Data & Coverage -- **Datasets:** 5 authoritative ground-truth datasets -- **Data Points:** 1,700+ (spanning multiple domains) -- **Historical Coverage:** 1918-2025 (107 years maximum span) -- **Geographic Coverage:** Global (U.S.-focused with expanding international data) - -### Infrastructure -- **Update Scripts:** TypeScript with Bun runtime -- **Automation:** Auto-discovery orchestrator with central logging -- **Data Formats:** CSV, JSON, Markdown, Pipe-delimited -- **Quality Framework:** 8-dimension library science evaluation -- **Version Control:** Full git integration with automated commits -- **GitHub Actions:** 2 active workflows (Code Review, PR Assistant) - -### Documentation -- **Markdown:** 8,000+ lines of documentation -- **TypeScript:** 1,000+ lines of automation code -- **Documentation Files:** 25+ comprehensive guides and references -- **Standards:** Dublin Core, MARC, SDMX, DDI metadata compliance - -### Community -- **Contributors:** 6+ community members -- **Pull Requests Merged:** 10+ contributions -- **Object Types:** 17+ framework components (Problems, Solutions, Ideas, Plans, etc.) - -
- -
-🎯 Milestones & Roadmap - -### βœ… Completed Milestones - -**Phase 1: Foundation (July 2024)** -- βœ… Single-repo structure -- βœ… Core object types defined (17+ types) -- βœ… Basic directory structure -- βœ… Initial documentation -- βœ… Public launch with intro video - -**Phase 2: Community Building (Aug-Sep 2024)** -- βœ… First community contributions -- βœ… Claims framework established -- βœ… Arguments and Values added -- βœ… Multi-contributor ecosystem active - -**Phase 3: Data Infrastructure (Oct 2025)** -- βœ… Five authoritative datasets added -- βœ… Library science methodology implemented -- βœ… TypeScript data management system -- βœ… Comprehensive documentation suite -- βœ… GitHub Actions automation -- βœ… Quality assurance framework - -### 🚧 Upcoming (Planned) - -**Phase 4: Enhanced Access & Interaction** -- [ ] Web-based contribution interface (non-coders can contribute) -- [ ] Interactive data visualizations -- [ ] RESTful API for programmatic access -- [ ] Advanced cross-reference linking -- [ ] Evidence-based problem/solution matching - -**Phase 5: Dataset Expansion** -- [ ] Additional authoritative datasets (UNICEF, OECD, IHME) -- [ ] Community-driven dataset requests -- [ ] Real-time data feeds for select sources -- [ ] Historical data archive expansion - -**Phase 6: Advanced Features** -- [ ] Machine-readable catalog (DCAT/CKAN) -- [ ] Automated quality scoring algorithms -- [ ] Data quality trend tracking -- [ ] Email/Slack notifications for updates -- [ ] Parallel dataset updates - -
- ---- - -**Full Update History:** See [`UPDATES.md`](./UPDATES.md) for complete chronological changelog - --- ## About -**Substrate** is an open-source framework for capturing, organizing, and analyzing different aspects of human civilization. It provides a structured knowledge system covering problems, solutions, plans, experiments, and empirical dataβ€”all interconnected and designed to be analyzed by both humans and AI systems. +Substrate provides a structured knowledge system covering **problems**, **solutions**, **plans**, **experiments**, and **empirical data**β€”all interconnected and designed to be analyzed by both humans and AI systems. -The project combines: -- **Conceptual Components**: Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims -- **Empirical Data**: Curated ground-truth datasets from authoritative sources -- **Organizational Elements**: People, Projects, Organizations, Funding Sources -- **Outcome Tracking**: Results, Experiments, Metrics, Risks +The project combines conceptual frameworks (Problems, Solutions, Ideas, Plans, Values, Models, Arguments, Claims) with authoritative ground-truth datasets from verified sources. All data is provided in human-readable CSV and Markdown formats with complete methodology documentation. -### Data Directory +**Mission:** Build a trusted foundation of ground-truth data and structured knowledge to support human understanding and progress. -Substrate includes a **Data/** directory with authoritative, ground-truth datasets about important aspects of human life, society, and progress. All datasets come from verified, reputable sources and are provided in human-readable CSV and Markdown formats. +
+🎬 Watch Introduction Video -**Current Datasets:** - -| Dataset ID | Dataset Name | Coverage | Data Points | Source | Description | -|-----------|--------------|----------|-------------|--------|-------------| -| **DS-00002** | **US-GDP** | 1929-2025 | 96 years (annual)
314 quarters | FRED/BEA | Real GDP (chained 2017 dollars) - primary measure of US economic activity | -| **DS-00001** | **US-Inflation** | 1947-2025 | 945 months | FRED/BLS | Consumer Price Index (CPI-U) - gold standard inflation measure | -| **DS-00003** | **Bay-Area-COVID-Wastewater** | 2022-2025 | 161 weeks | CDPH | California COVID-19 wastewater surveillance (leading health indicator) | -| **DS-00004** | **Pulitzer-Prize-Winners** | 1918-2024 | 249 winners | Wikidata | Arts & Letters categories (Poetry, Drama, General/Special awards) | -| **DS-00005** | **Knowledge-Worker-Global-Salaries** | Global | Multi-region | Research | Global compensation data for knowledge workers across roles and geographies | - -**Data Management System:** -- **Library Science Methodology**: 8-dimension source quality evaluation -- **TypeScript Automation**: Auto-discovery orchestrator with Bun runtime -- **Quality Standards**: Dublin Core, MARC, SDMX, DDI metadata compliance -- **Version Control**: Full git integration with automated updates -- **Central Logging**: Aggregated logs and health monitoring -- **Documentation**: Comprehensive guides for each dataset - -**Data Philosophy:** -- **Ground Truth First**: Authoritative, verifiable sources only -- **Human-Readable + Machine-Parseable**: CSV, JSON, and Markdown formats -- **Full Transparency**: Complete methodology documentation and source attribution -- **Shared Knowledge**: Public domain or openly licensed data -- **Research-Grade Quality**: Professional library science evaluation - -See **[Data/README.md](./Data/README.md)** for complete documentation of all datasets, data quality standards, and contribution guidelines. - -## Introduction video - -Here's a video explaining the project and its structure. +
- - Watch the Substrate Intro Video + Watch the Substrate Intro Video -
-## Blog post +
-And here's a full blog post about the project. +**Blog Post:** [Introducing Substrate](https://danielmiessler.com/p/introducing-substrate) -[Introducing Substrate](https://danielmiessler.com/p/introducing-substrate) +--- -## πŸ“š **Documentation** +## πŸ“Š Data Directory -Substrate includes comprehensive documentation for all aspects of the project: +Substrate includes **5 authoritative datasets** with 1,700+ data points spanning 107 years (1918-2025): -### **Getting Started** -- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Complete setup and usage guide for the data management system -- **[QUICK_REFERENCE.md](./QUICK_REFERENCE.md)** - Quick command reference and cheatsheet -- **[Data/README.md](./Data/README.md)** - Data directory philosophy, standards, and contribution guidelines +| Dataset | Coverage | Data Points | Source | +|---------|----------|-------------|--------| +| **US-GDP** | 1929-2025 | 96 years annual
314 quarters | FRED/BEA | +| **US-Inflation** | 1947-2025 | 945 months | FRED/BLS | +| **Bay Area COVID Wastewater** | 2022-2025 | 161 weeks | CDPH | +| **Pulitzer Prize Winners** | 1918-2024 | 249 winners | Wikidata | +| **Knowledge Worker Salaries** | Global | Multi-region | Research | -### **Technical Documentation** -- **[PROJECT_SUMMARY.md](./PROJECT_SUMMARY.md)** - Technical architecture and system design overview -- **[Data/README-LIBRARY-SCIENCE.md](./Data/README-LIBRARY-SCIENCE.md)** - Library science methodology framework -- **[Data/MIGRATION-GUIDE.md](./Data/MIGRATION-GUIDE.md)** - Guide for data directory structure changes +**Data Quality:** +- βœ… Library science methodology with 8-dimension source evaluation +- βœ… Authoritative sources only (government agencies, verified databases) +- βœ… Complete documentation and methodology for each dataset +- βœ… TypeScript automation with quality assurance +- βœ… CSV, JSON, and Markdown formats -### **Update Logs & Changes** -- **[UPDATES.md](./UPDATES.md)** - Complete project update history and changelog -- **[Data/UPDATES.md](./Data/UPDATES.md)** - Data directory-specific update log -- Individual dataset update logs in each `Data/*/UPDATES.md` file +**[β†’ Explore Data Directory](./Data/README.md)** -### **Dataset Documentation** -Each dataset includes comprehensive documentation: -- **README.md** - Dataset overview, research methodology, and usage -- **UPDATES.md** - Dataset-specific update history -- **RESOURCES.md** - Data sources, APIs, and download instructions -- **source.md** - Library science evaluation (8-dimension quality assessment) +--- -### **Video & Blog** -- **[Introduction Video](https://www.youtube.com/watch?v=ky7ejowc_qY)** - Project explanation and structure -- **[Blog Post](https://danielmiessler.com/p/introducing-substrate)** - Detailed project introduction +## πŸš€ Recent Updates + +> [!IMPORTANT] +> **πŸ”₯ October 2025:** Major data infrastructure upgrade complete! +> +> - 5 authoritative datasets added (1,700+ data points) +> - Library science methodology implementation +> - TypeScript automation with Bun runtime +> - Comprehensive documentation suite + +
+πŸ“… View detailed changelog + +### Recent Changes + +**2025-10-25 - Dataset Updates** +- βœ… Pulitzer Prize, GDP, and inflation data refreshed +- βœ… Knowledge Worker Salaries validation completed + +**2025-10-18 - New Dataset** +- πŸ†• Knowledge Worker Global Compensation added + +**2025-10-16 - Infrastructure** +- πŸ—οΈ Library science methodology (8-dimension evaluation) +- ⚑ TypeScript automation with auto-discovery +- πŸ“Š Central logging and health monitoring +- πŸ“š Documentation suite (Getting Started, Technical Summary, Quick Reference) + +**2025-10-07 - Major Datasets** +- πŸ†• Pulitzer Prize Winners (1918-2024, 249 winners) +- πŸ†• Bay Area COVID Wastewater (161 weeks, 2022-2025) + +**2025-10-06 - Automation & Data** +- πŸ€– GitHub Actions workflows (Code Review, PR Assistant) +- πŸ†• U.S. Inflation dataset (945 months, 1947-2025) + +**2024-09 - Community** +- πŸ“ Claims, Arguments, and Values frameworks established +- 🌍 Multiple community contributions + +**2024-07 - Foundation** +- πŸ—οΈ Single-repo structure +- πŸš€ Public launch + +### Project Stats (2025-10-27) + +**Data:** 5 datasets β€’ 1,700+ points β€’ 107-year span (1918-2025) + +**Infrastructure:** TypeScript automation β€’ Library science framework β€’ GitHub Actions + +**Community:** 6+ contributors β€’ 10+ merged PRs β€’ 17 object types + +**Docs:** 8,000+ lines markdown β€’ 25+ documentation files + +**[β†’ Full update history](./UPDATES.md)** + +
+ +--- + +## πŸ“š Documentation + +### Getting Started +- **[Getting Started Guide](./GETTING_STARTED.md)** - Complete setup and usage +- **[Quick Reference](./QUICK_REFERENCE.md)** - Command cheatsheet +- **[Data Directory Guide](./Data/README.md)** - Data philosophy and standards + +### Technical +- **[Project Summary](./PROJECT_SUMMARY.md)** - Architecture overview +- **[Library Science Framework](./Data/README-LIBRARY-SCIENCE.md)** - Methodology +- **[Migration Guide](./Data/MIGRATION-GUIDE.md)** - Structure changes + +### Updates & Changes +- **[UPDATES.md](./UPDATES.md)** - Complete project changelog +- **[Data Updates](./Data/UPDATES.md)** - Dataset-specific logs +- Individual dataset update logs in `Data/*/UPDATES.md` --- ## How to Contribute -You can contribute to Substrate by submitting PRs to modify the various Substrate object files within each directory, e.g.: `Problems`, `Solutions`, `Ideas`, etc. +Contribute by submitting PRs to modify Substrate object files in directories like `Problems/`, `Solutions/`, `Ideas/`, etc. -We're working on a web-based interface for this as well to make it easier for non-coders to contribute. +**Contributing Datasets:** +- See **[Data/README.md](./Data/README.md)** for data quality standards +- Follow **[Getting Started Guide](./GETTING_STARTED.md)** for step-by-step instructions -### Contributing Datasets - -To contribute new datasets, see: -- **[Data/README.md](./Data/README.md)** - Data contribution guidelines and quality standards -- **[GETTING_STARTED.md](./GETTING_STARTED.md)** - Step-by-step guide for adding new data sources - -
+**Note:** We're developing a web-based contribution interface for non-coders. > [!NOTE] -> July 27, 2024 β€” We moved to a single-repo structure to make the project easier to manage. +> **July 27, 2024** β€” We moved to a single-repo structure to make the project easier to manage. +--- + +## Roadmap + +### βœ… Completed + +**Phase 1: Foundation (July 2024)** +- Single-repo structure with 17+ object types +- Core framework and documentation +- Public launch + +**Phase 2: Community (Aug-Sep 2024)** +- Community contributions and frameworks +- Claims, Arguments, and Values established + +**Phase 3: Data Infrastructure (Oct 2025)** +- 5 authoritative datasets added +- Library science methodology +- TypeScript automation system +- Comprehensive documentation + +### 🚧 Planned + +**Phase 4: Enhanced Access** +- Web-based contribution interface +- Interactive visualizations +- RESTful API + +**Phase 5: Dataset Expansion** +- Additional authoritative datasets (UNICEF, OECD, IHME) +- Real-time data feeds +- Community-driven requests + +**Phase 6: Advanced Features** +- Machine-readable catalog (DCAT/CKAN) +- Automated quality scoring +- Email/Slack notifications + +--- ## Meta -> [!NOTE] -> Special thanks to the following people for their inspiration and contributions! +### Special Thanks -- _Jonathan Dunn_ for being the person most similar in goals that I've met so far. -- _Joel Parish_ for a neverending source of inspiration and structure wisdom. -- _Joseph Thacker_ a constant flow of solid ideas about all aspects of the project. +**Inspiration & Contributions:** +- _Jonathan Dunn_ - Similar goals and collaboration +- _Joel Parish_ - Structure wisdom +- _Joseph Thacker_ - Continuous flow of ideas -### Primary contributors +### Primary Contributors -`Substrate` was created by Daniel Miessler in July of 2024. -

+### Community Contributors + +Special thanks to all contributors: @ThatNateGuy, @JaymanW, @karai114, @DesertEaglePWN, @ktfth + +### Created By + +`Substrate` was created by Daniel Miessler in July 2024. + ![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/danielmiessler) +--- + +
+ +**[↑ Back to Top](#substrate)** + +