Files
Substrate/Data/UPDATES.md
svemagie 0698daea69 feat: add DE proxy datasets DS-00015–17 + metrics updates — flatten dirs to .md
- Add DE-Platform-Media (DS-00015), DE-Epistemic-Competence (DS-00016), DE-Social-Mobility (DS-00017) with source stubs
- Update DE-Democracy-Metrics, DE-Federal-Budget, DE-Lobby-Transparency, DE-Parliament-Activity, Knowledge-Worker salaries
- Add get-de-digital script for digital economy data retrieval
- Update de-plan1-sven with revised strategy sections
- Rename flat-dir index files to .md (Arguments, Claims, Problems, Values)
- Append new entries to Data/UPDATES.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 17:20:04 +02:00

5.0 KiB
Raw Permalink Blame History

Data Directory Update Log

This file tracks all datasets added to the Substrate Data directory.


2026-04-20 - DE Social Mobility (DS-00017)

Dataset: DE-Social-Mobility Status: Active Coverage: Gymnasium access rates by parental education (Destatis 2015), intergenerational mobility comparison (IWH 2017, PISA/IGLU) Connection: DE-Plan CHALLENGE 5 (Knowledge isolation), CHALLENGE 4 (Exhaustion/precarity)


2026-04-20 - DE Epistemic Competence (DS-00016)

Dataset: DE-Epistemic-Competence Status: Active Coverage: Adult literacy (PIAAC 2023), science trust (Wissenschaftsbarometer 2024), digital skills (Eurostat), media literacy perception (Eurobarometer) Connection: DE-Plan CHALLENGE 5 (Knowledge isolation)


2026-04-20 - DE Platform & Media (DS-00015)

Dataset: DE-Platform-Media Status: Active Coverage: Social network participation, online news reading, platform reach — Germany 20212025 Sources: Eurostat live API (bun get-de-digital) + Reuters Institute Digital News Report 2024 + ARD/ZDF Onlinestudie 2024 Connection: DE-Plan CHALLENGE 3 (Platform-mediated public sphere)


2025-10-16 - U.S. Gross Domestic Product (GDP)

Dataset: US-GDP Status: Active Coverage: 1929-2024 (annual), Q1 1947 - Q2 2025 (quarterly) Source: Federal Reserve Economic Data (FRED) / Bureau of Economic Analysis (BEA)

Contents

  • Real-GDP-Quarterly-1947-2025.csv - Quarterly real GDP (314 data points)
  • Real-GDP-Annual-1929-2024.csv - Annual real GDP (96 data points)
  • US-GDP-1929-2025.md - Comprehensive metadata documentation
  • README.md - Dataset documentation with research methodology and historical context
  • UPDATES.md - Dataset-specific change log
  • RESOURCES.md - Data sources, APIs, and download instructions

Description

Authoritative U.S. GDP data representing the total value of all goods and services produced within the United States. Real GDP (chained 2017 dollars) enables inflation-adjusted comparisons across 96 years of American economic history. Quarterly data provides 78 years of detailed business cycle information. Data sourced directly from BEA via FRED, the Federal Reserve's economic data platform.

Research Methodology

Created through comprehensive parallel research using 10 specialized research agents across 3 services (Perplexity, Claude WebSearch, Gemini). 20 focused queries evaluated data sources, historical coverage, measurement methodologies, and quality standards. 95%+ confidence level in source selection. Research confirmed BEA as primary official U.S. government source with FRED providing optimal accessibility.

Key Features

  • Gold standard economic indicator: Primary measure of U.S. economic activity
  • Long historical coverage: 96 years annual (1929-2024), 78 years quarterly (1947-2025)
  • Highest data quality: Three-stage quarterly revision process + annual comprehensive updates
  • Full transparency: Public domain data with complete methodology documentation
  • Easy access: Direct CSV downloads and free APIs available

2025-10-07 - Bay Area COVID-19 Wastewater Surveillance

Dataset: Bay-Area-COVID-Wastewater Status: Active Coverage: 2022-07-09 to 2025-08-02 (161 weekly data points) Source: California Department of Public Health (CDPH)

Contents

  • COVID-Wastewater-California-Statewide-2022-2025.csv - Main dataset
  • COVID-Wastewater-SF-Bay-Area-2023-2025.md - Metadata documentation
  • README.md - Dataset documentation and research methodology
  • UPDATES.md - Dataset-specific change log
  • RESOURCES.md - Official dashboard and data source links

Description

California statewide COVID-19 wastewater surveillance data serving as proxy for Bay Area trends. Includes weekly viral concentration measurements from 12+ treatment plants across Bay Area counties (SF, Alameda, Santa Clara, Contra Costa, Marin, San Mateo).


2025-10-07 - Pulitzer Prize Winners (Arts & Letters)

Dataset: Pulitzer-Prize-Winners Status: Active Coverage: 1918-2024 (249 winners in Arts & Letters categories) Source: Wikidata Focus: High-quality, complete coverage of Poetry, Drama, and General/Special awards

Contents

  • Pulitzer-Prize-Winners-Arts-Letters-1918-2024.csv - Combined dataset
  • category-poetry.csv - Poetry winners (105)
  • category-drama.csv - Drama winners (109)
  • category-general.csv - General/Special awards (35)
  • README.md - Dataset documentation and research methodology
  • UPDATES.md - Dataset-specific change log
  • RESOURCES.md - Official source links

Description

Curated Pulitzer Prize winners dataset focusing on Arts & Letters categories with high-quality, near-complete coverage. Includes 107 years of Poetry and Drama awards (1918-2024) plus General/Special citations. Data sourced from Wikidata SPARQL query with comprehensive cleaning. Journalism categories intentionally excluded due to low Wikidata coverage - prioritizing data quality over breadth.


Future Datasets

New datasets will be added above this line in reverse chronological order (newest first).