Volunteer Data Science

Community Input-Output Modeling Projects

Overview

Active Projects - Combining data science visualizations with LLM Chat
View Starter Samples - Roll up your sleeves and get coding

Industry Evaluator - Top Industries by County
Impact Bubble Chart - Choose 3 indicators for industry comparison
Sankey Supply Chain - D3 version of the USEEIO ecosystem

More Embeddable IO Widgets

Current Focus

• Build and optimize real-time and batch data pipelines to feed ML models, handling structured/unstructured data streams from various sources.
• Integrate machine learning models into Python-based server environments (e.g., FastAPI/Flask), ensuring low-latency, scalable model inference APIs.
• Work with other data scientists to automate feature engineering, model training, and deployment processes for smooth handoff from development to production.
• Containerize and deploy ML services using Docker and manage them on cloud platforms like AWS/GCP, with CI/CD pipelines for automation.
• Implement logging and monitoring systems to track data flow, model performance, and system health; optimize infrastructure for efficiency and reliability.

Volunteer Developers

Our volunteer teams contribute to data visualizations, directories, supply chain reports on topics like energy use, air and water quality, land use, and job creation using data from the following:

✪ US Bureau of Economic Analysis (BEA)
✪ US Bureau of Labor Statistics (BLS)
✪ US Environmental Protection Agency (EPA)
✪ US Census Bureau County Business Patterns
✪ Contributions from State and Local Agencies

Contributors focus on the following based on their areas of interest:

✪ React Vite, NextJS
✪ Python Data Prep and ML Forecasting
✪ JQuery, Javascript
✪ Supabase and DuckDB
✪ eCharts and D3 Data Visualization
✪ LLM Chatbot UI for Data Science, Open WebUI
✪ Geospatial Mapping using Leaflet and Mapbox

Active Project Areas

Javascript with Python - Machine Learning Tools for Industry Estimates

Environmentally-Extended Impact Evaluator

Focus: Tools combining NAICS industry groups (284) with Input-Output Visualizations for Electric Vehicle Manufacturing Transitions and Bioeconomy Waste-to-Energy analysis.

  • Top industry groups based on Public Use Microdata Areas (PUMAs)
  • Impact graphs for sets of locations and industries within EV Transition Evaluator

Implementation

  1. Set up a GitHub action to pull industry group concentration from
    DataUSA.io API or Google Data Commons API
    Store as CSV files by state on GitHub in the Community-Data repo.

  2. Apply machine learning using public websocket.
    Document deployment of existing websocket for Industry Hotspot Python.

    Python server-side: Flask to Google Cloud with Docker/Kubernetes
    Websocket API: Amazon API Gateway and AWS Lambda with DynamoDB

Industrial Ecology

Bioeconomy Input Audit

Production of fuels and chemicals from biomass can potentially support rural economies and new economic development with positive environmental impacts including capturing carbon, cleaning water, and generating green energy.​ Audits of regional fuel stocks will be conducted for use in net positive energey production from waste.

The New Bioeconomy: Advanced Biofuels
Bioeconomy Planner - Regional Biomass Industries

How to add new technologies to the USEEIO model
Model.earth USEEIOR fork with Bioeconomy functions

Lead Intern: Cindy Azuero
Collaborators: Valerie Thomas​, Wes Ingwersen, Loren Heyns and Mo Li
Focus: Southeast Georgia, 6-county region

Regional Modeling

County-Level USEEIO

Break Down National and State Data for county-level outputs. Methodology Documentation.
Generate county centroids from TIGER census data.

Sankey Supply Chains

Display input and output sectors using D3 Sankey flowcharts
for insights on regional value chains.

Industry Icons

Icons for NAICS Categories

Lead Intern: Yilun Zha
Collaborators: Wes Ingwersen, Loren Heyns and Michael Srocka
Focus: LaGrange, Georgia - 12-county region

Machine Learning Tools

Environmentally-Extended Impact Evaluator

Industry Impact Evaluator - NAICS sectors by county and zip

  • Top 20 industries based on location
  • Impact graphs for each set of selected industries
  • Heatmap displaying industry sectors and indicators
  • Inflow-Outflow for commodity sets
  • Bubble chart comparing 3 impacts

Python server-side: Flask to Google Cloud with Docker/Kubernetes
Websocket API: Amazon API Gateway and AWS Lambda with DynamoDB

Lead Intern: Nazanin Tabatabaei
Collaborators: Loren Heyns, Yilun Zha, Wes Ingwersen
Focus: US States

CSS for limiting visibility to localhost
DIV - Use the class "localonly" --- for display:block with padding:20px
SPAN - Use the class "local" --- for display: inline-block with padding: 0 4px 0 4px;

Rules of thumb for GitHub static website file names:
1. Avoid using spaces or parentheses
2. Use all lowercase (with occasional exceptions)
3. Avoid using underscores

GitHub Forks of main
1. Work in a fork of the main branch rather than creating multiple branches.
2. Turn on Github Pages so work in your forks can be reviewed.

Active Projects