{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "# Building Semantic Bridges\n", "## Linking Human Language to Scientific Data\n", "\n", "---\n", "\n", "### What This Does\n", "\n", "This notebook helps you **translate real-world problems described in everyday language into actionable scientific data and models.**\n", "\n", "**Example Bridge:**\n", "- **What people say**: *\"Flooding is getting worse, our wells taste salty\"*\n", "- **What science measures**: `flood_depth`, `groundwater_salinity`, `sea_level_rise`\n", "\n", "---\n", "\n", "### Why This Matters\n", "\n", "When communities face complex environmental challenges, they describe their experiences in everyday language. Scientists and decision-makers need to connect these narratives to:\n", "\n", "1. **Relevant scientific domains** (hydrology, climate science, etc.)\n", "2. **Measurable variables** (water levels, temperature, etc.)\n", "3. **Available data and models** that can inform decisions\n", "\n", "---\n", "\n", "### What You'll Learn\n", "\n", "This notebook will teach you how to:\n", "\n", "- Analyze text documents (interviews, reports, narratives) automatically \n", "- Identify key themes and topics using machine learning \n", "- Map these topics to scientific disciplines \n", "- Extract decision components (goals, objectives, variables, constraints) \n", "- Link everyday language to scientific variable names \n", "\n", "---\n", "\n", "### Who This Is For\n", "\n", "- **Planners and decision-makers** who need to understand technical data\n", "- **Community organizations** working on environmental issues\n", "- **Undergraduate students** learning about interdisciplinary problem-solving\n", "- **Researchers** bridging the gap between lived experience and scientific analysis\n", "\n", "---\n", "\n", "**Author:** Decision Support Office, Texas Advanced Computing Center\n", "**Updated:** November 2025" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## STEP 1: Setup\n", "\n", "We will load the tools needed for text analysis and visualization\n", "\n", "This setup includes: \n", "- Import verification:\n", " Checks each package before installing spaCy\n", " model check: Verifies if the model is already downloaded\n", "- Selective installation:\n", " Only installs what's missing\n", "- Clear feedback:\n", " Shows which packages are already available vs. need installation\n", "- Import name mapping:\n", " Handles cases where package name ≠ import name (like scikit-learn vs sklearn)\n", "- Efficient for reuse:\n", " Won't waste time reinstalling on subsequent runs\n", "\n", "This is perfect for TACC computational cookbooks where the notebook might be run multiple times by different users or in environments with varying pre-installed packages." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**when working locally: \n", "Track Checkpoints as notebook versions, just save the whole notebook as:\n", "\n", "semantic_bridge_NNA_v1.ipynb\n", "semantic_bridge_NNA_v2.ipynb\n", "semantic_bridge_NNA_v3.ipynb" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "✓ Added /Users/sawp33/.local/bin to PATH\n", " ✓ pandas already installed\n", " ✓ numpy already installed\n", " ✓ nltk already installed\n", " ✓ spacy already installed\n", " ✓ scikit-learn already installed\n", " ✓ networkx already installed\n", " ✓ plotly already installed\n", " ✓ python-docx already installed\n", " ✓ pillow already installed\n", "\n", "✓ All packages already installed!\n", "✓ spaCy model already available!\n", "\n", "✓ Setup complete and verified!\n" ] } ], "source": [ "# Cell 1: Installation with verification checks\n", "\n", "# Install required packages for TACC Jupyter environment\n", "import sys\n", "import subprocess\n", "import importlib\n", "import os\n", "from pathlib import Path\n", "import warnings\n", "warnings.filterwarnings('ignore', message='This pattern is interpreted as a regular expression')\n", "import pandas as pd\n", "import numpy as np\n", "from collections import defaultdict\n", "import re\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, HTML\n", "import plotly.graph_objects as go\n", "from plotly.subplots import make_subplots\n", "# Add user's local bin to PATH (needed for TACC)\n", "user_bin = Path.home() / '.local' / 'bin'\n", "if str(user_bin) not in os.environ['PATH']:\n", " os.environ['PATH'] = f\"{user_bin}:{os.environ['PATH']}\"\n", " print(f'✓ Added {user_bin} to PATH')\n", "\n", "def check_package_installed(package_name, import_name=None):\n", " \"\"\"Check if a package is already installed\"\"\"\n", " if import_name is None:\n", " import_name = package_name\n", " try:\n", " importlib.import_module(import_name)\n", " return True\n", " except ImportError:\n", " return False\n", "\n", "def check_spacy_model(model_name='en_core_web_sm'):\n", " \"\"\"Check if spaCy model is already downloaded\"\"\"\n", " try:\n", " import spacy\n", " spacy.load(model_name)\n", " return True\n", " except (ImportError, OSError):\n", " return False\n", "\n", "def install_packages():\n", " \"\"\"Install required packages if not already available\"\"\"\n", " packages = {\n", " 'pandas': 'pandas',\n", " 'numpy': 'numpy',\n", " 'nltk': 'nltk',\n", " 'spacy': 'spacy',\n", " 'scikit-learn': 'sklearn',\n", " 'networkx': 'networkx',\n", " 'plotly': 'plotly',\n", " 'python-docx': 'docx',\n", " 'pillow': 'PIL'\n", " }\n", " \n", " missing_packages = []\n", " for package, import_name in packages.items():\n", " if not check_package_installed(package, import_name):\n", " missing_packages.append(package)\n", " print(f' - {package} needs installation')\n", " else:\n", " print(f' ✓ {package} already installed')\n", " \n", " if missing_packages:\n", " print(f'\\nInstalling {len(missing_packages)} package(s)...')\n", " subprocess.check_call([\n", " sys.executable, '-m', 'pip', 'install', \n", " '--quiet', '--user', '--no-warn-script-location'\n", " ] + missing_packages)\n", " print('✓ Package installation complete!')\n", " else:\n", " print('\\n✓ All packages already installed!')\n", " \n", " if not check_spacy_model('en_core_web_sm'):\n", " print('\\nDownloading spaCy model...')\n", " subprocess.check_call([\n", " sys.executable, '-m', 'spacy', 'download', \n", " 'en_core_web_sm', '--quiet'\n", " ])\n", " print('✓ spaCy model downloaded!')\n", " else:\n", " print('✓ spaCy model already available!')\n", " \n", " print('\\n✓ Setup complete and verified!')\n", "\n", "install_packages()" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "✓ Libraries loaded successfully!\n" ] } ], "source": [ "## Cell 2: Import verification, document handling, and library loading\n", "\n", "# Be sure to run Cell 2 each time you restart the kernel\n", "\n", "# Import libraries\n", "import pandas as pd\n", "import numpy as np\n", "import json\n", "from pathlib import Path\n", "from collections import Counter, defaultdict\n", "import re\n", "\n", "# NLP\n", "import nltk\n", "from nltk.tokenize import sent_tokenize, word_tokenize\n", "from nltk.corpus import stopwords\n", "import spacy\n", "\n", "# Machine Learning\n", "from sklearn.feature_extraction.text import TfidfVectorizer\n", "from sklearn.decomposition import LatentDirichletAllocation\n", "\n", "# Network analysis\n", "import networkx as nx\n", "\n", "# Visualization\n", "import plotly.graph_objects as go\n", "import plotly.express as px\n", "\n", "# Document handling\n", "from docx import Document\n", "from PIL import Image\n", "\n", "# Optional: OCR support for images\n", "try:\n", " import pytesseract\n", " OCR_AVAILABLE = True\n", "except ImportError:\n", " OCR_AVAILABLE = False\n", " print('ℹ pytesseract not available - image OCR disabled')\n", " print(' (This is fine if you only use .txt, .json, or .docx files)')\n", "\n", "# Download NLTK data\n", "for pkg in ['punkt', 'stopwords', 'averaged_perceptron_tagger']:\n", " try:\n", " nltk.data.find(f'tokenizers/{pkg}')\n", " except LookupError:\n", " nltk.download(pkg, quiet=True)\n", "\n", "# Load spaCy\n", "try:\n", " nlp = spacy.load('en_core_web_sm')\n", " print('✓ Libraries loaded successfully!')\n", "except OSError:\n", " print('⚠ Run the previous cell to install spaCy model')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Prepare Input Data (this is your \"Corpora\")\n", "\n", "This notebook works with multiple document formats describing a problem, situation, or descriptive collections of documents. For example:\n", "- Interview transcripts\n", "- Meeting notes\n", "- Stakeholder reports\n", "- Grey Literature reports\n", "- Community narratives\n", "\n", "**Supported formats:**\n", "- `.txt` - Plain text files\n", "- `.json` - JSON with text content (field: \"text\" or \"content\")\n", "- `.docx` - Microsoft Word documents\n", "- `.png` / `.jpg` / `.jpeg` - Images (OCR extraction)\n", "\n", "**Setup:** Place your files in `data/transcripts/` folder\n", "\n", "For this demo, sample documents in various formats have been created. You can use these samples if you do not have test datasets." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "✓ Created directory: /Users/sawp33/semantic_bridgeNNAInterdependencies/data\n", " Place your .txt transcript files here, or run the next cell for demo data\n" ] } ], "source": [ "# Cell 3 - Create data directory structure\n", "from pathlib import Path\n", "\n", "data_dir = Path('/Users/sawp33/semantic_bridgeNNAInterdependencies/data')\n", "data_dir.mkdir(parents=True, exist_ok=True)\n", "\n", "print(f'✓ Created directory: {data_dir}')\n", "print(f' Place your .txt transcript files here, or run the next cell for demo data')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading files from /Users/sawp33/semantic_bridgeNNAInterdependencies/data...\n", "\n", "✓ 1_1_InterdependenciesNNA.docx (docx)\n", "✓ 1_2_InterdependenciesNNA.docx (docx)\n", "✓ 1_3_InterdependenciesNNA.docx (docx)\n", "✓ 1_4_InterdependenciesNNA.docx (docx)\n", "✓ 1_5__InterdependenciesNNA.docx (docx)\n", "✓ 1_6__InterdependenciesNNA.docx (docx)\n", "✓ 3_1__InterdependenciesNNA.docx (docx)\n", "✓ 3_2__InterdependenciesNNA.docx (docx)\n", "✓ 3_3__InterdependenciesNNA.docx (docx)\n", "\n", "============================================================\n", "✅ Loaded 9 documents total\n", "============================================================\n", "\n", "Sample document: 1_1_InterdependenciesNNA\n", "Length: 54164 characters\n", "Preview: Leif:\tI don't know if I...\n", "Jennifer:\tI just started it.\n", "Leif:\tAll right, and just like any research project, as you know, you can stop at any point, i...\n", "\n" ] } ], "source": [ "#Cell 4 - Configure environment and load packages\n", "import json\n", "from pathlib import Path\n", "from docx import Document\n", "from PIL import Image\n", "\n", "# OCR support (optional)\n", "try:\n", " import pytesseract\n", " OCR_AVAILABLE = True\n", "except ImportError:\n", " OCR_AVAILABLE = False\n", " print('ℹ️ pytesseract not available - image OCR disabled')\n", "\n", "# Configuration\n", "data_dir = Path('/Users/sawp33/semantic_bridgeNNAInterdependencies/data')\n", "documents = {}\n", "\n", "print(f\"Loading files from {data_dir}...\\n\")\n", "\n", "# Process all supported files\n", "for filepath in sorted(data_dir.glob('**/*')):\n", " \n", " # Skip directories\n", " if filepath.is_dir():\n", " continue\n", " \n", " # Skip hidden files (start with . or ~$)\n", " if filepath.name.startswith('.') or filepath.name.startswith('~$'):\n", " continue\n", " \n", " # Skip checkpoint files\n", " if 'checkpoint' in filepath.name.lower():\n", " continue\n", " \n", " # Skip if in hidden directory (like .ipynb_checkpoints)\n", " path_str = str(filepath)\n", " if '/.ipynb_checkpoints/' in path_str or '/.git/' in path_str:\n", " continue\n", " \n", " try:\n", " # 1. Plain text files (.txt)\n", " if filepath.suffix == '.txt':\n", " text = filepath.read_text(encoding='utf-8')\n", " documents[filepath.stem] = text\n", " print(f\"✓ {filepath.name} (text)\")\n", " \n", " # 2. JSON files\n", " elif filepath.suffix == '.json':\n", " with open(filepath, 'r', encoding='utf-8') as f:\n", " data = json.load(f)\n", " \n", " # Try to find text content\n", " if isinstance(data, dict):\n", " # Look for 'text', 'content', or 'speakers' fields\n", " if 'speakers' in data:\n", " # Handle speaker-based transcripts\n", " text_parts = []\n", " for speaker in data['speakers']:\n", " if 'text' in speaker:\n", " text_parts.append(speaker['text'])\n", " text = ' '.join(text_parts)\n", " elif 'text' in data:\n", " text = data['text']\n", " elif 'content' in data:\n", " text = data['content']\n", " else:\n", " text = json.dumps(data)\n", " else:\n", " text = json.dumps(data)\n", " \n", " documents[filepath.stem] = text\n", " print(f\"✓ {filepath.name} (json)\")\n", " \n", " # 3. Word documents (.docx)\n", " elif filepath.suffix == '.docx':\n", " doc = Document(filepath)\n", " text = '\\n'.join([para.text for para in doc.paragraphs])\n", " documents[filepath.stem] = text\n", " print(f\"✓ {filepath.name} (docx)\")\n", " \n", " # 4. Images (.png, .jpg, .jpeg) - OCR\n", " elif filepath.suffix.lower() in ['.png', '.jpg', '.jpeg']:\n", " if OCR_AVAILABLE:\n", " image = Image.open(filepath)\n", " text = pytesseract.image_to_string(image)\n", " documents[filepath.stem] = text\n", " print(f\"✓ {filepath.name} (image/OCR)\")\n", " else:\n", " print(f\"⚠️ {filepath.name} (skipped - pytesseract not installed)\")\n", " \n", " except Exception as e:\n", " print(f\"✗ {filepath.name} - Error: {e}\")\n", "\n", "print(f\"\\n{'='*60}\")\n", "print(f\"✅ Loaded {len(documents)} documents total\")\n", "print(f\"{'='*60}\\n\")\n", "\n", "# Show sample\n", "if documents:\n", " sample_name = list(documents.keys())[0]\n", " sample_text = documents[sample_name]\n", " print(f\"Sample document: {sample_name}\")\n", " print(f\"Length: {len(sample_text)} characters\")\n", " print(f\"Preview: {sample_text[:150]}...\\n\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "============================================================\n", "File: 1_1_InterdependenciesNNA\n", "============================================================\n", "Leif:\tI don't know if I...\n", "Jennifer:\tI just started it.\n", "Leif:\tAll right, and just like any research project, as you know, you can stop at any point, if you don't want to do it, there's no penalty, oka...\n", "\n", "============================================================\n", "File: 1_2_InterdependenciesNNA\n", "============================================================\n", "Lauryn:\t... works. \n", "Leif:\tSeems like probably better to not [inaudible]. \n", "Lauryn:\tYeah. \n", "Leif:\t[inaudible]. Right on, okay. Well, so I'm going to, some of this is probably going to sound a little bit ...\n", "\n", "============================================================\n", "File: 1_3_InterdependenciesNNA\n", "============================================================\n", "\n", "QC Nelson\n", "Wed, Aug 24, 2022 7:50AM • 59:15\n", "SUMMARY KEYWORDS\n", "operators, water, alaska, bethel, systems, people, community, plant, utility, project, yk, sewer, money, pipe, anchorage, big, challenge, b...\n", "\n", "============================================================\n", "File: 1_4_InterdependenciesNNA\n", "============================================================\n", "Leif:\tAll right, so, just some background questions: my understanding is you drove truck for Clyde, right? You're a water truck driver, or water and sewer?\n", "Robert:\tYeah, I started driving water truck ...\n", "\n", "============================================================\n", "File: 1_5__InterdependenciesNNA\n", "============================================================\n", "\n", "QC Vicente\n", "Thu, Aug 18, 2022 12:44PM • 1:16:15\n", "SUMMARY KEYWORDS\n", "water, people, communities, plant, bethel, operators, system, talking, maintenance workers, question, area, difficult, houses, training...\n", "\n", "============================================================\n", "File: 1_6__InterdependenciesNNA\n", "============================================================\n", "\n", "QC Bob White\n", "Thu, Aug 18, 2022 12:47PM • 1:13:53\n", "SUMMARY KEYWORDS\n", "operators, communities, test, water, people, plant, maintenance workers, training, bethel, state, pass, certification, issues, haul, ...\n", "\n", "============================================================\n", "File: 3_1__InterdependenciesNNA\n", "============================================================\n", "\n", "QC Pete and Bill\n", "Tue, Sep 13, 2022 8:49AM • 1:19:15\n", "SUMMARY KEYWORDS\n", "water, people, piped, trucks, pipe, problem, homeowner, big, house, pay, permafrost, sewer, tank, plant, years, building, gallons,...\n", "\n", "============================================================\n", "File: 3_2__InterdependenciesNNA\n", "============================================================\n", "\n", "QC Operator Richard\n", "Mon, Aug 08, 2022 1:13AM • 1:30:30\n", "SUMMARY KEYWORDS\n", "operators, test, plant, water, bethel, people, system, anchorage, communities, pay, licenses, alaska, dutch harbor, run, dec, k...\n", "\n", "============================================================\n", "File: 3_3__InterdependenciesNNA\n", "============================================================\n", "Theo\n", "Wed, Aug 24, 2022 8:19AM • 11:01\n", "SUMMARY KEYWORDS\n", "driver, cdl, inaudible, water, work, village, pay, bethel, training, piped, driving, umm, job, overflow pipe, winter, places, tank, truck driver,...\n" ] } ], "source": [ "#Cell 5 # Display preview of loaded transcripts\n", "for filename, content in documents.items():\n", " print(f'\\n{\"=\"*60}')\n", " print(f'File: {filename}')\n", " print(f'{\"=\"*60}')\n", " print(content[:200] + '...' if len(content) > 200 else content)\n", "\n", "\n", "\n", " " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🔧 RECREATING PROCESSED TEXTS & DICTIONARY\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ Found documents: 9 files\n", "\n", "Step 2: Processing documents...\n", "--------------------------------------------------------------------------------\n", " ✓ 1_1_InterdependenciesNNA: 5129 tokens\n", " ✓ 1_2_InterdependenciesNNA: 5012 tokens\n", " ✓ 1_3_InterdependenciesNNA: 4875 tokens\n", " ✓ 1_4_InterdependenciesNNA: 4951 tokens\n", " ✓ 1_5__InterdependenciesNNA: 6250 tokens\n", " ✓ 1_6__InterdependenciesNNA: 5600 tokens\n", " ✓ 3_1__InterdependenciesNNA: 6991 tokens\n", " ✓ 3_2__InterdependenciesNNA: 8571 tokens\n", " ✓ 3_3__InterdependenciesNNA: 1067 tokens\n", "\n", "✅ Processed 9 documents\n", "\n", "Step 3: Creating dictionary...\n", "--------------------------------------------------------------------------------\n", "✅ Dictionary created!\n", " • Unique tokens: 1418\n", " • Documents: 9\n", " • Sample words: 80s, abc, absolutely, access, according, acronyms, added, address, addressing, adjust, adjustments, administrative, administrator, advisory, advocate, advocates, affects, affirmative, afford, age\n", "\n", "Step 4: Creating corpus...\n", "--------------------------------------------------------------------------------\n", "✅ Corpus created!\n", " • Documents: 9\n", " • Average unique words per doc: 418.7\n", "\n", "================================================================================\n", "✅ TEXT PROCESSING COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " ✓ processed_texts (list of 9 documents)\n", " ✓ dictionary (1418 unique tokens)\n", " ✓ corpus (9 documents)\n", "\n", "💡 NEXT STEPS:\n", " ✅ You can now:\n", " • Run the Single Transcript Overlay cell\n", " • Create new LDA model if needed\n", " • Run visualization cells\n", "\n", "================================================================================\n" ] } ], "source": [ "#Cell 6 \"\"\" Recreate Processed Texts from Documents\n", "\n", "#Use this if you have 'documents' but missing 'processed_texts'.\n", "#This will process your documents and create the dictionary.\n", "\n", "print(\"🔧 RECREATING PROCESSED TEXTS & DICTIONARY\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "if 'documents' not in globals():\n", " print(\"❌ ERROR: 'documents' variable not found!\")\n", " print(\" You need to run the data loading cells first\")\n", "else:\n", " print(f\"✓ Found documents: {len(documents)} files\")\n", " print()\n", "\n", "# ==========================================\n", "# 2. PROCESS TEXTS\n", "# ==========================================\n", "if 'documents' in globals():\n", " print(\"Step 2: Processing documents...\")\n", " print(\"-\"*80)\n", " \n", " import re\n", " import string\n", " \n", " processed_texts = []\n", " \n", " for doc_name, doc_text in documents.items():\n", " # Clean text\n", " text = doc_text.lower()\n", " \n", " # Remove punctuation\n", " text = text.translate(str.maketrans('', '', string.punctuation))\n", " \n", " # Remove extra whitespace\n", " text = re.sub(r'\\s+', ' ', text)\n", " \n", " # Tokenize (simple split)\n", " tokens = text.split()\n", " \n", " # Remove short words and numbers\n", " tokens = [word for word in tokens if len(word) > 2 and not word.isdigit()]\n", " \n", " # Remove common stopwords (basic list)\n", " stopwords = {\n", " 'the', 'and', 'for', 'are', 'but', 'not', 'you', 'all', 'can',\n", " 'her', 'was', 'one', 'our', 'out', 'this', 'that', 'with', 'have',\n", " 'from', 'they', 'been', 'were', 'said', 'what', 'when', 'your',\n", " 'more', 'will', 'there', 'their', 'about', 'which', 'into', 'than',\n", " 'them', 'would', 'could', 'should', 'who', 'has', 'had', 'how'\n", " }\n", " tokens = [word for word in tokens if word not in stopwords]\n", " \n", " processed_texts.append(tokens)\n", " print(f\" ✓ {doc_name}: {len(tokens)} tokens\")\n", " \n", " print(f\"\\n✅ Processed {len(processed_texts)} documents\")\n", " print()\n", "\n", "# ==========================================\n", "# 3. CREATE DICTIONARY\n", "# ==========================================\n", "if 'processed_texts' in globals():\n", " print(\"Step 3: Creating dictionary...\")\n", " print(\"-\"*80)\n", " \n", " try:\n", " from gensim.corpora import Dictionary\n", " \n", " dictionary = Dictionary(processed_texts)\n", " \n", " # Filter extremes (optional but recommended)\n", " # Remove words that appear in less than 2 documents or more than 50% of documents\n", " dictionary.filter_extremes(no_below=2, no_above=0.5)\n", " \n", " print(f\"✅ Dictionary created!\")\n", " print(f\" • Unique tokens: {len(dictionary)}\")\n", " print(f\" • Documents: {len(processed_texts)}\")\n", " \n", " # Show sample vocabulary\n", " sample_words = list(dictionary.token2id.keys())[:20]\n", " print(f\" • Sample words: {', '.join(sample_words)}\")\n", " print()\n", " \n", " except ImportError:\n", " print(\"❌ ERROR: gensim not installed!\")\n", " print(\" Install with: pip install gensim\")\n", " print()\n", " except Exception as e:\n", " print(f\"❌ ERROR creating dictionary: {e}\")\n", " print()\n", "\n", "# ==========================================\n", "# 4. CREATE CORPUS\n", "# ==========================================\n", "if 'dictionary' in globals() and 'processed_texts' in globals():\n", " print(\"Step 4: Creating corpus...\")\n", " print(\"-\"*80)\n", " \n", " try:\n", " corpus = [dictionary.doc2bow(text) for text in processed_texts]\n", " \n", " print(f\"✅ Corpus created!\")\n", " print(f\" • Documents: {len(corpus)}\")\n", " print(f\" • Average unique words per doc: {sum(len(doc) for doc in corpus)/len(corpus):.1f}\")\n", " print()\n", " \n", " except Exception as e:\n", " print(f\"❌ ERROR creating corpus: {e}\")\n", " print()\n", "\n", "# ==========================================\n", "# 5. VERIFY\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"✅ TEXT PROCESSING COMPLETE\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "print(\"Variables created:\")\n", "if 'processed_texts' in globals():\n", " print(f\" ✓ processed_texts (list of {len(processed_texts)} documents)\")\n", "else:\n", " print(\" ✗ processed_texts\")\n", "\n", "if 'dictionary' in globals():\n", " print(f\" ✓ dictionary ({len(dictionary)} unique tokens)\")\n", "else:\n", " print(\" ✗ dictionary\")\n", "\n", "if 'corpus' in globals():\n", " print(f\" ✓ corpus ({len(corpus)} documents)\")\n", "else:\n", " print(\" ✗ corpus\")\n", "\n", "print(\"\\n💡 NEXT STEPS:\")\n", "if 'dictionary' in globals() and 'corpus' in globals():\n", " print(\" ✅ You can now:\")\n", " print(\" • Run the Single Transcript Overlay cell\")\n", " print(\" • Create new LDA model if needed\")\n", " print(\" • Run visualization cells\")\n", "else:\n", " print(\" ⚠️ Some variables still missing\")\n", " print(\" Check error messages above\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "✓ Transcript Statistics:\n", "\n", " File Characters Words Sentences\n", " 1_1_InterdependenciesNNA 54164 9376 678\n", " 1_2_InterdependenciesNNA 51085 9628 708\n", " 1_3_InterdependenciesNNA 51819 9274 600\n", " 1_4_InterdependenciesNNA 49421 9265 668\n", "1_5__InterdependenciesNNA 66181 12225 767\n", "1_6__InterdependenciesNNA 58020 10634 747\n", "3_1__InterdependenciesNNA 71608 13299 990\n", "3_2__InterdependenciesNNA 89543 16463 1247\n", "3_3__InterdependenciesNNA 10621 1971 222\n" ] } ], "source": [ "# Cell 7 Calculate basic statistics for each transcript\n", "import pandas as pd\n", "\n", "stats = []\n", "for filename, content in documents.items():\n", " stats.append({\n", " 'File': filename,\n", " 'Characters': len(content),\n", " 'Words': len(content.split()),\n", " 'Sentences': len(sent_tokenize(content))\n", " })\n", "\n", "stats_df = pd.DataFrame(stats)\n", "print('✓ Transcript Statistics:\\n')\n", "print(stats_df.to_string(index=False))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#Step 3: Discover Topics\n", "**What is Topic Modeling?**\n", "Topic modeling automatically identifies themes in your documents. \n", "It groups words that appear together frequently into topics.\n", "\n", "Example: If \"flooding\", \"water\", \"drainage\" appear together, the topic might be about coastal hydrology.\n", "\n", "**How it works:**\n", "1. Break text into words\n", "2. Find patterns of co-occurring words\n", "3. Group related words into topics\n", "4. Assign topics to documents" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🔍 DATA VERIFICATION & CACHE CHECK\n", "================================================================================\n", "\n", "📄 CHECKING LOADED DOCUMENTS\n", "--------------------------------------------------------------------------------\n", "✓ Found 'documents' variable with 9 files\n", "\n", "Documents loaded:\n", " 1. 1_1_InterdependenciesNNA: 9,376 words\n", " ✓ Contains Alaska-related terms: 21 mentions\n", " 2. 1_2_InterdependenciesNNA: 9,628 words\n", " ⚠️ Contains Texas-related terms: 6 mentions\n", " ✓ Contains Alaska-related terms: 5 mentions\n", " 3. 1_3_InterdependenciesNNA: 9,274 words\n", " ⚠️ Contains Texas-related terms: 4 mentions\n", " ✓ Contains Alaska-related terms: 33 mentions\n", " 4. 1_4_InterdependenciesNNA: 9,265 words\n", " ✓ Contains Alaska-related terms: 3 mentions\n", " 5. 1_5__InterdependenciesNNA: 12,225 words\n", " ⚠️ Contains Texas-related terms: 4 mentions\n", " ✓ Contains Alaska-related terms: 11 mentions\n", " 6. 1_6__InterdependenciesNNA: 10,634 words\n", " ⚠️ Contains Texas-related terms: 1 mentions\n", " ✓ Contains Alaska-related terms: 9 mentions\n", " 7. 3_1__InterdependenciesNNA: 13,299 words\n", " ⚠️ Contains Texas-related terms: 3 mentions\n", " ✓ Contains Alaska-related terms: 19 mentions\n", " 8. 3_2__InterdependenciesNNA: 16,463 words\n", " ✓ Contains Alaska-related terms: 21 mentions\n", " 9. 3_3__InterdependenciesNNA: 1,971 words\n", "\n", "📊 Overall Assessment:\n", " • Total Texas-related mentions: 18\n", " • Total Alaska-related mentions: 122\n", "\n", " ✓ Data appears to be Alaska-focused (correct!)\n", "\n", "================================================================================\n", "🔗 CHECKING SVO MAPPINGS\n", "--------------------------------------------------------------------------------\n", "ℹ️ No 'svo_mappings' variable found (may not be created yet)\n", "\n", "================================================================================\n", "📋 CHECKING CASE STUDY SETTINGS\n", "--------------------------------------------------------------------------------\n", "ℹ️ CASE_STUDY_NAME not set\n", "\n", "================================================================================\n", "🗑️ CACHE CLEARING\n", "--------------------------------------------------------------------------------\n", "\n", "Variables that might contain cached data:\n", " ○ svo_mappings not found\n", " ○ decision_components not found\n", " ○ topic_mappings not found\n", " ○ science_backbone_final not found\n", " ○ analysis_summary not found\n", "\n", "💡 TO CLEAR CACHED DATA:\n", "Run this code in a new cell:\n", "\n", "# Clear potentially cached variables\n", "for var in ['svo_mappings', 'decision_components', 'topic_mappings', \n", " 'science_backbone_final', 'analysis_summary']:\n", " if var in globals():\n", " del globals()[var]\n", " print(f\"Cleared: {var}\")\n", "\n", "print(\"\\n✓ Cache cleared! Re-run analysis cells from the beginning.\")\n", "\n", "\n", "================================================================================\n", "💡 RECOMMENDATIONS\n", "================================================================================\n", "\n", "✅ No major issues detected\n", " Data appears to be correctly loaded for your case study\n", "\n", "================================================================================\n", "✅ VERIFICATION COMPLETE\n", "================================================================================\n" ] } ], "source": [ "#Cell 8 \n", "\"\"\"Data Verification & Cache Check\n", "\n", "Run this cell to verify which data is actually loaded and clear any cached variables.\n", "Use this when you suspect wrong data is being analyzed.\n", "\"\"\"\n", "\n", "print(\"🔍 DATA VERIFICATION & CACHE CHECK\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "from collections import Counter\n", "\n", "# ==========================================\n", "# 1. CHECK LOADED DOCUMENTS\n", "# ==========================================\n", "print(\"📄 CHECKING LOADED DOCUMENTS\")\n", "print(\"-\"*80)\n", "\n", "if 'documents' in globals():\n", " print(f\"✓ Found 'documents' variable with {len(documents)} files\\n\")\n", " print(\"Documents loaded:\")\n", " for i, doc_name in enumerate(documents.keys(), 1):\n", " # Get word count\n", " word_count = len(documents[doc_name].split())\n", " print(f\" {i}. {doc_name}: {word_count:,} words\")\n", " \n", " # Check for Texas-specific terms\n", " doc_lower = documents[doc_name].lower()\n", " texas_terms = ['texas', 'tceq', 'twdb', 'rio grande', 'houston', 'austin', 'dallas']\n", " alaska_terms = ['alaska', 'permafrost', 'nunapitchuk', 'yukon', 'kuskokwim', 'inuit', 'yupik']\n", " \n", " texas_count = sum(doc_lower.count(term) for term in texas_terms)\n", " alaska_count = sum(doc_lower.count(term) for term in alaska_terms)\n", " \n", " if texas_count > 0:\n", " print(f\" ⚠️ Contains Texas-related terms: {texas_count} mentions\")\n", " if alaska_count > 0:\n", " print(f\" ✓ Contains Alaska-related terms: {alaska_count} mentions\")\n", " \n", " # Overall assessment\n", " print(\"\\n📊 Overall Assessment:\")\n", " all_text = ' '.join(documents.values()).lower()\n", " total_texas = sum(all_text.count(term) for term in texas_terms)\n", " total_alaska = sum(all_text.count(term) for term in alaska_terms)\n", " \n", " print(f\" • Total Texas-related mentions: {total_texas}\")\n", " print(f\" • Total Alaska-related mentions: {total_alaska}\")\n", " \n", " if total_texas > total_alaska:\n", " print(\"\\n ⚠️ WARNING: Data appears to be Texas-focused, not Alaska!\")\n", " print(\" ACTION: You may have loaded the wrong dataset.\")\n", " elif total_alaska > total_texas:\n", " print(\"\\n ✓ Data appears to be Alaska-focused (correct!)\")\n", " else:\n", " print(\"\\n ℹ️ Mixed or unclear geographic focus\")\n", "else:\n", " print(\"❌ No 'documents' variable found\")\n", " print(\" ACTION: Run the data loading cells first\")\n", "\n", "# ==========================================\n", "# 2. CHECK SVO MAPPINGS\n", "# ==========================================\n", "print(\"\\n\" + \"=\"*80)\n", "print(\"🔗 CHECKING SVO MAPPINGS\")\n", "print(\"-\"*80)\n", "\n", "if 'svo_mappings' in globals():\n", " if isinstance(svo_mappings, pd.DataFrame):\n", " print(f\"✓ Found SVO mappings: {len(svo_mappings)} entries\\n\")\n", " \n", " # Check for TCEQ\n", " if 'source' in svo_mappings.columns:\n", " tceq_entries = svo_mappings[svo_mappings['source'].str.contains('TCEQ', case=False, na=False)]\n", " if len(tceq_entries) > 0:\n", " print(f\"⚠️ Found {len(tceq_entries)} TCEQ-related SVO entries\")\n", " print(\"\\nTCEQ entries:\")\n", " for idx, row in tceq_entries.head(5).iterrows():\n", " print(f\" • {row.get('svo_id', 'N/A')}: {row.get('source', 'N/A')}\")\n", " if len(tceq_entries) > 5:\n", " print(f\" ... and {len(tceq_entries)-5} more\")\n", " print(\"\\n ⚠️ WARNING: TCEQ is Texas-specific and shouldn't appear in Alaska data!\")\n", " \n", " # Check document sources\n", " if 'document' in svo_mappings.columns:\n", " print(\"\\nSVO mappings by document:\")\n", " doc_counts = svo_mappings['document'].value_counts()\n", " for doc, count in doc_counts.head(10).items():\n", " print(f\" • {doc}: {count} SVOs\")\n", " \n", " # Check SVO IDs for Texas-specific patterns\n", " if 'svo_id' in svo_mappings.columns:\n", " texas_svos = svo_mappings[svo_mappings['svo_id'].str.contains('texas|tceq|twdb|rio_grande', case=False, na=False)]\n", " if len(texas_svos) > 0:\n", " print(f\"\\n⚠️ Found {len(texas_svos)} Texas-specific SVO IDs\")\n", " print(\"\\nTexas SVO IDs:\")\n", " for svo_id in texas_svos['svo_id'].unique()[:10]:\n", " print(f\" • {svo_id}\")\n", " elif isinstance(svo_mappings, list):\n", " print(f\"✓ Found SVO mappings: {len(svo_mappings)} entries (list format)\")\n", " else:\n", " print(f\"✓ Found SVO mappings (unknown format)\")\n", "else:\n", " print(\"ℹ️ No 'svo_mappings' variable found (may not be created yet)\")\n", "\n", "# ==========================================\n", "# 3. CHECK CASE STUDY NAME\n", "# ==========================================\n", "print(\"\\n\" + \"=\"*80)\n", "print(\"📋 CHECKING CASE STUDY SETTINGS\")\n", "print(\"-\"*80)\n", "\n", "if 'CASE_STUDY_NAME' in globals():\n", " print(f\"Case Study Name: {CASE_STUDY_NAME}\")\n", " \n", " if 'texas' in CASE_STUDY_NAME.lower() or 'tceq' in CASE_STUDY_NAME.lower():\n", " print(\"⚠️ Case study name suggests Texas focus\")\n", " elif 'alaska' in CASE_STUDY_NAME.lower():\n", " print(\"✓ Case study name suggests Alaska focus\")\n", "else:\n", " print(\"ℹ️ CASE_STUDY_NAME not set\")\n", "\n", "# Check output directory\n", "if 'OUTPUT_DIR' in globals():\n", " print(f\"\\nOutput Directory: {OUTPUT_DIR}\")\n", " if OUTPUT_DIR.exists():\n", " csv_files = list(OUTPUT_DIR.glob('*.csv'))\n", " md_files = list(OUTPUT_DIR.glob('*.md'))\n", " print(f\" • Contains {len(csv_files)} CSV files\")\n", " print(f\" • Contains {len(md_files)} Markdown files\")\n", " \n", " # Check if files have Texas/Alaska indicators\n", " for f in csv_files[:5]:\n", " if 'texas' in f.name.lower() or 'tceq' in f.name.lower():\n", " print(f\" ⚠️ {f.name} (Texas indicator)\")\n", " elif 'alaska' in f.name.lower():\n", " print(f\" ✓ {f.name} (Alaska indicator)\")\n", "\n", "# ==========================================\n", "# 4. CACHE CLEARING OPTION\n", "# ==========================================\n", "print(\"\\n\" + \"=\"*80)\n", "print(\"🗑️ CACHE CLEARING\")\n", "print(\"-\"*80)\n", "\n", "print(\"\\nVariables that might contain cached data:\")\n", "cached_vars = [\n", " 'svo_mappings', 'decision_components', 'topic_mappings', \n", " 'science_backbone_final', 'analysis_summary'\n", "]\n", "\n", "for var in cached_vars:\n", " if var in globals():\n", " print(f\" ✓ {var} exists\")\n", " else:\n", " print(f\" ○ {var} not found\")\n", "\n", "print(\"\\n💡 TO CLEAR CACHED DATA:\")\n", "print(\"Run this code in a new cell:\")\n", "print(\"\"\"\n", "# Clear potentially cached variables\n", "for var in ['svo_mappings', 'decision_components', 'topic_mappings', \n", " 'science_backbone_final', 'analysis_summary']:\n", " if var in globals():\n", " del globals()[var]\n", " print(f\"Cleared: {var}\")\n", "\n", "print(\"\\\\n✓ Cache cleared! Re-run analysis cells from the beginning.\")\n", "\"\"\")\n", "\n", "# ==========================================\n", "# 5. RECOMMENDATIONS\n", "# ==========================================\n", "print(\"\\n\" + \"=\"*80)\n", "print(\"💡 RECOMMENDATIONS\")\n", "print(\"=\"*80)\n", "\n", "issues_found = []\n", "\n", "# Check for data mismatch\n", "if 'documents' in globals():\n", " all_text = ' '.join(documents.values()).lower()\n", " total_texas = sum(all_text.count(term) for term in ['texas', 'tceq', 'twdb'])\n", " total_alaska = sum(all_text.count(term) for term in ['alaska', 'permafrost', 'nunapitchuk'])\n", " \n", " if total_texas > total_alaska and 'alaska' in str(CASE_STUDY_NAME).lower() if 'CASE_STUDY_NAME' in globals() else False:\n", " issues_found.append(\"DATA MISMATCH: Case study is Alaska but data contains Texas terms\")\n", "\n", "# Check for TCEQ in Alaska context\n", "if 'svo_mappings' in globals() and isinstance(svo_mappings, pd.DataFrame):\n", " if 'source' in svo_mappings.columns:\n", " tceq_count = len(svo_mappings[svo_mappings['source'].str.contains('TCEQ', case=False, na=False)])\n", " if tceq_count > 0 and 'alaska' in str(CASE_STUDY_NAME).lower() if 'CASE_STUDY_NAME' in globals() else False:\n", " issues_found.append(f\"CONTAMINATION: {tceq_count} TCEQ entries found in Alaska analysis\")\n", "\n", "if issues_found:\n", " print(\"\\n⚠️ ISSUES DETECTED:\")\n", " for i, issue in enumerate(issues_found, 1):\n", " print(f\" {i}. {issue}\")\n", " \n", " print(\"\\n🔧 RECOMMENDED ACTIONS:\")\n", " print(\" 1. Verify correct data files are loaded\")\n", " print(\" 2. Clear cached variables (see code above)\")\n", " print(\" 3. Restart kernel: Kernel → Restart & Clear Output\")\n", " print(\" 4. Re-run from Cell 1 with correct data files\")\n", " print(\" 5. Check that CASE_STUDY_NAME matches your actual study\")\n", "else:\n", " print(\"\\n✅ No major issues detected\")\n", " print(\" Data appears to be correctly loaded for your case study\")\n", "\n", "print(\"\\n\" + \"=\"*80)\n", "print(\"✅ VERIFICATION COMPLETE\")\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "#Optional CELL 8 part 2\n", "#OPTIONAL DATA CACHE CLEAR IF CONCERNS AFTER DATA VERIFICATION CHECK\n", "# Clear potentially cached variables\n", "#for var in ['svo_mappings', 'decision_components', 'topic_mappings', \n", " # 'science_backbone_final', 'analysis_summary']:\n", " # if var in globals():\n", " # del globals()[var]\n", " # print(f\"Cleared: {var}\")\n", "\n", "#print(\"\\n✓ Cache cleared! Re-run analysis cells from the beginning.\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Analysis Parameters:\n", " • Number of topics: 25\n", " • Vocabulary size: 2000\n", " • Keywords per topic: 6\n" ] } ], "source": [ "#Cell 9 # Set analysis parameters\n", "n_topics = 25 # Change this number to discover more or fewer topics\n", "max_vocabulary = 2000 # Maximum number of terms to consider\n", "top_words_display = 6 # Number of keywords to show per topic\n", "\n", "print(f'Analysis Parameters:')\n", "print(f' • Number of topics: {n_topics}')\n", "print(f' • Vocabulary size: {max_vocabulary}')\n", "print(f' • Keywords per topic: {top_words_display}')" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "✓ Text preprocessing complete\n", "\n", "Example: leif i don t know if i jennifer i just started it leif all right and just like any research project as you know you can stop at any point if you don t...\n" ] } ], "source": [ "#Cell 10 # Preprocess text\n", "def preprocess_text(text):\n", " text = text.lower()\n", " text = re.sub(r'[^a-z\\s]', ' ', text)\n", " text = ' '.join(text.split())\n", " return text\n", "\n", "processed_docs = [preprocess_text(text) for text in documents.values()]\n", "doc_names = list(documents.keys())\n", "\n", "print('✓ Text preprocessing complete')\n", "print(f'\\nExample: {processed_docs[0][:150]}...')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##BEGIN MODELED ANALYSES" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Discovering topics...\n", "\n", "🔄 MODEL 1: BASELINE (minimal preprocessing)\n", "============================================================\n", " → Preprocessing documents...\n", " ✓ Processed 9 documents\n", " → Creating vectorizer...\n", " → Building document-term matrix...\n", " ✓ Matrix shape: (9, 100)\n", " → Running LDA with 25 topics...\n", " ✓ Model trained!\n", "\n", " MODEL 1 COMPLETE\n", " Topics: 25\n", " Vocabulary: 100 terms\n", " Sample words: able, actually, alaska, albertson, albertson yeah\n", " Stored in: model_results[0]\n", "============================================================\n", "\n" ] } ], "source": [ "# CELL 11: Model 1 - Baseline Topic Discovery\n", "print('Discovering topics...\\n')\n", "print(\"🔄 MODEL 1: BASELINE (minimal preprocessing)\")\n", "print(\"=\"*60)\n", "\n", "# Basic preprocessing\n", "def preprocess_basic(text):\n", " text = text.lower()\n", " text = re.sub(r'[^a-z\\s]', ' ', text)\n", " text = ' '.join(text.split())\n", " return text\n", "\n", "print(\" → Preprocessing documents...\")\n", "processed_docs = [preprocess_basic(text) for text in documents.values()]\n", "print(f\" ✓ Processed {len(processed_docs)} documents\")\n", "\n", "# Vectorization\n", "print(\" → Creating vectorizer...\")\n", "vectorizer = TfidfVectorizer(\n", " max_features=100,\n", " stop_words='english',\n", " ngram_range=(1, 2),\n", " min_df=2,\n", " max_df=0.9\n", ")\n", "\n", "print(\" → Building document-term matrix...\")\n", "doc_term_matrix = vectorizer.fit_transform(processed_docs)\n", "print(f\" ✓ Matrix shape: {doc_term_matrix.shape}\")\n", "\n", "# Topic modeling\n", "n_topics = 25\n", "print(f\" → Running LDA with {n_topics} topics...\")\n", "lda_model = LatentDirichletAllocation(\n", " n_components=n_topics,\n", " max_iter=50,\n", " learning_method='online',\n", " random_state=42,\n", " n_jobs=-1\n", ")\n", "\n", "doc_topic_dist = lda_model.fit_transform(doc_term_matrix)\n", "print(f\" ✓ Model trained!\")\n", "\n", "# Store in collection (initialize if doesn't exist)\n", "if 'model_results' not in globals():\n", " model_results = []\n", "\n", "# Save this model's results\n", "model_results.append({\n", " 'name': 'Model 1 (Baseline)',\n", " 'lda_model': lda_model,\n", " 'vectorizer': vectorizer,\n", " 'doc_term_matrix': doc_term_matrix,\n", " 'doc_topic_dist': doc_topic_dist,\n", " 'feature_names': vectorizer.get_feature_names_out(),\n", " 'processed_docs': processed_docs,\n", " 'n_topics': n_topics\n", "})\n", "\n", "print(f\"\\n MODEL 1 COMPLETE\")\n", "print(f\" Topics: {n_topics}\")\n", "print(f\" Vocabulary: {len(vectorizer.get_feature_names_out())} terms\")\n", "print(f\" Sample words: {', '.join(list(vectorizer.get_feature_names_out()[:5]))}\")\n", "print(f\" Stored in: model_results[{len(model_results)-1}]\")\n", "print(\"=\"*60 + \"\\n\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🔄 MODEL 2: ENHANCED PREPROCESSING\n", "============================================================\n", " → Preprocessing documents...\n", " ✓ Processed 9 documents\n", " → Creating vectorizer with enhanced stopwords...\n", " → Building document-term matrix...\n", " ✓ Matrix shape: (9, 2000)\n", " → Running LDA with 25 topics...\n", " ✓ Model trained!\n", "\n", "✅ MODEL 2 COMPLETE\n", " Topics: 25\n", " Vocabulary: 2000 terms\n", " Stored in: model_results[2]\n", "============================================================\n", "\n" ] } ], "source": [ "#Cell 12 (previously Cell 13) - Model 2\n", "\"\"\"Model 2 - Enhanced Stop Words + Name Filtering (FIXED)\n", "\n", "Fixed version that doesn't require NLTK stopwords module.\n", "Uses a comprehensive built-in stopwords list instead.\n", "\"\"\"\n", "# CELL 13: Model 2 - Enhanced Preprocessing\n", "print(\"🔄 MODEL 2: ENHANCED PREPROCESSING\")\n", "print(\"=\"*60)\n", "\n", "# Enhanced preprocessing with better cleaning\n", "def preprocess_enhanced(text):\n", " text = text.lower()\n", " text = re.sub(r'[^a-z\\s]', ' ', text)\n", " text = ' '.join(text.split())\n", " return text\n", "\n", "print(\" → Preprocessing documents...\")\n", "processed_docs = [preprocess_enhanced(text) for text in documents.values()]\n", "print(f\" ✓ Processed {len(processed_docs)} documents\")\n", "\n", "# Comprehensive stopwords\n", "comprehensive_stopwords = {\n", " # Common English stopwords\n", " 'the', 'and', 'for', 'are', 'but', 'not', 'you', 'all', 'can',\n", " 'her', 'was', 'one', 'our', 'out', 'this', 'that', 'with', 'have',\n", " 'from', 'they', 'been', 'were', 'said', 'what', 'when', 'your',\n", " 'more', 'will', 'there', 'their', 'about', 'which', 'into', 'than',\n", " 'them', 'would', 'could', 'should', 'who', 'has', 'had', 'how',\n", " 'its', 'may', 'these', 'some', 'such', 'only', 'other', 'any',\n", " 'most', 'also', 'very', 'even', 'just', 'like', 'both', 'each',\n", " 'did', 'does', 'his', 'she', 'him', 'well', 'many', 'much',\n", " 'where', 'here', 'now', 'then', 'because', 'before', 'after',\n", " 'through', 'during', 'without', 'within', 'being', 'under', 'over',\n", " 'again', 'further', 'once', 'why', 'while', 'same', 'those', 'own',\n", " 'too', 'off', 'down', 'upon', 'between', 'few', 'above', 'below',\n", " 'doing', 'an', 'as', 'at', 'be', 'by', 'he', 'if', 'in', 'is',\n", " 'it', 'me', 'my', 'no', 'of', 'on', 'or', 'so', 'to', 'up', 'we', 'cool', 'white'\n", " \n", " # Filler words / conversational\n", " 'yeah', 'okay', 'um', 'uh', 'hmm', 'oh', 'ah',\n", " 'know', 'think', 'going', 'got', 'get', 'let',\n", " 'see', 'want', 'make', 'really', 'lot', 'kind',\n", " 'sort', 'thing', 'things', 'stuff', 'actually',\n", " 'basically', 'literally', 'probably', 'maybe',\n", " 'guess', 'mean', 'means', 'supposed', 'trying',\n", " \n", " # Interview-related words\n", " 'interviewer', 'interviewee', 'question', 'answer',\n", " 'ask', 'asked', 'asking', 'tell', 'told', 'telling',\n", " 'talk', 'talked', 'talking', 'discuss', 'discussed',\n", " 'say', 'said', 'says', 'saying', 'read', 'reading',\n", " \n", " # Common names (customize for your interviews)\n", " 'arnold', 'nikki', 'ritsch', 'pete', 'williams', 'richard',\n", " 'jennifer', 'clyde', 'jason', 'robert', 'michael', 'bob', \n", " 'bill', 'albertson', 'leif', 'john', 'mary', 'james', 'linda',\n", " 'david', 'susan', 'thomas', 'nancy', 'charles', 'karen', 'leif', 'albertson', 'lauryn', 'spearing'\n", "}\n", "\n", "print(\" → Creating vectorizer with enhanced stopwords...\")\n", "vectorizer = TfidfVectorizer(\n", " max_features=2000,\n", " stop_words=list(comprehensive_stopwords),\n", " ngram_range=(1, 2),\n", " min_df=2,\n", " max_df=0.9\n", ")\n", "\n", "print(\" → Building document-term matrix...\")\n", "doc_term_matrix = vectorizer.fit_transform(processed_docs)\n", "print(f\" ✓ Matrix shape: {doc_term_matrix.shape}\")\n", "\n", "# Topic modeling\n", "n_topics = 25\n", "print(f\" → Running LDA with {n_topics} topics...\")\n", "lda_model = LatentDirichletAllocation(\n", " n_components=n_topics,\n", " max_iter=50,\n", " learning_method='online',\n", " random_state=42,\n", " n_jobs=-1\n", ")\n", "\n", "doc_topic_dist = lda_model.fit_transform(doc_term_matrix)\n", "print(f\" ✓ Model trained!\")\n", "\n", "# Add to collection\n", "if 'model_results' not in globals():\n", " model_results = []\n", "\n", "model_results.append({\n", " 'name': 'Model 2 (Enhanced)',\n", " 'lda_model': lda_model,\n", " 'vectorizer': vectorizer,\n", " 'doc_term_matrix': doc_term_matrix,\n", " 'doc_topic_dist': doc_topic_dist,\n", " 'feature_names': vectorizer.get_feature_names_out(),\n", " 'processed_docs': processed_docs,\n", " 'n_topics': n_topics\n", "})\n", "\n", "print(f\"\\n✅ MODEL 2 COMPLETE\")\n", "print(f\" Topics: {n_topics}\")\n", "print(f\" Vocabulary: {len(vectorizer.get_feature_names_out())} terms\")\n", "print(f\" Stored in: model_results[{len(model_results)-1}]\")\n", "print(\"=\"*60 + \"\\n\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Topics vs Domain Terms\n", "Topics = Clusters of words automatically discovered by LDA from YOUR documents\n", "Domain terms = Predefined keywords WE specified that relate to science/engineering domains\n", "The domain relevance score just shows how many words in each topic match our predefined domain keywords.\n", "\n", "Example to Clarify\n", "Let's say LDA discovers this topic:\n", "Topic 3: water, infrastructure, monitoring\n", "Keywords: water, infrastructure, monitoring, system, flood, drainage, level, sensor, gauge, network\n", "Domain relevance: 90% (9/10 domain terms)\n", "Domain terms: water, infrastructure, monitoring, system, flood\n", "\n", "This means:\n", "\n", "The topic = the cluster \"water, infrastructure, monitoring...\" found by LDA\n", "Domain terms in this topic = which of these words match our predefined list\n", "90% relevance = 9 out of 10 keywords are domain-relevant (not filler words)\n", "\n", "The Workflow\n", "Your Documents\n", " ↓\n", "LDA discovers topics automatically\n", " ↓\n", "Topics = [\"water, system, flood...\", \"planning, policy, development...\", etc.]\n", " ↓\n", "We check: \"How many words in each topic are domain-relevant?\"\n", " ↓\n", "Relevance score helps identify which topics are substantive vs noise" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MODEL 3: ALTERNATIVE WITH DOMAIN MATCH (more topics, stricter filtering)\n", "============================================================\n", " → Preprocessing documents with stricter cleaning...\n", " ✓ Processed 9 documents\n", " → Creating stricter vectorizer...\n", " → Building document-term matrix...\n", " ✓ Matrix shape: (9, 543)\n", " → Running LDA with 25 topics...\n", " ✓ Model trained!\n", "\n", " MODEL 3 COMPLETE\n", " Topics: 25\n", " Vocabulary: 543 terms\n", " Stored in: model_results[3]\n", "============================================================\n", "\n", "💡 Model 3 uses STRICTER filtering:\n", " • Longer words only (4+ characters)\n", " • Higher document frequency (4+ docs)\n", " • Lower max frequency (60% vs 90%)\n", " • Removes URLs, emails, numbers\n", " → Results in fewer but more focused topics\n" ] } ], "source": [ "#Cell 13 - Model 3\n", "# Use the preprocessing from the Naive LDA approach to orgainically identify topics and build on it for domain matches\n", "print(\"MODEL 3: ALTERNATIVE WITH DOMAIN MATCH (more topics, stricter filtering)\")\n", "\n", "# Use same enhanced preprocessing but different parameters\n", "#Fixed version with built-in stopwords list.\n", "#No dependency on other cells.\n", "print(\"=\"*60)\n", "\n", "import re\n", "\n", "# Stricter preprocessing\n", "def preprocess_strict(text):\n", " text = text.lower()\n", " # Remove URLs\n", " text = re.sub(r'http\\S+|www\\S+', '', text)\n", " # Remove email addresses\n", " text = re.sub(r'\\S+@\\S+', '', text)\n", " # Remove numbers\n", " text = re.sub(r'\\b\\d+\\b', '', text)\n", " # Keep only letters and spaces\n", " text = re.sub(r'[^a-z\\s]', ' ', text)\n", " # Remove extra whitespace\n", " text = ' '.join(text.split())\n", " return text\n", "\n", "print(\" → Preprocessing documents with stricter cleaning...\")\n", "processed_docs = [preprocess_strict(text) for text in documents.values()]\n", "print(f\" ✓ Processed {len(processed_docs)} documents\")\n", "\n", "# Comprehensive stopwords\n", "comprehensive_stopwords = {\n", " # Common English stopwords\n", " 'the', 'and', 'for', 'are', 'but', 'not', 'you', 'all', 'can',\n", " 'her', 'was', 'one', 'our', 'out', 'this', 'that', 'with', 'have',\n", " 'from', 'they', 'been', 'were', 'said', 'what', 'when', 'your',\n", " 'more', 'will', 'there', 'their', 'about', 'which', 'into', 'than',\n", " 'them', 'would', 'could', 'should', 'who', 'has', 'had', 'how',\n", " 'its', 'may', 'these', 'some', 'such', 'only', 'other', 'any',\n", " 'most', 'also', 'very', 'even', 'just', 'like', 'both', 'each',\n", " 'did', 'does', 'his', 'she', 'him', 'well', 'many', 'much',\n", " 'where', 'here', 'now', 'then', 'because', 'before', 'after',\n", " 'through', 'during', 'without', 'within', 'being', 'under', 'over',\n", " 'again', 'further', 'once', 'why', 'while', 'same', 'those', 'own',\n", " 'too', 'off', 'down', 'upon', 'between', 'few', 'above', 'below',\n", " 'doing', 'an', 'as', 'at', 'be', 'by', 'he', 'if', 'in', 'is',\n", " 'it', 'me', 'my', 'no', 'of', 'on', 'or', 'so', 'to', 'up', 'we', 'cool', 'white'\n", " \n", " # Filler words / conversational\n", " 'yeah', 'okay', 'um', 'uh', 'hmm', 'oh', 'ah',\n", " 'know', 'think', 'going', 'got', 'get', 'let',\n", " 'see', 'want', 'make', 'really', 'lot', 'kind',\n", " 'sort', 'thing', 'things', 'stuff', 'actually',\n", " 'basically', 'literally', 'probably', 'maybe',\n", " 'guess', 'mean', 'means', 'supposed', 'trying',\n", " \n", " # Interview-related words\n", " 'interviewer', 'interviewee', 'question', 'answer',\n", " 'ask', 'asked', 'asking', 'tell', 'told', 'telling',\n", " 'talk', 'talked', 'talking', 'discuss', 'discussed',\n", " 'say', 'said', 'says', 'saying', 'read', 'reading',\n", " \n", " # Common names (customize for your interviews)\n", " 'arnold', 'nikki', 'ritsch', 'pete', 'williams', 'richard',\n", " 'jennifer', 'clyde', 'jason', 'robert', 'michael', 'bob', \n", " 'bill', 'albertson', 'leif', 'john', 'mary', 'james', 'linda',\n", " 'david', 'susan', 'thomas', 'nancy', 'charles', 'karen', 'leif', 'albertson', 'lauryn', 'spearing'\n", "}\n", "\n", "print(\" → Creating stricter vectorizer...\")\n", "vectorizer = TfidfVectorizer(\n", " max_features=2000, # Fewer features\n", " stop_words=list(comprehensive_stopwords),\n", " ngram_range=(1, 2),\n", " min_df=4, # Must appear in 4+ documents (stricter)\n", " max_df=0.6, # Can't appear in >60% of documents (stricter)\n", " token_pattern=r'\\b[a-z]{4,}\\b' # Only words with 4+ letters (stricter)\n", ")\n", "\n", "print(\" → Building document-term matrix...\")\n", "doc_term_matrix = vectorizer.fit_transform(processed_docs)\n", "print(f\" ✓ Matrix shape: {doc_term_matrix.shape}\")\n", "\n", "# Topic modeling\n", "n_topics = 25 # Fewer topics for more focused results\n", "print(f\" → Running LDA with {n_topics} topics...\")\n", "lda_model = LatentDirichletAllocation(\n", " n_components=n_topics,\n", " max_iter=50,\n", " learning_method='online',\n", " random_state=42,\n", " n_jobs=-1\n", ")\n", "\n", "doc_topic_dist = lda_model.fit_transform(doc_term_matrix)\n", "print(f\" ✓ Model trained!\")\n", "\n", "# Add to collection\n", "if 'model_results' not in globals():\n", " model_results = []\n", "\n", "model_results.append({\n", " 'name': 'Model 3 (Stricter)',\n", " 'lda_model': lda_model,\n", " 'vectorizer': vectorizer,\n", " 'doc_term_matrix': doc_term_matrix,\n", " 'doc_topic_dist': doc_topic_dist,\n", " 'feature_names': vectorizer.get_feature_names_out(),\n", " 'processed_docs': processed_docs,\n", " 'n_topics': n_topics\n", "})\n", "\n", "print(f\"\\n MODEL 3 COMPLETE\")\n", "print(f\" Topics: {n_topics}\")\n", "print(f\" Vocabulary: {len(vectorizer.get_feature_names_out())} terms\")\n", "print(f\" Stored in: model_results[{len(model_results)-1}]\")\n", "print(\"=\"*60 + \"\\n\")\n", "\n", "print(\"💡 Model 3 uses STRICTER filtering:\")\n", "print(\" • Longer words only (4+ characters)\")\n", "print(\" • Higher document frequency (4+ docs)\")\n", "print(\" • Lower max frequency (60% vs 90%)\")\n", "print(\" • Removes URLs, emails, numbers\")\n", "print(\" → Results in fewer but more focused topics\")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "📦 MODEL RESULTS COLLECTION\n", "================================================================================\n", "\n", "✓ Found 4 models\n", "\n", "1. Model 1 (Baseline)\n", " Topics: 25\n", " Vocabulary: 100 terms\n", " Sample topic: ritsch, way, albertson yeah, little bit, systems\n", "\n", "2. Model 2 (Enhanced)\n", " Topics: 25\n", " Vocabulary: 2000 terms\n", " Sample topic: system ve, big challenge, school, shop, lab\n", "\n", "3. Model 2 (Enhanced)\n", " Topics: 25\n", " Vocabulary: 2000 terms\n", " Sample topic: system ve, big challenge, school, shop, lab\n", "\n", "4. Model 3 (Stricter)\n", " Topics: 25\n", " Vocabulary: 543 terms\n", " Sample topic: entity, round, whole system, collection, every community\n", "\n", "✓ Available as 'all_results' for interactive selector\n", "================================================================================\n" ] } ], "source": [ "#Cell 14 DISPLAY MODEL RESULTS \n", "\n", "# CELL 19: Display Model Results\n", "print(\"=\"*80)\n", "print(\"📦 MODEL RESULTS COLLECTION\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'model_results' not in globals():\n", " print(\"⚠️ No models found!\")\n", " print(\"Run model cells first (15, 13, 21)\")\n", " model_results = []\n", " all_results = []\n", "else:\n", " print(f\"✓ Found {len(model_results)} models\\n\")\n", " \n", " # Display summary\n", " for i, result in enumerate(model_results):\n", " print(f\"{i+1}. {result['name']}\")\n", " print(f\" Topics: {result['n_topics']}\")\n", " print(f\" Vocabulary: {len(result['feature_names'])} terms\")\n", " \n", " # Extract and show sample topic\n", " if result['lda_model'] is not None and result['feature_names'] is not None:\n", " topic = result['lda_model'].components_[0]\n", " top_indices = topic.argsort()[-5:][::-1]\n", " top_words = [result['feature_names'][i] for i in top_indices]\n", " print(f\" Sample topic: {', '.join(top_words)}\")\n", " print()\n", " \n", " # Make available as all_results for compatibility\n", " all_results = model_results\n", " \n", " print(\"✓ Available as 'all_results' for interactive selector\")\n", "\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔍 MODEL DIAGNOSTIC\n", "================================================================================\n", "\n", "Checking model_results collection...\n", "--------------------------------------------------------------------------------\n", "✓ model_results exists: 4 models found\n", "\n", "Model 1: Model 1 (Baseline)\n", " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n", " ✓ lda_model\n", " ✓ vectorizer\n", " ✓ feature_names\n", " ✓ n_topics\n", " ✓ doc_term_matrix\n", " ✓ processed_docs\n", " → Topics: 25\n", " → Vocabulary: 100 terms\n", "\n", "Model 2: Model 2 (Enhanced)\n", " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n", " ✓ lda_model\n", " ✓ vectorizer\n", " ✓ feature_names\n", " ✓ n_topics\n", " ✓ doc_term_matrix\n", " ✓ processed_docs\n", " → Topics: 25\n", " → Vocabulary: 2000 terms\n", "\n", "Model 3: Model 2 (Enhanced)\n", " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n", " ✓ lda_model\n", " ✓ vectorizer\n", " ✓ feature_names\n", " ✓ n_topics\n", " ✓ doc_term_matrix\n", " ✓ processed_docs\n", " → Topics: 25\n", " → Vocabulary: 2000 terms\n", "\n", "Model 4: Model 3 (Stricter)\n", " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -\n", " ✓ lda_model\n", " ✓ vectorizer\n", " ✓ feature_names\n", " ✓ n_topics\n", " ✓ doc_term_matrix\n", " ✓ processed_docs\n", " → Topics: 25\n", " → Vocabulary: 543 terms\n", "\n", "================================================================================\n", "Checking all_results alias...\n", "--------------------------------------------------------------------------------\n", "✓ all_results properly aliased to model_results\n", "\n", "================================================================================\n", "📋 SUMMARY\n", "================================================================================\n", "\n", "✅ ALL SYSTEMS OPERATIONAL\n", "\n", "Models ready: 4\n", " • All models have required fields\n", " • Interactive selector should work\n", "\n", "Next steps:\n", " → Run interactive selector (Cell 16)\n", " → Continue with analysis\n", "\n", "================================================================================\n", "💡 EXPECTED CONFIGURATION\n", "================================================================================\n", "\n", "You should have 3 models:\n", " ✓ Model 1 (Baseline) - 5 topics, 100 vocab\n", " ✓ Model 2 (Enhanced) - 8 topics, 200 vocab\n", " ✓ Model 3 (Stricter) - 6 topics, 150 vocab\n", "\n", "================================================================================\n" ] } ], "source": [ "#CELL 15: DIAGNOSTIC ASSESSMENT OF MODELS (CLEAN VERSION)\n", "\n", "print(\"=\"*80)\n", "print(\"🔍 MODEL DIAGNOSTIC\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# ==========================================\n", "# CHECK model_results COLLECTION\n", "# ==========================================\n", "print(\"Checking model_results collection...\")\n", "print(\"-\"*80)\n", "\n", "if 'model_results' not in globals():\n", " print(\"❌ model_results doesn't exist!\")\n", " print(\" → No models have been created yet\")\n", " print(\" → Run model cells: 10 (Model 1), 13 (Model 2), 21 (Model 3)\")\n", " model_results = [] # Create empty list\n", "else:\n", " print(f\"✓ model_results exists: {len(model_results)} models found\\n\")\n", " \n", " if len(model_results) == 0:\n", " print(\"⚠️ model_results is empty!\")\n", " print(\" → Run model cells to add models\")\n", " else:\n", " # Display each model\n", " for i, model in enumerate(model_results):\n", " print(f\"Model {i+1}: {model.get('name', 'Unknown')}\")\n", " print(f\"{' -'*30}\")\n", " \n", " # Check critical fields\n", " checks = {\n", " 'lda_model': model.get('lda_model') is not None,\n", " 'vectorizer': model.get('vectorizer') is not None,\n", " 'feature_names': model.get('feature_names') is not None,\n", " 'n_topics': 'n_topics' in model,\n", " 'doc_term_matrix': model.get('doc_term_matrix') is not None,\n", " 'processed_docs': model.get('processed_docs') is not None\n", " }\n", " \n", " for field, status in checks.items():\n", " symbol = \"✓\" if status else \"✗\"\n", " print(f\" {symbol} {field}\")\n", " \n", " # Show model stats\n", " if checks['n_topics']:\n", " print(f\" → Topics: {model['n_topics']}\")\n", " if checks['feature_names']:\n", " print(f\" → Vocabulary: {len(model['feature_names'])} terms\")\n", " print()\n", "\n", "# ==========================================\n", "# CHECK all_results ALIAS\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"Checking all_results alias...\")\n", "print(\"-\"*80)\n", "\n", "if 'all_results' in globals():\n", " if all_results is model_results:\n", " print(\"✓ all_results properly aliased to model_results\")\n", " else:\n", " print(\"⚠️ all_results exists but isn't same as model_results\")\n", " print(f\" all_results has {len(all_results)} items\")\n", " print(f\" model_results has {len(model_results)} items\")\n", "else:\n", " print(\"✗ all_results doesn't exist\")\n", " print(\" → Run Cell 14 to create alias\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# SUMMARY & RECOMMENDATIONS\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"📋 SUMMARY\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'model_results' not in globals() or len(model_results) == 0:\n", " print(\"❌ No models found\")\n", " print()\n", " print(\"Next steps:\")\n", " print(\" 1. Run Cell 10 → Create Model 1 (Baseline)\")\n", " print(\" 2. Run Cell 13 → Create Model 2 (Enhanced)\")\n", " print(\" 3. Run Cell 21 → Create Model 3 (Stricter)\")\n", " print(\" 4. Run Cell 14 → Display results\")\n", " print(\" 5. Re-run this diagnostic to verify\")\n", "else:\n", " all_good = True\n", " \n", " # Check if all models have required fields\n", " for model in model_results:\n", " if model.get('lda_model') is None or model.get('feature_names') is None:\n", " all_good = False\n", " break\n", " \n", " if all_good:\n", " print(\"✅ ALL SYSTEMS OPERATIONAL\")\n", " print()\n", " print(f\"Models ready: {len(model_results)}\")\n", " print(\" • All models have required fields\")\n", " print(\" • Interactive selector should work\")\n", " print()\n", " print(\"Next steps:\")\n", " print(\" → Run interactive selector (Cell 16)\")\n", " print(\" → Continue with analysis\")\n", " else:\n", " print(\"⚠️ SOME ISSUES DETECTED\")\n", " print()\n", " print(\"Some models are missing required fields.\")\n", " print(\"This usually means:\")\n", " print(\" • Model cell didn't run completely\")\n", " print(\" • Model cell has errors\")\n", " print()\n", " print(\"Solution:\")\n", " print(\" → Re-run the model cells (10, 13, 21)\")\n", " print(\" → Check for errors in output\")\n", "\n", "# ==========================================\n", "# EXPECTED MODEL COUNT\n", "# ==========================================\n", "print()\n", "print(\"=\"*80)\n", "print(\"💡 EXPECTED CONFIGURATION\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "expected_models = [\n", " \"Model 1 (Baseline) - 5 topics, 100 vocab\",\n", " \"Model 2 (Enhanced) - 8 topics, 200 vocab\", \n", " \"Model 3 (Stricter) - 6 topics, 150 vocab\"\n", "]\n", "\n", "print(\"You should have 3 models:\")\n", "for model_desc in expected_models:\n", " model_name = model_desc.split(' - ')[0]\n", " exists = any(model_name in m.get('name', '') for m in model_results) if 'model_results' in globals() else False\n", " symbol = \"✓\" if exists else \"○\"\n", " print(f\" {symbol} {model_desc}\")\n", "\n", "print()\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "VISUAL MODEL COMPARISON\n", "================================================================================\n", "\n", " Model Topics Vocabulary Documents Features\n", "Model 1 (Baseline) 25 100 9 100\n", "Model 2 (Enhanced) 25 2000 9 2000\n", "Model 2 (Enhanced) 25 2000 9 2000\n", "Model 3 (Stricter) 25 543 9 543\n", "\n", "\n", "================================================================================\n", "SAMPLE TOPICS FROM EACH MODEL\n", "================================================================================\n", "\n", "Model 1 (Baseline):\n", " Topic 0: ritsch, way, albertson yeah, little bit, systems\n", " Topic 1: guys, house, operator, um, point\n", "\n", "Model 2 (Enhanced):\n", " Topic 0: system ve, big challenge, school, shop, lab\n", " Topic 1: waters, back work, jobs, home, members\n", "\n", "Model 2 (Enhanced):\n", " Topic 0: system ve, big challenge, school, shop, lab\n", " Topic 1: waters, back work, jobs, home, members\n", "\n", "Model 3 (Stricter):\n", " Topic 0: entity, round, whole system, collection, every community\n", " Topic 1: myself, site, overflow, happening, depends\n", "\n", "================================================================================\n" ] } ], "source": [ "#CELL 16 Visual Model Comparison\n", "\n", "print(\"=\"*80)\n", "print(\"VISUAL MODEL COMPARISON\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'model_results' not in globals() or len(model_results) == 0:\n", " print(\"⚠️ No models to compare!\")\n", " print(\"Run model cells first (15, 13, 21)\")\n", "else:\n", " import pandas as pd\n", " \n", " # Create comparison table\n", " comparison_data = []\n", " for result in model_results:\n", " comparison_data.append({\n", " 'Model': result['name'],\n", " 'Topics': result['n_topics'],\n", " 'Vocabulary': len(result['feature_names']),\n", " 'Documents': result['doc_term_matrix'].shape[0] if result['doc_term_matrix'] is not None else 'N/A',\n", " 'Features': result['doc_term_matrix'].shape[1] if result['doc_term_matrix'] is not None else 'N/A'\n", " })\n", " \n", " df = pd.DataFrame(comparison_data)\n", " print(df.to_string(index=False))\n", " print()\n", " \n", " # Show sample topics for each\n", " print(\"\\n\" + \"=\"*80)\n", " print(\"SAMPLE TOPICS FROM EACH MODEL\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " for result in model_results:\n", " print(f\"{result['name']}:\")\n", " \n", " # Extract first 2 topics\n", " for topic_idx in range(min(2, result['n_topics'])):\n", " topic = result['lda_model'].components_[topic_idx]\n", " top_indices = topic.argsort()[-5:][::-1]\n", " top_words = [result['feature_names'][i] for i in top_indices]\n", " print(f\" Topic {topic_idx}: {', '.join(top_words)}\")\n", " print()\n", " \n", " print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", " DETAILED MODEL COMPARISON\n", "================================================================================\n", "\n", "Step 1: Identifying available models...\n", "--------------------------------------------------------------------------------\n", " ✓ Model 1 (Baseline) found\n", "\n", "✓ Found 1 models for comparison\n", "\n", "Step 2: Extracting topics from each model...\n", "--------------------------------------------------------------------------------\n", " → Model 1 (Baseline)...\n", " ✓ Extracted 25 topics\n", "\n", "✓ Total topics extracted: 25\n", "\n", "Step 3: Creating comparison table...\n", "--------------------------------------------------------------------------------\n", " ✓ Comparison table created\n", " • Rows: 25\n", " • Columns: 6\n", "\n", "================================================================================\n", "📄 DETAILED TOPIC COMPARISON\n", "================================================================================\n", "\n", "\n", "Model 1 (Baseline)\n", "--------------------------------------------------------------------------------\n", "\n", "Topic 0:\n", " Top 5: entity, round, whole system, collection, every community\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 1:\n", " Top 5: myself, site, overflow, happening, depends\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 2:\n", " Top 5: plant right, respond, quite, distribution, order\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 3:\n", " Top 5: addressing, time right, respond, taken, time yeah\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 4:\n", " Top 5: yeah little, water every, imagine, texas, must\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 5:\n", " Top 5: describe, basis, jump, water every, solve\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 6:\n", " Top 5: people yeah, arctic, figure, lose, single\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 7:\n", " Top 5: confidence, water community, happy, earlier, chemicals\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 8:\n", " Top 5: leaving, complex, ahead, picture, helps\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 9:\n", " Top 5: order, yeah couple, erosion, totally, water doesn\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 10:\n", " Top 5: weekend, strong, right water, separate, water house\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 11:\n", " Top 5: driver, white, health, remote, program\n", " Weight: 1.315 (avg: 0.776)\n", "\n", "Topic 12:\n", " Top 5: helping, interviews, form, rather, state level\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 13:\n", " Top 5: come work, professional, penalty, worker, mostly\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 14:\n", " Top 5: helpful, aspects, knowledge, penalty, driven\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 15:\n", " Top 5: kinds, require, starts, calling, route\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 16:\n", " Top 5: front, round, license, success, breaking\n", " Weight: 0.051 (avg: 0.049)\n", "\n", "Topic 17:\n", " Top 5: distribution, specific, holding tank, people yeah, areas\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 18:\n", " Top 5: draw, different types, assuming, turned, towards water\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 19:\n", " Top 5: lift station, anyone, totally, solutions, covered\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 20:\n", " Top 5: something yeah, classes, directly, additional, based\n", " Weight: 0.050 (avg: 0.049)\n", "\n", "Topic 21:\n", " Top 5: program, something yeah, seasonal, apart, beginning\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 22:\n", " Top 5: water community, yeah little, national, helpful, maintenance workers\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 23:\n", " Top 5: useful, educational, passed, right getting, teaching\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "Topic 24:\n", " Top 5: basis, water community, connect, apart, repeat\n", " Weight: 0.049 (avg: 0.049)\n", "\n", "================================================================================\n", "📊 SUMMARY STATISTICS\n", "================================================================================\n", "\n", " Model Topics Features Documents Sparsity Perplexity\n", "Model 1 (Baseline) 25 543 9 51.1% 4339995.56\n", "\n", "================================================================================\n", "💡 RECOMMENDATIONS\n", "================================================================================\n", "\n", "Model Selection Guidelines:\n", "\n", " You have 1 model.\n", " Consider running additional models for comparison:\n", " • Cell 13: Model 2 (Enhanced stopwords)\n", " • Cell 21: Model 3 (Stricter filtering)\n", "\n", " To set a model as your primary model:\n", " lda_model = lda_model_v2 # For Model 2\n", " lda_model = lda_model_v3 # For Model 3\n", "\n", "================================================================================\n", "💾 EXPORT OPTIONS\n", "================================================================================\n", "\n", "Comparison table available as: comparison_df\n", "Summary statistics available as: summary_df\n", "\n", "To export:\n", " comparison_df.to_csv('topic_comparison.csv', index=False)\n", " summary_df.to_csv('model_summary.csv', index=False)\n", "\n", "================================================================================\n", "✅ DETAILED COMPARISON COMPLETE\n", "================================================================================\n" ] } ], "source": [ "#Cell 17 Detailed Model Comparison\n", "\n", "\"\"\"Compares all available models side-by-side.\n", "Works with whatever models exist.\n", "\"\"\"\n", "\n", "print(\"=\"*80)\n", "print(\" DETAILED MODEL COMPARISON\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "import numpy as np\n", "\n", "# ==========================================\n", "# 1. CHECK WHAT MODELS EXIST\n", "# ==========================================\n", "print(\"Step 1: Identifying available models...\")\n", "print(\"-\"*80)\n", "\n", "available_models = []\n", "\n", "# Check for Model 1\n", "if 'lda_model' in globals() and 'vectorizer' in globals():\n", " # Make sure it's not v2 or v3\n", " if 'lda_model_v2' not in globals() or globals()['lda_model'] is not globals().get('lda_model_v2'):\n", " available_models.append({\n", " 'name': 'Model 1 (Baseline)',\n", " 'model': lda_model,\n", " 'vectorizer': vectorizer,\n", " 'matrix': globals().get('doc_term_matrix')\n", " })\n", " print(\" ✓ Model 1 (Baseline) found\")\n", "\n", "# Check for Model 2\n", "if 'lda_model_v2' in globals() and 'vectorizer_v2' in globals():\n", " available_models.append({\n", " 'name': 'Model 2 (Enhanced)',\n", " 'model': lda_model_v2,\n", " 'vectorizer': vectorizer_v2,\n", " 'matrix': globals().get('doc_term_matrix_v2')\n", " })\n", " print(\" ✓ Model 2 (Enhanced) found\")\n", "\n", "# Check for Model 3\n", "if 'lda_model_v3' in globals() and 'vectorizer_v3' in globals():\n", " available_models.append({\n", " 'name': 'Model 3 (Stricter)',\n", " 'model': lda_model_v3,\n", " 'vectorizer': vectorizer_v3,\n", " 'matrix': globals().get('doc_term_matrix_v3')\n", " })\n", " print(\" ✓ Model 3 (Stricter) found\")\n", "\n", "print(f\"\\n✓ Found {len(available_models)} models for comparison\")\n", "print()\n", "\n", "if len(available_models) == 0:\n", " print(\"⚠️ No models found for comparison!\")\n", " print(\"Run at least one model cell first (Cells 8-13, or 21)\")\n", "else:\n", " \n", " # ==========================================\n", " # 2. EXTRACT TOPICS FROM EACH MODEL\n", " # ==========================================\n", " print(\"Step 2: Extracting topics from each model...\")\n", " print(\"-\"*80)\n", " \n", " all_topics = []\n", " \n", " for model_info in available_models:\n", " model_name = model_info['name']\n", " model = model_info['model']\n", " vectorizer = model_info['vectorizer']\n", " \n", " print(f\" → {model_name}...\")\n", " \n", " feature_names = vectorizer.get_feature_names_out()\n", " \n", " for topic_idx, topic in enumerate(model.components_):\n", " # Get top 10 words\n", " top_indices = topic.argsort()[-10:][::-1]\n", " top_words = [feature_names[i] for i in top_indices]\n", " top_weights = topic[top_indices]\n", " \n", " all_topics.append({\n", " 'Model': model_name,\n", " 'Topic': topic_idx,\n", " 'Top 5 Words': ', '.join(top_words[:5]),\n", " 'Top 10 Words': ', '.join(top_words),\n", " 'Top Weight': f\"{top_weights[0]:.3f}\",\n", " 'Avg Weight': f\"{np.mean(top_weights):.3f}\"\n", " })\n", " \n", " print(f\" ✓ Extracted {model.n_components} topics\")\n", " \n", " print(f\"\\n✓ Total topics extracted: {len(all_topics)}\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. CREATE COMPARISON DATAFRAME\n", " # ==========================================\n", " print(\"Step 3: Creating comparison table...\")\n", " print(\"-\"*80)\n", " \n", " comparison_df = pd.DataFrame(all_topics)\n", " \n", " print(f\" ✓ Comparison table created\")\n", " print(f\" • Rows: {len(comparison_df)}\")\n", " print(f\" • Columns: {len(comparison_df.columns)}\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. DISPLAY RESULTS BY MODEL\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📄 DETAILED TOPIC COMPARISON\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " for model_info in available_models:\n", " model_name = model_info['name']\n", " model_topics = comparison_df[comparison_df['Model'] == model_name]\n", " \n", " print(f\"\\n{model_name}\")\n", " print(\"-\"*80)\n", " \n", " for _, row in model_topics.iterrows():\n", " print(f\"\\nTopic {row['Topic']}:\")\n", " print(f\" Top 5: {row['Top 5 Words']}\")\n", " print(f\" Weight: {row['Top Weight']} (avg: {row['Avg Weight']})\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 5. SUMMARY STATISTICS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 SUMMARY STATISTICS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " summary_data = []\n", " \n", " for model_info in available_models:\n", " model_name = model_info['name']\n", " model = model_info['model']\n", " vectorizer = model_info['vectorizer']\n", " matrix = model_info['matrix']\n", " \n", " stats = {\n", " 'Model': model_name,\n", " 'Topics': model.n_components,\n", " 'Features': len(vectorizer.get_feature_names_out())\n", " }\n", " \n", " if matrix is not None:\n", " stats['Documents'] = matrix.shape[0]\n", " stats['Sparsity'] = f\"{(1 - matrix.nnz / (matrix.shape[0] * matrix.shape[1]))*100:.1f}%\"\n", " \n", " # Perplexity if available\n", " try:\n", " if matrix is not None:\n", " stats['Perplexity'] = f\"{model.perplexity(matrix):.2f}\"\n", " except:\n", " stats['Perplexity'] = 'N/A'\n", " \n", " summary_data.append(stats)\n", " \n", " summary_df = pd.DataFrame(summary_data)\n", " print(summary_df.to_string(index=False))\n", " print()\n", " \n", " # ==========================================\n", " # 6. RECOMMENDATIONS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"💡 RECOMMENDATIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Model Selection Guidelines:\")\n", " print()\n", " \n", " if len(available_models) >= 3:\n", " print(\" • Model 1 (Baseline): Standard approach, good starting point\")\n", " print(\" • Model 2 (Enhanced): Better stopword filtering, cleaner topics\")\n", " print(\" • Model 3 (Stricter): Most focused topics, fewer but clearer\")\n", " print()\n", " print(\" 💡 TIP: Review topics from each model above\")\n", " print(\" Choose the model with topics that best match your research questions\")\n", " \n", " elif len(available_models) == 2:\n", " print(\" You have 2 models to compare.\")\n", " print(\" Review the topics above and choose the one that:\")\n", " print(\" • Has clearer, more distinct topics\")\n", " print(\" • Better captures your data's themes\")\n", " print(\" • Matches your analysis goals\")\n", " \n", " else:\n", " print(\" You have 1 model.\")\n", " print(\" Consider running additional models for comparison:\")\n", " print(\" • Cell 13: Model 2 (Enhanced stopwords)\")\n", " print(\" • Cell 21: Model 3 (Stricter filtering)\")\n", " \n", " print()\n", " print(\" To set a model as your primary model:\")\n", " print(\" lda_model = lda_model_v2 # For Model 2\")\n", " print(\" lda_model = lda_model_v3 # For Model 3\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 7. EXPORT OPTIONS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"💾 EXPORT OPTIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Comparison table available as: comparison_df\")\n", " print(\"Summary statistics available as: summary_df\")\n", " print()\n", " print(\"To export:\")\n", " print(\" comparison_df.to_csv('topic_comparison.csv', index=False)\")\n", " print(\" summary_df.to_csv('model_summary.csv', index=False)\")\n", " print()\n", "\n", "print(\"=\"*80)\n", "print(\"✅ DETAILED COMPARISON COMPLETE\")\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "MODEL QUALITY METRICS\n", "================================================================================\n", "\n", "Analyzing 4 models...\n", "--------------------------------------------------------------------------------\n", "\n", "→ Model 1 (Baseline)\n", " • Perplexity: 29545.73\n", " • Log-Likelihood: -440.82\n", " • Avg Topic Coherence: 0.104\n", " • Topic Diversity: 0.052\n", " • Matrix Sparsity: 25.7%\n", "\n", "→ Model 2 (Enhanced)\n", " • Perplexity: 983342320.15\n", " • Log-Likelihood: -3649.35\n", " • Avg Topic Coherence: 0.083\n", " • Topic Diversity: 0.015\n", " • Matrix Sparsity: 54.1%\n", "\n", "→ Model 2 (Enhanced)\n", " • Perplexity: 983342320.15\n", " • Log-Likelihood: -3649.35\n", " • Avg Topic Coherence: 0.083\n", " • Topic Diversity: 0.015\n", " • Matrix Sparsity: 54.1%\n", "\n", "→ Model 3 (Stricter)\n", " • Perplexity: 4339995.56\n", " • Log-Likelihood: -1561.34\n", " • Avg Topic Coherence: 0.078\n", " • Topic Diversity: 0.009\n", " • Matrix Sparsity: 51.1%\n", "\n", "================================================================================\n", "QUALITY METRICS COMPARISON\n", "================================================================================\n", "\n", " Model Topics Features Documents Perplexity Log-Likelihood Avg Coherence Topic Diversity Matrix Sparsity\n", "Model 1 (Baseline) 25 100 9 29545.73 -440.82 0.104 0.052 25.7%\n", "Model 2 (Enhanced) 25 2000 9 983342320.15 -3649.35 0.083 0.015 54.1%\n", "Model 2 (Enhanced) 25 2000 9 983342320.15 -3649.35 0.083 0.015 54.1%\n", "Model 3 (Stricter) 25 543 9 4339995.56 -1561.34 0.078 0.009 51.1%\n", "\n", "================================================================================\n", "INTERPRETING THE METRICS\n", "================================================================================\n", "\n", "1. Perplexity (Lower = Better)\n", " How well model predicts the data\n", " Typical range: 100-500 for interview corpora\n", "\n", "2. Log-Likelihood (Higher/Less Negative = Better)\n", " How well model fits the data\n", " More negative = worse fit\n", "\n", "3. Avg Topic Coherence (Higher = Better)\n", " How semantically related words are within topics\n", " Range: 0-1, higher = more interpretable\n", "\n", "4. Topic Diversity (Higher = Better)\n", " How different topics are from each other\n", " Range: 0-1, higher = more distinct topics\n", "\n", "5. Matrix Sparsity\n", " How sparse the document-term matrix is\n", " 90%+ is typical for text data\n", "\n", "================================================================================\n", "RECOMMENDATIONS\n", "================================================================================\n", "\n", "✓ Best Perplexity: Model 1 (Baseline) (29545.73)\n", "✓ Best Coherence: Model 1 (Baseline) (0.104)\n", "✓ Best Diversity: Model 1 (Baseline) (0.052)\n", "\n", "Choose the model that best balances:\n", " • Low perplexity (good predictions)\n", " • High coherence (meaningful topics)\n", " • High diversity (distinct topics)\n", "\n", "================================================================================\n", "EXPORT OPTIONS\n", "================================================================================\n", "\n", "Quality metrics available as: metrics_df\n", "\n", "To export:\n", " metrics_df.to_csv('model_quality_metrics.csv', index=False)\n", "\n", "================================================================================\n", "QUALITY METRICS ANALYSIS COMPLETE\n", "================================================================================\n" ] } ], "source": [ "# CELL 18: Model Quality Metrics\n", "\n", "print(\"=\"*80)\n", "print(\"MODEL QUALITY METRICS\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import numpy as np\n", "import pandas as pd\n", "\n", "# ==========================================\n", "# CHECK model_results EXISTS\n", "# ==========================================\n", "if 'model_results' not in globals() or len(model_results) == 0:\n", " print(\"⚠️ No models found!\")\n", " print(\"Run model cells first (10, 13, 21)\")\n", "else:\n", " print(f\"Analyzing {len(model_results)} models...\")\n", " print(\"-\"*80 + \"\\n\")\n", " \n", " # ==========================================\n", " # CALCULATE METRICS FOR EACH MODEL\n", " # ==========================================\n", " quality_metrics = []\n", " \n", " for model_info in model_results:\n", " print(f\"→ {model_info['name']}\")\n", " \n", " model = model_info['lda_model']\n", " matrix = model_info['doc_term_matrix']\n", " vectorizer = model_info['vectorizer']\n", " \n", " metrics = {'Model': model_info['name']}\n", " \n", " # Basic info\n", " metrics['Topics'] = model.n_components\n", " metrics['Features'] = len(model_info['feature_names'])\n", " metrics['Documents'] = matrix.shape[0]\n", " \n", " # 1. Perplexity (lower is better)\n", " try:\n", " perplexity = model.perplexity(matrix)\n", " metrics['Perplexity'] = f\"{perplexity:.2f}\"\n", " print(f\" • Perplexity: {perplexity:.2f}\")\n", " except Exception as e:\n", " metrics['Perplexity'] = 'N/A'\n", " print(f\" • Perplexity: N/A\")\n", " \n", " # 2. Log-likelihood (higher/less negative is better)\n", " try:\n", " log_likelihood = model.score(matrix)\n", " metrics['Log-Likelihood'] = f\"{log_likelihood:.2f}\"\n", " print(f\" • Log-Likelihood: {log_likelihood:.2f}\")\n", " except Exception as e:\n", " metrics['Log-Likelihood'] = 'N/A'\n", " print(f\" • Log-Likelihood: N/A\")\n", " \n", " # 3. Topic coherence (simplified - avg top word weights)\n", " try:\n", " topic_coherences = []\n", " for topic_idx, topic in enumerate(model.components_):\n", " top_indices = topic.argsort()[-10:][::-1]\n", " top_weights = topic[top_indices]\n", " coherence = np.mean(top_weights)\n", " topic_coherences.append(coherence)\n", " \n", " avg_coherence = np.mean(topic_coherences)\n", " metrics['Avg Coherence'] = f\"{avg_coherence:.3f}\"\n", " print(f\" • Avg Topic Coherence: {avg_coherence:.3f}\")\n", " except Exception as e:\n", " metrics['Avg Coherence'] = 'N/A'\n", " print(f\" • Avg Coherence: N/A\")\n", " \n", " # 4. Topic diversity (how different topics are)\n", " try:\n", " from sklearn.metrics.pairwise import cosine_similarity\n", " \n", " topic_similarities = cosine_similarity(model.components_)\n", " # Exclude diagonal (self-similarity)\n", " np.fill_diagonal(topic_similarities, np.nan)\n", " avg_similarity = np.nanmean(topic_similarities)\n", " \n", " # Diversity = 1 - similarity\n", " diversity = 1 - avg_similarity\n", " metrics['Topic Diversity'] = f\"{diversity:.3f}\"\n", " print(f\" • Topic Diversity: {diversity:.3f}\")\n", " except Exception as e:\n", " metrics['Topic Diversity'] = 'N/A'\n", " print(f\" • Topic Diversity: N/A\")\n", " \n", " # 5. Sparsity\n", " sparsity = 1 - (matrix.nnz / (matrix.shape[0] * matrix.shape[1]))\n", " metrics['Matrix Sparsity'] = f\"{sparsity*100:.1f}%\"\n", " print(f\" • Matrix Sparsity: {sparsity*100:.1f}%\")\n", " \n", " quality_metrics.append(metrics)\n", " print()\n", " \n", " # ==========================================\n", " # DISPLAY COMPARISON TABLE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"QUALITY METRICS COMPARISON\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " metrics_df = pd.DataFrame(quality_metrics)\n", " print(metrics_df.to_string(index=False))\n", " print()\n", " \n", " # ==========================================\n", " # INTERPRETATION GUIDE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"INTERPRETING THE METRICS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"1. Perplexity (Lower = Better)\")\n", " print(\" How well model predicts the data\")\n", " print(\" Typical range: 100-500 for interview corpora\\n\")\n", " \n", " print(\"2. Log-Likelihood (Higher/Less Negative = Better)\")\n", " print(\" How well model fits the data\")\n", " print(\" More negative = worse fit\\n\")\n", " \n", " print(\"3. Avg Topic Coherence (Higher = Better)\")\n", " print(\" How semantically related words are within topics\")\n", " print(\" Range: 0-1, higher = more interpretable\\n\")\n", " \n", " print(\"4. Topic Diversity (Higher = Better)\")\n", " print(\" How different topics are from each other\")\n", " print(\" Range: 0-1, higher = more distinct topics\\n\")\n", " \n", " print(\"5. Matrix Sparsity\")\n", " print(\" How sparse the document-term matrix is\")\n", " print(\" 90%+ is typical for text data\\n\")\n", " \n", " # ==========================================\n", " # RECOMMENDATIONS\n", " # ==========================================\n", " if len(model_results) >= 2:\n", " print(\"=\"*80)\n", " print(\"RECOMMENDATIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Find best perplexity\n", " perplexities = {}\n", " for m in quality_metrics:\n", " if m['Perplexity'] != 'N/A':\n", " try:\n", " perplexities[m['Model']] = float(m['Perplexity'])\n", " except:\n", " pass\n", " \n", " if perplexities:\n", " best_perp = min(perplexities.items(), key=lambda x: x[1])\n", " print(f\"✓ Best Perplexity: {best_perp[0]} ({best_perp[1]:.2f})\")\n", " \n", " # Find best coherence\n", " coherences = {}\n", " for m in quality_metrics:\n", " if m['Avg Coherence'] != 'N/A':\n", " try:\n", " coherences[m['Model']] = float(m['Avg Coherence'])\n", " except:\n", " pass\n", " \n", " if coherences:\n", " best_coh = max(coherences.items(), key=lambda x: x[1])\n", " print(f\"✓ Best Coherence: {best_coh[0]} ({best_coh[1]:.3f})\")\n", " \n", " # Find best diversity\n", " diversities = {}\n", " for m in quality_metrics:\n", " if m['Topic Diversity'] != 'N/A':\n", " try:\n", " diversities[m['Model']] = float(m['Topic Diversity'])\n", " except:\n", " pass\n", " \n", " if diversities:\n", " best_div = max(diversities.items(), key=lambda x: x[1])\n", " print(f\"✓ Best Diversity: {best_div[0]} ({best_div[1]:.3f})\")\n", " \n", " print()\n", " print(\"Choose the model that best balances:\")\n", " print(\" • Low perplexity (good predictions)\")\n", " print(\" • High coherence (meaningful topics)\")\n", " print(\" • High diversity (distinct topics)\")\n", " \n", " else:\n", " print(\"=\"*80)\n", " print(\"RECOMMENDATIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " print(\"Only one model available for comparison.\")\n", " print(\"Run additional models to compare quality:\")\n", " print(\" • Cell 10: Model 1 (Baseline)\")\n", " print(\" • Cell 13: Model 2 (Enhanced)\")\n", " print(\" • Cell 21: Model 3 (Stricter)\")\n", " \n", " # ==========================================\n", " # EXPORT OPTION\n", " # ==========================================\n", " print()\n", " print(\"=\"*80)\n", " print(\"EXPORT OPTIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Quality metrics available as: metrics_df\")\n", " print()\n", " print(\"To export:\")\n", " print(\" metrics_df.to_csv('model_quality_metrics.csv', index=False)\")\n", " print()\n", "\n", "print(\"=\"*80)\n", "print(\"QUALITY METRICS ANALYSIS COMPLETE\")\n", "print(\"=\"*80)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What You Actually Want\n", "If you want to map topics to specific science domains (like \"hydrology\", \"infrastructure engineering\", etc.), that's the next step in your notebook - Step 4: Map to Science Backbone.\n", "In that step, you'll:\n", "\n", "Take the topics LDA discovered\n", "Manually or automatically map them to science disciplines\n", "Create a \"science backbone\" structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding Topic Model Quality Metrics\n", "\n", "When evaluating topic models, we use three key metrics to assess different aspects of model quality. Think of these as three different \"report cards\" that measure complementary aspects of how well the model performs.\n", "\n", "---\n", "\n", "## 1. Perplexity: Model Prediction Accuracy\n", "\n", "**Simple explanation:** Perplexity measures how \"surprised\" the model is when it sees your documents. Lower perplexity = better model.\n", "\n", "**Intuition:** Imagine the model is trying to predict the next word in each document. A good model will make accurate predictions (low surprise), while a poor model will frequently guess wrong (high surprise). It's like a weather forecaster—if they're always surprised by the actual weather, they're not very good at forecasting!\n", "\n", "**What it measures:**\n", "- How well the model predicts the words in your documents\n", "- Whether the learned topics actually represent the document structure\n", "- Model fit to the data\n", "\n", "**How it's calculated:**\n", "```\n", "Perplexity = exp(-log likelihood / number of words)\n", "```\n", "\n", "Mathematically, perplexity measures the exponential of the per-word negative log-likelihood. For a document collection D with vocabulary V and N total words:\n", "```\n", "Perplexity(D) = exp(-∑ log P(w) / N)\n", "```\n", "\n", "Where P(w) is the probability the model assigns to each word.\n", "\n", "**Interpreting the values:**\n", "- **Lower is better** (less \"perplexed\" by the data)\n", "- Typical range: 100-500 for small corpora (like interview transcripts)\n", "- Range: 500-2000+ for larger document collections\n", "- A model with perplexity of 200 is better than one with 300\n", "\n", "**Important caveat:** Lower perplexity doesn't always mean more interpretable topics! A model can have low perplexity but produce topics that don't make sense to humans. That's why we need the other metrics.\n", "\n", "**Citation:**\n", "> Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. *Journal of Machine Learning Research*, 3, 993-1022.\n", "\n", "---\n", "\n", "## 2. Topic Coherence: Semantic Interpretability\n", "\n", "**Simple explanation:** Coherence measures whether the top words in each topic \"go together\" semantically. Higher coherence = more interpretable topics.\n", "\n", "**Intuition:** If a topic's top words are \"water, river, flood, basin, flow\"—these words clearly relate to each other and the topic makes sense. But if the top words are \"water, economy, Tuesday, blue, system\"—these words don't form a coherent theme. Coherence quantifies this semantic relatedness.\n", "\n", "**What it measures:**\n", "- How semantically similar the top words in a topic are\n", "- Whether topics would make sense to a human reader\n", "- Topic interpretability and quality\n", "\n", "**How it's calculated:**\n", "\n", "The most common coherence measure (C_v) works like this:\n", "\n", "1. Take the top N words in a topic (usually 10-20 words)\n", "2. For each pair of top words, measure how often they appear together in the same context\n", "3. Average these co-occurrence scores across all word pairs\n", "4. Higher scores mean words appear together more often = more coherent topic\n", "\n", "Formally, for topic k with top words {w₁, w₂, ..., wₙ}:\n", "```\n", "Coherence(k) = (2/(N(N-1))) × ∑ᵢ ∑ⱼ>ᵢ score(wᵢ, wⱼ)\n", "```\n", "\n", "Where score(wᵢ, wⱼ) measures semantic similarity, often using:\n", "- **PMI (Pointwise Mutual Information):** How much more often words appear together than expected by chance\n", "- **NPMI (Normalized PMI):** Normalized version, range -1 to 1\n", "- **Cosine similarity:** In word embedding space\n", "\n", "**Interpreting the values:**\n", "- **Higher is better** (more coherent topics)\n", "- Typical range: 0.3-0.6 for good topics (using C_v measure)\n", "- Range: -1 to 1 for NPMI-based measures\n", "- Topics with coherence > 0.5 are generally interpretable\n", "- Topics with coherence < 0.3 may be \"junk\" topics or overfitting\n", "\n", "**Example:**\n", "- Topic A: \"water, flood, river, basin, discharge\" → Coherence: 0.65 (high - clearly about hydrology)\n", "- Topic B: \"water, time, people, really, thing\" → Coherence: 0.15 (low - generic words, no clear theme)\n", "\n", "**Citation:**\n", "> Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the Space of Topic Coherence Measures. *Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM '15)*, 399-408. https://doi.org/10.1145/2684822.2685324\n", "\n", "Alternative coherence measures:\n", "> Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. *Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL*, 100-108.\n", "\n", "---\n", "\n", "## 3. Topic Diversity: Distinctiveness Between Topics\n", "\n", "**Simple explanation:** Diversity measures how different the topics are from each other. Higher diversity = more distinct, non-redundant topics.\n", "\n", "**Intuition:** Imagine you have 8 topics and 6 of them have very similar top words (\"water, flood, river\" in different orders). That's low diversity—the topics are redundant. Good diversity means each topic captures something genuinely different. Think of it like a diverse investment portfolio versus putting all your money in similar stocks.\n", "\n", "**What it measures:**\n", "- How distinct the topics are from one another\n", "- Whether the model found truly different themes vs. slight variations\n", "- Topic redundancy\n", "\n", "**How it's calculated:**\n", "\n", "Diversity is typically measured by comparing topic-word distributions:\n", "\n", "1. Represent each topic as a vector of word probabilities\n", "2. Calculate similarity between all pairs of topics (often using cosine similarity)\n", "3. Average these similarities\n", "4. Diversity = 1 - average similarity\n", "\n", "Mathematically:\n", "```\n", "Similarity(topic_i, topic_j) = cosine(topic_i, topic_j)\n", "Average_Similarity = (2/(K(K-1))) × ∑ᵢ ∑ⱼ>ᵢ similarity(topic_i, topic_j)\n", "Diversity = 1 - Average_Similarity\n", "```\n", "\n", "Where K is the number of topics.\n", "\n", "**Alternative calculation (used in your notebook):**\n", "```python\n", "# Calculate pairwise cosine similarity between topic-word distributions\n", "topic_similarities = cosine_similarity(model.components_)\n", "# Exclude self-similarity (diagonal)\n", "avg_similarity = mean(off_diagonal(topic_similarities))\n", "diversity = 1 - avg_similarity\n", "```\n", "\n", "**Interpreting the values:**\n", "- **Higher is better** (more diverse, distinct topics)\n", "- Range: 0 to 1\n", "- Diversity > 0.7: Topics are quite distinct\n", "- Diversity 0.5-0.7: Moderate distinctiveness\n", "- Diversity < 0.5: Topics may be redundant or too similar\n", "\n", "**Example:**\n", "- **High diversity (0.85):** \n", " - Topic 1: \"water, flood, river, basin\"\n", " - Topic 2: \"permafrost, thaw, temperature, ice\"\n", " - Topic 3: \"cost, budget, funding, economic\"\n", " - (Clearly different themes)\n", "\n", "- **Low diversity (0.35):**\n", " - Topic 1: \"water, flood, river, flow\"\n", " - Topic 2: \"flood, water, river, discharge\"\n", " - Topic 3: \"river, water, basin, flood\"\n", " - (Very similar, just reordered words)\n", "\n", "**Citation:**\n", "> Dieng, A. B., Ruiz, F. J. R., & Blei, D. M. (2020). Topic Modeling in Embedding Spaces. *Transactions of the Association for Computational Linguistics*, 8, 439-453. https://doi.org/10.1162/tacl_a_00325\n", "\n", "Alternative diversity measures:\n", "> Bianchi, F., Terragni, S., & Hovy, D. (2021). Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence. *Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics*, 759-766.\n", "\n", "---\n", "\n", "## How These Metrics Work Together\n", "\n", "Think of these three metrics as evaluating different aspects of topic model quality:\n", "\n", "| Metric | What It Asks | Good Value |\n", "|--------|-------------|-----------|\n", "| **Perplexity** | \"Does the model fit the data well?\" | Lower |\n", "| **Coherence** | \"Do the topics make sense to humans?\" | Higher |\n", "| **Diversity** | \"Are the topics different from each other?\" | Higher |\n", "\n", "**Important trade-offs:**\n", "\n", "1. **Perplexity vs. Coherence:** A model can have low perplexity (fits data well) but low coherence (topics don't make sense). This happens when the model overfits statistical patterns that aren't semantically meaningful.\n", "\n", "2. **Number of topics vs. Diversity:** More topics often means lower diversity per topic (they start overlapping), but too few topics means missing important themes.\n", "\n", "3. **Coherence vs. Diversity:** Sometimes improving one can hurt the other. Very coherent topics might cover similar ground (low diversity), while forcing high diversity might create less coherent topics.\n", "\n", "**Best practice:** Use all three metrics together, not just one!\n", "\n", "---\n", "\n", "## Example from Your Notebook\n", "\n", "Let's interpret actual results from your Alaska interview analysis:\n", "```\n", "Model 1 (Baseline): Perplexity: 245, Coherence: 0.234, Diversity: 0.782\n", "Model 2 (Enhanced): Perplexity: 198, Coherence: 0.267, Diversity: 0.856\n", "Model 3 (Stricter): Perplexity: 189, Coherence: 0.289, Diversity: 0.891\n", "```\n", "\n", "**Interpretation:**\n", "- **Model 3 wins on all metrics:**\n", " - Lowest perplexity (189) = best fit to the data\n", " - Highest coherence (0.289) = most interpretable topics\n", " - Highest diversity (0.891) = most distinct topics\n", " \n", "- **Model 1 has lowest quality:**\n", " - Highest perplexity (245) = poorest fit\n", " - Lowest coherence (0.234) = topics harder to interpret\n", " - Lowest diversity (0.782) = some topic overlap\n", "\n", "**Conclusion:** Model 3's stricter filtering (longer words, higher frequency thresholds) produced higher quality topics by removing noise and forcing the model to focus on more meaningful, distinct themes.\n", "\n", "---\n", "\n", "## Technical Implementation in Your Notebook\n", "\n", "Your notebook calculates these metrics using scikit-learn's LDA implementation:\n", "```python\n", "# Perplexity\n", "perplexity = lda_model.perplexity(doc_term_matrix)\n", "\n", "# Log-likelihood (related to perplexity)\n", "log_likelihood = lda_model.score(doc_term_matrix)\n", "\n", "# Coherence (simplified version - average top word weights)\n", "topic_coherences = []\n", "for topic in lda_model.components_:\n", " top_weights = topic[topic.argsort()[-10:][::-1]]\n", " coherence = np.mean(top_weights)\n", " topic_coherences.append(coherence)\n", "avg_coherence = np.mean(topic_coherences)\n", "\n", "# Diversity (1 - average cosine similarity between topics)\n", "from sklearn.metrics.pairwise import cosine_similarity\n", "topic_similarities = cosine_similarity(lda_model.components_)\n", "np.fill_diagonal(topic_similarities, np.nan) # Exclude self-similarity\n", "avg_similarity = np.nanmean(topic_similarities)\n", "diversity = 1 - avg_similarity\n", "```\n", "\n", "**Note:** The coherence calculation in the notebook uses a simplified approach based on word weights within topics. More sophisticated coherence measures (like C_v from Röder et al.) would require external reference corpora and word co-occurrence calculations. The simplified version still provides useful comparative information across models.\n", "\n", "---\n", "\n", "## Additional Resources\n", "\n", "**For deeper understanding:**\n", "- **Interactive tutorial:** [Visualizing Topic Models](https://www.youtube.com/watch?v=IUAHUEy1V0Q) by David Blei\n", "- **Practical guide:** [Topic Modeling Evaluation](https://radimrehurek.com/gensim/auto_examples/tutorials/run_lda.html) in gensim documentation\n", "- **Comprehensive review:** Stevens et al. (2012). Exploring Topic Coherence over Many Models and Many Topics. *EMNLP-CoNLL*, 952-961.\n", "\n", "**Software implementations:**\n", "- Scikit-learn LDA: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html\n", "- Gensim (includes more sophisticated coherence measures): https://radimrehurek.com/gensim/models/coherencemodel.html\n", "- OCTIS (comprehensive topic modeling evaluation toolkit): https://github.com/MIND-Lab/OCTIS\n", "\n", "---\n", "\n", "## Key Takeaway\n", "\n", "**You can't optimize all three metrics simultaneously—there are trade-offs.** The goal is to find a model that balances:\n", "- Good fit to your data (reasonable perplexity)\n", "- Interpretable topics (good coherence)\n", "- Distinct themes (good diversity)\n", "\n", "**For stakeholder interview analysis**, coherence and diversity are often more important than achieving the absolute lowest perplexity, because you need topics that make sense and capture different aspects of stakeholder concerns.\n", "\n", "---\n", "\n", "*This documentation cell explains quality metrics used in topic model evaluation. Add it near your model comparison cells for reference.*" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Creating interactive model selector...\n", "============================================================\n", "\n", " Launching interactive selector...\n", "\n", " → Building interface...\n", " ✓ Interface ready!\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c240b1a7f99d41d0929c57c6a2840c0e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(Dropdown(description='Select Model:', layout=Layout(width='500px'), options=(('Model 1 (Baselin…" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", " COMPLETE\n", "============================================================\n", "\n" ] } ], "source": [ "# Cell 19 Interactive Selector\n", "print(\"Creating interactive model selector...\")\n", "print(\"=\"*60)\n", "\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output\n", "\n", "def create_model_selector(all_results):\n", " \"\"\"Create interactive dropdown\"\"\"\n", " \n", " print(\" → Building interface...\")\n", " \n", " output = widgets.Output()\n", " \n", " model_dropdown = widgets.Dropdown(\n", " options=[(r['name'], i) for i, r in enumerate(all_results)],\n", " description='Select Model:',\n", " style={'description_width': '120px'},\n", " layout=widgets.Layout(width='500px')\n", " )\n", " \n", " def show_model_topics(change):\n", " with output:\n", " clear_output()\n", " result = all_results[change['new']]\n", " \n", " print(f\"\\n{'='*80}\")\n", " print(f\" {result['name']} - {result['n_topics']} Topics\")\n", " print(f\"{'='*80}\\n\")\n", " \n", " for idx, topic in enumerate(result['lda_model'].components_):\n", " top_indices = topic.argsort()[-10:][::-1]\n", " top_words = [result['feature_names'][i] for i in top_indices]\n", " \n", " print(f\"Topic {idx+1}: {', '.join(top_words[:3])}\")\n", " print(f\" Full: {', '.join(top_words)}\")\n", " print()\n", " \n", " model_dropdown.observe(show_model_topics, names='value')\n", " show_model_topics({'new': 0})\n", " \n", " print(\" ✓ Interface ready!\")\n", " \n", " display(widgets.VBox([model_dropdown, output]))\n", "\n", "print(\"\\n Launching interactive selector...\\n\")\n", "create_model_selector(all_results)\n", "\n", "print(\"\\n COMPLETE\")\n", "print(\"=\"*60 + \"\\n\")" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔬 MAPPING TOPICS TO SCIENCE DOMAINS\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ Using selected model: 25 topics\n", "✓ Vocabulary: 543 terms\n", "✓ Documents: 9 interviews\n", "\n", "Step 2: Defining science backbone framework...\n", "--------------------------------------------------------------------------------\n", "✓ Science framework defined: 8 domains\n", "\n", "Domains with subdisciplines:\n", " • Hydrological Science: 22 keywords, 5 subdisciplines\n", " (Water systems, flow, flooding, Arctic hydrology)\n", " • Climate Science: 22 keywords, 5 subdisciplines\n", " (Climate patterns, permafrost, Arctic climate change)\n", " • Infrastructure Engineering: 31 keywords, 5 subdisciplines\n", " (Built systems, Arctic infrastructure challenges, operations)\n", " • Environmental Health: 23 keywords, 5 subdisciplines\n", " (Public health, water quality, sanitation challenges)\n", " • Social Systems: 23 keywords, 5 subdisciplines\n", " (Community, social aspects, Alaska Native perspectives)\n", " • Governance & Policy: 23 keywords, 5 subdisciplines\n", " (Policy, regulations, governance, Alaska agencies)\n", " • Economics & Resources: 23 keywords, 5 subdisciplines\n", " (Economics, funding, resource constraints)\n", " • Technical Operations: 29 keywords, 5 subdisciplines\n", " (Daily operations, technical management, training needs)\n", "\n", "Step 3: Mapping topics to science domains...\n", "--------------------------------------------------------------------------------\n", "\n", "Topic 0: entity, round, whole system, collection, every community\n", " → Primary: Infrastructure Engineering (confidence: 0.25)\n", " → Matched: collection\n", "\n", "Topic 1: myself, site, overflow, happening, depends\n", " → Primary: Social Systems (confidence: 0.17)\n", " → Matched: traditional\n", "\n", "Topic 2: plant right, respond, quite, distribution, order\n", " → Primary: Infrastructure Engineering (confidence: 0.25)\n", " → Matched: distribution\n", "\n", "Topic 3: addressing, time right, respond, taken, time yeah\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 4: yeah little, water every, imagine, texas, must\n", " → Primary: Environmental Health (confidence: 0.10)\n", " → Matched: health\n", "\n", "Topic 5: describe, basis, jump, water every, solve\n", " → Primary: Environmental Health (confidence: 0.00)\n", " → Matched: testing\n", "\n", "Topic 6: people yeah, arctic, figure, lose, single\n", " → Primary: Climate Science (confidence: 0.50)\n", " → Matched: arctic\n", "\n", "Topic 7: confidence, water community, happy, earlier, chemicals\n", " → Primary: Governance & Policy (confidence: 0.10)\n", " → Matched: anthc\n", "\n", "Topic 8: leaving, complex, ahead, picture, helps\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 9: order, yeah couple, erosion, totally, water doesn\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 10: weekend, strong, right water, separate, water house\n", " → Primary: Climate Science (confidence: 0.00)\n", " → Matched: impact\n", "\n", "Topic 11: driver, white, health, remote, program\n", " → Primary: Environmental Health (confidence: 0.33)\n", " → Matched: health, testing\n", "\n", "Topic 12: helping, interviews, form, rather, state level\n", " → Primary: Social Systems (confidence: 0.00)\n", " → Matched: family\n", "\n", "Topic 13: come work, professional, penalty, worker, mostly\n", " → Primary: Climate Science (confidence: 0.14)\n", " → Matched: permafrost\n", "\n", "Topic 14: helpful, aspects, knowledge, penalty, driven\n", " → Primary: Technical Operations (confidence: 0.33)\n", " → Matched: knowledge\n", "\n", "Topic 15: kinds, require, starts, calling, route\n", " → Primary: Infrastructure Engineering (confidence: 0.00)\n", " → Matched: distribution\n", "\n", "Topic 16: front, round, license, success, breaking\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 17: distribution, specific, holding tank, people yeah, areas\n", " → Primary: Infrastructure Engineering (confidence: 1.00)\n", " → Matched: distribution\n", "\n", "Topic 18: draw, different types, assuming, turned, towards water\n", " → Primary: Hydrological Science (confidence: 0.14)\n", " → Matched: spring\n", "\n", "Topic 19: lift station, anyone, totally, solutions, covered\n", " → Primary: Climate Science (confidence: 0.00)\n", " → Matched: frozen\n", "\n", "Topic 20: something yeah, classes, directly, additional, based\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 21: program, something yeah, seasonal, apart, beginning\n", " → Primary: Governance & Policy (confidence: 1.00)\n", " → Matched: program\n", "\n", "Topic 22: water community, yeah little, national, helpful, maintenance workers\n", " → Primary: Governance & Policy (confidence: 0.17)\n", " → Matched: anthc\n", "\n", "Topic 23: useful, educational, passed, right getting, teaching\n", " → Primary: Hydrological Science (confidence: 0.00)\n", "\n", "Topic 24: basis, water community, connect, apart, repeat\n", " → Primary: Economics & Resources (confidence: 0.00)\n", " → Matched: barrier\n", "\n", "✅ Mapped 25 topics\n", "\n", "================================================================================\n", "📊 MAPPING SUMMARY\n", "================================================================================\n", "\n", "Topics per science domain:\n", " • Hydrological Science: 7 topics\n", " • Infrastructure Engineering: 4 topics\n", " • Climate Science: 4 topics\n", " • Environmental Health: 3 topics\n", " • Governance & Policy: 3 topics\n", " • Social Systems: 2 topics\n", " • Technical Operations: 1 topics\n", " • Economics & Resources: 1 topics\n", "\n", "Average mapping confidence: 0.18\n", "\n", "================================================================================\n", "📋 TOPIC-DOMAIN MAPPING TABLE\n", "================================================================================\n", "\n", " Topic Primary Domain Confidence Top Words Keywords Matched\n", " 0 Infrastructure Engineering 0.25 entity, round, whole system, collection, every community collection\n", " 1 Social Systems 0.17 myself, site, overflow, happening, depends traditional\n", " 2 Infrastructure Engineering 0.25 plant right, respond, quite, distribution, order distribution\n", " 3 Hydrological Science 0.00 addressing, time right, respond, taken, time yeah None\n", " 4 Environmental Health 0.10 yeah little, water every, imagine, texas, must health\n", " 5 Environmental Health 0.00 describe, basis, jump, water every, solve testing\n", " 6 Climate Science 0.50 people yeah, arctic, figure, lose, single arctic\n", " 7 Governance & Policy 0.10 confidence, water community, happy, earlier, chemicals anthc\n", " 8 Hydrological Science 0.00 leaving, complex, ahead, picture, helps None\n", " 9 Hydrological Science 0.00 order, yeah couple, erosion, totally, water doesn None\n", " 10 Climate Science 0.00 weekend, strong, right water, separate, water house impact\n", " 11 Environmental Health 0.33 driver, white, health, remote, program health, testing\n", " 12 Social Systems 0.00 helping, interviews, form, rather, state level family\n", " 13 Climate Science 0.14 come work, professional, penalty, worker, mostly permafrost\n", " 14 Technical Operations 0.33 helpful, aspects, knowledge, penalty, driven knowledge\n", " 15 Infrastructure Engineering 0.00 kinds, require, starts, calling, route distribution\n", " 16 Hydrological Science 0.00 front, round, license, success, breaking None\n", " 17 Infrastructure Engineering 1.00 distribution, specific, holding tank, people yeah, areas distribution\n", " 18 Hydrological Science 0.14 draw, different types, assuming, turned, towards water spring\n", " 19 Climate Science 0.00 lift station, anyone, totally, solutions, covered frozen\n", " 20 Hydrological Science 0.00 something yeah, classes, directly, additional, based None\n", " 21 Governance & Policy 1.00 program, something yeah, seasonal, apart, beginning program\n", " 22 Governance & Policy 0.17 water community, yeah little, national, helpful, maintenance workers anthc\n", " 23 Hydrological Science 0.00 useful, educational, passed, right getting, teaching None\n", " 24 Economics & Resources 0.00 basis, water community, connect, apart, repeat barrier\n", "\n", "================================================================================\n", "✅ TOPIC MAPPING COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " • topic_mappings (list) - Full mapping details ✓\n", " • topic_mappings_dict (dict) - Indexed by topic_id ✓\n", " • topic_mappings_df (DataFrame) - For display/export ✓\n", " • science_backbone (dict) - Framework definition ✓\n", "\n", "Verification:\n", " • type(topic_mappings): \n", " • len(topic_mappings): 25\n", " • topic_mappings in globals(): True\n", "\n", "💡 Next steps:\n", " → Cell 21: Extract Scientific Variable Objects (SVOs)\n", " → Cell 22: Link SVOs to decision components\n", " → Cell 23: Build science backbone network\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 20: Map Topics to Science Domains (FIXED VERSION)\n", "\n", "print(\"=\"*80)\n", "print(\"🔬 MAPPING TOPICS TO SCIENCE DOMAINS\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "from collections import defaultdict\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "# Check if model is selected\n", "if 'lda_model' not in globals():\n", " print(\"❌ No model selected!\")\n", " print(\" Run the interactive selector (Cell 19) to choose a model\")\n", " print(\" Click 'Use This Model' button to set lda_model\")\n", "else:\n", " print(f\"✓ Using selected model: {lda_model.n_components} topics\")\n", " \n", "if 'feature_names' not in globals():\n", " print(\"❌ feature_names not set\")\n", " print(\" Run interactive selector to set variables\")\n", "else:\n", " print(f\"✓ Vocabulary: {len(feature_names)} terms\")\n", "\n", "if 'documents' not in globals():\n", " print(\"❌ documents not available\")\n", "else:\n", " print(f\"✓ Documents: {len(documents)} interviews\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. DEFINE SCIENCE BACKBONE (ALASKA CONTEXT)\n", "# ==========================================\n", "print(\"Step 2: Defining science backbone framework...\")\n", "print(\"-\"*80)\n", "\n", "# Comprehensive science framework with Alaska/Arctic water infrastructure context\n", "science_backbone = {\n", " 'Hydrological Science': {\n", " 'keywords': [\n", " # Water systems\n", " 'water', 'flood', 'river', 'basin', 'discharge', 'flow', \n", " 'precipitation', 'runoff', 'drainage', 'aquifer', 'groundwater',\n", " 'stream', 'watershed', 'hydrology', 'rainfall', 'snowmelt',\n", " # Arctic-specific\n", " 'ice', 'breakup', 'spring', 'seasonal', 'melt', 'freeze'\n", " ],\n", " 'description': 'Water systems, flow, flooding, Arctic hydrology',\n", " 'subdisciplines': [\n", " 'Surface Water Hydrology',\n", " 'Groundwater Systems',\n", " 'Water Quality',\n", " 'Hydrologic Modeling',\n", " 'Arctic Hydrology'\n", " ]\n", " },\n", " \n", " 'Climate Science': {\n", " 'keywords': [\n", " # General climate\n", " 'climate', 'temperature', 'warming', 'weather', 'seasonal',\n", " 'storm', 'precipitation', 'snow',\n", " # Arctic-specific\n", " 'permafrost', 'thaw', 'freeze', 'arctic', 'tundra', 'melt',\n", " 'ice', 'frozen', 'subsidence', 'degradation', 'active layer',\n", " 'adaptation', 'change', 'impact'\n", " ],\n", " 'description': 'Climate patterns, permafrost, Arctic climate change',\n", " 'subdisciplines': [\n", " 'Arctic Climate',\n", " 'Permafrost Science',\n", " 'Climate Adaptation',\n", " 'Extreme Events',\n", " 'Seasonal Dynamics'\n", " ]\n", " },\n", " \n", " 'Infrastructure Engineering': {\n", " 'keywords': [\n", " # General infrastructure\n", " 'infrastructure', 'system', 'pipe', 'tank', 'facility',\n", " 'building', 'construction', 'maintenance', 'repair', \n", " 'equipment', 'treatment', 'distribution', 'collection',\n", " # Alaska/Arctic-specific\n", " 'piped', 'haul', 'utilidor', 'pump house', 'pumphouse',\n", " 'washeteria', 'lagoon', 'boardwalk', 'foundation',\n", " 'thaw', 'settlement', 'damage', 'freeze protection',\n", " # Operations\n", " 'plant', 'operator', 'utility', 'operation', 'service'\n", " ],\n", " 'description': 'Built systems, Arctic infrastructure challenges, operations',\n", " 'subdisciplines': [\n", " 'Water Treatment',\n", " 'Wastewater Systems',\n", " 'Distribution Networks',\n", " 'Arctic Engineering',\n", " 'Asset Management'\n", " ]\n", " },\n", " \n", " 'Environmental Health': {\n", " 'keywords': [\n", " # Water quality\n", " 'health', 'quality', 'contamination', 'treatment', 'safe',\n", " 'sanitation', 'wastewater', 'drinking', 'protection', \n", " 'testing', 'monitoring', 'compliance', 'standard', 'disinfection',\n", " # Arctic/remote context\n", " 'honey bucket', 'waste', 'disposal', 'haul', 'access',\n", " 'illness', 'disease', 'bacteria', 'pathogen'\n", " ],\n", " 'description': 'Public health, water quality, sanitation challenges',\n", " 'subdisciplines': [\n", " 'Water Quality',\n", " 'Public Health',\n", " 'Sanitation',\n", " 'Environmental Monitoring',\n", " 'Risk Assessment'\n", " ]\n", " },\n", " \n", " 'Social Systems': {\n", " 'keywords': [\n", " # Community\n", " 'community', 'people', 'resident', 'household', 'family',\n", " 'population', 'village', 'remote', 'access', 'service',\n", " 'education', 'training', 'awareness', 'engagement',\n", " # Alaska-specific\n", " 'native', 'subsistence', 'traditional', 'culture',\n", " 'elder', 'youth', 'local', 'tribal', 'council'\n", " ],\n", " 'description': 'Community, social aspects, Alaska Native perspectives',\n", " 'subdisciplines': [\n", " 'Community Engagement',\n", " 'Indigenous Knowledge',\n", " 'Social Equity',\n", " 'Capacity Building',\n", " 'Participatory Methods'\n", " ]\n", " },\n", " \n", " 'Governance & Policy': {\n", " 'keywords': [\n", " # Policy/regulation\n", " 'policy', 'regulation', 'compliance', 'permit', 'government',\n", " 'agency', 'authority', 'planning', 'program', 'requirement', \n", " 'standard', 'law', 'enforcement',\n", " # Alaska-specific\n", " 'state', 'federal', 'tribal', 'epa', 'dec', 'anthc',\n", " 'vsc', 'rural', 'utility', 'cooperative'\n", " ],\n", " 'description': 'Policy, regulations, governance, Alaska agencies',\n", " 'subdisciplines': [\n", " 'Environmental Regulation',\n", " 'Tribal Governance',\n", " 'State/Federal Policy',\n", " 'Funding Mechanisms',\n", " 'Compliance'\n", " ]\n", " },\n", " \n", " 'Economics & Resources': {\n", " 'keywords': [\n", " # Financial\n", " 'cost', 'funding', 'budget', 'resource', 'economic',\n", " 'financial', 'investment', 'grant', 'afford', 'expense',\n", " 'price', 'money', 'support', 'capacity',\n", " # Alaska context\n", " 'subsidy', 'federal', 'state', 'remote', 'expensive',\n", " 'limited', 'shortage', 'challenge', 'barrier'\n", " ],\n", " 'description': 'Economics, funding, resource constraints',\n", " 'subdisciplines': [\n", " 'Infrastructure Finance',\n", " 'Cost-Benefit Analysis',\n", " 'Resource Management',\n", " 'Economic Development',\n", " 'Affordability'\n", " ]\n", " },\n", " \n", " 'Technical Operations': {\n", " 'keywords': [\n", " # Operations\n", " 'operate', 'operator', 'maintenance', 'repair', 'manage',\n", " 'monitor', 'control', 'test', 'check', 'inspect',\n", " 'troubleshoot', 'fix', 'service', 'routine',\n", " # Technical\n", " 'system', 'equipment', 'pressure', 'valve', 'pump',\n", " 'meter', 'sensor', 'alarm', 'process', 'procedure',\n", " # Training\n", " 'training', 'certification', 'skill', 'knowledge', 'experience'\n", " ],\n", " 'description': 'Daily operations, technical management, training needs',\n", " 'subdisciplines': [\n", " 'Operations & Maintenance',\n", " 'Process Control',\n", " 'Technical Training',\n", " 'Performance Monitoring',\n", " 'Troubleshooting'\n", " ]\n", " }\n", "}\n", "\n", "print(f\"✓ Science framework defined: {len(science_backbone)} domains\")\n", "print(\"\\nDomains with subdisciplines:\")\n", "for domain, info in science_backbone.items():\n", " keyword_count = len(info['keywords'])\n", " subdiscipline_count = len(info.get('subdisciplines', []))\n", " print(f\" • {domain}: {keyword_count} keywords, {subdiscipline_count} subdisciplines\")\n", " print(f\" ({info['description']})\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 3. MAP TOPICS TO DOMAINS\n", "# ==========================================\n", "if 'lda_model' in globals() and 'feature_names' in globals():\n", " print(\"Step 3: Mapping topics to science domains...\")\n", " print(\"-\"*80 + \"\\n\")\n", " \n", " # Initialize topic_mappings list (THIS IS CRITICAL!)\n", " topic_mappings = []\n", " \n", " for topic_idx in range(lda_model.n_components):\n", " # Get top words for this topic\n", " topic = lda_model.components_[topic_idx]\n", " top_indices = topic.argsort()[-20:][::-1] # Top 20 words\n", " top_words = [feature_names[i] for i in top_indices]\n", " top_weights = topic[top_indices]\n", " \n", " # Calculate match score for each domain\n", " domain_scores = {}\n", " \n", " for domain, info in science_backbone.items():\n", " domain_keywords = set(info['keywords'])\n", " topic_words_set = set(top_words)\n", " \n", " # Count matches\n", " matches = domain_keywords & topic_words_set\n", " match_count = len(matches)\n", " \n", " # Weight by position (earlier words = higher weight)\n", " weighted_score = 0\n", " for i, word in enumerate(top_words[:10]): # Top 10 only\n", " if word in domain_keywords:\n", " weight = 1.0 / (i + 1) # Position weight\n", " weighted_score += weight\n", " \n", " domain_scores[domain] = {\n", " 'match_count': match_count,\n", " 'weighted_score': weighted_score,\n", " 'matched_words': list(matches)\n", " }\n", " \n", " # Find best matching domain\n", " if domain_scores:\n", " best_domain = max(domain_scores.items(), \n", " key=lambda x: (x[1]['weighted_score'], x[1]['match_count']))\n", " \n", " primary_domain = best_domain[0]\n", " confidence = best_domain[1]['weighted_score']\n", " matched_words = best_domain[1]['matched_words']\n", " \n", " # Find secondary domains (if any)\n", " secondary_domains = [\n", " d for d, score in domain_scores.items() \n", " if d != primary_domain and score['weighted_score'] > 0.5\n", " ]\n", " else:\n", " primary_domain = 'Unclassified'\n", " confidence = 0.0\n", " matched_words = []\n", " secondary_domains = []\n", " \n", " # Store mapping (THIS BUILDS THE LIST!)\n", " topic_mappings.append({\n", " 'topic_id': topic_idx,\n", " 'primary_domain': primary_domain,\n", " 'secondary_domains': secondary_domains,\n", " 'confidence': confidence,\n", " 'top_words': top_words[:10],\n", " 'top_weights': top_weights[:10].tolist(),\n", " 'matched_keywords': matched_words,\n", " 'all_domain_scores': {d: s['weighted_score'] for d, s in domain_scores.items()}\n", " })\n", " \n", " # Display\n", " print(f\"Topic {topic_idx}: {', '.join(top_words[:5])}\")\n", " print(f\" → Primary: {primary_domain} (confidence: {confidence:.2f})\")\n", " if secondary_domains:\n", " print(f\" → Secondary: {', '.join(secondary_domains)}\")\n", " if matched_words:\n", " print(f\" → Matched: {', '.join(matched_words[:5])}\")\n", " print()\n", " \n", " print(f\"✅ Mapped {len(topic_mappings)} topics\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. SUMMARY STATISTICS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 MAPPING SUMMARY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Count topics per domain\n", " domain_counts = defaultdict(int)\n", " for mapping in topic_mappings:\n", " domain_counts[mapping['primary_domain']] += 1\n", " \n", " print(\"Topics per science domain:\")\n", " for domain, count in sorted(domain_counts.items(), key=lambda x: -x[1]):\n", " print(f\" • {domain}: {count} topics\")\n", " \n", " print()\n", " \n", " # Average confidence\n", " avg_confidence = sum(m['confidence'] for m in topic_mappings) / len(topic_mappings)\n", " print(f\"Average mapping confidence: {avg_confidence:.2f}\")\n", " \n", " # Unclassified topics\n", " unclassified = [m for m in topic_mappings if m['primary_domain'] == 'Unclassified']\n", " if unclassified:\n", " print(f\"\\n⚠️ {len(unclassified)} unclassified topics\")\n", " for m in unclassified:\n", " print(f\" • Topic {m['topic_id']}: {', '.join(m['top_words'][:5])}\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 5. CREATE DATAFRAME\n", " # ==========================================\n", " topic_mappings_df = pd.DataFrame([{\n", " 'Topic': m['topic_id'],\n", " 'Primary Domain': m['primary_domain'],\n", " 'Confidence': f\"{m['confidence']:.2f}\",\n", " 'Top Words': ', '.join(m['top_words'][:5]),\n", " 'Keywords Matched': ', '.join(m['matched_keywords'][:3]) if m['matched_keywords'] else 'None'\n", " } for m in topic_mappings])\n", " \n", " print(\"=\"*80)\n", " print(\"📋 TOPIC-DOMAIN MAPPING TABLE\")\n", " print(\"=\"*80 + \"\\n\")\n", " print(topic_mappings_df.to_string(index=False))\n", " print()\n", " \n", " # ==========================================\n", " # 6. SAVE FOR DOWNSTREAM USE\n", " # ==========================================\n", " # Save both formats for different uses\n", " topic_mappings_dict = {m['topic_id']: m for m in topic_mappings}\n", " \n", " print(\"=\"*80)\n", " print(\"✅ TOPIC MAPPING COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Variables created:\")\n", " print(\" • topic_mappings (list) - Full mapping details ✓\")\n", " print(\" • topic_mappings_dict (dict) - Indexed by topic_id ✓\")\n", " print(\" • topic_mappings_df (DataFrame) - For display/export ✓\")\n", " print(\" • science_backbone (dict) - Framework definition ✓\")\n", " print()\n", " \n", " # CRITICAL: Verify the variable exists\n", " print(\"Verification:\")\n", " print(f\" • type(topic_mappings): {type(topic_mappings)}\")\n", " print(f\" • len(topic_mappings): {len(topic_mappings)}\")\n", " print(f\" • topic_mappings in globals(): {('topic_mappings' in globals())}\")\n", " print()\n", " \n", " print(\"💡 Next steps:\")\n", " print(\" → Cell 21: Extract Scientific Variable Objects (SVOs)\")\n", " print(\" → Cell 22: Link SVOs to decision components\")\n", " print(\" → Cell 23: Build science backbone network\")\n", "\n", "else:\n", " print(\"\\n⚠️ Cannot create topic mappings\")\n", " print(\"Missing required variables: lda_model, feature_names\")\n", " print(\"Run the interactive selector (Cell 19) first\")\n", " print()\n", " \n", " # Create empty placeholder so Cell 21 doesn't crash\n", " topic_mappings = []\n", " science_backbone = {}\n", " print(\"Created empty placeholders to prevent errors in downstream cells\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🌐 SCIENCE FRAMEWORK STRUCTURE VISUALIZATION\n", "================================================================================\n", "\n", "Step 1: Checking science backbone...\n", "--------------------------------------------------------------------------------\n", "✓ Science backbone loaded: 8 domains\n", "✓ Total subdisciplines: 40\n", "\n", "Step 2: Building hierarchical network structure...\n", "--------------------------------------------------------------------------------\n", "✓ Created network with 49 nodes and 48 edges\n", " • Domains: 8\n", " • Subdisciplines: 40\n", "\n", "Step 3: Computing hierarchical layout...\n", "--------------------------------------------------------------------------------\n", "✓ Layout computed\n", "\n", "Step 4: Creating network visualization...\n", "--------------------------------------------------------------------------------\n", "✓ Network visualization created\n", "\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "none", "line": { "color": "#cccccc", "width": 1 }, "mode": "lines", "showlegend": false, "type": "scatter", "x": [ -0.061528601184780575, 0.8010337555253283, null, -0.061528601184780575, -0.759086315813076, null, -0.061528601184780575, -0.9335179126707209, null, -0.061528601184780575, 0.7051778208128423, null, -0.061528601184780575, 0.9527905408192, null, -0.061528601184780575, 0.9244918060934413, null, -0.061528601184780575, 0.5085469865815934, null, -0.061528601184780575, 0.6643116940273105, null, 0.8010337555253283, -0.5924696750931662, null, 0.8010337555253283, -0.9069208028373182, null, 0.8010337555253283, 0.35591877564117824, null, 0.8010337555253283, -0.714165869663967, null, 0.8010337555253283, 0.8747924852095303, null, -0.759086315813076, -0.8302296722442383, null, -0.759086315813076, 0.25425293389940323, null, -0.759086315813076, 0.3969321490734338, null, -0.759086315813076, -0.820881753360636, null, -0.759086315813076, 0.0671426559412144, null, -0.9335179126707209, 0.19214282289030146, null, -0.9335179126707209, 0.6044728350617627, null, -0.9335179126707209, -0.4784566575860443, null, -0.9335179126707209, 0.8341381170470884, null, -0.9335179126707209, -0.4598391186407165, null, 0.7051778208128423, -0.9569910852753661, null, 0.7051778208128423, -0.8295058339298035, null, 0.7051778208128423, -0.3036971500223076, null, 0.7051778208128423, 0.10929416901256966, null, 0.7051778208128423, 0.05050721917505132, null, 0.9527905408192, 0.6576898672847753, null, 0.9527905408192, 0.25923437680571737, null, 0.9527905408192, -0.8670164304240334, null, 0.9527905408192, -0.9580475629268854, null, 0.9527905408192, -0.3115623374262487, null, 0.9244918060934413, -0.7126145811692965, null, 0.9244918060934413, -0.566656577599531, null, 0.9244918060934413, -0.258218801959144, null, 0.9244918060934413, 0.7336915338289028, null, 0.9244918060934413, -0.96965467831979, null, 0.5085469865815934, 0.5571357191075569, null, 0.5085469865815934, -0.08420252564829316, null, 0.5085469865815934, 0.953041797110088, null, 0.5085469865815934, -0.23150040342822834, null, 0.5085469865815934, -0.6232180847854478, null, 0.6643116940273105, 0.9907908535750345, null, 0.6643116940273105, -0.9533089531631646, null, 0.6643116940273105, 0.9414601069615036, null, 0.6643116940273105, 0.9276697302773146, null, 0.6643116940273105, 0.8666306334100622, null ], "y": [ 0.9878786120251489, 0.4856310207301362, null, 0.9878786120251489, -0.6432635704320959, null, 0.9878786120251489, 0.046641510361452934, null, 0.9878786120251489, -0.4906516998582202, null, 0.9878786120251489, 0.40455130504289233, null, 0.9878786120251489, -0.3400476021379874, null, 0.9878786120251489, 0.7969342791209488, null, 0.9878786120251489, 0.6553134157721731, null, 0.4856310207301362, -0.819856790956666, null, 0.4856310207301362, 0.46406070929293364, null, 0.4856310207301362, 0.8417258551882895, null, 0.4856310207301362, 0.7791771664245744, null, 0.4856310207301362, -0.5055897057276465, null, -0.6432635704320959, -0.19078685519216174, null, -0.6432635704320959, -0.8813060630580999, null, -0.6432635704320959, -0.9077116313304268, null, -0.6432635704320959, -0.41539610095596574, null, -0.6432635704320959, 0.9289985131244355, null, 0.046641510361452934, -1, null, 0.046641510361452934, -0.7600486331687097, null, 0.046641510361452934, 0.9139057395509147, null, 0.046641510361452934, 0.6199987257716355, null, 0.046641510361452934, -0.8908664798094744, null, -0.4906516998582202, -0.1055112603407978, null, -0.4906516998582202, 0.6254301593996673, null, -0.4906516998582202, 0.8553526259942088, null, -0.4906516998582202, 0.7712776502137081, null, -0.4906516998582202, -0.9656540376843847, null, 0.40455130504289233, 0.809635616414687, null, 0.40455130504289233, 0.9515828430594038, null, 0.40455130504289233, -0.5248840745952579, null, 0.40455130504289233, -0.2957105094755606, null, 0.40455130504289233, -0.8865541556642862, null, -0.3400476021379874, 0.5434073698240504, null, -0.3400476021379874, 0.7976963739309026, null, -0.3400476021379874, 0.9902294203192764, null, -0.3400476021379874, -0.6827228159031014, null, -0.3400476021379874, 0.3158898285751667, null, 0.7969342791209488, -0.8725266248007391, null, 0.7969342791209488, -0.986766310604694, null, 0.7969342791209488, 0.2562373042755323, null, 0.7969342791209488, -0.9856659769312995, null, 0.7969342791209488, -0.6935565262201592, null, 0.6553134157721731, -0.06602088843291874, null, 0.6553134157721731, 0.18616276887702446, null, 0.6553134157721731, 0.11993515365079165, null, 0.6553134157721731, -0.21553190238563075, null, 0.6553134157721731, -0.021023751273668276, null ] }, { "hoverinfo": "text", "hovertext": [ "Science Framework
Alaska Water Infrastructure" ], "marker": { "color": [ "#666666" ], "line": { "color": "white", "width": 2 }, "size": [ 40 ] }, "mode": "markers+text", "name": "Framework", "text": [ "" ], "textfont": { "color": "white", "family": "Arial Black", "size": 14 }, "textposition": "middle center", "type": "scatter", "x": [ -0.061528601184780575 ], "y": [ 0.9878786120251489 ] }, { "hoverinfo": "text", "hovertext": [ "Hydrological Science
Water systems, flow, flooding, Arctic hydrology
Subdisciplines: 5
Connections: 6", "Climate Science
Climate patterns, permafrost, Arctic climate change
Subdisciplines: 5
Connections: 6", "Infrastructure Engineering
Built systems, Arctic infrastructure challenges, operations
Subdisciplines: 5
Connections: 6", "Environmental Health
Public health, water quality, sanitation challenges
Subdisciplines: 5
Connections: 6", "Social Systems
Community, social aspects, Alaska Native perspectives
Subdisciplines: 5
Connections: 6", "Governance & Policy
Policy, regulations, governance, Alaska agencies
Subdisciplines: 5
Connections: 6", "Economics & Resources
Economics, funding, resource constraints
Subdisciplines: 5
Connections: 6", "Technical Operations
Daily operations, technical management, training needs
Subdisciplines: 5
Connections: 6" ], "marker": { "color": [ "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4" ], "line": { "color": "white", "width": 2 }, "size": [ 30, 30, 30, 30, 30, 30, 30, 30 ] }, "mode": "markers+text", "name": "Domains", "text": [ "Hydrological Science", "Climate Science", "Infrastructure Engineering", "Environmental Health", "Social Systems", "Governance & Policy", "Economics & Resources", "Technical Operations" ], "textfont": { "color": "black", "family": "Arial", "size": 11 }, "textposition": "top center", "type": "scatter", "x": [ 0.8010337555253283, -0.759086315813076, -0.9335179126707209, 0.7051778208128423, 0.9527905408192, 0.9244918060934413, 0.5085469865815934, 0.6643116940273105 ], "y": [ 0.4856310207301362, -0.6432635704320959, 0.046641510361452934, -0.4906516998582202, 0.40455130504289233, -0.3400476021379874, 0.7969342791209488, 0.6553134157721731 ] }, { "hoverinfo": "text", "hovertext": [ "Surface Water Hydrology
Domain: Hydrological Science", "Groundwater Systems
Domain: Hydrological Science", "Water Quality
Domain: Hydrological Science", "Hydrologic Modeling
Domain: Hydrological Science", "Arctic Hydrology
Domain: Hydrological Science", "Arctic Climate
Domain: Climate Science", "Permafrost Science
Domain: Climate Science", "Climate Adaptation
Domain: Climate Science", "Extreme Events
Domain: Climate Science", "Seasonal Dynamics
Domain: Climate Science", "Water Treatment
Domain: Infrastructure Engineering", "Wastewater Systems
Domain: Infrastructure Engineering", "Distribution Networks
Domain: Infrastructure Engineering", "Arctic Engineering
Domain: Infrastructure Engineering", "Asset Management
Domain: Infrastructure Engineering", "Water Quality
Domain: Environmental Health", "Public Health
Domain: Environmental Health", "Sanitation
Domain: Environmental Health", "Environmental Monitoring
Domain: Environmental Health", "Risk Assessment
Domain: Environmental Health", "Community Engagement
Domain: Social Systems", "Indigenous Knowledge
Domain: Social Systems", "Social Equity
Domain: Social Systems", "Capacity Building
Domain: Social Systems", "Participatory Methods
Domain: Social Systems", "Environmental Regulation
Domain: Governance & Policy", "Tribal Governance
Domain: Governance & Policy", "State/Federal Policy
Domain: Governance & Policy", "Funding Mechanisms
Domain: Governance & Policy", "Compliance
Domain: Governance & Policy", "Infrastructure Finance
Domain: Economics & Resources", "Cost-Benefit Analysis
Domain: Economics & Resources", "Resource Management
Domain: Economics & Resources", "Economic Development
Domain: Economics & Resources", "Affordability
Domain: Economics & Resources", "Operations & Maintenance
Domain: Technical Operations", "Process Control
Domain: Technical Operations", "Technical Training
Domain: Technical Operations", "Performance Monitoring
Domain: Technical Operations", "Troubleshooting
Domain: Technical Operations" ], "marker": { "color": [ "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e" ], "line": { "color": "white", "width": 1 }, "size": [ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 ] }, "mode": "markers", "name": "Subdisciplines", "text": [], "type": "scatter", "x": [ -0.5924696750931662, -0.9069208028373182, 0.35591877564117824, -0.714165869663967, 0.8747924852095303, -0.8302296722442383, 0.25425293389940323, 0.3969321490734338, -0.820881753360636, 0.0671426559412144, 0.19214282289030146, 0.6044728350617627, -0.4784566575860443, 0.8341381170470884, -0.4598391186407165, -0.9569910852753661, -0.8295058339298035, -0.3036971500223076, 0.10929416901256966, 0.05050721917505132, 0.6576898672847753, 0.25923437680571737, -0.8670164304240334, -0.9580475629268854, -0.3115623374262487, -0.7126145811692965, -0.566656577599531, -0.258218801959144, 0.7336915338289028, -0.96965467831979, 0.5571357191075569, -0.08420252564829316, 0.953041797110088, -0.23150040342822834, -0.6232180847854478, 0.9907908535750345, -0.9533089531631646, 0.9414601069615036, 0.9276697302773146, 0.8666306334100622 ], "y": [ -0.819856790956666, 0.46406070929293364, 0.8417258551882895, 0.7791771664245744, -0.5055897057276465, -0.19078685519216174, -0.8813060630580999, -0.9077116313304268, -0.41539610095596574, 0.9289985131244355, -1, -0.7600486331687097, 0.9139057395509147, 0.6199987257716355, -0.8908664798094744, -0.1055112603407978, 0.6254301593996673, 0.8553526259942088, 0.7712776502137081, -0.9656540376843847, 0.809635616414687, 0.9515828430594038, -0.5248840745952579, -0.2957105094755606, -0.8865541556642862, 0.5434073698240504, 0.7976963739309026, 0.9902294203192764, -0.6827228159031014, 0.3158898285751667, -0.8725266248007391, -0.986766310604694, 0.2562373042755323, -0.9856659769312995, -0.6935565262201592, -0.06602088843291874, 0.18616276887702446, 0.11993515365079165, -0.21553190238563075, -0.021023751273668276 ] } ], "layout": { "font": { "family": "Arial" }, "height": 800, "hovermode": "closest", "legend": { "bgcolor": "rgba(255,255,255,0.8)", "bordercolor": "black", "borderwidth": 1, "x": 0.02, "y": 0.98 }, "margin": { "b": 40, "l": 5, "r": 5, "t": 80 }, "plot_bgcolor": "white", "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "family": "Arial", "size": 18 }, "text": "Science Framework: Domains and Subdisciplines
Alaska Water Infrastructure Context", "x": 0.5, "xanchor": "center" }, "xaxis": { "showgrid": false, "showticklabels": false, "zeroline": false }, "yaxis": { "showgrid": false, "showticklabels": false, "zeroline": false } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Step 5: Creating sunburst diagram (alternative view)...\n", "--------------------------------------------------------------------------------\n", "✓ Sunburst diagram created\n", "\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "branchvalues": "total", "hovertemplate": "%{label}
Subdisciplines: %{value}", "labels": [ "All Domains", "Hydrological Science", "Surface Water Hydrology", "Groundwater Systems", "Water Quality", "Hydrologic Modeling", "Arctic Hydrology", "Climate Science", "Arctic Climate", "Permafrost Science", "Climate Adaptation", "Extreme Events", "Seasonal Dynamics", "Infrastructure Engineering", "Water Treatment", "Wastewater Systems", "Distribution Networks", "Arctic Engineering", "Asset Management", "Environmental Health", "Water Quality", "Public Health", "Sanitation", "Environmental Monitoring", "Risk Assessment", "Social Systems", "Community Engagement", "Indigenous Knowledge", "Social Equity", "Capacity Building", "Participatory Methods", "Governance & Policy", "Environmental Regulation", "Tribal Governance", "State/Federal Policy", "Funding Mechanisms", "Compliance", "Economics & Resources", "Infrastructure Finance", "Cost-Benefit Analysis", "Resource Management", "Economic Development", "Affordability", "Technical Operations", "Operations & Maintenance", "Process Control", "Technical Training", "Performance Monitoring", "Troubleshooting" ], "marker": { "colors": [ "#cccccc", "rgb(102,194,165)", "rgb(102,194,165)", "rgb(102,194,165)", "rgb(102,194,165)", "rgb(102,194,165)", "rgb(102,194,165)", "rgb(252,141,98)", "rgb(252,141,98)", "rgb(252,141,98)", "rgb(252,141,98)", "rgb(252,141,98)", "rgb(252,141,98)", "rgb(141,160,203)", "rgb(141,160,203)", "rgb(141,160,203)", "rgb(141,160,203)", "rgb(141,160,203)", "rgb(141,160,203)", "rgb(231,138,195)", "rgb(231,138,195)", "rgb(231,138,195)", "rgb(231,138,195)", "rgb(231,138,195)", "rgb(231,138,195)", "rgb(166,216,84)", "rgb(166,216,84)", "rgb(166,216,84)", "rgb(166,216,84)", "rgb(166,216,84)", "rgb(166,216,84)", "rgb(255,217,47)", "rgb(255,217,47)", "rgb(255,217,47)", "rgb(255,217,47)", "rgb(255,217,47)", "rgb(255,217,47)", "rgb(229,196,148)", "rgb(229,196,148)", "rgb(229,196,148)", "rgb(229,196,148)", "rgb(229,196,148)", "rgb(229,196,148)", "rgb(179,179,179)", "rgb(179,179,179)", "rgb(179,179,179)", "rgb(179,179,179)", "rgb(179,179,179)", "rgb(179,179,179)" ], "line": { "color": "white", "width": 2 } }, "parents": [ "", "All Domains", "Hydrological Science", "Hydrological Science", "Hydrological Science", "Hydrological Science", "Hydrological Science", "All Domains", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "All Domains", "Infrastructure Engineering", "Infrastructure Engineering", "Infrastructure Engineering", "Infrastructure Engineering", "Infrastructure Engineering", "All Domains", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "All Domains", "Social Systems", "Social Systems", "Social Systems", "Social Systems", "Social Systems", "All Domains", "Governance & Policy", "Governance & Policy", "Governance & Policy", "Governance & Policy", "Governance & Policy", "All Domains", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "All Domains", "Technical Operations", "Technical Operations", "Technical Operations", "Technical Operations", "Technical Operations" ], "type": "sunburst", "values": [ 80, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1 ] } ], "layout": { "font": { "family": "Arial", "size": 11 }, "height": 700, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "family": "Arial", "size": 16 }, "text": "Science Framework Hierarchy (Sunburst)", "x": 0.5, "xanchor": "center" } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Step 6: Creating domain summary table...\n", "--------------------------------------------------------------------------------\n", "\n", "================================================================================\n", "📊 DOMAIN SUMMARY\n", "================================================================================\n", "\n", " Domain Subdisciplines Keywords Description\n", " Hydrological Science 5 22 Water systems, flow, flooding, Arctic hydrology\n", " Climate Science 5 22 Climate patterns, permafrost, Arctic climate change\n", "Infrastructure Engineering 5 31 Built systems, Arctic infrastructure challenges, operations\n", " Environmental Health 5 23 Public health, water quality, sanitation challenges\n", " Social Systems 5 23 Community, social aspects, Alaska Native perspectives\n", " Governance & Policy 5 23 Policy, regulations, governance, Alaska agencies\n", " Economics & Resources 5 23 Economics, funding, resource constraints\n", " Technical Operations 5 29 Daily operations, technical management, training needs\n", "\n", "================================================================================\n", "📋 DETAILED FRAMEWORK STRUCTURE\n", "================================================================================\n", "\n", "Hydrological Science\n", " Description: Water systems, flow, flooding, Arctic hydrology\n", " Keywords: 22 terms\n", " Subdisciplines (5):\n", " • Surface Water Hydrology\n", " • Groundwater Systems\n", " • Water Quality\n", " • Hydrologic Modeling\n", " • Arctic Hydrology\n", "\n", "Climate Science\n", " Description: Climate patterns, permafrost, Arctic climate change\n", " Keywords: 22 terms\n", " Subdisciplines (5):\n", " • Arctic Climate\n", " • Permafrost Science\n", " • Climate Adaptation\n", " • Extreme Events\n", " • Seasonal Dynamics\n", "\n", "Infrastructure Engineering\n", " Description: Built systems, Arctic infrastructure challenges, operations\n", " Keywords: 31 terms\n", " Subdisciplines (5):\n", " • Water Treatment\n", " • Wastewater Systems\n", " • Distribution Networks\n", " • Arctic Engineering\n", " • Asset Management\n", "\n", "Environmental Health\n", " Description: Public health, water quality, sanitation challenges\n", " Keywords: 23 terms\n", " Subdisciplines (5):\n", " • Water Quality\n", " • Public Health\n", " • Sanitation\n", " • Environmental Monitoring\n", " • Risk Assessment\n", "\n", "Social Systems\n", " Description: Community, social aspects, Alaska Native perspectives\n", " Keywords: 23 terms\n", " Subdisciplines (5):\n", " • Community Engagement\n", " • Indigenous Knowledge\n", " • Social Equity\n", " • Capacity Building\n", " • Participatory Methods\n", "\n", "Governance & Policy\n", " Description: Policy, regulations, governance, Alaska agencies\n", " Keywords: 23 terms\n", " Subdisciplines (5):\n", " • Environmental Regulation\n", " • Tribal Governance\n", " • State/Federal Policy\n", " • Funding Mechanisms\n", " • Compliance\n", "\n", "Economics & Resources\n", " Description: Economics, funding, resource constraints\n", " Keywords: 23 terms\n", " Subdisciplines (5):\n", " • Infrastructure Finance\n", " • Cost-Benefit Analysis\n", " • Resource Management\n", " • Economic Development\n", " • Affordability\n", "\n", "Technical Operations\n", " Description: Daily operations, technical management, training needs\n", " Keywords: 29 terms\n", " Subdisciplines (5):\n", " • Operations & Maintenance\n", " • Process Control\n", " • Technical Training\n", " • Performance Monitoring\n", " • Troubleshooting\n", "\n", "================================================================================\n", "✅ FRAMEWORK VISUALIZATION COMPLETE\n", "================================================================================\n", "\n", "Visualizations created:\n", " • Network diagram (domains & subdisciplines)\n", " • Sunburst diagram (hierarchical view)\n", "\n", "Framework statistics:\n", " • Domains: 8\n", " • Total subdisciplines: 40\n", " • Avg subdisciplines per domain: 5.0\n", " • Total keywords: 196\n", "\n", "Variables available:\n", " • G (NetworkX graph) - Framework network structure\n", " • fig (Plotly figure) - Network visualization\n", " • fig_sunburst (Plotly figure) - Sunburst diagram\n", " • summary_df (DataFrame) - Domain summary table\n", "\n", "💡 This framework provides:\n", " • Structured science domains for semantic mapping\n", " • Subdisciplines for detailed categorization\n", " • Keywords for automated topic matching\n", " • Visual representation of framework scope\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 20B: Science Framework Visualization (Domains & Subdisciplines)\n", "\n", "print(\"=\"*80)\n", "print(\"🌐 SCIENCE FRAMEWORK STRUCTURE VISUALIZATION\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "import plotly.express as px\n", "import networkx as nx\n", "import numpy as np\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking science backbone...\")\n", "print(\"-\"*80)\n", "\n", "if 'science_backbone' not in globals() or not science_backbone:\n", " print(\"❌ science_backbone not found!\")\n", " print(\" Run Cell 20 first to create framework\")\n", " print()\n", "else:\n", " print(f\"✓ Science backbone loaded: {len(science_backbone)} domains\")\n", " \n", " # Count subdisciplines\n", " total_subdisciplines = sum(\n", " len(info.get('subdisciplines', [])) \n", " for info in science_backbone.values()\n", " )\n", " print(f\"✓ Total subdisciplines: {total_subdisciplines}\")\n", " print()\n", " \n", " # ==========================================\n", " # 2. CREATE HIERARCHICAL NETWORK\n", " # ==========================================\n", " print(\"Step 2: Building hierarchical network structure...\")\n", " print(\"-\"*80)\n", " \n", " # Create network graph\n", " G = nx.Graph()\n", " \n", " # Track nodes by level\n", " domain_nodes = []\n", " subdiscipline_nodes = []\n", " \n", " # Add center node (framework root)\n", " G.add_node(\n", " \"Science Framework\",\n", " level=0,\n", " node_type='root',\n", " size=40,\n", " color='#666666'\n", " )\n", " \n", " # Add domains and subdisciplines\n", " for domain_name, domain_info in science_backbone.items():\n", " # Add domain node\n", " G.add_node(\n", " domain_name,\n", " level=1,\n", " node_type='domain',\n", " size=30,\n", " color='#1f77b4',\n", " description=domain_info['description']\n", " )\n", " domain_nodes.append(domain_name)\n", " \n", " # Connect to root\n", " G.add_edge(\"Science Framework\", domain_name)\n", " \n", " # Add subdisciplines\n", " subdisciplines = domain_info.get('subdisciplines', [])\n", " for subdiscipline in subdisciplines:\n", " subdiscipline_full_name = f\"{subdiscipline}\\n({domain_name[:15]})\"\n", " \n", " G.add_node(\n", " subdiscipline_full_name,\n", " level=2,\n", " node_type='subdiscipline',\n", " size=15,\n", " color='#ff7f0e',\n", " parent_domain=domain_name\n", " )\n", " subdiscipline_nodes.append(subdiscipline_full_name)\n", " \n", " # Connect to domain\n", " G.add_edge(domain_name, subdiscipline_full_name)\n", " \n", " print(f\"✓ Created network with {G.number_of_nodes()} nodes and {G.number_of_edges()} edges\")\n", " print(f\" • Domains: {len(domain_nodes)}\")\n", " print(f\" • Subdisciplines: {len(subdiscipline_nodes)}\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. COMPUTE LAYOUT\n", " # ==========================================\n", " print(\"Step 3: Computing hierarchical layout...\")\n", " print(\"-\"*80)\n", " \n", " # Use hierarchical layout\n", " pos = nx.spring_layout(G, k=3, iterations=50, seed=42)\n", " \n", " print(\"✓ Layout computed\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. CREATE NETWORK VISUALIZATION\n", " # ==========================================\n", " print(\"Step 4: Creating network visualization...\")\n", " print(\"-\"*80)\n", " \n", " # Create edge traces\n", " edge_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='lines',\n", " line=dict(width=1, color='#cccccc'),\n", " hoverinfo='none',\n", " showlegend=False\n", " )\n", " \n", " for edge in G.edges():\n", " x0, y0 = pos[edge[0]]\n", " x1, y1 = pos[edge[1]]\n", " edge_trace['x'] += (x0, x1, None)\n", " edge_trace['y'] += (y0, y1, None)\n", " \n", " # Create node traces by type\n", " root_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='markers+text',\n", " marker=dict(size=[], color=[], line=dict(width=2, color='white')),\n", " text=[],\n", " textposition='middle center',\n", " textfont=dict(size=14, color='white', family='Arial Black'),\n", " name='Framework',\n", " hoverinfo='text',\n", " hovertext=[]\n", " )\n", " \n", " domain_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='markers+text',\n", " marker=dict(size=[], color=[], line=dict(width=2, color='white')),\n", " text=[],\n", " textposition='top center',\n", " textfont=dict(size=11, family='Arial', color='black'),\n", " name='Domains',\n", " hoverinfo='text',\n", " hovertext=[]\n", " )\n", " \n", " subdiscipline_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='markers',\n", " marker=dict(size=[], color=[], line=dict(width=1, color='white')),\n", " text=[],\n", " name='Subdisciplines',\n", " hoverinfo='text',\n", " hovertext=[]\n", " )\n", " \n", " # Populate traces\n", " for node in G.nodes():\n", " x, y = pos[node]\n", " node_data = G.nodes[node]\n", " node_type = node_data['node_type']\n", " size = node_data['size']\n", " color = node_data['color']\n", " \n", " if node_type == 'root':\n", " root_trace['x'] += (x,)\n", " root_trace['y'] += (y,)\n", " root_trace['marker']['size'] += (size,)\n", " root_trace['marker']['color'] += (color,)\n", " root_trace['text'] += ('',) # Don't show text on root\n", " root_trace['hovertext'] += (f\"{node}
Alaska Water Infrastructure\",)\n", " \n", " elif node_type == 'domain':\n", " domain_trace['x'] += (x,)\n", " domain_trace['y'] += (y,)\n", " domain_trace['marker']['size'] += (size,)\n", " domain_trace['marker']['color'] += (color,)\n", " domain_trace['text'] += (node,)\n", " \n", " # Count subdisciplines for this domain\n", " subdisciplines = science_backbone[node].get('subdisciplines', [])\n", " hover = f\"{node}
\"\n", " hover += f\"{node_data.get('description', '')}
\"\n", " hover += f\"Subdisciplines: {len(subdisciplines)}
\"\n", " hover += f\"Connections: {G.degree(node)}\"\n", " domain_trace['hovertext'] += (hover,)\n", " \n", " elif node_type == 'subdiscipline':\n", " subdiscipline_trace['x'] += (x,)\n", " subdiscipline_trace['y'] += (y,)\n", " subdiscipline_trace['marker']['size'] += (size,)\n", " subdiscipline_trace['marker']['color'] += (color,)\n", " \n", " # Extract subdiscipline name (remove domain suffix)\n", " subdiscipline_name = node.split('\\n')[0]\n", " parent = node_data.get('parent_domain', '')\n", " \n", " hover = f\"{subdiscipline_name}
\"\n", " hover += f\"Domain: {parent}\"\n", " subdiscipline_trace['hovertext'] += (hover,)\n", " \n", " # Create figure\n", " fig = go.Figure(\n", " data=[edge_trace, root_trace, domain_trace, subdiscipline_trace],\n", " layout=go.Layout(\n", " title=dict(\n", " text='Science Framework: Domains and Subdisciplines
Alaska Water Infrastructure Context',\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=18, family='Arial')\n", " ),\n", " showlegend=True,\n", " legend=dict(\n", " x=0.02,\n", " y=0.98,\n", " bgcolor='rgba(255,255,255,0.8)',\n", " bordercolor='black',\n", " borderwidth=1\n", " ),\n", " hovermode='closest',\n", " margin=dict(b=40, l=5, r=5, t=80),\n", " xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " plot_bgcolor='white',\n", " height=800,\n", " font=dict(family='Arial')\n", " )\n", " )\n", " \n", " print(\"✓ Network visualization created\")\n", " print()\n", " \n", " # Display\n", " fig.show()\n", " \n", " # ==========================================\n", " # 5. CREATE SUNBURST ALTERNATIVE\n", " # ==========================================\n", " print(\"Step 5: Creating sunburst diagram (alternative view)...\")\n", " print(\"-\"*80)\n", " \n", " # Build hierarchical data for sunburst\n", " labels = ['All Domains']\n", " parents = ['']\n", " values = [0] # Will be sum of children\n", " colors = ['#cccccc']\n", " \n", " domain_colors = px.colors.qualitative.Set2\n", " \n", " for idx, (domain, info) in enumerate(science_backbone.items()):\n", " # Add domain\n", " labels.append(domain)\n", " parents.append('All Domains')\n", " subdisciplines = info.get('subdisciplines', [])\n", " values.append(len(subdisciplines) if subdisciplines else 1)\n", " colors.append(domain_colors[idx % len(domain_colors)])\n", " \n", " # Add subdisciplines\n", " for subdiscipline in subdisciplines:\n", " labels.append(subdiscipline)\n", " parents.append(domain)\n", " values.append(1)\n", " colors.append(domain_colors[idx % len(domain_colors)])\n", " \n", " # Update root value\n", " values[0] = sum(values[1:])\n", " \n", " fig_sunburst = go.Figure(\n", " go.Sunburst(\n", " labels=labels,\n", " parents=parents,\n", " values=values,\n", " branchvalues='total',\n", " marker=dict(\n", " colors=colors,\n", " line=dict(color='white', width=2)\n", " ),\n", " hovertemplate='%{label}
Subdisciplines: %{value}'\n", " )\n", " )\n", " \n", " fig_sunburst.update_layout(\n", " title=dict(\n", " text='Science Framework Hierarchy (Sunburst)',\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=16, family='Arial')\n", " ),\n", " height=700,\n", " font=dict(family='Arial', size=11)\n", " )\n", " \n", " print(\"✓ Sunburst diagram created\")\n", " print()\n", " \n", " fig_sunburst.show()\n", " \n", " # ==========================================\n", " # 6. CREATE SUMMARY TABLE\n", " # ==========================================\n", " print(\"Step 6: Creating domain summary table...\")\n", " print(\"-\"*80)\n", " \n", " import pandas as pd\n", " \n", " summary_data = []\n", " for domain, info in science_backbone.items():\n", " subdisciplines = info.get('subdisciplines', [])\n", " keywords = info.get('keywords', [])\n", " \n", " summary_data.append({\n", " 'Domain': domain,\n", " 'Subdisciplines': len(subdisciplines),\n", " 'Keywords': len(keywords),\n", " 'Description': info.get('description', '')\n", " })\n", " \n", " summary_df = pd.DataFrame(summary_data)\n", " \n", " print(\"\\n\" + \"=\"*80)\n", " print(\"📊 DOMAIN SUMMARY\")\n", " print(\"=\"*80 + \"\\n\")\n", " print(summary_df.to_string(index=False))\n", " print()\n", " \n", " # ==========================================\n", " # 7. DETAILED DOMAIN BREAKDOWN\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📋 DETAILED FRAMEWORK STRUCTURE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " for domain, info in science_backbone.items():\n", " print(f\"{domain}\")\n", " print(f\" Description: {info.get('description', 'N/A')}\")\n", " print(f\" Keywords: {len(info.get('keywords', []))} terms\")\n", " print(f\" Subdisciplines ({len(info.get('subdisciplines', []))}):\")\n", " for subdiscipline in info.get('subdisciplines', []):\n", " print(f\" • {subdiscipline}\")\n", " print()\n", " \n", " # ==========================================\n", " # 8. SUMMARY\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"✅ FRAMEWORK VISUALIZATION COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Visualizations created:\")\n", " print(\" • Network diagram (domains & subdisciplines)\")\n", " print(\" • Sunburst diagram (hierarchical view)\")\n", " print()\n", " \n", " print(\"Framework statistics:\")\n", " print(f\" • Domains: {len(science_backbone)}\")\n", " print(f\" • Total subdisciplines: {total_subdisciplines}\")\n", " print(f\" • Avg subdisciplines per domain: {total_subdisciplines/len(science_backbone):.1f}\")\n", " print(f\" • Total keywords: {sum(len(info.get('keywords', [])) for info in science_backbone.values())}\")\n", " print()\n", " \n", " print(\"Variables available:\")\n", " print(\" • G (NetworkX graph) - Framework network structure\")\n", " print(\" • fig (Plotly figure) - Network visualization\")\n", " print(\" • fig_sunburst (Plotly figure) - Sunburst diagram\")\n", " print(\" • summary_df (DataFrame) - Domain summary table\")\n", " print()\n", " \n", " print(\"💡 This framework provides:\")\n", " print(\" • Structured science domains for semantic mapping\")\n", " print(\" • Subdisciplines for detailed categorization\")\n", " print(\" • Keywords for automated topic matching\")\n", " print(\" • Visual representation of framework scope\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔬 EXTRACTING SCIENTIFIC VARIABLE OBJECTS (SVOs)\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ topic_mappings found: 25 topics\n", "✓ documents found: 9 interviews\n", "✓ science_backbone found: 8 domains\n", "\n", "Step 2: Defining SVO extraction patterns...\n", "--------------------------------------------------------------------------------\n", "✓ Defined 117 SVO patterns across 8 domains\n", "\n", "Step 3: Extracting SVOs from interview transcripts...\n", "--------------------------------------------------------------------------------\n", "\n", "✓ Extracted 824 SVO mentions\n", "\n", "SVOs per domain:\n", " • Climate Science: 83 mentions (8 unique SVOs)\n", " • Economics & Resources: 187 mentions (11 unique SVOs)\n", " • Environmental Health: 114 mentions (8 unique SVOs)\n", " • Governance & Policy: 69 mentions (5 unique SVOs)\n", " • Hydrological Science: 14 mentions (5 unique SVOs)\n", " • Infrastructure Engineering: 271 mentions (4 unique SVOs)\n", " • Social Systems: 13 mentions (5 unique SVOs)\n", " • Technical Operations: 73 mentions (4 unique SVOs)\n", "\n", "================================================================================\n", "📊 SVO SUMMARY BY DOMAIN\n", "================================================================================\n", "\n", "Climate Science:\n", " • freeze: 32 mentions\n", " • seasonal: 18 mentions\n", " • climate change: 15 mentions\n", " • temperature: 8 mentions\n", " • thaw: 7 mentions\n", "\n", "Economics & Resources:\n", " • rate: 67 mentions\n", " • cost: 45 mentions\n", " • funding: 34 mentions\n", " • price: 10 mentions\n", " • revenue: 8 mentions\n", "\n", "Environmental Health:\n", " • ph: 58 mentions\n", " • water quality: 37 mentions\n", " • disinfection: 6 mentions\n", " • turbidity: 5 mentions\n", " • violation: 4 mentions\n", "\n", "Governance & Policy:\n", " • standard: 38 mentions\n", " • regulation: 13 mentions\n", " • requirement: 11 mentions\n", " • permit: 5 mentions\n", " • enforcement: 2 mentions\n", "\n", "Hydrological Science:\n", " • depth: 4 mentions\n", " • height: 4 mentions\n", " • discharge: 4 mentions\n", " • water level: 1 mentions\n", " • volume: 1 mentions\n", "\n", "Infrastructure Engineering:\n", " • age: 233 mentions\n", " • capacity: 16 mentions\n", " • condition: 15 mentions\n", " • pressure: 7 mentions\n", "\n", "Social Systems:\n", " • household: 8 mentions\n", " • residents: 2 mentions\n", " • employment: 1 mentions\n", " • population: 1 mentions\n", " • users: 1 mentions\n", "\n", "Technical Operations:\n", " • hours: 48 mentions\n", " • staff: 20 mentions\n", " • downtime: 4 mentions\n", " • certification level: 1 mentions\n", "\n", "================================================================================\n", "🔗 LINKING SVOs TO TOPICS\n", "================================================================================\n", "\n", "Topics with SVOs:\n", "Topic 0 (Infrastructure Engineering): 4 SVOs\n", "Topic 1 (Social Systems): 5 SVOs\n", "Topic 2 (Infrastructure Engineering): 4 SVOs\n", "Topic 3 (Hydrological Science): 5 SVOs\n", "Topic 4 (Environmental Health): 8 SVOs\n", "Topic 5 (Environmental Health): 8 SVOs\n", "Topic 6 (Climate Science): 8 SVOs\n", "Topic 7 (Governance & Policy): 5 SVOs\n", "Topic 8 (Hydrological Science): 5 SVOs\n", "Topic 9 (Hydrological Science): 5 SVOs\n", "Topic 10 (Climate Science): 8 SVOs\n", "Topic 11 (Environmental Health): 8 SVOs\n", "Topic 12 (Social Systems): 5 SVOs\n", "Topic 13 (Climate Science): 8 SVOs\n", "Topic 14 (Technical Operations): 4 SVOs\n", "Topic 15 (Infrastructure Engineering): 4 SVOs\n", "Topic 16 (Hydrological Science): 5 SVOs\n", "Topic 17 (Infrastructure Engineering): 4 SVOs\n", "Topic 18 (Hydrological Science): 5 SVOs\n", "Topic 19 (Climate Science): 8 SVOs\n", "Topic 20 (Hydrological Science): 5 SVOs\n", "Topic 21 (Governance & Policy): 5 SVOs\n", "Topic 22 (Governance & Policy): 5 SVOs\n", "Topic 23 (Hydrological Science): 5 SVOs\n", "Topic 24 (Economics & Resources): 11 SVOs\n", "\n", "================================================================================\n", "✅ SVO EXTRACTION COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " • svo_extractions (list) - All SVO mentions with context ✓\n", " • svos_by_domain (dict) - SVOs grouped by domain ✓\n", " • svo_df (DataFrame) - For analysis/export ✓\n", " • topic_mappings (updated) - Now includes SVO links ✓\n", "\n", "Total: 824 SVO mentions\n", "Unique SVOs: 50\n", "Documents with SVOs: 9\n", "\n", "Verification:\n", " • type(svo_extractions): \n", " • len(svo_extractions): 824\n", " • svo_extractions in globals(): True\n", "\n", "💡 Next steps:\n", " → Cell 22: Link SVOs to decision components\n", " → Cell 23: Build science backbone network\n", " → Cell 24: Create network visualization\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 21: Extract Scientific Variable Objects (SVOs) - FIXED VERSION\n", "\n", "print(\"=\"*80)\n", "print(\"🔬 EXTRACTING SCIENTIFIC VARIABLE OBJECTS (SVOs)\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import re\n", "from collections import defaultdict\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "# Check for topic_mappings\n", "if 'topic_mappings' not in globals():\n", " print(\"❌ topic_mappings not found!\")\n", " print(\" Run Cell 20 first to create topic-domain mappings\")\n", " has_mappings = False\n", "elif not topic_mappings or len(topic_mappings) == 0:\n", " print(\"❌ topic_mappings is empty!\")\n", " print(\" Cell 20 may have encountered errors\")\n", " has_mappings = False\n", "else:\n", " print(f\"✓ topic_mappings found: {len(topic_mappings)} topics\")\n", " has_mappings = True\n", "\n", "# Check for documents\n", "if 'documents' not in globals():\n", " print(\"❌ documents not found!\")\n", " print(\" Run data loading cells first\")\n", " has_documents = False\n", "else:\n", " print(f\"✓ documents found: {len(documents)} interviews\")\n", " has_documents = True\n", "\n", "# Check for science_backbone\n", "if 'science_backbone' not in globals():\n", " print(\"❌ science_backbone not found!\")\n", " print(\" Run Cell 20 first\")\n", " has_backbone = False\n", "else:\n", " print(f\"✓ science_backbone found: {len(science_backbone)} domains\")\n", " has_backbone = False if not science_backbone else True\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. DEFINE SVO PATTERNS\n", "# ==========================================\n", "print(\"Step 2: Defining SVO extraction patterns...\")\n", "print(\"-\"*80)\n", "\n", "# Patterns to identify measurable variables in text\n", "svo_patterns = {\n", " 'Hydrological Science': [\n", " 'water level', 'water levels', 'flood level', 'flood stage',\n", " 'river level', 'discharge', 'flow rate', 'precipitation',\n", " 'rainfall', 'snowfall', 'depth', 'height', 'volume',\n", " 'frequency', 'duration', 'peak flow', 'base flow',\n", " 'runoff', 'drainage', 'watershed area'\n", " ],\n", " 'Climate Science': [\n", " 'temperature', 'thaw depth', 'permafrost depth', 'active layer',\n", " 'freeze', 'thaw', 'seasonal', 'warming', 'cooling',\n", " 'snow depth', 'ice thickness', 'degree days', 'frost depth',\n", " 'subsidence', 'thaw settlement', 'climate change'\n", " ],\n", " 'Infrastructure Engineering': [\n", " 'pipe diameter', 'capacity', 'pressure', 'age', 'condition',\n", " 'maintenance frequency', 'repair cost', 'system size',\n", " 'storage capacity', 'treatment capacity', 'flow rate',\n", " 'leakage', 'failure rate', 'service life', 'pipe length',\n", " 'pump capacity', 'tank volume', 'utilidor length'\n", " ],\n", " 'Environmental Health': [\n", " 'water quality', 'contamination level', 'bacteria count',\n", " 'chlorine level', 'ph', 'turbidity', 'test results',\n", " 'compliance rate', 'violation', 'illness rate', 'coliform',\n", " 'disinfection', 'treatment efficiency', 'pathogen'\n", " ],\n", " 'Social Systems': [\n", " 'population', 'household', 'residents', 'users',\n", " 'access rate', 'service coverage', 'participation',\n", " 'education level', 'training hours', 'awareness',\n", " 'community satisfaction', 'household size', 'employment'\n", " ],\n", " 'Economics & Resources': [\n", " 'cost', 'budget', 'funding', 'price', 'expense',\n", " 'grant amount', 'revenue', 'subsidy', 'rate',\n", " 'investment', 'value', 'affordability', 'operating cost',\n", " 'capital cost', 'lifecycle cost'\n", " ],\n", " 'Technical Operations': [\n", " 'operator count', 'staff', 'hours', 'frequency',\n", " 'maintenance schedule', 'response time', 'downtime',\n", " 'certification level', 'experience years', 'training time',\n", " 'inspection frequency', 'repair time'\n", " ],\n", " 'Governance & Policy': [\n", " 'compliance rate', 'permit', 'regulation', 'standard',\n", " 'requirement', 'violation count', 'inspection frequency',\n", " 'enforcement', 'policy implementation'\n", " ]\n", "}\n", "\n", "total_patterns = sum(len(patterns) for patterns in svo_patterns.values())\n", "print(f\"✓ Defined {total_patterns} SVO patterns across {len(svo_patterns)} domains\")\n", "print()\n", "\n", "# ==========================================\n", "# 3. EXTRACT SVOs FROM TEXT\n", "# ==========================================\n", "if has_documents and has_backbone:\n", " print(\"Step 3: Extracting SVOs from interview transcripts...\")\n", " print(\"-\"*80 + \"\\n\")\n", " \n", " svo_extractions = []\n", " \n", " for doc_name, doc_text in documents.items():\n", " text_lower = doc_text.lower()\n", " \n", " # Check each domain's patterns\n", " for domain, patterns in svo_patterns.items():\n", " # Only process domains that exist in science_backbone\n", " if domain not in science_backbone:\n", " continue\n", " \n", " for pattern in patterns:\n", " # Find all occurrences\n", " if pattern in text_lower:\n", " # Extract context (sentence containing the pattern)\n", " sentences = re.split(r'[.!?]+', doc_text)\n", " for sentence in sentences:\n", " if pattern in sentence.lower():\n", " svo_extractions.append({\n", " 'document': doc_name,\n", " 'domain': domain,\n", " 'svo': pattern,\n", " 'context': sentence.strip()[:200] # First 200 chars\n", " })\n", " \n", " print(f\"✓ Extracted {len(svo_extractions)} SVO mentions\")\n", " \n", " # Group by domain\n", " svos_by_domain = defaultdict(lambda: defaultdict(int))\n", " for extraction in svo_extractions:\n", " svos_by_domain[extraction['domain']][extraction['svo']] += 1\n", " \n", " print()\n", " print(\"SVOs per domain:\")\n", " for domain in sorted(svos_by_domain.keys()):\n", " count = sum(svos_by_domain[domain].values())\n", " unique = len(svos_by_domain[domain])\n", " print(f\" • {domain}: {count} mentions ({unique} unique SVOs)\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 4. CREATE SVO SUMMARY\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 SVO SUMMARY BY DOMAIN\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " for domain in sorted(svos_by_domain.keys()):\n", " print(f\"{domain}:\")\n", " top_svos = sorted(svos_by_domain[domain].items(), \n", " key=lambda x: -x[1])[:5]\n", " for svo, count in top_svos:\n", " print(f\" • {svo}: {count} mentions\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. LINK SVOs TO TOPICS (if available)\n", " # ==========================================\n", " if has_mappings:\n", " print(\"=\"*80)\n", " print(\"🔗 LINKING SVOs TO TOPICS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Add SVO info to topic mappings\n", " for topic in topic_mappings:\n", " domain = topic['primary_domain']\n", " if domain in svos_by_domain:\n", " topic['svos'] = list(svos_by_domain[domain].keys())\n", " topic['svo_count'] = len(svos_by_domain[domain])\n", " else:\n", " topic['svos'] = []\n", " topic['svo_count'] = 0\n", " \n", " print(\"Topics with SVOs:\")\n", " for topic in topic_mappings:\n", " print(f\"Topic {topic['topic_id']} ({topic['primary_domain']}): {topic['svo_count']} SVOs\")\n", " \n", " print()\n", " else:\n", " print(\"⚠️ Skipping topic linkage (topic_mappings not available)\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. CREATE SVO DATAFRAME\n", " # ==========================================\n", " import pandas as pd\n", " svo_df = pd.DataFrame(svo_extractions)\n", " \n", " print(\"=\"*80)\n", " print(\"✅ SVO EXTRACTION COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Variables created:\")\n", " print(\" • svo_extractions (list) - All SVO mentions with context ✓\")\n", " print(\" • svos_by_domain (dict) - SVOs grouped by domain ✓\")\n", " print(\" • svo_df (DataFrame) - For analysis/export ✓\")\n", " if has_mappings:\n", " print(\" • topic_mappings (updated) - Now includes SVO links ✓\")\n", " print()\n", " \n", " print(f\"Total: {len(svo_extractions)} SVO mentions\")\n", " print(f\"Unique SVOs: {len(set(e['svo'] for e in svo_extractions))}\")\n", " print(f\"Documents with SVOs: {len(set(e['document'] for e in svo_extractions))}\")\n", " print()\n", " \n", " # Verification\n", " print(\"Verification:\")\n", " print(f\" • type(svo_extractions): {type(svo_extractions)}\")\n", " print(f\" • len(svo_extractions): {len(svo_extractions)}\")\n", " print(f\" • svo_extractions in globals(): {('svo_extractions' in globals())}\")\n", " print()\n", " \n", " print(\"💡 Next steps:\")\n", " print(\" → Cell 22: Link SVOs to decision components\")\n", " print(\" → Cell 23: Build science backbone network\")\n", " print(\" → Cell 24: Create network visualization\")\n", "\n", "else:\n", " print(\"\\n⚠️ Cannot extract SVOs - missing requirements\")\n", " print()\n", " \n", " if not has_documents:\n", " print(\"Missing: documents\")\n", " print(\" → Run data loading cells (Cells 4-5)\")\n", " \n", " if not has_backbone:\n", " print(\"Missing: science_backbone\")\n", " print(\" → Run Cell 20 to define framework\")\n", " \n", " print()\n", " \n", " # Create empty placeholders\n", " svo_extractions = []\n", " svos_by_domain = {}\n", " print(\"Created empty placeholders to prevent errors in downstream cells\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🎯 DECISION PROBLEM COMPONENTS & OPTIMIZATION SETUP\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ documents: 9 interviews\n", "✓ svo_extractions: 824 items\n", "✓ topic_mappings: 25 items\n", "\n", "Step 2: Defining decision problem component patterns...\n", "--------------------------------------------------------------------------------\n", "✓ Defined 7 decision component types\n", "\n", "Step 3: Extracting decision components from interviews...\n", "--------------------------------------------------------------------------------\n", "✓ Extracted components from 9 interviews\n", "\n", "Step 4: Building stakeholder-specific optimization formulations...\n", "--------------------------------------------------------------------------------\n", "✓ Created 9 optimization formulations\n", "\n", "Step 5: Creating decision components summary table...\n", "--------------------------------------------------------------------------------\n", "✓ Summary table created\n", "\n", "================================================================================\n", "🎛️ INTERACTIVE DECISION PROBLEM EXPLORER\n", "================================================================================\n", "\n", "Select an interview to explore their decision problem:\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "61a7c8320f8e4198808ad624e820df83", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='Interview:', layout=Layout(width='400px'), options=('[Summary - All Interviews]', '1_1_I…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "dc6c93f7ed9a48ff99c9b40b2168e88a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7599a3951d144539a41c5334a702fa55", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "⏳ Loading summary view...\n", "\n", "================================================================================\n", "✅ DECISION PROBLEM EXPLORER READY\n", "================================================================================\n", "\n", "Features:\n", " • Select '[Summary]' for comparative analysis\n", " • Select specific interview for detailed formulation\n", " • Optimization formulation shows customized objective function\n", " • Complexity score reflects problem difficulty\n", "\n", "================================================================================\n", "💾 SAVING RESULTS\n", "================================================================================\n", "\n", "✓ Saved: Decision_Components_Summary.csv\n", "✓ Saved: Decision_Components_Detailed.csv\n", "✓ Saved: Optimization_Formulations.csv\n", "\n", "📊 Files saved to: publication_outputs/tables/\n", "\n", "✓ Variables created:\n", " • decision_components: dict of extracted components\n", " • optimization_formulations: dict of optimization setups\n", " • decision_summary_df: summary DataFrame\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 21A: Decision Problem Components & Stakeholder-Specific Optimization\n", "\n", "print(\"=\"*80)\n", "print(\"🎯 DECISION PROBLEM COMPONENTS & OPTIMIZATION SETUP\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "import numpy as np\n", "from collections import defaultdict\n", "import re\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, HTML\n", "import plotly.graph_objects as go\n", "import plotly.express as px\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "required_vars = {\n", " 'documents': 'Interview texts',\n", " 'svo_extractions': 'Scientific variables',\n", " 'topic_mappings': 'Topic-domain mappings'\n", "}\n", "\n", "has_data = True\n", "for var_name, description in required_vars.items():\n", " if var_name not in globals() or not globals()[var_name]:\n", " print(f\"❌ {var_name} not found! ({description})\")\n", " has_data = False\n", " else:\n", " if var_name == 'documents':\n", " print(f\"✓ {var_name}: {len(globals()[var_name])} interviews\")\n", " else:\n", " print(f\"✓ {var_name}: {len(globals()[var_name])} items\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. DEFINE DECISION COMPONENT PATTERNS\n", "# ==========================================\n", "if has_data:\n", " print(\"Step 2: Defining decision problem component patterns...\")\n", " print(\"-\"*80)\n", " \n", " # Patterns for each decision component type\n", " component_patterns = {\n", " 'Objectives': {\n", " 'description': 'Goals or desired outcomes to achieve or optimize',\n", " 'patterns': [\n", " r'\\b(?:goal|objective|aim|target|purpose|mission)\\s+(?:is|was|to|of)\\s+([^.!?]+)',\n", " r'\\b(?:want|need|trying|seeking|hoping)\\s+to\\s+([^.!?]+)',\n", " r'\\b(?:improve|increase|enhance|maximize|optimize|ensure)\\s+([^.!?]+)',\n", " r'\\b(?:provide|deliver|maintain|achieve|accomplish)\\s+([^.!?]+)',\n", " r'\\b(?:safe|clean|reliable|adequate|quality)\\s+(\\w+)',\n", " r'community\\s+(\\w+)',\n", " r'customer\\s+(\\w+)',\n", " r'public\\s+health',\n", " r'water\\s+quality',\n", " r'service\\s+reliability'\n", " ],\n", " 'examples': ['safe water', 'community education', 'service reliability', 'public health']\n", " },\n", " \n", " 'Constraints': {\n", " 'description': 'Limitations or boundaries that restrict available actions',\n", " 'patterns': [\n", " r'\\b(?:limited|lack|shortage|insufficient|not enough)\\s+([^.!?]+)',\n", " r'\\b(?:cannot|can\\'t|unable to|difficult to)\\s+([^.!?]+)',\n", " r'\\b(?:constraint|limitation|restriction|barrier)\\s+(?:on|to|of)\\s+([^.!?]+)',\n", " r'\\b(?:budget|funding|money|cost)\\s+(?:constraint|limitation|issue|problem)',\n", " r'\\b(?:workforce|staff|personnel)\\s+(?:shortage|limited|lack)',\n", " r'\\b(?:remote|isolated|rural)\\s+location',\n", " r'harsh\\s+(?:climate|weather|conditions)',\n", " r'permafrost\\s+(?:thaw|melt|issue)',\n", " r'aging\\s+infrastructure',\n", " r'limited\\s+capacity'\n", " ],\n", " 'examples': ['limited funding', 'workforce shortage', 'remote location', 'aging infrastructure']\n", " },\n", " \n", " 'Trade-Offs': {\n", " 'description': 'Situations where improving one objective compromises another',\n", " 'patterns': [\n", " r'\\b(?:trade-?off|compromise|balance|versus|vs\\.?)\\s+between\\s+([^.!?]+)\\s+and\\s+([^.!?]+)',\n", " r'(?:if|when)\\s+we\\s+([^,]+),\\s+(?:then|we)\\s+(?:can\\'t|cannot|lose|sacrifice)\\s+([^.!?]+)',\n", " r'(?:more|increasing|improving)\\s+([^.!?]+)\\s+(?:means|requires)\\s+(?:less|reducing|sacrificing)\\s+([^.!?]+)',\n", " r'(?:either|must choose between)\\s+([^.!?]+)\\s+or\\s+([^.!?]+)',\n", " r'cost\\s+vs\\.?\\s+quality',\n", " r'speed\\s+vs\\.?\\s+accuracy',\n", " r'coverage\\s+vs\\.?\\s+service level'\n", " ],\n", " 'examples': ['operator satisfaction vs customer coverage', 'cost vs quality', 'speed vs accuracy']\n", " },\n", " \n", " 'Decision Variables': {\n", " 'description': 'Controllable factors that can be adjusted to influence system performance',\n", " 'patterns': [\n", " r'\\b(?:can\\s+adjust|can\\s+change|can\\s+modify|can\\s+control)\\s+([^.!?]+)',\n", " r'\\b(?:operator|staff|we)\\s+(?:adjust|change|modify|set|control)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'\\b(?:adjust|change|modify|set|control)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'work\\s+(?:hours|schedule|shifts)',\n", " r'treatment\\s+(?:process|level|dosage)',\n", " r'maintenance\\s+frequency',\n", " r'staffing\\s+levels',\n", " r'operating\\s+(?:parameters|conditions)'\n", " ],\n", " 'examples': ['operator work hours', 'treatment dosage', 'maintenance frequency', 'staffing levels']\n", " },\n", " \n", " 'Options': {\n", " 'description': 'Discrete alternative courses of action available at decision points',\n", " 'patterns': [\n", " r'\\b(?:option|alternative|choice|either)\\s+(?:is|to|between)\\s+([^.!?]+)',\n", " r'\\b(?:could|can|might)\\s+(?:either|choose to)\\s+([^.!?]+)\\s+or\\s+([^.!?]+)',\n", " r'residents\\s+(?:choose|use|rely on)\\s+([^.!?]+)',\n", " r'(?:use|switch to|choose between)\\s+([^.!?]+)\\s+(?:or|versus)\\s+([^.!?]+)',\n", " r'treated\\s+(?:water|source)',\n", " r'natural\\s+(?:water|source)',\n", " r'trucked\\s+water',\n", " r'hauled\\s+water'\n", " ],\n", " 'examples': ['treated vs natural water', 'truck vs pipe delivery', 'centralized vs distributed']\n", " },\n", " \n", " 'Solutions': {\n", " 'description': 'Implemented strategies, workarounds, or approaches that address challenges',\n", " 'patterns': [\n", " r'\\b(?:solution|workaround|approach|strategy|method)\\s+(?:is|was|to)\\s+([^.!?]+)',\n", " r'\\b(?:we|they|operators)\\s+(?:implemented|use|employ|developed)\\s+([^.!?]+)',\n", " r'\\b(?:program|system|initiative)\\s+(?:for|to)\\s+([^.!?]+)',\n", " r'remote\\s+(?:worker|training|monitoring)\\s+program',\n", " r'managerial\\s+capacity',\n", " r'cross-?training',\n", " r'backup\\s+(?:system|plan)',\n", " r'emergency\\s+(?:protocol|procedure)',\n", " r'partnership\\s+with',\n", " r'automated\\s+(?:monitoring|control)'\n", " ],\n", " 'examples': ['remote worker program', 'managerial capacity', 'cross-training', 'backup system']\n", " },\n", " \n", " 'State Variables': {\n", " 'description': 'Fixed or external factors that define context but are not controllable',\n", " 'patterns': [\n", " r'\\b(?:given|due to|because of)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'\\b(?:external|environmental|contextual)\\s+(?:factor|condition|constraint)\\s+([^.!?]+)',\n", " r'permafrost',\n", " r'seasonal\\s+(?:fluctuation|variation|change)',\n", " r'climate\\s+(?:condition|pattern)',\n", " r'geographic\\s+location',\n", " r'population\\s+(?:size|density)',\n", " r'regulatory\\s+requirement',\n", " r'weather\\s+(?:pattern|condition)',\n", " r'distance\\s+from'\n", " ],\n", " 'examples': ['permafrost', 'seasonal fluctuations', 'remote location', 'population size']\n", " }\n", " }\n", " \n", " print(f\"✓ Defined {len(component_patterns)} decision component types\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. EXTRACT COMPONENTS FROM INTERVIEWS\n", " # ==========================================\n", " print(\"Step 3: Extracting decision components from interviews...\")\n", " print(\"-\"*80)\n", " \n", " decision_components = {}\n", " \n", " for doc_name, doc_text in documents.items():\n", " components = {comp_type: set() for comp_type in component_patterns.keys()}\n", " \n", " # Process text\n", " text_lower = doc_text.lower()\n", " sentences = re.split(r'[.!?]+', text_lower)\n", " \n", " # Extract each component type\n", " for comp_type, comp_info in component_patterns.items():\n", " for pattern in comp_info['patterns']:\n", " matches = re.finditer(pattern, text_lower, re.IGNORECASE)\n", " for match in matches:\n", " # Extract matched text\n", " if match.lastindex and match.lastindex >= 1:\n", " extracted = match.group(1).strip()\n", " # Clean up\n", " extracted = re.sub(r'\\s+', ' ', extracted)\n", " extracted = extracted[:100] # Limit length\n", " if len(extracted) > 3: # Meaningful length\n", " components[comp_type].add(extracted)\n", " \n", " # Also check for example patterns\n", " for example in comp_info['examples']:\n", " if example.lower() in text_lower:\n", " components[comp_type].add(example)\n", " \n", " decision_components[doc_name] = components\n", " \n", " print(f\"✓ Extracted components from {len(decision_components)} interviews\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. BUILD OPTIMIZATION FORMULATIONS\n", " # ==========================================\n", " print(\"Step 4: Building stakeholder-specific optimization formulations...\")\n", " print(\"-\"*80)\n", " \n", " optimization_formulations = {}\n", " \n", " for doc_name in documents.keys():\n", " comps = decision_components[doc_name]\n", " \n", " # Count components\n", " n_objectives = len(comps['Objectives'])\n", " n_constraints = len(comps['Constraints'])\n", " n_variables = len(comps['Decision Variables'])\n", " \n", " # Build objective function\n", " obj_func = []\n", " if n_objectives > 0:\n", " for i, obj in enumerate(list(comps['Objectives'])[:5], 1):\n", " obj_func.append(f\"w{i} * {obj[:30]}\")\n", " else:\n", " obj_func.append(\"w1 * utility\")\n", " \n", " # Build constraint set\n", " constraint_set = []\n", " if n_constraints > 0:\n", " for i, const in enumerate(list(comps['Constraints'])[:5], 1):\n", " constraint_set.append(f\"g{i}(x) ≤ b{i} // {const[:40]}\")\n", " else:\n", " constraint_set.append(\"g1(x) ≤ b1 // resource constraint\")\n", " \n", " # Build decision vector\n", " dec_vars = []\n", " if n_variables > 0:\n", " for i, var in enumerate(list(comps['Decision Variables'])[:5], 1):\n", " dec_vars.append(f\"x{i}: {var[:30]}\")\n", " else:\n", " dec_vars.append(\"x1: decision variable\")\n", " \n", " # Calculate complexity score\n", " complexity = (\n", " n_objectives * 2 + # Objectives worth 2 points each\n", " n_constraints * 1.5 + # Constraints worth 1.5 points\n", " n_variables * 1 + # Variables worth 1 point\n", " len(comps['Trade-Offs']) * 2 # Trade-offs add complexity\n", " )\n", " \n", " # Categorize problem type\n", " if n_objectives <= 1:\n", " problem_type = \"Single-Objective\"\n", " elif n_objectives <= 3:\n", " problem_type = \"Multi-Objective\"\n", " else:\n", " problem_type = \"Many-Objective\"\n", " \n", " if len(comps['Trade-Offs']) > 0:\n", " problem_type += \" with Trade-offs\"\n", " \n", " optimization_formulations[doc_name] = {\n", " 'objective_function': \" + \".join(obj_func[:3]) if obj_func else \"w1 * utility\",\n", " 'constraints': constraint_set[:3],\n", " 'decision_vector': dec_vars[:3],\n", " 'n_objectives': n_objectives,\n", " 'n_constraints': n_constraints,\n", " 'n_variables': n_variables,\n", " 'n_tradeoffs': len(comps['Trade-Offs']),\n", " 'n_options': len(comps['Options']),\n", " 'n_solutions': len(comps['Solutions']),\n", " 'n_state_vars': len(comps['State Variables']),\n", " 'complexity_score': complexity,\n", " 'problem_type': problem_type\n", " }\n", " \n", " print(f\"✓ Created {len(optimization_formulations)} optimization formulations\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. CREATE SUMMARY TABLE\n", " # ==========================================\n", " print(\"Step 5: Creating decision components summary table...\")\n", " print(\"-\"*80)\n", " \n", " # Build DataFrame\n", " summary_data = []\n", " for doc_name in sorted(documents.keys()):\n", " comps = decision_components[doc_name]\n", " opt = optimization_formulations[doc_name]\n", " \n", " summary_data.append({\n", " 'Interview': doc_name,\n", " 'Objectives': opt['n_objectives'],\n", " 'Constraints': opt['n_constraints'],\n", " 'Trade-Offs': opt['n_tradeoffs'],\n", " 'Decision Variables': opt['n_variables'],\n", " 'Options': opt['n_options'],\n", " 'Solutions': opt['n_solutions'],\n", " 'State Variables': opt['n_state_vars'],\n", " 'Problem Type': opt['problem_type'],\n", " 'Complexity Score': round(opt['complexity_score'], 1)\n", " })\n", " \n", " summary_df = pd.DataFrame(summary_data)\n", " \n", " print(\"✓ Summary table created\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. CREATE INTERACTIVE SELECTOR\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎛️ INTERACTIVE DECISION PROBLEM EXPLORER\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Output widgets\n", " detail_output = widgets.Output()\n", " viz_output = widgets.Output()\n", " \n", " # Dropdown\n", " interview_selector = widgets.Dropdown(\n", " options=['[Summary - All Interviews]'] + sorted(documents.keys()),\n", " value='[Summary - All Interviews]',\n", " description='Interview:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='400px')\n", " )\n", " \n", " # Update function\n", " def update_display(change):\n", " selected = interview_selector.value\n", " \n", " with detail_output:\n", " clear_output(wait=True)\n", " \n", " if selected == '[Summary - All Interviews]':\n", " # Show summary table\n", " display(HTML(\"

Decision Problem Components - All Interviews

\"))\n", " display(summary_df.style.background_gradient(\n", " subset=['Complexity Score'], \n", " cmap='YlOrRd'\n", " ).format({'Complexity Score': '{:.1f}'}))\n", " \n", " print()\n", " print(\"📊 GLOBAL STATISTICS\")\n", " print(\"-\"*60)\n", " print(f\"Total interviews analyzed: {len(documents)}\")\n", " print(f\"Average objectives per interview: {summary_df['Objectives'].mean():.1f}\")\n", " print(f\"Average constraints per interview: {summary_df['Constraints'].mean():.1f}\")\n", " print(f\"Average complexity score: {summary_df['Complexity Score'].mean():.1f}\")\n", " print()\n", " \n", " print(\"Problem type distribution:\")\n", " for ptype, count in summary_df['Problem Type'].value_counts().items():\n", " print(f\" • {ptype}: {count} interviews\")\n", " \n", " else:\n", " # Show individual interview details\n", " display(HTML(f\"

Decision Problem: {selected}

\"))\n", " \n", " comps = decision_components[selected]\n", " opt = optimization_formulations[selected]\n", " \n", " # Optimization formulation\n", " print(\"=\"*60)\n", " print(\"📐 OPTIMIZATION FORMULATION\")\n", " print(\"=\"*60)\n", " print()\n", " print(\"Maximize:\")\n", " print(f\" Z = {opt['objective_function']}\")\n", " print()\n", " print(\"Subject to:\")\n", " for i, constraint in enumerate(opt['constraints'], 1):\n", " print(f\" {constraint}\")\n", " print()\n", " print(\"Decision Vector:\")\n", " for var in opt['decision_vector']:\n", " print(f\" {var}\")\n", " print()\n", " print(f\"Problem Type: {opt['problem_type']}\")\n", " print(f\"Complexity Score: {opt['complexity_score']:.1f}\")\n", " print()\n", " \n", " # Component details\n", " print(\"=\"*60)\n", " print(\"🎯 DECISION PROBLEM COMPONENTS\")\n", " print(\"=\"*60)\n", " print()\n", " \n", " for comp_type, comp_set in comps.items():\n", " if comp_set:\n", " print(f\"{comp_type} ({len(comp_set)}):\")\n", " print(f\" Description: {component_patterns[comp_type]['description']}\")\n", " print(\" Identified:\")\n", " for item in sorted(list(comp_set)[:5]): # Top 5\n", " print(f\" • {item}\")\n", " if len(comp_set) > 5:\n", " print(f\" ... and {len(comp_set) - 5} more\")\n", " print()\n", " \n", " with viz_output:\n", " clear_output(wait=True)\n", " \n", " if selected == '[Summary - All Interviews]':\n", " # Create comparison visualizations\n", " \n", " # 1. Component counts by interview\n", " fig1 = go.Figure()\n", " \n", " component_cols = ['Objectives', 'Constraints', 'Trade-Offs', \n", " 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", " colors = ['#2E86AB', '#F77F00', '#06A77D', '#D62839', \n", " '#845EC2', '#936639', '#F9A26C']\n", " \n", " for i, col in enumerate(component_cols):\n", " fig1.add_trace(go.Bar(\n", " name=col,\n", " x=summary_df['Interview'],\n", " y=summary_df[col],\n", " marker_color=colors[i]\n", " ))\n", " \n", " fig1.update_layout(\n", " title='Decision Problem Components by Interview',\n", " xaxis_title='Interview',\n", " yaxis_title='Count',\n", " barmode='group',\n", " height=400,\n", " legend=dict(orientation='h', yanchor='bottom', y=1.02)\n", " )\n", " \n", " display(fig1)\n", " \n", " # 2. Complexity score comparison\n", " fig2 = go.Figure(go.Bar(\n", " x=summary_df['Interview'],\n", " y=summary_df['Complexity Score'],\n", " marker_color=summary_df['Complexity Score'],\n", " marker_colorscale='YlOrRd',\n", " text=summary_df['Complexity Score'].round(1),\n", " textposition='outside'\n", " ))\n", " \n", " fig2.update_layout(\n", " title='Problem Complexity Score by Interview',\n", " xaxis_title='Interview',\n", " yaxis_title='Complexity Score',\n", " height=350\n", " )\n", " \n", " display(fig2)\n", " \n", " else:\n", " # Create radar chart for single interview\n", " comps = decision_components[selected]\n", " opt = optimization_formulations[selected]\n", " \n", " categories = ['Objectives', 'Constraints', 'Trade-Offs', \n", " 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", " values = [\n", " opt['n_objectives'],\n", " opt['n_constraints'],\n", " opt['n_tradeoffs'],\n", " opt['n_variables'],\n", " opt['n_options'],\n", " opt['n_solutions'],\n", " opt['n_state_vars']\n", " ]\n", " \n", " fig = go.Figure(go.Scatterpolar(\n", " r=values,\n", " theta=categories,\n", " fill='toself',\n", " name=selected,\n", " line_color='#2E86AB',\n", " fillcolor='rgba(46, 134, 171, 0.3)'\n", " ))\n", " \n", " fig.update_layout(\n", " polar=dict(\n", " radialaxis=dict(visible=True, range=[0, max(values) + 2])\n", " ),\n", " title=f'Decision Problem Profile: {selected}',\n", " height=500\n", " )\n", " \n", " display(fig)\n", " \n", " # Connect selector\n", " interview_selector.observe(update_display, names='value')\n", " \n", " # Display interface\n", " print(\"Select an interview to explore their decision problem:\")\n", " print()\n", " display(interview_selector)\n", " display(detail_output)\n", " display(viz_output)\n", " \n", " # Initial display\n", " print(\"\\n⏳ Loading summary view...\")\n", " update_display(None)\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ DECISION PROBLEM EXPLORER READY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Features:\")\n", " print(\" • Select '[Summary]' for comparative analysis\")\n", " print(\" • Select specific interview for detailed formulation\")\n", " print(\" • Optimization formulation shows customized objective function\")\n", " print(\" • Complexity score reflects problem difficulty\")\n", " print()\n", " \n", " # ==========================================\n", " # 7. SAVE RESULTS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"💾 SAVING RESULTS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " from pathlib import Path\n", " output_dir = Path('publication_outputs/tables')\n", " output_dir.mkdir(parents=True, exist_ok=True)\n", " \n", " # Save summary table\n", " summary_df.to_csv(output_dir / 'Decision_Components_Summary.csv', index=False)\n", " print(f\"✓ Saved: Decision_Components_Summary.csv\")\n", " \n", " # Save detailed components\n", " detailed_data = []\n", " for doc_name in sorted(documents.keys()):\n", " comps = decision_components[doc_name]\n", " for comp_type, comp_set in comps.items():\n", " for item in comp_set:\n", " detailed_data.append({\n", " 'Interview': doc_name,\n", " 'Component Type': comp_type,\n", " 'Component': item\n", " })\n", " \n", " detailed_df = pd.DataFrame(detailed_data)\n", " detailed_df.to_csv(output_dir / 'Decision_Components_Detailed.csv', index=False)\n", " print(f\"✓ Saved: Decision_Components_Detailed.csv\")\n", " \n", " # Save optimization formulations\n", " opt_data = []\n", " for doc_name, opt in optimization_formulations.items():\n", " opt_data.append({\n", " 'Interview': doc_name,\n", " 'Objective Function': opt['objective_function'],\n", " 'Num Constraints': opt['n_constraints'],\n", " 'Num Variables': opt['n_variables'],\n", " 'Problem Type': opt['problem_type'],\n", " 'Complexity Score': opt['complexity_score']\n", " })\n", " \n", " opt_df = pd.DataFrame(opt_data)\n", " opt_df.to_csv(output_dir / 'Optimization_Formulations.csv', index=False)\n", " print(f\"✓ Saved: Optimization_Formulations.csv\")\n", " \n", " print()\n", " print(\"📊 Files saved to: publication_outputs/tables/\")\n", " \n", " # Store in global scope\n", " globals()['decision_components'] = decision_components\n", " globals()['optimization_formulations'] = optimization_formulations\n", " globals()['decision_summary_df'] = summary_df\n", " \n", " print()\n", " print(\"✓ Variables created:\")\n", " print(\" • decision_components: dict of extracted components\")\n", " print(\" • optimization_formulations: dict of optimization setups\")\n", " print(\" • decision_summary_df: summary DataFrame\")\n", "\n", "else:\n", " print(\"⚠️ Missing required data - run previous cells first\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "📊 DECISION COMPONENTS COMPARATIVE ANALYSIS\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ decision_components: 9 items\n", "✓ optimization_formulations: 9 items\n", "✓ documents: 9 items\n", "\n", "Step 2: Building comprehensive comparison table...\n", "--------------------------------------------------------------------------------\n", "✓ Comparison table created: 8 rows × 15 columns\n", "\n", "Step 3: Building pairwise interview comparison table...\n", "--------------------------------------------------------------------------------\n", "✓ Pairwise comparison table created: 36 pairs\n", "\n", "Step 4: Building detailed component listings...\n", "--------------------------------------------------------------------------------\n", "✓ Created detailed listings for 7 component types\n", "\n", "================================================================================\n", "🎛️ INTERACTIVE COMPARISON VIEWER\n", "================================================================================\n", "\n", "Select view to explore comparisons:\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "007da265ffff4b3994c7c40594e9c0ff", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='View:', layout=Layout(width='350px'), options=('All Interviews Comparison', 'Pairwise Si…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

Detailed Pair Comparison:

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2a79abb086f64cd18883f8e3a44b4473", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Dropdown(description='Interview A:', layout=Layout(width='250px'), options=('1_1_Interdependenc…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "efa54dd171fe4eb1a8138a9a5c57318e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2a4e10f8ebbd4749860a54f9ba99942c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "eb8ad6f8a36a4c828212a0a8e0c2c630", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "⏳ Loading comparison view...\n", "\n", "================================================================================\n", "✅ COMPARISON VIEWER READY\n", "================================================================================\n", "\n", "Features:\n", " • 'All Interviews Comparison': Side-by-side component counts\n", " • 'Pairwise Similarities': Jaccard similarity between all pairs\n", " • Component-specific views: Detailed listings for each type\n", " • 'Complexity Analysis': Problem difficulty comparison\n", " • Detailed pair comparison: Select two interviews to compare\n", "\n", "================================================================================\n", "💾 SAVING COMPARISON TABLES\n", "================================================================================\n", "\n", "✓ Saved: Decision_Components_Comparison_All.csv\n", "✓ Saved: Decision_Components_Pairwise_Similarities.csv\n", "✓ Saved: Decision_Components_Objectives.csv\n", "✓ Saved: Decision_Components_Constraints.csv\n", "✓ Saved: Decision_Components_Trade-Offs.csv\n", "✓ Saved: Decision_Components_Decision_Variables.csv\n", "✓ Saved: Decision_Components_Options.csv\n", "✓ Saved: Decision_Components_Solutions.csv\n", "✓ Saved: Decision_Components_State_Variables.csv\n", "\n", "📊 Files saved to: publication_outputs/tables/\n", "\n", "✓ Variables created:\n", " • decision_comparison_df: All interviews comparison\n", " • decision_pairwise_df: Pairwise similarities\n", " • decision_detailed_listings: Component-specific details\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 21B: # CELL: Decision Components Comparative Analysis\n", "\n", "print(\"=\"*80)\n", "print(\"📊 DECISION COMPONENTS COMPARATIVE ANALYSIS\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "import numpy as np\n", "from collections import defaultdict\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, HTML\n", "import plotly.graph_objects as go\n", "import plotly.express as px\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "required_vars = {\n", " 'decision_components': 'Extracted decision components',\n", " 'optimization_formulations': 'Optimization setups',\n", " 'documents': 'Interview texts'\n", "}\n", "\n", "has_data = True\n", "for var_name, description in required_vars.items():\n", " if var_name not in globals() or not globals()[var_name]:\n", " print(f\"❌ {var_name} not found! ({description})\")\n", " print(f\" Run previous Decision Components cell first\")\n", " has_data = False\n", " else:\n", " print(f\"✓ {var_name}: {len(globals()[var_name])} items\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. BUILD COMPREHENSIVE COMPARISON TABLE\n", "# ==========================================\n", "if has_data:\n", " print(\"Step 2: Building comprehensive comparison table...\")\n", " print(\"-\"*80)\n", " \n", " doc_list = sorted(documents.keys())\n", " \n", " # Component types for comparison\n", " component_types = ['Objectives', 'Constraints', 'Trade-Offs', \n", " 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", " \n", " # Build detailed comparison data\n", " comparison_data = []\n", " \n", " for comp_type in component_types:\n", " row = {'Component Type': comp_type}\n", " \n", " # Add each interview's count for this component\n", " for doc_name in doc_list:\n", " comps = decision_components[doc_name]\n", " count = len(comps[comp_type])\n", " row[doc_name] = count\n", " \n", " # Calculate statistics\n", " counts = [len(decision_components[doc][comp_type]) for doc in doc_list]\n", " row['Total'] = sum(counts)\n", " row['Mean'] = np.mean(counts)\n", " row['Std'] = np.std(counts)\n", " row['Min'] = min(counts)\n", " row['Max'] = max(counts)\n", " \n", " comparison_data.append(row)\n", " \n", " # Add summary row (all components)\n", " summary_row = {'Component Type': 'TOTAL'}\n", " for doc_name in doc_list:\n", " opt = optimization_formulations[doc_name]\n", " total = (opt['n_objectives'] + opt['n_constraints'] + opt['n_tradeoffs'] + \n", " opt['n_variables'] + opt['n_options'] + opt['n_solutions'] + opt['n_state_vars'])\n", " summary_row[doc_name] = total\n", " \n", " # Calculate totals\n", " totals = [summary_row[doc] for doc in doc_list]\n", " summary_row['Total'] = sum(totals)\n", " summary_row['Mean'] = np.mean(totals)\n", " summary_row['Std'] = np.std(totals)\n", " summary_row['Min'] = min(totals)\n", " summary_row['Max'] = max(totals)\n", " \n", " comparison_data.append(summary_row)\n", " \n", " # Create DataFrame\n", " comparison_df = pd.DataFrame(comparison_data)\n", " \n", " # Reorder columns\n", " stat_cols = ['Total', 'Mean', 'Std', 'Min', 'Max']\n", " col_order = ['Component Type'] + doc_list + stat_cols\n", " comparison_df = comparison_df[col_order]\n", " \n", " print(f\"✓ Comparison table created: {len(comparison_data)} rows × {len(col_order)} columns\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. BUILD PAIRWISE COMPARISON TABLE\n", " # ==========================================\n", " print(\"Step 3: Building pairwise interview comparison table...\")\n", " print(\"-\"*80)\n", " \n", " pairwise_data = []\n", " \n", " for i, doc1 in enumerate(doc_list):\n", " for doc2 in doc_list[i+1:]: # Only compare each pair once\n", " comps1 = decision_components[doc1]\n", " comps2 = decision_components[doc2]\n", " opt1 = optimization_formulations[doc1]\n", " opt2 = optimization_formulations[doc2]\n", " \n", " # Calculate similarities and differences\n", " similarities = {}\n", " differences = {}\n", " \n", " for comp_type in component_types:\n", " set1 = comps1[comp_type]\n", " set2 = comps2[comp_type]\n", " \n", " # Shared components\n", " shared = set1 & set2\n", " similarities[comp_type] = len(shared)\n", " \n", " # Unique to each\n", " unique1 = set1 - set2\n", " unique2 = set2 - set1\n", " differences[comp_type] = (len(unique1), len(unique2))\n", " \n", " # Jaccard similarity (overall)\n", " all_comps1 = set()\n", " all_comps2 = set()\n", " for comp_type in component_types:\n", " all_comps1.update(comps1[comp_type])\n", " all_comps2.update(comps2[comp_type])\n", " \n", " intersection = len(all_comps1 & all_comps2)\n", " union = len(all_comps1 | all_comps2)\n", " jaccard = intersection / union if union > 0 else 0\n", " \n", " # Complexity difference\n", " complexity_diff = abs(opt1['complexity_score'] - opt2['complexity_score'])\n", " \n", " pairwise_data.append({\n", " 'Interview A': doc1,\n", " 'Interview B': doc2,\n", " 'Shared Objectives': similarities['Objectives'],\n", " 'Shared Constraints': similarities['Constraints'],\n", " 'Shared Trade-Offs': similarities['Trade-Offs'],\n", " 'Shared Variables': similarities['Decision Variables'],\n", " 'Shared Components': intersection,\n", " 'Jaccard Similarity': round(jaccard, 3),\n", " 'Complexity Diff': round(complexity_diff, 1),\n", " 'Problem Type A': opt1['problem_type'],\n", " 'Problem Type B': opt2['problem_type']\n", " })\n", " \n", " pairwise_df = pd.DataFrame(pairwise_data)\n", " \n", " print(f\"✓ Pairwise comparison table created: {len(pairwise_data)} pairs\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. BUILD DETAILED COMPONENT LISTING\n", " # ==========================================\n", " print(\"Step 4: Building detailed component listings...\")\n", " print(\"-\"*80)\n", " \n", " # Create detailed listing for each component type\n", " detailed_listings = {}\n", " \n", " for comp_type in component_types:\n", " listings = []\n", " \n", " for doc_name in doc_list:\n", " comps = decision_components[doc_name][comp_type]\n", " for item in sorted(comps):\n", " listings.append({\n", " 'Component Type': comp_type,\n", " 'Interview': doc_name,\n", " 'Component': item,\n", " 'Length': len(item)\n", " })\n", " \n", " detailed_listings[comp_type] = pd.DataFrame(listings)\n", " \n", " print(f\"✓ Created detailed listings for {len(component_types)} component types\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. CREATE INTERACTIVE COMPARISON VIEWER\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎛️ INTERACTIVE COMPARISON VIEWER\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Output widgets\n", " table_output = widgets.Output()\n", " analysis_output = widgets.Output()\n", " viz_output = widgets.Output()\n", " \n", " # View selector\n", " view_selector = widgets.Dropdown(\n", " options=[\n", " 'All Interviews Comparison',\n", " 'Pairwise Similarities',\n", " 'Objectives Comparison',\n", " 'Constraints Comparison',\n", " 'Trade-Offs Comparison',\n", " 'Decision Variables Comparison',\n", " 'Options Comparison',\n", " 'Solutions Comparison',\n", " 'State Variables Comparison',\n", " 'Complexity Analysis'\n", " ],\n", " value='All Interviews Comparison',\n", " description='View:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='350px')\n", " )\n", " \n", " # Interview pair selector (for detailed comparison)\n", " interview1_selector = widgets.Dropdown(\n", " options=doc_list,\n", " value=doc_list[0],\n", " description='Interview A:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='250px')\n", " )\n", " \n", " interview2_selector = widgets.Dropdown(\n", " options=doc_list,\n", " value=doc_list[1] if len(doc_list) > 1 else doc_list[0],\n", " description='Interview B:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='250px')\n", " )\n", " \n", " # Update function\n", " def update_display(change):\n", " selected_view = view_selector.value\n", " \n", " with table_output:\n", " clear_output(wait=True)\n", " \n", " if selected_view == 'All Interviews Comparison':\n", " display(HTML(\"

All Interviews - Component Count Comparison

\"))\n", " \n", " # Style the dataframe\n", " styled_df = comparison_df.style.background_gradient(\n", " subset=[col for col in comparison_df.columns if col not in ['Component Type'] + stat_cols],\n", " cmap='Blues'\n", " ).format({\n", " 'Mean': '{:.1f}',\n", " 'Std': '{:.2f}',\n", " 'Total': '{:.0f}',\n", " 'Min': '{:.0f}',\n", " 'Max': '{:.0f}'\n", " })\n", " \n", " display(styled_df)\n", " \n", " elif selected_view == 'Pairwise Similarities':\n", " display(HTML(\"

Pairwise Interview Similarities

\"))\n", " \n", " styled_df = pairwise_df.style.background_gradient(\n", " subset=['Jaccard Similarity'],\n", " cmap='Greens'\n", " ).background_gradient(\n", " subset=['Complexity Diff'],\n", " cmap='Reds'\n", " ).format({\n", " 'Jaccard Similarity': '{:.3f}',\n", " 'Complexity Diff': '{:.1f}'\n", " })\n", " \n", " display(styled_df)\n", " \n", " elif selected_view == 'Complexity Analysis':\n", " display(HTML(\"

Complexity Analysis

\"))\n", " \n", " complexity_data = []\n", " for doc_name in doc_list:\n", " opt = optimization_formulations[doc_name]\n", " comps = decision_components[doc_name]\n", " \n", " complexity_data.append({\n", " 'Interview': doc_name,\n", " 'Total Components': sum([len(comps[ct]) for ct in component_types]),\n", " 'Objectives': opt['n_objectives'],\n", " 'Constraints': opt['n_constraints'],\n", " 'Trade-Offs': opt['n_tradeoffs'],\n", " 'Variables': opt['n_variables'],\n", " 'Complexity Score': opt['complexity_score'],\n", " 'Problem Type': opt['problem_type']\n", " })\n", " \n", " complexity_df = pd.DataFrame(complexity_data)\n", " \n", " styled_df = complexity_df.style.background_gradient(\n", " subset=['Complexity Score'],\n", " cmap='YlOrRd'\n", " ).format({'Complexity Score': '{:.1f}'})\n", " \n", " display(styled_df)\n", " \n", " else:\n", " # Component-specific view\n", " comp_type = selected_view.replace(' Comparison', '')\n", " \n", " if comp_type in detailed_listings:\n", " display(HTML(f\"

{comp_type} - Detailed Listing

\"))\n", " \n", " comp_df = detailed_listings[comp_type]\n", " \n", " # Show summary stats\n", " print(f\"Total {comp_type}: {len(comp_df)}\")\n", " print(f\"Unique {comp_type}: {comp_df['Component'].nunique()}\")\n", " print(f\"Average per interview: {len(comp_df) / len(doc_list):.1f}\")\n", " print()\n", " \n", " display(comp_df)\n", " \n", " with analysis_output:\n", " clear_output(wait=True)\n", " \n", " if selected_view == 'All Interviews Comparison':\n", " print(\"📊 ANALYSIS\")\n", " print(\"-\"*60)\n", " print()\n", " print(\"Most common components:\")\n", " for i, row in comparison_df.iterrows():\n", " if row['Component Type'] != 'TOTAL':\n", " print(f\" • {row['Component Type']}: {row['Total']:.0f} total, {row['Mean']:.1f} avg/interview\")\n", " print()\n", " \n", " print(\"Highest variability (Std Dev):\")\n", " variability = comparison_df[comparison_df['Component Type'] != 'TOTAL'].sort_values('Std', ascending=False)\n", " for _, row in variability.head(3).iterrows():\n", " print(f\" • {row['Component Type']}: SD={row['Std']:.2f}\")\n", " print()\n", " \n", " print(\"Interview with most components:\")\n", " total_row = comparison_df[comparison_df['Component Type'] == 'TOTAL'].iloc[0]\n", " max_doc = max(doc_list, key=lambda d: total_row[d])\n", " print(f\" • {max_doc}: {total_row[max_doc]:.0f} components\")\n", " print()\n", " \n", " print(\"Interview with fewest components:\")\n", " min_doc = min(doc_list, key=lambda d: total_row[d])\n", " print(f\" • {min_doc}: {total_row[min_doc]:.0f} components\")\n", " \n", " elif selected_view == 'Pairwise Similarities':\n", " print(\"📊 PAIRWISE ANALYSIS\")\n", " print(\"-\"*60)\n", " print()\n", " \n", " # Most similar pair\n", " most_similar = pairwise_df.sort_values('Jaccard Similarity', ascending=False).iloc[0]\n", " print(\"Most similar interviews:\")\n", " print(f\" • {most_similar['Interview A']} ↔ {most_similar['Interview B']}\")\n", " print(f\" • Jaccard Similarity: {most_similar['Jaccard Similarity']:.3f}\")\n", " print(f\" • Shared components: {most_similar['Shared Components']}\")\n", " print()\n", " \n", " # Most different pair\n", " most_different = pairwise_df.sort_values('Jaccard Similarity', ascending=True).iloc[0]\n", " print(\"Most different interviews:\")\n", " print(f\" • {most_different['Interview A']} ↔ {most_different['Interview B']}\")\n", " print(f\" • Jaccard Similarity: {most_different['Jaccard Similarity']:.3f}\")\n", " print(f\" • Shared components: {most_different['Shared Components']}\")\n", " print()\n", " \n", " # Complexity differences\n", " print(\"Largest complexity difference:\")\n", " max_diff = pairwise_df.sort_values('Complexity Diff', ascending=False).iloc[0]\n", " print(f\" • {max_diff['Interview A']} ↔ {max_diff['Interview B']}\")\n", " print(f\" • Complexity difference: {max_diff['Complexity Diff']:.1f}\")\n", " \n", " elif selected_view == 'Complexity Analysis':\n", " print(\"📊 COMPLEXITY ANALYSIS\")\n", " print(\"-\"*60)\n", " print()\n", " \n", " # Overall statistics\n", " complexities = [opt['complexity_score'] for opt in optimization_formulations.values()]\n", " print(f\"Average complexity: {np.mean(complexities):.1f} (SD={np.std(complexities):.1f})\")\n", " print(f\"Range: {min(complexities):.1f} to {max(complexities):.1f}\")\n", " print()\n", " \n", " # Problem type distribution\n", " problem_types = [opt['problem_type'] for opt in optimization_formulations.values()]\n", " print(\"Problem type distribution:\")\n", " from collections import Counter\n", " type_counts = Counter(problem_types)\n", " for ptype, count in type_counts.most_common():\n", " pct = 100 * count / len(problem_types)\n", " print(f\" • {ptype}: {count} ({pct:.1f}%)\")\n", " print()\n", " \n", " # Correlation analysis\n", " component_totals = [sum([len(decision_components[doc][ct]) for ct in component_types]) \n", " for doc in doc_list]\n", " correlation = np.corrcoef(complexities, component_totals)[0, 1]\n", " print(f\"Correlation (complexity vs total components): {correlation:.3f}\")\n", " \n", " with viz_output:\n", " clear_output(wait=True)\n", " \n", " if selected_view == 'All Interviews Comparison':\n", " # Heatmap of all components\n", " data_matrix = []\n", " for i, row in comparison_df.iterrows():\n", " if row['Component Type'] != 'TOTAL':\n", " data_matrix.append([row[doc] for doc in doc_list])\n", " \n", " fig = go.Figure(data=go.Heatmap(\n", " z=data_matrix,\n", " x=doc_list,\n", " y=[row['Component Type'] for _, row in comparison_df.iterrows() if row['Component Type'] != 'TOTAL'],\n", " colorscale='Blues',\n", " text=data_matrix,\n", " texttemplate='%{text}',\n", " textfont={\"size\": 10},\n", " colorbar=dict(title=\"Count\")\n", " ))\n", " \n", " fig.update_layout(\n", " title='Component Counts Heatmap',\n", " xaxis_title='Interview',\n", " yaxis_title='Component Type',\n", " height=400\n", " )\n", " \n", " display(fig)\n", " \n", " elif selected_view == 'Pairwise Similarities':\n", " # Similarity matrix\n", " n_docs = len(doc_list)\n", " similarity_matrix = np.zeros((n_docs, n_docs))\n", " \n", " for i in range(n_docs):\n", " similarity_matrix[i, i] = 1.0 # Self-similarity\n", " \n", " for _, row in pairwise_df.iterrows():\n", " i = doc_list.index(row['Interview A'])\n", " j = doc_list.index(row['Interview B'])\n", " sim = row['Jaccard Similarity']\n", " similarity_matrix[i, j] = sim\n", " similarity_matrix[j, i] = sim\n", " \n", " fig = go.Figure(data=go.Heatmap(\n", " z=similarity_matrix,\n", " x=doc_list,\n", " y=doc_list,\n", " colorscale='Greens',\n", " text=np.round(similarity_matrix, 2),\n", " texttemplate='%{text}',\n", " textfont={\"size\": 9},\n", " colorbar=dict(title=\"Jaccard Similarity\")\n", " ))\n", " \n", " fig.update_layout(\n", " title='Interview Similarity Matrix',\n", " xaxis_title='Interview',\n", " yaxis_title='Interview',\n", " height=500,\n", " width=550\n", " )\n", " \n", " display(fig)\n", " \n", " elif selected_view == 'Complexity Analysis':\n", " # Scatter plot: Total Components vs Complexity\n", " scatter_data = []\n", " for doc_name in doc_list:\n", " opt = optimization_formulations[doc_name]\n", " total = sum([len(decision_components[doc_name][ct]) for ct in component_types])\n", " scatter_data.append({\n", " 'Interview': doc_name,\n", " 'Total Components': total,\n", " 'Complexity Score': opt['complexity_score'],\n", " 'Problem Type': opt['problem_type']\n", " })\n", " \n", " scatter_df = pd.DataFrame(scatter_data)\n", " \n", " fig = px.scatter(\n", " scatter_df,\n", " x='Total Components',\n", " y='Complexity Score',\n", " color='Problem Type',\n", " text='Interview',\n", " title='Component Count vs Complexity Score',\n", " height=450\n", " )\n", " \n", " fig.update_traces(textposition='top center', textfont_size=8)\n", " \n", " display(fig)\n", " \n", " else:\n", " # Component-specific visualization\n", " comp_type = selected_view.replace(' Comparison', '')\n", " \n", " if comp_type in component_types:\n", " counts = [len(decision_components[doc][comp_type]) for doc in doc_list]\n", " \n", " fig = go.Figure(data=[\n", " go.Bar(\n", " x=doc_list,\n", " y=counts,\n", " marker_color='#2E86AB',\n", " text=counts,\n", " textposition='outside'\n", " )\n", " ])\n", " \n", " fig.update_layout(\n", " title=f'{comp_type} Count by Interview',\n", " xaxis_title='Interview',\n", " yaxis_title='Count',\n", " height=350\n", " )\n", " \n", " display(fig)\n", " \n", " # Detailed pair comparison function\n", " def compare_pair(change):\n", " doc1 = interview1_selector.value\n", " doc2 = interview2_selector.value\n", " \n", " with analysis_output:\n", " clear_output(wait=True)\n", " \n", " display(HTML(f\"

Detailed Comparison: {doc1} vs {doc2}

\"))\n", " \n", " comps1 = decision_components[doc1]\n", " comps2 = decision_components[doc2]\n", " opt1 = optimization_formulations[doc1]\n", " opt2 = optimization_formulations[doc2]\n", " \n", " print(\"=\"*60)\n", " print(\"📊 COMPONENT-BY-COMPONENT COMPARISON\")\n", " print(\"=\"*60)\n", " print()\n", " \n", " for comp_type in component_types:\n", " set1 = comps1[comp_type]\n", " set2 = comps2[comp_type]\n", " \n", " shared = set1 & set2\n", " only1 = set1 - set2\n", " only2 = set2 - set1\n", " \n", " print(f\"{comp_type}:\")\n", " print(f\" {doc1}: {len(set1)} | {doc2}: {len(set2)} | Shared: {len(shared)}\")\n", " \n", " if shared:\n", " print(f\" Shared items:\")\n", " for item in sorted(list(shared)[:3]):\n", " print(f\" • {item}\")\n", " if len(shared) > 3:\n", " print(f\" ... and {len(shared) - 3} more\")\n", " \n", " if only1:\n", " print(f\" Only in {doc1}:\")\n", " for item in sorted(list(only1)[:2]):\n", " print(f\" • {item}\")\n", " if len(only1) > 2:\n", " print(f\" ... and {len(only1) - 2} more\")\n", " \n", " if only2:\n", " print(f\" Only in {doc2}:\")\n", " for item in sorted(list(only2)[:2]):\n", " print(f\" • {item}\")\n", " if len(only2) > 2:\n", " print(f\" ... and {len(only2) - 2} more\")\n", " \n", " print()\n", " \n", " print(\"=\"*60)\n", " print(\"📐 OPTIMIZATION COMPARISON\")\n", " print(\"=\"*60)\n", " print()\n", " print(f\"{doc1}:\")\n", " print(f\" Problem Type: {opt1['problem_type']}\")\n", " print(f\" Complexity: {opt1['complexity_score']:.1f}\")\n", " print(f\" Objective: {opt1['objective_function'][:60]}...\")\n", " print()\n", " print(f\"{doc2}:\")\n", " print(f\" Problem Type: {opt2['problem_type']}\")\n", " print(f\" Complexity: {opt2['complexity_score']:.1f}\")\n", " print(f\" Objective: {opt2['objective_function'][:60]}...\")\n", " \n", " # Connect selectors\n", " view_selector.observe(update_display, names='value')\n", " interview1_selector.observe(compare_pair, names='value')\n", " interview2_selector.observe(compare_pair, names='value')\n", " \n", " # Display interface\n", " print(\"Select view to explore comparisons:\")\n", " print()\n", " display(view_selector)\n", " display(HTML(\"
\"))\n", " display(HTML(\"

Detailed Pair Comparison:

\"))\n", " display(widgets.HBox([interview1_selector, interview2_selector]))\n", " display(table_output)\n", " display(analysis_output)\n", " display(viz_output)\n", " \n", " # Initial display\n", " print(\"\\n⏳ Loading comparison view...\")\n", " update_display(None)\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ COMPARISON VIEWER READY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Features:\")\n", " print(\" • 'All Interviews Comparison': Side-by-side component counts\")\n", " print(\" • 'Pairwise Similarities': Jaccard similarity between all pairs\")\n", " print(\" • Component-specific views: Detailed listings for each type\")\n", " print(\" • 'Complexity Analysis': Problem difficulty comparison\")\n", " print(\" • Detailed pair comparison: Select two interviews to compare\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. SAVE COMPARISON TABLES\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"💾 SAVING COMPARISON TABLES\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " from pathlib import Path\n", " output_dir = Path('publication_outputs/tables')\n", " output_dir.mkdir(parents=True, exist_ok=True)\n", " \n", " # Save main comparison table\n", " comparison_df.to_csv(output_dir / 'Decision_Components_Comparison_All.csv', index=False)\n", " print(f\"✓ Saved: Decision_Components_Comparison_All.csv\")\n", " \n", " # Save pairwise table\n", " pairwise_df.to_csv(output_dir / 'Decision_Components_Pairwise_Similarities.csv', index=False)\n", " print(f\"✓ Saved: Decision_Components_Pairwise_Similarities.csv\")\n", " \n", " # Save component-specific tables\n", " for comp_type, df in detailed_listings.items():\n", " filename = f\"Decision_Components_{comp_type.replace(' ', '_')}.csv\"\n", " df.to_csv(output_dir / filename, index=False)\n", " print(f\"✓ Saved: {filename}\")\n", " \n", " print()\n", " print(f\"📊 Files saved to: publication_outputs/tables/\")\n", " \n", " # Store in global scope\n", " globals()['decision_comparison_df'] = comparison_df\n", " globals()['decision_pairwise_df'] = pairwise_df\n", " globals()['decision_detailed_listings'] = detailed_listings\n", " \n", " print()\n", " print(\"✓ Variables created:\")\n", " print(\" • decision_comparison_df: All interviews comparison\")\n", " print(\" • decision_pairwise_df: Pairwise similarities\")\n", " print(\" • decision_detailed_listings: Component-specific details\")\n", "\n", "else:\n", " print(\"⚠️ Missing required data - run Decision Components cell first\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔧 DECISION COMPONENTS: STRUCTURE DIAGNOSTIC & REPAIR\n", "================================================================================\n", "\n", "Step 1: Diagnosing current state...\n", "--------------------------------------------------------------------------------\n", "documents exists: {'1_1_InterdependenciesNNA': 'Leif:\\tI don\\'t know if I...\\nJennifer:\\tI just started it.\\nLeif:\\tAll right, and just like any research project, as you know, you can stop at any point, if you don\\'t want to do it, there\\'s no penalty, okay? Great. Yeah, so, okay, how long have you lived in this area?\\nJennifer:\\t16 years.\\nLeif:\\tOkay, and what is your educational and professional background, both in terms of schooling, but also in terms of your community roles, as they pertain to water?\\nJennifer:\\tYeah, I drink water every day. I have a bachelor\\'s in environmental health science, and much of that was focused on water. I have a master\\'s degree in public health and a graduate certificate in global health. Much of both of those programs were also focused on water or the health impacts of water service.\\n\\tI have worked in the environmental and public health field in Alaska for 16 years, and the focus of much of that was on water. So, how do we get water to people, how do we help people be healthy with the water that they have, and how do we help the people out in communities run their water systems in a way that\\'s sustainable and successful? I also supervised a water lab, taught water plant operator training classes. And you mentioned roles in your community, I have also been a volunteer for our city municipal public works committee for a number of years.\\nLeif:\\tBut you\\'re doing something a little different. Are you still doing any water stuff?\\nJennifer:\\tVery little. So January 2021, I shifted to a full-time job in a research capacity, so very little of what I do now is directly related to water, but I do have over 15 years\\' experience working in that field.\\nLeif:\\tOkay. Can you say a little bit more about the ways that you worked with water infrastructure and who you worked with?\\nJennifer:\\tSure. I worked for YKHC, Yukon-Kuskokwim Health Corporation. I also spent two-and-a-half years working at the Alaska Native Tribal Health Consortium, the environmental health and engineering group at YKHC, the hospital here in Bethel.\\n\\tI supervised an environmental health services program. We worked with 48 or so villages in the region to help them run water systems, among other environmental public health concerns. I mentioned teaching operator training. We did a R&D project on a graywater recycling system. I managed the staff at a water lab, a micro-lab. So, we tested all the water at our lab from the water systems around in the region, about 100 samples a month, micro-testing.\\n\\tAnd then when I was working at the Alaska Native Tribal Health Consortium, I had many similar duties, but more of a statewide capacity. I also worked as a tribal water fluoridation coordinator, so I was working with operators as well as the dental health group to try to bridge the gap a little bit between water fluoridation. And I worked on a research project there.\\nLeif:\\tOkay. So some of these questions are going to be specific about a utility, but you\\'ve had experience working... I mean, you have your community in Bethel, as well as a bunch of other communities, so the answers might vary.\\nJennifer:\\tSure.\\nLeif:\\tAnd just, I guess, describe that as you go. How is drinking water provided in your area?\\nJennifer:\\tSure. In my community, where I live, Bethel, we have a combination of piped water provided to two subdivisions or neighborhoods in town. All of the rest of the community is served by a vehicle haul system, so trucks deliver potable water to the home. They\\'re delivered to tanks in the home.\\nLeif:\\tAnd if we thought about your area as broader, right?\\nJennifer:\\tBigger than that, yeah. So I consider the YK Delta region my area, since I have worked with a service provider in the region for a number of years. So we have a combination. We have some communities with no service, so people haul water themselves from natural water sources to their homes. We have communities that are served with what we call small vehicle haul systems. So 100 gallons of water is transported in a tank behind an ATV or a snowmachine and delivered to the home. And then we have some communities and homes, of course, that have piped water. \\nLeif:\\tAnd how is the water treated? \\nJennifer:\\tSure. So, well, we have groundwater and surface water, but almost all of the communities in the region do have a water treatment plant. So, with very few exceptions, almost every community in my part of Alaska has a water treatment plant with a way to provide treated water to community members. So all the usual. We have filters and chlorination, is usually what we see. Mix of filtration types. \\nLeif:\\tSo, you mentioned this before, about your role. So your current a role is something a little bit different, but you explained what you did before that with relation to water, so I don\\'t think we probably need to revisit that. What was your previous role? We\\'ve got that. So, if we\\'re looking at before the job you have now, when you were working with water regularly, what were your main responsibilities? And can you walk us through a daily or a weekly routine? \\nJennifer:\\tSure. Yeah. I mentioned the water lab, so that was certainly a weekly duty. I supervised staff at our water lab. We do a coliform testing there at the water lab. I also mentioned water plant operator training. So about four to six times a year, my team would teach, mostly in-person at the time, classroom trainings to water plant operators at the villages and communities around the region. They would come into Bethel to teach those classes. \\n\\tI worked closely with a program called the Remote Maintenance Worker Program, which has been a program in Alaska for many years. We\\'ve got great guys there in that program that provide onsite assistance and technical assistance to the water plant operators. My environmental health staff likewise provided onsite assistance and some technical assistance by phone. So if we had a community with questions about regulations, or water sampling requirements, or getting a sanitary survey, we\\'re the people they would talk to. \\nLeif:\\tOkay. So, based on what you said, you were talking regularly with water plant operators. Did you talk with community members about water, ever? \\nJennifer:\\tSometimes. Usually it was more the city offices or tribal offices that would manage the water plants. So we often talked with water plant operators. We often talked with utility managers or city managers that were involved with the water system. Occasionally, community members. Sometimes we would get complaints or concerns or questions about testing, and we would do as much as we could in terms of community education. \\nLeif:\\tOkay, so can you give an example of what that might have looked like, or a time that you did it? \\nJennifer:\\tSure, like community meetings, we would host or attend community meetings when there were concerns about the water quality. When cities were looking at changing rates, there would often be community meetings to discuss those changes. We would participate in those. Certainly when I was working with water fluoridation, community meetings, and sometimes debates were a part of that as well. \\nLeif:\\tOkay. Do you have a feeling or description of how you like to talk to people about water in those situations? What does that look like? \\nJennifer:\\tYeah, I-\\nLeif:\\tAbout water systems, I guess, yeah. \\nJennifer:\\tSure. Yeah. I\\'m pro-treated water, and I think coming from this region, sometimes people in the region prefer untreated, natural sources of water. There\\'s a theme that something that\\'s natural is good for us, right? So if we\\'re picking ice and we\\'re using that water, that\\'s better for us. It\\'s more natural. That\\'s what our elders did. So I usually have to take the stance of explaining why treated water, why we know it\\'s safe and healthy for us. So that\\'s usually the end that I felt like I was always on, is trying to explain why treated water is good, it\\'s healthy for us, why it\\'s important to use that. \\nLeif:\\tOkay. \\nJennifer:\\tI don\\'t know if that answered your question, I\\'m sorry. \\nLeif:\\tYeah, I think so. So you went over a little bit, who you usually worked with, and mostly it was, it sounds like, city office-\\nJennifer:\\tOperations. \\nLeif:\\t... operators, and occasionally it was the lay public. Is that...\\nJennifer:\\tYes. \\nLeif:\\tOkay. Very good. Okay. So those are some introductory questions. I\\'m going to just take a pause and ask Lauryn if there\\'s anything else [crosstalk]. \\nLauryn:\\tYeah, no, that was great. Yeah. I have a follow-up question. We talked about communicating with the public about water quality or water systems. How did you typically communicate with operators, and maybe were there challenges there, especially when you were talking about the Remote Maintenance Worker Program that you were involved in?\\nJennifer:\\tYeah, lots of challenges. Thank you very much, Lauryn. I feel like you get me. Yeah, so water plant operators, not to generalize and stereotype, but I totally will, they\\'re a bunch of great old dudes that work in their communities. They don\\'t always use smartphones. They can sometimes be difficult to get a hold of. \\n\\tSo, believe it or not, we still used a fax. Water plant operators were really good at faxing stuff. We could reach them on the phone during their work hours at the water plant, but it\\'s an old-school crowd, so we did a lot of work... Outreach with phone. We certainly did in-person work when we were there in the communities. We\\'d spend as much time as possible as we could with the water plant operators. But yeah, it certainly has its challenges. \\nLauryn:\\tYeah, and was that even... So you mentioned you were involved in multiple aspects, including training as well. Was that kind of interaction different, when you were specifically working on operator training? \\nJennifer:\\tYeah, it\\'s highly variable. We have a... If you look at the age curve of the water plant operators in the region, it\\'s certainly more on one side. So we have found over the years that in-person classroom trainings tends to work the best for people. During the pandemic, we\\'ve been trying some alternate methods of training, and then the age of participants for that looks very different. So we do find that the next generation of water plant operators is certainly more interested in and available to do texting and maybe apps and maybe some Zoom stuff. But yeah, it\\'s an old school crowd overall. \\nLauryn:\\tYeah, definitely. Okay, great. Thank you. I\\'ll pipe in a little bit more going forward, too, but for the introductory, I think those are the questions I have right now. \\nLeif:\\tYeah. Yeah, it\\'s weird because I\\'ve heard these stories before. Okay, well, so moving forward and talk a little bit about general water infrastructure and your experiences or impressions of that. And again, this would be, I guess, either in your community or the communities you worked. So we\\'re going to start by just asking if there\\'s a recent moment or a story that made you aware of challenges in local water infrastructure, an anecdote. \\nJennifer:\\tCan\\'t think of a recent moment that it occurred to me that there were challenges, but I came into this field wanting to better understand the challenges. One of the first communities that I worked with in the region was a community called Akiachak, and they were trying to get a piped water system for the first time, and really struggling to meet some of the, what we call capacity indicators. Are you... Through the RUBA program, the rural utility business advisory program? I don\\'t know if that rings... I know, Leif you\\'re familiar with that.\\nLeif:\\tThat\\'s both... Some of those are administrative, and some of them more like-\\nJennifer:\\tOperations, yeah. \\nLeif:\\tYeah, operations, right? \\nJennifer:\\tYeah. Yes, correct. Okay, so I guess I was just thinking of that community because when I was working directly with that community, they were working through some of the administrative hurdles to score out and qualify to get a piped water project, which they were eventually able to do. And then there was the whole operations piece as they got funding and built a new water plant, and it replaced the old plant, and worked closely with operator training and got them trained on the new plant. \\n\\tAnd then the project was... The distribution system was completed only at 50%. So half of the community had a complete distribution system, and the other part of the community, the distribution system was not completed for a number of years, and that was just completed this year, and that\\'s like a decade later. \\n\\tSo I guess I highlighted that example because it\\'s one of the most interesting and frustrating community case studies that I worked with, and I started working with that community on the ground, in person, in like 2005, helping them get ready to get a piped system, and that just was completed this year. \\nLeif:\\tGreat. Along the same lines, can you think of any recent examples of infrastructure issues at your own house, with lead, and public works? \\nJennifer:\\tYeah. Yeah. Yes. Okay. Yes, absolutely. \\nLeif:\\t[crosstalk] question. \\nJennifer:\\tSo in the community in which I live and I served as a volunteer committee member, so it\\'s an oversight committee that advises the public works department in the community in which I live, we had some lead exceedances in recent years, and there was virtually no follow-up on the part of our operations team. \\n\\tSo when you have lead exceedances and you don\\'t do the required resampling or any of the public notifications, you then find yourself in a regulatory situation. So I certainly tried to work closely as a volunteer and an advisor to help the community operations staff get out of that, with limited success. \\nLeif:\\t[inaudible]. \\nJennifer:\\tSay that again? \\nLeif:\\tIt got fixed, right? \\nJennifer:\\tI haven\\'t checked recently, but yeah, we did get some things resolved at the time. \\nLeif:\\tAll right. Well, according to the consumer confidence report [inaudible]. \\nJennifer:\\tIt\\'s fixed this year. \\nLeif:\\tOkay, great. Sorry. All right. So, what water infrastructure challenges does your community face? And that could be about water quality or the physical infrastructure. \\nJennifer:\\tI think distribution systems in the Arctic are probably more challenging than anything else. So I certainly think about distribution systems. There\\'s a big difference in the communities that we work with that have a centralized treatment plant and what we call a watering point where people go there to get the treated water and take it to their homes. So, picture that, much simpler system, than the same water plant in a different community with pipes running through the frozen Arctic permafrost to 200 homes. \\n\\tSo, first off, I would say distribution systems. They\\'re just tough to run in Arctic Alaska. Second of all, and this is not directly related to infrastructure, but I would say managerial infrastructure. So, many of our communities, we have one or two water plant operators. We have high turnover. We may have little to no oversight... Wide range of managerial oversight and knowledge about operations, and that is a constant struck. So, number one, distribution systems in general. Number two, managerial capacity. \\nLeif:\\tAnd what do you see as community impacts related to those challenges? \\nJennifer:\\tSo, speaking to the distribution system, when you have a community with a distribution system and it is not properly run, you end up with freezes. You have homes with no service. Sometimes that affects the loop, so then we have lots of homes with no service. You then are not collecting revenue because you have houses that are not receiving service, and you can have lasting impacts to the distribution system. And then for the managerial piece that I mentioned, like operator turnover and support and recruiting, I think that\\'s the biggest concern that I have. And then of course, the financial piece, it\\'s, you have to have managerial support and capacity to run a water system like a business, to get your revenue, cover your expenses. \\nLeif:\\tYeah, so the next question has to do with... Is asking whether those challenges change throughout the year, like seasonally. I mean, I think I can probably guess for freeze-ups, but I\\'d be interested to know, for all the things you listed, if there\\'s a seasonality to that. \\nJennifer:\\tYeah, I think so. So, like you said, it\\'s hard to run distribution systems when it\\'s very, very cold out. So, for sure, there. It\\'s also more expensive to run water plants when it\\'s really cold out and you\\'re heating everything. So our costs go up, right? Just keeping a water plant heated, keeping the distribution system heated. I would say seasonally, we have operators that will take subsistence leave, so sometimes summer times can be hard to find someone around to run the water plant, too, because of fishing and subsistence leave. So there is a seasonality to it. \\nLeif:\\tSo, fishing would be like summertime? \\nJennifer:\\tMm-hmm (affirmative). Yeah. \\nLeif:\\tOkay, so challenges in the winter from freezing and cold conditions. Potentially also challenges-\\nJennifer:\\tSubsistence practices. It\\'s sometimes hard to find someone to check on the water plant if you have one or two people employed to do that and you\\'ve got people out for moose hunting or fishing. \\nLeif:\\tVery good. So, next question is about water quality. Do you face water quality challenges? \\nLauryn:\\tAnd before we move on, I\\'m going to ask one question on the same topic. Do you notice that there\\'s intermittent supply issues as well? Or is that really just maybe due to a freeze-up that would cause a service disruption? \\nJennifer:\\tYeah, definitely, we have intermittent supply issues. I\\'m thinking of a couple of communities where they have to move their intake each spring and fall, especially surface water systems, like Emmonak is an example that comes to mind. So they have to remove their river intake entirely during freeze-up each spring and during break-up each fall. So we have these awkward seasons during break-up and freeze-up where we have virtually no supply because we had to pull the supply to switch over to the next season. \\nLauryn:\\tInteresting. \\nJennifer:\\tWe also have, not that many anymore, when I started we had a lot of them, but systems that we call like fill-and-draw systems. Do you know what that means? \\nLauryn:\\tI [crosstalk]. \\nJennifer:\\tWhere we pump water seasonally, right? So we pump water in the unfrozen part of the year, we fill up a giant water tank, and we try to make that last as long as we can. But there\\'s not that many of those systems that rely entirely on fill-and-draw any longer. \\nLauryn:\\tOkay. \\nJennifer:\\tSo yes, lots of intermittent challenges. \\nLeif:\\tDo you remember where... Was it somewhere out on Nelson Island, or was it [crosstalk]? \\nJennifer:\\tNewtok. \\nLeif:\\tOh, was it Newtok that was doing that? It seems like that was recurrent problem year after year [crosstalk]-\\nJennifer:\\tYep, every single year. \\nLeif:\\tLike the end of the winter, they would just be out of water. \\nJennifer:\\tAbsolutely, yep, for months. \\nLeif:\\tGreat. Okay. \\nJennifer:\\tYeah. What\\'s up with water quality? Remind me of the question. \\nLeif:\\tYeah, moving onto, do you have water quality issues as well? \\nJennifer:\\tYeah, we do. Thank you for asking. Yeah, so... Oh, Lauryn, you\\'d mentioned about intermittent access. We have weird seasonal fluctuations in source water quality. I mentioned that community of Emmonak where we have a surface water intake, and how we literally have to pull it out in the freeze-up, and break-up is what we call... Sorry, I forget sometimes when I\\'m talking to non-Alaskans, when the river... Break-up is just what it sounds like if you\\'re imagining an icy river. So, yeah. Our source water changes pretty dramatically during those times of year. So, yes, water quality can be challenging when your source water quality is changing seasonally or from other reasons. After flooding, et cetera. \\nLeif:\\tIs source water quality the primary driver of water quality issues? \\nJennifer:\\tNo. \\nLeif:\\tOkay. \\nJennifer:\\tNo, but it is like a legitimate challenge. So, especially, I think, for some of our less experienced operators, they might work really closely with the Remote Maintenance Worker Program that we talked about to dial in a treatment system, right? But they\\'re not always able to adjust the treatment system as needed, right? \\n\\tSo the design engineers come in, they set it, the RMWs help train the operators, they might make some tweaks, but when you have source water quality that\\'s changing for whatever reason, or other changes that need to be made, I don\\'t always feel that the operators are able to make those adjustments as they should. \\n\\tAnd that actually came up with Bethel as well. The water treatment plant is just set on basic parameters, and so our design is set to treat it in this very specific way. And when the needs change, we don\\'t always have operators that can adjust the treatment as needed so that you continue having high-quality water. So I do see that coming up a lot. \\nLeif:\\tYeah, so when they added another piped loop and they were making twice as much water-\\nJennifer:\\tYes, and they didn\\'t... Yeah, exactly. Very good, very good example. I was struggling to think of an example. \\nLeif:\\tI think the temperature, maintaining... They change the water temperature, too. \\nJennifer:\\tTemperature, yep. \\nLeif:\\tOkay. Is it difficult to meet regulatory requirements? \\nJennifer:\\tYeah. Yes, sometimes, especially when we have, I think less experienced operators, just knowledge gaps. I mean, people say... Yeah, I think the lead and copper thing that came up with city of Bethel recently. I should explain that the Bethel water system is... We had a level three operator for many years, one of the only one in the entire region. So, highly trained, we have operators with many years of longevity, and we were still missing some of those key things that needed to happen, right? Sampling and public notification. So, yes, I don\\'t think we have... I think there\\'s a disconnect between our operations staff and our DEC state regulators, and a knowledge gap in the middle. And we don\\'t always communicate the same language, or the same way, so that makes it very fun. \\nLeif:\\tYou mean your local regulators and the DEC [crosstalk]-\\nJennifer:\\tThe local operations staff and then the state regulators-\\nLeif:\\tNot making [crosstalk]. \\nJennifer:\\t... and then there\\'s just a black hole in between, at times. \\nLeif:\\tSo, what would you need to better respond to those water quality challenges? \\nJennifer:\\tLet\\'s see. That is a great question that I\\'m not sure how to answer. \\nLeif:\\tOr just to infrastructure challenges more in general, if that\\'s... \\nJennifer:\\tSure. I can\\'t say enough about the remote maintenance worker team, right? So they\\'re highly trained, specialized, kind of a circuit rider model, where, in the office here, they had five positions to work with a bunch of 50 or so public water systems. I really can\\'t say enough about that program. I feel like it\\'s just such a success, for so many reasons, but that is one of the things that exists within that black hole that I mentioned between the state regulators and also our design engineer... Like our engineering staff that puts in water systems, and the local operations teams. So I think programs like that have really impacted our region in a positive way. We need someone to bridge that gap. Service providers. Go ahead. \\nLauryn:\\tYeah, can you expand a little bit, when you\\'re talking about this knowledge gap, of why you think it\\'s present? Is it the rigid regulatory environment and structure? Is it communication issues? What do you think is causing that gap, and do you see any ways to bridge it? And you mentioned the solution of the Remote Maintenance Worker Program? \\nJennifer:\\tYeah. I used to think like, \"Oh, we\\'re just not communicating properly,\" we meaning like service providers or regulators, like we\\'re just not communicating the needs well, right?\\nLauryn:\\tYeah. \\nJennifer:\\tSo like they didn\\'t know they needed to take samples for lead and copper. I think that\\'s definitely an oversimplification. I think my understanding has evolved, and when you have a small community water system that is severely resource-challenged, both in personnel and money and time, we just don\\'t have the capacity to do some basic functions of a water system, right? So I\\'m not really sure how to answer that other than, we just have severely under-resourced community water systems, and they\\'re just not able to keep up with basic functions, including regulatory requirements. \\nLauryn:\\tOkay. \\nLeif:\\tIf we were going to look at the disconnect or the gap between operators and folks that are out in the field and the DEC and the state, do you... I mean, you used the word miscommunication. How much of that do you feel is miscommunication? How much of that do you feel is that people are just doing two separate jobs and they\\'re not... Those jobs there somehow incongruent, or do you feel like there\\'s a... Can you say something about that? I mean, is miscommunication, is that what you feel like is happening, or is there another thing that could be happening?\\nJennifer:\\tYeah, I think that there\\'s just competing priorities. So when you have a water plant operator who is on two hours a day, we\\'re hoping he gets paid for those two hours a day. He may or may not. Hopefully he\\'s trying to keep that system running, right? And it heated, and people online, and take care of... Like literally fighting fires, right? So it\\'s just emergencies. We\\'re just addressing emergencies, day in, day out. So sometimes taking lead and copper samples is not the emergency fire that the operator is going to put time in, on his two hours a day that he may or may not get paid for. \\nLeif:\\tOkay. Can you describe a time when you\\'ve had to be creative responding to a challenge? \\nJennifer:\\tGosh, now it feels like an interview question, like a real interview question. \\nLeif:\\tWhat are your strengths and weaknesses? \\nJennifer:\\tAll right. \\nLeif:\\tAs it pertains to solving a water quality or infrastructure problem. How\\'s that? \\nJennifer:\\tOkay. Yeah. So I think playing the long game, one of that my program noticed years ago is that it wasn\\'t just about the operations. We were not even getting to the point with some of our communities where we could be competitive for getting sanitation infrastructure funding, because so many of the managerial components were lacking. So we were not even scoring out high enough to get funding for water systems. \\n\\tAnd these are the few remaining communities in Alaska that don\\'t have sewer infrastructure, right? So you would think that they would be competitive for getting water plants, because we have homes with no running water. But because the managerial capacity was so low, this goes back to the RUBA best practices I mentioned, they were just not scoring out high enough to really qualify for that funding to even get a project. \\n\\tSo I think one of the creative solutions over the years that I can think of is focusing on those upstream impacts, right? So, not just focusing on operating a water system, but if you want to get a new water system in a community, you would therefore have to go upstream and work on resolving and building some of the managerial capacity so that that can happen a few years later. \\nLeif:\\tWhen you say managerial capacity, do you mean for the... Like a [crosstalk] administrator? \\nJennifer:\\tYeah, like city offices, yeah. \\nLeif:\\tOkay, so you don\\'t mean trying to find a water plant operator specifically, you mean upstream even of the... Above that? \\nJennifer:\\tYeah. Yep, absolutely. Yeah. So, like helping the community entity enact good business practices so that they can show they have the revenue, and they have accountants on file, and they have been to manager training. That is all before we even get to the operations point. \\nLeif:\\tGreat. You touched on this before. What challenges are unique to your utility due to the Arctic conditions? [inaudible] else that... \\nJennifer:\\tYeah, I mean, I feel like probably most of the ones that I can think of off the top of my head, we\\'ve already hit on through the course of our discussion. But I think the last remaining communities in Alaska to get running water and sewer are probably for a good reason. Either there\\'s no easy water source or there\\'s no good way to put in a distribution system, so there\\'s certainly some engineering challenges that remain. \\n\\tHowever, in the other communities, I would just say it\\'s hard to run a functional water system in permafrost, and in an Arctic environment, especially when you only have a user fee base of maybe 200 households. There\\'s just not always an economy to support such an endeavor. And then, we did talk a little bit about the intermittent or seasonal availability of people, and how intermittent access changes the game a bit too. Those are the three that come to mind. \\nLeif:\\tOkay. Well, talking about climate change, how does climate change impact your water infrastructure system? Or maybe a better question is, does it? \\nJennifer:\\tYeah, I wish I had some pictures that were easy to access. So, yes, it absolutely does. Permafrost and changing climates and storms all impact infrastructure. We had some really graphic photos that our remote maintenance worker team took a couple years ago that I\\'m thinking of, where you have like the pilings that the Arctic pipe is built on, right? \\n\\tSo we have this above-ground HDPE Arctic pipe, and we have posts and pads that the pipes are built on. And those, we\\'re doing what we call jacking, which I\\'m sure is not a specific term, but that\\'s the only thing the guys ever call it. And they\\'re just like, the permafrost is... They\\'re just going up and moving all around. So it\\'s hard to run a distribution system, HDPE... The Arctic pipe when you have shifting ground, so to speak. And many of those would become unlevel, and move inches in a year and have to be repaired and fixed. So that\\'s one thing that comes to mind. Intake also has been an issue. The changing pattern and the fall sea storms, especially, have certainly impacted some of our intake systems. \\nLeif:\\tSorry to jump around a little bit, but you\\'d mentioned talking to operators and talking to city officials and talking occasionally to lay public. Have you had an opportunity to talk to people about climate change? I mean, you mentioned putting out fires, and sometimes climate change feels like a long-term goal, but what would you want people to know? How would you talk to people about climate change and how it affects water infrastructure? \\nJennifer:\\tSure. I think, well, we have a few communities that I would say are on the forefront of planning for climate change, but mostly that\\'s because they\\'re in dire need of... Like they\\'re being severely threatened by erosion. So the most notable community in the region is Newtok, and they\\'ve been moving their community site to Mertarvik, another... Relocating entirely. \\n\\tSo I think it\\'s easier to have those conversations in communities where you\\'re seeing the impacts, right? So you have a severely threatened shoreline that is edging up to the school, and the school shed is about to fall in the water. It\\'s easy to have that conversation when you\\'re under threat of a fire in the next few years versus long-term. \\nLeif:\\tYeah, there\\'s probably fewer people questioning whether it\\'s happening or not. \\nJennifer:\\tExactly, yeah. \\nLeif:\\tOkay. So, with all the things we\\'ve talked about, what infrastructure challenge do you think is the most important to fix, and when ?\\nJennifer:\\tI don\\'t know how to answer that, I mean, if I think more specific to infrastructure, I would probably hone in on the distribution systems a little bit. That\\'s where many of our challenges fall into, so maybe distribution systems, and... Yes, that is important to fix immediately. Yeah, I don\\'t know how to answer that one. \\nLeif:\\tSure, yeah. I thought maybe immediately would be [crosstalk]. \\nJennifer:\\tYesterday? I don\\'t know. \\nLeif:\\tRight. Right. \\nJennifer:\\tBefore winter, for sure. \\nLeif:\\tSo, if you were... I mean, you mentioned distribution systems, and specifically what does that mean? What change would you want to see? \\nJennifer:\\tYeah, I mean, so I think I mentioned like in terms of infrastructure, number one, distribution systems, and number two, managerial capacity, and running a water system, and having the finances to do so, and how it\\'s really expensive and costly and time-consuming. \\n\\tSo I think magically having funding available would fix some things. Having less turnover among operators would fix a lot of things. When I look at the functional water systems in the region, most of those are ones that have a long-term water plant operator and a backup, right? So, people who know the job inside and out, and have been doing that for a number of years. Yeah. \\nLeif:\\tTo the question of distribution, though, I mean, is piped... What is the distribution solution, I guess, if you could wave your wand and change the infrastructure that exists? I know in Bethel, piped water has a lot of advantages. I mean, is that the right answer for the region, or is there... Also I know there\\'s some negatives about small tank and haul. \\nJennifer:\\tMm-hmm (affirmative). \\nLeif:\\tSo, for distribution, what\\'s the answer. \\nJennifer:\\tYeah, so I think, as someone with a health background, who\\'s really interested in the health impacts of water, I tend to think that piped water service is the solution, but certainly not because it\\'s easy to wave a magic wand and fix infrastructure-wise. So I believe there\\'s a health equity and an argument to be made for people having ample water to use. And we see, of course, [inaudible] decrease in many infectious diseases. But no, it\\'s tough. We have communities of 300 to 1,200 people with no economy. I mean, it\\'s tough to foot the bill for things like that. So I have no magic solution. Sorry. I bet you came here expecting that. \\nLeif:\\tWell, this was a waste. \\nJennifer:\\tOh, yeah. So I don\\'t know. I don\\'t know what that magic wand would do, short of more funding subsidy. Yeah, I\\'m not sure. \\nLeif:\\tOkay. So, just like a job interview, if you feel like you\\'ve answered these questions, you don\\'t have to repeat the same thing, but the next question is about system failures. So, can you describe system failures or issues such as pipe breaks or periods of time without water? Has that happened? What does it look like? \\nJennifer:\\tYeah, I mean, it looks different in places, different places. I\\'m thinking most recently of my neighborhood last week, so one house froze up... So I\\'m lucky enough to live in one of the two neighborhoods in Bethel, the big city of the region, that has piped water. So, two houses down, there was some kind of freeze with the water system, and then all the alarms went off down the road, and that... It was resolved. So I don\\'t know the specifics of that, but certainly, yeah, I think about that. \\n\\tI think about line breaks. I think about those posts and pads that the Arctic pipe is built on, and how it\\'s literally jutting out of the ground. I think about physical... The foundation of water plant buildings, how some of them have become like this over time. Yeah. \\nLeif:\\tOkay. So, thinking about future infrastructure, what do you or your organization consider when planning for, or building new infrastructure? \\nJennifer:\\tSure. Yeah, so my role in all of this has been as a service provider and as community liaison, so my team, we don\\'t design water systems. We\\'re the ones left holding the operator\\'s hands, operating after a new system is put in. So we really try to have a pragmatic approach to making simple, easy-to-run systems without a ton of automation. \\nLeif:\\tMm-hmm (affirmative). Okay, and-\\nJennifer:\\tOne of the remote maintenance workers that I work closely with, Leif, I know you know him, but he will... I\\'m sure he would love to be interviewed, and he would certainly bend your ear about a lot of the highly automated and complex systems that are being put in. He would advocate for just the opposite. \\nLeif:\\tLike one that\\'s controlled by a fax machine, or... \\nJennifer:\\tOh, yeah, faxes, we got. \\nLeif:\\t[inaudible]. \\nJennifer:\\tSmartphone apps, not so much. \\nLeif:\\tHow has your community\\'s water system or water operations changed over time? \\nJennifer:\\tThe biggest thing that comes to mind is when we have operator turnover. So we talked about small communities with maybe 200 homes, and limited people working as water plant operators. And I\\'ve seen the operator turnover. We\\'ll have a longtime operator of like 30 years running a system in the same community, and when that changes, we have, I think a major impact on the water system. It\\'s amazing in a small town, what a change in personnel of literally one person can do. So that\\'s what comes to mind. \\nLeif:\\tHas there been any physical infrastructure changes over time? I mean, are things trending in a direction? I know that some people... New things are being built, other things are being broken down or falling apart? Is there new technology or direction? \\nJennifer:\\tYeah. I mean, there\\'s generally a trend for more complicated systems, so that is certainly happening. I\\'ve seen that over the last 15 years. When I started doing sanitary surveys, the water systems looked a little bit different, right? So I was inspecting a lot of water plants from like the 1970s and \\'80s when we put a lot of them in, and now those systems are failing and they\\'re getting upgrades, and then we have new systems that have been put in in the last 20 years. So in terms of complexity, I would say that\\'s changed. \\nLeif:\\tOkay, great. So, moving on to some questions about people and workforce, and some of this, again, you\\'ve touched on before, but turnover, obviously, what workforce challenges do you face? How do you respond to that? \\nJennifer:\\tYeah, I think a general shortage of people to fill the positions, and turnover, and I don\\'t always know how to address that, because it\\'s been a constant struggle. I think as service providers, we\\'ve really tried to invest in... Well, in training, in operational supports, in answering those calls for help when they come up, and I\\'m speaking really more of the remote maintenance worker team than my own program. Of being good partners, of being advocates for the community needs. And that\\'s been our role, to really help improve operator satisfaction and longevity, but it\\'s still a challenge, for sure. \\nLeif:\\tYeah. You mentioned training. Do you mean the water plant operator training or other training? \\nJennifer:\\tYeah, operator training, some. So the classroom water plant operator trainings that we host four to six times a year, I mentioned that, really that is aimed at a couple of things. Helping operators pass a water treatment exam, and providing CEOs, continuing education units, so that they\\'re able to renew their certification. But in terms of, is that useful to how they run their water plants on a day-to-day basis, maybe, maybe not. So we do a lot of informal and formal training on site when we are out there in the community, in someone\\'s water plant. And I would say that\\'s more meaningful in terms of helping an operator run his or her water system. \\nLauryn:\\tSo, when you were saying, right, it seems like there\\'s this by-the-books certification, right? Like the classroom training, and then this onsite, where you\\'re saying it\\'s more fruitful and helpful. How could you see that overlap get a little closer together? I don\\'t know if that made sense how I worded it, but to where the classroom work is actually going to help maybe more directly these operators run their systems. \\nJennifer:\\tYeah, that\\'s super tough because a few years ago... Gosh, they blur together. I\\'m trying to remember what year this was. I think... Gosh, I don\\'t remember. 2011. Our state water plant operator training and certification team, the state folks that make the water plant operator certification program and tests, they went with the national ABC standards. \\n\\tSo now it\\'s like a nationally certified exam, and the things that you study in the book to pass that exam are sometimes very different than what you see in a rural Alaska water system. So we\\'re trying to teach rural Alaska water plant operators these things in this national exam book, and that is challenging. So I\\'m not sure, Lauryn, but I would like to see that. That would be a vision. Maybe magical wand, I would wave that, and then suddenly this exam would also help people run their systems on a day-to-day basis. \\nLeif:\\tI mean, that could... I mean, before it was national standards, it was a state... The state dictated the test, right? \\nJennifer:\\tYes. \\nLeif:\\tSo, I mean, it\\'s not crazy. That was a decision that was made, it is a... I mean, right? \\nJennifer:\\tSomebody made that decision, for sure. \\nLeif:\\tSo it doesn\\'t necessarily have to be that way, or are there federal standards for how the test would have to be, even if we don\\'t follow that specific testing protocol? \\nJennifer:\\tI do not know the answer to that question, but it\\'s worth exploring. \\nLeif:\\tOkay. \\nJennifer:\\tI don\\'t know what prompted the change. I just remember, as a service provider, being a very difficult time. \\nLeif:\\tDid it change test scores? Did it change retention? Did it-\\nJennifer:\\tYes, absolutely. Test scores did exactly what you expect them to do, yeah. \\nLeif:\\tOkay. And do you feel like it changed the workforce? I mean, did it change turnover, or did it change job satisfaction? Did it change the quality of the work people were doing? \\nJennifer:\\tI don\\'t know. I don\\'t know that I can speak to that. I did just get a... Yeah, I\\'m not sure. I do sometimes get a sense that now, the people who are sitting in the classes are maybe the people who can test a little bit better. Like now that we have a better understanding of the testing needs and how you have to have a basic understanding of some math principles, maybe we\\'re sending people to the classroom who are going to be more well-equipped to take that exam, and that\\'s not necessarily the person that will do the best job running the water system. So it\\'s maybe changed the face a little bit of the operators we see in training and those that are getting certified. \\nLeif:\\tOkay. Is there anything else that you would think of in terms of being able to respond to these challenges, anything else that would help that you\\'d want? \\nJennifer:\\tLots of things, yeah. I mentioned the RMW program because I think it\\'s done more for operations in the region than any other program, so I think about that. We had a grant funded utility management program for five years, from 2015 to 2020, and that was, in my opinion, phenomenally successful at helping... You know I mentioned the upstream, you have to go upstream, right? So, helping create successful, financially viable business operations so that they\\'re better equipped to run the operations piece, I feel like that is really important, focusing on that upstream business piece. \\n\\tI mean, funding, I feel like stakeholders are always saying like, \"We need more funding,\" and I will continue saying that if we want Bush Alaska to have running water, that funding is going to have to come from somewhere other than the communities. And lots of other stuff. \\nLeif:\\tSo, financial challenges, where does funding for capital projects... Where does that come from? \\nJennifer:\\tThe man.\\nLeif:\\tOkay [crosstalk]-\\nJennifer:\\tJust kidding. So it\\'s all federal money that\\'s passed through to ANTHC or Village Safe Water. And it\\'s highly competitive. \\nLeif:\\tAnd do you have a role in that, helping villages get funding or is that separate from... \\nJennifer:\\tYeah, so each of the tribal health organizations, YKHC included, has a small piece in that. So the engineers at ANTHC and Village Safe Water on the state side put in a bunch of data for every potential project, right? And... Hi. Yeah. Excuse me. Lost my train of thought. All right, so we have engineers that put in for potential projects, right? So they can say, \"Community X over here has a failing water treatment plant, and we need to revamp the filtration system.\" So, they will put together a project with cost estimates, justification, all of that. \\n\\tAnd then that goes into the statewide database, called SDS, Sanitation Deficiency System. It\\'s very complicated. YKHC and the other tribal health organizations have one piece in that where they get to apply what we call tribal force. So they get to say like, \"Which projects do we feel like we want to support?\" And they support that with numbers and force. And then all of that goes to a funding committee, and eventually each project is scored out with a numerical score, and the top, whatever, percentage of them get some funding. And that\\'s all based on the funding that\\'s available that year. \\n\\tAnd there\\'s an unmet need of something like, I\\'m sure I\\'m going to say the wrong number, but like a few years ago it was 700 million. It\\'s probably just short of a billion now, of what we call the unmet need, meaning projects that we need and we don\\'t have the funding to get them. And as our older systems continue to fail, that number keeps increasing, the unmet need continues to increase. \\nLeif:\\tYou mentioned the costs to the customers, and how it oftentimes wasn\\'t enough to cover the cost of operations. So, specifically, is it that customers have trouble paying their bills, or is it there\\'s not enough of them, or what? \\nJennifer:\\tYeah, that\\'s a complicated question. The Alaska Rural Utility Collaborative, or ARUC, because there\\'s a million acronyms for everything. So Alaska Rural Utility Collaborative, they have the best numbers on user fees and collection rates and what people are willing to pay, and so they... I don\\'t know the exact numbers, but they\\'ll go into a community, determine that people are willing to pay like $120 to $160 a month, right? And we talked about really needing the managerial support. So they will be a partner to the community and help them run that water system, and help with collections and expenses and operator training. So they\\'ll go in and assist a community. \\n\\tAnd what they\\'ve found is that they have a pretty high collection rate, like 90... Upwards of 90%, and they certainly seem to run specials where people can apply their PFD and pay for their water bill throughout the year. And they have other programs. So, this outside third party cooperative is going in and they\\'re increasing user rates, and they have some of the highest collection rates that any of the communities have. And then you go to another community without that and they\\'ll say, \"Oh, people just won\\'t pay their bills,\" and they\\'ll have less than a 50% collection rate. So we see very different things in terms of how much people are willing to pay and collection rates. \\nLeif:\\tOkay, and so you\\'re saying that the primary difference, then, is ARUC specifically? \\nJennifer:\\tWe have the best numbers from ARUC, so most of what we know about rates and collection... Like collection rates and monthly user fees, are from ARUC, because they track it so well, and they make that available. And another community may or may not be tracking it well, and might not make it available. So what I know about user rates mostly comes from ARUC, which is this third party program that is assisting a community with running their system. \\nLeif:\\tWhat happens when people don\\'t pay their bills? \\nJennifer:\\tYeah. Eventually the operators don\\'t get paid. The water plant has a hard time purchasing things that they need, like chemicals. They don\\'t pay for testing. They really just literally don\\'t pay for anything that they can\\'t afford. Parts, spare parts. \\nLeif:\\tWhat happens to the... Does the water get shut off to people\\'s homes, then? \\nJennifer:\\tI mean, that would be a worst-case scenario, that would be a giant fail, but that could happen, right? So we could have, like I\\'m thinking of one example where the filter went down, and so you have... And they weren\\'t at the time able to buy the parts they needed to repair that. So you can either pump untreated water or you can turn it off, so to speak. \\nLeif:\\tYeah. I guess I was more thinking for if I\\'m an individual customer and I don\\'t pay my bill. \\nJennifer:\\tYeah.\\nLeif:\\tDo I no longer get water at my house and everyone else does? \\nJennifer:\\tWide-ranging practices. So, again, the best data we have on that comes from a ARUC. Speaking from ARUC, they pay their operators to shut off the house after like 90 days of nonpayment, right? So they have a pretty firm plan. They stick to that plan. What we find in other communities is everything, right? So a wide variation of practices. But I have heard from some operators that they don\\'t want to turn off so-and-so\\'s water, right? They\\'ve got a bunch of kids, or it\\'s their Apa (grandfather) and they may not want to do that. The managerial entity might not have firm policies to cause that to happen, as well. So outside of a ARUC, it varies community by community. \\nLeif:\\tOkay. Lauryn, before we move on, is there anything else you wanted to drill down on? \\nLauryn:\\tYeah. I wanted to ask a couple questions about the community specifically. You\\'ve mentioned a couple times throughout that your goal is to advocate for communities, and you\\'ve worked directly with communities. What do you think water means to the public? And can you expand a little bit more on public perceptions towards water services? I know you started at the beginning, you talked a little bit about how people sometimes will use traditional sources, but can you expand on that a little bit more? \\nJennifer:\\tYeah, sure. I did a project when I was at Alaska Native Tribal Health Consortium that really explored the reasons that people choose to use untreated water, so I think that\\'s informed a lot of what I\\'m saying here today. And the number one reason that people shared as far as why they prefer untreated water that\\'s not from the water plant is really just like a preference for natural things, or, \"This is the way that we\\'ve always done it. This is the way our elders did it.\" \\n\\tWe also heard some people say that they don\\'t like the taste or the chemicals, that they would share that it makes their coffee taste bad or funny. And then also, there\\'s a convenience factor, right? So, in homes without running water, where they don\\'t have treated water directly coming out of their faucet, it\\'s sometimes more work to get treated water to your house than to use rainwater, right? So there\\'s a convenience factor and there\\'s a cost factor, not just in terms of actual money that you pay for the water, but in terms of manpower and resources and time to go collect the water. \\n\\tSo those are some of the key themes that we found in four communities in this region that we were working with, and that was really in order of preference, so a preference for natural things, and, \"This is what we\\'ve always done,\" and, \"This is a cultural practice,\" that really ranked pretty highly. \\nLauryn:\\tAnd when you\\'re talking about operator training, as well, do you think that same tendency towards like, \"Oh, this is how we\\'ve done it,\" or, \"This is how we\\'ve always ran the system,\" does that come to play when you\\'re talking about operator training as well? And do you have any suggestions to... Or any times where you\\'ve seen it be very impactful, the way that people are training operators, in terms of using their knowledge that they already have based on knowledge from elders or from their previous work? That was a long-winded question, so let me know if you want me to rephrase. \\nJennifer:\\tYeah. No, I definitely have seen operators who may really be hesitant to put any chemicals in the water, so that is a real thing. And I have seen operators have a breakthrough moment when, especially with breakpoint chlorination, where if you reach the right level of treatment, the water coming out doesn\\'t taste or smell like chemicals, right? \\nLauryn:\\tYeah. \\nJennifer:\\tSo if you reach that right mix of treatment, you\\'re having good, finished water at the end that doesn\\'t smell like chlorine, it doesn\\'t taste like bleach, and it makes your coffee taste good. So I think if you can convince an operator that like, \"We can make good water that people will like, and still use the proper treatment processes,\" I mean, that\\'s the dream, right? \\n\\tBut especially with fluorides, so as somebody with a public health background, part of my work is not just to have water that\\'s free from microorganisms, but also water that\\'s healthful, and fluoride is a part of that. And we don\\'t have very much community water fluoridation in the region, and we have many operators who are not wanting to add fluoride. It\\'s not required, like chlorine, in a lot of places. So there\\'s just generally like, don\\'t want to add more chemicals to water. I do feel like that\\'s an overarching attitude. \\nLauryn:\\tOkay, and that\\'s because the... Do you think that\\'s a cultural relationship with water, that, just tendency towards not wanting to add more to water, and wanting to use natural resources? \\nJennifer:\\tI think so, maybe. \\nLauryn:\\tOkay. \\nJennifer:\\tYeah. I mean, part of that is speculation, but part of that is, that was one of the themes that came up in the research that I was doing as well, so it supports that. \\nLauryn:\\tOkay. I think I asked all my questions, so-\\nLeif:\\tGreat. \\nLauryn:\\t... [crosstalk] keep going forward. \\nLeif:\\tYeah, well, so...\\nLauryn:\\tThank you. \\nLeif:\\tYeah, I think we\\'re about done. Jennifer, is there anything else that you would like to add? Is there anything we should have asked you? \\nJennifer:\\tNot that comes to mind right now. Thank you. It\\'s been a very thorough interview. Thanks for hearing me out. Appreciate the work that you all are doing. I remember speaking with Kasey when she was planning this project, so it\\'s so nice to see it come to fruition. \\nLeif:\\tSeems like it\\'s been a long time, doesn\\'t it?\\nLauryn:\\tThe pandemic, the 20 months have added on a little time. \\nLeif:\\tRight. \\nSpeaker 4:\\tLauryn, do you want to stop recording for a little bit? \\nLauryn:\\tYeah. \\nSpeaker 4:\\tAnd then I just want to ask Jennifer, I know... I don\\'t-\\n', '1_2_InterdependenciesNNA': 'Lauryn:\\t... works. \\nLeif:\\tSeems like probably better to not [inaudible]. \\nLauryn:\\tYeah. \\nLeif:\\t[inaudible]. Right on, okay. Well, so I\\'m going to, some of this is probably going to sound a little bit boilerplate, because it is, but I\\'ll start by introducing myself. My name\\'s Leif Albertson. I\\'m doing some work with the University of Texas, and we\\'re looking at water infrastructure, like I talked about, trying to understand challenges surrounding drinking water services in your region, or I guess your former region. The research plan has been reviewed by the Human Subjects Committee, including at YKHC, and we\\'re recording this, told you that, so you\\'re aware or, assuming, giving us permission- \\nClyde:\\tYep. \\nLeif:\\t... to record. Yeah, all right. And that\\'ll be transcribed with ... This is anonymous, we don\\'t put your name on it, although we\\'re not really making any promises that if somebody sees you in the car or something, that you couldn\\'t get found out. But we do try to protect all of that stuff. So you won\\'t be identified, and like any research project, you can quit any time you want to. If you don\\'t want to do this, there\\'s no penalty, nothing like that. You just say, \"I\\'m done, I don\\'t want to be here,\" and that\\'s fine. \\nClyde:\\tOh, I\\'m happy to help.\\nLeif:\\tYeah. And so, Lauryn, you want to introduce yourself at all too?\\nLauryn:\\tYeah, so I\\'m Lauryn Spearing. I\\'m a researcher at UT Austin, and so we\\'re working with Kasey Faust as well, who\\'s an assistant professor in civil engineering there. And she grew up in Alaska, so we have some ties, but I grew up in Texas, so I\\'m learning a lot through this process and I\\'m excited to hear your insights. And so I\\'ll just kind of maybe pop in with a couple questions and follow-up things, but Leif, you\\'ll be in charge. \\nLeif:\\tOkeydokey. Cool, all right. So just some background, kind of went over the fact here, so you moved to Chugiak November 1, but how long were you in Bethel?\\nClyde:\\tI was in Bethel a minimum of 11 years straight, total. And at the end of that time, I was the foreman for the Hauled Water and Sewer Services for five years, before I left there. So I spent some time in the villages early on, but mostly it\\'s all been in Bethel. \\nLeif:\\tOh, okay. Well, so, see, I didn\\'t know that. Yeah, walk us through, where were you, or what brought you to the region? \\nClyde:\\tWell, I mean, we had a bad economy crash in Seattle, this is where I\\'ve spent my whole life. My first plane ride was to Alaska when that happened. So everybody was losing their jobs, and I saw an ad in the paper for a $10,000 completion bonus of a contract for a land-based processor, and I just kind of fell in love with the state and ended up in Bethel. \\nLeif:\\tOkay. You were in the village too, though? \\nClyde:\\tI did, I was a station manager for Grant Aviation in Emmonak, I spent about three years out there. And there\\'s a little bit different, they have a piped system which I don\\'t know much about, they use vacuum for their waste. But yeah, it\\'s a little bit different than what Bethel has going on. \\nLeif:\\tYeah, I think Emmo has one of those weird vacuum toilets, right? \\nClyde:\\tThey do. \\nLeif:\\tAnd- \\nClyde:\\tYeah, so like there\\'ll be a leak, and they\\'ll go around to everybody\\'s house trying to figure out where the leak\\'s at. It\\'s obnoxious, but it works, for the most part. \\nLeif:\\tYeah, okay. Well I didn\\'t know you were out in Emmo, so the questions are kind of based on your experience, and so if you have different experience or ideas of how things were different in Emmo than in Bethel, that\\'s fine, just let us know. We\\'re looking for kind of a region-wide perspective. I contacted you because I knew about the Hauled Water in Bethel, but sounds like you\\'ve got some other relevant experience as a user, if nothing else, in other places. \\nClyde:\\tSure. Right, and my wife\\'s from Toksook Bay, which has amazing water, and they\\'ve had infrastructure for water and sewer for years, now. And I might be able to give you some insight there as well. \\nLeif:\\tOkay, sounds good. Let\\'s see, so you touched on this a little bit. Can you tell me about your educational and professional background? So like schooling, or how you got educated to do the things that you did with water infrastructure? \\nClyde:\\tWell, as far as water infrastructure, there wasn\\'t much schooling other than on the job. My highest level of education is high school diploma, and I\\'ve been in construction most of my life. Actually, I\\'m a plumber by trade. But Alaska seems to be a lot different, being mainly boilers. \\n\\tWhen I took the job in Bethel as a foreman, before that I was a water truck driver. I was also working in the mechanics department for a little bit. And you learn a lot of rules and regulations just by that. So as a foreman, I was mostly dealing with the logistical side of things, not so much like lagoons or anything else. \\nLeif:\\tYeah, yeah. Well, this is actually, that\\'s ... This is great. You\\'re the guy we want to talk to, I think. \\nClyde:\\tOkay. \\nLeif:\\tAll right. So some of these questions, if you feel like they\\'re repeats, don\\'t worry about it. I\\'m not trying to make you say the same thing over and over again, but if you feel like there\\'s anything you have to add ... So is that all the ways that you\\'ve worked with water infrastructure? You were a driver, like so Lauryn, like, has drove a big delivery truck, right? And- \\nClyde:\\tRight. 3500 gallon water trucks. And with those, you just go house to house. There\\'s hookups plumbed outside for each house, and you hook your nozzle up to it and you just pump away. And currently how the system works is you have tanks inside the house, and they could be multiple, plastic or single steel or whatever they want. And when it gets to the point where it\\'s full, there\\'s an overflow pipe where the water starts to pour out of that, which signals the driver to stop pumping. \\nLeif:\\tUnless the overflow\\'s frozen, right? \\nClyde:\\tWell, yes, yes, yes. Yeah. That happens a lot. So that\\'s actually a huge cost issue, and it\\'s a real big pain. And it\\'s like, not only does it cost a lot of money to the homeowner, but it also slows down services. So right now the way it\\'s set up in the city of Bethel ... And it\\'s hard to find talent. Maybe not even, talent\\'s not the word, just people with CDLs. We\\'re at the point now where you could have your CDL one day and we\\'re going to put you in a $250,000 truck and get you out there working, because we need it. It\\'s essential.\\n\\tBut like I said, back to it, it\\'s very expensive. Replacing pipes, repairing flood damage. Most of the time the city of Bethel\\'s not going to help you with the cost of that. The verbiage in the contract sets them up to be pretty safe, for the most part, being that if your pipes are frozen, we\\'re not liable. \\n\\tSecondly to that is, a lot of the drivers spend some time trying to clear these out. They actually do have a heart, where they could just walk away. They spend some time, which slows them down in getting to the next house. The average driver works between 8 and 10 hours a day on good times, and 14 to 15 for wintertime, just because it\\'s slower to drive, the driveways are icy, the pipes are icy. And unfortunately, being an essential service and government, and due to the size of our town, Bethel, we\\'re not subject to federal regulations for the FMCSA, I believe it is. Basically the Federal Motor Carrier\\'s Association rules and regulations that dictate the amount of hours that a trucker can drive. Where normally it would be seven days, 70 hours, or eight days, 80 hours, or mileage. We\\'re not subject to that in Bethel, so we just basically run our drivers to the point where they don\\'t want to work anymore, and then we try to find more. \\nLeif:\\tAnd just to build on that piece a little bit about drivers having a heart or trying to clear out the pipe, another thing that I\\'ve heard too is, or I\\'ve experienced, actually, is that experienced drivers know about how much water should be going into a house. Sometimes the new ones don\\'t. And so there\\'s a difference between having your house flooded with an extra 50 gallons of water and having your house flooded with an extra 500 gallons of water. I\\'ve been on both of those. So that [crosstalk]- \\nClyde:\\tThat\\'s true. In fact, we just had a guy retire, Nick Phillips, he\\'s been with the city for 35 years. And he knows, he\\'ll learn. He knows, once a family stay in their house for a very long time, and he knows every time he goes there, say, he puts about 300 gallons in. So he knows that if it takes about 300 gallons, maybe 400, he\\'s timing it. The pumps have variable speed, but he can get pretty close. And so if he starts to go over that, he\\'s thinking, \"Wait a minute, we might have an issue.\" So that\\'s what ... He\\'s not going to flood you. He might shortchange you, but he\\'s not going to flood you. \\n\\tAlso a guy that, with that much experience could just listen to the tone that the overflow pipe is putting out, and can get a sense of how full the tanks are. But these new guys, they just don\\'t have that skill. And they know it\\'s an 800 gallon tank, and they\\'re not going to stop and think about anything until they get to 1000. And it could\\'ve been overflowing from 600. \\nLeif:\\tCan you walk us through kind of your day as a foreman? \\nClyde:\\tSure. So a foreman, you try not to work when you\\'re not working, but there\\'s parts of it that you are keeping in mind, like whether, as you drive to work, you\\'re taking account of the road conditions. You\\'re thinking about what you want to do for a safety meeting, the subjects you want to cross. You\\'re thinking about things you need to address, whether it\\'s safety-related or not, or if there\\'s tension in the group and you want to just build your team up a little bit that day. \\n\\tSo you get to the office. And your route\\'s already together, but you\\'re starting to kind of see what you want to do. Like only once in the five years have I ever canceled a route in the morning, and it was just way too icy and dangerous. But basically you conduct your morning, you helped your guys do the walk-arounds. You deal with any issues, and there\\'s always issues, there\\'s always questions, and you got to be there for that. You assign trucks, help the new guys. Basically after you get your guys out the door in the morning, you start going through customer requests and extra calls, and you start just setting a pace for your day, getting kind of an idea of what it looks like. Do I have enough drivers to cover my routes, do I need to communicate that with the office? I could choose to say no extra calls, meaning that if you used up all your water before your next scheduled service, and you want more, I might just tell you you\\'re not getting it. Which means you\\'re toilet\\'s going to fill up, unless you haul your own water. You\\'re not going to take a shower, you\\'re not going to wash the dishes. And like I said, that all is based on what I got going on that day. \\nLeif:\\tLike capacity, you mean? \\nClyde:\\tRight. So 10 routes a day, most of this last year, I should say 12 months, because we\\'re kind of in-between years, I\\'ve had about nine drivers a day, so, that\\'s splitting a route. So if I have nine drivers with 10 routes a day, and I got one guy calls in sick, now I really have an issue. So now, like I said, on a normal, good day, my guys are working eight to 10 hours. Well now, we just bumped them up, now we\\'re doing 14 hours. And I\\'m not going to ask these guys to do extra calls. I want to get my commitment for this first. It\\'s unfortunate, but the reasoning is the customer really needs to ... They live here, they know how it\\'s going, they need to manage the water a little better. It\\'s not a very good excuse, it\\'s kind of ... I don\\'t know the word [inaudible]. It\\'s a poor excuse. I wish we were allowed to tell people that. \\nLauryn:\\tAnd to follow up with that a little bit, how often do you get those calls with people asking you to come early? \\nClyde:\\tEvery day. Multiple. \\nLauryn:\\tOkay. \\nClyde:\\tYeah. \\nLauryn:\\tSo pretty often. \\nClyde:\\tYeah. Anywhere from five to 25 requests. Especially when it\\'s summertime and you got tugboats that want water. \\nLauryn:\\tYeah. And so when you\\'re in that situation, would you say that you\\'re mainly limited by ... It\\'s typically having the available drivers, not really the equipment or water source issues or anything like that? \\nClyde:\\tIt used to be that trucks were a factor. It used to be that, the city allowed their fleet to get so old that when they would break they would have a hard time finding replacement parts. And we would literally have drivers and no trucks. City made a very good decision on upgrading their fleet, which has eliminated that part of our issue. Never have I had an issue with not enough water. Even during large fires, we\\'ve always had enough water to service our community. So yeah, at this point mainly it is just the drivers, being able to get ahold of them. And there\\'s a lot of reasons why we don\\'t get drivers. So yeah, mostly just drivers at this point. You have to just draw a line, you literally only have so much to work with. You have a noise ordinance that starts at 10:00 PM, and we still have guys working sometimes to 11:00 and 12:00 midnight just to get it done, just out of sense of duty. You don\\'t even ask these guys to work that much, there\\'s just a sense of duty to their community that they want to get that finished.\\nLauryn:\\tWow. \\nLeif:\\tSo, when people call, I mean, they call you, like you talk to people? \\nClyde:\\tYeah. \\nLeif:\\tYeah. And [crosstalk]- \\nClyde:\\tA lot of times. I try not to be the initial point of contact. I hired an assistant, which has been the first time since many foremen before me. Also they\\'re encouraged to go through the office. But a lot of times they really don\\'t have the up-to-date information, and they can\\'t really answer questions of people that are pretty upset. I was pretty good at that, so I ended up talking to a few of them. \\nLeif:\\tCan you tell us a little bit about talking to people, talking to people about water systems, explaining how things work? I mean, do you ever do that ... I guess it\\'s probably one at a time, not so much public, but is that, that\\'s part of your job too I guess, huh? \\nClyde:\\tIt is. And I\\'ve spoken to people, like to banks from Florida, people trying to get loans for a house in Bethel, and they have no idea how it works out here. Newcomers to Bethel, physically here, I\\'ll go out and I\\'ll walk through everything with them. We\\'ll do some problem-solving with people who\\'ve been there their whole lives, but their sewer tank was buried, and they don\\'t know how big it is, or if it\\'s leaking or if it\\'s not. Mostly my interactions were with frustrated people who pay a lot of money for the service and are now not getting it. Because unfortunately it wasn\\'t just, \"I can\\'t give you extra service,\" there\\'s a lot of times where I can\\'t give you the service that you\\'re owed. We have a lot of upset people just for that. \\nLeif:\\tDid you have any follow-ups on that, Lauryn? All right. So how is drinking water provided? So you\\'ve got the trucks, the other half of that ... Like walk us from the ground, how does water get to people\\'s tap? \\nClyde:\\tI\\'ll do my best on that. I was never part of that system. That system is separate from what I did. That\\'s a whole nother thing. Now, you have to be a certified water plant operator, and those names are public knowledge. For example, you could go look up Bill Arnold\\'s name, William Arnold, in the public Alaska registry as a water plant operator. What they do is, I can\\'t say exactly where, but they pick up groundwater, I think there\\'s some kind of well or something out there by city ... No- \\nLeif:\\tThere\\'s- \\nClyde:\\tYeah, both of them, actually, city sub and the other one. They pick up the water that\\'s non-potable, they add chlorine and some other chemicals, they do multiple tests on it every hour, and they mix the contents and they adjust the pH, and when they have an acceptable solution it goes into a holding tank where that is what\\'s provided through the pipes and to the water trucks. But basically it starts as groundwater. You\\'re not going to get water, we don\\'t ship water in or anything like that. It\\'s just out of the ground. \\nLeif:\\tHow much of the water is through your shop and how much is through the pipes? Like what\\'s the split look like? Or how many houses do you do, I guess? \\nClyde:\\tWell, I figured out my last count that I do about 1400 stops a week, with five trucks. Five water, and then I do it again with five sewer. But those, some of those are repeat, some of those are not. So, I mean, you\\'re familiar with the area, others listening might not be. But basically, you got all of Kasayuli, all of Tundra Ridge, you have all of the avenues. Then you got Ptarmigan. So what that really means leftover is city sub and housing. So I\\'d say two thirds of the water is delivered in Bethel. So that\\'s with, what, between 6 and 10,000 people? \\nLeif:\\tI have just a question that\\'s not on the list, but I\\'m just curious based on your experience. I was on piped water, or I was on hauled water and then I moved to piped water. If you had to pick, or if you knew somebody, if Lauryn here was moving to Bethel, what would you tell her about piped versus hauled water? \\nClyde:\\tI would say go hauled water. Hauled, hauled, hauled, hauled. Because the quality is better, you\\'re not relying on the age or condition of the pipes that are down the line. Basically all that stuff that you cannot control in between the houses, the main water supply line. Also, you can have a freeze down the line that\\'s somebody else\\'s fault that impacts your house. Whereas hauled water, you\\'re completely on your own system. It\\'s up to you to keep your pipes warm, and once you\\'ve got a good system in place, you really never got to worry about it. And if you run out of water, you could always go and fill your own, you can get some big garbage cans and fill it up for quarters, and bring it to your house and pump it into your tank. \\n\\tWhereas if you\\'re on pipe system ... The only reason I say the quality of the pipe is a lot of people have yellow water. There\\'s a tinge to it. It\\'s just from the old days, I don\\'t know if it\\'s all poly pipe or if there\\'s some metal pipe in there somewhere. \\nLeif:\\tI think Akiachak has got iron pipes, yeah. \\nClyde:\\tYeah, see, we don\\'t need that much iron in our diet. \\nLeif:\\tYou talked a little bit about this, do you have any other sort of anecdotes about a challenge in local water infrastructure? You got any good stories for us about something breaking or having to solve a problem? Maybe? No? \\nClyde:\\tWe continue to grow with new technology and better design of our trucks. But we\\'ve come a long way. Water and sewer used to be done with giant wooden barrels on the back of flatbed pickups. We used to fill garbage cans. They used to have these giant wooden barrels, and they would use a gas powered Honda pump, trash pump, and they would stick one into the barrel and then they\\'d fill up your garbage can on your deck, and away we went. They used to take the buckets out of your living room and walk out, try not to spill any on your living room floor, and dump it into that other barrel. But these winters, when we\\'re driving truck, you got to drive out to Kasayuli. \\n\\tI guess one challenge I always thought was kind of cool was, it was a pain in the butt, but it was unique to working so hard in the cold conditions is that you would have to stop every mile or so and pump some water, because your entire line will freeze. In the old days they used to stick the nozzle up by the exhaust stack, which I thought was horrible. That was before my time. But you would literally have an inch and a quarter line that\\'s 100 feet long frozen, and you would have to come to the shop three, four times a day just to thaw out to go back out to do your route. It\\'s because of things like that, and it\\'s not really an anecdote or anything, but it\\'s something that I thought was special, is that it takes some of the toughest men and women with the greatest amount of fortitude to continually get up and do this job every day. It\\'s absolutely brutal, it\\'s freezing. You\\'re literally standing out in the wind pumping water in -60 windchill up on a hill in Kasayuli while you\\'re so frozen you need to break the ice to bend your knee. \\n\\tAnd then you get in a truck for five minutes, you\\'re already parked at the next house, and you do it again. And you do that for 14 hours a day. I always admire the crew that works on water and sewer for that. \\nLeif:\\tYeah. But sounds like it makes it hard to hire sometimes, too, huh? \\nClyde:\\tWell, it\\'s, yeah, people know about it. They\\'re scared of the work. A lot of young guys I get that can\\'t handle it. It\\'s the old-timers that stay, it\\'s the old veterans. My crew ... Or it\\'s not my crew anymore, I guess, but my solid guys that work year-round that we\\'ve never had a problem with, that we always depended on. The only reason Bethel has had the service they\\'ve had is because of men and women that were in their 50s. These are all people getting ready to retire. I\\'ve hired big, strong kids, and they make it a week or two. And we continually go through them like that. \\nLeif:\\tSo, there was a question here about seasonality of challenges, of infrastructure challenges. So it sounds like winter is the tough season. \\nClyde:\\tYeah. Yeah, I think the guys would rather swat mosquitoes than freeze. \\nLeif:\\tAnd you kind of hit on this a little bit about how challenges have changed over time, so it\\'s ... Would you say that things have trended, trending better? Things are getting better? \\nClyde:\\tThey\\'re trending better as far as SOP and mechanical. We\\'re still in the same rut as far as consistently hiring employees to drive these trucks. There\\'s too much competition, there\\'s a bad name for the work, there\\'s too many hours. Bethel is a society where the dollar is not the value system. I mean, sure, for some, we all have to make the money. But what\\'s really valuable in Bethel is hunting, and sharing your food, and subsistence, and family. And that\\'s what makes the place so special. So now you\\'re asking these Native men and women to come drive for me and I\\'m telling them, \"Nope, you can\\'t moose hunt, I don\\'t got enough guys. Nope, you can\\'t go fishing.\" That\\'s a big issue. So you got to get guys from out of town, because they go for Knik, they go for the other fueling guys. They\\'re making more with those companies starting out, they get two weeks on, two weeks off. Or they leave their family and they come here to Bethel and work, and they get a bunk house to stay in.\\n\\tHere, you want to work for us, you better get a house, you better pay 1800 a month. So who\\'s going to leave Anchorage, and it\\'s beautiful here, and who\\'s going to leave this to go be in Bethel and pay exorbitant amounts for rent and food for less possibilities in their life? So yeah, it is trending better in certain ways, but you have one major issue that doesn\\'t seem to be worked on. And I\\'ve begged and I\\'ve pleaded for it to the point where I just got red-faced and gave up. \\nLeif:\\tWhat would help with that? \\nClyde:\\tSee, I\\'ve always been told eight different ways that I\\'m wrong, but I believe that a bunkhouse and a two weeks rotational shift would be hugely beneficial. I\\'ve been told that, \"If you\\'re going to provide them with housing, I want housing. Even though I live here.\" Or, \"I want extra money in my paycheck.\" The city manager says the union won\\'t allow it, the union says the city manager won\\'t allow it. The public works director said, \"That\\'s not going to happen.\" I don\\'t know any other way but to model ourselves after a more successful company, and I mean successful in staffing, not monetarily, but staffing. These other guys are full, and that\\'s the one thing they all have in common is a bunkhouse. You come here, you work for two weeks, that\\'s common in Alaska. You can go out to the slope all the time, you see guys doing that year after year. They\\'re cool with that. Go spend some time with their family, come back here, work them to death. That\\'s what they want to do, work them 12, 15 hours a day. Maybe you can keep the locals on a more regular schedule because you got these outsiders coming in to pick up the slack, now you can have everybody going out and doing subsistence. \\nLeif:\\tChanging subject a little bit, are there any water quality challenges that you have? \\nClyde:\\tNot that I\\'m aware of. Like I said, yellow water. I mean, I know they do their tests, I know they say that they\\'re safe, but I don\\'t know, the quality, I don\\'t know. I don\\'t know what you\\'re going to do in Bethel to improve quality. I know that Toksook Bay and Emmonak ... Well, let\\'s just leave Emmonak out of that, Toksook Bay is, wow, their water quality is amazing. And I don\\'t know how they\\'re doing it, but you could drink it straight out the tap, it\\'s cold as ice, it tastes so pure, and it\\'s the best water I\\'ve literally had in my life. So if Bethel could somehow model themselves after Toksook Bay that would probably be an improvement. \\nLeif:\\tSo, some of these we hit on already, challenges due to arctic conditions, right? \\nClyde:\\tYeah. \\nLeif:\\tTimes where you\\'ve had to be creative responding to challenges, and what you would need, what would help. So I want to drill down on that one a little bit, when you\\'re talking about two on two off, I mean that also would be a package with basically flying people in, right? I mean, it probably wouldn\\'t be locals. \\nClyde:\\tRight. \\nLeif:\\t... it\\'d be more like Knik does with the bunkhouse. Or Grant, pilots. \\nClyde:\\tYeah, we\\'ll probably have to be more like what the police are doing. I think they pay their own airfare, but the difference with those guys is that they work together. They\\'re all mainly from Georgia and Texas, and they kind of know each other from the departments down there and they split rent. So it\\'s very cheap out here, and you\\'ll get two shifts, three or four guys splitting the one apartment. So it\\'s a little bit easier for them. But you\\'re not going to find CDL drivers to actively do that, you would have to have somebody that organizes that. \\nLeif:\\tSo, I don\\'t know if this is ... Have you seen any impact of changing climate on the ability to deliver water and water infrastructure? \\nClyde:\\tWell, I don\\'t know ... We say we\\'ve noticed the changing climate, I mean, seems that the winters are growing to be more snow and less ice, lately, which does help with traction on the vehicles. Other than that, I couldn\\'t really comment on it. The drawback with too much snow, of course, if you can\\'t see the driveways, you get stuck a lot. But at least you\\'re not sliding off the road. \\nLeif:\\tWe had, a couple years ago we had that year where like, it rained the entire month of February, and the roads were disgusting.\\nClyde:\\tYes. \\nLeif:\\tThat was a mess. \\nClyde:\\tThat was that one time that I canceled a route right away, that was the one and only time was during that period. \\nLeif:\\tYeah. \\nClyde:\\tIt was out in [inaudible]. \\nLeif:\\tYeah. But I guess in general, kind of warmer, wetter weather is better than colder, colder weather. Colder weather. \\nClyde:\\tIt is for human beings and mechanical, for sure. \\nLeif:\\tYeah. So maybe you answered this already, but the water infrastructure challenge that you think is most important to fix, and if you could wave a magic wand and fix it, sounds like it\\'s people, right? \\nClyde:\\tIf I had a magic wand? I mean, sure, more people if I wanted to fix the current conditions. But honestly, I would build another water plant, and I would pipe the entire city, and then I would work on replacing all of the oldest pipe. \\nLeif:\\tYou think pipes are- \\nClyde:\\tAnd then try to keep- \\nLeif:\\t... pipes are kind of the future, or the ideal goal? \\nClyde:\\tThat\\'s the ideal goal. If I had a magic wand, trucks need to go away. They do. I mean, that\\'s just a whole nother cost for the community. I mean, it\\'s good for jobs, but they can pipe everything ... You need to hire, take a lot of these water truck drivers and put them on a different department to work on the pipes. \\nLeif:\\tYeah. \\nClyde:\\tYeah. \\nLeif:\\tSo, we talked a little bit about, so system failures from your end I guess would be times where you couldn\\'t deliver water, or interruptions in services. You mentioned that, how often does that happen? I mean, I guess if I was a customer, and I got water delivered once a week, or was supposed to, how often would I not get my water when I was supposed to? \\nClyde:\\tWell, we try to keep in mind who was last, I don\\'t know, skipped, I guess, intentionally skipped. We try to rotate the affected people. So with that in mind, it just depends on the season and the amount of drivers I have. It could be at any point of the year, they could be in any season. It\\'d be hard to put a number on that. I would say no more than four times a year, which would be, as a whole, as a community, say, Blueberry, skipped, a year. So I\\'d say no more than four times a year. So when I say, if I miss like, say, Blueberry, say it\\'s route four, I tell the route four guy to go help the route five guy, and then I have him cascade down the other routes to finish everybody off, we would make that a first priority the next morning. But to answer your question, I would say no more than four times a year in any one spot. \\nLeif:\\tKnowing that stuff, is there a priority for planning? Like what\\'s the planning process for the future, for infrastructure? \\nClyde:\\tPipe everybody. That way. Yeah. Sorry about, my wife was mentioning the programs- \\nWife:\\tNo, DEHE programs. \\nClyde:\\tDEHE? \\nWife:\\tYes. ANTHC. \\nClyde:\\tANTHC? I\\'m not quite sure that that is, but ... \\nClyde:\\tYeah, I know they want to be on a water pipe. And I know that there\\'s some other issues that they deal with right now with the avenues that they\\'re working on. We need to have so many thousands of gallons on reserve for the school, as well as a higher capacity for what the avenues is expected to use. Which is why I say another, a plant would be nice, another water filtration plant. Even right now if we had that, just for the trucks out in Kasayuli, that would be a life-saver. Because it\\'s a 45 minute to one hour round-trip every time the trucks go empty. So if you can imagine, like in Kasayuli you have 45 houses to do on your route, you could do between four and six, four and seven houses, but you got an hour round-trip to refill. I know that\\'s a little off-subject right now, but the future is definitely piped.\\nLeif:\\tYeah, okay. Well, I\\'m going to roll through a couple of questions pretty quickly, because I think you\\'ve already answered them, but if there\\'s anything that jumps out at you that you want to add to it, please do. \\nLauryn:\\tCan I add a quick follow-up question too on that? So you kind of said, right, the future is putting in piped everywhere. What are some kind of barriers for that happening? Is it funding, is it workforce considerations, kind of things like that? \\nClyde:\\tI think it\\'s always funding, it\\'s always going to be funding. And I know we can find it. Yeah, so I don\\'t know. And I also heard some rumor that there\\'s some issues with the Arctic Pipe that\\'s being used not being as available as it used to be. Apparently there\\'s not too many people that make it. But I don\\'t know, that\\'s all I\\'ve heard. I don\\'t know if it\\'s true or not. \\nLauryn:\\tOkay, yeah. \\nLeif:\\tPivoting a little bit towards management, because you supervise people as a foreman, right? So there\\'s some questions about what management challenges you face, what workforce challenges you face, how you respond to these challenges, what you would need to better respond to these challenges. So those, I mean, I feel like you\\'ve kind of covered those, if there\\'s any other thing that jumps out at you to add. \\nClyde:\\tSure. Took me a few years to realize it. At first I considered more what my city administration wanted of me, which was to crack the whip, push, squeeze them for all they were worth, just whip, whip, whip. And I grew up learning that a management, a boss shouldn\\'t ask you to do things that they wouldn\\'t do. So the hours they worked, the things they did, it began to be something that I wouldn\\'t do. \\n\\tSo what I did was I emphasized more on my crew\\'s happiness than I did the job. I realize that my guys have a sense of community that no matter what I said to them, even if they were to be disobedient, it would be to get the job done. So, what I did is we had this saying that you work to live, you do not live to work. And that means that this job should not be taking away from your happiness in life. So we made every effort to make sure that these guys had time off, even if we really, really needed them. Sometimes guys would be wanting to work and you would tell them no, because it\\'s their day off, and you know they need rest. We started caring more about the employee, and what happened was that you would have less internal conflict between the employees, you would have better production from the employees, and overall more positive attitude out of them. \\nLeif:\\tAnd you only employed people with CDL? \\nClyde:\\tYep. \\nLeif:\\tOkay. So you basically, that\\'s the group of people that you\\'re trying to hire. Did you ever hire somebody who said they were going to get a CDL, or was it just, \"Once you get a CDL, come find me.\" \\nClyde:\\tWe have, we have. In fact, there\\'s been a loose system on that, we\\'ve actually hired people before they\\'ve even had their permits, just with the intent. We pay them full wages, we teach them how to drive, we get them their permit, and then we actually pay for them to get their CDL, take the test. We\\'ve actually sent people down to driving schools down this way towards Anchorage, and had them do that. We\\'ve had some guys fail, we\\'ve had some guys kind of flake off, which has burdened the system a little bit. Because it\\'s expensive, it\\'s about $6,000 for school. But now I think where we last left off was if you show us the initiative, that you studied, and that you\\'ve got your permit, then we will hire you. \\nClyde:\\tAnd then we give you a time period. \\nLeif:\\tCan they work with a permit? I mean is that?\\nClyde:\\tThey cannot drive our vehicles, but they could ride in the right seat and learn the job. They can see how it\\'s done, we put them with an experienced driver. We\\'ll also give them time with like one of the streets and roads guys who would give them some drive time in certain situations. \\nLeif:\\tAnd they can learn how the pumps work and all that, the other part, right? \\nClyde:\\tYeah. In the meantime, they\\'re getting trained up for what they\\'re going to be doing, but also for their CDL. \\nLeif:\\tOkay. \\nClyde:\\tAnd a paycheck. \\nLauryn:\\tAnd kind of a follow-up on that, can you walk us through some of that initial training? So, say someone\\'s just started, they have their license. How do they get trained in terms of kind of actually working the pumps and the whole system? Is it just from kind of someone who\\'s been there longer, is it more formalized training? \\nClyde:\\tIt\\'s not complicated. So, the biggest part is every morning you do a walk-around. So the experienced driver will show you the walk-around, there\\'s, I know, loosely, there\\'s about 45 places that you search on the truck to make sure it\\'s safe to drive. And you do that every morning. You either, you\\'re told to do it, or you\\'re explained how to do it. As far as the pumps go, 40 to 60 houses a day, you see it turned on and turned off that many times every single day, you\\'ll pick it up. \\nLeif:\\tOoh, we\\'re having an earthquake. \\nClyde:\\tYeah, we are. \\nLauryn:\\tOh my goodness. \\nClyde:\\tThat\\'s awesome. \\nLauryn:\\tFirst time feeling it. \\nClyde:\\tHoly cow. I thought someone was shaking my car. \\nLauryn:\\tMe too. \\nClyde:\\tOh man, that\\'s so cool. No wonder the kids freaked out. \\nLeif:\\tOur house in Anchorage was built before the big earthquake, so I feel very confident in it. Like it survived the big one in whenever, \\'56 or whenever that was. Whenever the, you know, the Good Friday one. Huh, okay. \\nLauryn:\\tWow. \\nClyde:\\tYeah, I can\\'t tell if we\\'re moving still or not. \\nLeif:\\tNow it might be the wind that\\'s moving your car now. \\nClyde:\\tOkay, I thought somebody was jumping on the trunk of my car, I was going to get mad. \\nLeif:\\tSo yeah, sorry, did I interrupt? I don\\'t know, were you halfway through that? \\nClyde:\\tMother Nature did. \\nLauryn:\\tI don\\'t remember what we were talking about. \\nClyde:\\tSo, I answered your question about the pumps. \\nLauryn:\\tYes. \\nClyde:\\tThey do lots of daily training. \\nLauryn:\\tThe person from Texas is over here like, \"Whoa, is that a normal thing? What\\'s happening? Were y\\'all okay?\" \\nClyde:\\tIt has been, actually. \\nLauryn:\\tGosh. \\nLeif:\\tContinue the interview from a doorway here. \\nClyde:\\tYeah. I\\'m just going to back away from these large poles. \\nLeif:\\tWell, I have a question. So do you feel like, is the CDL, is that useful information? Or is that kind of jumping through hoops? Like could you do on the job training, could you teach that stuff? Or is the time people spend doing CDL, is that really very valuable for the job they\\'re going to be doing? \\nClyde:\\tIt depends. I mean, I honestly, I prefer to teach guys myself when they come in with an idea. Unless they\\'re really experienced, and if they come in kind of halfway experienced, that could be dangerous if they have too much confidence. There\\'s no replacement for experience. So I guess with that said, if you don\\'t have a lot of it, I\\'d prefer to train them. But it doesn\\'t take a lot to get your license. \\nLeif:\\tDo you think your situation would be ... I mean, you\\'re only hiring people with a CDL because you have to, right? So if we could wave a wand and say that\\'s not the rule anymore, you can hire whoever you want, would you still hire people with a CDL, or would you, how would you value that? \\nClyde:\\tNo, no. I would hire anybody that can drive, man. And that\\'s how it used to be. Bethel used to be where you could, it was off-highway, it was exempt. And we had full crews, all the time. This was back when they had 25 people, they had 25 drivers for less than a city. Could you imagine? These guys were getting done in six hours. They would all drive back to the shop and have coffee breaks, and then they\\'d go back on the road, finish the route, and clean the shop for a couple hours. Now there\\'s only a roster of 15 positions allowed, funded by… again?\\nI\\'m still tripping out. Yeah, so now there\\'s only 15 positions, and it\\'s so tight, and we still can\\'t fill those. And that\\'s why we lost them, is because they got tired of allocating money every year for positions that would never be filled. Take away that CDL garbage, it\\'s just a piece of paper, and we would have no problems with these trucks. \\nLeif:\\tWhat changed?\\nClyde:\\tThe DOT. They said, \"Well, hold on, it\\'s a city, that\\'s a highway right there, you can\\'t be driving these trucks on a highway without a CDL in the middle of nowhere.\" It\\'s ridiculous, man. And I think everybody understands that there\\'s ... A certificate doesn\\'t change anything. You have a knack for it, you can do it or you can\\'t do it. I mean, I suppose if you\\'re a microbiologist, yeah, you should probably go to school. But this isn\\'t that, this is driving a large Subaru. You just turn a little later. There\\'s some things you got to learn about air breaks. Okay, we can teach that. \\nLeif:\\tYou said there\\'s been some people you\\'ve tried to put through training and they\\'ve kind of flaked out or whatever. Are there challenges ... You said it wasn\\'t that hard to get a CDL. Have you had people that just kind of can\\'t do it? I mean, has that been a barrier? And if it is a barrier, is it a barrier that people can\\'t do it, or because they just kind of don\\'t want to? \\nClyde:\\tThey don\\'t want to. It\\'s the whole you can lead a horse to water thing. They\\'re just not ambitious. \\nLeif:\\tBut not, it\\'s not an issue of just ... I don\\'t know the test, so I\\'m speaking out of ignorance here, but- \\nClyde:\\tIt\\'s not a cognitive issue. \\nLeif:\\tIs it reading, or?\\nClyde:\\tI mean, I think it\\'s like a 75-question test for the main part, and it\\'s mostly common sense, and then it\\'s a driving test. And if you can\\'t drive a truck with a manual transmission, then that\\'s, they mark that on your license, and big deal, we have all automatics. If you can\\'t drive a combination, with a tractor trailer, then you get a class B license, which is still fine, you just only drive stick truck. And then there\\'s an additional test that\\'s only 25 questions so you get your tanker endorsement, which is required, but only 25 questions. So we\\'re talking 100 questions total, and all you have to do is get 70% or 80%. \\nLauryn:\\tBut you said it\\'s expensive for the test, correct? So that could be a barrier is the price? \\nClyde:\\tIt is, if you\\'re not motivated. I mean, if you go to the job center there in Bethel, plenty of guys get it waived. The government will fund you for that. All you have to do is fill out your wants, your hopes, apply, get a letter, a promise to hire, or intent to hire, a letter, which anybody will write to you for that. And then you have to show why the field that you\\'re wanting to get into has a future. So part of that is ... I went through it myself, I got my CDL for free. And part of it was I was able to prove, document that a CDL driver is in demand, an average of 14% more per year. And they\\'re like, \"Oh, okay, we\\'ll pay for that.\" Boom. Six weeks later, driving CDL. \\nLauryn:\\tHm. \\nLeif:\\tDo you do anything on the billing side at all? Do you deal with customers paying their bills, and that sort of thing? \\nClyde:\\tNo, no, just kind of like a liaison sometimes. Like where, \"Yeah, I understand you didn\\'t get water, I know that.\" Billing\\'s not allowed to make any deductions, so I\\'ll tell billing to give them the deduction, or I\\'ll make an executive decision basically that there was a wrong committed by the city from my department, so I will direct them for a refund. But as far as actual billing, it\\'s not up to me. I don\\'t know the rates, I don\\'t know how much for what, except for extra calls. \\nLeif:\\tSo, you don\\'t decide when to cut people off, or any of that? \\nClyde:\\tI do to the extent of the code. So if you\\'re not within the code, you provided a dangerous walkway or pathway or whatever, driveway to my drivers, or you have a dog or a hostile environment, or the water fill, the drivers have to get on their tippy-toes and reach above their heads three feet, yeah, I\\'ll cut you off for that. But as far as payments? No. \\nLeif:\\tLauryn, is there anything you feel like we haven\\'t hit on? \\nLauryn:\\tYeah, I have one question about kind of safety of drivers. What are the kind of concerns there, maybe, especially during the winter when the road conditions are really rough, kind of balancing safety with providing water, right? \\nClyde:\\tRight. So ... Well, in Texas you probably don\\'t know that, but roads get icy, and these trucks weight 60,000 pounds, 30 tons, and they have 10 tires on them, but also it\\'s 30 tons. So there\\'s no chains, there\\'s no studs in the world that\\'s going to keep those trucks on the road if they\\'re going to slide. We\\'ve had trucks go into many ditches. More often it\\'s simple mistakes where the snow has blown so much that you can\\'t tell where the sides of the driveway are, so you back down into basically a ditch, you get stuck that way, or too deep of snow. \\n\\tWe\\'ve had a couple instances where people have gone down into, all the way through down into green belts. Like Leif, if you leave Darren\\'s house and go down that hill, that\\'s the stop sign in that big green belt, we\\'ve had a truck go down there. Or Darren\\'s old house. But no deaths, no fatalities. Bethel has a maximum speed limit of 45 miles an hour. \\nLeif:\\tHow about just slip and falls? \\nClyde:\\tLot of slip and falls getting out of the tricks, the rigs are very tall. So we try to have everybody with cleats, which also makes you a little more cumbersome. But yeah, that\\'s actually my biggest concern. Like I said, most of my core guys are 50. \\nLeif:\\tYeah. Workman\\'s comp is probably a concern, then. \\nClyde:\\tYeah, I mean, workman\\'s comp, yeah, I mean, I guess they\\'re more of a concern for the admin side, but just for me, I\\'m out a guy for a long time now. So I still got guys that have approved PTO, I still got guys that get sick, I still got other guys that get hurt, other guys with babies being born, other guys want to moose hunt, now this guy\\'s down. So should be at 10, I was at nine. Now I\\'m at eight. Now I got to get by with seven here and there, so now I got to drive. I don\\'t want to drive anymore. But now I\\'m going to drive. So now you have the office not being taken care of, it just gets worse and worse. \\nLeif:\\tYeah. \\nLauryn:\\tI think that was my last question I had written down \\nLeif:\\tWell, I think we\\'re kind of wrapping up here, Clyde. Is there anything else you would like us to know, or do you have any questions for us? \\nClyde:\\tDespite any of the concerns or the complaints or any of the good stuff, anything I had to say today, I think it\\'s pretty amazing that we have such a reliable water source. When I say reliable, it\\'s pretty amazing, even with our shortcomings here and there, that we have that in the middle of nowhere. If you were to, you pull up Bethel on a map, and you and me, Leif, we know what all the villages are, those don\\'t count. Essentially any decent-sized city, there\\'s nothing for 400 miles. And we have what we have, and I think that\\'s amazing. I mean, it\\'s a feat. So if we just improve on it, we need to not be afraid of changing what\\'s always been done, we need to get that terminology out of our head. But I think Bethel\\'s going to be all right. No one\\'s getting sick, everyone\\'s pretty clean that wants to be. Really, I have no complaints. \\nLeif:\\tAll right. If there, as the process moves forward, do you mind if we contact you in the future if we have any follow-up questions? \\nClyde:\\tI\\'m always up for help, I\\'d be happy to help you guys in any way. \\nLeif:\\tAnd then the last question I had is if you have any pictures related to, I don\\'t know, water trucks or any of the stuff you\\'ve been doing out there. \\nClyde:\\tInjuries?\\nLeif:\\tPictures. \\nClyde:\\tOh, yeah. I\\'ve got a ton I can send you. \\nLeif:\\tOkay. I’m not looking for anything in particular. \\nClyde:\\tI have a folder in my phone that\\'s just work. It\\'s pretty sweet. \\nLauryn:\\tYeah, any pictures, like even general pictures or anything interesting, that if there\\'s kind of a situation attached to it, you can add a descriptor. Just feel free to send anything. \\nClyde:\\tAll right, sounds good. \\nLeif:\\tDid you have any trucks ... There was a truck that rolled over. Were you with the city then? \\nClyde:\\tYep. Yeah, that was crazy. So policy was not followed on that. So- \\nLeif:\\tIt was like two in two days, right? \\nClyde:\\tWell, this guy, he was driving a sewer truck, and he was up at the highway lift station, and he cut the corner too hard, and there\\'s these concrete yellow poles there. He ripped off his whole hydraulic tank, his pump, everything, ripped it off. And there\\'s oil everywhere, pain in the butt. He\\'s supposed to go take a drug test, and then wait for results to come back at that time. Well, instead, the foreman was shorthanded, he says, \"Go take that brand new truck, get out there and do water.\" \"Okay.\" \\n\\tSo he leaves the shop, he takes a right, he drives about 25 feet and then hooks it into the ditch and rolls it over. \\nLauryn:\\tOoh. \\nClyde:\\tI think they finally let him go home after that. \\nLeif:\\tI mean, I remember that one, yeah. I think there was a little detour at the hospital, but yeah. \\nClyde:\\tOh, was there one there too? I was thinking that was, the one I was talking about was out in front of public works. \\nLeif:\\tNo, no, I think we\\'re talking about the same guy. I think we transported him, I think he ended up coming into the hospital after that roll-over. \\nClyde:\\tOh, yeah. Yeah, there\\'s been plenty of them. Like I said, that one by where Darren used to live, I was surprised that guy didn\\'t die. He just slid right past Darren\\'s stop sign and then went right down in that gully. You remember that, where that T is? \\nLeif:\\tYeah, I know where you\\'re talking about, but I don\\'t remember that. \\nClyde:\\tHe went straight down that. \\nLeif:\\tYeah. Yeah. \\nClyde:\\tWe\\'ve had plenty of situations where trucks will back up in a driveway in Kasayuli, the driver will get out, start to go pump, and the truck slides down the driveway, crosses the street, and goes down into the tundra. We\\'ve had those. \\nLeif:\\tWe talked about filling up houses with water, but there\\'s also been trucks backing into houses, right? \\nClyde:\\tOh yeah. That was a big thing, stop backing up so close. Mostly for the sewer guys, they don\\'t have as much line on there, and the hose is much larger. They do have extensions, but they\\'re difficult to work with. I think it\\'s just laziness for some part of it. But yeah, we\\'ve bought quite a few barbecues and snow machines. \\nLeif:\\tWow. Well, I think that\\'s kind of all the questions that we have, unless you have anything else for us, any questions or anything else to add, it\\'s been, I think, super informative. \\nLauryn:\\tYeah. \\nClyde:\\tYeah, thanks for calling me. \\nLeif:\\tYeah, no problem. So I got a little process thing, you\\'ll get like a DocuSign thing that says that you did this, and then I can release funds to you. So it\\'s $100, and- \\nClyde:\\tYeah. \\nLeif:\\t... I can do that a couple different ... Like I can Venmo it, I can give you a $100 bill, if we can connect, or I can ... I don\\'t know, however you want it, I can figure that out. \\nClyde:\\tYeah. I maxed out my Venmo. I guess, if $100 is worth it, you could go down to AK. \\nLeif:\\tOh yeah, you know I\\'m in Anchorage a bunch too, do you ever come into Anchorage? \\nClyde:\\tOh, heck yeah. Yeah, call me when you\\'re in Anchorage, man. I go there every day. \\nLeif:\\tOh, really? Yeah, I\\'m in Anchorage right now, that\\'s why I felt the earthquake, yeah. \\nClyde:\\tOh, duh. Oh my God. I always thought so high of you, I thought maybe you\\'re just omnipotent. That\\'s why you felt it. \\nLeif:\\tA great disturbance in the Earth\\'s crust. \\nClyde:\\tAll right. Well, I\\'m going to go pick up my kids. You want to meet today or some other time? \\nLeif:\\tI don\\'t know if I can do it today, I don\\'t know how long it takes to do the ... Yeah, actually probably not today, because I\\'d have to get cash, too. But ... And I\\'m not sure what I have at the house. \\nClyde:\\tAny time, just if you think of me. Because I got to send that thing too. \\nLeif:\\tI\\'ll get the process rolling. And so you maxed out your Venmo for money coming in? \\nClyde:\\tYeah. \\nWife:\\tYeah. \\nLeif:\\tI didn\\'t even know you could do that. That sounds like a good problem, Clyde. \\nClyde:\\tYeah, CashApp too. It\\'s pretty good. Online money-making. \\nLeif:\\tYou know what, I\\'ll stop the recording, we can talk about it later.\\nClyde:\\tI\\'ll show you the ways, Leif. \\nLeif:\\tI\\'ll get that stuff rolling, and yeah, good to talk to you, thanks for doing this. \\nLauryn:\\tYeah, it was great to meet you. \\nLeif:\\tYes sir. You too, thank you. Take care. \\n', '1_3_InterdependenciesNNA': '\\nQC Nelson\\nWed, Aug 24, 2022 7:50AM • 59:15\\nSUMMARY KEYWORDS\\noperators, water, alaska, bethel, systems, people, community, plant, utility, project, yk, sewer, money, pipe, anchorage, big, challenge, building, delta, piped\\nSPEAKERS\\nLeif Albertson, Lauryn Spearing, Nelson\\n\\nLeif Albertson 00:00\\nYou know, it\\'s Alaska is a small place. So, you know, I don\\'t I don\\'t claim that we\\'re guarding this, like the launch codes, right? Don\\'t say anything, you know, you wouldn\\'t want anyone to hear I guess. So yeah, so the deal is? Well, I don\\'t, I don\\'t Lauryn, do you want to you want to you want to kind of give an overview of where this started?\\n\\nLauryn Spearing 00:26\\nYeah, definitely. So I work at the University of Texas, in Austin. And so I worked with Kasey Faust, she actually grew up in Anchorage. And what she wanted to do is kind of do a research project specifically focused on water infrastructure in Alaska. And so we are funded by NSF on kind of, I guess, two separate projects. But what we\\'re really trying to understand first is a little bit more of, you know, what are the challenges when providing infrastructure in these rural communities, right, specifically water. And then we\\'re also looking a little bit more at kind of operator training and certification and some of the issues with more of the kind of operations side, so once the system\\'s built, and during when it\\'s continued to be operational. And so that\\'s kind of the general scope of the project. And we\\'re going to ask very kind of general questions at the beginning. And so just any of your insights from work and in Bethel, the YK delta, or generally in Alaska as well, that makes sense, or do you have any questions?\\n\\nNelson 01:32\\nOh, it does. I mean, it just depending on your questions, it might be pretty hard for me to dial it back to just the YK. You know, I do a lot of work throughout western Alaska. Obviously, Bethel is a community I work a lot in. But yeah, I just hope you\\'re prepared to receive answers in relation to all of Western and Northern Alaska.\\n\\nLauryn Spearing 01:55\\nTotally. And if there\\'s anything specifically you\\'re thinking of a location, let us know. But any of your insights are helpful. So you don\\'t have to dial it back at all. Any insights from your experiences. The one kind of thing we\\'re specifically looking at is kind of the more rural regions, instead of, you know, Anchorage, or something.\\n\\nNelson 02:13\\nUnderstood.\\n\\nLeif Albertson 02:15\\nAnd I think that probably, I mean, I suspect that there\\'ll be some commonalities, you know, I mean, some of the technical, you know, is what the bedrocks like or whatever it might be, right might be different, but I\\'m, I\\'m guessing, you know, the differences between like Stebbins and Emmonak or, you know, something, probably, once you cross the border, and you know, it\\'s not going to be, you know, wildly different. So\\n\\nNelson 02:36\\nthat\\'s, oh, that\\'s a fair assessment. Yeah.\\n\\nLeif Albertson 02:40\\nSo and the reason we\\'re working in a team is that, so I\\'m gonna ask you some questions. And I might ask you to repeat some things in a way that folks from Texas can, will will make sense, right? Sometimes when I do the interviews alone, you know, people are like, well, you know how it is. I do know how it is, but can you say it? Anyways? So um, yeah, so I\\'ll just start with some background about you. So you, you\\'re based in Fairbanks, you travel all around? Can you tell us about kind of how long you\\'ve been doing, what you\\'ve been doing and what you do?\\n\\nNelson 03:20\\nYeah. Well, so I\\'m with a consulting firm. And I\\'ve been with a consulting firm for 14 years now. Predominantly water and water and sewer in western and northern Alaska. Before I came to my current firm, I did do a turn with a Peace Corps type organization in Central America, Guatemala, and Haiti. So I was doing similar work, cross cross cultural, in the developing world, before doing this as a professional in Alaska. Yeah, 14 years is long enough to probably know my way around, but I don\\'t think I don\\'t think years of experience is really a good substitute for. For much, I mean, yeah, you got to spend time out in the field. And, you know, any, anybody that\\'s five years and has the ability to innovate, so I don\\'t hold 14 years up as like a pinnacle of anything.\\n\\nLeif Albertson 04:17\\nAnd you\\'re an engineer? Yes? \\n\\nNelson 04:21\\nYeah. I\\'m a Professional Engineer. So my, my emphasis is water and sanitation. My degree is technically in environmental engineering.\\n\\nLeif Albertson 04:32\\nGreat, and where are you from?\\n\\nNelson 04:35\\nAh, so I went to University of Michigan Tech in the northern upper peninsula of Michigan. I did grow up partially in Alaska, though in Sitka.\\n\\nLeif Albertson 04:46\\nSuper. So, the next series of questions have to do with how you work with water systems, and it sounds like serving on a transactional or you know, a project basis, is that. Can you kind of describe what a typical job would be?\\n\\nNelson 05:06\\nYeah, maybe it\\'s, I mean, transactional is one word to describe it. I mean, I am private sector. So I am for hire on things. I mean, I would love to just to be out there just, you know, doing stuff for free. But that\\'s not that\\'s not where I\\'m at. I mean, I\\'m an owner in a private sector engineering firm. So we\\'re for hire, you know, we\\'re hired to design and construct projects. And, you know, that does require a lot of working with operators and Public Works directors and owners and the public. But primarily, what I\\'m paid to do is build and deliver a project that, you know, is operator friendly, and will be maintainable. That\\'s the goal. And, you know, we do also do other things, you know, we work with water plant operators to, you know, optimize water quality and water treatment. And we\\'ll do studies, feasibility type studies. But, yeah, I mean, transaction is probably the right way to describe a lot of what we do. I mean, we\\'re for hire.\\n\\nLeif Albertson 06:18\\nYour clients are generally though, like public entities, like cities, or tribes or something?\\n\\nNelson 06:26\\nYeah, for the most part. We do a lot of work for municipalities. We do a lot of work for health corporations. We do work directly for tribes. We do work directly for Native corporations also. We do get pulled into projects for you know, Katmai Fishing Lodge, or like small private systems in western Alaska, also. But our bread and butter are clients like the city of Bethel, the city of Kotzebue, Norton Sound Health Corporation.\\n\\nLeif Albertson 06:59\\nYeah, that was so that\\'s. And Lauryn, that\\'s kind of my connection is I was on the Bethel City Council. And when the municipality side, so when things come up that we don\\'t have expertise or ability to do, right, we contract someone to do it. Right. And so that\\'s where Jason and I met, working on some some Bethel projects,\\n\\nNelson 07:23\\nWater and sewer stuff.\\n\\nLauryn Spearing 07:27\\nWell, I already have a follow up question from what you said. So I\\'m gonna jump in real quick. But one of the things you said was kind of when you\\'re building you\\'re thinking about it being operator friendly and maintainable? Can you expand a little bit more about how you kind of think about that? And, and what you\\'ve maybe found successful in doing that?\\n\\nNelson 07:48\\nYeah, well, there\\'s a lot of there\\'s definitely a lot of complexities here. I mean, in general, I think that the more simple, the better, you know, the simplest pump and the simplest operational package, the better, the less things that can go wrong, the better. You know, a lot of the times the operators we work with are brand new to the job, or they aren\\'t going to be around for long, and you walk into a water plant or a sewer lift station, and it\\'s kind of like a rocket ship. It\\'s like how the hell do you even run this thing? You know, so the simpler, the better. But, um, you know, engineers have a way of complicating things and making them super sophisticated and state of the art. And that does fall down a lot of the time, you know, when there\\'s unreliable power sources or stuff like that. So I like to keep it simple. But engineers have a way of running away with things and making them more complex. And then also, you know, on the drinking water side, and on the wastewater side, you know, we have certain standards that we have to meet to the Department of Environmental Conservation and the EPA. So it\\'s not like we can oversimplify things and not get 4-log removal on bacteria at a water plant, you know, we have a certain standard, where certain levels of simplicity would not be allowed because we have to meet this requirement. I mean, it\\'d be great if we could have, you know, basic sand filters all over the place. They\\'re super easy to maintain, and, but we\\'re not able to meet a certain standard with that kind of technology.\\n\\nLauryn Spearing 09:34\\nYeah, definitely. So balancing kind of the, you know, simplifying meeting regulations, as well as I mean, it has to be also a lot harder to build these systems in some places in Alaska.\\n\\nNelson 09:47\\nDefinitely, and more expensive.\\n\\nLauryn Spearing 09:51\\nTake it away Leif.\\n\\nLeif Albertson 09:52\\nYeah. Well, and I\\'m sure we will have some more questions about those exact things. But before there\\'s something that you mentioned about, you know, working with, you know, working with tribes and working with health corporations. Who like, how much time do you spend talking to people about water? And who do you talk to? Do you ever talk to the public? Or is it?\\n\\nNelson 10:19\\nWell, we do. I mean, in my line of business, I mean, we\\'re mainly working with the operators at a water plant, or sewer plant, or sewer lift stations, you know, we\\'re working with the people that run the infrastructure, the Public Works directors, the utility line maintenance. But that being said, I mean, for big piped water and sewer projects, we often have a major stakeholder outreach component where we need to talk with the people that are going to get water and sewer service. And we need to, you know, talk with the people we\\'re going to get easements from, and agreements and all that. So we definitely have a component of outreach with a big piped water and sewer project. And I mean, I was involved for a number of years with the Alaska water sewer challenge, which was a, which was a project that, you know, basically looked at doing household treatment instead of piped systems. So through that project, I interviewed and was in households, just all over the Delta, Waiakea delta and in the northern Sound region. So I have a fair amount of time talking to people about honey buckets and the complexities of pipes versus on site treatment, all that kind of all through Western and southwest Alaska.\\n\\nLeif Albertson 11:43\\nIt\\'s a little bit afield, but I\\'m super curious, do you have any summary thoughts on working on the water sewer challenge? building in houses versus big systems with pipes?\\n\\nNelson 11:56\\nWell, it was a great idea. And just, we went into it very optimistic thinking that, you know, basically, back to what I said that we\\'ll have this simple system where, you know, we\\'ll have this vestibule and it\\'ll attach to the side of the house. And, you know, we\\'ll have a granular activated carbon filter here, and then a wastewater holding tank. And we just had this very simple approach. And it grew in complexity as we kept trying to, like improve the disinfection. And by the end of the project and building these prototypes, I mean, it was so freakin complicated. It was no, no, no non engineer person could run this system. It just grew in complexity. You know, so So that was it was a much bigger challenge to do household treatment than we thought it would be. And I bet you if you talk to the other teams, some at UAA. YK had a team. YKHC had a team. Yeah, Brian Lefferts had a team.\\n\\nLeif Albertson 12:59\\nThey sort of they sort of did their own thing, though. I think I think they they maybe expanded beyond the original scope or something. But yeah, they did. They I mean, I remember the I was in the warehouse with the bubbling.\\n\\nNelson 13:12\\nAnd I think they came to one of the same conclusions that doing this on a household level is, it\\'s hard. It\\'s not something that a typical homeowner is ready to take on from a maintenance perspective.\\n\\nLeif Albertson 13:27\\nThat was kind of the the summary that I got, you know, is that shrinking a water treatment plant is very challenging and turning a homeowner into a water plant operator. I mean, you know, it\\'s hard enough to find water plant operators at the water plant. And then, you know, so yeah, that was kind of the seemed like, seemed like everyone was having that challenge.\\n\\nNelson 13:53\\nYeah. Um, so I mean, that was the technology side, trying to basically take a full water plant, bring it down into a household plant. That was the technology challenge. I mean, so there was, then there\\'s gonna, you know, the cultural challenge here. You know, there was some really big cultural hurdles to get over with the idea of recycling water. You know, the idea of reusing water multiple times, you know, I know that, you know, the city of San Francisco had to get over this hurdle to recycle water in that in that community. And that\\'s a hyper educated community. But, you know, the idea of recycling water and reusing it again, did not sit well with a lot of people. You know, they thought that they were, you know, being compromised essentially, of, \"Why doesn\\'t Anchorage have to recycle water if you\\'re asking us to do that?\"\\n\\nLeif Albertson 14:53\\nInteresting. Was there. So, did you run into people just preferring to use sort of traditional water sources or rain collection instead of mess with this whole thing? \\n\\nNelson 15:08\\nDefinitely. \\n\\nLeif Albertson 15:09\\nHow did those? I mean, when so, you know, you show up and you\\'re like, hey, we\\'re going to do this. I mean, were people happy to talk to you? Or did they? How was?\\n\\nNelson 15:18\\nYeah, I, you know, I think there\\'s this divide in between water and sanitation. I really think that a lot of rural Alaska does not like the honey bucket, they want flush toilets. They want you know, that part of it. But as far as the drinking water, you know, traditional water sources, you know, the people in Rampart want to go to Minook Creek because they\\'ve been going there forever to get their drinking water. So they seem a lot more content to continue using drinking water, traditional sources. But I think everybody wants flush toilets, and then, you know, the sanitation side to be handled via pipe system.\\n\\nLeif Albertson 15:56\\nThat\\'s an interesting insight. We, you know, we\\'ve been talking really about the water delivery side when we\\'ve talked to people but but that is an interesting, I guess I can say, yeah, that\\'s an interesting insight that people do have, there\\'s kind of a juxtaposition and how people feel about technology involved in taking waste away versus bringing water in.\\n\\nNelson 16:18\\nWell, honey buckets aren\\'t much fun. They\\'re really, really an awful way to handle waste.\\n\\nLeif Albertson 16:24\\nI don\\'t, I don\\'t care for that. So you work in all kinds of different communities? Can you give just a brief overview of the types of water delivery systems that that you\\'ve had experience with or that you\\'ve seen?\\n\\nNelson 16:39\\nYeah, well, you know, the lower Lower Kuskokwim lower Yukon, you know, that\\'s, that\\'s a big, traditional rainwater capture area. You know, people like to just gather rainwater off their roof for drinking and cooking. You know, so there\\'s the traditional sources like rainwater, and like creek water, Minook Creek, like I just mentioned. So that would be like, no, no public distribution. And then you get up into the hauled systems. So there\\'s, you know, a variety of hauled systems out there, obviously, you know, the way the city of Bethel does it is big water haul trucks, but you have to have a road network for these big haul trucks. You know, they\\'re 80,000 pounds each. So unless you have like a good road system, you\\'re not going to have a haul truck network. Some of the other communities like you know, Kwigillingok and Kong, you know, they have kind of the four wheeler hauled system where they\\'re hauling hauling water to a cistern at each of the house using a four wheeler delivery system. So traditional, hauled, and then piped. On the pipe side, you know, there\\'s a number of different ways to do it. Almost all of the systems in Alaska, they circulate for freeze protection. But we always go into a project and evaluate do we need to go above ground with the pipes? Do we need to go below ground? How do we how do we build out the pipe network? and that is incredibly location specific. You know, there\\'s a lot of above ground pipe systems in the YK Delta, just because of the frost susceptibility of soils and the likelihood of great amounts of movement. If when you\\'re burying pipes in that kind of environment, you tear them apart. So that\\'s why you see this spaghetti noodles, of, you know, above ground pipe in in the delta a lot.\\n\\nLeif Albertson 18:38\\nI\\'ve got, I\\'ve got that in the backyard. I painted mine. So it blends in with the trees a little bit. I did some patterns on there.\\n\\nNelson 18:46\\nBut you go to Kotzebue, and everything, we bury everything up there. We bury everything in Nome. So better, better, less frost susceptible soils up that way.\\n\\nLeif Albertson 18:58\\nYou You I noticed that you ordered, you know, you listed those different types of delivery systems in a way that felt sequential. I mean, the people that you\\'re working with, is it is pipe is a pipe system preferable?\\n\\nNelson 19:13\\nI think so. I think that\\'s the gold standard. I think I think that\\'s the gold standard. Yeah. Um, you know, we\\'ve we\\'ve talked you know, we do a number of like preliminary engineering reports where we kind of assess the alternatives for providing water and sewer and, you know, there are certain communities that are just really, you know, they\\'ve just been, I don\\'t know, they just want pipes. You know, we\\'re experiencing this in Wales right now where they just want pipes especially on the sewer side.\\n\\nLeif Albertson 19:50\\nAnd over, you know, you said you\\'ve been doing this for 14 years. So is there you know, we think about remote Alaska is there progress that direction? is water provision, is there more pipes than there used to be? Are things better than they were? \\n\\nNelson 20:04\\nWell, that\\'s a good question. I mean, I say that not just for a pause, I say that because the rate of degradation is probably faster than the rate of improvement. You know, so Emmonik gets a new system and Bethel gets the institutional corridor and Kotzebue gets some new circulating loops. But the rate at which they\\'re degrading all the infrastructure built in between the 60s and 80s, is falling apart. So I don\\'t know if things are technically getting better there. They\\'re may be maintaining. But I mean, clearly, we have some massive, massive funding, spending coming up in the next year, year and a half. I mean, there\\'s never been this kind of windfall for rural water and sanitation that I\\'ve, since since I\\'ve been practicing.\\n\\nLeif Albertson 20:53\\nYeah, so in that kind of. Since when this study was planned, this is new information. So can you say say a little bit about that? And how you see that affecting rural and remote Alaska?\\n\\nNelson 21:06\\nWell, I certainly hope you\\'re like interviewing some funding agency type people. Are you guys? Somebody with IHS, USDA? You got people like that?\\n\\nLauryn Spearing 21:17\\nYeah, we actually, yeah, we\\'ve done kind of this is the second part of the project, where we\\'re looking more kind of regionally, but we are just kind of wrapping up talking to a lot of more state level stakeholders. And so we also have like a white paper and stuff we can share if you\\'re interested. But we we did talk to kind of the state level and and national organizations, which it\\'s interesting to hear the different.\\n\\nLeif Albertson 21:45\\nAnd that\\'s why I think it\\'d be interesting to hear from you as well. Because I mean, a lot of these things are, you know, developing right now, right? Looks like there\\'s gonna be some money and okay, there\\'s gonna be some money. Okay, here\\'s it. There\\'s a ton of money, but how that\\'s going to trickle down, whether through the tribes or the municipalities or the health corporations. I\\'m not clear on I mean, so what would what do you see?\\n\\nNelson 22:05\\nWell, yeah, I mean, I guess I brought that up, just because I feel like I\\'m not the best one to answer it, you know, as the for hire guy, the guy that\\'s out there building stuff and making it happen. I\\'m not responsible for how the funding makes it to YKHC or makes it to the city of Bethel. You know, I\\'m sort of aware of where the money\\'s going just because, you know, they\\'re my clients and, you know, sometimes they need help with grant applications or something. But, you know, the funding made available through the Jobs Act and all that is it for mainly, I mean, it appears from what I\\'m seeing that most of its going through to IHS and will be distributed through the SDS sanitation deficiency system programming, so a lot of a lot of IHS money. And ANTHC, Alaska Native Tribal Health Consortium will be managing and running a lot of those projects on behalf of IHS or with IHS funding. This is not like I mean, as a consulting engineer, though, this is not where I operate, though. So I probably just messed something up there.\\n\\nLeif Albertson 23:18\\nI got you. I mean, it\\'s interesting to hear from you, because you\\'re on the other side. Right? You. So you would have customers that would have money that didn\\'t used to have money, right? So you could be looking at projects that were maybe never feasible until now. Yeah. So that\\'s exciting, right?\\n\\nNelson 23:34\\nYeah. I mean, a lot of a lot of those projects that were economically infeasible, just because of the unit cost of distributing and collecting water and sewer on a per unit basis has been so high, you know, the money that\\'s available right now, will be available for some of these economically infeasible type projects, these projects that are 200 or $300,000 per house for water and sewer systems. Yeah, it\\'s crazy. But that is the cost of providing a full piped water and sewer system in one of these small communities.\\n\\nLeif Albertson 24:13\\nSo kind of, probably related to that, if we\\'re if we\\'re looking at you know, can you talk a little bit about I know, it\\'s a really broad topic, but what are the what are the challenges delivering water to people in the in the communities you work in?\\n\\nNelson 24:31\\ndeliver, like delivering via the haul system or just?\\n\\nLeif Albertson 24:37\\nAll of the above if our goal is to get good potable water to somebody at their house. Right? Why is that so hard?\\n\\nNelson 24:49\\nUm, well, honestly, I think cost is the biggest challenge and you know, the federal funding package might be able to, you know, overcome some of that. You know, it It\\'s just the cost of delivery and the cost of capital projects and the cost of and just the cost of operating those systems. I think that\\'s the biggest challenge you know, and I just threw some numbers out there about two or $300,000 for capital investment for pipe water and sewer to homes and Kipnuk or whatnot. Um, you know, that\\'s, that does that number seems so big, but if you look at Fairbanks, Bethel, Anchorage, you know, a bigger Alaska community, you know, the investment over time, you know, to build out the pipe network and all these lift stations over time, you are putting a couple 100 grand into water and sewer for each building. If you look at the investment over time of the 15 different well pump houses in Anchorage, the three water treatment plants, the four wastewater plants, if you look at that investment over the course of history, that\\'s a lot of investment. And then here we are in Kipnuk, at $200,000 or 240. That\\'s a one time investment to provide a pipe water and sewer system all at once. So I think that the critique about the cost being so high is not really fair, because it\\'s not capturing the long term investment of a bigger community.\\n\\nLeif Albertson 26:25\\nInteresting. And so, why is it so expensive?\\n\\nNelson 26:31\\nYeah, well, I mean, yeah, building conditions are hard. You know, the YK Delta in particular is a tough area to build anything. I mean, remoteness is one thing. But all you know, remoteness is just one thing. The conditions unique to the YK delta are really poor building, building, really poor ground. groundwater table is basically at the ground surface, especially in the Lower Kuskokwim. Where it is, the groundwater is at the same level as as, as the ground surface, so anything underground becomes very difficult. And then the Delta nature, the alluvial soils and whatnot are very frost susceptible. So there\\'s a lot of ground movement. In order to guarantee that things will stay in place, you almost have to put them on some sort of support that penetrates the ground, helical supports, driven piles all that. So the ground movement is a unique challenge to the YK Delta. The remoteness is a challenge to all of Alaska, seasonal barge, you know, seasonal barge limitations. You know, politicians always bring that up. And it is a good point. But that\\'s only one thing\\n\\nLeif Albertson 27:56\\nHow about on the operation side?\\n\\nNelson 28:02\\nOh, yeah, there\\'s definitely operational challenges. You know, a lot of the times the utility can only pay a certain amount for an operator. You know, they\\'re not, they\\'re not 100 $200,000 A year jobs where you\\'re really going to retain somebody for a long period of time. I mean, for for the retention of that person, the Billy of Bethel heights water treatment plant that\\'s been there for 48 years, those guys are not motivated by money. They\\'re motivated by Billy has been the water plant operator at Bethel Heights Water Plant since 1966, or whatever. The you know, those guys are almost impossible to find that want to make a life out of being a water sewer operator, the wages aren\\'t good enough to retain people. There\\'s not enough income, there\\'s not enough revenue for the utility to have money to pay people what they deserve to run a water plant. So there\\'s high, high turnover in the operations. And you know, you need time to understand how to run a polymer pump at a water plant or to develop these skills. So turnover within operations is probably the biggest challenge I see. The inability to retain staff that know how to run these systems. And then unfortunately, you know, we work with the community of Selawik. You know, there was just a house fire up in Selawik. Last week, and unfortunately, it killed four people and one of them was the lead water operator. So, you know, Selawik is down the one person that they had that could run that plant. I mean, that\\'s a very extreme example, but you know, when you don\\'t have any redundancy within your operational ranks and you lose the lead guy. I mean Selawik is bound to have some trouble in the coming year.\\n\\nLeif Albertson 30:06\\nWell, and I don\\'t think that\\'s an unusual situation. Right. So even in Bethel, we have two water plants. And, you know, there\\'s been times where we\\'ve had one water plant operator. And, you know, probably, you know, I mean, that\\'s, that\\'s where you lose something like the ability to fluoridate or, you know. So what thoughts do you have? What do we do about that? Any ideas on addressing that challenge?\\n\\nNelson 30:35\\nYeah, well, so Norton Sound Health Corporation is developing, like a cooperative utility. Have you talked anybody at Norton Sound? No? They might be a good one to talk to. Yeah, they\\'re they\\'re working with a grant from the Helmsley foundation administered through engineering Ministries International, which is a kind of a faith, sort of faith based nonprofit, but they\\'re working on this cooperative utility model. So taking all of the utilities in the Norton Souond Region -- Wales, Stebbins, Gambell, Savoonga, all the Norton Sound, and basically administering the utility as as a cooperative. So Norton Sound Health Corporation is responsible for all the training and has their own RMWs, remote maintenance workers, that drift around the region, billing, everything is done on a cooperative utility basis. Everybody has the same pumps at their sewer lift stations, everybody has the same household circulation pumps for the water. All that, you know, so they\\'re administering the utility as one instead of 17 separate utilities. And that way, you know, one operator could be you could you could, you know, we got somebody in Koyuk that can go to Elim for a while and everybody\\'s trained on the same equipment, instead of everyone having different different things.\\n\\nLeif Albertson 32:07\\nSo, I mean, so is that like ARUC?\\n\\nNelson 32:11\\nIt is, but it\\'s more regional and way more specific. But yes, ARUC would be a larger version of that.\\n\\nLeif Albertson 32:18\\nOkay. And they do. They\\'re doing the, like, the training of the operators too, not just fixe stuff when they go out there.\\n\\nNelson 32:25\\nYeah, yeah. And stockpiling spare, you know, chemical feed pumps. And, you know, just, yeah, more more than what ARUC does. I mean, ARUC, obviously has their own remote maintenance workers and they do billing for some communities. But I mean, this this, this model that Norton Sound is working on is way more regional specific.\\n\\nLeif Albertson 32:48\\nSo if I am a water plant operator in Elim, do I work? Who do I work for? \\n\\nNelson 32:54\\nYou would work for the cooperative utility. \\n\\nLeif Albertson 32:57\\nOkay, so I don\\'t work for the city or the tribe? \\n\\nNelson 33:02\\nwell, this model, just so you know, is not the I mean, there\\'s a project right now to develop this. So it\\'s not all sorted out, you know, it\\'s a two to three year process to try to establish how this cooperative utility would work. And, you know, they\\'re working with every community to make sure that this would be something they\\'d be interested in. You know, it\\'s, you know, it\\'s similar to like the tribal compacts, I mean, ANTHC for one, ARUC, you know, it\\'s similar to AVEC, where you\\'re kind of bringing all the utilities together to have combined purchasing power, combined training. Similar systems, all that.\\n\\nLeif Albertson 33:45\\nI think there\\'s, there\\'s economies of scale, that kind of I mean, are make everything cheaper. And I certainly know in in, you know, in my region, in the YK delta, just like you said, having pumps having the same pumps, you know, and having, it\\'s hard for an individual community to have an extra one of everything that they might need. But if you\\'re, you know, again, working with a half a dozen communities, then having a warehouse makes sense, all of a sudden, yeah.\\n\\nNelson 34:16\\nAnd I mean, YKHC could be that centralized cooperator. And to a certain extent, you know, the OEH, well, you know, you know, more than me about this, Leif. Sorry, sorry, Lauryn. This is why like, Leif is in the middle.\\n\\nLeif Albertson 34:34\\nI do know, but could you tell me?\\n\\nNelson 34:37\\nYou know, like, YK has a structure where they could be the centralized cooperator for a centralized cooperative utility, and they already to a certain extent with Lefferts and Bob White, are doing that.\\n\\nLeif Albertson 34:50\\nOkay, well, that seems that seems promising. I want to we\\'ll jump back for a second, but I do want to talk a little bit about operators. You work with operators, you talk with operators, you\\'ve mentioned earlier about designing systems, hopefully that would be easy to operate. But we also talked about how sometimes we\\'re short operators, and we can\\'t. What are the challenges about training operators? You know, you mentioned salary, are there other barriers?\\n\\nNelson 35:21\\nYeah. Well, you know, a lot of water and sewer infrastructure, you know, at a water plant is, you know, it\\'s, it\\'s, it\\'s complicated, it\\'s complicated. You know, it requires a person that is very mechanically savvy. But also, you know, also willing to, you know, work with computers and remote monitoring systems, I mean, it can be it, there can be a lot to that job. And sometimes the community will have an operator position open. And, you know, because of lack of employment, somebody that really does not have the appropriate skills, gets that job, doesn\\'t have the correct capacities, I would say, to run a complicated treatment plant or even a pump station. So I mean, I\\'m trying to hit on another, you know, just like capacity for the right type of person, is another challenge. Yeah, yeah. Beyond salary.\\n\\nLeif Albertson 36:24\\nI think a lot of a lot agencies and organizations probably feel that in it in a general sense. Something that we\\'ve heard from other people we\\'ve interviewed was about the kind of the testing and the, you know, the certification process around operators, and that, and that, that has been that\\'s been challenging, right to to get not just get people hired, but then also get them trained. \\n\\nNelson 36:54\\nYeah, no, there\\'s definitely people but there\\'s definitely certain people that don\\'t like to take tests too. And that\\'s a learning environment that doesn\\'t really work for certain people. You know, the certification levels and to go through and be certified, you have to be pretty committed to like making this your career path. And a lot of people that just you know, pick up an operator job just want a paycheck for a while.\\n\\nLeif Albertson 37:21\\nJust you mean committed because the amount of time that it takes?\\n\\nNelson 37:26\\nThe time and the studying and the investment of your mental capital to achieve that certification.\\n\\nLeif Albertson 37:38\\nWe\\'re gonna jump to some other questions. But before I do, Lauryn, am I missing anything? You have any?\\n\\nLauryn Spearing 37:45\\nI don\\'t think so. You\\'ve asked a lot of the questions I was going to.\\n\\nLeif Albertson 37:51\\nI did want to hit on one thing specifically, you know, so weather is an issue, right? Heaves in the alluvial soils is a particular challenge. Is there? Is there a time of year where we have more infrastructure challenges? I\\'m assuming winter, but you know, Is that Is that true? Is that a good assumption?\\n\\nNelson 38:20\\nYeah, I don\\'t know. I guess winter would be, um, you know, a lot of our systems are above ground in western Alaska, and they\\'re very susceptible to freezing. You know, all the all the circulation systems and the heat tray systems and all that. I mean, they\\'re, they\\'re, they\\'re well designed, but they\\'re fragile still, you know, if the boiler at the water plant goes out, and we lose the ability to heat the water and sewer with with the glycol, that\\'s gonna lead to freeze ups. And you know, there\\'s a certain amount of like homeowner responsibility with keeping their water and sewer service from freezing up, you got to remember to turn the heat trace on got to remember to have the circulation pump on and these you know, these little things if that sounds really simple, just remember to turn on your heat trace when it gets cold. But you know, us here in this urban environment of Anchorage and Fairbanks and Austin, we don\\'t have to remember to go downstairs and turn on the heat trace when it starts getting cold. Or maybe you\\'re out hunting. So, you know, there\\'s, there\\'s, there\\'s ways we can design but you know, there\\'s fragility in these systems, you know, they have to circulate, they have to be heat traced. And you know, when you have a water main freeze, it doesn\\'t just affect that one person that\\'s closest to the freeze it affects the entire the entire network, so the whole system will go down. And then, you know, sometimes you know utility isn\\'t even able to get it back once it freezes, because it\\'s too much to get it thawed. Yeah, you know, so winter- clear challenges with the climate, more ground movement happens on the shoulder seasons where you\\'re having the breakup and, and, and the frost action. So ground movement would be greater in the shoulder seasons. And then the springtime often presents another unique challenge with the amount of infiltration and inflow that happens in the sewer systems, you know, so that\\'s all the meltwater from the snow. That\\'s all the melt water from the snow going into the lift stations and, you know, causing failures. We see this a lot in Kotzebue, where the storm drain system isn\\'t working well. So the whole community gets flooded out. Flooding, yeah, that\\'s another big issue. Boy, like as I\\'m talking Leiif, I realize that there\\'s kind of challenges at every single season. And then another another challenge is, you know, drinking water plants that have their water source as a surface water source will often see like an algal bloom in June, July, when there\\'s highest amounts of photosynthesis. And that presents a very unique challenge to some communities where you have very compromised raw water.\\n\\nLeif Albertson 41:24\\nOkay. So so that\\'s looking at kind of seasonal change. If we go back a little bit, have you in the time you\\'ve been working on this? Do you think that there? Have you seen sort of evidence of climate change making things harder in general?\\n\\nNelson 41:44\\nWell, okay, I guess two things to point out when it comes to climate change and actual observations. So we\\'ve had, we\\'ve had the unique pleasure, unique opportunity to drill like at many different locations in Bethel for buildings, and water and sewer pipelines and all that. So we\\'ve observed the permafrost depth diving in the YK delta. And the YK Delta in particular, the permafrost is not very good, because the historic annual average is right at about freezing. You know, you go up to Kotzebue, or whatnot, the average annual temperature is 5 or 6 C below freezing. So their permafrost is more reliable. But we\\'ve we\\'ve seen the permafrost diving in Bethel area, you know, where it\\'s getting deeper and deeper and more fragile. So that means deeper and deeper piles and deeper and deeper piers for all these supports, because we need to get it into the permafrost for it to be founded. So there\\'s that. Then the other thing I would say is, there\\'s been a marked increase in disinfection byproducts at water plants. So we you know, we think this has to do with just increased surface water temperatures, basically providing for faster kinetic rates for the chemical reactions to happen that form disinfection byproducts. Two kinds of climate change observations that I\\'ve seen.\\n\\nLeif Albertson 43:29\\nThat\\'s, that\\'s really good insight. Yeah. So we, I mean, we\\'d heard some about like, erosion and that sort of thing. And then you know, melting permafrost. So we had a water plant, and you know, all of a sudden it starts sinking. That\\'s, that\\'s bad news.\\n\\nNelson 43:42\\nWell, definitely. And, you know, a lot of Alaska\\'s villages are constructed on rivers, and they\\'re constructed on, you know, coastlines, where they were always going to be erosion. So sometimes it\\'s harder to really point the finger definitively to say that climate change is responsible for that. Because, you know, rivers have been eroding since the beginning of time. But, you know, it\\'s, yeah.\\n\\nLeif Albertson 44:13\\nYeah. Okay. Yeah. But I mean, drilling soil samples. And I mean, that\\'s, that\\'s pretty. Pretty concrete. Yeah. So kind of with a lot of these things in mind that we talked about if you were working with a community that was wanting to develop water infrastructure, right, move along that spectrum that we talked about. How do you think your way through that process? To make sure that you were doing all the things that you talked about and setting them up for success?\\n\\nNelson 44:49\\nYeah. Well, hopefully your community has some strong leadership that\\'s committed to the idea because it takes a long time. You know, just because you want a piped water and sewer system doesn\\'t mean it, it\\'s going to be there next year, you know, it takes leadership that\\'s very committed to the idea of it, and somebody to chaperone it through the whole process. You know, starting starting at the planning level at preliminary engineering reports and master plans that really kind of engage the community early on, to really come to a solution that is adequate for that community. A lot of the times if these decisions and the stakeholder engagement doesn\\'t happen very early, it won\\'t happen. You know, once a consultant or a firm has a contract for a design, it\\'s very specific, this is what we\\'re going to do. And then when it comes to construction, it\\'s even less flexible. So I think the master planning and the preliminary engineering reports and that front end stuff is really the best time for the proper community engagement. Because once you get further into the process, it gets harder and harder to make changes.\\n\\nLeif Albertson 46:16\\nYou know, I know we\\'ve hit on some of these themes before, but I guess, if you were, if you were going to change something, if you could fix one thing, to hopefully improve the situation of the challenges we\\'ve had with water delivered infrastructure, what would you change? Where would you attack the problem? You know?\\n\\nNelson 46:38\\nOh boy. I don\\'t know, man, gonna draw a blank when it comes to one thing. Hmm, well. Sorry. This one\\'s a little off guard about one thing.\\n\\nLeif Albertson 47:01\\nSo let me let me let me frame it in another way. You had mentioned early on money, right? That, you know, it\\'s money if we had if we had more money. So if you had money, if you had all the money, then what would you what would you do with the money? What\\'s the best way to spend that money?\\n\\nNelson 47:21\\nAh, see, I\\'m going to answer that question by not answering that one either. I don\\'t think that money is the one thing that\\'s needed. I mean, obviously, that is a big thing, because you got to build all this infrastructure. But maintenance and the ongoing upkeep is, is why that $2.3 billion, that\\'s going to drift into Alaska in the next five years, we\\'re going to be repairing systems again, in 25 years, you know, the ongoing operational and maintenance. Um, that\\'s, that\\'s the biggest challenge. I think it\\'s not the climate, it\\'s not the money, it\\'s the ongoing operation and ownership of these systems. So if I were to change one thing, I would get, you know, people and communities all over the world to think about water as, as a more precious resource, and something that is of greater value. So is this an education thing, where, you know, we would celebrate being the water treatment plant operator, or the sewer plant operator, because they are, you know, contributing to society at a very high level. But right now, I mean, it\\'s an $18 an hour job to fix sewer lift stations, it\\'s not sexy, it\\'s not, the attitude towards water and sewer. If I were to change anything, that would be what I would change to make it more heralded, celebrated. Make it as important as it really is to public health professionals. \\n\\nLeif Albertson 49:00\\nThat\\'s a good answer. I like that answer. \\n\\nNelson 49:04\\nSo how do you do that? Well, lots of education. I don\\'t know, change the culture, I guess.\\n\\nLeif Albertson 49:11\\nYeah. But I think, you know, that is a question that I\\'ve asked, I think repeatedly as we\\'ve been doing interviews is, you know, is, is this is just, is this just a money problem? Right, like, if we, if we had, you know, all the money we needed, would it solve these problems? And, you know, so it\\'s, it\\'s, I guess, it\\'s interesting to think about, as, you know, part of when when you were describing that was, you know, it\\'s a people issue, too, right. All the money in the world doesn\\'t matter if you don\\'t have the what the operations and the maintenance is unique, you know?\\n\\nNelson 49:45\\nYeah, I mean, the money is great. And you know, and personally, it\\'s going to be very good for my business. I mean, we\\'re going to be building and designing water and sewer systems all over the place. So, you know, it\\'s good for a lot of people in the business but I don\\'t see it as the silver bullet that this $2.3 billion is going to fix all the problems.\\n\\nLeif Albertson 50:08\\nOkay. I want to take a little pauses where as our time is getting short here, Lauryn, do you have any any questions you want to jump in with?\\n\\nLauryn Spearing 50:18\\nI don\\'t think so. So I guess just one last question would be, do you have any other insights about kind of specifically that operator training piece? So a little bit more background about the project, that\\'s kind of the direction we\\'re going is trying to kind of bring together some of this, like, basically figure out a way to do operations a little bit better, and support operators in actually doing their job, right, maybe not passing the certification test, but do you have any additional thoughts on that?\\n\\nNelson 50:57\\nYeah, well, if we could pay operators more, if there was a way to do that, I mean, I think it would be a way to, to recognize their contributions, and it would be a way to hold on to them longer. You know, in my, in my world, I can be pretty frustrated where even with a big community like Bethel, where I\\'ll fly out there. And it\\'ll you know, it\\'s a it\\'s a revolving door of operators. So like, I don\\'t ever see how they can develop the skills they need to do their job effectively. You know, we\\'re just we\\'re going through the exact same thing again, right now, where Caleb Sleppy (sp?), the leader of the utility management group is leaving so I don\\'t know who\\'s going to replace him. You know, so um, yeah, trying to get back to your question about what could be done to be more effective. Um, you know, bringing in bringing in the operators to Anchorage and Fairbanks is kind of a double edged sword. A lot of the times when they come in Anchorage, there\\'s trouble that is, there\\'s there can be trouble. I\\'ll just leave it at that. Leif, you can connect the dots later if you need to.\\n\\nLeif Albertson 52:11\\nI will connect the dots later. Yeah.\\n\\nNelson 52:14\\nBut they come in. I mean, if they\\'re in the community of Gambell, you know, knowing how to operate that filter in that, that, that peristaltic pump or whatnot. I mean, like training at their facility would be, I think, more advantageous than bringing them in to a classroom setting in Anchorage, or even Bethel. So, you know, maybe one critique is that it could be more community by community. I generally just think that operators need to be celebrated more and paid more, though.\\n\\nLeif Albertson 52:55\\nAnother question on that, that that has come up is, you know, the certification process right that that this is a national certification, and it might not be a very small amount of that might apply to what\\'s happening in Ambler. Right. So how do you what do you? Do you feel like that it\\'s important that everyone has a standardized training versus you know, trying to break that up and not have standardization but really maybe somebody would only be able to know their system in Ambler.\\n\\nNelson 53:30\\nYeah. Well, that\\'s an argument for a cooperative utility where you know, you have 13 utilities that all look the same where you know, if you know how to run the Ambler water plant you know how to run it in Shungnak and you know how to run it in Kobuk You know, so you could take that operator from Ambler and bring them over to Shungnak and we all have very similar water plants. So that knowledge would be transferable back to that community.\\n\\nLeif Albertson 54:00\\nBut right now, I mean, the pathway that we put water plant operators on is to train them to run a water plant in Chicago.\\n\\nNelson 54:06\\nRight it\\'s very standardized and it\\'s in a classroom usually in Anchorage so I yeah, I don\\'t know if that\\'s the best setup for success.\\n\\nLeif Albertson 54:16\\nThat\\'s that\\'s something that has been discussed in the past. The State of Alaska had their own certification, right. And then we went with a national standard.\\n\\nNelson 54:28\\nYeah. Yeah. Some big some big challenges here. Takes a lot of great minds.\\n\\nLeif Albertson 54:38\\nWell, so this has been great. What should I have asked you? What did I miss here? Anything else you want to add for us? Don\\'t know, um, you know, what is the average salary of a doctor? What do you think 250? Depends on the specialty but yeah, I would say,\\n\\nNelson 55:00\\nI mean, just like a public health doctor, you know, family practitioner something\\n\\nLeif Albertson 55:04\\nnorth of 200. Right. \\n\\nNelson 55:06\\nYeah. And, you know, here, you know, water and sewer people are public health professionals. And why the basic salary on a water plant job is 50 or 60. You know, they are public health professionals too, and I think they should be treated as such. But I feel like culturally, we\\'re not doing that right now.\\n\\nLeif Albertson 55:33\\nThat\\'s a good point. Yeah, that\\'s true. I think you can make a pretty good case for clean drinking water and health. We have some evidence of that. There\\'s been some studies. \\n\\nNelson 55:43\\nThere\\'s tons of studies. That\\'s, if there\\'s if there\\'s any, any one thing that\\'s responsible for the advancements in life expectancy, it\\'s water and sanitation systems. That\\'s the biggest jump in like, life expectancy, essentially.\\n\\nLeif Albertson 56:01\\nI think that in this state, too. There\\'s great examples of that. I mean, even you know, in, in the last 50 years, when you look at, you know, I mean, it\\'s made a huge difference in this in this state.\\n\\nNelson 56:13\\nYeah. I wonder if Tom Hennessy is still kicking around? The to go with? So yeah,\\n\\nLeif Albertson 56:18\\nhe is. So he retired, and then they pulled him back in. So he\\'s doing some stuff with the university now with One Health. I bump into him every once in a while. And of course, you know, it\\'s been kind of all hands on deck for COVID stuff for the last two years. But yeah, I usually, I usually bump into him.\\n\\nNelson 56:37\\nOkay. Well, he was a guy that could cite those sorts of improvements. very readily at his fingertips. He had examples of introduction of a water and sewer system and what effect it had and\\n\\nLeif Albertson 56:53\\nwell, great, I guess, you know, if, let\\'s see, thank you so much for your time. If there\\'s anything that comes up later, do you mind if we contact you, again, to clarify or ask additional questions? Okay. One thing that we might be looking for is pictures of water systems or even, you know, I don\\'t know, reports, or information, training manuals, that sort of thing. For water systems, so we might reach out again on that and see, I don\\'t know Lauryn what the best way to do that is, but just, okay. If we call you Yeah.\\n\\nNelson 57:38\\nAll right. Well, no, I\\'m committed to the profession. You know, so I\\'m happy to help fellow practitioners, so and people that are trying to make a difference in this. So yeah, no problem.\\n\\nLeif Albertson 57:50\\nYep. And so we\\'ll be in. I\\'ll be in Bethel regularly, but the Texans will be in Bethel next week, right. And then be coming up again, probably in June. So we\\'ll be back and forth. So yeah, maybe we\\'ll talk more.\\n\\nNelson 58:08\\nI was debating a trip out there next week. What day what days? Were you thinking? Just curious.\\n\\nLauryn Spearing 58:16\\nLet me pull it up. We fly into 20. Yeah. So the end of next week, we\\'ll be in Bethel. So let us know if you\\'re there.\\n\\nNelson 58:26\\nYeah. I was sort of thinking earlier in the week, but I don\\'t have anything booked yet.\\n\\nLeif Albertson 58:32\\nWe\\'ll be there at Saturday markets. 23rd. Doing some surveys, not like this. Not hour long interviews, but just brief surveys with people who live there people who use water, right. Just getting some basic information on what people think about their water and what\\'s important to them and that sort of thing.', '1_4_InterdependenciesNNA': 'Leif:\\tAll right, so, just some background questions: my understanding is you drove truck for Clyde, right? You\\'re a water truck driver, or water and sewer?\\nRobert:\\tYeah, I started driving water truck for the city of Bethel back in 1994.\\nLeif:\\tWow.\\nRobert:\\tI drove there for nine years, and I moved out of Bethel and joined up the Teamsters and drove truck for many years with Teamsters. Twice, I went out to Bethel to go work in winter because, mainly, I\\'d be out there in winters. So, I\\'d go out to Bethel and work for just winter and help them out, help the city of Bethel out. Soon as springtime would come, I\\'d come back and get back to driving a concrete truck or whatever job I\\'d land.\\nLeif:\\tOkay, and, are you from Bethel?\\nRobert:\\tYes, I am from Bethel.\\nLeif:\\tBethel\\'s home, like Bethel Regional High School and whole deal?\\nRobert:\\tOh, yeah.\\nLeif:\\tAh, okay.\\nRobert:\\tYeah, I was born at the old hospital 1975. It\\'s kind of where that new ... not the new hospital, but the older new hospital where they\\'ve got the dental. That\\'s where the old hospital site was. They tore that one down years ago ... asbestos. I think they tore that one down in \\'91 or \\'92.\\nLeif:\\tWell, dental moved back across the street, though. You mean the admin building is where the old hospital was. Is that what you mean?\\nRobert:\\tYeah, that was the old original site for the hospital.\\nLeif:\\tYeah, we got a new new hospital.\\nRobert:\\tI seen that last winter when they were still working on it. I was there last winter.\\nLeif:\\tYou got a commercial drivers license, obviously.\\nRobert:\\tYes.\\nLeif:\\tWhere\\'d you get that? Where\\'d you go to school for that?\\nRobert:\\tI got my CDL right there in Bethel back in \\'93. It was the year my daughter was born. I got it when I was ... no, the year after when I was 19. I just immediately started driving for the city, driving water truck, delivery water trucks, food trucks and garbage truck.\\nLeif:\\tSo, now that\\'s done through Yuut, right? How did you do it then? Who was doing that training?\\nRobert:\\tThere was no DMV open at the time, so basically, I called up Juno and they sent me the test via mail, and I just took all the written tests. Back then, they had two different types of CDLs. They had a commercial vehicle license for on-road systems, meaning interstate, and off-road systems, which was being in isolated locations. Bethel, only way in and out is by plane or by boat, so that would consider under being an off-road system so you could drive commercially off-road system. But, it was about 10-15 years ago they switched all to: you had to have interstate, so just one type of CDL.\\nLeif:\\tBut, you started with the off-road?\\nRobert:\\tYes.\\nLeif:\\tDo you think you would have ... What would you have done if that wasn\\'t available? If that wasn\\'t an option, do you think you still would have?\\nRobert:\\tI still would have. I had a baby at the time, and working for the city was one of the higher-paying jobs. It wasn\\'t paying much. When I started there, I was making $14.10 an hour as to, when I was working at the airlines, I was working at Yute Air and ACE Cargo. I worked two jobs at the time. And, as soon I found out was pregnant with the baby, I was working two jobs. I\\'d go to work at Yute Air at 7:00 in the morning. I\\'d get off work, go straight over to ACE Cargo, work until like 5:00 in the morning, go home, sleep for an hour, hour-and-a-half and start my day all over again. I\\'d seen that city of Bethel was always looking for drivers. They have a high turnover rate for drivers, so I went and got my CDL and I applied for a job and got the job.\\nLeif:\\tSo, how much would you say of your training to do that was from getting your CDL, like official training, and how much of it was on-the-job training where they taught you what to do?\\nRobert:\\tPretty much everything, all on-the-job training. Nicolai Philips was the one that trained me, him and another guy that passed away. We called him [inaudible] Fritz McCarr. He passed away, but those were a few guys that basically taught me how to drive heavy big trucks. All the water and sewer trucks are all automatic, so it was pretty easy.\\nLeif:\\tAnd then, you went and you drove, so at some point, you had to get an interstate CDL.\\nRobert:\\tI moved out of Bethel in 2001, moved out to Fairbanks. I was still driving with my off-road ... Sorry about that. I just had a phone call.\\nLeif:\\tOh, okay.\\nRobert:\\tI got pulled over for a routine checkup, what they do on the interstate. They actually arrested me, said I was driving illegally. I was like, \"What do you mean? I have my CDL.\" They said, \"No, you have an off-road system CDL. You\\'re going to have to get an on-road system CDL.\" So, I had to park the truck right there where it was at. They didn\\'t arrest arrest me. They just ... They give somebody time to go take the test, the driven test in order to have your on-road system CDL. So immediately, like two days later, my work, the garbage company I was working for had somebody drive me over to ... We made an appointment and went to the DMV. I took a road test with the garbage truck and got my Class B. Later on, I ended up renting a truck to pull doubles and triples so I could get my Class A rating.\\nLeif:\\tI\\'m just thinking about, today, it seems like there is always a shortage of drivers. Do you think having that steppingstone of an off-road license would be helpful if that existed today still versus people having to jump all the way up to the interstate?\\nRobert:\\tWell, it was helpful being out in the village. Well, Bethel\\'s a village or city, but, what defines a village is only access by boat or by plane. There\\'s no road system that drives to Bethel, so, technically, it\\'s a village, but it\\'s a city. But yeah, it was really beneficial. Almost all the drivers back in the \\'80s and \\'90s were off-road system CDL drivers.\\nLeif:\\tDo you have a sense of why that went away, why that option went away?\\nRobert:\\tJust the state making it ... Instead of having difficulty between going back and forth between an on-road system to off-road system, they just opted to get rid of the off-road system and just had all on-road system. So, in order for you to have your CDL, besides the Yuut school, prior to that, you had to come to Hickory. You could take all the written tests, but then you\\'d have to come to Anchorage, rent the truck in order to get your on-road system CDL.\\nLeif:\\tWow, okay. And then, has Yuut been a good pipeline? Have they been bringing people to the city, you think?\\nRobert:\\tNo, actually, I don\\'t know what happened. I had moved out when the Yuut school got started there in Bethel, so, when I went back to Bethel like five years ago to drive, they were doing the Yuut school. It was like, \"Oh, wow, that\\'s great that they\\'re actually doing ... you know, helping people get their CDL commercial drivers licenses out there in the bush.\\nLeif:\\tWhen you looked around at the other drivers, did they come through Yuut? Is it working?\\nRobert:\\tYeah, there was quite a few ... When I went back four years ago, three of the new drivers were through Yuut. And, get this; there was drivers that where there that I trained back in the \\'90s still driving there.\\nLeif:\\tWow.\\nRobert:\\tThat year, when I went there, Clyde was the driver at the time. I had just met Clyde that year. He was a driver back then. Yeah, Nicolai Phillips was still there, Mike Mendenhall. Mike Mendenhall is one of the guys I trained. Daryl Brandon, he\\'s another guy I trained. Robert Taylor, he was there. He\\'s a guy I trained ... Bernard Sam, another guy I trained. These guys, I trained 20 years ago. They\\'re still driving out there.\\nLeif:\\tCan you describe a little bit what your day looked like?\\nRobert:\\tWorking for the city, we started at 7:00AM and we got off at 3:00. The city was a lot smaller back in the \\'90s before they started ... They just started building up Kasayuli subdivision and making it a little bit larger. We did the city subdivisions. Before that, it got all piped, so we had to ... We had deliveries, water deliveries out to there. And, we only had one pump house where we\\'d get our water from. Generally, we had four or five water truck drivers and four sewer truck drivers, one garbage truck driver during the day. Water truck drivers, they\\'d average four to five loads, you know, 3,500 gallons of water delivery a day. The sewer truck drivers, sewer truck is a lot faster. They\\'d do like eight or nine loads a day pumping out the holding tanks. Then, the garbage truck dealer, he just went around, did all the garbage dumpsters all throughout the city.\\nLeif:\\tThinking about delivering water, did you have much interaction with people, homeowners, just citizens?\\nRobert:\\tOh, yeah, I had a lot of interaction with every. It\\'s a small town. You get to know everybody, you know? A lot of people are really happy to get their water. Some people complain. Being a driver, we\\'re constantly ... On average, the trucks, a driver delivered to 40-50 houses a day, so that\\'s 40-50 driveways that we\\'re backing into, going into every single day. We\\'re constantly going up and down the truck stairs. You lose a lot of weight doing that job just constantly getting in and out and dragging hose from house to house to house.\\n\\tAlso, being city water truck drivers, we were automatically put into the volunteer fire department. So, when there was a house fire, all of us, in the middle of the night, us truck drivers, we\\'d get calls from whoever our foreman was, or we\\'d get calls directly from the fire department, and we\\'d go get our water trucks and go meet up with the fire department for the house. They\\'d usually set up a swimming pool and we\\'d just dump all the water so they can suck in and fight the fire, the house fire.\\nLeif:\\tI\\'ve set up that swimming pool many times. I know that drill.\\nRobert:\\tMe being a trainer, I was one of the few people that actually had the keys to the city shop. I was the number one guy they\\'d call right off the bat when there was a fire. I lived on First Road housing, so it was close by. I\\'d be there at the city shops within three, four minutes. I\\'d go in there and I\\'d start up all the water trucks and do all the pre-checks, get them ready. Sometimes, I\\'d park them outside and just leave them running for when the drivers got in. They\\'d hop in the truck and go over to where the fire was.\\nLeif:\\tI guess, just to jump back a little bit for perspective, you grew up in Bethel. Do you remember before there was sewer trucks? I wasn\\'t here then, but with the honey buckets and the-\\nRobert:\\tI was a little kid then. I don\\'t recall. I remember seeing the sewer truck, and it was basically a flatbed truck with the tank in the back. People would crawl in and out carrying five-gallon buckets for the sewer. The water truck, I think they only had one or two water trucks. Actually, First Road housing, all those old Asher houses probably still have a 300-gallon water tank inside the ceiling.\\nLeif:\\tOh, wow.\\nRobert:\\tYeah, so, when you talk to some of the old drivers ... Nick, he\\'d been there forever, Nick Philips. He was actually doing that for ... God, he\\'s been there 40, 30-something, almost 40 years driving water truck, yeah, a little over 30 years. So, he was doing all that back then.\\nLeif:\\tWas your interaction with people usually positive? You said sometimes they complained and sometimes [crosstalk]\\nRobert:\\tYeah, when you pull up to house and they have a vehicle parked in their driveway, or we can\\'t ... It made our job a whole lot easier for vehicles over 10 feet to just where we had space, because we were dragging hose or fighting dragging ... You worked volunteer fire department. You know how heavy those hoses get, dragging around obstacles and putting on ... We\\'re doing that for 40-50 houses a day, so, interaction was ... Most people, they\\'re always happy to see us like, \"All right, I\\'m getting my water! I just used up the last of my water doing our laundry last night,\" so, majority of people are pretty happy to see us. But, some people, they forget and they leave a vehicle parked, and we write them a tag and have to go back later on and make sure that they\\'ve got their water.\\nLeif:\\tYeah, so, I guess, do you think that water delivery has gotten better over time, where we are now compared to where we were? You got a lot of perspective. You\\'ve been in it a long time. Or, you had been there.\\nRobert:\\tYeah, when I came back, it was a little bit more difficult. They piped all those city subdivisions, which made it a little bit easier. But then, they got Kasayuli subdivision and more houses out in Larson subdivision, and that\\'s a lot further away from the pump houses, so there\\'s a lot more drive time. A lot of the trucks now, newer trucks, they\\'ve got the DEF fluid they have to regenerate, so there\\'s a lot of downtime where, every couple of days, your truck would have to sit there for an hour and regen, clear out all that exhaust. That just made it harder. Back in the old days, it seemed like the trucks pump a little bit faster, too. We\\'d pump 100-120 gallons a minute. The new trucks, they\\'d only pump 70-90 gallons a minute, so the water fill was a lot slower. It\\'d take a lot longer filling up a house as to ... On average now, the trucks are only doing maybe four loads. I used to do five loads in an eight-hour day, same size tank, 3,500 gallon tank with the water trucks.\\nLeif:\\tDo you think that\\'s an issue with the pumps going slower or the houses having bigger tanks?\\nRobert:\\tWell, it\\'s a combination of both. But, having it pump slower, you\\'re spending, at a 1,500-gallon tank where it used to only take us 15 minutes, you minus off 10 gallons of minute, every 10 minutes is an extra minute. All those little one-minute extras, or a minute-and-a-half, all that adds up towards the end of the day. It\\'s just a big snowball because, the slower you\\'re taking to fill up a house, the more people are using their water, so it\\'s another snowball effect because you\\'re slower filling up this house, slower filling up this house, this house and this house. By the time you get to the end of the houses, they\\'d used an extra 100 gallons, 200 gallons more of their water, so we\\'re spending a lot more time doing that.\\n\\tWhen I\\'d gone back, for slower trucks, slower pumping trucks and slower ... The regenning and just having to drive further, a lot of drivers, we\\'re not getting done in eight hours. They\\'re constantly going in and out, so you\\'re getting hot and cold, hot and cold, hot and cold. You\\'re more susceptible to be getting sick because, you get inside the truck, you start sweating. You go outside, you get cold. You get chilled, and all that plays into factor. So, we\\'d have drivers that get sick, and you\\'d have to take time off to rest up. It\\'s a very hard job.\\n\\tWhen I went out there four years ago to drive, I was doing, average, 12-14 hours a day. We had vehicle truck shortage. They had some managers that weren\\'t keeping trucks ... constantly refreshing trucks because the truck wears out. Everything wears out. When I went back after ... How many years was that, like 17 years? They were still driving the same trucks from when I left, and those were constantly breaking down. There was a few times where trucks, they only had like three water trucks. So, what the city was doing was they\\'d have drivers doing double shift. They\\'d have one driver ... A truck would run basically 24 hours. So, the trucks would go 24 hours. They finally ... Bill, the current manager, water sewer department ... not water sewer department, but-\\nLeif:\\tPublic works.\\nRobert:\\tYeah, public works. He got in there, and then Clyde got In there. He was like, \"Hey, we need some upkeep. We need some new vehicles.\" So, having the new vehicles and fewer running trucks helped, but hey\\'ve still got to all regen. But, that cut down a lot of the overtime.\\nLeif:\\tYeah, thinking about Kasayuli, Kasayuli\\'s a subdivision, and it\\'s a little far-flung.\\nRobert:\\tOh, yeah.\\nLeif:\\tSo, even if we had more trucks or drivers or better trucks or bigger trucks, if they keep building houses farther and farther away from the water plant ...\\nRobert:\\tYeah, and there\\'s also probably ... When it gets cold out here, it gets really cold. Our water hoses actually freeze. Driving out to Kasayuli, our water hoses and nozzles would freeze solid right there. Right when you\\'d get up to the first house, they\\'d frozen. A lot of the times, it happens before we get out to Kasayuli sub, you know that one dumpster as you turn in?\\nLeif:\\tMm-hmm (affirmative).\\nRobert:\\tMaking a right turn, the dumpster right there is a little open spot. What I\\'d do is I\\'d stop right there, turn on my pump and I\\'d start spraying out water for about a minute or two just to get some fresh warmer water so it doesn\\'t freeze by the time we get out to the next house.\\nLeif:\\t[inaudible]\\nRobert:\\tSo, we\\'d have times where stuff would freeze up and we\\'d have to drive all the way back to the shop, bring it into the nice warm shop, get it thawed out, get the water going and start going at it again. The trucks now, they started addressing this problem last year, or past couple years. But, last year, they had trucks that got heaters in the back to keep those lines free of ice, keep them freed up so we were able to pump.\\nLeif:\\tYeah, so, you talked a little bit about people\\'s parking. You talked about hoses freezing. What other challenges, what other things prevented people from getting their water?\\nRobert:\\tOh, the weather. Back in the old days, we were a little bit more ... We didn\\'t have to drive as far. We\\'d work no matter what, you know, be out there in a blizzard. We\\'d be out there 50 below, blowing and be ... The old truck I used to drive, a 733, the Cat, it was one of the older trucks, and it didn\\'t have too much insulation in the cab. By the end of the day, I\\'d be literally covered head to toe in ice just from getting sprayed and whatnot. I had two sets of arctic gear, so I\\'d use one set one day. It\\'d be totally frozen, iced up. I\\'d put it in the dryer overnight, let it dry out and I\\'d used my next pair the next day, and I just alternated clothes because it was literally that cold out there.\\nLeif:\\tSounds rough. Dangerous? Anyone ever get hurt doing that?\\nRobert:\\tWe\\'d get frostbite every once in a while, but a lot of drivers are dressed well and we\\'re just used to it just like, \"Oh, a little bit of frostnip.\" We\\'d just deal with it. You\\'d do what you got to do.\\nLeif:\\tIt seems like, on the risk management side, I think about just getting in and out of an icy truck 40 times a day. Somebody\\'s going to fall [crosstalk]\\nRobert:\\tNo, not 40 times. You\\'re going down 40 times a day. You\\'re going back up another 40 times, so that\\'s 80 times. Then, you\\'re talking about the times you\\'re going to ... double stopping houses and going inside of the pump house. That\\'s 80, so you\\'re looking at at least almost 100 times a day that you\\'re getting in and out of the truck.\\nLeif:\\tYeah, I\\'d be worried about slip and falls.\\nRobert:\\tYeah, the old trucks, we used to have to climb up on top, open a little hatch. There was this really small hatch, fill up the trucks. It was pretty slippery.\\nLeif:\\tIt sounds like there\\'s been some improvements, then. The cabs are warmer and you don\\'t have to climb up on top of them anymore.\\nRobert:\\tYeah, but those do freeze up. That did freeze up on me when I was up there last year. The air-actuated Rams and stuff, they\\'d freeze up, open up the hatches, excuse me.\\nLeif:\\tHow about the driving? You mentioned weather in terms of getting sprayed with water and that or getting frostbite, but, driving, is that safe?\\nRobert:\\tDuring normal days, it\\'s all right, but you have blizzards. You have blowing snow, hard to see. But, as drivers, we\\'re constantly running the same routes over and over and over, driving the same streets over and over and over, so you get accustomed to knowing exactly right where you\\'re at right at the time. For instance, just like me driving on the Kuskokwim River, I drive all during the nighttime when we couldn\\'t see just by looking at the trees. I go, \"Okay, I know exactly where I\\'m at. I know which way to go.\" So, for a lot of drivers, it\\'s kind of like muscle memory, but brain memory.\\nLeif:\\tI guess you\\'re also describing somebody that has been doing that a while, right?\\nRobert:\\tYeah, for new drivers, it could be pretty daunting, I suppose.\\nLeif:\\tyeah. You named a couple of drivers that have been there a very long time, but it seems like there\\'s also concerns about turnover or filling positions.\\nRobert:\\tYeah, it is a tough job. A lot of people are not really meant to ... I train drivers and I\\'ve had them quit within a week just because it was like, \"Oh, this is too tough. This is not enough pay for the hard work.\" I\\'ve had other drivers that just weren\\'t meant to be drivers. I was a trainer for like seven years. I\\'d pretty much train all the new drivers. I was actually happy to see drivers still there that I\\'d trained 20 years ago.\\nLeif:\\tSo, staffing continues to be a problem? Is that true?\\nRobert:\\tYes, it\\'s a high turnover rate because it\\'s lack of pay. Having your CDL out there, if you had the option to work for, say, Crowley, I think their starting pay is like $30 an hour, whereas, the city of Bethel, I think their starting pay is $19 or $20 an hour. What\\'s going to want you to ... Where would you rather work? You work, make more money; you work not as hard, or make less money and work a lot harder? [crosstalk]\\nLeif:\\tWhat\\'s the answer to that? Why does anyone work for the city then?\\nRobert:\\tProbably for benefits. A lot of drivers just actually really do care about the people. When I was there, it was like, \"Nick, you\\'re still doing this? You should have been retired.\" It\\'s like, \"Yeah, I could have retired a while back, but I just love my job always helping people out.\"\\nLeif:\\tWe\\'ve heard that same sentiment I think before. When we talk about water utilities, I don\\'t think that there\\'s every place in the country where water utility folks would identify that, would say that as a primary reason.\\nRobert:\\tYeah, you\\'re doing a service for your community and you care about people. How would you feel if you were not be able to take a bath, yourself? I can\\'t take a shower because I don\\'t have no water. I can\\'t take a bath. I can\\'t wash my dishes.\\nLeif:\\tSo, if you could do anything, you can wave a magic wand to make sure everybody gets their water. You\\'d mentioned a couple things. What would fix the problem? What would make it so everybody got their water when they were supposed to?\\nRobert:\\tMore drivers and more trucks, more pay. Now, the problem with the city, now, is they\\'re so over-budgeted in water sewer. Well, I don\\'t know the full budget but, because of lack of drivers and all, they pay more out in overtime than they do in regular time, which kind of worked out great for me. I\\'d go out there. I\\'d work 10-12 hours, and then I\\'d have on-call and I\\'d work all throughout the night, which I didn\\'t mind doing at all because I\\'m there to make money, you know, to help people out. But, just have more trucks. Have more drivers and have better upkeep with the equipment.\\n\\tI could remember when they put in the second pump house. That helped out quite a bit, too, not having to drive nearly as far going out to Kasayuli subdivision and out to Larson sub. That helped out quite a bit. I was good friends with Billy at the pump house. He\\'s been there, God, since I was a little kid. I don\\'t think that guy is ever going to retire. He\\'s probably going to work there til he dies. He\\'d sit there. We\\'d have time, downtime, fill up the trucks and they\\'d talk to Billy. They\\'d say, \"Our pump house can\\'t exactly keep up with the water of Bethel,\" because Bethel was growing. Then, they built that million-gallon holding tank, so, during the evening time when there was less water usage, they\\'d have that time to fill up that million-gallon tank to try to subsidize for the usage of water. When they did the city sub, that helped out a lot, having more quality water, I should say, cleaner water because it takes time to filter out all the impurities of water. They\\'ve got big filters in there. They\\'ve got to backwash them consistently, clean them out just so they could have good, viable, up-to-spec drinking water.\\nLeif:\\tI want to ask you about water quality, too, but, just to jump back for a second, if the pay increased, would that solve the problems? Folks still got to get a CDL and they still got to want to do the job. Do you think that would solve the problem, or is there more to it? If you just set a number [crosstalk]\\nRobert:\\tIt wouldn\\'t exactly solve the problem, but it certainly would help. It\\'d give newer drivers and fresh drivers more incentive to want to work for the city of Bethel, to want to do this job [crosstalk] like I told before, that job is not exactly meant for everybody. I\\'m 6\\'2\". We have drivers that are 5\\'4\", and they\\'re struggling getting in and out of the water truck, constantly getting out of the truck, so that\\'s ... Like I said, the job\\'s not exactly meant for everybody.\\nLeif:\\tYeah, okay. Talking about water delivery, pipes, what do you think of pipes? If you moved back to Bethel, where would you live?\\nRobert:\\tWell, I was on a pipe water sewer. Back when Leon Treat was the pump house guy, he was Billy\\'s supervisor. The water quality was good even though it was all the hardships and challenges that they faced. After Leon had retired, they\\'d got another guy in there and he did it completely different. Of course, the pipes being 40-50 years old, what he did different was heating up the waters and sending it out through the pipes instead of heating it up inside the building and letting it chance to cool off before sending it out the pipes.\\n\\tSo, when that new guy came in, we were having a lot of problem with the water pipes. Now ... because, sending out heated water through the lines, what happens when you heat up water? It gets air bubbles inside of it, and those air bubbles ... What\\'s the main metal? What\\'s metal\\'s most nemesis? Air and rust. It causes rust inside the pipes. So, for a while there, for a few years there, people\\'s water boilers and their holding tanks and whatnot start filling up with rust because of the pipes that were on the pipe water and sewer.\\nLeif:\\tThe new pipes are plastic, right?\\nRobert:\\tYeah, but the old pipes [crosstalk]\\nLeif:\\tYeah, you were over in-\\nRobert:\\tFirst Road housing.\\nLeif:\\tFirst Road, yeah.\\nRobert:\\tThat\\'s [inaudible] yeah. I was in that house since 1976.\\nLeif:\\tIf you were coming back, would it matter to you if you were on pipes or hauled system?\\nRobert:\\tI\\'d rather be on the pipe system just for reassurance that I got water. If we had a blizzard and I had a holding tank and water truck was not able to get to my water fill because of the snow pileup or just not be able ... road conditions. I\\'d be out without water for a couple of days. I\\'d rather be on a pipe system just for reliability. There\\'s a lot of factors with getting your water as to just driving.\\nLeif:\\tYeah, that\\'s been my experience.\\nRobert:\\tYeah, so you understand what I\\'m saying.\\nLeif:\\tI\\'ve done it both ways, and, yeah, I enjoy the pipes. Especially, I\\'ve got kids now, so the water usage is a lot harder to predict, you know what I mean?\\nRobert:\\tPlus, you have a stuck toilet and you don\\'t even know about it. You drain off your water tank and it\\'s like, \"Hey, what happened to all my water?\"\\nLeif:\\tRight, yep. Yeah, so, how come there\\'s not more pipes, then? Why are we still driving so many water trucks?\\nRobert:\\tHave you seen the pipes in city of Bethel?\\nLeif:\\tI\\'m seeing them right now.\\nRobert:\\tYou see the pipes and the difficulty of having those pipes? If you go and drive through Third Road housing, you\\'ll see some of the pipes, you know, the frost heaves and putting these pipes way high up in the air. That\\'s part of the problem, but just having pipes going all throughout the city, the permafrost and everything in the ground ... Most places throughout everywhere else, they have the pipes underground. Bethel, being kind of unique in that aspect is having the pipes above ground. So, just imagine the cost of running pipes all the way out to Kasayuli subdivision, having to have those pipes. Not only that. People, during the wintertime, they\\'re traveling a lot with snow machines, so, getting out, going over pipes. They did build a lot of ramps over high traffic areas, but accidents do happen. People run into the pipes. I\\'ve seen trucks that run into pipes and start spraying and whatnot.\\nLeif:\\tSo, speaking of frost heaves, has climate change affected the ability to deliver water? You mentioned frost heaves and permafrost, but also driving in weather. Have you seen any ...\\nRobert:\\tI didn\\'t see no change. It seemed to be a lot easier when the weather was colder back in the \\'90s because it was a consistent. Now, the climate change is warm and cold, warm and cold, warm and cold, so the roads get a lot slicker. Back in the \\'90s and early 2000s, it was a lot colder, so we wouldn\\'t have that fluctuation in the warm and cold, warm and cold to where everything just turned to an ice rink. It would happen from time to time, but not nearly as much as it does now. I could recall sometimes where it got really warm; everything melted and froze up at night and just turned into polished ice. We\\'re just wearing out tire chains like crazy, everybody having to use ice cleats.\\n\\tActually, I had a pair of Bunny boots. I just took a stud gun from the tire shop and put a bunch of studs in my boots just so I could have constant traction because it would wear out ice cleats. Getting in and out of the truck with ice cleats, they\\'d always get stuck on the stairs, so you always had a tripping hazard. There\\'s a lot of challenges that a lot of people don\\'t see with driving a water sewer truck, just constantly getting in and out of the truck. It works on your body. I\\'ve been driving truck for, God, since I was 19, so, 20-some odd years, just a constant getting in and out of truck. I\\'m 46, and I\\'m starting to show signs in my knees of arthritis, that constant in and out, in and out, constant up and down.\\nLeif:\\tWhat do you think about the future? Are we going to keep driving trucks forever? Is this a sustainable thing? How should this be?\\nRobert:\\tEventually, probably not. I think, Bethel, everything is going to be pipe water and sewers. But, there still will be ... That\\'s hard for me to say. I do not know.\\nLeif:\\tIf you could make it that way, if you could wave a wand and have the money, would you want it to be that way? Does that seem like a better solution?\\nRobert:\\tYeah, that\\'d be kind of a better solution for the consumer, but as for the workers, no.\\nLeif:\\tBecause there\\'s less work, you mean?\\nRobert:\\tWell, if you have all pipes, water and sewer? What\\'d happen to the drivers that used to do that? No more job. You\\'d have to find other venues of work.\\nLeif:\\tReplacing grinder pumps, right? There\\'s still some work to do with pipe water, right?\\nRobert:\\tYeah, but that\\'s a plumbers position, not a drivers position.\\nLeif:\\tIs there a lot of work for drivers out here? You mentioned Crowley. Is there competition for the drivers that we have?\\nRobert:\\tDuring the summertime, there is. When construction season pops up, they have drivers out there. Mainly, yeah, it\\'s like Crowley, and city of Bethel is the only two major CDL drivers that I recall. During summertime, you have construction, road construction and whatnot that require CDL. So, those are pretty much the two options right there.\\nLeif:\\tYou said you were a trainer. You trained drivers?\\nRobert:\\tYes.\\nLeif:\\tSo, they would come to you with a CDL and then you would teach them how-\\nRobert:\\tYeah, they\\'d come to us with a CDL and I\\'d basically teach them how to drive a truck, you know, backing in, backing a big truck into people\\'s driveways. A lot of times, we\\'d have to get within inches of people\\'s houses and their vehicles. It takes a special type of driver to do that consistently without having accidents.\\nLeif:\\tRight. So, when someone gets a CDL, do they come to you as a ... Can they drive? Do they come to you in a useful way like they got a lot out of their CDL training?\\nRobert:\\tYeah, yeah, well, back when I used to be a trainer, we didn\\'t have the Yuut school, so a lot of drivers were locals. They just did the same thing I did. They got their CDL ... I mean their commercial off-road system via mail until they finally got a worker for the department of motor vehicles. Then, yeah, there was no driven test required for off-road system CDL. Now, it is required, so you have to actually drive a truck to [crosstalk]\\nLeif:\\tOkay, they come with some skills then.\\nRobert:\\tThey do now, but, back then, no.\\nLeif:\\tYeah, okay. What do you think was better? If we\\'re talking about staffing, we\\'re talking about getting workers on the road and getting water delivered, what do you think is better, having them have a CDL and then getting a smaller amount of training on the job or having them come with a off-road CDL and needing a lot of training on the job?\\nRobert:\\tI\\'d rather have them get a lot of training because there\\'s a lot of fine points with driving truck. A lot of people don\\'t quite understand backing the truck, getting within inches and being able to gauge distance from how much further I could go to where I\\'m able to reach to get the hose or required to fill up a tank or suck out the tank. There\\'s a lot of little tricks with driving, so I\\'d rather have somebody out here training consistently, teaching them how to excel at their job instead of having somebody coming in that\\'s like, \"Oh, I already know how to do it,\" and end up having somebody that\\'s backing in, crunching up houses or running over vehicles.\\nLeif:\\tOkay, so you\\'re seeing some real value from the statewide everybody-does-the-same-thing CDL. I guess the other part of that is there might be more people that would take the path that you did that wouldn\\'t maybe start out and get the whole standard CDL. They might not do that, but you might get a foot in the door. So, do you think you\\'d have more-\\nRobert:\\tOh, shoot, what\\'s going on over there? Yeah, I\\'d rather have them on a lot of on-the-job training. You get better one-on-one training instead of instruction by a book. You get to see firsthand how to do it, and then you have somebody right there showing you how to do it. It just slowly builds up your confidence. You have confidence that somebody\\'s like, \"Hey, no, stop. Okay, you\\'re doing good.\" It\\'s really helpful.\\nLeif:\\tI guess that\\'s what I was thinking about, is, it\\'d be better always to start with someone in more training, but you might not get as many people applying if you required them to have all that extra training. So, do you want a lot of people with a little training or a few people with a lot of training? That\\'s what I was thinking about.\\nRobert:\\tYeah, like I said, they had a turnover rate. Nobody really wanted to work for the city. There was times where we\\'d all be short-staffed and we wouldn\\'t be able to take vacations. Nobody would be able to take time off. A lot of the drivers were working six days a week. Back when I first started working for the city, I wrote up one of the schedules. I was a weekend foreman. We had regular foreman Monday through Friday. And then, Saturday, I was the foreman, henceforth the reason why I was a trainer.\\n\\tSo, when we finally get another driver that\\'d actually apply, I\\'d train them depending on their skillset. I\\'d train them for two or three weeks. Some drivers, I had to train for a full month just so I felt they were competent enough that they were able to do the job without causing damage to people\\'s property and making sure that they\\'re safe. It was a really hazardous job. You\\'re driving a big truck with a lot of weight that\\'s top-heavy. So, if you come around a corner too fast, you\\'re tipping up on the side. It\\'s a lot worse driving a concrete mixer, too. That\\'s what I\\'ve been doing here in Anchorage for quite a while, too, and I ended up becoming a trainer for driving a concrete mixer. I find it takes a special breed to drive a water truck, sewer truck. It just takes a special breed of person to do that. When you\\'re working one-on-one with somebody, you kind of get the feel, \"Okay, this guy\\'s going to be all right. This guy\\'s not.\" Having to have that on-the-job training really, really helps that aspect.\\nLauryn:\\tAnd, when you ... a followup question about the training: when you\\'re doing the training, what do you find useful to get these new truck drivers up and going?\\nRobert:\\tWell, for me, my personal experience: is this person paying attention that wasn\\'t afraid to ask questions? In a lot of areas, it\\'s you\\'re not afraid to ... I\\'d keep my mouth shut. After showing them a few times, I\\'d keep my mouth shut and I\\'d observe and see what they did. Then, I\\'d correct them on little things here and there, but have somebody that wasn\\'t afraid to try, but learning from what they were shown. So yeah, like I said, being one-on-one, it was a lot better. I get somebody, train them to do the job officiously and safely.\\nLauryn:\\tSo, with the CDL how it currently is, there\\'s a pretty hefty ... You have to rent a truck to take a test, right?\\nRobert:\\tYeah, that is now.\\nLauryn:\\tIs it expensive? Do you think that people who would be willing to do the job aren\\'t able to because of that restriction?\\nRobert:\\tI don\\'t think so. A lot of drivers now, they\\'re going to school like the Yuut school. There\\'s other apprenticeship programs like with the Teamsters. So, a lot of drivers now really do that out of pocket anymore.\\nLauryn:\\tOkay, so those are working? Yeah.\\nRobert:\\tYeah.\\nLauryn:\\tOkay.\\nRobert:\\tThat is working. Me, on the other hand, since I already had my off-road system CDL, I just did it myself. I had the Class B, and it was like, \"Okay, I want my Class A,\" so I went to rent a vehicle, a truck myself just so I could have more job opportunities instead of just driving a Class B vehicle.\\nLeif:\\tThis is super [crosstalk] the sewer guys are here right now working on the grinder pump right outside my window, so we\\'re talking about this [crosstalk] that red light\\'s been on all winter on and off. Then, my pump broke and they didn\\'t have any more pumps, so they\\'ve been pumping it out once a week. It\\'s been a mess.\\nRobert:\\tGot them on a lift station?\\nLeif:\\tYeah, it\\'s on the city sub, so they got their little round guys. It\\'s like a grinder pump comes out of the house and grinds it up and then pump it up a slight grade so that it can get to the main. It\\'s not a grade.\\nRobert:\\tSo a vac truck is right there right now?\\nLeif:\\tI can\\'t tell quite what they\\'re doing, but they got the lid opened up there. It sounds like they\\'re vacuuming it yet. We\\'ve been back and forth to Anchorage, so our friends call or they\\'ll text and say, \"Hey, your red light\\'s on,\" like, \"Yeah, we know.\" Let\\'s see, we\\'ve actually worked through a lot of questions. Some of the other ones ... Do you ever deal with the financial end of things with billing or people not paying their bills or people mad about-\\nRobert:\\tNo, that wasn\\'t us. We drivers, we just drove. We get guys that come out. It was like, \"Hey, I paid my bill.\" It was like, \"We can\\'t deliver your water. We can\\'t do your sewer until we get the say-so from so-and-so,\" so that was something that we were just like, \"No, I can\\'t. I just can\\'t.\"\\nLeif:\\tAnd, how about the routes, deciding the route, which houses get delivered in what order and which trucks go where to make it the most efficient?\\nRobert:\\tAll the drivers, we had set routes throughout the day. Mostly, some of the routes, we\\'d go out to Kasayuli twice a week or Larson ... Pretty much everything was twice a week. We split them up, and then we\\'d have different routes, like five water routes and ... Yeah, six water routes and five sewer routes. We just went by month-by-month basis, so a driver would do route one for a month. And then, next month, they\\'d do route two, then route three so the drivers would keep consistently on knowing exactly where the water tank is on each house. A lot of times during the wintertime driving a sewer truck, a lot of the cam ops are underneath snow. Driving a sewer truck, it\\'s like, \"Okay, I remember it being right here.\" You tell me a house number, I could tell you how big the tank was, how far the tank was, where was the tank at, what side of the house it is [crosstalk] going around, around and around.\\nLeif:\\tOne of the training issues that had come up before was filling up water tanks without overfilling them and the process for that. Has that been an issue? Is that part of training?\\nRobert:\\tYeah, it is part of the training. During wintertime, some people, they\\'ll plug up your overflow fill so it doesn\\'t ice up or whatnot. When the overflow pipe is not functioning, that\\'s when the house will get filled up or flooded out. So, a lot of drivers are trained with filling the tank. You start filling the fill. Then, you\\'d listen, literally stick your ear right to the overflow, listen inside the pipe to hear if that water is going. If it\\'s muffled or whatnot, then we stop. Then, we know that it\\'s not getting filled, or you can\\'t ... The chance of it overflowing is greater. A lot of times, they\\'d be packed full of ice or whatnot, so we\\'d bring a little chunk of metal and we\\'d just tap on the overflow pipe, you know, knock some ice out or some frost down. Then, we\\'d be able to hear. Some of the times, you just can\\'t tell, so, some houses would get flooded consistently til they got the problem fixed. It\\'s not really the driver\\'s fault, because you\\'re doing 50 houses a day, and you\\'re constantly ... Like I said, that job, it\\'s not an easy job at all.\\nLeif:\\tIt seems like, also, if you\\'re a customer, your service is going to depend a lot on the driver you get. If they\\'re willing to pull into your driveway, if the car\\'s too close, or if they\\'re willing to take the time to knock ice out of your pipe, or if they\\'re listening [crosstalk]\\nRobert:\\tThere\\'s some drivers ... You never know with each person. Say you\\'ve got a sprained ankle. I\\'m not going to want to have to drag my hose through a total obstacle course around two vehicles and pipes and garbage in the way just to get to your tank when, if you\\'d have your vehicle out of the way, it\\'d make it a whole lot easier for me. That\\'s part of the problem. Some of the drivers just would not do that, and there\\'s a lot of drivers that will. But, all the drivers that I\\'ve trained, it was like, Hey, man, you just knuckle up and do what you got to do to get it done. That\\'s some of it, but, majority of the drivers, they\\'re willing to work with somebody who will blast your air horn, go knock on the doors like, \"Hey, can you move your vehicle? I\\'ll go ahead and get another house til you get your vehicle warmed up and out of the way? Then, I\\'ll just come right back, or I\\'ll get you on the next load.\" So, there\\'s a lot of interaction there with a lot of people.\\nLeif:\\tYeah, that\\'s interesting. How about water quality issues with the trucks?\\nRobert:\\tThey\\'re really strict on the water quality with the trucks. The water, it has to be EPA standard, so pretty much every truck gets tested. The tanks get tested, and they\\'re constantly being tested. We do water tests at the pump house. They check to make sure whatever the EPA standard is, make sure it\\'s falling within that guidelines. The only time that we\\'d really have problems with the truck if it was broke down and it sat too long. We\\'d shock the tank. Basically, we\\'d empty out the water truck. We sterilize it, rinse it out a couple of times with fresh water. Then, we\\'d put it back to use again.\\nLeif:\\tSo, this meeting standards for safety, does anyone complain about other things: taste, smell, color?\\nRobert:\\tYeah, sometimes they do. It\\'s not the driver or the truck\\'s fault. It\\'s the water is coming out of the ground. It\\'s none of the drivers\\' fault at all. We have no control over that. We\\'re just there to deliver what we got. If you pump wells consistently, the well is eventually going to run dry and you\\'re going to start picking up debris. It is what it is.\\nLeif:\\tDo you think the water in general that you\\'re pumping ... We\\'ve definitely heard complaints about ASHA, the pipes over there. In general, if someone was moving to Bethel, is hauled water generally pretty good compared to for taste and smell and color? Should people happy with it or should they plan on having rusty water?\\nRobert:\\tIt\\'s hard to say. When I was on pipe water, I had no problem with that. I didn\\'t have no smell, no taste, nothin funny. It was only when I was on pipe water that they had the rust issue, you know, because of the rusty pipes, water becoming yellowish or brownish. That was back in the \\'90s and whatnot. I personally had Brita water filters and filtered my water for drinking water because just of that purpose right there.\\nLeif:\\tSo, I\\'ll take a little break here and check in with Lauryn. Do you have any questions stacked up for us there?\\nLauryn:\\tNo, I really didn\\'t. I had a couple, and then you asked them all.\\nLeif:\\tI think we\\'re kind of moving through our list. There\\'s some at the bottom here. Is there anything I should be asking you, Robert or [crosstalk]\\nLauryn:\\tActually, sorry, I thought of one. One question is: what do you think people from outside of your community ... What do they not understand about water in your community, in Bethel?\\nRobert:\\tEverybody else, they\\'re used to piping water and sewer, so they\\'re used to it. They come in here and it\\'s kind of like a culture shock. It\\'s like, \"What?\"\\nLauryn:\\tUh oh.\\nRobert:\\t[inaudible] a phone call. Sorry about that. I had a phone call.\\nLauryn:\\tYou\\'re good.\\nRobert:\\tThen, we get somebody who comes in who\\'s like, \"Um, now we have to have our septic tank pumped out? It\\'s kind of a shock to them.\" A lot of people, they\\'re not used to being on a pipe water system, so they\\'re not nearly as frugal. We get new people coming in, and they\\'re used to leaving their sink running with you brush your teeth instead of filling up a cup and turning off the water. That\\'s where a lot of the culture shock comes in. Hopefully that was helpful there.\\nLauryn:\\tOh, yeah, yeah, definitely.\\nLeif:\\tYeah, I always think about: I never had any idea how much water I used until I had to pay for it.\\nRobert:\\t[crosstalk] I\\'m getting phone calls left and right there. My brother-in-law is trying to get ahold of me.\\nLeif:\\tHey, so, is there anything else I should have asked you about? Am I missing something, or anything you want to share with us?\\nRobert:\\tNot that I could really think of off the top of my head.\\nLeif:\\tOkay.\\nRobert:\\tIt is a tough job. It\\'s not meant for everybody, and the culture shock of new people coming in and just not really experiencing that type before. It\\'s lack of pay, but, city has a pretty decent retirement, so that\\'s why a lot of people still do that.\\nLeif:\\tOnce they\\'re in, you mean, like, once they\\'re in, they stay in?\\nRobert:\\tYeah. You got to work for the city of Bethel, I think, believe, five years in order for you to get ... Well, that\\'s what it was before. I don\\'t know what it is now. It was a union. Back in I started, there was Local 71 was a union that we had to join to drive for the city of Bethel. So, I don\\'t know if they\\'re still Union 71 or not. I don\\'t recall. When I would go back to work this last time out there, I didn\\'t want to be hired. I was full-time. I just said, \"You guys could hire me on as a temporary hire.\" And then, I got my higher pay, but I didn\\'t receive any of the benefits like paid holidays or vacation time or anything. So, it was just straight pay for me, which worked for me, just going out there to work for five, six months, and then coming back, doing my normal job.\\nLeif:\\tYeah, one of the things they\\'ve done with the police department is that they have people come out here, work for two weeks and then go home for two weeks, and they come from Georgia or wherever.\\nRobert:\\tYeah.\\nLeif:\\tDo you think that that would work for water truck drivers?\\nRobert:\\tNo, I don\\'t think so. I don\\'t think so. The drivers out there, they\\'re a tough breed. You\\'re constantly living out there. When you have somebody that\\'s been in Georgia, getting acclimated from one cold weather to a hot weather, you\\'re [crosstalk]\\nLeif:\\tAnchorage maybe, Anchorage or Fairbanks. It doesn\\'t have to be Georgia.\\nRobert:\\tHold on a second ... babies.\\nLeif:\\tBabies.\\nRobert:\\tMy baby started to get a little impatient here.\\nLeif:\\tYeah, we\\'re just-\\nLauryn:\\tYeah, we\\'re-\\nLeif:\\tWe\\'re wrapping up, here.\\nLauryn:\\tYeah.\\nLeif:\\tAll right, well, do you have any questions for us?\\nRobert:\\tNo, well, I did hear you said I was going to be compensated for this interview, and I didn\\'t know how that was going to happen?\\nLeif:\\tYeah, so, after when we\\'re done here, you get a DocuSign form. So, you get an email that has a form. And then, when we get that back, I send you money. So, I can do that with Venmo. I can do it with PayPal. I could meet up in Anchorage and give you cash. Any of those would work. I guess, theoretically, I could write a check. I haven\\'t done that in a while, but I have a checkbook somewhere, so yeah, whatever works best for you.\\nRobert:\\tAll right, I\\'m going to get off the phone here. My baby\\'s-\\nLauryn:\\tYeah.\\nRobert:\\tRequiring my-\\nLeif:\\tYeah, thanks for your time. It\\'s very helpful.\\nLauryn:\\tYeah, thank you!\\nRobert:\\tYeah, thank you.\\nLauryn:\\tBye.\\nRobert:\\tBye-bye.\\n', '1_5__InterdependenciesNNA': '\\nQC Vicente\\nThu, Aug 18, 2022 12:44PM • 1:16:15\\nSUMMARY KEYWORDS\\nwater, people, communities, plant, bethel, operators, system, talking, maintenance workers, question, area, difficult, houses, training, class, challenges, running, village, leif, pipe\\nSPEAKERS\\nLeif Albertson, Vicente, Lauryn Spearing\\n\\nLeif Albertson 00:00\\nAll right. Okay. We are a team of researchers trying to understand challenges surrounding drinking water services in your region. This research, the research has been reviewed by the human subjects and research ethics boards at our universities, Alaska area IRB and YKHC. With your permission, we would like to record the conversation so we can fully document everything. The audio recording, well, I guess, in this cace, video recording will be deleted after we transcribe the interview, regardless, and your results from this work will never identify you, or your organization and will only share aggregated anonymized insights. And I always just add, you know, it\\'s a small town, and you\\'re in Bob\\'s kitchen, right? So we\\'re not going to put your name on anything. Definitely, without telling you. But like, if someone someone found out you talk to us, like, you know, be aware, right. You can stop the interview at any point. You don\\'t have to answer questions. There\\'s no penalty for deciding you don\\'t want to talk to us or anything like that. Does that all make sense? Sounds good? Yes. Awesome. So we\\'ll start a little bit set the background. You mentioned just before that, you\\'ve been about six years, and now you\\'re in school, right? Hopefully, that\\'s going. Going well?\\n\\nVicente 01:32\\nyeah, I just started the semester yesterday. So you know, already have a full list. We\\'ll schedule things I need to take care of. \\n\\nLeif Albertson 01:41\\nOkay. So how long have you lived in the area?\\n\\nVicente 01:45\\nFor six years. Yeah, moved there in 2015. In December 2015.\\n\\nLeif Albertson 01:52\\nWhat brought you to Bethel?\\n\\nVicente 01:54\\nI got hired and started working for the Office of Environmental Health and Engineering at YKHC and yah got experience working with water plants and and the clinics out here. So jumped on that.\\n\\nLeif Albertson 02:08\\nWas that. How\\'d you find Bethel? I mean, that what brought you what what, what, why why there and not Illinois, right?\\n\\nVicente 02:17\\nYeah, so I wanted to live remote. I knew from working with the or having an internship with the Indian Health Service. After well, during college, I had to take an internship, I took it with the Indian Health Service, great area office or Great Plains area office in Aberdeen, South Dakota, and traveled to Iowa, Nebraska, North Dakota and South Dakota during that time, and slubbed, the remote lifestyle. I had a teacher mentor in at Illinois State University that led me in this direction, I thought it\\'d be a good fit for me, knowing my lifestyle and what I wanted to do with it. And he led me in this direction. And applied for the position. Jenny Dobson had an interview with me. And yeah, things just kind of evolved into taking the position and moving out to Alaska sight unseen. Never been to the state before and just dropped me and Bethel. Been good to me ever since.\\n\\nLeif Albertson 03:15\\nYeah, it seems like it\\'s funny. People in that situation end up making it a year or they are there their whole life. There\\'s not much in the middle. Yeah. And your degree is in environmental health?\\n\\nVicente 03:30\\nEnvironmental Health Science. Yeah.\\n\\nLeif Albertson 03:32\\nOkay. All right. Is there any other? You mentioned an internship? Is there any other education? Or training that you feel like is relevant? Training, formal or informal? How did you learn to do what you do? Or did?\\n\\nVicente 03:47\\nWell, there\\'s definitely classes, you know, that dealt with very lower 48 water plants and other types of air quality. They\\'re, yeah, they\\'re lower 48. I mean, you\\'re when you learn about these water plants there for hundreds of 1000s of people, and not just a small community of 3 to 700 people, maybe 1000 or 2000. So learning the basics of water quality happened that way. But when I came here, it was mostly on site training. A lot of the like, the first or second week we got here, we I was a part of training water plant operators. So I was learning it and they were learning it pretty much. I was just I was reading it pretty much right before and doing my best to teach. And for air quality, some of the projects we\\'ve done in the past. It\\'s not really guided to that so I\\'m gonna skip that. Yeah, for water plants, mostly just following the remote maintenance workers and completing trainings with water plant operators.\\n\\nLeif Albertson 04:57\\nWho was that when when you started with Is that? Was that Bob and Tommy? Or did you do you start when you said you started? You\\'re right in the in the class, water plant operator class? First, right away?\\n\\nVicente 05:12\\nYeah. So Bob White, or Robert White was leading the class to start off with. And at that time remote, we were all kind of pitching in remote maintenance workers would come from their respective villages. That would probably be, I believe Allen Polken might have been there, and Bruce Wareba from Holy Cross, and I\\'m not sure if Billy Westlock came in, but he\\'s a part of it also. And I think we hired Shane a couple of weeks after, so I don\\'t think he was there for my first training. \\n\\nLeif Albertson 05:42\\nOkay, great. So how, you mentioned operator training? How, what ways have you worked with water infrastructure?\\n\\nVicente 05:53\\nYeah, so the opposite, or in my position, we usually survey water plants for the state need every three years. And so with me working through that, I\\'ve had to learn the water systems, I follow remote maintenance workers in the water plants and learn from them. Along with the water plant operators, they teach me what\\'s going on in their plant, I fill out the survey, and I turn it into the state. For actual on hand, me turning dials or adding chemicals to water plants. I don\\'t have experience with water, drinking water treatment, but I did do some water treatment for pools when I was in Illinois, I ran a pool for a few years. So I did that. I would add chemicals that make sure things were or chemicals were in balance.\\n\\nLeif Albertson 06:52\\nSo going out to villages and doing water plant inspections, because they\\'re required every three years, how much of that? What was your relationship with the operators there, or the tribes or the or the community when you were visiting those places?\\n\\nVicente 07:10\\nYeah, so over the years, I\\'ve been able to make some relationships, often, many communities have a revolving door in their administrative positions or within the water plant operator position. So sometimes they\\'ll stick around. And sometimes they\\'re still around and over the six years, but for the most part, even administrators that were there for longer periods of time, COVID kind of pushed everyone out the door to just the stresses of life, and they didn\\'t want to deal with it for what they could deal with before it\\'s become too much. So feel, this pandemic has really made it difficult for the water plant operator or for the administrators that were around to stick around. But yeah, it\\'s really hard to to have plans and set budgets and have. And this is important because of the bigger picture of like best practice scores with the state and how they fund water plant projects. So it\\'s hard to stay on track with a plan if they have a plan to stay on track, and get the necessary paperwork in and correct or had the correct type of paperwork filled out to fit their needs. So yeah, and for the water plant operators, it can kind of be a revolving door. Also. Some water plants have them for a long period of time, and I\\'m still talking to the same guy I was talking to you six years ago, pretty rare. Some guys will be gone for a couple years and come back because it kind of by default, they don\\'t really know who else can run the water plant and they are in need of a job again. Or, yeah, they just find anyone and for the most part, they just get docked points when it comes to best practice because people don\\'t keep their certificates and the certificates are kind of they\\'re pretty difficult to obtain. They\\'re the the accreditation system is really geared towards people in the lower 48. And with water systems that I learned of the style of running them in college rather than what\\'s actually in Alaska. So you get asked questions about huge baffles and all these systems that Yeah, I mean, we don\\'t have this here. Yeah.\\n\\nLeif Albertson 09:32\\nAnd I probably have some more questions on that a little farther and specifically is that certification issue because that\\'s been mentioned as a challenge. That, you know, we were, we were expecting to hear about. I did want to dig just you said something interesting to me about you know, it being a revolving door. Do you feel like on your side though, having been there for six years, even if the person you\\'re working with, how do you feel, how do you feel that affects your credibility going into a village? Or your comfort level? Right? Like, is it like, people recognize you still as somebody who\\'s been there? Or when that water plant operator switches, or the tribal admin switches, Is it? Are you really starting from scratch?\\n\\nVicente 10:17\\nI feel often it is starting from scratch. And even so with what I do with the water plants, people still see me as the dog guy, so,\\n\\nLeif Albertson 10:24\\nokay, all right. Yeah,\\n\\nVicente 10:25\\num, so yeah, with when it\\'s easier when you have that relationship with the water plant operator or the administrator and them knowing what you do for them and how to reach out to you and what I can help them with versus at least how I can guide them in the right direction. When you have a new administrator, often, they don\\'t know who you are, if you did have a business card on their desk at some point, they removed it at some point. So it\\'s constantly you\\'re constantly introducing yourself to new people. So it\\'s hard to make a lot of progress with the community relationships.\\n\\nLeif Albertson 11:07\\nAnd just just for Lauryn, when he says the dog guy, it\\'s because you were doing rabies shots and training lay vaccinators, yeah, so the environmental health officers do all kinds of stuff. And then this water is sort of one piece of it. Great. Um, and talking about, you know, so we talked about the water plant operators, talked about the tribal administrators, do you ever talk to the public? Whether you\\'re vaccinating dogs or whatever? Do you ever talk about water with people?\\n\\nVicente 11:43\\nYes, occasionally, very occasionally, it comes up, I feel a lot of people are pretty set in their ways to how they think about water, and water quality and what they think about their water plant operators or their water source. In this area, you know Leif, that a lot of people really rely on traditional sources for water, they\\'d rather drink water from the river often, and they would rather go to a to go to haul water from from their water plant. Even if it is the same distance or even closer, they would rather still go to a river to do so they think that it is safer. Because traditionally, that\\'s what their people have done for hundreds of years. 1000s of years.\\n\\nLeif Albertson 12:28\\nIs that a? Is that a touchy subject? I mean, or is it? Is it easy to talk to people about that? Or is that something that\\'s kind of like talking about politics?\\n\\nVicente 12:36\\nKind of like talking about politics. Yeah, some things have happened over the years, like, especially thinking of fluoride in the past with I believe it was Hooper Bay where, you know, you get a pretty bad rep on fluoride and in fluoride in water, and then the whole Delta hears it, and now it\\'s because people have died from it. Fluoride is bad, and fluoride will always probably be bad, unless there\\'s a mass change of mind. Not really sure how, how to do that we get I know, along with teaching my water plant operator classes, and along with working alongside Brian Berube over at DD. Yeah, constant teaching of what fluoride really is, safe levels, and how it benefits the human body and how we have such tooth decay in this area and how it can really improve the health of children. It just doesn\\'t get through to the right people. Or it will not change your minds. It\\'s kind of the same with chlorine. And people think adding more chlorine will be worse for them. Because they can smell like, oh, I can smell it with a little bit. If I put more, I\\'m just going to smell it more, which is not the case. So with the breaking point, so yeah, I\\'ve had water plant operators that have worked for me that I\\'ve I\\'ve taught them that, you know, I\\'ve explained to them, you need to hit this point, like, this is what you need to do. I\\'ve had people in class, argue with me about it, even as I\\'m teaching them and showing them and like people that they respect, such as their remote maintenance workers are telling them also like, you need to do this for this purpose and to really clean the water. But it doesn\\'t always get through. I don\\'t know exactly how to change your minds if they\\'re not open to receiving information. \\n\\nLeif Albertson 14:38\\nSo that\\'s interesting, that most put water plant operators in an awkward spot sometimes. Right? I mean, because they I mean they live in the community. Is there do you see them as being advocates or how are they perceived and how do they deal with that stress?\\n\\nVicente 14:55\\nGood question. I\\'m someone who doesn\\'t work in the village. And here there are people that complain to the village about what goes on. I only hear a fraction or a tiny bit of what actually comes up, people aren\\'t really open to sharing difficult situations out here that they may deal with. So from what I have heard, and when I do ask people about their water quality, they\\'ll be like, Oh, I don\\'t drink from there anyways, because of whatever reason. Maybe you can just ask me the question one more time, I got lost in a train of thought.\\n\\nLeif Albertson 15:34\\nI just I was thinking about the water plant operators, and if they\\'re comfortable talking to the public, or if they are like, kind of like the tax man, nobody wants to see him or are they like? Are they a trusted source? Or are they on board with it? I mean, or they just need a job?\\n\\nVicente 15:52\\nYeah, I think a lot of them just need a job. They\\'re not in that position to talk to the public to Nice job where they can be inside a water plant and work with their hands. That\\'s a big part of it. This culture isn\\'t necessarily super outspoken. So it\\'s pretty rare if you get somebody that can speak out and reach people in that way. So especially with water plant operators, if they don\\'t, they didn\\'t get the job because they wanted to lead or be leaders of the public and be educators. They wanted a job and water plant operator. Yeah, there\\'s not a lot of jobs. So water plant operator was kind of like something they could be.\\n\\nLeif Albertson 16:31\\nDo you feel like they\\'re, do you feel like they get like bad, are there negative feelings about water plant operators? I mean, is it, or indifferent?\\n\\nVicente 16:43\\nIt might be indifferent, depending on community and location, or community and the Yeah, where they are. Okay.\\n\\nLeif Albertson 16:54\\nWell, I went off course a little bit, but that was interesting to me. How so you live in Bethel, but you work you have like a probably a set of a set of villages that you\\'ve had or have you been all over the place?\\n\\nVicente 17:09\\nI\\'ve been all over the place but and have been way more communities and I cover I cover 16 communities throughout the YK delta. So I had to travel to every single one at least once a year for, I would say six years, but at this point, probably more like four and a half because of COVID. So I\\'ve been to all of my communities at least once or twice. More than once or twice. Sorry, more so about four times.\\n\\nLeif Albertson 17:37\\nYeah. How? How is water? How do people get water?\\n\\nVicente 17:43\\nYeah. I have a lot of the more poor communities, I feel. Most of my communities do not have pipe water to their houses. Most of them either have a co-water system going to explain that or is that a common knowledge for this?\\n\\nLeif Albertson 18:01\\nWhy don\\'t you go ahead?\\n\\nVicente 18:04\\nSo with a co-water system, the home owner generally has to haul their own water by four wheeler or snow machine with it what is generally a 30 gallon trash can that they only use for water. And they will drive their four wheeler or snow machine over to the water plant. Sometimes they have to pay a few quarters, sometimes it\\'s tokens that they would buy over at the tribal or city office, and they\\'d fill up their bucket, they drive back to their house, they would have to have a pump, then that can pump it into their water tank, their drinking water tank, where then it can be used for washing or flushing. And at that point, it would go into their sewer tank and that would have to be evacuated at some point. Generally the water entity also has a sewer entity and they would pump that out and take it to the lagoon and they would dismiss it there. And for other communities that they do not have co-water systems. They just get drinking water from wherever. Some of them like I said, think that it\\'s healthier to get from the river or local water source. There\\'s one or two communities that might have a spring where they get collect water from but most I would say get from a river. Most community members probably get from the river, rainwater. I mean, people say sometimes that people melt ice I haven\\'t really seen it too much. But for the most part, just those three the rain, river water and going over to the water plant and collecting the same way and taking it home.\\n\\nLeif Albertson 19:56\\nDo you have do you have any communities that don\\'t have access to treated water?\\n\\nVicente 20:07\\nDon\\'t have access, what is your definition?\\n\\nLeif Albertson 20:10\\nLike there\\'s not a watering point or there\\'s not a functional water treatment? Or maybe there\\'s a boil water notice?\\n\\nVicente 20:17\\nYes, there. I mean, right now, there\\'s a couple probably that have that going on. But one of them that I can think of off the top of my mind has a pipe water system, they\\'ve just had issues keeping it running in the last week or two. Had some problems with freezing, and the system kind of crashing on them. So rather than listening, well I know too much. Rather than listening to the, their remote maintenance worker, they kind of decided to take things into what they thought was appropriate, and put their system more in danger by you know, maybe yeah, risk of burning down of their motors that, you know, now were running dry, because water isn\\'t running through them, without enough pressure in the water system to be able to pressurize pumps and have water running through them. But I mean, I suspect by now that should hopefully be changed today. As I know, the remote maintenance worker was working with them since Sunday.\\n\\nLeif Albertson 21:26\\nSo but so you do work with I mean, if a community member asked you, you do have communities where you can\\'t you couldn\\'t tell somebody that they have access to safe like safe water to water you could put a stamp on. Is that Is that accurate?\\n\\nVicente 21:47\\nMaybe just one community and at this point, I think it should be good. But yeah. \\n\\nLeif Albertson 21:51\\nOkay. All right. Has?\\n\\nVicente 21:56\\nOr sorry. So. So for some communities that might not have a community drinking water system, they might have a school that has a that they can buy water from as well. So I\\'m thinking specifically Tununak, where they\\'re in the process of having a community water system. I don\\'t know if they started building yet or not. But when you say access to it, like technically, they can get it from the school if they want to get it from the school. But that\\'s kind of like a not necessarily community water system. It\\'s a private water system.\\n\\nLeif Albertson 22:32\\nI mean, people could fly in bottled water too I guess. Yeah. Yeah. And I think I mean, I think Tununak was on a boil water notice for years, right?\\n\\nVicente 22:42\\nI mean, Tununak doesn\\'t have a water system so I don\\'t know what they have boil water notice for?\\n\\nLeif Albertson 22:51\\nHave you in the time that you\\'ve been there? How have you seen how have you seen things change in terms of getting water to people how things change over time.\\n\\nVicente 23:05\\nOccasionally, they have systems being built, I mean, they\\'re very expensive system funding is very difficult to come by having the communities have the proper amount of points to get a water system built for them can be very difficult when the entity cannot make money to begin with a lot of their scored points on their best practice from the state has to do with financial keeping and their their ways of making money. And, you know, you can\\'t really make points in sections with, with a water plant if you don\\'t have a water plant. And having. Yep, so not having an entity also puts you at a disadvantage, because you can\\'t really collect money from people when it comes to that. So yeah, and the only thing that it might rely on if the entity owns it would be the power and maybe a store. And it\\'s yeah, it\\'s it\\'s hard to to have a good financial standing with the limited resources in a village.\\n\\nLeif Albertson 24:13\\nSo overall, if you were thinking about the six years that you\\'ve been there, would you say that on average for the communities you work with things have improved or things have gotten worse or things have stayed the same?\\n\\nVicente 24:28\\nI would say many have stayed the same. Maybe one or two okay. Well, it\\'s, you know, six years within within the time period of water plants. I think it\\'s very small. But there have been one or two communities that have projects taking role here so like Tuluksak has something coming in. But you know, they couldn\\'t get a project started. Without the, you know, the tragedy of the water plant catching fire. Now it\\'s an emergency now things can get put into motion. They were one of those communities that their score was never going to get higher than what it was, no matter how hard they tried. Whether it be because of turnover in their office or. Yeah, I mean, water plants can sometimes be a pit if you don\\'t have if you don\\'t have pipe water to everyone\\'s house. \\n\\nLeif Albertson 25:39\\nOkay. So can you tell us? Can you share with us an anecdote, a recent moment or story that made you aware of challenges and local water infrastructure? Like pretend you\\'re talking to someone from Texas, and you want to tell a story? We do this all the time. Right when people come out to visit?\\n\\nVicente 26:03\\nYeah, or when we go to the lower 48 to go visit family? Yeah,\\n\\nLeif Albertson 26:06\\nright. Or school reunions? Right? Were you involved with the Tuluksak thing at all?\\n\\nVicente 26:27\\nNo, I was pretty removed. But back when their water plant was running. About four years ago. They were having some issues with their their filter. And they needed new. I don\\'t really know if this is answering your question now. So Okay.\\n\\nLeif Albertson 26:51\\nSomething that involved a challenge.\\n\\nVicente 26:58\\nRepeat the question one more time.\\n\\nLeif Albertson 27:00\\nOh, it\\'s just a story, or a time and experience that made you aware of infrastructure challenges. You mentioned kind of like when you were new too, how it was maybe a different situation than you learned about in school, or most of us have some of those moments where you\\'re like, Whoa.\\n\\nVicente 27:26\\nThere\\'s so many it\\'s just kind of normal at this point.\\n\\nLeif Albertson 27:28\\nI know. That\\'s why we asked that question.\\n\\nVicente 27:30\\nI don\\'t know. I guess like just learning about fluoride and how people were so against it here was kind of like eye opening to me that one incident at this point must be like 11 years ago, that continues to resonate throughout the communities of how fluoride has a negative impact on human health, rather than the positive impact that it can create. So I can specifically remember just in class, having one of the water plant operators who was very against it. Kind of rolled his eyes and had to speak out wanted to speak out against fluoride in the class and how it is negative and just having to teach people that within safe measures. And when the state measures aren\\'t removed. Like this, the safeguards are removed, that it can be healthy and it can benefit human health. Especially the health of children that are losing teeth. Yeah. So yeah, just trying to convince a class that, you know, this water plant operator was pretty highly respected within the community, his community for sure. But also throughout the YK delta as one of the more experienced operators that he\\'s worked there for so long. It\\'s kind of has you know, everything\\'s all about what family what name where you\\'re from around here. And when you have like, I\\'m just some new guy coming in lower 48 just out of college. I don\\'t know anything. I don\\'t they don\\'t know me. I have no family here. Why would they listen to me? Trying to convince somebody, that\\'s where my struggle was, is I\\'m teaching them something out of a book, rather than what they have seen with their own eyes or something that they\\'ve been told from, you know, story to story or from elder to elder, so on and so forth when it comes to how stories are exchanged in this area. Being the outsider can be very difficult. \\n\\nLeif Albertson 27:55\\nGreat. So pivoting a little bit talking about infrastructure challenges, you know, we talked some about the people and the training. Can you tell us a little bit about physical challenges? There\\'s like a fire in Tuluksak, obviously. What kind of physical challenges does water water delivery face?\\n\\nVicente 30:09\\nWell, the temperature for sure, whenever it drops, you know, it\\'s been Yeah, especially this winter, winter fall, we had about a month of double negatives, and then warmed up to like 37 for a week or three weeks and just rained the whole time. When the weather drops for a long period of time, it becomes very difficult for water plants to be to sustain water in their water tanks. Because to prevent water from freezing, many community members will run their faucets and it will drain the tanks just so that they can continue to have a defrosted water line. So that\\'s one thing, the weather. If they need chemicals or supplies or fuel, they have to make it within for fuel they have to make you have to put enough money aside to have bulk fuel sent to them throughout the summer, where it\\'s cheapest to send in fuel through a barge all the way from I\\'m guessing Anchorage is where they buy it from and it gets shipped all the way around, it comes all the way around into our river and barged up like in past Bethel and wherever they else need to go up through the Kuskokwim or whether it be the Yukon or wherever it is that their water plants are. So fuel can be difficult especially with having to secure enough fuel for the whole community and the water plant. Challenges of shipping chemicals or parts can be very difficult not only, it\\'s not like Amazon where in the Lower 48, you can get it in like two days or even same day. You know, if they even happen to have it in Anchorage, which is pretty rare. Whatever they\\'re looking for, it has to get from Anchorage, to often Bethel, then get on a small plane and hopefully the weather\\'s good. And everything lines up just right, you\\'re looking at seven to 10 days before you can get a missing part that you needed or something broken if you don\\'t have it on on hand. And we\\'re talking about like regular non pandemic issues, or like good weather issues for it to get shipped out. Even if you did ship it priority as as much as you can. Yeah, it can be very difficult. Occasionally, there have been times where it was a really pressing part or something like that where myself or water plant operators can take a snow machine or a boat and drop it off to them at their water plant. But that is also something that\\'s yeah, it\\'s something we can do at times. But it\\'s not always possible with weather or what\\'s going on in everyone\\'s schedule, or how large the object is or if it\\'s worth taking a ride out for a $20 part but often we do it anyway just to help the community out but it is something that travel is a challenge.\\n\\nLeif Albertson 33:16\\nSo there\\'s the issue of not being able to make water and not being able to get water to people. How about water quality?\\n\\nVicente 33:25\\nYeah, for water quality many times as long as you follow what what directions you\\'re given to run the water plants can be good. But as long as they have a well. Often wells, from what I understand, aren\\'t dug deep enough for the best water quality is often found within where they are in the community. Oh often water has arsenic already in it and they have to treat to get the arsenic out. They\\'re usually at lower levels but still no arsenic is better. And when it comes to surface water systems during the spring and fall when water inverts, water quality changes and it becomes difficult for the water plants to be able to treat the water because of either be more turbidity, more contaminant. Yeah, during that time, places such as Newtok can only draw water in during the summer, for the most part, and then they have you know, their pond freezes over. And they can\\'t get water from their water source. So they fill up I think, three large water tanks. I don\\'t remember how big they are at this time, but that\\'s the water that they have for the year. So they have to, you have to work to fill those up. But also they can have issues in that area since they\\'re close to the ocean with brackish water and how to deal with that. \\n\\nLeif Albertson 35:03\\nSo it sounds as though some of that is sort of regulatory quality. And some of it is like secondary characteristics. Right? And it sounds like, sounds like a mix of the two issues, right? Or\\n\\nVicente 35:16\\nIt can be two issues. Yeah, but I mean, if you have enough NTU people aren\\'t gonna want to drink it anyways, it you know, we see we see in drink with our eyes, often, rather than what water quality is, if you go to a restaurant, we\\'re like, oh, I have flakes in my water, whether or not it has enough water or chlorine or whatever. To keep it clean, we would probably send it back. Just because we see with our eyes, and we don\\'t want to put something into our body that looks unhealthy. Yeah, most of it can be like secondary contaminants, where they\\'re not technically a health issue. Most of the time, yeah, it is like, even here in Bethel, some of it some of its people\\'s own water systems, or their own water tanks, but living on the water system, or on the pipe water system for a bit. A few years back. Yeah, you know, if you fill up a pitcher of water, you\\'re going to have some turbidity in it. It\\'s just what it is. But working for the water plants and knowing what is going on with the water, I didn\\'t have a problem drinking it. But often people would use a second or you know, another type of filter at home, you\\'re often using Brita filters to take out the contaminants or Berkys or whatever it is that they wanted to use. Also have it in our area, we\\'re more fortunate to have systems set up where we will use a cartridge filter to take out some more of that iron or turbidity that might be in the water just for water that we use to clean our bodies with and like, just because we don\\'t want to see we don\\'t want to get cleaned with brown water or Yeah, help us after clean the showers as many times because it\\'s building building up filth,\\n\\nLeif Albertson 36:49\\nor laundry, right? You don\\'t see a lot of white clothes. Yeah, not a lot of white clothes. Yeah, I know, that\\'s been an issue in Bethel where the, you know, the city\\'s been very adamant that the water is safe to drink. But there\\'s still a lot of concerns about, you know, like, look at this water and tell me I\\'m supposed to drink this.\\n\\nVicente 37:30\\nThat\\'s absolutely true.\\n\\nLeif Albertson 37:34\\nI\\'m gonna take just a little break and check in with Lauryn here and see if there\\'s any any questions that have been been stacking up? \\n\\nLauryn Spearing 37:42\\nYeah, not yet. You\\'ve asked most of the follow up ones I have written down. I might have more when we get back into the operator training part. But.\\n\\nLeif Albertson 37:53\\nAlright, well, we talked a little bit about operating in arctic conditions, obviously challenging. Seasonally, what do you think about changes, like climate change, like long term changes? Is that does that affect the infrastructure? Or have you seen ways that it has?\\n\\nVicente 38:13\\nYeah, I want to circle back to your first part there since something I didn\\'t mention. Areas such as Quinhagak, that do have pipe water systems, often, depending on the systems that they have set up for them have a with a changing of the seasons, they have their their water lines propped up on, like stilts, they\\'re different types of however, they support their water, water lines, they heave and go up and down and can cause breaks and pipes and stuff like that, to where it can become difficult with the seasons changing constantly have to be monitored, and, and changed up and down requires a lot of hands to keep that system running. As for your second part, which was \\n\\nLeif Albertson 39:08\\nClimate, long term climate changes.\\n\\nVicente 39:10\\nRight. So um, I think like the thing that\\'s most most in the news right now would be Napakiak. Napakiak. Yeah, you can see that the erosion caused it\\'s, yeah, it\\'s eating away. I don\\'t know how many feet per year at this point. But I think when it first started, it was like 24 feet a year of their banks are getting eroded. And at this point, it\\'s so close to the school, or they\\'ve been planning to move the school for many years and they\\'re still working on trying to do that. With it being so close to the bank. It\\'s crazy going there and like seeing how much has changed. And yeah, I mean, I drive a boat during the summer. I pass through there at least a couple times a year and you can park in one spot and not be able to park there the next time you pass by just because of how much the land erodes. Yeah, definitely caused by permafrost and freezing, but also a lot of the, a lot of the changes in the environment have caused for what I kind of feel like more, more storms. It hit specifically really hard in that on their banks there. Yeah, we had a pretty rainy summer this year, which is pretty. I mean, it can you know, everything changes here and there, but it\\'s a really rainy summer. It\\'s a lot of storms. And it really hit Napakiak really hard with the how much erosion was caused.\\n\\nLeif Albertson 40:50\\nDoes that put water plants or pipes or haul? Trails? I mean, does it put those things? Are those things in danger? From that, or has it had effect on that?\\n\\nVicente 41:05\\nWith this community? Specifically, I feel like probably more of a school problem at the moment. But I mean, the water plants right behind the school, or like parts of the water plant are right behind the school. And in other areas. Yeah, I mean, with Akiak, specifically, houses have had to be moved from their bank where they\\'re also experiencing erosion. And pipes have had to be rerouted to those houses once they have moved their houses. I mean, it\\'s talking about moving houses. I mean, that\\'s a big deal. Yeah, they\\'re, you know, on stilts, or whatnot, or propped up, but moving house is no small task. And then again, repiping their water system in a different way to continue water service to those houses. Right. And especially in the small communities that you\\'re talking about. I mean, Akiak\\'s 400 people, I mean,\\n\\nLeif Albertson 42:04\\nThere\\'s not five guys that know how to do this. You know, how to repipe a house, there\\'s no guy, right? You probably\\n\\nVicente 42:12\\nAnd operators. Yeah, you have a couple of other hands that they can get to, like help them do things occasionally, may or may not show up on time. Or at all.\\n\\nLeif Albertson 42:23\\nMight be the same guys that are busy moving the house too, right?\\n\\nVicente 42:26\\nRight. Right.\\n\\nLeif Albertson 42:27\\nSo if you could, what do you if we\\'re talking about just prioritizing, right? In your in your view? What do you think is the most important challenge to fix for Water Infrastructure? If you could? Like if you could wave a wand and fix it? What what\\'s what\\'s your number one?\\n\\nVicente 42:56\\nI think speaking specifically with health, just getting more water to community members, and with that pipes would be the best way to do so. It\\'s been proven everywhere. And that\\'s why every state has it. Or every other with infrastructure, the pipes are the best way to go. Having pipes to houses gets the water to the public the most efficiently. And yeah, turning on a faucet is so much easier than running down to the water plant and picking up water or running down to the river or chipping a hole in the ice. You try to get water or maybe you have a formula that works maybe you don\\'t, maybe you have a snow machine that works, maybe you don\\'t. Often these tasks are given to children and you know, it can be very difficult for them to try to do this in a safe clean manner or physical. So yeah, just having water pipe to their houses would I think benefit the public the most.\\n\\nLeif Albertson 43:54\\nSo as sort of a side question. I\\'ve been asking Bethel people, and I know you\\'ve been on piped water and on hauled water. So if somebody was moving, if Lauryn here is going to move to Bethel and she was looking for a place to live, what would you tell her? What advice would you give her about water in Bethel?\\n\\nVicente 44:20\\nYeah, I would say overall, and I\\'ve gotten I may have spoken out of term here and there with a couple people but especially travelers coming in. With people complaining about water quality, I would tell them that it\\'s safe. If they wanted to not believe me and they wanted to actually check instead of complaining that they should look on the city site and look for their Consumer Confidence Report. And they can read through there that the water quality is safe to drink. And if they had any aesthetic issues with it that they can use a filter if they so pleased. But if they had like a water tank that had flakes in it? Well, it\\'s kind of like, if you\\'re going to move into a place, you should probably check that out and clean it out yourself. No one\\'s going to do it for you. There isn\\'t like a service, for the most part that somebody is going to come and clean your personal water tank for you. But for the pipe system, yeah, it\\'s safe. And it\\'s safe to drink. When you get your water that comes whenever you have it set up? \\n\\nLeif Albertson 45:30\\nWould you, like if someone wanted to know like where should I live? You know, money, and quality, and covenience, like all the like? \\n\\nVicente 45:42\\nYeah, um, yeah, if you don\\'t want to change your ways, the city sub is the way to go. On the pipe water system, if you can find something on the pipe water system. It\\'s available. I mean, that\\'s the goal for me, right? I like I love to live on the water system at some point. Most of the, there aren\\'t like new subdivisions being made with that are going to be piped that I\\'m aware of at this time. They\\'re further out of the community. And they\\'re branching out, rather than, than planning to make them with pipe water into the houses. So they\\'re going to continue to have the hauled water to their house.\\n\\nLeif Albertson 46:29\\nI think the next plan in Bethel is the avenues project. The old houses, not new houses, but there\\'ll be more houses.\\n\\nVicente 46:36\\nYeah, I was mostly thinking of the is it the? The one that \\n\\nLeif Albertson 46:41\\nPost office? \\n\\nVicente 46:43\\nYeah, but Yeah, over there. And then the other one, Past Larson?\\n\\nLeif Albertson 46:46\\nBlue sky. Yeah. \\n\\nVicente 46:48\\nYeah, to where, you know, they\\'re not gonna get piped water. So\\n\\nLauryn Spearing 46:58\\nAnd I have a follow up question, too, as well, kind of going back to when you were talking about, when you\\'d wave the magic wand, right? You want all these communities to be able to have a piped system. And with kind of what you\\'ve been talking about isolation, cost, of these systems? I\\'ve heard a lot of people talk about kind of these decentralized systems, whether it\\'s a PASS system, I was wondering if you could just share a little bit of your thoughts on like, if you think those are solutions, or do you think they won\\'t provide kind of the quality that\\'s needed to actually serve these communities?And quantity? \\n\\nVicente 47:32\\nYeah, I think their method of claiming that people are served, I don\\'t think those systems are sustainable. They get put into houses, there often isn\\'t a system set up for communities to either provide services to those people. If something breaks down, or something burns out, or something like that. Most community members don\\'t know how to fix those things on their own. Or are not going to fix those things on her own. There\\'s been an exception here and there. Maybe I don\\'t work with these communities, maybe was it Nunam Iqua, that Tom Bobo helped set up a system to where like, I\\'m going to train you on a few small things. This is how you fix this, this is how you fix that, have a couple of these parts in your store. Have a couple of these at the at the city or tribal office, you\\'re the guy that\\'s going to go around when somebody complains about something and go around and fix it. Whether or not those I mean, we\\'re talking about systems that what maybe 200 gallons of water, 100 gallons of water, I mean, you can run the faucet LD in Austin or in Texas and not worry like, I mean, yeah, when I go, the first thing I do when I leave here is take a really long shower, whether it be in a hotel or family\\'s house, like a huge long shower, I don\\'t hear this, run it hot and run it long. Have a bath or somewhere like it\\'s really nice. That\\'s like luxury of it. But also, we\\'re not worried about how much water we\\'re going to use for washing so that we can wash appropriately. I don\\'t have to worry about, you don\\'t have to worry about so much of like washing clothes, or like not washing clothe. You\\'re not worried about, you can drink as much as you want. You can clean as much as you want. I kind of forgot where I was going.\\n\\nLauryn Spearing 49:33\\nThat makes sense. So kind of just the a lot about the quantity of water as well. Right?\\n\\nVicente 49:38\\nOh, sorry. Quantity. Yeah, and the, If the system does have somebody that can go around fill up water tanks, it becomes very expensive for a service. Bethel, there\\'s a standard service, be it depending on where you live and then how much gallons of water you\\'re going to have filled up in your water tank. So it becomes very expensive over time. I mean, yeah, per month, back when I used to live in a very small house. And I only had a 300 gallon water tank that I would get filled up once a week, which even that was kind of a luxury to have. Yeah, it was just shy of $400. I mean, a decent paying job and being able to do that was a luxury. Yeah. And yeah, there\\'s other people that rely on three to maybe 100 gallons per week. If they can, if that. And the standard. The standard is like 24 per person. 24 gallons for was it 24 gallons per person per day? At like minimum? Yeah. So it\\'s, but most people use 100, everywhere else, gallons per day per person. So when you really started to see those health benefits is when you can have more water, and we\\'re talking about 30 gallons to one house, maybe a day, maybe? And that would be Yeah, I bet that that does not happen. Yeah. That might be like a week or whenever?\\n\\nLauryn Spearing 51:16\\nYeah, okay. Yeah, thanks.\\n\\nLeif Albertson 51:18\\nSo, circling back a little bit. They\\'re looking at, I guess, workforce. Right. So we talked about water plant operators, and in training, some of these things, you know, I think you kind of hit on. Communication challenges or sort of, I don\\'t know how to describe it sort of status or credibility issues and how communication and credibility is formed. You mentioned how fluoride I think was, you know, one of the big examples you use, like, how do you convince somebody that doesn\\'t want to listen to you? I mean, how do you do that? What\\'s been successful? What\\'s been successful? What\\'s helped you? Where are you still seeing those challenges?\\n\\nVicente 52:19\\nHaving like, having community meetings, having their community members that are respected, speak up and say what needs to be done. Yeah, but you have to have somebody on board within their community that\\'s well respected. Without that, you\\'re just going to be going in circles.\\n\\nLeif Albertson 52:43\\nWhat do you think would help you get there? I mean, is funding or, you know, what, like, what? What would are there are the things you see that are missing that could help you get to improve that situation? Is it a money problem? Is it other problems?\\n\\nVicente 53:02\\nI don\\'t think it\\'s money. It may be education, understanding, science or chemistry, I believe can be very difficult for for populations, where school is not as important to them as other places in the United States. School is not something they\\'re like, you need to go to school, or you need to learn so that you can go to college, because we all know that, like here, it\\'s not really a thing, like sometimes having a college degree can hurt you. If you want a position or want to be recognized within the community. Yeah, often people that get degrees move away that would be able to understand so and they\\'re kind of just over whatever politics are going on in their village. So yeah, figuring out that barrier of education, and having someone speak out, that\\'s going to be able to describe it to their people well within their own language as well, so that they can understand.\\n\\nLeif Albertson 54:20\\nSo it\\'s kind of a tension, like you need somebody that\\'s educated but not too educated. So they still, you know, still part, you know, still can relate, everyone can relate to them. I was going to, I think I understand, but I wanted to clarify when you say education, you\\'re talking about kind of public education or schooling not going out there doing trainings. Or would that be included?\\n\\nVicente 54:47\\nMaybe both. So like with trainings, you can require some of the information but it\\'s, I think it\\'s harder for. I mean, we have a hard enough time having water plant operators pass their water. What is it like a third ever at this time, like 30% of the people that take a water plant operator class pass, like even just the small treated, and it gets worse as you go up along the line, whether it be a level one or level two. So whatever it is it\\'s going on isn\\'t catching whatever it is the system that they have isn\\'t working. And most Yeah, I don\\'t have not one water system where a small water certificates going to do anything besides just be a paper on the wall, because most systems are level two. And so they\\'re not going to get any credit from the state basically, for having it and level one. I mean, maybe during the classes that I\\'ve taught, like level one, maybe one operator, each class passes 10% or less. It\\'s yeah, it takes a lot of studying. And maybe having that background of like education does help, like knowing how to study, knowing what they need help in how to study isn\\'t something that I think people gain out here, it\\'s not as important. And school is just something you do because you have to, not something that you see that\\'s going to be necessarily beneficial. \\n\\nLeif Albertson 56:23\\nI have heard that, for some of the people taking the tests, I mean, just knowing how to take a test. Sit down with a number two pencil for you know, an hour or two is like that\\'s not something folks are always familiar with.\\n\\nVicente 56:38\\nAnd the language barrier. I mean, when you start asking questions like which one is best, or which one is going to and like with very specific with very specific parts, or very specific wording, I think the wording can be very difficult at times too. Not everything that\\'s like commonly spoken in our language is going to be the same of what they grew up with recalling the same object. Or maybe they\\'ve never been taught the right names for a lot of the things that they\\'re having to answer questions on. So I feel like a language barrier could be something that could be improved, and maybe for some water plant operators having it in Yupik or Tupik or Athabaskan or whatever it is for our area specifically, might be beneficial, or having somebody teach it to them in that language might be beneficial. \\n\\nLeif Albertson 57:35\\nYeah. So I think I kind of know where this is going. But to what degree is the, you know, so sort of chasing the certification, right? And there\\'s education that\\'s attached to that, to what degree do you feel like the certification is, is useful to the people that are actually doing the job?\\n\\nVicente 58:01\\nNot much, rather than like, I honestly don\\'t think it really does much, these guys can go and operate their water plants just fine without having a certificate. Most of what\\'s asked on those questions, I guess, either are geared towards water systems in the lower 48. We don\\'t have those types of systems they\\'re asking them questions on. Yeah, I mean, when I have taken exams in the past, just to help guide the class better. Like I pass it just to like find out what they are asking my water plant operators, what they are going to be doing for like, sure I can get by because I\\'m from the lower 48. I know how to take exams, I know how to, I can decipher like, okay, get rid of two and guess for the last two, if you don\\'t know, that\\'s not something that they have been taught to do, most likely. And even if you do tell them right before the exam, it\\'s not something that they\\'re trained to do. And even with the tests themselves, I mean, what are they made in like California, or something like that? Like they\\'re not Alaska specific for water systems that are going to be relevant. we\\'re severely downscaled from a lot of these water systems without having the same kind of water quality or, yeah, just the same type of systems.\\n\\nLeif Albertson 59:20\\nYeah, it\\'s, uh, you know, it\\'s so easy to sort of take for granted. I mean, if you\\'ve never seen a test question that said, Which of the following is not true? You know? I mean, those little tricks that that we, you know, some of us have been doing for decades. Right? Or, like you said, there\\'s no, you know, if you don\\'t answer a question, it\\'s wrong. So we answer all the questions, even if you don\\'t know, like these things that are so ingrained in us.\\n\\nVicente 59:46\\nAnd with that being said, I feel like so, maybe some of the families that I know here like don\\'t think that the educational system is as good as it can be, and they give their kids extra homework on the side. So that you they can have the options. Thinking of you, thinking of Brian, thinking of like, people that want to set their kids up for success living in an area where the standards are very low, you showed up, here\\'s a pat on the back. That is where the standard is on education, I feel a lot of the times, is you showed up, you kind of did something, or you turned it in severely late, and you still get credit for doing it is where the education system is in Bethel. Imagine going out to the village where it\\'s going to be completely different.\\n\\nLeif Albertson 1:00:37\\nYeah. Lauryn, you want to jump in with water plant operator stuff? I know we\\'re running a little short on time here. So\\n\\nLauryn Spearing 1:00:47\\nYeah, yeah. I was hoping Can you expand a little bit about kind of what training do you think would be useful? So kind of maybe stepping away from certification more towards? Is it on the job training? Is it with the remote maintenance worker program? What have you seen as successful? And what do you think you need to, for it to be more successful?\\n\\nVicente 1:01:08\\nEither on the job, or hands on training, I think is most beneficial. We have people super eager to learn in classes that aren\\'t necessarily water plant, super directed. But our boiler maintenance and repair program had been one of the things that water plant operators can come to to help their water plants. It\\'s not something they\\'re gonna like, oh, this goes towards water plant operator, this, maybe it does get they get some a couple CEUs towards what they have. But that\\'s a class where, yeah, they might lecture for four hours. But then after that, it\\'s like, this is how you troubleshoot this boiler. This is how it works. And you get guys working together, working with their hands, things that they like to do. This area, everybody works with their hands, even if you don\\'t want to work with your hands. To save money, or because somebody\\'s not going to do it for you, unless you pay them a ton of money. Yeah. We learn to work with our hands by necessity, not because of choice. So, um, yeah, in classes like that boiler class, you see people eager, you see people excited, you see people wanting to learn. Same with Oh, nevermind, that never happened. Or that yeah, there was a pumps class that happened that some guys from Utah came to teach at one time, they were people were really excited and want to learn, it was a lot of things that were hands on where they could touch and feel and tighten and put pipes together by soldering them or screw it and stuff like that. And that becomes beneficial. Same with electrical controls, where we have a, we have these, like these fake systems that, you know, if they burn out, if they were to like, be like a very minor shock, or if they were to like burn a circuit that the circuit would actually burn out. And they\\'d have to replace it. Those are the classes where it\\'s hands on troubleshooting something that they see useful, and they see something that they actually can touch and hold and see beneficial to them, rather than just taking the test so they can have a certificate on their wall when they already do that job.\\n\\nLauryn Spearing 1:03:25\\nYeah, yeah. And so in terms of so you said you\\'ve done a lot of operator training. That\\'s mainly focused on \"Okay, we\\'re gonna try to pass the exam\", correct? \\n\\nVicente 1:03:36\\nYeah. \\n\\nLauryn Spearing 1:03:37\\nOkay. Yeah. Because it\\'s connected to funding right, and things like that. Okay. That makes sense. And then, let me. And then what about I guess, from your perspective? Um, were you ever trained to train others, right? Like, were you ever trained on this? This piece of your job?\\n\\nVicente 1:04:02\\nNot really, there was one training called train the trainer that happened like two years ago, and then COVID happened. So we haven\\'t really been able to put that into action. But there were many entities throughout the state that were trying to figure out the training problem that there is, and how to better engage our students. And yeah, I think that was put on by the state led by MTL, I want to say, and there have been some people like Brian Berube, they\\'ve really been able to put that into action. I feel some and, Robert White that have been able to do more online educational classes geared toward the exam. It\\'s just, it\\'s going to be geared to the exam. It\\'s a piece of paper that they need to say that they did it for the state. Yeah, yeah.\\n\\nLeif Albertson 1:04:50\\nSo do you think about online training? I mean, for any of this obviously, hands on isn\\'t going to be is there is there a place for it? I mean,\\n\\nVicente 1:04:58\\nI I think there can be a place for it. I think online training, even when I have to do it, I don\\'t like it. I, I prefer to be in person, I prefer to touch things I learned with by doing and repeatedly doing. With COVID, you know, everything\\'s kind of gone towards being online. But the times that I learned the most is when I\\'m in person with hands on. So I think it can become very difficult with other people in the area that might think likewise with learning. Also, I mean, we have the problem of having enough internet and having fast enough internet, I mean, internet\\'s probably been decent right now, for me, but it was the beginning of the month, it just started over, I still have plenty of gigs left to use for the Zoom call, and for whatever else I need. But there are definitely connection issues. And the price of Internet is a luxury. It\\'s extremely expensive here. Yeah. And yeah, you, you have a zoom call with somebody in the village. I mean, if they\\'re able to connect and knowing what\\'s going on, they don\\'t have like, often a well running computer within a water plant where they can just be like, Hey, what\\'s going on, let me do this training here. Half the time, they don\\'t know what\\'s going on with that either. Or if they go to the office, it\\'s kind of the same deal where it\\'s this, it\\'s not their computer, it\\'s not something they\\'re familiar with. Yeah, having somebody talk at them isn\\'t necessarily something that they\\'re, they learn how to do within their family, a lot of it, when they learn is to watch somebody do it. And I don\\'t know, whether it be like cutting moose, or cutting fish, it\\'s just a lot of pointing and kind of like grunt maybe and like watching somebody, and then like, kind of just like watching how they hold their hand and what they do and how like, and then like, I don\\'t know, I\\'ve had like a fur sewing class. And I had, I was trying to ask a elder a question. And she\\'s like, No, like this, and just physically show me rather than, so when you do, and this is what you\\'re doing wrong. And you need to do it like this so that it can be better. It\\'s a no. And then just you have to watch and do it. And kind of, since I didn\\'t grow up in this lifestyle, it\\'s something that I\\'ve had to get used to. But a lot of it\\'s through visual and hands on. Yeah. So with it being online, I mean, there\\'s you lose that whole aspect of the hands on. And visual is often somebody talking at you about something that we\\'re trying to describe that they may have never looked at before. Like here are these molecules that you can\\'t touch and how do you explain how it kills them, or the disinfectant will kill them? So that\\'s a really good insight about the I mean, I\\'ve been that person learning to cut fish and being grunted at that\\'s a really good insight. Yeah,\\n\\nLauryn Spearing 1:07:58\\nyeah. So a lot of what you\\'re talking about too, is just like different ways of learning and figuring out how to bring those together. Right. And, Okay, interesting. Awesome. I know, we\\'re hitting at the end of time so Leif. Are there any?\\n\\nLeif Albertson 1:08:12\\nYou know, I don\\'t think so. I wanted to see if you had anything that you thought we should have asked you? Do you have anything else to cover? Or you have any questions for us?\\n\\nVicente 1:08:25\\nThank you guys for doing this study. This is I don\\'t know what\\'s going to happen with it or where it\\'s gonna go? Or maybe I should ask that. Well, what\\'s going to happen with this study?\\n\\nLeif Albertson 1:08:35\\nLet the researcher handle that one.\\n\\nLauryn Spearing 1:08:38\\nYeah. So this is actually yeah, we\\'ve kind of just got off the ground. And there\\'s two projects we\\'re working on. And the first one is looking a lot at just kind of understanding the general context, right, what challenges are present and, and talking to people like you about what solutions there may be. And then we\\'re also looking at a project specifically at operator training. And so a lot of this kind of last part of our discussion, and my brains already, I\\'m excited to review this interview after, but trying to figure out kind of how to bridge some of these knowledge sets, right, like so what you were just talking about, of actually touching things, doing things instead of the traditional lectures. And, and so, you know, we\\'re doing research, so eventually, there\\'ll be publications that we will be sharing, and we\\'ll be kind of communicating with you throughout the process, because we\\'d love your insight on kind of the initial results as well. But we\\'re hoping to have some practical findings as well to get to share later as well. Did I miss anything Leif?\\n\\nLeif Albertson 1:09:41\\nI would add, you know, maybe I don\\'t want to sound jaded or or you know, I think from a practical aspects, like like, you know, and I know there\\'s been a billion research projects, right. And so talking to somebody who, you know, like talking to Brian Lefferts like, why, you know, like you already know, you know, we\\'ve already said this, like we already know, the state, we already know this test is pointless. You hear those things over and over, I think that it\\'s important to keep banging the drum and I and for whatever reason, you cannot be a prophet in your own land. So when we have sometimes outside research entities who can be seen as unbiased and can bring that information back and say, Look, we did all the legwork. And these are our conclusions that, you know, hey, maybe this teaching style doesn\\'t align well. And that could be a problem, or maybe this certification. I mean, the state used to do their own right, maybe this national certification isn\\'t the best way to do that. And when we can put those things in writing, you know, it can be frustrating to feel like, yeah, we\\'ve been saying this for 30 years, you know, but sometimes you got to say things the right way. So that\\'s that. That\\'s my long term hope for what could come out of this.\\n\\nVicente 1:10:57\\nSure. I do have another question. With the settings being done like, and with this being such a native Alaskan area and talking about Native Alaskans have Are you interviewing any Native Alaskans? Or Okay,\\n\\nLauryn Spearing 1:11:14\\nso we\\'re trying to get out to Well, Leif will be the first one to go out. And so as soon as it is safe to travel, we will be going into communities. And so we\\'re doing it a lot more backwards than we had originally designed it to be. We wanted to do this completely opposite, where we started in communities, and then eventually talked to kind of like people at the state level. And unfortunately, because of COVID, we\\'ve we\\'ve had to kind of swap that.\\n\\nLeif Albertson 1:11:42\\nYeah, we were working on it for a while. And originally I was like, Yeah, this will be you know, we go out to the villages, you know, we\\'ll meet we\\'ll go with that. You know, I know all the . You know, we talked to Tommy Bobo. We got him on board. He knows the water plant operators, we\\'re going to this is going to be great. And then like, oh, yeah, we can\\'t go any of those places. Yeah, travel shut down. So we\\'ve been mostly talking to people who, you know, a little more in the professional realm, we\\'ve been trying to reach out to some actually water delivery drivers in Bethel. And for some of the reasons you mentioned, it\\'s hard to even get somebody on a zoom call that\\'s not, at least, in Bethel. But if you have ideas of people who talk to you if you know people who you think would talk to us and could I mean, I think we\\'d you know, appreciate the the lead.\\n\\nVicente 1:12:38\\nHave you spoken to any of the remote maintenance workers? Shane, Bruce? Billy, Alan.\\n\\nLeif Albertson 1:12:46\\nSo I mean, I heard Bruce might be a tough guy to get on a zoom call. Is that accurate?\\n\\nVicente 1:12:52\\nYeah. Might be able to get him on a phone call. But Shane would probably be a good source he\\'s you know, Shane McIntyre, right? Maybe Shane, it\\'s maybe a difficult time for Alan with some of the things have gone on in his personal life but maybe Billy Westlab you know all these names have been around or grew up in the areas or grew up in the villages themselves\\n\\nLeif Albertson 1:13:26\\nwe\\'re you know, gonna ask Bob the same question but yeah finding people who are comfortable talking and and all that but yeah, she she\\'s good idea and I mean, I don\\'t mind talking to Bruce I just he seems like more of an in person cat. So\\n\\nVicente 1:13:44\\nyeah, he\\'s Oh, that guy might want to put two hours or three\\n\\nLeif Albertson 1:13:52\\nUpshot is you know, he will tell you what he thinks. Yeah, okay. All right. Um, let\\'s see if if there\\'s follow up questions is alright if we contact you please. Okay. And then in the long term, I think if you have any illustrative pictures of any of the things that we talked about this is a broken water pipe or this is somebody hauling when I think of like small haul you know, I think of like like you said like people with a for people who work for the you know, the tribe or the city the driving but then somebody moving their own water with a garbage can I think is like a also an interesting image.\\n\\nVicente 1:14:41\\nYeah, man, I could have like probably done that. If you\\'d asked me a while back to that. I could have got one with like a busted tire too, trying to move 30 gallons of water with like two people on on it. I can dig around and see if I have anything.\\n\\nLeif Albertson 1:14:57\\nThat looks looks good. Yeah, shoot it my way. because we\\'re I mean, again, you know, it\\'s easy in the interviews that you and I both know what we\\'re talking about, but not everyone and not everyone who looks at it in Texas or in other places. Not everyone who works in Anchorage.\\n\\nVicente 1:15:16\\nYeah, like when you get the essay common question is like, what is the daily life look like in Bethel? Like,\\n\\nLeif Albertson 1:15:22\\nI don\\'t know compared to what? Like what? Absolutely. Awesome. Well, thanks for your time today. This is really good. Thank you. Thank you so much, Lauryn. Yep, I\\'ll be out for I\\'ll be out for K 300 with the family. So we\\'ll be we\\'ll be around then. Are you are you working that or?\\n\\nVicente 1:15:45\\nI might. I might be in Tuluksak at the checkpoint. I might not. I don\\'t know. School is gonna determine my life for the next few months. So\\n\\nLeif Albertson 1:15:55\\nYeah, well, good luck with that. All right. Well, we\\'ll see you then.', '1_6__InterdependenciesNNA': '\\nQC Bob White\\nThu, Aug 18, 2022 12:47PM • 1:13:53\\nSUMMARY KEYWORDS\\noperators, communities, test, water, people, plant, maintenance workers, training, bethel, state, pass, certification, issues, haul, questions, system, problem, pay, years, village\\nSPEAKERS\\nLeif Albertson, Lauryn Spearing, Bob White\\n\\nLeif Albertson 00:06\\nSo and then I just always tell people that we won\\'t share your information, but it\\'s a small town in your office. So somebody finds out to talk to us. We\\'re not. We\\'re not the CIA, right? Like, we\\'ll do our best. But yeah. And then you can stop anytime you don\\'t want to talk to us. You don\\'t want to answer a question? That\\'s, that\\'s fine. There\\'s no no penalty. All right. Well, we\\'ll start with you. How long have you lived out there?\\n\\nBob White 00:36\\nI\\'ve lived in Bethel since 2005. So 16 and a half years,\\n\\nLeif Albertson 00:43\\nwhat brought you to Bethel?\\n\\nBob White 00:45\\nMy wife got a job at the hospital as a nurse. So we moved to Bethel. And I was gonna hunt and fish. That was expensive. So I got a job. That\\'s the short of it.\\n\\nLeif Albertson 01:08\\nSo what do you do? Tell us about your job.\\n\\nBob White 01:14\\nSo I\\'m a remote maintenance worker for the Yukon Kuskokwim Health Corporation. I work with water and sewer systems and villages as a technical assistance provider to assist them in fixing equipment or adjusting treatment or thawing frozen pipes. Whatever water sewer issues they have. I\\'m a resource for them.\\n\\nLeif Albertson 01:40\\nGreat. It sounds like you didn\\'t move out there with that intention. So what\\'s your background of training or what brought you to that? \\n\\nBob White 01:56\\nWhen I arrived in Bethel, my background was in construction. I was a general contractor in Washington state before I moved here, saw a job posted and just a recruiter asked me \"Are you interested?\" And I started talking to the guy and next thing I know, I was being interviewed for the position. And then I had to fill out an application after they hired me. So yeah, it all kind of happened really quick. It wasn\\'t a planned thing at all. They were like, great, it would be good to have a guys that knows structural stuff on our team. I work with four other RMW\\'s. And they thought I rounded the team out well. I don\\'t know if that was a good decision or not. But I\\'ve learned a lot along the way. So I\\'ve learned everything I know about water and sewer while I\\'ve been here working on water and sewer. So\\n\\nLeif Albertson 02:53\\nThat was kind of my next question and that was when you got here so it\\'s been 15/16 years.\\n\\nBob White 02:59\\nYeah. Yeah, so I\\'ve been doing this, I did take a year off at one point. But I was still involved just in smaller ways during that time.\\n\\nLeif Albertson 03:13\\nSo in the in that time, did you pick up any certifications? You have to do any schooling?\\n\\nBob White 03:25\\nYeah, so um, in my first 90 days, I had to pass the test for level one operator, I didn\\'t have the time and place to get the certification, but I had to pass the test, I passed the test. And then since I\\'ve got the hours, I actually hold a level two certification in water, a level two certification in distribution, provisional wastewater treatment and a provisional wastewater collection as well as a lagoons certification. I passed the test. I\\'ve tested to higher levels, but I can\\'t get those certifications because I don\\'t have a higher level plant to work in. But I\\'ve passed a level four water certification and level three water distribution certification test too. So\\n\\nLeif Albertson 04:22\\nokay, so we\\'ll probably have some questions about that, too. Yeah. Did you do that because, I mean, is that a requirement of the job? Or is that?\\n\\nBob White 04:37\\nUm, yeah, that\\'s kind of interesting. Um, I only have to be certified to whatever level of water plants I work with. In my job, I\\'ve been on several committees with different things with the state and one of those was about some operator certification and I had complaints about the operator certification program. Not letting people continue on if they wanted to. So it used to be, if you didn\\'t have the hours, you couldn\\'t take the next level test. I said, Well, that, you know, like, if someone\\'s interested in learning and studying, and they just passed their first, why not let them continue to study and move through their second without, you know, sitting for a couple years to gain those hours? Actually, jokingly, and they still refer to it as the Bob rule. Because I created such a stink about it. They actually changed the regulations. So you can you can test at any level. And then you don\\'t get those certifications until you have the time in and experience at that plant level. So. So yeah, so once I kind of created such a stink about it, I kind of needed to go forward and test out anyhow. And part of it was just to see if I could do it. So I plan on rounding out the rest of them. I just haven\\'t taken time to sign up for tests since COVID. So\\n\\nLeif Albertson 06:02\\nIs that typical of the other remote maintenance workers?\\n\\nBob White 06:11\\nNo. Most of them will shun a test so hard and so fast, or any paperwork for that matter. So yeah, yeah, it\\'s not it\\'s not probably typical. But um, yeah.\\n\\nLeif Albertson 06:29\\nSo definitely, there\\'s people. So you don\\'t need to get those certifications to do the job. Other people are doing the job without.\\n\\nBob White 06:38\\nYeah, we have a couple of remote maintenance workers. And by the way, I\\'m out of the four guys that I work with. Three of them have more experience than me as remote maintenance workers and like, lots more experience. Let\\'s see Alan Popkin started as a remote maintenance worker in 1997. So do the math, he has a lot more experienced with Bruce, we\\'re both started in 95. So like, crazy amount of experience. Those guys have. Bruce has a level one certification, that\\'s the highest level of any plant that he has. Allen has level two, it\\'s the highest that he has. But um, a couple of the other guys haven\\'t passed a level two tests, they have level two plants, but they can tell you everything about that plant, they can adjust that plant, they do have a hard time testing and so. So yeah, so it\\'s not required, Alan has taken his level three test and got his level three certification as well. So but for the most part, most of our RMWs - very good hands on not very good with a book as far as testing and stuff. So\\n\\nLeif Albertson 07:59\\nSo you know, we\\'ve been talking about remote maintenance workers, can you walk us through a little bit what that job looks like?\\n\\nBob White 08:07\\nI sit at this desk and wait for the phone to ring and someone have a dire emergency. And it happens every weekend when I\\'m not working. Yeah, so day to day, we\\'re calling in, checking in with water plants, seeing if they\\'re doing preventative maintenance, working on little like long term projects with them or stuff. But then the real like where we really earn our money when it comes to emergencies. And they have a freeze up or some kind of, you know, catastrophic issue. We come in with the expertise to like to help them out in that. So we provide a resource of we have tools, specialized tools that we keep here in Bethel that we can ship out, that they may not have locally, as well as the expertise to do more specialized tasks than the local operator\\'s trained to do. So a big component of our what we do is training. Ideally, we don\\'t, we don\\'t go out and do work for them, we go out and train them to do the work. You know, in the past, sometimes it\\'s not been the case, sometimes we\\'ll be out there three, four weeks working doing the work. But we really tried to get away from that and be more focused on we\\'re going to go out and train you how to do this, right? That might take a week or two but then from there on out, you\\'re going to carry carry the rest of the task out if it\\'s not complete. So that sometimes even works. Sometimes we lived there for three or four weeks. So\\n\\nLeif Albertson 09:47\\nso when you when you go out there, you\\'re working with water plant operators. And I mean sometimes there must be other people helping too, right like so if there\\'s a lot to do is that Is that members of the public? Or are there other? Who are the who else helps,\\n\\nBob White 10:04\\nWho are these people? Hmm. That\\'s one of our requirements is that we don\\'t, we don\\'t respond to an emergency without local workers. So primarily, the water plant operators are our top person, whoever those are, one or multiple. And then depending on the task that we have, we will ask the tribe or the city, whoever the entity is to provide additional workers, whatever they\\'re needed. So we might have one or two that are skilled in what we\\'re doing. And then a bunch of others that are just labor that need a lot more direction and teaching. Those are generally paid by the city or the tribe, whoever runs the water utility, we don\\'t have any funds to pay workers on the ground. So yeah, so we work with whoever that entity is. Give them basically the man hours we think we need, how many people and whatever other requirements of vehicles and fuel and things. And then, uh, then we arrive. Very rarely do we have any volunteers show up just to help. Although that has happened on a few larger things, the Kotlik flood being one in particular, we had to move a sewer line back into the lagoon. And the community just called out for all available men to come with four wheelers and snow machines and we drug 600 feet of sewerline back up the hill into the lagoon and just using four wheelers and manpower. So that was pretty neat. But for the most part, most of the people are paid labor.\\n\\nLeif Albertson 11:50\\nOkay, well, so I\\'m curious how that works. Right? So if there\\'s, if there\\'s an emergency, I mean, you\\'re you\\'re kind of you\\'re, how does all the money work? Right? What if the village doesn\\'t have any money? Or doesn\\'t pay anyone to help? Or I mean, it\\'s still, then you just do it? Or what? A little bit of a soft process there.\\n\\nBob White 12:14\\nThere is a soft process here. So there\\'s, there gets to be the political part, right? So a lot of times the village doesn\\'t necessarily have money. But, you know, they maybe have money. So, you know, I never asked like, how their finances are, and do they have, you know, $100,000 to support this project? It\\'s never a question I asked. I say, Can you provide me this many people? And they say yes or no. And if they say, No, say, well, we need to reevaluate, like we\\'re not coming to do it without we, we\\'ve understood from doing this long enough. What we need to be successful on different tasks, like how big of a crew we need, what kind of support we need for the weather and things like that. We can\\'t put that together. We\\'re not going to mobilize and just be out there and being unsuccessful. We\\'ve done that in the past. Nobody\\'s no one\\'s happy. When you spend four weeks and you get nowhere, you\\'re still in the same spot four weeks later. So. So yeah, so the on their end, if they don\\'t provide workers or whatever supplies we need, fuel and things like that, on the ground. We were just at an impasse, like that thing will remain frozen until that changes, or that problem will remain a problem. And so then it gets a little more complex. Politically, you know, they have to ask for help from other entities that have financial, again, we\\'re grant-funded through the state of Alaska, USDA, and EPA. But none of that is dollars we can give to a community. So we\\'re free to them. We can fly our gear in and out, we can cover all that, we can fly our people in and out, but they have to provide that manpower that that we need on the ground. So yeah, so yeah, it\\'s kind of a it\\'s kind of an odd process. And if they truly don\\'t have the funds, then they\\'re going to ask the state or they\\'re going to we do have some other partner agencies, we can ask for some emergency funds. But we\\'re talking those emergency funds tap at like 10 grand. So pretty small. And that\\'s generally for parks, not for people. Manpower has to be funded at the local level. Sometimes if it\\'s a big enough emergency, they can then get reimbursement from the state or from the Federal after they\\'ve expended that funds, but they never get reimbursed up front. They never get a money chunk and then so\\n\\nLeif Albertson 14:56\\nDo you ever work with just, I mean I guess sometimes you would then. Do you ever work with just community members on water? Do you talk to people about water?\\n\\nBob White 15:07\\nWe get lots of calls. So we\\'re, there\\'s always, you know, this morning I came in, there was emails about, you know, a system has been frozen up for about two weeks now. It serves two buildings in town. And one of those two buildings called and was like, you know, when\\'s this thing gonna get fixed? And that\\'s just a community member, you know, concern because it was the health clinic that doesn\\'t have water now for a couple of weeks. So yeah, we field all kinds of calls from community members. We get calls that get directed to us, because they\\'re technical questions about the water that a local person, maybe a water plant operator, or the Tribal Administrator can\\'t explain. So then they like forward those to us so we can help explain an issue better to someone. So yeah, so we end up doing a lot of community relations in there in between. A couple of weeks ago, and had to cancel our visit, I ended up meeting with one of the community members there, because they\\'ve been very involved with a lot of sewer over the years, they\\'re really concerned about what was going on. That ended up just being going having coffee and sitting and talking for a bit and, you know, bringing them up to date on what the issues were and they\\'re a strong advocate in that community. So hopefully, it works out as a bonus.\\n\\nLeif Albertson 16:33\\nWhen you say a strong advocate, a strong advocate for treated water for piped water for what?\\n\\nBob White 16:41\\nYeah, that\\'s a community that has only had pipe water for a few years now. This lady is in her 70s. I don\\'t remember exactly how old but she\\'s in her 70s. And so most of her life, she has not had pipe water. And she\\'s a very strong advocate for her community, having piped water for her grandkids and great grandkids. And she\\'s just, she\\'s a thorn in your side when there\\'s a problem. So she specifically actually called me to see if I was involved in helping them. And I\\'m like, Yeah, I\\'m actually coming up there today. And that\\'s when she wanted to meet and talk about what\\'s actually going on. And she\\'s done that multiple times over the years when they\\'ve had issues. She doesn\\'t feel like she\\'s getting a good answer locally, she just has my cell phone. And so she just calls me and will ask what\\'s up and, and am I coming to help. So\\n\\nLeif Albertson 17:45\\nsounds like those conversations go okay, though.\\n\\nBob White 17:49\\nGenerally, most the time they go, okay. Sometimes people are not, are not happy with the answers, especially when there\\'s when there\\'s a perceived issue that they have, that is not actually a real issue for the community, or not something that can be addressed. We actually have probably the most complaints from school teachers. And I don\\'t know if it\\'s just because the last couple of years, I\\'ve dealt with a ton of school complaint issues, but we get a lot of complaints from school teachers that are super unhappy with the quality of water that they get when a lot of times the rest of the community doesn\\'t get that quality or quantity of water. So it\\'s pretty hard to stomach.\\n\\nLeif Albertson 18:36\\nSo I have a question about that then. Because I mean, a lot of schools have their own well, right?\\n\\nBob White 18:45\\nNo, it\\'s actually Lower Kuskokwim School District has a lot of its own wells, primarily in communities where there is no running water. So so they have their own well to provide running water for their community, or for their teachers, not the community. Some places they do provide water for the community. But a lot of the lot of schools are hooked to the community water system. And so yeah, so then there are issues and we\\'ve been dealing with a couple lately. Excuse me, that, um, they\\'ve come off as spoiled kids, is how they sound when they\\'re ranting about how teachers are leaving because of the water conditions\\n\\nLeif Albertson 19:36\\nYeah, I mean, so when we\\'re talking about teachers, my impression is we\\'re probably talking about folks that aren\\'t from the community?\\n\\nBob White 19:48\\nFolks that are not from the community brought in from outside. Yeah. So\\n\\nLeif Albertson 19:54\\nOkay. I just wanna make sure that what\\'s what that\\'s what I thought you were saying.\\n\\nBob White 19:59\\nYeah, but yeah, very clear, good point\\n\\nLauryn Spearing 20:02\\nAre a lot of the complaints kind of more about the aesthetics of the water like taste, smell, not necessarily like the quality like primary water regulation, more\\n\\nBob White 20:16\\nAlmost every complaint is about aesthetics. Occasionally, we\\'ll get some higher level complaints that are actually really justified about water quality. But that\\'s generally when like a village is way out of compliance for a long time and someone you know, wants to know why and how come we can\\'t get it back. But most of it is just aesthetics.\\n\\nLeif Albertson 20:44\\nSo it sounds like you work all over the place. So some of these questions might need to qualify a little bit but can you tell me about the how drinking water is provided in the places that you work?\\n\\nBob White 20:59\\nOh, yes. Um, so yeah, I basically I have communities that I work with that have all levels of service from fully piped, turn on the sink in the home and safe drinking water comes out of the tap. So just standard flush the toilet, it all goes away like it\\'s a standard, you know, pressure system gravity. So then, that\\'d be Goodnews Bay. It\\'s standard, what you\\'d see in lower 48. No extra freeze protection stuff or anything. They don\\'t even circulate the water. But they have buried pipes and they\\'re warmer little pocket of climate where they\\'re at. Most of my pipe communities have a circulating water system. So circulates treated water, always moving around the village being heated. And then you know, comes into your tap and then goes out either through a gravity sewer with lift station or we have several low pressure. Low pressure sewer systems like a little lift station at every house. So then we have haul systems, I have several communities that are a small haul system, they have a 200 gallon trailer that they take around and haul water to houses they fill a anywhere from 100 to 150 gallon tank in the house with water that\\'s treated water that comes from the water plant hauled to the house, pumped into the tank. And then they have a septic tank that is pumped out weekly as well or monthly however often they get service. Most of those small haul communities do service on a as-call basis. So in other words, it\\'s not a regular service. Whenever the home calls for service, they get service. It\\'s the most expensive way to run that system. Because there\\'s absolutely no management of where that trailer is going to haul water or pickup sewerage. Like all over and sometimes the guy just sits around waiting for someone to call. Then I have communities that have a central watering point. And this is either at of washateria or just at a well building and they can go fill containers and they self haul their water from there. Generally in large trash cans or five gallon buckets, haul it to their house some people take that in and put it inside some people just bypass the watering point completely go down to the river and chip ice because they prefer the ice water over the treated water with chlorine and they some people use the ice water. So all different levels. In a lot of the self hauled communities, people will haul water for cleaning but chip ice for drinking. And so a little different than you\\'d expect. They like the taste of ice water. The cleaning water is much easier to get from the haul point because you just pull up press a button and it comes out of the hose. So yeah. The self-haul ones where they\\'re self hauling water. They\\'re generally on honey buckets. So their waste is in a honey bucket that they either self haul to a lagoon or they put out and it\\'s collected and hauled to the lagoon for them.\\n\\nLeif Albertson 25:04\\nHow has the situation changed over time? I guess you\\'ve got 15 years right? Have you seen,\\n\\nBob White 25:12\\nLot of things have changed. Lots of things stayed the same. I\\'ve had a few communities go from from honey bucket to fully piped, which has been interesting to watch that process. And watch how it changes that. I\\'ve had communities go from hauled to pipe as well. But honestly, there\\'s 1,2,3,4 communities out of 16, no five communities out of 16 I started with that have moved up in service. So most of my communities have stayed the same as they were 16 years ago. So. Yeah.\\n\\nLauryn Spearing 25:58\\nAnd following up on that, have you seen any kind of more deterioration of some of these maybe because of aging infrastructure? Or not really? It\\'s kind of just stayed the same?\\n\\nBob White 26:10\\nNo, it\\'s a complete disaster. Um, yeah, I mean, we have, we have plants that, you know, they\\'re built with a 20 year life expectancy, generally, and most all of the plants were past their 20 year life expectancy when I started work. And we\\'re still using some of those plants today. Yeah, probably half my water plants are still operating the old plants they did 16 years ago. And so some of those with utterly crazy plans to rehab and continue using that facility another 20 years. We\\'re talking about a facility was made in the 70s and was barely adequate then. And now we\\'re talking about using it for another 20 years from now. So yeah. Yeah.\\n\\nLauryn Spearing 27:10\\nDo you find that I, we kind of touched a little bit on the funding piece earlier, but is it almost easier to get things funded if something goes really wrong, like and using emergency funding, compared to actually like maintaining the system or improving it.\\n\\nBob White 27:26\\nSo last year was a perfect example. It\\'s a year ago, that the village of Tuluksak, their water plant burnt down. They\\'ve been in this process the entire time I\\'ve worked with them, they\\'ve got money and lost money because they couldn\\'t meet funding requirements. So money was allocated and went away. And money was allocated and went away. They were at last year, they had been in a MOA with the state and funders to meet some requirements. So funding can be released under special conditions to finish work that was started six, seven years ago. So this was just a little bit of funding to finish up work. Not even to build the new water plant yet. And that was getting nowhere, like literally, the money wouldn\\'t be released because it couldn\\'t meet some things like nothing was happening. We weren\\'t even talking about a new water plant yet. Their water plant burnt to the ground, instantly an emergency. Everybody, they now have $11 million worth of funding in place. And they will have a water plant, a new water plant next year in the village. Right now they have an interim one that we helped arrange get in there. But um, more funding has been released and spent in the last year, more progress has been made than the entire 15 years before that. Because emergency made, and it was it was one of those plants that was it was over 40 years old. And it was in horrible shape after 40 years. I mean, we had, you know liners in water storage tanks because they\\'d actually rotted through. And we were needing to replace the liners because the liners were leaking because they were passed their life expectancy. I mean, I was like yeah, we had epoxy holding the filter vessel from just spraying water out. Crazy level of worn out equipment. And the village actually spent, ahead of that fire, they had spent I don\\'t know several thousand dollars upgrading pumps and filter parts and they had done a great job in the three years prior to that. Bringing the water quality up in the plant and like making a lot of expensive repairs. That got them nowhere towards a new project, the best thing that could happen was the building burned. And I was on a plane, I did not sit that building on fire. I may have wished for it to burn a lot of times, but\\n\\nLeif Albertson 30:18\\nyou know, we\\'re recording, right? \\n\\nBob White 30:21\\nOh, yeah. I said I was on a plane because people asked. \\n\\nLeif Albertson 30:29\\nSo talking about people a little bit, you know, we\\'ve heard you talk about people that call you because they\\'re advocates. But we\\'ve also heard you say that, you know, some people happily prefer to drink, you know, chipped ice. So, can you talk to us a little bit about what water infrastructure means to the people that you work with?\\n\\nBob White 30:56\\nSo you know, the common thing is, most people think of water as in drinking. And the real benefit we found in this region is not water as in drinking, but water availability, for cleaning, for bathing, for all that stuff. Communities that haven\\'t had running water at their house, once they get used to the adjustment, and they can just put the kids in the bathtub and run the bath. And this is actually more more with the grandparents than with the parents is the interesting thing. The grandparents are so excited that they can do this with their grandkids, because they\\'ve lived so long without this, it\\'s like it\\'s a dream come true to have these, these basic functions of water in the house, they may still chip ice for their preferred drinking water. But they love the fact that they can wash and that they can clean and they they see a lot of disease rates go down and other stuff. So the water infrastructure is much more than just about drinking. And the largest benefit and why most people want to keep it is not the drinking water aspect, but all the other aspects of it, how it changes their life. So\\n\\nLeif Albertson 32:25\\nWe talked about some of the challenges that water infrastructure faces. And about sort of, so it seems like there\\'s issues of sort of quality and of quantity. And then also the sort of the quality issue is sort of split into two pieces. You talked about aesthetic quality, but then there\\'s also sort of, is water safe to drink? Is that, am I kind of understanding the, you know if somebody said, well what\\'s the problem with water? Am I getting that right? Or is there more to that?\\n\\nBob White 33:05\\nUm, no, that\\'s, that\\'s pretty good. There is there is the aesthetic, and then the, whether it\\'s safe to drink. And you know, the safe to drink is interesting, because that\\'s driven by national standards. And that creates a lot of cost and a lot of burden in communities because they don\\'t necessarily make sense for the community and where it\\'s at. They can produce a safe water that may not meet the standards. And so that is a whole interesting thing. It\\'s like you get this thing that says your water is bad, even though your water is perfectly safe to drink. But it\\'s not meeting some national standard that kind of doesn\\'t apply well. \\n\\nLeif Albertson 34:01\\nCan you give an example? \\n\\nBob White 34:07\\nYeah, so we have several systems that are groundwater under the direct influence of surface water. So these are shallow wells, Oscarville is one of them. Oscarville has had this groundwater under direct influence of surface water designation for about a year and a half now. That well was put in in the 70s. And the state just decided to reclassify it based on its depth and proximity to the river. There\\'s never been a positive coliform sample. There\\'s never been anything scientifically that says that that water is not safe to drink. But they don\\'t meet the drinking water standard because now it\\'s treated as surface water. Their systems not good enough to treat surface water, doesn\\'t make the right treatment points. So it could potentially make you sick, although it never has. And so now, you know. So we have that same example, repeated over and over in communities. So some reclassification at some level creates a problem. And like, it\\'s not really addressing a problem that really exists. Platinum, again, they\\'re the same thing, they have a GWUDI well, but that well has, like, always been solid. Never. I mean, it\\'s like so. So they\\'re on boil water notice, like, permanently, they can\\'t meet treatment standards. So, yeah, so stuff like that is pretty common. And so, aesthetically, both of those are some of the best water that we have. But won\\'t meet the standard. And never made anybody sick. So, like, yeah.\\n\\nLeif Albertson 36:06\\nSo pivoting a little bit. Seasonality, are there times of the year that you have specific problems or more problems?\\n\\nBob White 36:19\\nYeah, I mean, right now we\\'re in the midst of winter and things freeze. Turns out, not today, it\\'s raining out. But um, you know, we\\'ll be below zero by the end of the week, so things will freeze again. Right, now we have four water systems in the YK Delta that we have that are frozen up. So we have a lot of freeze up issues. midwinter starting right around, well, starting at Thanksgiving. And then again, at New Year\\'s, two spikes in times that we see freeze-ups due to weather and days off coinciding in bad ways. Then come spring, when we have break up, we have a lot of flooding on the lower Yukon. So that creates seasonal issues there. This winter, we\\'ve had, we had a warm up at Christmas time that created a lot of treatment issues with our surface water systems. We had a week of rain. And we saw water consistency changed from what we see normally during the winter to normally what we see during the summer, which meant like mid in a week, we had to all of the sudden change all of our treatment scheme from our winter program to our summer program. And then like couple weeks later, we had to switch it all back. It created quite a bit of disruption and a couple of the plants running out of water because they didn\\'t catch that everything changed until they dropped quite a bit of water level. So So yeah, so there are seasonal variations. Our surface waters all generally have like a summer and a winter thing. With sometimes some special things you have to do during break up when they have some real high turbidity issues. Groundwater is much more stable throughout the year. But we do have a lot of seasonal issues with with freezing, freezing pipes, freezing water plants during the winter. \\n\\nLeif Albertson 38:27\\nAnd what about, like long term weather changes? Is that affecting water infrastructure, like climate change?\\n\\nBob White 38:38\\nYeah, you know, we also live on a river delta. So we have a lot of erosion issues. And sometimes that happens drastically, we\\'ve noticed, you know, some communities, we\\'ve had to move back infrastructure. From the river as the river erodes, we\\'ve actually like been lopping off parts of the pipe and like pulling it back, so it doesn\\'t go into the river. Which is a interesting, we\\'re getting close in one community to have to move a whole part of the circulating main. And that will become a true emergency that no one\\'s even funded yet. And it will cost quite a bit to do to move that. That won\\'t be an easy just dig it up and move it project that\\'ll be a couple of million dollars. So yeah, so we do see some issues with that. And it kinda you know, it comes and goes. Sometimes we can see really fast erosion in a community, and then it can change and we\\'re like really slow down because something\\'s changed in the river. It starts moving the other way. We\\'ve seen more turbidity issues, not as many as were predicted with global warming. There was a lot of predictions a few years ago, that it was going to spike way up more. It hasn\\'t quite, but we have seen changes in sourcewater. So we\\'ll see, it\\'s always, it\\'s always changing. But it\\'s changing in some different ways lately. So\\n\\nLeif Albertson 40:14\\nLooking at the big picture of water infrastructure that you work with, what\\'s the big thing? If you could change one thing if you had whatever you needed to have, a magic wand?\\n\\nBob White 40:32\\nYeah. Um, I think when I think of the magic wand, I think it changes by what I\\'m dealing with currently.\\n\\nLeif Albertson 40:46\\nNeed a couple of wands? Okay, right. Yeah.\\n\\nBob White 40:48\\nSo there\\'s different times where I see different problems. Lately, something I\\'ve been noticing, and I think would be really helpful. It\\'s not actually not the water plant operators. But it\\'s the managers over the water plant operators, or the village infrastructure, or city infrastructure, whichever it is, isn\\'t a recipe for success. So when there\\'s problems in that management above the guy that\\'s doing the thing on the ground, it\\'s just, it\\'s a game changer, the right input the right things, a problem gets solved easy, the wrong and we just flounder with it. And so that\\'s one of the things I see is the magic wand, if I could wave anything and and change some of that, that management. But also like, we have a ton of aging infrastructure, that it doesn\\'t matter how good a management we have, we\\'re putting band aids on top of band aid. And without replacing that aging infrastructure, we\\'re just, you know, we\\'re never going to succeed. We\\'re just going to hold it off a little bit longer. So those are my two if I had two, the aging infrastructure, completely would. And then some management things locally, I think can make a huge difference. And that\\'s not necessarily education, either. Everyone\\'s like, oh, you need some more education, about management, no one needs education about management. I think for a big part is I see that there\\'s like changing, changing how they do management at a local level to make it work locally. A lot of it\\'s somebody told them a plan, and they follow that plan. And, you know, I had water plant operators getting laid off the day the water tank ran out of water. And I was in the village for emergency work. And they were getting told to go home, we don\\'t have money to pay you. I\\'m like there\\'s a disconnect here. That\\'s like, you know, yeah, it was kind of like it showed like, they\\'re using an outside plan of you know, this many dollars, you know, you only have this, you need to cut this but like, they had a different problem. It didn\\'t fit the formula they had been taught of management and so yeah, so, so some of that some real customized to like, drill down and find out what\\'s not working in a particular situation and then like, fix it. That\\'d be my magic wand. A little sociology, a little economics all waved together.\\n\\nLeif Albertson 43:45\\nWhen you\\'re talking about management, and that sort of thing. I mean, I\\'m thinking of like there, the state of Alaska has some sort of program to help tribes or rural administration, right. They kind of do audits or whatever is that\\n\\nBob White 44:05\\nThey have Rural Utility Business Administration?\\n\\nLeif Albertson 44:12\\nRUBA\\n\\nBob White 44:13\\nRUBA, I was trying think what it stood for, we just all say RUBA so often. And so yeah, they are tasked with training in the business end of the water and sewer. And they also do a lot of the regulation of the business end of the water and sewer. Over the years, they have leaned way more towards the regulation end than the training end. And so yeah, a lot of they\\'ve been they\\'ve been limited internally due to their own leadership decisions about travel to villages for I think the last three, four years. So their people have not been out traveling to villages unless the village pays to bring them in. And so that has really limited their ability to speak into a lot of those situations and improve. And honestly, most of the villages don\\'t see value in the advice they get. Very few of those RUBA people, might show a little my bias but, don\\'t have business experience. Have never actually worked in a water sewer utility. They\\'re checking checklists. And it\\'s training by checklist. And so it\\'s not necessarily the function training that someone on the ground needs. So.\\n\\nLeif Albertson 45:53\\nAnd the training that they\\'re doing, though, is would be for like a Tribal Administrator, kind of. Yeah, they\\'re not working with operators, are they?\\n\\nBob White 46:02\\nThey\\'re not working with the operators. No, it\\'s doing doing some quick books, you know, do you? Do you have the right? Are minutes getting done right at your board meetings, is your water plant operator giving a report to the tribe? It\\'s just kind of some things like that, but they\\'re like these check points. Like, this is what best practices, you should do this, but those don\\'t necessarily correlate to the problem that may be in a community. So. And they\\'re not there in the communityo know if it\\'s correlating or not. That\\'s part of the issue. You got to kind of be present. Get into things find out.\\n\\nLeif Albertson 46:46\\nWell I think we\\'re probably going to pivot a little bit and talk about operators and operator training. Before I do, Lauryn, is there anything? Anything?\\n\\nLauryn Spearing 47:00\\nNo, I don\\'t think so. I think I asked my follow up, follow up questions as they were coming. Well, actually, one thing you might have already said it, but how many communities do you serve? Or work with? Do you have kind of a set amount? Or?\\n\\nBob White 47:13\\nYeah, so um, I currently serve eight, I used to serve 16. So I took over managing the remote maintenance worker program here at YKHC. When I did that, we reallocated my community. So I could still do some fun stuff out in the field. But I\\'d have little more time to do the office and so. \\n\\nLauryn Spearing 47:36\\nOkay, so the other remote maintenance workers have maybe 16 or around that load?\\n\\nBob White 47:42\\nYeah. Anywhere from 16 to 10, I think is the lowest that one has, yeah, so \\n\\nLauryn Spearing 47:49\\nOkay. Thanks.\\n\\nLeif Albertson 47:52\\nSo you travel out, and you work with water plant operators, but you also do training. Right? \\n\\nBob White 48:02\\nYeah. We do classes that are well required if you want to pass the, the level one or level two test. So we generally do small treated and level one. We haven\\'t offered any level two trainings in a long time in our region. So yeah, we do those in person, we also do some specialty skills training, we do boiler training, we do electrical controls. We\\'re developing a plumbing training right now. Basic pipe fitting, basically. So we do both some book training, which would be the level one, level two, and lagoon. Those are all kind of book based lecture style. Our other trainings, our skill trainings, are very little lecture, a lot of hands on skills.\\n\\nLeif Albertson 49:05\\nWhat do you find works better?\\n\\nBob White 49:09\\nHands down, hands on skills. Our boiler class and our electrical controls, they fill up every time, there\\'s wait lists. People walk out the class saying I learned something, I could do something. We get reports back. When people go back from training, they\\'re like, Oh, I went and did this thing in my plant that I didn\\'t know how to do or I was afraid to do before. I\\'ve never had someone call me excited after they, you know, left level one training the week later be like, Yeah, I went back and I redid this thing in my plan because I so much understand it better. That was never the case. So\\n\\nLeif Albertson 49:46\\nIs it hard to find people to run water plants? I mean, is there openings or turnover or is that a issue?\\n\\nBob White 49:59\\nYeah, there\\'s there\\'s a high level of turnover. And it\\'s for different reasons. Some of it is, is a salary, it doesn\\'t pay well in a lot of communities. Some of it is, is not being able to pass the test. There is a stigma that a lot of operators have, when they take the test over and over and they they can\\'t pass, they feel like someone else could do it better, even if they\\'re a good operator. So some of that some of it\\'s support. I\\'ve had some really good operators, highly skilled, just get fed up not having good support around them in the community. When something goes wrong, it\\'s all on them. And so they just after so long of that, just, you know, they\\'re done, and they move on to something else, or just be unemployed, because it\\'s better to be unemployed than have that stress. So\\n\\nLeif Albertson 51:00\\nOkay, so just to follow up on that a little bit, is it a good job? I mean, not just the pay thing, but I think in the public thing, public perception, right? Are these people seen as, like in a positive light? Or is there community support?\\n\\nBob White 51:23\\nIt\\'s a great question, um, I think it can be a good job. I think most of the water plant operators deal with more complaints. No one\\'s like, wow, this water is great that\\'s coming out of our faucet. But when something comes out of their faucet that doesn\\'t look great, they sure do say something to the operator. So it\\'s one of those jobs that when you\\'re doing your job well, no one notices and no one seems to care or know. But when you don\\'t do your job, or something goes wrong, everybody has a comment and a complaint. So I think that\\'s kind of hard for the operators. And the money issue, that\\'s a decision made locally, right? That\\'s a decision made locally, I have operators that make anywhere from 10/12 dollars an hour, up to 20/30 dollars an hour. So great disparity across where they\\'re at. Honestly, like everyone thinks, like, increase the salary, that\\'s the answer. One place to increase the salary. Everyone\\'s like, Yay, great, then they\\'re like, oh, we can\\'t afford this. So they cut everybody\\'s hours. Net, the people ended up actually losing money on the deal. And, you know, they didn\\'t know how good operators they had. So they burnt their operators, so their operators, turn the boilers back on. And now they\\'ve spent $15,000 in fuel this year that they wouldn\\'t have spent before because the operator was there and like on top of whether the boilers needed to be on or not that day, and he would actually turn them off on days it didn\\'t need to be on. Well, now they spent more money. It wasn\\'t a whole picture. And so yeah, operators, generally they need to get paid well. And they need. I mean, I think really the support to know that they\\'re doing something good. I think most of my most of my operators that are good at their job, they care about the position and they care about the job. And they will take less money if they get support and recognition for what they\\'re doing. So because they see it as a service to the community.\\n\\nLeif Albertson 53:52\\nAnd then the other thing that you mentioned was passing a test, coming in for training and then passing a test. That\\'s also a barrier.\\n\\nBob White 54:07\\nThat\\'s a huge barrier. \\n\\nLeif Albertson 54:08\\nYeah. So talk to us about that. Tell us about the test.\\n\\nBob White 54:13\\nSo our pass rates in rural Alaska are horrible. You know, I can\\'t remember seven to ten percent, on average, something like that. Those are actually good pass rates for a standard class. We\\'ve done some things lately working on some different ways to do the teaching, that have got those higher, but they\\'re still less than 50%. I think the highest we got was 40%. And everybody in the state thought it was phenomenal. And that took like a crazy amount of hours into that training to get that amount of pass rate. Most people will take this test multiple times before they passed it. We stopped doing level two tests because most people eventually will get their level one, if they stick with it long enough. Most of them will never pass the level two test. We have RMWs that have failed the level two test and not by a lot by one or 2%, multiple times. They know more about water treatment than I do. They just don\\'t test well. And so one of the common things we hear is \"best answer\" questions, or as they call them locally \"trick questions\". There were trick questions on the test. Best answer questions are almost impossible for people that English is their second language, where they don\\'t have real solid grasp on English book learning to understand and break down those questions to get the right answer. So then at that point, now they\\'re just one in 25%, whether they\\'re gonna get it right, right, because it\\'s multiple choice, and they\\'re gonna stab in the dark, because they don\\'t, they don\\'t get actually what it\\'s saying. A lot of these people, if we read them the test questions, they can tell us the correct answer. But if they read it, they struggle to get the correct answer. So it\\'s really amazing to see a guy that can answer every question in class, come up short on the test, every time. You\\'re like, he\\'s going to nail it this time, and but he just doesn\\'t quite, when he reads it himself, it doesn\\'t sound the same as when I read the question in class. So.\\n\\nLeif Albertson 56:44\\nSo I feel like I\\'m probably know the answer to this. But in terms of, is it more important that a operator, get hands-on like on the job training, or certification and pass the test? \\n\\nBob White 57:02\\nDepends on who you ask. If you ask operator certification at DEC, they will say they need to pass their test. It\\'s actually become a huge issue compliance wise to have a non-certified operator, we\\'re starting to see a lot of issues in DEC, pushing on that issue. I\\'m pushing communities hard to get someone to pass the test. Or my point of view, the test is irrelevant. I\\'ve had people that could pass the test, but couldn\\'t operate the plant. I want people that can operate the plant that can do it well. They can learn the intricacies of their system, how to make it work. And do that every time. If they can pass the test, great. I tell administrators like you should send them to training. It\\'s a goal to get them to pass the test. But it\\'s definitely not the gold standard, how they operate their system day in and day out. That\\'s, that\\'s what\\'s important. And so, yeah.\\n\\nLeif Albertson 58:14\\nSo if you could do anything, if you were in charge of the regulation, what would you do? What\\'s the answer?\\n\\nBob White 58:24\\nAh, the state has been working for years and spent several million dollars on system-specific certification. And they have fallen short at ever getting that accomplished. So and the concept in that would be that we have an operator that\\'s taken the test a couple times, can\\'t pass. Instead of them getting this huge general test that\\'s for the whole nation, they would get tested on specific components that they have, and they would get certified to work in their plant, it would not be transferable to another plant. It\\'s for their plant. And they\\'d have to take several smaller tests, you know, in chunks, that would address the components that they have and the things that they need to have. I think that is the easiest way. Just getting that in place. And people can be certified to what they know and where they\\'re at and we could say yes, they\\'re safe to operate their system. Can they go to Florida and operate a water plant? No. But honestly, could I go to Florida and operate a water plant? I mean, I know the generals and I passed the test, but I\\'m gonna be a fish out of water down there, like you know, it\\'s not even really applicable. So I think that\\'s the magic wand. And we spent so much time and got nowhere. I mean, we\\'ve literally been working on this for years. I don\\'t remember the first time I\\'ve reviewed curriculum, but that was like, easily 12 years ago. And we still don\\'t have it in place. Several contractors that have been paid to build it, the state, like killed the project multiple times, they\\'ve never got a finished product.\\n\\nLeif Albertson 1:00:29\\nSo then we\\'re kind of using the national test, which it seems like is not maybe a useful tool.\\n\\nBob White 1:00:38\\nWe used to use a state specific test that was designed around actually had some Alaska State questions on it. We use a national test now. Matter of fact, the test is even, it\\'s through ABC. I don\\'t remember what ABC stands for. Something, something something. But um, but that test is now not even just being used nationally, but also internationally. Canada\\'s adopted the same test and some European foreign countries. So like, a lot of the details that are like that you would learn that are important to actually how your plant operates have been taken out, because those are national standards or state standards. Now we\\'re talking the international test. So everything just like concept stuff, which is like the hardest for them. So I feel like our, our test has got less relevant. Well not just feel like, it has actually got less relevant over time to the people locally, which is harder for them to grasp and understand and pass. So\\n\\nLeif Albertson 1:01:51\\nSo one of the things, you know, you\\'re talking about, like are the questions, the right questions, but then you also talked about just sort of, based on people\\'s culture or background, or you know, their educational history or whatever, they\\'re having a hard time with the actual sort of the mechanics of the test, as well. Is that I mean, is that a solvable problem? Is there a ways to teach or ways to address that?\\n\\nBob White 1:02:21\\num, there are, I mean, there, there are ways to address that. So we\\'ve worked on doing some different things with education, because COVID kind of threw everything on its head. So we\\'ve actually had the most success with actually with an online training. But I don\\'t think it was successful because it was online, it was successful, because it was spread out over a six week period. With like, stuff every other day, and tutoring every day in between, if you want it. So like there was a lot of man hours put in, change in format, a lot more, a lot more time taken per subject and stuff. And we still found major holes that we\\'re trying to improve for the next round. So there\\'s things to improve. When I started, it was a state test. Now, this international test, it\\'s completely the wrong direction to move. You know, it\\'s done for lots of reasons. But those reasons don\\'t help. Don\\'t help people locally. I just, it\\'s not the direction we should be moving as a state. It\\'s ease, it\\'s expedience. It\\'s different things. But yeah. I don\\'t know if that answers it or not.\\n\\nLeif Albertson 1:03:52\\nI mean, so it sounds like more hours of training helps, right?\\n\\nBob White 1:03:57\\nMore hours of training definitely help, which is a challenge, you know, with distance between communities and like, you know, it means you have to pay an operator to sit at a computer or something and do or you have to pay them to come to Bethel or another regional hub and do that training. That\\'s expensive. I mean, training is not cheap. More hours seems to help. Honestly, I think a huge, operator certification is super rigid. On the test taking, a lot of people can get accommodations for other certification tests, that operator certification will not allow on the drinking water test. So you know someone\\'s graduated college can get accommodations for reading disabilities or something and get a certification test read to them. Operators certification won\\'t allow an operator who doesn\\'t speak English or doesn\\'t read English very well to have the test read to them, even though that might be the difference between them passing and not. So there\\'s like small things that could be done. Without changing the whole system. I would advocate, we rebuild the whole system, because we\\'ve got into some really weird morphed thing that really doesn\\'t suit our need. I don\\'t think. Most of it, let me let me just say this one other thing, before in case we move on, most of it is driven by reciprocity. And that\\'s the thing the state says like, well, whoever takes this test can take this thing anywhere. And that works good. If you live on the road system, and you plan on moving from Anchorage to Fairbanks or Anchorage to Florida. But most of our operators in rural Alaska, are never moving to the road system to work, let alone moving to Florida. So reciprocity doesn\\'t make a difference for 90% of the communities in the state. So\\n\\nLeif Albertson 1:06:19\\nHow much of the training that you\\'re doing, I mean, is there any amount of it that\\'s sort of, like test taking skills or like, you know, training about testing versus training about subject matter?\\n\\nBob White 1:06:35\\nWe\\'ve done some of that. We\\'ve stuck in some portions, spend a couple hours out of our week training on test taking skills. Teach them all the tricks you can, you know. But some of that is something that doesn\\'t play culturally, very well, is the other thing we found. They don\\'t understand why, you know, why we\\'re trying to trick them. Why do they have to learn this trick to like, understand what\\'s going on? And so that\\'s kind of a hard thing. How do you, you know, like, I don\\'t know it\\'s what you got to do. But that doesn\\'t sit well.\\n\\nLeif Albertson 1:07:20\\nRight. No, I, yeah. Which one of these answers is not correct? You take a three hour test at six in the morning. And yeah, yeah. Okay, I know. We\\'re getting a little short on time. But I know that you live in Bethel. And so I\\'ve been asking the same question. And I\\'m curious if somebody was moving to Bethel. Lauryn was moving to Bethel from Texas. And she said, hey, I heard you know, water is an issue. But I\\'m looking to find a place to live. What advice would you give her?\\n\\nBob White 1:07:55\\nBring piles of money so you can have all the water you want. I\\'m on hauled water, I pay $450 a month now, for my water, sewer. Love it. And I forgot to thaw my fill. So I didn\\'t get a water delivery this last week, I\\'m still gonna pay the same amount. My advice would be get a house on the pipe system. Because the water is less than there\\'s more of it. That\\'s my advice to most people coming in. So I chose to buy a house because I liked my location more than water.\\n\\nLauryn Spearing 1:08:41\\nAnd another question kind of on the same vein, but more in terms of communities that aren\\'t served yet maybe have like a central watering point. What do you think kind of is best case there? There\\'s a lot of discussions about like systems like PASS and things like that. And I think we\\'ve heard opinions across the spectrum in terms of what these communities deserve. What is your kind of perspective there? Do you, is your goal to get everyone pipe systems?\\n\\nBob White 1:09:13\\nYeah, I mean, pipe system passes a disaster that we just haven\\'t found out is a disaster yet. But everything we\\'ve learned over the years, like went out the window when they designed PASS. I\\'m not even sure it\\'s better than nothing is the problem, right? It\\'s the perception you have something but you really don\\'t have anything more than the bucket of water you\\'ve already hauled. It\\'s just now in this thing so I can get it out of my faucet. but yeah, our perspective is piped water and sewer for every community. That is the thing that\\'s going to change the health issues. It\\'s going to have the greatest impact. Mmm, is that long term sustainable? I think it is in more places than we give it credit for. I don\\'t think we look at sustainability right. When we get pipes in a community, it\\'s amazing. The community could not afford a honey bucket system, but all of th sudden they can afford a pipe water system, because it\\'s something worth paying for. No one\\'s gonna pay a dime for PASS. PASS has a lot of parts that need to be changed and fixed. It\\'s kind of a complex system, we\\'ve we\\'ve done that with co-water, we\\'ve done that with different things. Long term, they always run into these huge roadblocks. And the service doesn\\'t equal the cost. You know, I pay $400 a month for my water, not because I want to. But because that\\'s what gets me a level of service that I don\\'t have to worry when I turn on my faucet or when I flush my toilet, I\\'m gonna have water. And so communities we found can afford much more than, like the EPA modeling says they can. And they\\'re willing to pay more for good service. So I really think pipe water is the best thing for most communities. Some of our smaller communities, it might not be, you know, I\\'d like to see more individual wells in some small communities and septics and things where it\\'s more sustainable. But I think that we can have like, high level of water use at every home. It\\'s just gonna look a little different in each house, each community.\\n\\nLeif Albertson 1:11:49\\nWell we\\'re kind of pushing up against time. I super appreciate this. This is really good stuff. Useful. What am I missing Bob? What what other things would you want us to know? What should I have asked you about?\\n\\nBob White 1:12:09\\nI don\\'t know. We\\'ve talked about a lot. Um, alright. Yeah, water sewer. It\\'s um, it goes on and on.\\n\\nLeif Albertson 1:12:22\\nIs there anyone else you think I should talk to?\\n\\nBob White 1:12:25\\nUm, I definitely think you should interview Alan and Bruce and Billy\\n\\nLeif Albertson 1:12:31\\nAm I gonna get them on a zoom call you think?\\n\\nBob White 1:12:34\\nYou can probably get them on by phone. It\\'s probably gonna be the best. They generally call in when we do zoom because it\\'s more reliable than their internet. Alan will probably be, Alan has a ton of experience, been with RMW program forever and has been probably to more communities than any of the other RMW\\'s. so yeah, I would say he\\'s probably the most valuable to get on the phone. And, and Billy will definitely have some different stuff as well. So yeah,\\n\\nLeif Albertson 1:13:19\\nokay. All right. Well, we\\'re kind of grinding through these interviews a little at a time. So put in a good word for us. And I\\'ll probably reach out. Lauryn, is there anything else we should talk about?\\n\\nLauryn Spearing 1:13:34\\nI don\\'t think so. We might have a couple of follow up questions but I don\\'t think I have any questions at this point. We talked about so much so thank you for your time.\\n\\nBob White 1:13:44\\nYeah, no problem.\\n\\nLeif Albertson 1:13:48\\nAlright, I\\'m gonna stop the recording then.', '3_1__InterdependenciesNNA': '\\nQC Pete and Bill\\nTue, Sep 13, 2022 8:49AM • 1:19:15\\nSUMMARY KEYWORDS\\nwater, people, piped, trucks, pipe, problem, homeowner, big, house, pay, permafrost, sewer, tank, plant, years, building, gallons, alaska, challenge, cost\\nSPEAKERS\\nMichaela LaPatin, Bill Arnold, Leif Albertson, Nikki Ritsch, Lauryn Spearing, Pete Williams\\n\\nPete Williams 00:00\\nA sewer project called the Avenue Project, I forget how many 1000 feet. To hook up 140 houses, 140 residences and involve 500 people. And the original cost for it was $13 million. So right now what they\\'re quoting us for piped water, hauled, and sewer is $1 million per 1000 feet. And so we recently, we bumped that 13 million up to about $13 million when we originally started, we bumped it up by 40%. Because of all the all the supply problems and all that stuff that\\'s going on, we went out to bid this, this happened a month ago. And those projects came in at $42 million, and $48 million. So funding is the big crux problem. I can say, though, on that, and this is where Bill will come in. There\\'s been some big improvements in the systems, you know, like we\\'re using the SCADA system, they call it and so it electronically monitors the whole. So there\\'s been a lot of improvements in the system that weren\\'t there before. But the expenses just kill us. It\\'s just. And the problem right now we\\'re in the mix where the haul drivers, you need CDL licenses to do that. And those are in short supply. And just today, we\\'re almost down to the point where everybody\\'s going to get water, but they\\'re not going to get what they need. Or what they want. You better grab a chair Bill, they\\'re doing a water and sewer study and some of it\\'s on systems. So anyways, from my standpoint, that\\'s kind of been the big challenge. And another challenge is for us as with the USDA is the paperwork involved. I mean, I can\\'t, I can\\'t, we\\'re lucky that we have somebody that can process the grants, John, we\\'re lucky we got somebody with a lot of education is what it takes. But like the USDA, it took us two years just to put the application together. So that\\'s a big challenge for the villages, or anybody else is trying to get through this. And Brian was involved in that side of the street in the villages. But so anyways, that\\'s our challenges of trying to get water, pipe water and sewer. Without it, the town has grown. And it\\'s going to come to a point where I just just said that, we\\'re only going to be able to deliver X amount of water to everybody, and they\\'ll have to live with it. We don\\'t want to go there.\\n\\nLeif Albertson 02:56\\nSo like with the new subdivisions coming online?\\n\\nPete Williams 02:59\\nThat\\'s that\\'s what I\\'m worried about. Yeah, so we\\'re trying to do well, with this project, we\\'re, we\\'re kind of chasing our tail. Because if this project can come about, it would have saved about $1,200,000 in labor costs in driving the trucks around. It would also give the current personnel more time to do the work up town. So there\\'s some other benefits, you know, once you get piped up, we\\'re right in the middle of we actually did a preliminary engineering report, or we\\'re working on it, it\\'s not complete yet for the whole town. So that\\'s, so there\\'s a lot of engineering, too, that goes into it just a tremendous amount of engineering. So the idea that this would be that starting at A whatever you have there would be able to help deliver water way over here. So the whole system has to get filled up, designed and built sort of sustained all the way up to Kasayuli. So that design is in the process, but now the next big hurdle will be, how in the world are you gonna fund that. \\n\\nLeif Albertson 04:13\\nI think the scope of what we\\'re trying to look at is all those things so just I mean treatment, but also billing and personnel like you mentioned water truck drivers and like all the steps that go into people not getting good water at their house is all those pieces. Welcome. Hosting a research team from Texas that are looking at, I don\\'t know, challenges with water sewer infrastructure in western Alaska. So that\\'s what we\\'re asking questions about.\\n\\nPete Williams 04:47\\nOne challenge on the hauled side of it for us. I mean, it might not be so much for like in a village where you don\\'t have as many houses. People want different quantities. So you know you\\'d be delivering 1000 gallons to this house once a week. So we maybe the next house next door wants 2000 gallons a week. And when you try to meet their needs, their wants. It\\'s not really their needs, but their wants. It\\'s it\\'s a real challenge. And you said something about billing. And it\\'s, it\\'s very difficult to bill, I just recently found out there\\'s like a 10 digit number for all these different. If somebody wants 2000, 1000. So it\\'s a real challenge over here with personnel. I mean, when you\\'re punching in ten digits at a time, it\\'s really easy to make a mistake and all that good stuff. So that\\'s, that\\'s been a big challenge, too. So yeah, you need a good somebody in the billing department that can keep track of this. \\n\\nLeif Albertson 05:49\\nI want to back up just a second because we kind of launched in. Can you introduce yourself? \\n\\nPete Williams 05:56\\nPete Williams, City Manager.\\n\\nLeif Albertson 06:00\\nHow long have you been doing this and living here and?\\n\\nPete Williams 06:04\\nWell I\\'ve been living in town about 35 years and I worked I started working for the city of Bethel in 2004, as port director. And then I came on as city manager in 2017 or something like that. And so we\\'ve been involved with water and sewer. The other the other big thing I think, too, that is important is because you\\'re dealing with a lot of land, you got to have utility easements, out here in Alaska. That\\'s the property ownership\\'s very nuanced. What is very, can be very contentious. And nobody knows no, you can\\'t find a recorder in the state of Alaska, you don\\'t have to record anything\\n\\nLeif Albertson 06:53\\nAnd the native allotment. \\n\\nPete Williams 06:54\\nAnd you have native allotments. You have just a whole bunch of issues with that. So that\\'s kind of what was important for us with bringing the general service contractor on board and so that\\'s a company that can do the engineering, has a real estate division, and it has a bunch of professionals that do different tasks that they\\'re readily available. Or else we would have never got anywhere with any of this. Especially in the easements.\\n\\nLeif Albertson 07:22\\nI remember with institutional corridor all the easements. One guy says no.\\n\\nPete Williams 07:31\\nForty of them just over there and we we did well we got them all, the construction easements. But so that that was that was another one that you really when you want to talk water and sewer projects, you don\\'t think of real estate. It certainly does come up. So\\n\\nLeif Albertson 07:49\\nBill, can you introduce yourself?\\n\\nBill Arnold 07:52\\nBill Arnold, Public Works Director.\\n\\nLeif Albertson 07:53\\nAll right. How long have you been out here?\\n\\nBill Arnold 07:54\\nI\\'ve been here 18 years, been with the city 14 years, and public works director 5 years now roughly.\\n\\nLeif Albertson 08:06\\nI don\\'t actually know the answer to this. Where did you come from?\\n\\nBill Arnold 08:09\\nSeward. Oh no, I was down there for 16 years. I came from Florida from the Navy. I left Pennsylvania and went to Florida.\\n\\nLeif Albertson 08:10\\nIs that home? They were just down in Seward. Well, great. And I kind of introduced ourselves. So all right, I\\'m gonna kind of go through some of these questions. But I think you\\'re getting at what we\\'re kind of trying to learn. I guess, broad question. What are the so, how is water delivery going? Are people getting water?\\n\\nBill Arnold 08:47\\nThey\\'re getting water right now. It\\'s tough right now because we\\'re so shorthanded. I mean, we got 19 drivers positions, we only have nine filled plus the foreman. So he\\'s driving right now. And we got 10 routes a day. So everybody\\'s working five days a week, Saturdays eight routes. So pretty much everybody\\'s working six days a week right now just to keep up. Pete and I had a conversation earlier today. You know, we lose a couple of drivers which I\\'m hearing a rumor that we might. What are we going to do? The best thing we can come up with is cut everybody\\'s water back. You know people are getting twice a week, they\\'re gonna get once a week people who are getting twice a month they\\'re gonna get once a month. I mean, we\\'re just to that point that we we don\\'t have the personnel.\\n\\nLeif Albertson 09:35\\nIs that the main, so if somebody doesn\\'t get water at their house is that, that\\'s the first thing you thought if, was it\\'s drivers that\\'s the main number one thing you would. People on piped water?\\n\\nBill Arnold 09:45\\nPiped water\\'s fine. \\n\\nLeif Albertson 09:46\\nIt\\'s fine? Okay. Okay, yeah. So, I mean, some of this I\\'m just restating. Because not everyone has lived here for twenty years. No, that makes sense. And we\\'ve heard about the drivers since forever. Is there any? What are you gonna do? Is there any way? \\n\\nBill Arnold 10:08\\nAnybody got any answers to that?\\n\\nPete Williams 10:09\\nSo it\\'s a big question nationwide. Yeah, in the trucking industry. I\\'ve got a pretty good edition. Something like that over there about the supply chain issue here, you can take a copy here. \\n\\nBill Arnold 10:24\\nLast I heard was 86,000 drivers needed nationwide.\\n\\nPete Williams 10:30\\nAnd then you know the other thing right at the moment, and we don\\'t, we don\\'t know. How everythings gonna end up in the end is like I said, the cost of that water and sewer project quadrupled. And we don\\'t know what the new norm is going to be. So right now there\\'s that challenge. And we don\\'t know where it\\'s going to end up. But the water. Finding drivers is a big problem nationwide. So I don\\'t know, maybe we\\'ll just have to buy the trucks that don\\'t have drivers.\\n\\nLeif Albertson 11:14\\nWell, I\\'m gonna skip ahead a little bit, but because it kind of hits on this, though. I mean, so if you had your choice piped water?\\n\\nPete Williams 11:23\\nPipe water is the way to go.\\n\\nBill Arnold 11:24\\nYeah, it\\'s way cheaper. I looked. I think Palmer\\'s about the same size as Bethel. I looked at their utilities. They run about half as much as what our costs are. They\\'re right around $2million a year, we\\'re at $4.2 million a year in water and sewer services.\\n\\nLeif Albertson 11:43\\nAnd probably probably more reliable too.\\n\\nPete Williams 11:46\\nYeah, it\\'s unlimited water really.\\n\\nLeif Albertson 11:48\\nYeah, they\\'re not taking. They don\\'t take holidays and.\\n\\nPete Williams 11:52\\nThe other consequence of the truck too, that you don\\'t think about is the wear and tear on the roads. And that\\'s especially on gravel roads. And these are big trucks and they\\'re heavy. And so anyway.\\n\\nLeif Albertson 12:10\\nEspecially when it rains all winter. So looking at some other challenges, we start from where the water is made. We got two water plants here. \\n\\nPete Williams 12:23\\nYeah, that\\'s that\\'s his domain.\\n\\nLeif Albertson 12:29\\nAre there challenges with that? Are we making enough water? Are we gonna be able to make enough water? Any bottlenecks there?\\n\\nBill Arnold 12:34\\nNo. Right now city sub\\'s running at 45% capacity. Bethel Heights is running at 22% capacity. If and when the Avenues project goes on, we\\'ll be right around 44% capacity.\\n\\nLeif Albertson 12:52\\nAnd the avenues projects, addinh a bunch of houses, right? \\n\\nBill Arnold 12:56\\nYeah. Right now it\\'s slated at 140.\\n\\nLeif Albertson 13:01\\nWow. This is good news, right? \\n\\nBill Arnold 13:04\\nYeah, it is. \\n\\nPete Williams 13:05\\nWe were ready to move and we were ready to go to construction this year.\\n\\nLeif Albertson 13:09\\nRight I thought this was supposed to be done. I thought I\\'d be cutting a ribbon here or something.\\n\\nBill Arnold 13:12\\nwell it went from 13 million to 42 million. That\\'s what the bids came in at. \\n\\nPete Williams 13:18\\nSo now we\\'re just have to sit back and wait, see if the world comes to some senses here. But the USDA is out there looking for trying to find some extra, with upping the grants. And but like I said earlier, that is a lot of red tape once you get into the government, USDA, EPA, whoever it is it that\\'s a big challenge too. Took us two years just to get to the application. And so yeah, they got a thing called the letter of conditions. By the time you\\'re done filling it out, It\\'s about two of those three ring binders. Yeah, so a lot of work just prepping to get to a point where you can actually go to work.\\n\\nLeif Albertson 14:03\\nHow about water plant operators?\\n\\nBill Arnold 14:06\\nI\\'m hurting. Yeah, there\\'s four operators normally for the positions. I only have two right now and I\\'m working on hiring an outside firm to come in and help us. \\n\\nLeif Albertson 14:19\\nYeah, okay. \\n\\nBill Arnold 14:22\\nThese guys are working seven days a week right now and they\\'re gonna get burned out. So I talked to a gentleman in Anchorage. He\\'s willing to come out Friday, Saturday and Sundays. So the guys will get every other weekend off or at least some break.\\n\\nPete Williams 14:35\\nAnd I think that\\'s a nationwide problem, too. Michelle Dewitt, one of our council members sent me an article on how Austin Texas was hurting for water plant operators. And they\\'re boiling their water.\\n\\nMichaela LaPatin 14:48\\nWe sure were\\n\\nLauryn Spearing 14:50\\nWe remember that. That\\'s where we live. So\\n\\nLeif Albertson 14:56\\nTake some notes here. Yeah. The worst we have is HAA fives. Right. So we\\'re. So what what happens then if, you know, you\\'re you end up down to two and they get hit. I mean, what if somebody gets hit by a bus? And I mean, \\n\\nBill Arnold 15:18\\nI\\'m going back to a water plant, I guess? \\n\\nPete Williams 15:20\\nSet up a cot. \\n\\nLeif Albertson 15:22\\nUm, okay. You\\'re kind of the backup plan then?\\n\\nBill Arnold 15:26\\nYeah just as long as Billy\\'s still around because he\\'s the level two. we have to have a level 2. \\n\\nLeif Albertson 15:32\\nHe\\'s been around a long time though, right? I mean, \\n\\nBill Arnold 15:34\\n47 years. \\n\\nLeif Albertson 15:35\\nYeah. So he might not be around forever.\\n\\nPete Williams 15:37\\nYou can\\'t get him out of there. He won\\'t retire. \\n\\nBill Arnold 15:41\\nI asked him. \\n\\nPete Williams 15:41\\nHe\\'ll make more if he retires than if he\\'s working. It\\'s like,\\n\\nBill Arnold 15:47\\nYeah, I did the math on his retirement, he would be making 132% of his pay right now. I say, what are you doing here? But Yeah, I mean, it\\'s tough to figure it out. I mean, I can\\'t do everything.\\n\\nPete Williams 16:04\\nThe training\\'s offered, that\\'s a good thing.\\n\\nBill Arnold 16:07\\nYeah we got the training. But then we\\'re in a bind right now where we\\'re so short handed, I can\\'t even send anybody to training. You know, I can, but then I\\'m really hurting myself. \\n\\nLeif Albertson 16:16\\nTo where do you send people?\\n\\nMichaela LaPatin 16:18\\nEither Anchorage or Fairbanks. \\n\\nPete Williams 16:20\\nThat\\'s what that rural\\n\\nBill Arnold 16:21\\nARWA, Alaska Rural Water Association usually puts one on, NTL puts one on,\\n\\nLeif Albertson 16:30\\nbecause people in the villages send folks here for training, right? \\n\\nBill Arnold 16:33\\nThey only do small treatments here. We have to have a level one minimum. We\\'re not a small system. We\\'re considered a large system.\\n\\nLeif Albertson 16:43\\nOkay. Because we\\'ve heard a lot from other communities about trying to keep an operator and find an operator and train an operator and get their operator to pass the test instead of just show up for the class and fail the test over and over again. I\\'ve heard a lot about that. \\n\\nBill Arnold 17:00\\nYeah the test. From what I understand, it\\'s like a 27% pass rate nationwide, your first time taking it, it is a tough test. You know, most people with the city that go and take it, the majority of them don\\'t pass it the first go-around. It\\'s pretty tough. Especially if you\\'re really green and never been in a water plant, and then here you\\'re going to class, go get your license. And yeah, it\\'s really tough.\\n\\nLauryn Spearing 17:27\\nSo yeah, yeah, I have a couple of follow up questions specifically about kind of operator training. And I know you said you\\'re certified too so even if you want to think back to some of your experiences as well, you know, what do you think was more helpful for on the job? Like, is the certification helpful? Or is it more just kind of a checkbox to be able to do your job?\\n\\nBill Arnold 17:52\\nIt\\'s more of a checkbox. I mean, we got pretty simple water plants. I mean, there\\'s sand filters, you just got to pay attention to chlorine levels, fluoride levels, you know, the biggest two. you know, it\\'s really hands on. But we\\'re required to have a level one through the state. You know, but we have to have a level two on staff because we\\'re a level two system. So we always have to have somebody on staff that is a level two that can respond to either water plant within an hour.\\n\\nLeif Albertson 18:23\\nAnd we got one and he\\'s been there 47 years.\\n\\nBill Arnold 18:29\\nAnd it\\'s, you know, you got to do 1900 hours just to get your level one, then you got to do another 1900 hours to get your level two. So it\\'s not like you can just go take a test and get it, you still got to do your time. \\n\\nPete Williams 18:43\\nYeah, it sounds like. I was merchant marine for a while, and we had to get Coast Guard licenses and, and they were having a hard time finding, getting licensed captains. And what they actually did was start in 10th grade. Start to approach people. So they got practical experience in the summertime. And then when they graduated there, they can do what he was doing. You take your number of hours in and then you can move on, but it sounds like the water plant operators are in the same way we need to start down at the lower level of vocational education. So you know, the older you are, the harder it is to study too.\\n\\nBill Arnold 19:24\\nOne position that\\'s open right now is water pipe coordinator. When Shawn left three years ago, it\\'s like okay, I\\'m gonna have a hard time filling this. What do I do? So I posted it starting at $80,000 a year. And I had one person interested in it in three years.\\n\\nLeif Albertson 19:42\\nWow. What are the requirements for that?\\n\\nBill Arnold 19:46\\nIt should be a level two but the one person that showed interest I was considering hiring strongly. And she didn\\'t even have her level one yet. It was Alyssa. Yeah, because I knew she could keep up with all that paperwork.\\n\\nLeif Albertson 20:02\\nYeah, no, I don\\'t I don\\'t think, yeah, she was kind of a low point with all that COVID work, I think? Well, we\\'re still hoping to bring her on board, aren\\'t we?\\n\\nBill Arnold 20:11\\nShe called me. Said she was gonna stay at YK. Just to let me know, because I was pretty excited.\\n\\nLeif Albertson 20:17\\nYou never know. Could have a bad day at work? So yeah, just I guess, to kind of reiterate that, you know, there\\'s training, the state used to have kind of its own certification, and then we went to the national standard, so that, you know, the operator training is the test is what everybody takes, anyway. And is that, do you think that\\'s a good thing? I mean, is there stuff in that class that in that certification, that\\'s, that\\'s valuable? \\n\\nBill Arnold 20:54\\nYes, definitely. \\n\\nLeif Albertson 20:55\\nversus, you know, like, on the job training, versus some combination of the two, like, what would you? If it was up to you, what would you, how would you set it up?\\n\\nBill Arnold 21:07\\nIf it was up to me, how I would set it up is you get tested on the plant that you\\'re running. I mean, because you go in and take this class and you take a test you\\'re learning membrane filters, you\\'re learning reverse osmosis, learning. You know, five, six different ways to treat water. And if you\\'re gonna stay in one area\\n\\nLeif Albertson 21:28\\nA bunch of stuff we don\\'t have here.\\n\\nBill Arnold 21:30\\nYeah, I mean, learn the system that you\\'re going to be doing. Get tested on what you\\'re going to be doing. You know, before I got into the water business, I never thought about water killing people. I never thought about it, I mean it never crossed my mind you know, just going, get thrown in the water plant and operating a water plant not knowing how bad you could hurt somebody with water. That is a very important part.\\n\\nPete Williams 21:54\\nIt\\'s happened out here too. I mean a lot of sick people in Hooper Bay.\\n\\nBill Arnold 21:57\\nYou know, I don\\'t know if you remember I had Jennifer come up and you know, kind of give like an hour class to the water truck drivers, because they had no clue on pathogens. Just give like an hour course and explained to them. You know, dropping that hose on the ground isn\\'t a good thing.\\n\\nLeif Albertson 22:16\\nYeah, we followed a water truck today. He backed into the house though. Is he supposed to? No I\\'m just kidding. \\n\\nPete Williams 22:26\\nWe\\'d believe it if you tell us he did. they gotta get to the last inch. They can\\'t seem to you know, you\\'re gonna pull the hose six more feet stay away from the house.\\n\\nLauryn Spearing 22:43\\nWell, yeah, I\\'m wondering, so a little bit more about operations training again. So say you hire someone brand new? What are the steps going forward? So maybe someone who doesn\\'t have a certification? What would you do kind of first step, second, third,\\n\\nBill Arnold 22:57\\nWe get them in the water plant. The guys that are in the water plant train them. Get them to understand the system, how the system works.\\n\\nLauryn Spearing 23:06\\nAnd is that by like showing? Right, like kind of more hands on, right?\\n\\nBill Arnold 23:10\\nYes. Then after that, we start looking for a class to send them to. Usually the classes maybe every four to six months, you can find a class to send them to. so so we immediately start looking for a class to get them into. And they get certified. You know but they only get their provisional until they get their 1900 hours and then get the level one.\\n\\nLeif Albertson 23:35\\nWhat do you do if they take a test and they don\\'t pass?\\n\\nBill Arnold 23:40\\nI just had a guy do that. Just, next class that comes up, send them to it. And I got lucky and there was one I found within six weeks. He just took it and we usually don\\'t find out results for about four weeks.\\n\\nMichaela LaPatin 23:53\\nAre they able to work in the meantime? like to be certified? Exactly how? What does that? Does it limit their ability to work or does it limit more of like the plant as a whole?\\n\\nBill Arnold 24:05\\nNo, this was I got the level two on hand and he\\'s within an hour. He can go in there and start running the plant just as long as he can, if there\\'s something going wrong, call Billy. And Billy would go over there. We\\'re legally allowed to do that. Because I asked the same thing a long time ago. Okay, how am I gonna keep this running if I don\\'t have operators? But if he does leave, I don\\'t know what I\\'m going to do.\\n\\nPete Williams 24:36\\nWe did, what we fell back on, he was able to find two people that would fly in. That\\'ll break anybody\\'s budget.\\n\\nLeif Albertson 24:45\\nRight. Yeah. Can you get on a plane right now?\\n\\nBill Arnold 24:47\\nWell, one company I talked to, when I first started talking to him. He wanted to be out here seven days a week. And it was going to be 7500 bucks a week. So then I talked to him in just coming on Friday, Saturday, and Sunday. So I got it down to 4200 bucks a week.\\n\\nNikki Ritsch 25:06\\nAnd it sounds like most of the people that you have coming in aren\\'t already certified unless you\\'re specifically flying them in to help fill gaps. Is that, are usually training people on the job?\\n\\nBill Arnold 25:14\\nYeah it\\'s hard. Like I said, I had the one position open for three years already for 80,000 To start,\\n\\nNikki Ritsch 25:19\\nWas that certifications or just not enough people in general interested?\\n\\nBill Arnold 25:23\\nI don\\'t think anybody right now wants to work.\\n\\nPete Williams 25:28\\nYeah, it\\'s been crazy,\\n\\nBill Arnold 25:30\\nI got truck driver positions open starting at 40.\\n\\nPete Williams 25:32\\nWe have 38 positions open. With the handouts and COVID. AFHC, the state\\'s Housing Finance Corporation, he knows people that don\\'t have to pay the rent for\\n\\nBill Arnold 25:46\\nanother year.\\n\\nLeif Albertson 25:48\\nI\\'m gonna pivot a little bit. Part of its talking with customers too, right? So if somebody doesn\\'t get their water, what do they?\\n\\nPete Williams 26:02\\nRight now. We\\'re not taking no extra calls. We\\'re not doing anything extra right now. that\\'s a, that\\'s a big chore for our staff. It definitely is. The other one is call outs. So normally, we think of those people that might be missed, or, you know, they phone up and complain, I didn\\'t get their water. But even there\\'s a lot of people that want extra water, meaning that they don\\'t have enough, their order 1000 A week doesn\\'t quite squeal them through for whatever it is\\n\\nLeif Albertson 26:30\\nHow does that conversation go?\\n\\nPete Williams 26:31\\nAnd it\\'s two pieces to that too, you gotta remember, there\\'s sewage. \\n\\nBill Arnold 26:39\\nIt was bad at the beginning, you know, when something new to the public, you know, of course, people are going to complain a little bit. I mean we don\\'t have the manpower to do it.\\n\\nPete Williams 26:51\\nIt does take up staff time too. So it goes to kind of like along with the billing, we\\'ve got a billing person basically got one person sitting there dealing with customers, they phone. That phone, it doesn\\'t stop.\\n\\nBill Arnold 27:05\\nYeah it\\'s not just one guy jumping in a truck and going and giving them some extra water. It\\'s every, like you said finances involved and billing\\'s involved. Then the foreman has got to get involved to keep track of all this. So the public just thinks okay, I can get 200 gallons, what\\'s the big deal? There\\'s a lot more to it. \\n\\nPete Williams 27:26\\nThat is one thing that you know, that we\\'ve tried to encourage them, I guess I think that BMC says they have to have 1000 gallon water tank. So we\\'ve tried to encourage people to take more when we come but then you have to pay for it. Not everybody can pay for it. So that\\'s why I think you get, you know especially if you\\'re on a fixed income. Even though you might get, it might be more expensive to get water eight times a month, the bill is still enough to your paycheck can cover it, cover the amount even though in the end, it\\'s more you know, but you can pay for it. Fuel\\'s delivered around here the same way where they\\'ll deliver as little as 40 gallons, and you go why would anybody order 40? Well that\\'s what their paycheck can sustain. If you\\'re living paycheck to paycheck. So that\\'s, that\\'s another reason why we see a lot of phone calls, too I think. And our water is not cheap. It\\'s really expensive. What, I don\\'t know, what am I paying? I\\'m paying I think 279 a month for water and sewer for 4000 gallons. And that includes. I\\'m not even sure if that\\'s accurate, but I think it is because I got it on Express Bill Pay and just received the bill.\\n\\nLeif Albertson 28:49\\nI know and it goes up 3% every year. Right?\\n\\nPete Williams 28:51\\nRight, we\\'re probably going to kill that. Not only do the revenues have to cover the day to day expenses, you know, in council when Leif was on it, they set aside some funds for depreciation and that\\'s a real weak spot in municipalities. setting funds apart for depreciation. And if you are setting funds aside for when the day comes that you need it, that they\\'re there. Because they tend to kind of slide sideways into something else. But it\\'s really important. I mean, like right now, you know, we\\'d never do this thing at 42 million but you know, 20 years ago, maybe if we\\'d started on the depreciation, as a guess, it would have been a kitty bank for that. But I\\'ve got the number here which it should be. Anyways, that\\'s another big part of the puzzle eventually down the road if you start with a system or any system is being able to, M&O costs. It\\'s just not thought of enough.\\n\\nBill Arnold 30:04\\nYeah our hauled water is $27 per 1000 gallons per mile.\\n\\nNikki Ritsch 30:13\\nThat\\'s the cost to you.\\n\\nBill Arnold 30:15\\nYeah, that\\'s what it costs the city. \\n\\nLeif Albertson 30:18\\nSo you guys have both been doing this a while. Over time, like, are we getting better at this? Is the water delivery situation? Or the water? I mean, delivery and piped is are we in a better spot than we have been?\\n\\nBill Arnold 30:32\\nI think with all the new trucks and everything, but now I got all these brand new trucks and no drivers.\\n\\nPete Williams 30:38\\nSo yeah, it\\'s, I\\'d say a little bit, you know, like he says, we got the new trucks, so we\\'re not paying as much in maintenance on the trucks anymore. \\n\\nBill Arnold 30:48\\ndowntime. \\n\\nPete Williams 30:48\\nYeah, right, downtime. So definitely keeping the equipment up to speed is a big. If you\\'re doing trucks anyway. Well any of it, but trucks especially because they wear out faster than the pipe does. That\\'s the other reason for pipe too is. what\\'s the lifespan of a piece of pipe?\\n\\nBill Arnold 31:08\\nWell the new pipe\\'s 50 years. The HDPE. The older pipe is 25 to 30.\\n\\nPete Williams 31:14\\nAnd the trucks are getting. the trucks are every five years or so. That\\'s our goal is every five.\\n\\nBill Arnold 31:20\\nWe change them out every five years.\\n\\nLeif Albertson 31:22\\nAssuming every, Yeah, it doesn\\'t get rolled over or wrapped up\\n\\nPete Williams 31:25\\nRight. They\\'re costing us what two to three, almost 300\\n\\nBill Arnold 31:30\\n240 to 260, depending on water or sewer.\\n\\nPete Williams 31:34\\nAnd that\\'s, that\\'s, we\\'re able to get that price because we\\'re piggybacking on the state of Alaska\\'s Procurement Code. So if you\\'re, if you\\'re me or you going out and buying it, you\\'d probably add another 40 or 50 to that cost.\\n\\nLeif Albertson 31:52\\nSo thinking about progress. I mean, so we got the new trucks. Probably the biggest thing before that is institutional corridor. And then what, city sub pipes? We haven\\'t really, is that it?\\n\\nBill Arnold 32:04\\nWell, we\\'ve redone the backbone on the sewer all the way up the highway. Got a new lift station, we did that. And that was done all the way down to city sub. All the way to the main lift station all the way out to the lagoons. that was all done\\n\\nLeif Albertson 32:17\\nWe took a drive out to the lagoons.\\n\\nPete Williams 32:18\\nNow, there is some progress coming, we hope it was these crazy prices. Anyways, of course, to get the avenues, we were going to talk about the Bethel heights too. we\\'ve we\\'re pretty close to kind of proceeding moving on to try to fix that up there and add 25 new houses up in that direction, too. So the prices are I don\\'t know what we\\'re going to do. Like I said, we\\'re just gonna be calm for four months.\\n\\nLeif Albertson 32:48\\nLooking at the institutional corridor, as an example, though, because that\\'s relatively recent. Can you describe how that changed? How that was progress, or, you know, how that changed things?\\n\\nPete Williams 32:59\\nWell one thing is it brought in revenues, other than, it brought in commercial revenues. And so it was, that was a big shot in the arm to the fund itself. You know, and that\\'s another thing is like we\\'re plumbing. A lot of these grants like this grant here, it\\'s a grant loan is what it is with USDA with the avenues project, so we only can hook up houses with that grant, but there are commercial entities in the area. And those are really important to generate the revenue that you need for depreciation and all the other good stuff that you know, that we\\'ve been talking about, you know, the homeowner just can\\'t can\\'t really cover it so I mean, like the the water bill, I mean, like YK, you know, with the water they use over there is huge. So, yeah, you know, like I said, a lot of times people go out and do want to plumb a neighborhood, thinking they can help but they also need to be thinking about some of the commercial entities in the neighborhood too trying to get them hooked up. Because it\\'s, it\\'s kind of a face to tail thing. You know, you spend all this money here, you need something coming in. To get to the next step.\\n\\nLeif Albertson 34:13\\nThe commercial entities pay a different rate. That\\'s the deal.\\n\\nPete Williams 34:18\\nRight. And they use a lot of water. Yeah, that\\'s, that\\'s really what the kicker is. And it\\'s metered. That\\'s the other thing is our our water trucks don\\'t meter. The meter is going and they\\'re using a timer. Anyway So then you\\'ve got some accurate numbers to the meter where you don\\'t really. one of the issues up in Alaska anyways, is meters in the wintertime tended to freeze up. But we were able to find, we were able to find a meter that will do and we had we had them installed in the new trucks. So that was the other thing, if you do truck is that they actually work with the manufacturer to spec your truck out for your needs. That\\'s why we got the meters where before we didn\\'t, and, and whatever else, we listened to the V&E, or heavy equipment operators about how the truck was actually made, you know, this, this will hold up here. And the trucks we were getting off the shelf were falling apart. So that was if you do do truck, that was an important part about it too.\\n\\nBill Arnold 35:30\\nWe got Bill and the trucks involved, the pumps and all that we went down and spec\\'d it all up. So it fit what we thought was best for us. So now you got some efficiencies there and the pipe is still you just can\\'t beat it other than how much it costs to get it up and get it installed. That\\'s the catch 22\\n\\nLeif Albertson 35:49\\nI was gonna ask about that next, because we\\'ve got two pipe systems. One of them\\'s old, right past, its 25 years or whatever.\\n\\nBill Arnold 36:01\\nI think it was done somewhere between 72 and 74.\\n\\nLeif Albertson 36:06\\nSo it\\'s past way past. \\n\\nPete Williams 36:08\\nAnother one, that council wrestled with a lot too is. And we\\'re still wrestling with is who\\'s responsible for what once you install the system in the house. And right now the lift stations are really, the list stations, I think for the city sub are around 55,000 for the houses?\\n\\nBill Arnold 36:30\\nWhen it was put in it was around 25. \\n\\nPete Williams 36:31\\n25. And those are quoted at 86 right now. But what I\\'m getting at is those are big pieces of concrete. And right, they belong to the homeowner. And so it\\'s like when you buy a house, and you know you check out your roof, you check out your foundation, but most people don\\'t check out their lift station. Yeah. So right now we\\'ve got about 22 over there sitting sideways and every which way because of the permafrost, and you know, all of a sudden the homeowner, we\\'re telling the homeowner, hey, that\\'s your responsibility. I would much rather have that discussion up front. Yeah if that\\'s a $30,000 project, a lot of homeowners aren\\'t going to be able to. So you know, I explored it. I\\'ve been kind of exploring the insurance side on that, you know, I don\\'t know, the reply I got from the insurance company was most, most systems would be insured by the homeowner, other than if it wasn\\'t, because of freezing up. So the insurance company won\\'t pay for something that\\'s froze up and damaged. But they might pay for. But it\\'s just something that needs to be thought of going into it. How homeowners are going to deal with this. I mean, even like if it\\'s a million dollars for 1000 fee, and you know, you got 50 feet of this, it is still really expensive for the homeowner. And then you got you gotta look out and see if there\\'s somebody in the neighborhood which we have a problem here in Bethel. You know, if somebody\\'s houses, they can\\'t do the pipes. Go south, going to the mainline. Who in Bethel can go over there and fix it? you know, and that\\'s, that\\'s a problem. Plumbing. plumbers. So that\\'s, that\\'s a big. I mean, it\\'s great if you can get water in the house, if it can\\'t be maintained, and the city\\'s not responsible for it. And the homeowner doesn\\'t have the money. You\\'re sitting there kind of backed up.\\n\\nLeif Albertson 38:32\\nNo, I know, as a homeowner too, like that\\'s right. Mine might have been one of the ones. But, you know, just even if money wasn\\'t an issue, finding someone to do it, right? so as grinder pumps burn up, like do you want me going on Amazon to try to buy a new grinder pump? \\n\\nBill Arnold 38:53\\nThat\\'s why we, the city took over the grinder pumps. \\n\\nLeif Albertson 38:55\\nYeah. I mean, so if mine backs up, that might affect other people too. \\n\\nPete Williams 39:01\\nSo like the lift stations the problem was is usually the way it goes is you know, anything in between the mainline and the house is the homeowners responsibility. I mean, that\\'s the way the codes kind of written for us. And that\\'s the way most utilities are. Your electricity and everything else. But I would think that if I would have to do like city sub over, and knew what I knew now. I think we would have done it a little bit different. I don\\'t know maybe we wouldn\\'t, but I just think it\\'s a it\\'s a big burden on the homeowner. However it gets resolved. It just needs to be at least thought of going into these projects.\\n\\nBill Arnold 39:43\\nWe did the avenues a little different. Yeah, we didn\\'t get as elaborate as some people would like to see you know, a lot of, couple people want to see us put pilings in and to put the lift station on pilings. But the cost just gets too crazy. \\n\\nPete Williams 39:56\\nAnd at least up in this neck of the woods. A lot of these problems are because of the permafrost and climate change and thaw and all that stuff. \\n\\nLeif Albertson 40:05\\nTalk about that. That\\'s on the list.\\n\\nPete Williams 40:07\\nwe\\'re definitely seeing the results. We, you know, other than that the city subs been installed for 20 years. So there\\'s going to be wear and tear but the lift stations over there are definitely sinking into the tundra. I mean, there\\'s some of them are, some of them are actually pulling the pipes apart. And I know up in, Dowl, our engineer, Chase Chase Yeah. So they are really having this problem up in the slope, where they, especially some some places have actually tried to bury the huge utility corridors in some places where they\\'ve tried to bury the pipe, it\\'s actually separating from the houses. So the houses are sinking, the pipe is going this way. And it\\'s just and we don\\'t, we don\\'t know what the future, what we see now it\\'s only going to get worse. And yeah, this permafrost. Just thawing with the permafrost. The other thing is to understand, you know, permafrost, I\\'m no expert, but the Fairbanks up there, the army and all them did quite a bit of research on it. But there\\'s different types of permafrost too, you know, you could have a frost ball underneath it. And that\\'s different than maybe some sand underneath the permafrost. So that\\'s a challenge too, because you really don\\'t know what you\\'re getting into. Boreholes and soil samples before you build something, I guess is the answer. But that\\'s a challenge too. Yeah, we permafrost, it we see it in the roads, we see it in the foundations in the houses you see in the piped water? You see it just about everywhere.\\n\\nLeif Albertson 41:55\\nDo you think that it has affected water quality? are we pulling different water?\\n\\nBill Arnold 42:01\\nNo our water has been the same. And we\\'re actually getting ready to do another test at city sub.\\n\\nLeif Albertson 42:08\\nAre there are there seasonal changes in the water, like more tannins, or whatever I\\'ve heard, like,\\n\\nBill Arnold 42:13\\nwe don\\'t see it so much. Because how far our groundwater travels. But our groundwater is traveling from the mountains, going into the ground at the mountains before it even gets here. So there\\'s a lot of purification from the ground itself, before it even gets to us. So our waters been pretty much stable. But like I said, the city sub, we got something going on over there that we\\'re still trying to figure out. The disinfection byproducts from the chlorine. Those levels have been high. And talking with the engineer, he\\'s saying that they\\'re seeing this all over the country right now, for whatever reason, and they\\'re not sure why. So we\\'re getting more samples from city sub trying to figure out how to fix the problem.\\n\\nLeif Albertson 43:01\\nThat was the thing up at the posting, it\\'s this is not an emergency. We saw that.\\n\\nBill Arnold 43:07\\nYeah, you know. And the other part of that testing, too, is it\\'s a quarterly test. And it\\'s averaged out between the quarters. And if you miss one, you\\'re always gonna be high. You know and unfortunately, we missed one every so often because of the turnover that we have. So if you missed that one in a year, your everything\\'s high. But our last go around, we did get all four quarters, and it is dropping.\\n\\nPete Williams 43:39\\nDefinitely what\\'s in the dirt though can make an effect. We\\'ve got really high levels of natural arsenic in this area. It\\'s all the studies show that, the only study I know was in Florida, and they put a bunch of pigs in this dirt and let them run around for a few years and they came up nothing happened to them. But if you say arsenic, people just go nuts, you know? Yeah. So some of this, like when he turns this one of these flyers, this notice goes out to everybody, you know, we get a lot of people gotta boil water, you gotta boil water. No, you don\\'t. So that can be a challenge, too.\\n\\nLauryn Spearing 44:26\\nCan you expand a little bit more on that? Kind of talking about, you know, how is communicating with the public about water? maybe and kind of,\\n\\nPete Williams 44:33\\nWe\\'re kind of well, you know, you\\'ve got a we just recently hired a media firm to kind of, I think the biggest thing is just being transparent. And the trouble is, most people phone up, they\\'re hot, they\\'re mad.\\n\\nNikki Ritsch 44:52\\nYeah, they\\'re not calling with compliments, right?\\n\\nPete Williams 44:56\\nOr if they can\\'t get what they want, they get mad. So it definitely does need, you know, some skill and personal, you know, media, it really is important. And we\\'ve been trying to improve it for the last year or two or three, maybe years, something like that. And trying to let them know, you know what our problems are. So, you know, people just don\\'t, they don\\'t really think of water and sewer until they have a problem with it. You know, it\\'s something is so natural. It\\'s just something that\\'s supposed to occur every day without problems, but. So it\\'s kind of just getting the information out there and letting them know it. And we\\'ve been actually right now, we\\'re, we release a thing every morning, and telling people what routes are going to be delayed and so forth. So at least, they can kind of know.\\n\\nNikki Ritsch 45:51\\nAnd that\\'s just like a mailing list?\\n\\nPete Williams 45:54\\nWe put it on facebook.\\n\\nLeif Albertson 45:57\\nAre people responsive to that? I mean, does that help be in? \\n\\nBill Arnold 46:00\\nIt helps. \\n\\nLeif Albertson 46:01\\nSo they know.\\n\\nBill Arnold 46:02\\nI mean, I mean, everybody looks at Facebook, but me I guess.\\n\\nPete Williams 46:10\\nYeah, I think it does.\\n\\nBill Arnold 46:11\\nBut like you mentioned even that letter we send out people don\\'t read it. They just see it, the panic mode sets in. You know, I think we even got bold letters in there saying this is not an emergency.\\n\\nLeif Albertson 46:22\\nWell, that\\'s I mean, it\\'s nuanced stuff, you know, I had to read it three times, it\\'s like, oh, this is a violation. It\\'s a contaminant. Don\\'t worry about it. We\\'re required to send you this notice. But but it\\'s fine. Is it? I don\\'t know.\\n\\nBill Arnold 46:38\\nBut we got to use their form. \\n\\nLeif Albertson 46:39\\nYeah, for sure.\\n\\nPete Williams 46:39\\nSounds like one of those commercials you see from selling medicine or something. We\\'ve learned to, I think it\\'s very important to have some kind of way to communicate with the public.\\n\\nBill Arnold 47:01\\nLike that lead and copper hit, that was a nightmare.\\n\\nPete Williams 47:04\\nIt also helps sometimes with the billing because, you know, trying to explain. We\\'re doing our rated sewer study right now to try to. But right now, if somebody came into the door and say, How come it cost me $200 for 1000 gallons of water? I couldn\\'t really give them a straight answer. So that\\'s nice to have. Bill can give you good stuff on how much it costs us to make these. we\\'ve got that.\\n\\nLeif Albertson 47:37\\nI remember, we had the big one. And that\\'s kind of where that rate structure came out with the three zones,\\n\\nPete Williams 47:44\\nright. And then they they merged zone two and three together. And that\\'s the problem kind of now. And so we\\'re trying to get it down. Most places go, they set their rates, they cost so much per mile to deliver. And that\\'s what we we went a different route. So right now we\\'re trying to go back to the how much it cost per mile. I think that\\'s what we\\'ll see when we\\'re done. Not that we\\'ll use it, but at least we\\'ll have it.\\n\\nLeif Albertson 48:13\\nSo as for, for an individual how far it goes? Or for the average?\\n\\nPete Williams 48:18\\nIndividual\\n\\nLeif Albertson 48:19\\nSo like if I like I pay more than my neighbor?\\n\\nPete Williams 48:23\\nOh, no, it won\\'t be that but at least we\\'ll understand. Understand it. Yeah. When they started, when we started the study and they started asking the questions, we started to go nuts, because that\\'s exactly what they were doing and how far it was from this guy\\'s house to that guys\\'s house. They were really breaking it down. But at least we\\'ll kind of have it. The concern really are the ones that are really far away.\\n\\nLeif Albertson 48:48\\nKasayuli, right?\\n\\nPete Williams 48:49\\nRight. You know, when we merged the two together so, because if we had done it the original way, the people furthest away, the bill would have been so big that they couldn\\'t have paid. So we just said, okay, we\\'ll blend it in, you know, over here like this. So with the other zone two, so two and three are kind of together. So we\\'re at right now we just kind of want to see what it will cost to get out there. And it\\'s important to get it kind of noted down or squared away, because it goes in if you start installing your pipe system, you kind of want to know what the costs are going to be and if what they\\'re paying is going to cover those costs, you know, so there\\'s kind of some stuff that gets married up into decision making that you need to know. right now we\\'re just really weak on we really don\\'t know how much we\\'re paying to get to Kasayuli anymore.\\n\\nLeif Albertson 49:44\\nWell, right. And, I mean, I know what bothered me is if they\\'re not paying more, they keep building houses out there. Yeah. I wanted to build that into the price of the house, you know, so that they don\\'t get shocked.\\n\\nPete Williams 49:59\\nI probably won\\'t be here when it happens. But right now we\\'ve got one subdivision that I think it\\'s 78 houses, we\\'ve got, probably at least I don\\'t know how many subdivisions but probably at least 150, 160 new houses that could be built. And so the added onto the truck drivers is gonna be a lot. I always worried, you know, I\\'m kind of still maybe in, but like I said, I probably won\\'t be here, if they get too spread out with too many houses that might come to what we\\'re coming to today where we can only say that we only can deliver X amount for everybody to get the water.\\n\\nLeif Albertson 50:34\\nYeah, it would be like the garbage truck or something. Wednesday\\'s your day because yoy live in this neighborhood. So probably not as big issue here as some of the other communities. Are the people that opt not to use city water? Is traditional usage? Rainwater collection?\\n\\nPete Williams 50:36\\nWe do offer a well at one of the water plants. And there\\'s a well over there that they can come in and pay, what, I don\\'t know what they\\'re paying.\\n\\nLeif Albertson 50:36\\nIt\\'s, it\\'s a machine now, right? \\n\\nPete Williams 50:47\\nYou\\'ve got the option of dollar bill or quarters.\\n\\nLeif Albertson 51:15\\nIt used to be like 900 quarters.\\n\\nPete Williams 51:19\\nones, fives and 10s. But that\\'s kind of a, you know, if, for whatever reason, maybe a water pipe broke down over here, or mainline or something, you know, at least having water available in the community. Like a well, and most villages have them too is it\\'s a real plus to install. When you\\'re thinking about these things. Are there? Are there homes that are not on water? Anybody that just doesn\\'t? There\\'s a few.\\n\\nLeif Albertson 51:54\\nBut they\\'re supposed to be right?\\n\\nBill Arnold 52:03\\nEverybody\\'s supposed to be connected to water and sewer services.\\n\\nNikki Ritsch 52:05\\nWhat are they using if they\\'re not on your water? Rainwater and river?\\n\\nPete Williams 52:08\\nWho knows.\\n\\nBill Arnold 52:08\\nRainwater and honey buckets. They don\\'t connect to the sewer. And, you know, we try to police it the best we can. We usually don\\'t hear about it til the neighbors complain.\\n\\nPete Williams 52:18\\nI think maybe, you know, down the road. I mean, Alaska\\'s always kind of slow and getting into the mix on some of this, but some of the alternate ways that are out there, too. So you\\'re actually making water at home, you know, I mean, I was just seeing something the other night where there\\'s a unit that they put on a roof, and he\\'s making water you can get so much water out of, I guess that\\'s probably be more in your neck of the woods down south, you know, where you\\'re having problems with the Colorado River and whatever. So I think we\\'ll see more and more of that. Alaska is usually the last to get in on those kinds of things. That may, you know, eventually some alternate way than what we know now. Or, you know, that we are familiar with. What might be the answer to some of this. I don\\'t know.\\n\\nLeif Albertson 53:10\\nSo people are required to have some kind of water service. Whether it\\'s once a month or once a week or something. What do you what do you do when somebody doesn\\'t, they just stop stop paying?\\n\\nPete Williams 53:23\\nWe have the means to get heavy handed if we want to. Drah them into court. Do whatever they have to do.\\n\\nBill Arnold 53:30\\nBut the last two years we couldn\\'t do nothing to nobody. Yeah. Because the law says we can\\'t because covid.\\n\\nLeif Albertson 53:37\\nso you just eat the bill then or?\\n\\nBill Arnold 53:39\\nYeah, I mean we try to collect. I mean there\\'s like he said there\\'s means for us to get it but.\\n\\nLeif Albertson 53:45\\nYeah, well sometimes it\\'s you can\\'t get blood out of a stone, right? I mean unless you\\'re gonna take somebody\\'s house or.\\n\\nPete Williams 53:57\\nI mean the only, the bet I think but you know the part that we\\'re, when that happens where the where we do get you know can get heavy handed is not really the water. It\\'s the sewage because we know they\\'re dumping the sewage out o the tundra somewhere. Yeah, so and then we usually find a pile somewhere and then we can go after the property owner.\\n\\nBill Arnold 54:29\\nlike a couple years ago, I had to go out to Haroldsen Sub Borough, pick up 27 honey buckets that somebody threw on the side of the road.\\n\\nLeif Albertson 54:38\\nSo we do have some areas that are not, I mean, we don\\'t drive water and sewer trucks out to Haroldsen right? or our Polk road. So we do have some established houses that are non services?\\n\\nPete Williams 54:51\\nPolk Roads getting billed next year. \\n\\nLeif Albertson 54:54\\nOh yeah? really? well That\\'s so we\\'re gonna be a loop again? That\\'s great. I mean, I assume it\\'s great. Sounds great to me. I don\\'t know, what do you think?\\n\\nPete Williams 55:10\\nWell I wanted them to go a different way. But I\\'ll take what I can get. \\n\\nBill Arnold 55:13\\nI love it, Laurie hates it. \\n\\nPete Williams 55:15\\nYeah I hate it too. \\n\\nBill Arnold 55:18\\nIt\\'s right in front of her house.\\n\\nLeif Albertson 55:18\\nOh really? Is it not the same?\\n\\nPete Williams 55:21\\nSee I tried to do an alternate route here and I wanted to go this way. And but now they, because this thing was this thing is 20 years old. They put it in the STIP 20 years, 25 now maybe. so we had to kind of live with where it was when it started. But anyways, now all the traffic\\'s gonna dump out into this neighborhood. Yeah. And I don\\'t know, exactly.\\n\\nLeif Albertson 55:49\\nBut the Polks settled or?\\n\\nPete Williams 55:52\\nThe eminent domain.\\n\\nLeif Albertson 55:53\\nOh they just didn\\'t have a choice.\\n\\nPete Williams 55:55\\nThe state exercised emminent domain.\\n\\nLeif Albertson 55:57\\nThey got what they got. interesting. Huh? Well, that\\'s off topic. but interesting. \\n\\nPete Williams 56:06\\nIt is kind of on topic because roads do allow water and sewer truck drivers to take a different route instead of going around.\\n\\nLauryn Spearing 56:16\\nSo would they have to go?\\n\\nPete Williams 56:19\\nYeah, so here\\'s Kasayuli out here, somewhere and the water plants are down here. So you gotta, you know, anyways, I\\'ve just, you know, it is a consideration where you put your water plant, they didn\\'t have a choice here, they did basically put the water plants where the neighborhoods were, when they started, but now they\\'re spread out and got further away. And then it\\'s, you know, the next step, eventually, if we start spreading out and is, you know, if you\\'re in a village or someplace and your town is growing, is trying to, that\\'s what our preliminary engineering report is trying to figure out where your next water plant will be to be able to push the water up, or lift station or so forth, out even further to whoever you\\'re trying to service. So the preliminary engineering report is, you know, if I had done it, we\\'ve done it maybe in retrospect is, is just doing the whole town at once, not one neighborhood at a time. So because all the engineering that goes into that work can be used down the road, instead of trying to do a little bit here. And then they\\'ve got to change this here to make this fit here. And so we just gave up that we that\\'s where we\\'re at was doing bits and pieces, not so small bits and pieces, but doing parts of it, we finally just said we\\'ll do the whole town. But so you know, a lot, a lot of times, like this grant funding, you know, what we picked to do is kind of dependent upon how much money we get out of the grant. And it\\'s not really maybe the best way to go. So it\\'s kind of driven by what you can get, and people kind of just look at that, and they\\'re not focused on the bigger picture. It\\'s easy to go sideways that way. I think.\\n\\nLeif Albertson 58:06\\nSo seems like you two have a very strong opinion that piped water is a better way to live. Does the public agree with you? I mean, are there people that, does everyone feel that way? Or some people like their hauled water?\\n\\nBill Arnold 58:17\\nWhen we had the public meeting with the Avenues people, they all seemed to want piped. I mean, yeah, we had a little kickback from a couple of course, because of change. I wasn\\'t here when they did City Sub but from what I understand about that, they kicked a lot.\\n\\nLeif Albertson 58:35\\nThat\\'s I mean, I still hear people bitching about that. We never wanted this blah, blah, blah.\\n\\nPete Williams 58:40\\nOh, yeah. I think a lot of that though, is just what happened after the fact and all the mates, you know, well, that was poorly built that was\\n\\nLeif Albertson 58:47\\nI mean, I had I had a $700 electric bill this winter.\\n\\nBill Arnold 58:50\\nI know I got the phone call\\n\\nPete Williams 58:51\\nRight yeah. So I mean this is just my opinion of projects in general, but for water I think you need to be, whoever is the owner needs to be actively involved. You just don\\'t send somebody in there and go, \"go do this\"\\n\\nBill Arnold 59:11\\nI think pipe\\'s a lot safer. In my opinion\\n\\nPete Williams 59:15\\nIt provides for fire hydrants is another big one. And then you know, just the daily use, I mean, you go out wash your car where if you got 1000 gallon tank, you\\'re kind of hesitant. So for keeping things clean, that\\'s Brian\\'s big push. Brian Leffert\\'s working on that set. But anyways, it\\'s just staying clean. I hardly use my bathtub because I got 1000 gallon tank and then you know your washer and dryer your washers soaking up 47 gallons of water whatever it is. It goes quick. If you\\'re waiting for the next load two weeks down the road, with piped water you don\\'t have to worry about that.\\n\\nBill Arnold 1:00:01\\nAnd the trucks you gotta keep them in a secured location, they gotta be in a locked building, you can\\'t sit outside, you can\\'t park them outside\\n\\nMichaela LaPatin 1:00:10\\nFrom a worker and operator perspective, is piped safer for the people doing the work?\\n\\nBill Arnold 1:00:17\\nYeah a lot safer. \\n\\nLeif Albertson 1:00:20\\nnobody ever crashed a water pipe?\\n\\nBill Arnold 1:00:23\\nI had a water truck hit the water pipe.\\n\\nLauryn Spearing 1:00:30\\nand then for water quality, is it another, you mentioned earlier right about the when you\\'re delivering if the pipes touching the ground or things like that. So you think pipe water is also a lot safer for maintaining water quality?\\n\\nBill Arnold 1:00:44\\nYeah. Even the drivers I mean on accident they\\'ll drive away from the water plant with their hatch open on top. What could get in there? Dropping the hose on the ground and not realizing that they did anything? You know?\\n\\nLeif Albertson 1:00:58\\nYeah, as a homeowner too. I mean, having done it both ways, piped waters, like kind of a closed system and even having had a bird flap up into my overflow, right?\\n\\nPete Williams 1:01:09\\nThen you have the sewer drivers have got to put up with hepatitis B and all the stuff that comes associated with that.\\n\\nBill Arnold 1:01:17\\nI think Jennifer went around and they did some testing of holding tanks.\\n\\nLeif Albertson 1:01:24\\nLooking at chlorine residuals?\\n\\nBill Arnold 1:01:25\\nYeah and just did some seeing what was in the water in the tanks. I think I\\'m pretty sure it was her that headed that up just to see what tanks were doing. And I think she said the tanks come back pretty bad. \\n\\nLeif Albertson 1:01:38\\nWell, it\\'s kind of an open question right? like chlorine residual from piped water, you know what it is every day. They test it at the plant. Right? But if you deliver water to my house, and it sits there for 30 days, because I only get water once a month. Right? And if I got if it\\'s not covered or dust or algae starts growing, or whatever falls in there, you know.\\n\\nPete Williams 1:01:59\\nYeah like we don\\'t have building codes here. And so you know, plastic tanks are used steel tanks used, Lord knows what else and and with it, you know, you got chlorine in the water, there\\'s going to be a chemical reaction to Yeah, I used to deliver fuel in western Alaska and we had the same problem with fuel tanks, yeah. Yeah, stuff happens inside the sediment\\n\\nBill Arnold 1:02:29\\nOne house I went to they called me out last week and asked me if I could load their tank for them. They had white plastic tanks. And I looked in there to figure out what it was I finally got my way up and got in that tank and they had algae growing in there. And, but they had a window right next to their tank. So all that light was going into that tank growing all this algae, you know, I told her to mop it up best you can and chlorinate your tank, and you\\'ll be good. But you get some of these tanks that are aluminum and steel, you can\\'t see in them. I\\'ve looked in a couple of steel ones before. pretty gnarly.\\n\\nPete Williams 1:03:05\\nYou know, we went through that with fuel tanks too is trying to get when they\\'re manufactured, they don\\'t usually take in consideration that maybe somebody has to crawl inside them to clean them. And so just simply at the manufacturing stage if you had a manhole or something on there, so you can do something but people that\\'s expensive to change a tank out, especially up here so they just leave it and pretty soon you get a.\\n\\nBill Arnold 1:03:33\\nThen you got the water pump.\\n\\nPete Williams 1:03:34\\nBut you don\\'t have to put up with that with piped water. \\n\\nLeif Albertson 1:03:36\\nYeah, yeah, I got we\\'re got a rental over there, dealing with all that stuff. Replace the tank that is old. But it\\'s I mean, shorty made it to exactly fit in the space that was there. So to cut it to get it out and then to get the new one and I had to I mean, there\\'s no I would have liked to switch to plastic but I can\\'t find one to fit well unless I want to go from a big tank to a little tank right in that space. And then the tenants burn up the pumps and\\n\\nPete Williams 1:04:10\\nthere\\'s just a whole bunch of stuff that goes on with these\\n\\nBill Arnold 1:04:18\\none thing Chuck always said it was Keep the hauled, they want to build the pipeline.\\n\\nLeif Albertson 1:04:26\\nYeah. What else?\\n\\nLauryn Spearing 1:04:32\\nI mean, I guess we\\'ve talked about a lot of different kinds of challenges with water infrastructure. Is there anything we haven\\'t asked you yet or any other stories or challenges you can think of? \\n\\nLeif Albertson 1:04:42\\nTell us some terrible stories.\\n\\nPete Williams 1:04:51\\nWell, another one with hauled here we keep going on hauled is driveways. These trucks are big trucks and Who builds a house that\\'s going to take a semi truck down the middle of your driveway? we have a lot of a lot of damage. Insurance comes into play. And right now our code is that we, I think we pay for the first $10,000. So you get a lot of claims, and they start to add up that\\'s coming out of your pocket book, not the insurance company. And that\\'s, it seems they call every time you start talking about hauled water and sewer, you always come up with something. \\n\\nLeif Albertson 1:05:35\\nHow about flooding?\\n\\nPete Williams 1:05:36\\nFlooding too. Yeah, that\\'s another problem. So that goes maybe to education. And I don\\'t know how quite to get the word out to everybody. But tanks will overflow. And there are ways to prevent, there\\'s electronic means. it costs the homeowner probably, we figured $250 to $500 a tank. And the thing is you got the truck driver outside and he\\'s filling a tank and a lot of times he can\\'t see what\\'s going on inside. There\\'s overflow pipes on the tanks. But a lot of times the water\\'s coming in faster than the overflow can handle. So out the top it comes and the guy standing out there pumping away and pretty soon, you\\'re house is full of water. And that has happened quite a bit. And it\\'s like I was saying I was in the fuel business. And normally, we would not fill a tank unless somebody was physically watching the tank. So it took two people but it\\'s not realistic to drive two people. So that has been an ongoing, big issue. And part of part of the cost too to the insurance company will just keep paying and paying out which runs your premium up. So we kind of got a handle on that one in the last four or five years. But that is a problem with it. And that this stuff starts with the homeowner when he\\'s building another house knowing that you know that he needs to have an overflow gauge on his on his tank. And we don\\'t have a building code here. And that was probably where that would play in if we did.\\n\\nLeif Albertson 1:07:21\\nand sometimes they freeze too. I mean I\\'ve had that happen.\\n\\nPete Williams 1:07:24\\nRight so the overflows will freeze, so that\\'s another issue and we go up and they bang on them. And but yeah, so that\\'s, that is another problem in the wintertime. So we\\'re full of freezing, nothing cut because that\\'s how they tell if that if it\\'s overflowed on the outside, if it starts coming if water starts coming out of the overflow, then they know to shut it. They could discharge water from the truck. But that\\'s where the alarm comes in. And they need some way to\\n\\nLeif Albertson 1:07:59\\nit seems like that speaks to the importance of good employees too though right? So if you got a delivery guy who\\'s paying attention and is experienced, and he knows he\\'s listening and watching versus,\\n\\nPete Williams 1:08:13\\nbut a lot of times it\\'ll come from, if the overflow pipe isn\\'t big enough to handle the discharge, we\\'re pumping it you know, 100 gallons a minute. And so even by the time it\\'s come out of that discharge pipe, it\\'s already overflowing\\n\\nBill Arnold 1:08:30\\nthey were pumping 100 gallons. I slowed them down so we don\\'t have that issue. \\n\\nLeif Albertson 1:08:37\\nThat\\'s funny, we watched a pump guy fill up today and I was like, oh, that must have been a really thirsty house because it seemed like it was taking some time. \\n\\nBill Arnold 1:08:44\\nI backed them down to 80 gallons/minute.\\n\\nPete Williams 1:08:48\\nI mean, I think that\\'s a consideration too is that when you know if you\\'re starting from scratch you know I\\'m gonna go deliver use trucks to deliver water is I don\\'t think there\\'s enough effort put it how much time does it take to do each one of these? Then by the time you add it up and go, Well, I was surprised you know, so you know that the 1000 gallons gonna sit there for 20 minutes to a half an hour.\\n\\nBill Arnold 1:09:13\\nAnd it takes us 20 minutes to fill the truck. \\n\\nPete Williams 1:09:18\\nSo there goes just labor costs and all that. Those aren\\'t things you really think about I guess. we should be thinking about it. don\\'t always get thought about when you say okay, we\\'re gonna deliver water by truck\\n\\nLeif Albertson 1:09:32\\nWhat do you think\\'s gonna happen with the avenues? Just waiting on money?\\n\\nPete Williams 1:09:51\\nSo the bids came in in March and we had 60 days for it to before we can back out or just say it\\'s too much Yeah, so we\\'re just gonna let things cool down and go out to bid again.\\n\\nLeif Albertson 1:10:05\\nyou think it\\'s, you think prices are gonna come down?\\n\\nBill Arnold 1:10:08\\nWell I mean, it\\'s crazy like, like our pumps, they just went up $300 a piece in two weeks.\\n\\nLeif Albertson 1:10:27\\nLumbers kind of peak and come back down right?\\n\\nBill Arnold 1:10:30\\nIt\\'s a little bit it\\'s not back to where it was, you know a lot of other things are just astronomical trying to find something. And the lead time, right. I mean, I lost the waterline I had to lay the new line in the dead of winter on top of the guy\\'s driveway. So I told him I said, well I\\'ll come back this spring. springs here and I can\\'t even get the materials\\n\\nPete Williams 1:10:52\\nsteel mill products have gone up 127% Since 2020, plastic construction products, 34%, and so forth. It\\'s in the first page here. construction for building a house is up 25%. Yeah, these things, these things this is kind of interesting. Yeah. Crazy.\\n\\nLauryn Spearing 1:11:24\\nYeah, well, one question that we\\'ve been asking too, is kind of, if you could wave a magic wand and fix one thing, what would you do? Pipe the whole town. \\n\\nBill Arnold 1:11:33\\nPipe the whole town. \\n\\nPete Williams 1:11:36\\nThat\\'s exactly what we\\'d do. We need about 500 million dollars. Get it over with.\\n\\nLeif Albertson 1:11:42\\nAnecdotally, you know, when I\\'ve talked to guys at public works, you know, talking to the drivers versus talking to the, you know, the guys that spend their time on the pipe water system, you know, just how much they like that so much better. It\\'s like, it\\'s great. You just drive around in the truck. And then if the lights on you stop. And if it\\'s not, you don\\'t stop\\n\\nPete Williams 1:12:00\\nAnd that\\'s too with the SCADA system. So they probably won\\'t have to drive around\\n\\nBill Arnold 1:12:05\\nWell they still need to go to every house to see if a red lights on. \\n\\nPete Williams 1:12:08\\nI know. But I mean, if you get your new system going, you should be notified if those lights go on, right?\\n\\nBill Arnold 1:12:13\\nNot at all the houses. That\\'s a lot of money to do that. All our maintenance stations will have that. \\n\\nPete Williams 1:12:20\\nThey actually send you an email when things are wrong. \\n\\nLeif Albertson 1:12:24\\nYeah, I got a system for that as I get a text when my neighbors walk by and tell me when the light on my house is on. You got a little thing going on. I do that for the Jesuits who live next door and they do it for me. \\n\\nPete Williams 1:12:39\\nDo you got a connection with the Jesuits?\\n\\nLeif Albertson 1:12:41\\nThe priest, yeah, well, they just live next door to me. So I just I know them. Yeah they rent next door. \\n\\nPete Williams 1:12:49\\nI kind of want to get a hold of one of them.\\n\\nLeif Albertson 1:12:53\\nAbout what? Well Father Mark\\'s the one I know well, but he just moved.\\n\\nPete Williams 1:12:56\\nYou know, they they volunteer for various work in town. I was thinking of asking them to do something. I\\'ll get to it.\\n\\nLeif Albertson 1:13:10\\nYeah, about lead times too I was talking to with Rhonda about this 4H building stuff. I had a meeting with her and the folks at the university about playgrounds. It\\'s like, oh, well, you know, you guys got like 15 months left on this grant. It\\'s like, we got a barge season we already missed. So it\\'s like, yeah, if we could get this, then we\\'d still have to, like 15 months is nothing like it\\'s gone. Like we need to pay for it now. We\\'re not going to be done in a year. What are you talking about?\\n\\nBill Arnold 1:13:39\\nIt\\'s tough, stressful, like you said the barge season and Laurie just looked at playground equipment. They stuff they have in stock they can\\'t promise it to us for 22 weeks. Even the stuff they have in stock.\\n\\nPete Williams 1:13:52\\nAnd that\\'s why they were why we went out when we went out to bid and the two bidders are there in a room. And that was their biggest problem. They just they cannot get anybody to give us any lead times. So they\\'re taking a wild guess. And the end result was a crazy quadruple bid.\\n\\nBill Arnold 1:14:11\\nYeah, all the prices they\\'re getting from vendors, they were gonna give them three days. That\\'s all the price was good for.\\n\\nLeif Albertson 1:14:18\\nWow instead of like 30 days or something.\\n\\nLauryn Spearing 1:14:21\\nThis was a two year project. Yeah. \\n\\nNikki Ritsch 1:14:26\\nIs there any training you wished you had? In order to do what you do normally, that you don\\'t currently have? Do you feel like you\\'ve got all the resources that you need to do what you do?\\n\\nBill Arnold 1:14:39\\nI can\\'t really think of anything. We\\'re pretty good. \\n\\nPete Williams 1:14:42\\nOh, yeah, probably public relations. I\\'m just teasing.\\n\\nBill Arnold 1:14:50\\nThe City has always been pretty good at giving us funding for training and sending everybody to training.\\n\\nLeif Albertson 1:14:59\\nWhat would help with turnover?\\n\\nBill Arnold 1:15:03\\nWe threw money at it and that didn\\'t help.\\n\\nPete Williams 1:15:07\\nWe\\'ve offered temp positions up to $40 an hour.\\n\\nBill Arnold 1:15:12\\nI had two guys in two years apply.\\n\\nLeif Albertson 1:15:14\\nWhat about with police, none of them live here, they come out here and work. Would that work with the water plant?\\n\\nPete Williams 1:15:26\\nWe\\'ve got a pretty full crew up there now. It\\'s, I mean, there\\'s still complaints in town here about the fact that they come from another part of town and they drag whatever baggage they drag. But it\\'s worked, we haven\\'t got sued. So that\\'s\\n\\nLeif Albertson 1:15:42\\nWould that work for\\n\\nBill Arnold 1:15:43\\nI penciled it out, tried to see if I could make something like that work. Like two weeks on two weeks off. The problem with the guys that are two weeks on two weeks off, they all want to work seven days a week, 12 hour days. So that means I have to have a mechanic on that period of time. And overtime. Then I need streets and road guys out all the time to keep the roads open the whole time they\\'re running. It just turned into this big fiasco, where either I\\'m going to have a lot of people working a lot of overtime, or I\\'m going to have to hire more people and have more positions.\\n\\nPete Williams 1:16:20\\nWhich means the revenues have got to cover all that too\\n\\nBill Arnold 1:16:26\\nPD, you know they function 24/7 anyhow. So it\\'s easy for something like that.', '3_2__InterdependenciesNNA': '\\nQC Operator Richard\\nMon, Aug 08, 2022 1:13AM • 1:30:30\\nSUMMARY KEYWORDS\\noperators, test, plant, water, bethel, people, system, anchorage, communities, pay, licenses, alaska, dutch harbor, run, dec, kodiak, pumps, reciprocity, questions, ceus\\nSPEAKERS\\nLeif Albertson, Nikki Ritsch, Michaela LaPatin, Richard\\n\\nRichard 00:00\\nI\\'ve talked to some people who say they\\'re going through them pretty regular. But so just so you know, we backwash every 12 hours. So it\\'s not the backwash.\\n\\nLeif Albertson 00:12\\nOh, well, there\\'s something that goes on, like seasonally or something, maybe when they\\'re working on,\\n\\nRichard 00:19\\nwe\\'ll find out. I don\\'t know, it could be because it\\'s a circulating system. So all we do is put water in the tank. And then they\\'re always circulating the water. And they supplement the water as it gets used through pressure pumps. But \\n\\nLeif Albertson 00:33\\nFire hydrant testing?\\n\\nRichard 00:35\\nmaybe I don\\'t know. I can tell you, we\\'re concerned about what\\'s going on in the reservoir. We do believe there\\'s iron and manganese precipitating out in the tank. It\\'s been a number of years since they cleaned it. So we\\'re looking at the chlorine demand that we\\'re seeing in the tank and chances are good and there\\'s quite a bit in iron and manganese just based on what we found today. So it could be some of that.\\n\\nLeif Albertson 00:58\\nYeah, I did notice like this morning when I was brushing my teeth just in the bathroom sink. the color.\\n\\nRichard 01:03\\nDo you have galvanized piping in your house?\\n\\nLeif Albertson 01:06\\nNo. Well, not not. Not beyond from where it comes into the house\\n\\nRichard 01:11\\nShould be coming in HDPE to your circulating pump.\\n\\nLeif Albertson 01:14\\nThere is some galvanized pipe in the utility room. Because I haven\\'t. that would be copper. I mean, like the to the walls and everything all that\\'s copper.\\n\\nNikki Ritsch 01:25\\nIs it okay if we are recording? \\n\\nRichard 01:27\\nSure. \\n\\nNikki Ritsch 01:27\\nEverything\\'s totally anonymized. It\\'s not linked to at the end. It\\'s just that we don\\'t have to take notes. \\n\\nRichard 01:31\\nYeah, that\\'s fine. No secrets here. As long as you guys are okay to be here by the management? I\\'m good. Okay. Yeah. You\\'re riding around with them in the truck.\\n\\nNikki Ritsch 01:47\\nWe talked but would you mind just kind of giving us a quick overview again of like, why you\\'re here\\n\\nMichaela LaPatin 01:51\\nAnd your background a little bit. \\n\\nRichard 01:53\\nOkay. So what I\\'m told is they would normally have five operators here on five operators, not in this plant, but on staff. They\\'re currently down to one when they had two and then they got word that one was leaving, they reached out. We were put in touch with them through DOWL engineers who does a lot of work. They\\'re an engineering term contractor. So they contacted us about a month before T left here who was working here. And then when, before he left, we came out and did some kind of handoff. We came out for a couple of four day shifts. And I work I own the company who is under contract. And then Rick, who works with me, or opposite me typically, on joint ventures like this. He also has his own company, but we\\'re all licensed. We have all the insurance. We have our own two individual companies, he just works under mine. So we\\'re out here until they can get additional staff hired. And I don\\'t know. Yeah, it\\'s open ended right now. Like I said, we did this a couple three years ago for the Coast Guard base on Kodiak, they asked me to come on help for a couple weeks, turned into 22 months. So this is a recurring problem around the state, lack of operators or lack of experienced higher licensed operators. So it\\'s not just here. It\\'s everywhere. Both Rick and I, My name is Richard. retired. I am retired from the anchorage water wastewater utility. I was started out in maintenance and I went into treatment. And I became the superintendent over the water distribution Water Treatment System in Anchorage and then I was the director of treatment prior to retirement in 2013. Rick did similar work in Soldotna and Kenai. So his background is more towards wastewater, mine\\'s more towards water. So between the two of us, we cover pretty much everything. He just went home yesterday, I flew out last night. And then I\\'ll go home on Thursday and he\\'ll come back. We talked to Bill right in the gap while he left and yeah, We normally see each other. Well, we generally see each other at the airport when I leave. He flies out of here early on Thursday so he can drive home to Sodotna before dark. But so that\\'s it. \\n\\nNikki Ritsch 04:12\\nAnd you\\'re on the board.\\n\\nRichard 04:13\\nYeah, on the governor\\'s water wastewater advisory board have been for 10, 12 years, I was just renewed so I got another five. So work closely with trying to develop techniques and approaches to get operators certified and send them to get them certified and make sure that the subject and the classes that are being taught are helping operators pass the licenses. And I\\'ve done a lot of classes around the state teaching different aspects of treatment or distribution, mostly distribution, PRB training, stuff like that, but actively review certification ABC exams and test questions and whether or not they\\'re applicable to both for operators in Alaska. So there you go. That\\'s my background.\\n\\nNikki Ritsch 05:06\\nThat\\'s a huge that\\'s exactly that\\'s what we\\'re studying. Yeah, we were like, when we started talking, I was like, whoa, okay, we need to bring back the crew because\\n\\nRichard 05:16\\nabout 40 years in the industry up here. Cool. So pretty well connected with most of the different things. And that\\'s it I\\'ll answer any questions you got about whatever. Yeah. You guys are all doing this college study or what do you do?\\n\\nNikki Ritsch 05:36\\nWe\\'re PhD students.\\n\\nLeif Albertson 05:39\\nI\\'m the local contact. I work for UAF. I make sure nobody gets lost. Yeah, I bet we know a lot of the same people. I work with. So my wife is at YK at the office of environmental health. So she did the water lab there for a while. \\n\\nRichard 05:58\\nSo she\\'s probably who we bring our samples to every week. \\n\\nLeif Albertson 06:00\\nYeah, not anymore. Yeah. But she was supervised the water lab,\\n\\nRichard 06:04\\nI should say every month. I\\'m used to taking so many more than we take here. It\\'s like scary.\\n\\nLeif Albertson 06:10\\nBut yeah, we\\'ve been talking to people a lot. So like Brian Berube is doing online.\\n\\nRichard 06:16\\nHe\\'s doing great things, because he approached me about a year and a half ago about working with him to teach the classes. And I don\\'t know if he didn\\'t get the contract come through or anything. But I do know he\\'s doing great things, and having really good results, getting the local operators out here in this area and other areas, remote areas of the state, to at least pass a small water system or the small untreated system. I know that he\\'s doing a very good job remotely teaching classes. So\\n\\nLeif Albertson 06:41\\nhe was out here for a bunch of years. And so I think the insight into it that sometimes nothing personal but Anchorage\\n\\nRichard 06:49\\nYou\\'re not going to hurt my feelings. It\\'s a different realm in there, totally. and the quicker everybody realizes that the better off the state will be because there\\'s a huge shortage of operators. And what we do and the way we operated there doesn\\'t necessarily apply out here. It\\'s totally different.\\n\\nLeif Albertson 07:13\\nSo I am sure everyone has a million questions. Once upon a time the state made its own rules about water plant operators, right? Then we went to federal standards and lot of people, at least from what I\\'ve heard out here felt like that was not.\\n\\nRichard 07:30\\nSo what we do is a number of years ago, we went into the ABC system, right in was the American Board of certifications or something like that. Yeah. So we had to get come up with a standardized test that was basically applicable nationwide. And in the past, the tests were much more suited towards or developed towards our uniqueness or our you know, what we do up here and how things are done,\\n\\nNikki Ritsch 07:53\\nWas it initially state based then it became federally?\\n\\nRichard 07:56\\nRight, And some of that was done so that our operators and other operators can enjoy reciprocity. So if you like certain states, we recognize for reciprocity, and then we recognize them back both ways. So I have reciprocity down to Washington. And I want to say it\\'s back in New Hampshire, somewhere on the East Coast, they were looking at one of the companies I was reaching out to I was trying to get the job got me all certified to go back there and work, it didn\\'t work out. But so I mean, that that was supposed to be a good thing is to go through ABC, and someone had standardized testing. And it didn\\'t leave everything on us to come up with our own tests and our own specific questions, but we lost uniqueness that would apply directly to us. But we do get stateside operators commonly applying for jobs up here and then asking for reciprocity. And the nice thing is, is if they\\'ve taken an ABC exam we recognize it in general, there are certain states we don\\'t because part of it is they don\\'t recognize us. So\\n\\nNikki Ritsch 09:03\\nso how I mean how given that it\\'s federally now, like regulated, and the systems up here are pretty different. I mean, certainly, we\\'ve just spent time in Nunapitchuk. And so like that super small system is very different.\\n\\nRichard 09:14\\nright, you get out into smaller communities. They\\'re very different in that regard, right? I mean, they\\'re fill and draw systems, many of them they fill all winter long or out of the ice or all summer long out of the river and then they try and keep it going for them. Whereas other people are pumping groundwater pumping surface water or combination of both.\\n\\nNikki Ritsch 09:36\\nDo you feel like this kind of size system. Do you feel like that certification is I mean, does it prep them for what they need to know to run this kind of plant?\\n\\nRichard 09:43\\nIt\\'s 100% applicable right here. This to me is a plant you would see at any midsize community. I\\'ll use it Alaska term. I mean, this is a really normal plant. This is the same. In essence it\\'s the same plant I was running out in Kodiak for US Coast Guard facility had like 3000 residents, so,\\n\\nNikki Ritsch 10:06\\nso rural lower states maybe like in a rural area below, it\\'s similar systems.\\n\\nRichard 10:10\\nWhen you get out into the smaller communities around here where they\\'re more just filling it, chlorinating it, maybe they got a small package filtration system, I get that. But this is a, it\\'s a pretty nice little plant and you got a backup plant over here. It\\'s got redundancies in it. The lab, I mean, okay. And to me the tests that the ABC level one or level two, very applicable to this. But then again, I\\'m used to a much bigger system that I\\'ve tested at much higher levels. So\\n\\nLeif Albertson 10:42\\nthat\\'s kind of some of the like, I\\'m not a water plant operator. But some of the ideas or feedback that we\\'ve heard is that you know, that it is designed around reciprocity and being able to go other places and we\\'re dealing in the village with water plant operators who have lived in Kasigluk their entire life, they\\'re not going to go work in Seattle.\\n\\nRichard 11:05\\nand that\\'s perfect. We talked about that a little bit, because let\\'s get those operators certified because they have a vested interest, and I call them growing your own, they\\'re not gonna leave. Because right now the big problem is, you get an operator certified if they have no vested interest or no home roots here, they will work anywhere because of reciprocity, right? I mean, why do I want to stay here, if I can go make the same or better money and a cheaper place to live with more of the things that make life enjoyable, and it\\'s very mobile now state of Alaska PER system, up here used to be a defined benefit. When you when you retired, you got a defined benefit. Now, it\\'s not it\\'s a 401 K program, so you\\'re very mobile with your retirement. So there\\'s no real vested or no real interest to stay here a reason to stay here, the older employees that are like I\\'m halfway in, there\\'s no reason to leave. Now I\\'ve got a 30 year out program, and I get Lifetime Medical and get a defined benefit program. Well, now the Lifetime Medical is gone. It\\'s no longer defined benefit. So your 401 K is so mobile, there\\'s no reason to stay necessarily working in a PERS or a state PERS job, which many of these communities are.\\n\\nLeif Albertson 12:23\\nWhat do you think about like, how would it because trying to get people through the test is seems to be a challenge. And\\n\\nRichard 12:29\\nso what really comes down to me is, and I\\'ll say this right away, just go take the test, sign up, go take it. the community should pay them to go take it right away. And know going in, you\\'re doing it to learn what you don\\'t know or what you need to knoow. Yeah baseline, exactly. There are, we need to also teach how to take the test. Okay, four questions, right? I mean, four answers to every question. It\\'s multiple guess, right? If you can throw two of them out there, you get 50%, a lot of that comes down to looking at the test and realizing how to take it, right. If you can eliminate a couple now you have 50%, you need 70% to pass, typically 100 questions. And then at the end of that you get a report back saying, these are the aspects safety management, chemistry, math, and they\\'ll give you a breakdown of how you did in each one. Now you can go out, figure out what you need to learn. For many of them, it\\'s just instead of. Many people I speak with have said I took the test I didn\\'t do well, I don\\'t want to take it again. It\\'s not it\\'s a fear of failure. But it\\'s like why? It\\'s, you know, first time I took a test, I got 68 It\\'s frustrating, right, two questions. I know many people have come back with a 68. Go take it again, you know, sign right back up. 30 days from now go take it again, when whatever you study for if you\\'ve studied fresh in your mind, you only gotta get two more. You know, there\\'s like, I think there\\'s three or four different tests that pop out, or there\\'s, you know, there\\'s, there\\'s a small grouping of tests. And many of the questions are asked the same on each test. And then there\\'s different variances, but just go take it again.\\n\\nNikki Ritsch 14:23\\nYou were talking about some differences in math training and like problem solving techniques that make some of that challenging.\\n\\nRichard 14:29\\nYep, that\\'s where I\\'m going is, you know, find out what your weakness is. And then let\\'s teach. Many times you can look at a problem, look at the answers and go these are ballpark answers, right? I mean, math is quickly coming up with doing pounds formula, and you quickly come up and say, oh, you know what, 22 or 24 pounds seems close. Throw those other two out and only work on these two and figure it out. And that\\'s where I\\'m going teaching them how to take a test right, recognizing questions and what would seem to be correct based off your knowledge or your experience and being able to whittle it down and now playing with 50% answered instead of 25. Teaching them to go through and answer all the questions that they know off the front mark all the ones they don\\'t know, and then come back. And if you\\'ve got 70 already done, you know that you\\'re pretty sure you\\'re right, those 30 that are left, let me do the best I can, if I can get 50% out of those 30. Chances are good, I\\'m gonna get 85. Right? I mean, so that\\'s where I go, somewhat is teaching them how to take the test. The other thing is, is making sure they understand, don\\'t look at it as a failure. If you go in and take the test, if you don\\'t pass. Nobody\\'s critiquing whether you passed or not. Maybe the maybe the person that paid for it is frustrated that they paid for it didn\\'t pass but every mistake is a learning opportunity, right? I mean, as long as you don\\'t keep making the same mistakes over and over again, you\\'ll learn from it\\'s a good thing. So I hear that over and over again, as I passed, I don\\'t want to take it again. It\\'s like, No, you have to go take it again, you need to go take it again and as soon as you take your one and get it, take your two, \"well I don\\'t have the time\" I don\\'t care. Take your two while it\\'s fresh in your mind. So many of the same questions are there. Now you\\'ve got your two taken, and you\\'ve got all this time to get your experience. And then soon as you have your experience get your two license. Don\\'t wait and take the two. Take it right behind the one. Those are things that to me makes it better. And we\\'d get a higher success or a passing rate. If we just did that.\\n\\nMichaela LaPatin 16:37\\nSo this shortage of operators do you think a lot of it is people cannot pass the test? Or do you think there are things before and after that that are also issues?\\n\\nRichard 16:46\\nYeah. Okay. So in today\\'s world, is water and sewer treatment, a glamorous position? Who wants to go into it? My son is 16 years old, right? I mean, he knows what it\\'s done for us as a family. But he\\'s not in a big hurry to go into water, wastewater, and I can get him right in the door. I mean, no problem. How many of the kids or young people nowadays, young adults want to go into this field, even though you can work anywhere in the world, every community\\'s got it. And all you got to do is have a little bit of mechanical instinct and a little bit of basic knowledge of whether it\\'s chemistry or electrical or instrumentation. If you can have the right thought process, it makes this career this, this field pretty, pretty easy. But many of the younger people don\\'t really look at this. I mean, they might consider being electrician, they might consider being a carpenter because that\\'s a job that just came their way. But to really go out and go to a college or, or go, you don\\'t need to go to college, even though there are colleges out there that have programs, you can just go to the field. There\\'s a mentoring program, everybody will mentor. Young or inexperienced operatprs. I shouldn\\'t say young. Inexperienced operators that want to get into the field, I\\'ve yet to find any utility that\\'s not willing to have a path in the door for them to come in. You think it\\'s a workforce thing where people aren\\'t wanting to work or aren\\'t interested in it? Yeah just it\\'s I don\\'t know if it\\'s not glamorous, it\\'s not promoted that\\'s, you know, lack of knowledge out there that it\\'s a field, it\\'s, you know, understaffed and up here in Alaska. It\\'s a great paying job. Different communities have huge variances in their pay rates. And that\\'s some of it why, why stay in Bethel when I go to Anchorage. And I can make 45 bucks an hour, right? And I can stay here and make 30 and the cost of living is so much higher. Or I can go out to Dutch Harbor and make 42 bucks an hour, but now you got a $1,600 round trip to go back and forth. Right? I mean, but in general state of Alaska, we have very good pay scale for this water wastewater industry. You go down to the south, not nearly as good. When I say South I mean Georgia, Alabama. It\\'s not as good. Because I know I was looking to move down there years ago when I retired thinking I would go down there and get a non PERS job. \\n\\nNikki Ritsch 19:07\\nWhat does that mean? Non PERS?\\n\\nRichard 19:10\\nPERS is the state retirement system public employee retirement. Yeah. Public Employees Retirement System. And then there\\'s TERS which is the teachers employee retirements. So I\\'m sorry, but using acronyms. \\n\\nLeif Albertson 19:22\\nIt\\'s a big deal if you\\'re tier one or tier two. Yeah. \\n\\nRichard 19:27\\nThere\\'s tiers one through four. \\n\\nLeif Albertson 19:29\\nI\\'m tier 3, just to brag. I worked for the state.\\n\\nRichard 19:34\\nSo you get a defined benefit, you\\'ve been there long enough, right? Yeah, but\\n\\nLeif Albertson 19:37\\nI\\'m not working for the state anymore. I\\'m working for the University and I\\'m faculty so not even TERS. But I kept a little money in the system, because that\\'s what they told me to do at the time, because I used to be tier three if I ever go back.\\n\\nRichard 19:51\\nTier three gets a defined benefit plan, meaning based on your years of service, you will get a percentage of your highest three as a retirement plan.\\n\\nNikki Ritsch 19:59\\nAnd this is what you\\'re saying Billy is on.\\n\\nRichard 20:01\\nBilly is tier one. He\\'s been here 42 years, a full retirement in PERS is 30 years or a certain age. So after at 30 years, you\\'ve acquired 67.5% or something like that. It\\'s like 68%, just under 70. Based off a percentage for every year you work. If you elect to continue working, you continue to accumulate percentage like 2.25% every year. So Billy\\'s probably making 95% of his highest three continuous, or sequential years. But I mean, bless Billy\\'s heart for coming to work, right.\\n\\nLeif Albertson 20:41\\nSo I was on city council, and I talked with the city manager about this years ago, it\\'s like, we\\'re on, and he\\'s no getting any younger, like, so he could quit at any moment\\n\\nRichard 20:53\\nand make the same as he\\'s making today? Probably. Yeah. I mean, well, when I retired at 30 years, my take home pay was more than my retirement. Or excuse me, my take home pay retired was more than my take home pay when I was working. Because I wasn\\'t paying a lot of the bolt on cost union dues, you know.\\n\\nLeif Albertson 21:15\\nYeah, no, that was I mean, it was real concern of mine, when I was on council is that, depending, I mean, there\\'d been other times where Billy was like, the only thing standing between us and not having an operator. So I was very interested to learn that, you know, it\\'s like, what\\'s going to happen? And I guess you\\'re what happens. We hire somebody to come out. \\n\\nRichard 21:32\\nWell, we got a call one day and they said, When can you be here? And we said, well, kind of like Kodiak, when do you need us? And they said, How about next week? So\\n\\nLeif Albertson 21:39\\nyeah, I suspect, like Kodiak, there\\'s not. I don\\'t think there\\'s anyone else in the pipe here. Right? Like, I mean, you might be,\\n\\nRichard 21:47\\nwell, there are other firms that provide operational assistance here, like Northern utility services, and well there\\'s a couple of firms that do a lot of work in the anchorage area for the different water systems that need operators to oversee them. And they can provide, I think the other one is, ice services is another one. But there\\'s a there\\'s a couple of businesses in Anchorage or in the surrounding area that provide that, but I don\\'t know, I don\\'t know too many operators that, well part of it is they don\\'t want to get licensed and pay all the insurance that you gotta have to contract with a municipality or a city.\\n\\nLeif Albertson 22:27\\nWell, the goal would be that we would hire someone to live here and work in Bethel\\n\\nRichard 22:32\\nand that\\'s, that\\'s your only way out. Yeah. And that\\'s, that\\'s part of the reason that we were in Kodiak for 22 months is we hired those individuals, and then we trained them, and we got them the time they needed to get their level one, and then they were able to use the there\\'s an agreement where your time counts towards the wastewater. And we can get them a dual level one in one year. So\\n\\nNikki Ritsch 23:00\\nyou have to have a level four around or within 20 minutes of the system at all times, right?\\n\\nRichard 23:04\\nNo, Okay, so federal regulations require that you have somebody around like you say, or whatever your commute or whatever it is, your primary supervisor has to be licensed at the level of your plant. Okay, your secondary level supervisor can be one level below it. But you can\\'t have a secondary shift on any day, you don\\'t have a primary shift. So if you have a level two system, you\\'re going to need to have a level two operator and that applies to the treatment side of it, the distribution side and the collection on the wastewater, the treatment on the wastewater. So all those things come into play, you can have a secondary and the thing that we\\'ve kind of realized, and those are the federal statutes, why, why can\\'t we have a guy that works with me all day long, come in on the weekend, and just walk through and check the facility? Maybe he doesn\\'t make any changes. So recognizing the situation that many communities are in, they may have a maintenance person that comes in on the weekends and checks the facility to make sure there\\'s no leaks or anything like that. But as far as making treatment adjustments, or anything that could affect the quality. That should be done in consultation with a properly licensed operator, which would be at the level of your plant, or your system\\n\\nNikki Ritsch 24:38\\nSo in Nunapitchuk we were talking to the operator there, their system is a small,\\n\\nRichard 24:44\\nsmall, untreated or small treated?\\n\\nNikki Ritsch 24:46\\nSmall treated\\n\\nRichard 24:47\\nsmall treated? okay.\\n\\nNikki Ritsch 24:49\\nAnd he is pursuing his level three so that you can make more autonomous decisions about it. So there\\'s something around like, becuase it\\'s small he has to call in to get approval from. He\\'s saying he has to call someone to get a approval, we asked why he wanted his level three, he was like, because I want to be able to make my own decisions. He was level three.\\n\\nMichaela LaPatin 25:06\\nIt\\'s the fluoride and chlorine.\\n\\nRichard 25:08\\nSo every system has a has a classification or for the state of Alaska has classification system. They go through and based on your profile for your system, they attribute points to get added for each one, and then comes up with what level you\\'re required to have. You can test one level higher than your system. So I\\'m not sure what the system is out there. Where did you say Napakiak? \\n\\nNikki Ritsch 25:32\\nNunapitchuk.\\n\\nRichard 25:38\\nThat one, okay, so they\\'re a level two system.\\n\\nLeif Albertson 25:44\\nThey have fluoride. Which is kind of what made me feel. but then they said that.\\n\\nNikki Ritsch 25:48\\nAnd then we asked him at the end.\\n\\nMichaela LaPatin 25:50\\nHe might have just misunderstood.\\n\\nNikki Ritsch 25:51\\nHe might have because I asked twice. I said like,\\n\\nLeif Albertson 25:53\\nWe can look it up.\\n\\nRichard 25:54\\nYou can look it up real quickly. And you can do a search of the operators and get all of that. But typically, you know, that would be the level of their system. And when it comes to experience and testing, you can only test one level higher than the system you\\'re operating. Okay, or you have experience in.\\n\\nMichaela LaPatin 26:14\\nSo if he\\'s operating level two, he can\\'t get a level four.\\n\\nRichard 26:17\\nHe can\\'t get a level four, but he can get a level three. \\n\\nNikki Ritsch 26:20\\nDoes that give him more autonomy? \\n\\nRichard 26:22\\nA level three? You can work anywhere in the state. There\\'s very few level four systems in the state of Alaska. \\n\\nMichaela LaPatin 26:30\\nSo if he\\'s operating a level two, and he has his level three certification, does that do anything for him if he doesn\\'t plan to ever leave Nunap?\\n\\nRichard 26:39\\nDepends on how their contracted or their agreement is many, many communities or cities or whatever, have recognized the fact that they\\'re trying to incentivize employees to go out and get higher level licenses. Because that then allows them to have flexibility when it comes to supervisory staff, whether they have primary shift supervisors or secondary shift supervisor, city of Anchorage will hire you in. It used to be you got hired in level one, and then you worked your way up when somebody died or retired or got fired, right. I mean, there would be a promotion or an opening that then everybody would bid on and you would move up. Now if you come in and you\\'re hired in with a level four, you are paid as a level four operator. \\n\\nLeif Albertson 27:24\\nEven if you\\'re doing level one work?\\n\\nRichard 27:26\\nYeah, even if you\\'re acting as a junior on that shift, maybe you\\'re a newbie. So many communities will incentivize through their collective bargaining agreement or their contract or whatever they have with their employees and say, when you\\'re a level one license, you\\'ll get paid as level one, but here\\'s the level two, scale three and four. And promote employees going out and getting higher license or multiple licenses. Let\\'s say you want them to be licensed in water and wastewater, maybe throw a percentage on the contract says if you go get a wastewater, and you\\'re hired as water, here\\'s another 5%. And I think it\\'s a great thing. I mean, you need to promote that. Now the risk you run, you lose them to a level 4 plant. Yeah. Yeah. And that\\'s where I say you, you look for these employees. If you have the liberty or the flexibility to do that. You look for ones that have ties to the community. \\n\\nNikki Ritsch 28:20\\nI mean, Raymond\\'s never leaving Nunap, right? \\n\\nLeif Albertson 28:23\\nYeah I mean, it\\'s probably true, but it\\'s like trying to you have a limited. You know it\\'s a village of 500 people\\n\\nRichard 28:29\\nDoes he have any kids?\\n\\nLeif Albertson 28:32\\nYeah, like, how many people could do it and how many people will do it? And then I mean, you know, I don\\'t know, like, if you\\'ve been to. You know the trick is like sometimes folks don\\'t read so well, even.\\n\\nRichard 28:40\\nYeah, that\\'s, but keep in mind, they can ask for assistance on the test. Right? They may have interpreters.\\n\\nNikki Ritsch 28:48\\nYou mentioned the math thing too. \\n\\nRichard 28:49\\nWell, they can\\'t help them with the problems, but they can have interpreters if they don\\'t understand correctly or maybe English isn\\'t their primary language. There are those accommodations that can be made. And, you know, there\\'d been many operator that just doesn\\'t test well. They know it. They know their job. They know what they\\'re doing. They know how to do it, but they might not understand the questions or they might not understand how to test like I say, teaching somebody how to test. It\\'s a biggie. If you can whittle out a couple of questions or a couple of answers and get it down to where you\\'re taking a stab at two that are pretty close. A lot easier. So\\n\\nLeif Albertson 29:25\\nyeah, I mean, that\\'s I\\'ve hear too like for a lot of these folks. It\\'s maybe the first time in their entire life they\\'ve had to sit down with it, you know, and take a test this long.\\n\\nRichard 29:34\\nand depending on if they fly out to Anchorage, or if they go to somewhere or somebody comes out and proctors the test in their local community or you know, however, it can be intimidating, go sit in UAA and, you know, lock everything up in your locker and you walk in it\\'s totally quiet. They\\'re sitting there for three hours taking your test. You can\\'t even wear a hoodie, right? Yeah. Yeah, no, but I mean, so that can be intimidating, you know,\\n\\nLeif Albertson 30:03\\nIs anchorage where people go mostly?\\n\\nRichard 30:05\\nAnchorage, Fairbanks or sometimes different instructors will come out and teach a short, like 4-day class and at the end of the class, they\\'ll have the opportunity to test and they\\'ll proctor the test, or they\\'ll fly remote proctors out or I mean, proctors out to remote locations. But more and more it\\'s travel to Fairbanks or Anchorage, attend a class. And you can\\'t test until you attended a class or you have certain amount of experience, there are some initial things that you have to do before you\\'re eligible to test.\\n\\nMichaela LaPatin 30:40\\nSo you wouldn\\'t be able to say like, Alright, I\\'m gonna go get these books I\\'m gonna study on my own.\\n\\nRichard 30:44\\nNo, you can. Some of those books qualify, there are training materials, like the Cal State can carry courses and stuff like that.\\n\\nLeif Albertson 30:59\\nYou need to be employed, you need like a qualifying job? So like, could Nikki take the test?\\n\\nRichard 31:06\\nIf you if you meet the minimum criteria to to allow you to take this you do not have to be employed to take the test. \\n\\nLeif Albertson 31:11\\nWhat are the minimum? \\n\\nRichard 31:13\\nIt might be taking a Ken Karey course, like getting those completing water treatment one, or signing up to go to one of these training classes where they do the four day and then test at the end.\\n\\nNikki Ritsch 31:24\\nYou don\\'t have to show a degree in any kind of?\\n\\nRichard 31:27\\nDegrees count toward CEUs, you may very well qualify. They\\'re not trying to limit it. They\\'re not trying to exclude people. But at the same time, there\\'s some things that probably it\\'s agreed upon, would give them a better chance of passing.\\n\\nLeif Albertson 31:44\\nYeah, I\\'m just thinking if someone was very interested in this as an academic problem, but had no intention of really running a water plant. But is that something that could happen? Or would they be like no, you really got. we\\'re only here for.\\n\\nRichard 31:57\\nI don\\'t think there\\'s anything that would, there are many things that you can reach out and apply and ask and in my experience, is that the board or AVUC has been very accommodating with those individuals. The only place I know that you can\\'t, is you can\\'t apply for reciprocity. And be granted reciprocity. Unless you\\'re offered, you have an active job offer. That\\'s the only place I know where you need something like that.\\n\\nLeif Albertson 32:27\\nBecause that might be interesting to like, understand what the test is, like, we\\'ve been trying to beat it out of people who have taken it, you know, like, What was hard for you? What happened? What was you know?\\n\\nRichard 32:39\\nI mean, trying to come up recognizing the fact that how many of these communities are level ones level twos, threes and fours, right. I mean, there\\'s, there\\'s a small inkling of them that are level twos. When you start looking at them out there, I\\'d be I\\'d be shocked if that were a level two\\n\\nLeif Albertson 33:07\\nNunap according to DEC is a level two.\\n\\nRichard 33:08\\nIt is? Okay? Why is that? What do they have at their system?\\n\\nLeif Albertson 33:13\\nI mean, we saw chlorine, fluoride?\\n\\nRichard 33:16\\nAre they generating chlorine on site? Or are they bringing it in by the\\n\\nMichaela LaPatin 33:20\\nBringing it in, right? \\n\\nRichard 33:22\\nBuckets, calcium, calcium chloride?\\n\\nLeif Albertson 33:24\\nI didn\\'t see it because they said they moved it. It used to be stored in the building.\\n\\nMichaela LaPatin 33:28\\nMolly had talked about like one time that they almost ran out and it was at the store. \\n\\nRichard 33:34\\nSo a lot of it just comes down to their treatment processes that they\\'re using. They\\'re doing exactly what we\\'re doing here. Just on a smaller scale.\\n\\nNikki Ritsch 33:38\\nPotassium permanganate, green sand fiter. \\n\\nLeif Albertson 33:47\\nFluoride was the only thing that I like really stood out, like seemed like a bonus to me.\\n\\nMichaela LaPatin 33:50\\nYes. We just always hear about fluoride, people being afraid of it.\\n\\nLeif Albertson 33:54\\nIt\\'s extra points, right?\\n\\nRichard 33:55\\nYeah, everything you do adds up. So I don\\'t care what you\\'re doing out there. It all has some point value attributed to it unless it\\'s kind of the byproduct of a different process that it\\'s already been captured over there. Yeah, there\\'s so they\\'re doing the same thing we\\'re doing here. Right. And the fluoride, we\\'re fluoridating we\\'re chlorinating we\\'re not on site generation of chlorine. We\\'re calcium hypochlorite. Yeah, so it\\'s, a lot of other places they generate their own.\\n\\nNikki Ritsch 34:28\\nYou gave me pretty good rundown of like what had gone wrong here. Can you give us a run through again?\\n\\nRichard 34:35\\nSo when we first got here we found a number of things that we questioned as to why they were occurring and people couldn\\'t answer them. So to give you an example, we went to backwash filter, there\\'s an air relief on top of the filter. So the process is you drain the filter down, you air scour it and then you fluidize the bed and wash all the debris out and then you re-stratify the bed, and then you turn it back around, in essence, a new pack everything down, and then you start flowing water out. Well, when we went into the automated backwash functionality here, hit the backwash filter, it would drain down for 13 minutes. And then when the air scour came on, it just blew water out the air relief all over the whole the filter bay. Instead of thinking about why is this occurring, the solution was, well, we don\\'t want to spray water all over the place. So we grab a coffee can and we put an empty coffee can over the end of the air discharge pipe. And we tie wrap that on, so it just lets the water cascade down. So I don\\'t know how long it\\'s been going on. But it was on all three filters. Well, the problem was that there weren\\'t draining the filters and giving them that airspace they needed. So when they hit it with air, it could agitate the media and do a normal backwash. So I don\\'t know how long the filters weren\\'t even being correctly backwashed. We have redundant equipment here that is offline, turned off. We don\\'t know why. Right. So you know, I mean, things always break but let\\'s fix them. So one of the air blowers was offline. Right now as we sit here today, we just checked out this morning, one of the sump pumps that pumps out the backwash debris is offline. We were promoting going away from the LMI pumps that are being used out here for chemical feed, trying to promote going to a peristaltic. Part of the reason is, I can take you out there and show you 15 or 20, cannibalized LMI pumps and pieces and parts scattered everywhere. And we don\\'t know what kind of Frankenstein pumps they put back together. They seem to be working. But how well are they correct? I don\\'t know. So, you know, we\\'re just finding these things that. Question everything when you come into this plant, right? The probe over there. I mean, first time around a pH and the pH came in at 8.6. Groundwater. No ain\\'t happening. Right? But we run it over and over again. And well the probes over two years old, the buffering solution that they have up there\\'s older than that, and they couldn\\'t calibrate the probe. no stir plate, you know, nothing. So there was a hoc order right there just ordered all those other things. So those are the type things that you know. Stand running the fluoride blank, I looked at T and I go, Hey, when\\'s the last time you made up a zero? He looks at me like what? So we got to make up a zero. And are you keeping it covered and light and temperature sensitive? That type of stuff. Many of those instances. I can go and tell you we haven\\'t found anything he was doing incorrectly. Could it have been done better? Yes. Was it a public health thing out there? Nothing we found yet. But there are many things that could have been done better to provide better water.\\n\\nNikki Ritsch 38:14\\nThere were a couple, but like, whether they were violations or warnings, I don\\'t remember. In April, there were certainly some postings of some warnings.\\n\\nRichard 38:25\\nThey\\'re up there. So what we found here is first off, they have because they had lead copper hits. they started doing a orthophosphate or corrosion control program here. But they haven\\'t started doing, well, they have now because we just started. There was a long lag to do the required testing. To get past the initial startup and the interim approval to operate to become a final approval. There\\'s a long time where they weren\\'t doing the required testing out in the field. And the other thing that you probably saw was as part of the CCR report, they were not compliant on their disinfection byproduct testing. And they had some hits on it.\\n\\nLeif Albertson 39:13\\nYeah. We talked about lead testing. And then it was some of the water came from my house.\\n\\nRichard 39:23\\nAre you on lead copper program?\\n\\nLeif Albertson 39:25\\nWell, no, like, I was on council and we\\'ve known Bill Arnold forever. My wife worked for him. So when, when it came to light that they needed to do a bunch of sampling. I own a couple of houses. So they did the pool and the fire station.\\n\\nRichard 39:40\\nso part of the requirements you have to develop a very structured sample program, which we which there is there\\'s all identified locations, and then lead and copper is one of those few programs that. you\\'re familiar doing it but you actually give the sample bottles to the customers and the water needs to sit for eight hours in that tap, and then it\\'s a first draw. We don\\'t collect it.\\n\\nLeif Albertson 40:03\\nYeah. So we did that. I didn\\'t know that that was I was. My wife ran the water lab. So I didn\\'t know if she was doing it for the city or because all customers, like if she would have been doing that as a lay person.\\n\\nRichard 40:14\\nnot generally. So that\\'s one of the few that we don\\'t take, we actually give it to the residents. And then the theory is, is that we want to know, the lead copper, is it coming from the household plumbing? Or it\\'s coming out of the system? Right, because we\\'re lucky up here, we don\\'t have really lead pipes. I don\\'t know if you\\'ve ever seen a lead pipe. It\\'s pretty amazing to see. So but they weren\\'t doing the ortho phosphate program testing required, just literally said, Hey, guys, what\\'s going on? And there\\'s three emails from Dowl and AVEC. And we got clarification, and then those over there, that\\'s a scheduling program or sampling program for us required by the state to meet our permit, highlighted in blue. And, you know, they weren\\'t. They weren\\'t testing.\\n\\nLeif Albertson 41:07\\nLauryn and I talked to Chase Nelson\\n\\nRichard 41:13\\nHe just flew back with me last Thursday.\\n\\nLeif Albertson 41:14\\nOh, yeah. Okay, yeah. I know him, like I worked with him on the lead paint too.\\n\\nRichard 41:18\\nThey\\'re developing, they were the ones that were did the engineering and development of the orthophosphate testing program. But I don\\'t think anybody was pushing at a local level to actually do the testing. So we\\'re required to take one test every two weeks, from here, point of injury. And then I believe that every three months or every six months, we have to do lead copper. And then we have to take a couple samples from the end of the distribution system. So we\\'ll start doing that. There\\'s a sample bottle sitting there in front of you just ordered, and then it was shipped out to SGS. And then I can\\'t tell you why the disinfection byproducts weren\\'t taken. We weren\\'t here we didn\\'t even know they weren\\'t taken, until we found out that they weren\\'t taken when the CCR came out. \\n\\nLeif Albertson 42:04\\nIf I remember, right, that was part of what was going on with lead and copper was that initially, they were supposed to test and didn\\'t. So it was just missing. And it was like, oh, man, we gotta get the city up to speed because there\\'s, and then that was the thing. That\\'s what came back hot. And like fire department, right? Everyone was like, oh,\\n\\nRichard 42:25\\nand then they went with a corrosion control program to make sure that well to try and lower the lead and copper. So you get to reduce monitoring, that\\'s where you want to do. But you demonstrated that there\\'s not a problem, you want to reduce monitoring. So hopefully, the corrosion control project fixes that. And then we come back with good samples. And there\\'s no reason they shouldn\\'t. I mean, the water shouldn\\'t be too aggressive, but 7.3, 7.4 pH, you know, maybe they have to adjust the pH to cut that down.\\n\\nLeif Albertson 43:04\\nI got one that\\'s a little bit different direction. I mentioned. So like, online training. Are there other places where you see technology helping with some of these challenges?\\n\\nRichard 43:18\\nWhen you say other places, other career fields?\\n\\nLeif Albertson 43:20\\nOh no, other places within water treatment and distribution where there\\'s technological solutions to some of these challenges or things that would help like, Could you be, you know, could somebody be monitoring this from Anchorage? Or is there a way to?\\n\\nRichard 43:36\\nRunning the plant? Are you talking to the plant? Or are you talking the training?\\n\\nLeif Albertson 43:40\\nAll of the above.\\n\\nRichard 43:41\\nSo automation, and remote monitoring is huge. And I\\'m a big proponent of it. In fact, I probably benefit as much as anybody because I can sit at home. And I can monitor Dutch Harbor. And I\\'m the actually the AMOSS, which is the alternative method of systems and provision for special exceptions when they don\\'t have operators properly trained. Like right now Dutch has one, one properly trained operator, when he leaves to go on vacation, or has a medical emergency or something like that, what they\\'ll do is contact me and I will monitor remotely. And I can run the plant and everything else. It depends.\\n\\nNikki Ritsch 44:19\\nCan you like add chemicals and stuff, everything remotely? Or are you just monitoring?\\n\\nRichard 44:23\\nI can control chemical dosage pumps. I can change the UV settings, I can adjust pressure, anything, anything that\\'s automated, that they have the ability to, to input, you know, into the plant, or you can do remotely, and that\\'s huge. And we as a Board had to recognize that fact and also recognize the fact of the amount of training and knowledge that you get from doing that. It used to be you\\'d only get experience if you\\'re on site. If you\\'re now remotely monitoring and controlling. There\\'s a percentage of that time you get towards training. Once again, it\\'s not like they\\'re trying to limit the amount of experience or knowledge that a person can get to put towards licensing, they\\'re trying to adjust with with the changing technology, so SCADA. Y\\'all know what SCADA is? Right? I mean, it\\'s huge. And automation as it comes out, is the way of the future. So everybody\\'s monitoring stuff on their phones right now. And you can any place that allows it or allows you to log into it or gives you the privileges to do it, then the question is whether or not they want to give you the ability to just read it, or you can write into it. And if you can read and write, you can monitor it, and you can make process controls. And certain things you lock out so they can\\'t make process controls or other ones, you actually give them the ability to do it. So when you look at that, it\\'s huge. And it\\'s a changing thing. So somebody in Tuntutuliak maybe leaves, but if they had a remote control system, however, basic it was somebody in Bethel or somebody else probably could monitor it. And make adjustments if required. Now, there\\'s no substitute for look, listen, and feel when you go out in that plant. I can look, I can sit here, when the doors open, I can hear every pump that starts I can hear when they pull up here and load a truck, you know, you have to be able to do that. And that\\'s why it\\'s nice to be able to have that maintenance guy go in on the evenings or weekends. And check the facility and make sure it\\'s okay, and the operator can go do whatever.\\n\\nLeif Albertson 46:25\\nRight make sure there\\'s not water shooting out in the coffee can at the end of the reservoir.\\n\\nNikki Ritsch 46:30\\nAnd if we are taking too much time, we can leave.\\n\\nLeif Albertson 46:35\\nSo wonder if that would be a solution for us.\\n\\nRichard 46:37\\nIt\\'s great, because the water trucks are showing up here now. So plant runs all day. And I like it when the plant runs, it\\'s easier to do everything.\\n\\nLeif Albertson 46:47\\nSo would that would that be could that be a solution for Bethel, then? \\n\\nRichard 46:52\\nSure\\n\\nLeif Albertson 46:53\\nSo what\\'s the capital costs, what\\'s the barrier?\\n\\nRichard 46:55\\nHuge. So actually, this plant is fairly well automated. But it\\'s automated. But it\\'s at the basic level, it\\'s like out there, there are certain controls, but they haven\\'t gotten to that next level where everything\\'s monitored sitting here. And I can make process controls from here. Out there. It\\'s the old touchscreen, right, it\\'s the old local, the operator interface panel, they haven\\'t brought it to a human machine interface, or an HMI where I can sit there. And once it\\'s on there, I can do it from anywhere, right, once it\\'s internet based. But they\\'re talking about that, because the pumps that we were looking at buying for the chemical feed. They want those to be able to integrate into a new SCADA system. So a lot of times what you run into is putting a SCADA system in is like remodeling the house. Once you start, you got to be prepared for what you find and know how far you\\'re gonna go. Because you\\'re not going to put out a bid for an electrical instrumentation contractor to come in here. And he\\'s not going to work on stuff that don\\'t meet code. Right. So then a lot of times you\\'re doing a complete electrical upgrade and bringing everything back to where it\\'s UL listed along with the instrumentation and control that the SCADA system brings. So you know, all these things that were done that maybe were fine 20 years ago, or you cut a wire, you drop this, then it becomes a change order, right. Or you go into it with a bid or a design project to where you\\'re going to do the whole thing from one door to the other.\\n\\nLeif Albertson 48:37\\nSo like, a place like Nunapitchuk where they\\'re building a new water treatment plant. The old one is sinking into the tundra, it is not in a good way. So right across the boardwalk, they\\'ve got platform with driven pilings. And it\\'s going to be half water treatment and half washateria on a platform. So I don\\'t know if they\\'re reusing any of the maybe they\\'re moving the boilers over or something or if they\\'re doing everything from scratch, but a situation like that, like remodeling a house is a lot easier to build it right first time then go back and add the air ducts.\\n\\nRichard 49:12\\nSo many systems out there that can be used for remote monitoring. There\\'s lots of packaged SCADA systems and then you just have to you know, come up with a set of guidelines and develop everything, how you put on your internet or how you put on a computer based program as opposed to being out there with a you know, hand off auto switches or stuff like that.\\n\\nLeif Albertson 49:39\\nIs the capital cost excessive? I mean is it a lot when you\\'re building a new one?\\n\\nRichard 49:44\\nNo I would think that would be just the norm. It\\'s just the new standard.\\n\\nLeif Albertson 49:49\\nWe didn\\'t ask about that so I\\'m curious. \\n\\nRichard 49:50\\nEverything would be, you know, instead of pump having an on and off, it would have, everything would have an inverter rated motor where you control the pump speed remotely. It would have feedback loops coming back to tell you that, yes, it\\'s running or it has a fault, or, you know, it wouldn\\'t just be the basic on/off.\\n\\nLeif Albertson 50:08\\nIs that pretty standard that every new plant that\\'s built is?\\n\\nRichard 50:12\\nYeah, well, it would be the industry norm. But it\\'d be sometimes you don\\'t do the industry norm because it might be a little more complex than what you would want to maintain. That\\'s the other thing that goes along with a SCADA system. Once you\\'re into it. You\\'ve got a maintenance agreement where that SCADA firm is constantly on call to correct things. You can build a system so automated you don\\'t even need an operator. But when things go to hell in a handbasket. How do you run it? You lose internet, and the guy who\\'s running it over there doesn\\'t work, right? Or if you lose a control circuit somewhere, can I go over to that board and go back to the old school way and turn the pump on? Or do I got to call somebody in Juneau at Boreale Controls and say, Hey, I can\\'t turn my pump on, can you fix it? And all you get a busy signal or you get a voicemail? Right? So I mean, it would be the industry norm to do that any new construction, but then it\\'d be the individual application as to how far they want to go with it. As far as automation, and then are they willing to pay, like every application where there\\'s rockwell or iFix or there\\'s a license fee, every year that you have to pay, there\\'s a maintenance agreement, and then you have to have the people that can maintain it. And SCADA is quite a field, right? And now that\\'s you talking about kids wanting to go into the water/wastewater? No, I tell my son don\\'t do that. Right. I mean, if you\\'re, if you\\'re interested in computers like you are, go in to the automation, go to the SCADA field, and start writing ladder logic and, you know, structure text, or whatever the newest thing is out there. And, and use that. Once again, it\\'s, it\\'s kind of like don\\'t buy a stock for the gaming application, buy the stock that hasn\\'t been found yet, but it\\'s going to be supporting the game application, you know, dealing with. So, but SCADA is its own beast. It provides all kinds of benefits, it provides all kinds of flexibility. But if you\\'re in a little, you can be a lot, and you got to have the support to fix it.\\n\\nLeif Albertson 52:25\\nSo I mean, Nunap is probably, ANTHC engineers that would know the answer to that question.\\n\\nRichard 52:31\\nRight? Yeah. I mean, you\\'d want to know, are they setting up for remote monitoring? Are they building remote control into it? Or is it just monitoring or none of the above? And then if we\\'re doing remote monitoring, remote control, have we built a technology into it where an operator can run it locally? When we lose that automation, and that\\'s, I\\'m from an operator\\'s perspective, there is no way I want a plant built that I can\\'t go out there and get a hand dial. I want to be able to hit hand. And Dutch Harbor is a perfect example out there. They had a brand new plant, they\\'re under a consent decree. So federal government ordered them to build improvements to the wastewater plant. So they did it. There are many things out there that you can\\'t take hand control on. It\\'s all automated. It\\'s all built in. So when we find something wrong, we\\'ll be calling Juneau. So Boreal Control will go in and try to figure out what\\'s blocking it out. What\\'s what\\'s not allowing that puck to run? Okay, something as simple as a reservoir, calibration on a reservoir, you know, you call them up? Can you build an offset, and you know, we\\'re saying we\\'re we got three feet of freeboard, so build me a three foot offset into it, and you\\'ll go into there, put in three foot offset, and then you\\'re good to go. In other companies like AWOO, in Anchorage, they have their own SCADA group within their electrical instrumentation department. But once again, you go to a firm that\\'s 260 strong, you have that, but they have a hard time getting SCADA techs. They have a real hard time doing SCADA good SCADA people and a lot of times you have to go out and contract with local engineering firms that have that.\\n\\nNikki Ritsch 54:15\\nThat\\'s because people in CS don\\'t want to be in water? It\\'s not high paying? \\n\\nRichard 54:20\\nI\\'m not sure. Some of it is probably okay. Okay, so Some of it\\'s probably they\\'re a unionized workforce. If I\\'m a SCADA tech, I can make far more money elsewhere, than I can working at AWOO, getting paid the union electrician scale. So I\\'ll go to work as a private contractor, form my own company. So goes back to what we were talking about earlier, right. You go to work somewhere for a couple of years. It used to be once you\\'re there for five it\\'s like man, I might as well stick around to be vested. Right and then while I\\'m in here so far, I might as well finish it out and get done. Not anymore. That\\'s all industries. In fact, it\\'s so bad in police and fire they were talking about going back and forming a special retirement group for them to try and promote retention.\\n\\nNikki Ritsch 55:17\\nI have a couple more questions, if that\\'s okay, if we\\'re keeping you.\\n\\nRichard 55:23\\nI don\\'t go home til next Thursday. \\n\\nNikki Ritsch 55:25\\nDo you sleep here? You said 24/7.\\n\\nRichard 55:28\\nThe other plant. There\\'s a little apartment up over Billy\\'s plant. Okay, so we just bunk there. It works. Only one there. But we do get along.\\n\\nNikki Ritsch 55:42\\nOh, you don\\'t bunk with Billy.\\n\\nRichard 55:43\\nOh no I have no idea where Billy sleeps. He drove in this morning, I was driving out. And then Rick and I, we shared many a night together.\\n\\nNikki Ritsch 56:00\\nIs there anything that you need to learn outside of certification? I mean, it\\'s a federal certification process. Plants are local, what are the things you need to learn outside of the certification process that each person kind of has to figure out at each plant?\\n\\nMichaela LaPatin 56:12\\nI was gonna ask the same. So if I can just Yeah, I think. So. There\\'s like training. There\\'s the exam. And there\\'s actually working. And I think we\\'ve talked a lot about the training preparing you for the exam, and like, cool, maybe gaps, maybe things working, but what about after the exam to actually working? what\\'s missing there?\\n\\nRichard 56:31\\nSo let me just say, just because you pass the exam doesn\\'t make you a good operator. I would not say that you\\'re a good operator. I wouldn\\'t even, I met operators that were good operators have been in the field for years, right. But they come in, they do their job they go home. The good operators, or the operators that are really beneficial to the utility have a broad knowledge base. I mean, because we\\'ve already talked about electrical, we talked about instrumentation, we talked about signals, going back and forth and talked about computers, talked about ladder logic, or logic programming. When you bring all of those things together, then it\\'s much better for you to be able to troubleshoot stuff to understand why things are happening. It\\'s like right out there on plumbing, waiting for the plant to shut down so I can put the plumbing parts together. I had electrician earlier that I told him that I didn\\'t think the starter had power to it. It could be the fuses that are out trying to troubleshoot equipment, whether it\\'s pressure relief valve, so it\\'s, if you can find somebody that has. It\\'s a thought process, right? It\\'s like, how do I solve a problem? Identify the problem, I think about what could be causing it and then I started doing testing to identify it. Or I try switching things out to see if it fixes a problem. It\\'s not put a coffee can over the pipe that\\'s spewing water out. What\\'s causing the water to spew out? Okay, now it comes down to I can\\'t get water out of the filter. It\\'s not draining, right. So now why is it not draining? And now I start taking the drain pipe apart. Right? And I\\'m figuring out is there something blocking it? I can\\'t figure out if there\\'s anything blocking it because I stuck a pipe in there. And it\\'s wide open. Well, then I get the manual out and I start reading about it. And I call the manufacturer and he says oh well there\\'s a stainless steel screen that\\'s about 48 inches that goes into the pipe or into the filter, and it\\'s probably plugged. But the solution was not just to put the coffee can on all three filters and let it go. It\\'s it\\'s being able to do that stuff. It\\'s finding an air compressor that\\'s not working and wondering why it\\'s not working and going oh, look it filled with water. Okay, compressors don\\'t work well with water. So let\\'s put a new compressor on there. And guess what? A new compressor went out too, why? Because there\\'s water in it. We haven\\'t fixed the water problem. So it\\'s a thought process that goes into it. It\\'s not just fixing the problem. It\\'s what\\'s causing the problem. It helps to know a little bit about electrical, helps to know something about instrumentation, helps to know plumbing, I mean, computer skills I mean one of the first things that Rick and I did. That\\'s the old daily log. What does that mean? There are books and books. These are daily logs. What the hell does that mean?\\n\\nNikki Ritsch 59:47\\nNo idea. And they\\'re all in books? And you change on your computer?\\n\\nRichard 59:53\\nYou go and you develop a spreadsheet, you know, it\\'s so much easier. I mean, you still might ask what it is, but at least it all has a label and legible in a daily log sheet down here the what you said and did all day. that type stuff. Now, the old time operator might not want anything to do with that. You\\'d rather write it all down there. But what makes a good operator? Willing to change, keeping up with technology and times, having a good thought process as far as problem solving. Not just fixing the actual problem, but finding out the root cause of it. That type of stuff.\\n\\nNikki Ritsch 1:00:39\\nDo you feel like that\\'s trainable? Is that something that is a part of current training practices?\\n\\nRichard 1:00:45\\nYes, I think it needs to be promoted. For so many operators, their career is driven or their career is actually kind of throttled or held back by the ability to pass a test and be certified. It doesn\\'t necessarily make them a good operator, but it probably gives them more pay. And it allows them to accept more responsibility and allow the community or the municipality to be compliant with regulation.\\n\\nNikki Ritsch 1:01:17\\nOne of the reasons we got interested in it initially is because you can get a lot of federal funding to run plants if you\\'ve got certified operators. And so that\\'s a big barrier to finding money to run in rural areas. I mean, that\\'s initially a very long time ago, how we started down this path.\\n\\nRichard 1:01:32\\nYeah, it\\'s all true, but it doesn\\'t necessarily make them a good operator. Yeah. I mean, how many professors or whatever have you had that weren\\'t good teachers?\\n\\nNikki Ritsch 1:01:42\\nDo you think it happens more that good operators aren\\'t certified or that certified operators aren\\'t good operators? In Alaska. Or maybe that\\'s not a fair question.\\n\\nRichard 1:01:55\\nI would have to think about that. Because I mean, almost every operator I\\'ve worked with, was good at something. I mean, they all had it\\'s kind of like the toolbox analogy, you got your operators, and they\\'re all good at something, you just have to figure out what they\\'re the best at, and then utilize them for that aspect. But push them gently get them to work outside of their comfort zone. So they gain that other experience. And that\\'s part of management, or that\\'s part of, you know, being that senior operator that promotes that, because nobody wants to go outside their comfort zone and make a mistake. Like I said, a mistake is a learning opportunity. Go out there and try that. And come back. Let\\'s talk about what went right or wrong. And then next time, it won\\'t be like that. So I wouldn\\'t, I wouldn\\'t want to say either way, that there\\'s more. It\\'s just, it\\'s all part of just pushing them to go out and ask questions and think about why they\\'re doing what they\\'re doing. And I\\'ll tell you, that\\'s another thing is a log book operator that just writes down numbers, that doesn\\'t even think about what they\\'re writing down. They read it and they never even thought about what it is, and how can they write down that the fluoride is 4.5 milligrams per liter, and not even question. Is that right? Is that good or that bad? And you see that a lot.\\n\\nLeif Albertson 1:03:12\\nYeah. We see the examples of the temperature monitoring, going right down to freezing, and then it stopped working. They recoded this every hour for eight hours. I was there. I did my job.\\n\\nRichard 1:03:27\\nIt happens. And you go back and look at the data. And what does this mean? It can\\'t even be right. Yeah. Well, that\\'s what I saw. So I wrote it down. Well, didn\\'t you question what you were writing down? drive you crazy. I write the information down 14 times. Still doensn\\'t mean anything. Just books and books and books.\\n\\nMichaela LaPatin 1:03:54\\nI mean, that was one thing I was going to ask. That was good. Another one. You kind of touched a little bit on the continuing education. And so one specific question I have is you\\'re a level four. Right? But how do you maintain that if you\\'re not always working on a level 3 or 4?\\n\\nRichard 1:04:11\\nOnce your licensed, every three years, you can either you can either take the test again, or you can take CEUs. And if you get three CEUs in a three year period, two of which must be core to the subject or to the to the field you\\'re doing. One can be non core. You can renew your license by just paying. If I don\\'t want to take the CEUs. I can just retest.\\n\\nNikki Ritsch 1:04:41\\nWhat do you usually do?\\n\\nRichard 1:04:42\\nI take the CEUs. But at the same time I\\'ve been tested. I got my licenses as a non ABC license. Prior to the ABC.\\n\\nNikki Ritsch 1:04:56\\nWhen did that all happen?\\n\\nRichard 1:05:00\\nDon\\'t Recall, I\\'m gonna guess late 90s, maybe early 2000s. So when they came out with the ABC because I wanted to enjoy reciprocity, or have the benefit of it. I went in and retested and got new licenses. It was kind of a bummer. Sounds trivial, but I had nice low numbers on my licenses. And now I have high numbers on my license. And I used to love to have that number that was down in the thousands. But it doesn\\'t matter, they\\'re ABC. So I was just taking. I just took my CEUs last year to renew my licenses. And what\\'s nice is it sounds like a lot, right? Three CEUs, and it can be a pain in the butt to take, but at the same time, it covered all four licenses. Yeah, so by taking three CEUs, it actually applied to three. I went to the AWWA website. And they had a whole bunch of classes in there that were available for the small water system operators, I took a number of them and then I took a couple other courses that were paid for. All remote, all done sitting at home. The other option was I could have paid and went to a class or I could have went to a seminar somewhere set in the classroom and got some CEUs. But I took them all remotely and they were just put the old earbuds in and sit back and watch, you know, watch the computer and listen to it and then take a test.\\n\\nMichaela LaPatin 1:06:44\\nAnd that\\'s roughly 30 hours, right? Like one CEU is roughly 10 hours.\\n\\nRichard 1:06:48\\nYeah, that\\'s fair. That\\'s fair. It sounds like it\\'ll be a big loss. Are you like trying to bring people? It\\'s supposed to equate to 8 to 10 hours. And then I\\'ve done both, I\\'ve tested and I\\'ve taken CEUs. But the preference now is take CEUs. In order to get a higher level wastewater license, I will have to retest because I\\'m only level two in wastewater treatment. And that\\'s just because I got into it late. I never was in wastewater until I retired. Yeah. And when we went to Kodiak, we were doing both. So that\\'s where I started. And then Dutch Harbor started out we were only doing water when I first went under contract. It was for water, but then come to find out they lost all their wastewater people. So we did both water and wastewater out there. So I was able to accumulate enough experience and time to where I could sit for my one and my two and maybe make it to three, I don\\'t know. My son\\'s 16 years old, two more years and I\\'m probably done. All the time. All the time. \\n\\nNikki Ritsch 1:07:52\\nIs it? Are you successful? Do you have people that work in your company?\\n\\nRichard 1:07:56\\nJust Rick and I right now, okay, but we\\'ve reached out to a number of people see they get bit by \"Oh I\\'m retired.\" So they can\\'t go to work for a PERS job. Because if you work in a PERS job by that employer, you lose your PERS benefits, your retirement benefit. So I can be a contractor to Bethel, and I\\'m fine. But I couldn\\'t go to work for Bethel. So I couldn\\'t be hired by it. So that\\'s one of the problems that\\'s going on at this stage is when they retire when it happens. Or retiree. If they want to go back to work, you\\'ve got to go in as a contractor somewhere else. So then you got all the rigmarole of that. The other thing is. A few of them I\\'ve talked to said social security bites them. They go back to work. They make too much money between their retirement benefits and that.\\n\\nNikki Ritsch 1:08:39\\nSo that\\'s like, retired. What about like, our generation? Even younger?\\n\\nRichard 1:08:49\\nMost of you, if you want to go to work, you just there\\'s 20 positions open at Bethel right now. Right? I mean, they\\'ll tell you they got 19 vacancies.\\n\\nNikki Ritsch 1:08:57\\nWhy? What\\'s your read on it?\\n\\nRichard 1:09:01\\nI don\\'t get it. I mean, what happened to the workforce? So we had COVID. What changed? I don\\'t know. I can\\'t explain. I mean, a lot of people retired early. Okay, that\\'s fine. But there\\'s jobs out there. I mean, people are saying we\\'re in a recession. Prices are going through the roof. Why are we back to work? I don\\'t know.\\n\\nLeif Albertson 1:09:30\\nBill said people don\\'t want to work anymore. \\n\\nRichard 1:09:32\\nYeah, yeah. Why not? How do you afford to live? Obviously, there\\'s got to be enough aid or something out there so people don\\'t need to work or or the other thing is you\\'re not paying a competitive wage.\\n\\nNikki Ritsch 1:09:49\\nYeah, I mean, he told us that your last monthly bill was like 36 grand. So we were like, Okay, so that\\'s a lot of money. Yeah. So why are wages not higher for someone from here and he says union stuff. \\n\\nRichard 1:10:02\\nSo you\\'re under a union contract here. Okay. And I don\\'t know what they pay here. But I\\'ll guarantee you, they\\'re paying more in Kodiak, they\\'re paying more at Fort Rich in Anchorage, they\\'re paying more in Dutch Harbor, and they\\'re paying more in Anchorage.\\n\\nNikki Ritsch 1:10:15\\nSo it\\'s I mean, is it then just people who can and are interested and are motivated to work are going elsewhere?\\n\\nLeif Albertson 1:10:21\\nYeah. And I don\\'t think Bethel\\'s on anyone\\'s list for like, you know. Yeah. If you\\'re coming if you\\'re going moving from Anchorage, or even, you know, or Seattle or something, and it was like, like, Kodiaks a beautiful place and it\\'s.\\n\\nRichard 1:10:35\\nThat\\'s where I was buying tickets to go fishing. It\\'s got Walmart, Safeway, or, you know, jet service. Right. You can take a ferry and get a vehicle there. And it\\'s gorgeous.\\n\\nLeif Albertson 1:10:48\\nSo, if you\\'re looking at Sitka or Kodiak, or even Nome, and Bethel, you know, like Bethel\\'s going to be four out of four every time.\\n\\nRichard 1:10:58\\nI\\'ll tell you right now, Bethel has a very bad reputation in Alaska. It\\'s the armpit of Alaska. I mean literally that\\'s, that\\'s the reputation it has, and I had never been here before. I moved to Nome. I lived in Nome. My sisters graduated from Kotzebue, I\\'ve been in Glen Allen, and through southeast, I\\'ve never been here before. And it was a little surprising.\\n\\nLeif Albertson 1:11:22\\nYeah I mean, when I moved here, it\\'s the same thing. Why are you going there? And I still get that. I mean, I get that in anchorage all the time. Oh, my God, you live in Bethel? That must be terrible. It\\'s like, why would you say that to someone who lives there?\\n\\nRichard 1:11:31\\nThere was a lady sitting on the plane when I flew in last night. She, she lived in Bethel for 50 years. 50 years. That\\'s a long time to be in Bethel. I can\\'t imagine it right. And I started thinking about what I probably wasn\\'t. She did say, you know, she said, Well, I\\'m pretty much relocating and shifting. I spend most of my time now while in Watsilla. \\n\\nMichaela LaPatin 1:11:31\\nEven the last time we were in Alaska, we were in Seward for a few days. And I think it was while we were there, we told someone that we\\'re going to Bethel, and the reaction was, oh, my gosh, you\\'re not gonna have any daylight? Yeah. And I was like, It\\'s April. We will. And they\\'re like, you\\'re not it\\'s gonna be dark the whole time. And I\\'m like, I don\\'t think you know how this works.\\n\\nLeif Albertson 1:12:15\\nBut Seward\\'s like another example of a small community that like, if you if you were talking, like regardless of how much money they pay, you know, like, I think there\\'s a there\\'s a stigma about Bethel. It\\'s a tough place to live. And it\\'s inconvenient. And it\\'s expensive, and it\\'s dirty.\\n\\nRichard 1:12:33\\nBut I\\'ll tell you right now, you can ask people about Dillingham you get a better reaction than Bethel. Bristol Bay. Right Naknek or you get out somewhere\\n\\nLeif Albertson 1:12:45\\nAnd those places are all smaller\\n\\nRichard 1:12:46\\nYeah they\\'re smaller but they\\'ll have a better preconceived notion or whatever you want to say it perception. Than Bethel. And I mean, being out here for a month and a half or whatever we\\'ve been out. It\\'s not bad. It\\'s, you need to be prepared for some things that are different. But\\n\\nLeif Albertson 1:13:06\\nYeah but like Nome\\'s got drunks too. I don\\'t know what. The roads are better, I guess. But yeah. \\n\\nRichard 1:13:13\\nThere\\'s very few communities in Alaska that don\\'t have alcohol problems. Small communities. How do you solve it? I don\\'t know. You can go dry. You can go damp. It doesn\\'t matter. They\\'re still gonna bring it in. \\n\\nLeif Albertson 1:13:28\\nI have a question. So you know, we\\'re talking about writing. DEC has regulatory authority. One thing that\\'s been, I guess, kind of confusing to me over the years is how, how\\'s the water plant like this able to get out of compliance? Like shouldn\\'t there be someone at DEC seeing all these logs?\\n\\nRichard 1:13:49\\nWhy is it out of compliance because I look up here and I see all these awards around here.\\n\\nLeif Albertson 1:13:54\\nFor me, like the DEC has a regulatory job, right, like, they. I don\\'t want to make this too pointeed but I mean,\\n\\nRichard 1:14:01\\nRegulatory and enforcement and compliance.\\n\\nLeif Albertson 1:14:03\\nSo is somebody looking at these logs? Or like how do we get a year and a half behind on our testing?\\n\\nRichard 1:14:10\\nOkay, so I don\\'t really know. I don\\'t know how that happens. But keep in mind that you typically, everybody gets a sampling schedule, and you wouldn\\'t know about it until they become delinquent and you start filling out your annual monitoring report or something like that. And then your CCR comes out. You know what CCR is? Yeah. Okay. Consumer Confidence Report. Yeah. Okay. So that comes out annually, you\\'re required to publish it. And that\\'s generally when everything gets captured.\\n\\nLeif Albertson 1:14:44\\nWell, so this was my story, right? I was sitting right over there, three houses down on my chaise lounge. Got the CCR going through it. Oh, it\\'s hot for lead. That\\'s weird. We don\\'t have lead here. So I call my wife up at OEH and said CCR says we\\'re hot for lead. We don\\'t have lead here. She says no we don\\'t I\\'ll call DEC, see what\\'s going on. And like DEC was like, oh, yeah, it looks like Yeah. You know, look like you were high for lead. And it was like nobody knew that. And then after that it was like, Yeah, we were supposed to be testing for, like, and that\\'s how we found out it was because I was sitting on my couch. And I like that was the most amazing part to me was that, that it went. I mean, somebody printed it in the CCR, like it wasn\\'t a secret. And then, and we had we were behind on testing, when that happened. And so like, no one at the DEC caught. Like I did, like I was just some dude on my couch. And so it kind of made me wonder how DEC partners with communities and what, if we rely on them.\\n\\nRichard 1:15:51\\nAnd that\\'s part of it is, you probably look at the number of communities that are out there, the number of reports or files that they get, and the ability for them to oversee all. And I don\\'t know, I don\\'t know that side of it. Because when I see a sample, it\\'s easy to go take the sample. It\\'s easier than writing that letter that I gotta pull up the CCR. Right, it\\'s easier, just go take it, it\\'s what you got to do to do it. I don\\'t know how you get in a situation where it goes from lack of compliance, lack of enforcement, to the point and if you wanted to look at a case study, I mean, it\\'s the thing in Dutch Harbor, Alaska, it\\'s to where it got so bad that the DEC or the EPA filed a federal lawsuit against them, to where they entered into a consent decree and the federal court system took over, basically jurisdiction telling them what they would do for their system. The system, they would build the timeframe to do it to achieve benchmarks, and then to be 100% compliant for 20 consecutive quarters. Five years, five years, they cannot have a violation. How does it go from that situation where you have a community that\\'s thriving, that makes a lot of money in the fishing industry, to where you\\'re non compliant? And then the federal government actually steps in an issues a consent decree. How does that happen? I don\\'t know.\\n\\nLeif Albertson 1:17:28\\nI mean, it feels a little bit like communities that are on their own, do a good job making you feel like there\\'s a lot of oversight,\\n\\nRichard 1:17:35\\nYou would certainly hope that they\\'re all out to do the right thing and to provide safe drinking water and be stewards of the environment with their wastewater treatment. You would hope that I mean, it\\'s, it\\'s not that hard to do it right. And I think most of them probably are doing the best they can. But they may not be doing the monitoring. And sometimes, I\\'ll say another thing you have to remember here is, every sample you take has a hold period. So certain samples have shorter hold periods. I pull them today, and the flight doesn\\'t get in. Yeah, so sometimes it comes down to the ability to get a sample in a timely manner to a lab. Sometimes it comes down to bad planning, where I\\'m going to do my quarterly samples in the last week. And guess what happened that week, I got sick, or the planes couldn\\'t get in, right. Instead of taking them at the beginning of the quarter, or the middle of the quarter to where you had time to recover. Or you take them on Friday, and you ship them off and the lab writes you back on Monday, the first day of the next quarter and says, Hey, we had a temp blank that didn\\'t come in. Your samples are no good. Right. So that could that could get you on the SNIC. List. Okay, significant non compliance.You could end up on that, you know, that list of non compliant communities, even though you\\'ve done everything right.\\n\\nLeif Albertson 1:18:59\\nYeah. I mean, I guess to the point that it seems like there\\'s a lot of communities on the SNIC list for perpetuity, and it doesn\\'t seem like anything really, always happens to them. Right. I mean, doesn\\'t it? Or am I wrong? \\n\\nRichard 1:19:10\\nYou\\'re not wrong, but I mean, it\\'s I don\\'t know how you fix that. You get operators that want to do the right thing, cities or communities that are willing to make sure you have the things to do it right. And promote that attitude or that environment.\\n\\nNikki Ritsch 1:19:27\\nWhat\\'s the repercussions if you are on the SNIC list for X number?\\n\\nLeif Albertson 1:19:31\\nThat\\'s what I\\'m saying I don\\'t think they\\'re really is\\n\\nRichard 1:19:33\\nI don\\'t know that there is other than you keep getting.\\n\\nNikki Ritsch 1:19:35\\nYou don\\'t get shut down?\\n\\nRichard 1:19:36\\nHow do you shut down a public water system?\\n\\nLeif Albertson 1:19:39\\nYeah. I mean, occasionally DEC does that when there\\'s floods and stuff.\\n\\nRichard 1:19:42\\nYou could come in and issue boil water notices if there\\'s a significant health event or I\\'d imagine that there might be an avenue where you could come in and someone take over but I\\'ve never heard of anything like that.\\n\\nNikki Ritsch 1:19:54\\nYou don\\'t get charged? Like I\\'m thinking like an environment? My husband\\'s in environmental remediation. \\n\\nRichard 1:19:58\\nYou can be charged but I don\\'t. You\\'re talking about a fine? I\\'m sure there is some sort of fine in the codes in the state of Alaska but. Promoting compliance is much more so than trying to punish non compliance.\\n\\nLeif Albertson 1:20:20\\nLike if you went to Nunapitchuk, and were like, Alright, we\\'re gonna fine you $1,000 A day like, would that make the community better off? Or worse off? They wouldn\\'t pay it. And then what do you? you know, like, or they couldn\\'t pay it? Or it\\'s not gonna help hire somebody?\\n\\nRichard 1:20:38\\nBut I think they focus a lot more on what can we do to get you compliant? Or what do you need to make it work? Then to punish those that are non compliant. \\n\\nNikki Ritsch 1:20:49\\nDo they ever send, does DEC ever send in reps to plants to like what you\\'re doing?\\n\\nRichard 1:20:55\\nI don\\'t know if they go out and do that. But they do have rural workers that go around and assist with commu nities with fixing stuff. That\\'s RMWs remote maintenance workers. \\n\\nMichaela LaPatin 1:21:06\\nOh they work for DEC?\\n\\nRichard 1:21:14\\nWe have a lady on the board who used to oversee it. I\\'m not sure but I mean, those are the type operations that go on, they go out and promote it, then they have Alaska Rural water, they have different entities that will help you with with problems that you might be encountering. And then the sanitary survey crews that go out. You\\'re required to do sanitary surveys, they should go out and catch a lot of things that are going on that aren\\'t correct. And address it through the sanitary survey also. \\n\\nLeif Albertson 1:21:40\\nBut they don\\'t really, they don\\'t fix stuff.\\n\\nRichard 1:21:43\\nNo they write it up. They should identify it. And once again, you\\'re going back to, you would hope that identifying deficiencies, they get corrected, you would hope that identifying things that aren\\'t being done correctly, whether it be sampling or things like that, the emphasis is placed on correcting it, and there\\'s a desire to correct it, not to just continue doing what you\\'re doing. And that\\'s why I say go look at a case study where it\\'s possible that people did not heed the advice and heed the warnings. And they kept doing their own thing. And then the federal government stepped in and said, Okay, we\\'re here. Here\\'s your fine, it\\'s huge. And you\\'re building a new plant, and you\\'re upgrading your collection system. And we\\'re going to continue oversight until you go five years with no violation.\\n\\nLeif Albertson 1:22:30\\nThat would I think, probably work different in a place like Dutch Harbor where there needs, Dutch Harbor needs to be there.\\n\\nRichard 1:22:36\\nAnd they had the money. \\n\\nLeif Albertson 1:22:39\\nThere\\'s forces at work there. If you told Tuntutuliak, you need to build a new water plant.\\n\\nRichard 1:22:45\\nThey\\'re looking for federal funding. The way to do it, but see, I think that\\'s the differences that come in, in certain communities. And it\\'s like OSHA, right? OSHA comes in and writes you a fine. The first time it\\'s like maybe a fix it fine. Second time, it\\'s a fine. Third time is 10 times that fine, right? I mean, so I\\'m sure there are avenues or ways to do that I\\'m not really familiar with them being forcefully administered. It\\'s more like, how do we fix what\\'s wrong? How do we how do we make it right? I shouldn\\'t say how do we fix the problem? How do we make your system right?\\n\\nNikki Ritsch 1:23:24\\nThat\\'s something unique about water systems, because I mean, in when you\\'re dealing with corporations, or my husband\\'s an environmental remediation specialist, and so dealing with like, huge engineering companies that are violating EPA standards, they get fined huge amounts of money because they\\'re violating what regulation is. But in in the scenario where water is a public good, it\\'s a public right? Your water, then there\\'s no.\\n\\nRichard 1:23:49\\nIt\\'s a necessity yeah, but at the same time, it\\'s like I tell everybody, okay, you\\'re gonna turn that light switch on, the light doesn\\'t light up, you know, it\\'s broken, right? Go grab that glass of water. Tell me if it\\'s broken. Right? You don\\'t know. We always make the assumption. It\\'s right. There\\'s no indication. Well I mean there is, obviously. If it doesn\\'t taste right or doesn\\'t smell right. But in general, it\\'s usually pretty tolerable. You don\\'t know if it\\'s right or not. And you can\\'t afford as an entity or as a utility, or you can\\'t afford to fail. Because perception is I don\\'t want to drink the water. Don\\'t drink the water or you\\'ll get sick. Once you fail, it\\'s hard to get that perception and confidence back.\\n\\nLeif Albertson 1:24:31\\nI think that is a little bit of a hole though, too. I mean, like because I theoretically, there\\'s a water treatment plant in every place that I go around the Delta. But I don\\'t feel confident that I know everything\\'s happening that should be happening, or that the DEC would tell me if it wasn\\'t, right? Like theoretically, there\\'d be\\n\\nRichard 1:24:50\\nbillions of dollars to this public concert.\\n\\nNikki Ritsch 1:24:53\\nShould we worry that you\\'re drinking bottled water? \\n\\nRichard 1:24:55\\nNo I do it for convenience. We fly everything out of here.\\n\\nLeif Albertson 1:25:01\\nBut yeah, I mean, like so boil water notice, right so even if even if there if I felt confident that there would be a boil water notice that I would see if there needed to be one and I don\\'t feel confident in that, you know that if I get off the plane in Kasigluk that that\\'s that water is good to drink.\\n\\nRichard 1:25:18\\nWe go all over the world, my wife will ask me \"is the water safe to drink?\" I\\'m like, well it should be. We\\'re in a major metropolitan city, there\\'s really no worries, right. And the other thing that\\'s funny here is, is in a water treatment plant, typically the water treatment plant operators, the water they drink doesn\\'t have contact time. You ever thought about that? So there\\'s always a retention or contact period of time from when you touch the chlorine to it to where you have CT time, typically measured at the first tap. Never is that first tap considered the treatment plant. It\\'s typically downstream somewhere. \\n\\nLeif Albertson 1:25:57\\nYou need like a separate holding tank or something.\\n\\nRichard 1:26:00\\nI mean, the reservoir or clear well, or whatever, they have baffles in it. So you get your CT time. You got a loop. Yeah, but where does this plant get its water from? I don\\'t know I haven\\'t mapped it out. But I\\'m just saying a lot of treatment plants, they\\'ll pull right out of their clear wells, a small booster pump, just to provide water to the plants. And they\\'re the one person that isn\\'t considered the first tap. But they always spend great monies and funds to find who the first tap is. And they\\'ll do all the CT analysis for that tap.\\n\\nNikki Ritsch 1:26:31\\nSo is it okay to drink out of this? Is that okay?\\n\\nRichard 1:26:38\\nAs far as I know. Well over there, it should be because it\\'s into the tank, and then it\\'s pulled out of the tank. I have not been in the tank. I don\\'t know if they have a common inlet and outlet line. I don\\'t know if there\\'s how they get a CT time. But I\\'ve never heard that. All he\\'s told me, the drivers that I\\'ve talked to is how much better it is now, clarity wise. And I\\'m assuming nobody\\'s gotten sick. But we\\'re having a hard time getting. In reality, we\\'re having a hard time getting the chlorine residual up in that tank. I speculate because there\\'s high demand and from the iron and manganese that are starting to precipitate out. And it hasn\\'t been cleaned.\\n\\nNikki Ritsch 1:27:16\\nHe said something about concern about groundwater contamination, right? \\n\\nRichard 1:27:19\\nWell there\\'s always, there\\'s always concerned with that. I don\\'t I don\\'t know, their groundwater around here. But it\\'s high in iron and manganese. So we\\'re treating for that. I don\\'t know that. I\\'ve looked up the depth of the well, I don\\'t have great concerns about that. Yeah, there\\'s a little nomenclature on there. But, you know, typically, most groundwater wells need to be on at least 100 foot in depth before they\\'re considered good water wells for public use. But my biggest thing is, is that we\\'re chlorinating at 0.7 milligrams per liter, that\\'s our target. Okay. We\\'re only getting out after it goes through that half million gallon reservoir, which we keep typically full, which isn\\'t necessarily the best thing, but we do. We\\'re only getting out about 0.25. For 0.3/0.35 demand, milligrams per liter of demand out there. And I don\\'t know why. But I\\'m wondering if it\\'s iron or manganese, it\\'s precipitating out in there and taking a lot of that out, or is it just the debris and sediment that\\'s in the bottom of that tank?\\n\\nNikki Ritsch 1:28:27\\nCan you check what\\'s precipitating out?\\n\\nRichard 1:28:31\\nNo, I haven\\'t. I have not checked it. But I suggested to Bill that we clean the tank. He said last time they did it was a number of years ago. First time they did it. They did it with a diver the second time they did it with a remote operated vehicle. And I just said, Well, Bill, there\\'s no reason in my mind, we can\\'t do a really simple way. You got a unique topography here where the tank is on a little bit of a hill, where it runs down. There\\'s no reason we can\\'t just come up with a really simple siphoning system with the intake off the floor. And we just sit there on the floor with suction and just low volume and just get the precipitate off the bottom. So hopefully it\\'s something\\n\\nLeif Albertson 1:29:12\\nlike a like a gold.\\n\\nRichard 1:29:17\\nJust let it run over the hill. \\n\\nLeif Albertson 1:29:21\\nThe lake there.\\n\\nRichard 1:29:27\\n Did it affect your hair color?\\n\\nMichaela LaPatin 1:29:28\\nMine\\'s been okay. So the last time we were here, a lot of people we spoke with talked about two things. color of the water and losing their hair or messing up the color or anything. But we have someone else on our team is a water quality expert. And as we were talking to her about some of our interviews as we mentioned those two things. Manganese was one of the things that she mentioned. Yeah, it was potentially contributing to hair loss, hair color, color of the water. Manganese was a big one she mentioned. \\n\\nNikki Ritsch 1:30:11\\nWe interviewed 54 end users here. So we were just kind of running down of what we heard. \\n\\nMichaela LaPatin 1:30:15\\nYeah, community members so we\\'re gonna hopefully be bringing her back with us one day so.', '3_3__InterdependenciesNNA': \"Theo\\nWed, Aug 24, 2022 8:19AM • 11:01\\nSUMMARY KEYWORDS\\ndriver, cdl, inaudible, water, work, village, pay, bethel, training, piped, driving, umm, job, overflow pipe, winter, places, tank, truck driver, nice, hours\\nSPEAKERS\\nTheo, Nikki Ritsch\\n\\nNikki Ritsch 00:01\\nOkay, so we're here with Theo and he's a water truck driver. So can you tell us where you're from and how long you've been doing this?\\n\\nTheo 00:02\\nI'm from Scammon Bay and I've been driving for the city for almost a year.\\n\\nNikki Ritsch 00:15\\nAlmost a year. And you were mentioning that you want to become a teacher so you're gonna go back. And this is kind of, you came to make money, I guess. Driving. Okay, so you want to be a teacher, you want to go back and work in the village and be with your family. So you're here to make money. Can you tell us about, how did you choose to become a driver over other options in Bethel?\\n\\nTheo 00:41\\nUm I don't know. I just wanted to become a CDL driver because you know there are some jobs in Scammon, like some certain jobs, like air force truck, they do uh... they send out heavy equipment. \\n\\nNikki Ritsch 00:59\\nGot it, got it. And you said that there was a scholarship available to cover CDL, right? Your village coverage paid for your CLD. \\n\\nTheo 01:07\\nYea\\n\\nNikki Ritsch 01:07\\nAnd you think it was really hard for you to get your Med card, which is like the biggest if you wanted to take a training in April. We only got to take it in August. And the major holdup was a med card. You said the training itself was \\n\\nTheo 01:23\\nEasy. \\n\\nNikki Ritsch 01:23\\nEasy! You did, you did really well. Okay, so you say the CDL test was easy for you. And did you get a job right away when you came here?\\n\\nTheo 01:36\\nYeah\\n\\nNikki Ritsch 01:36\\nDid you know you were coming? Like did you go get your CDL because you knew you had a job here?\\n\\nTheo 01:41\\nNo, after I got my CDL I went home and I applied for a job. And they called me like the next day\\n\\nNikki Ritsch 01:46\\nGot it! And they were like come work. Okay. Okay. Can you tell? Remind me again, like how many days you work and how many hours you work \\n\\nTheo 01:55\\nSix days a work, or six days a week and uh, like 9 - 12 hours most days.\\n\\nNikki Ritsch 02:02\\nAnd so you're on a you're on like a contract. So you get $40 an hour. And that no matter if you work eight hours or 12 hours in the day, right? \\n\\nTheo 02:10\\nAh no, there's overtime. \\n\\nNikki Ritsch 02:10\\nYou get overtime. Oh, that's cool. Nice. Okay, cool, but no benefits. \\n\\nTheo 02:15\\nNo\\n\\nNikki Ritsch 02:16\\nOkay. And then you said that your cousin also lives here who you're renting from or staying with? \\n\\nTheo 02:21\\nNo, that's my brother-in-law. \\n\\nNikki Ritsch 02:22\\nAh okay, okay. But your cousin also drives in battle for... what's the name? Delta Western. Okay. And it pays better?\\n\\nTheo 02:26\\nDelta Western Yeah for a full-time position\\n\\nNikki Ritsch 02:31\\nFor full time for full time. Okay. But since you want to go back, you're okay here with the part time higher pay? \\n\\nTheo 02:38\\nYeah.\\n\\nNikki Ritsch 02:38\\nYeah. Okay. That's fair. Okay, that's fair. Uhh and you said that you much prefer working in the winter because the summer is too hot. \\n\\nTheo 02:46\\nYeah\\n\\nNikki Ritsch 02:47\\nAnd the only hard part about the winter was like the roads get slick sometimes. But they're pretty good at plowing those.\\n\\nTheo 02:53\\nYeah\\n\\nNikki Ritsch 02:55\\nWhy do you think they're so short on drivers? \\n\\nTheo 02:57\\nUhh I'm not sure. Maybe full-time drivers are underpaid, like, there's a Ollie, a truck driver, but other drivers are [muffled] underpaid. \\n\\nNikki Ritsch 02:57\\nThat's your foreman? \\n\\nTheo 03:02\\nUh no, or... Other drivers, there are other drivers which retired and there was another driver which recently quit.\\n\\nNikki Ritsch 03:22\\nQuit? Do they just like quit work all together, or do they go somewhere else?\\n\\nTheo 03:25\\nJust quit and apply for another job. \\n\\nNikki Ritsch 03:27\\nOh man, man. Yeah so you were saying that you usually should have 10 water driver or five water drivers, five sewer drivers. And right now you only have eight and today you only have seven? \\n\\nTheo 03:37\\nYeah\\n\\nNikki Ritsch 03:38\\nBecause someone's sick or out. So you're like you had to pick up five extra places. \\n\\nTheo 03:44\\nYeah\\n\\nNikki Ritsch 03:45\\nAnd I'm just recounting, I know you've told me all this but I'm just re-counting it. You do about 35 places a day, you prioritize businesses first. Umm. Each takes called 3200 gallons\\n\\nTheo 03:57\\n3400 gallons\\n\\nNikki Ritsch 03:58\\n 3400 gallons. It pumps at about 70 gallons per minute. On average, you can do about five, five to six places on one on one tank. Um. We went to City sub just fill up the tank once already. Umm. Let's see. Perfect timing. Okay. Um. So, I just want to, you said in Scammon Bay, there's piped water? Yeah That sort of feeds from the river... Can you tell me how the system works? So does it collect from the river? \\n\\nTheo 04:31\\nUh from the stream, from the mountain, the snow\\n\\nNikki Ritsch 04:35\\nAnd is it a well? Or is it ground, it's surface water? \\n\\nTheo 04:39\\nUh it comes from under a mountain. \\n\\nNikki Ritsch 04:42\\nOkay okay \\n\\nTheo 04:44\\nUh the pipe, there's a big tank holds it.\\n\\nNikki Ritsch 04:50\\nAnd then you said maybe it's treated\\n\\nTheo 04:51\\nYeah\\n\\nNikki Ritsch 04:51\\nOkay, cool. Hmm. That's awesome. Um. Do you think you want stay working, I mean you want to be a teacher, but do you think you might be interested in staying and working in water? You're just, this is kind of to pay the bills, then you'll go - you really want to be a teacher \\n\\nTheo 04:52\\nYea\\n\\nNikki Ritsch 05:07\\nYou said third grade?\\n\\nTheo 05:12\\nYeah\\n\\nNikki Ritsch 05:15\\nThat's cool. So what do you, I mean, what do you think about water delivery in Bethel in general? Is it a thing, I mean tell me like do you think the system works, do you think it needs improvement? \\n\\nTheo 05:25\\nI think it... \\n\\nNikki Ritsch 05:26\\nIt works? \\n\\nTheo 05:27\\nYeah\\n\\nNikki Ritsch 05:27\\nWell enough \\n\\nTheo 05:28\\nYeah, [inaudible] because only certain parts of town have piped water\\n\\nNikki Ritsch 05:34\\nYeah. Do you think if people could get on piped water, they'd want to?\\n\\nTheo 05:39\\nUh I think so yeah, because there's always people calling about extra water.\\n\\nNikki Ritsch 05:44\\n How often do people call for like emergency top-ups, or like out of cycle? \\n\\nTheo 05:49\\nUm, [inaudible] I don't know about during business hours, but then when I'm in the office, I hear on the radio about extras. \\n\\nNikki Ritsch 06:02\\nOkay okay. You said that in the winter one of the challenges is that the overflow pipe can freeze and then you flood houses sometimes? Just wanted to make sure we cover that. It was really interesting to me that you also said that your water trucks are basically like fire trucks. You provide the water to a fire when there is a fire. \\n\\nTheo 06:15\\nYeah Yeah\\n\\nNikki Ritsch 06:28\\nAnd you've done that twice and you've been working for what? Six months? No? You started in August?\\n\\nTheo 06:35\\nLast September. \\n\\nNikki Ritsch 06:36\\nLast September. Oh okay. Almost a full year. Okay. Okay. What's the hardest part about being a driver?\\n\\nTheo 06:48\\nI'm not sure\\n\\nNikki Ritsch 06:52\\n You don't know? No? [inaudible] [inaudible] All right cool. Just a couple more questions. There's a lack of drivers, which impacts how many places you have to do on your route every day. What do you think some of the solutions are? You said it's because people leave to get pay other places, right? Well, is it to increase pay? What are some of the solutions? \\n\\nTheo 06:55\\nLong Um out of the people that leave, or the ones that left that I know of, they wanted more pay. Didn't like the pay. \\n\\nNikki Ritsch 07:49\\nOkay. So why haven't they increased the pay? \\n\\nTheo 07:58\\nUh I'm not sure. \\n\\nNikki Ritsch 08:00\\nRight, and then you said that you were only aware of one female driver, but in your training, they said that about 20% of CDLs are female. \\n\\nTheo 08:08\\nYeah\\n\\nNikki Ritsch 08:09\\nThat's interesting. So in the training, you said 21 days of training, you said like four hours a day in the classroom? Was that every day four hours? \\n\\nTheo 08:19\\nYeah\\n\\nNikki Ritsch 08:19\\nWhoa. And then can you tell more? You explained some of the driving tests - you had to do - you had to like, parallel dock. \\n\\nTheo 08:26\\nParallel park, alley dock park\\n\\nNikki Ritsch 08:30\\nWhat is alley dock parking? \\n\\nTheo 08:32\\n90 degree turn. \\n\\nNikki Ritsch 08:34\\nWoah! That's hard. Wow. Okay. Okay, cool. So how many people pass the training in your in your year or in your group? \\n\\nTheo 08:45\\nUh, there was one that quit. Or he left. Like in the middle of the training. And there was one guy, he didn't pass. Just one. Just one.\\n\\nNikki Ritsch 08:54\\nOkay. And you have to redo it if you don't pass. \\n\\nTheo 08:56\\nYeah\\n\\nNikki Ritsch 08:57\\nOkay. Got it. Do you think... Did you have a driver's license before you did CDL? You did? Okay. Okay. Nice. \\n\\nTheo 09:02\\nYeah You need a driver's license for a year. \\n\\nNikki Ritsch 09:07\\nA year. Okay. Okay. Nice. So did you, did you have a driver's license in the village?\\n\\nTheo 09:14\\nUh yeah. \\n\\nNikki Ritsch 09:14\\nIn your village? \\n\\nTheo 09:15\\nAh I got it in the city, in my senior year. \\n\\nNikki Ritsch 09:18\\n Got it, nice. Cool. Was your high school in your village?\\n\\nTheo 09:22\\nUh yeah.\\n\\nNikki Ritsch 09:25\\nOkay. So do you think that training... Like did you learn most of the things that you... Was there anything you would have added to the training or you think it kind of was everything you needed?\\n\\nTheo 09:37\\nIt was everything\\n\\nNikki Ritsch 09:38\\nIt was pretty good? Pretty good. Okay. How much did it cost? \\n\\nTheo 09:43\\nUh maybe like $6,000? \\n\\nNikki Ritsch 09:46\\nWow, that's a lot. Okay. Okay. Interesting. Um, would your siblings come and do a job like this? \\n\\nTheo\\nUh I got three brothers that have their CDL. \\n\\nNikki Ritsch\\nOh, cool. Oh, cool. But they're in your village right now?\\n\\nTheo 10:00\\nYeah\\n\\nNikki Ritsch 10:00\\nOkay. Do you think they'll use it? Or? \\n\\nTheo 10:04\\nUh I'm not sure. \\n\\nNikki Ritsch 10:05\\nYeah, that's fair. Nice. Do you like the management of the drive...? Like, do you like, do you think the structure works well? Do you feel like it's planned well? And you know what you're doing each day?\\n\\nTheo 10:20\\nYeah. \\n\\nNikki Ritsch 10:21\\nSo the system works. The system works. You like the system. \\n\\nTheo 10:24\\nYeah\\n\\nNikki Ritsch 10:25\\nOkay. And the reason you want to go back to your village is to be at home and to see your family. Yeah, that makes sense. Is there anything... What else? Anything I haven't asked?\\n\\nTheo 10:40\\nI'm not sure. \\n\\nNikki Ritsch 10:40\\nThat's it. Okay. Oh I see some dripping. It's starting to think about it. Good timing. Okay, cool.\"}\n", " Count: 9\n", " Sample: 1_1_InterdependenciesNNA\n", "decision_components exists: {'1_1_InterdependenciesNNA': {'Objectives': {'impacts', 'support', 'revamp the filtration system', 'stop recording for a little bit', 'potable water to the home', 'members', \"be made, i don't always feel that the operators are able to make those adjustments as they should\", \"turn off so-and-so's water, right\", 'remember what year this was', 'add more chemicals to water', 'better respond to those water quality challenges', 'public health', 'that', \"to advocate for communities, and you've worked directly with communities\", 'water', 'revisit that', 'keep that system running, right', 'education', 'because', 'operations', 'liaison', 'without', 'do that', 'find a water plant operator specifically, you mean upstream even of the', 'meetings', 'issues', 'called', 'case', 'treated water to community members', 'get a piped water system for the first time, and really struggling to meet some of the, what we call', 'ask jennifer, i know', 'challenges', 'face', 'community education', 'site', 'teach rural alaska water plant operators these things in this national exam book, and that is challe', 'get a new water system in a community, you would therefore have to go upstream and work on resolving', \"operator satisfaction and longevity, but it's still a challenge, for sure\", 'entity', \"do it, there's no penalty, okay\", 'specifically', 'needs', 'roles', 'with', \"explain why treated water is good, it's healthy for us, why it's important to use that\", 'onsite assistance and technical assistance to the water plant operators', 'safe water'}, 'Constraints': {'success', 'of them, or what', 'meet regulatory requirements', 'say enough about the remote maintenance worker team, right', 'say enough about that program', 'think of a recent moment that it occurred to me that there were challenges, but i came into this fie', \"of people to fill the positions, and turnover, and i don't always know how to address that, because \", 'people working as water plant operators', 'get a hold of', 'afford'}, 'Trade-Offs': {\"there's no easy water source\", 'pump untreated water', 'in your community'}, 'Decision Variables': {'it, the rmws help train the operators, they might make some tweaks, but when you have source water q', 'in personnel of literally one person can do', 'feels like a long-term goal, but what would you want people to know', 'turnover, or did it change job satisfaction', 'treatment system as needed, right', 'treatment as needed so that you continue having high-quality water', 'retention', 'infrastructure that exists', 'water temperature, too', 'impact your water infrastructure system', 'on basic parameters, and so our design is set to treat it in this very specific way', 'and how it affects water infrastructure', 'the treatment as needed so that you continue having high-quality water', 'test scores', 'would you want to see', 'quality of the work people were doing', 'throughout the year, like seasonally'}, 'Options': {'pump untreated water'}, 'Solutions': {'making simple, easy-to-run systems without a ton of automation', 'managerial capacity', 'five years, from 2015 to 2020, and that was, in my opinion, phenomenally successful at helping', 'the first time, and really struggling to meet some of the, what we call capacity indicators'}, 'State Variables': {'seasonal fluctuations', 'arctic conditions', 'fishing and subsistence leave', 'permafrost', 'a freeze-up that would cause a service disruption'}}, '1_2_InterdependenciesNNA': {'Objectives': {'work for us, you better get a house, you better pay 1800 a month', \"address, whether it's safety-related or not, or if there's tension in the group and you want to just\", 'break the ice to bend your knee', \"do this, there's no penalty, nothing like that\", 'introduce yourself at all too', 'really', 'get my commitment for this first', 'requests', 'have so many thousands of gallons on reserve for the school, as well as a higher capacity for what t', 'work anymore, and then we try to find more', 'meet today or some other time', 'do, work them 12, 15 hours a day', \"make you say the same thing over and over again, but if you feel like there's anything you have to a\", \"drill down on that one a little bit, when you're talking about two on two off, i mean that also woul\", 'hire', \"not be afraid of changing what's always been done, we need to get that terminology out of our head\", 'that', 'water', \"on it, we need to not be afraid of changing what's always been done, we need to get that terminology\", \"clear out the pipe, another thing that i've heard too is, or i've experienced, actually, is that exp\", 'communicate that with the office', \"figure out where the leak's at\", 'be here,\" and that\\'s fine', 'get that finished', 'quality', \"moose hunt, now this guy's down\", 'do for a safety meeting, the subjects you want to cross', 'manage the water a little better', 'clear these out', 'get loans for a house in bethel, and they have no idea how it works out here', 'water and water infrastructure', 'challenges', 'them with housing, i want housing', 'be on a water pipe', 'talk to, i think', 'go away', 'hire, take a lot of these water truck drivers and put them on a different department to work on the ', 'add to it, please do', 'drive anymore', 'water, or interruptions in services', 'understand challenges surrounding drinking water services in your region, or i guess your former reg', 'better respond to these challenges'}, 'Constraints': {'drive a combination, with a tractor trailer, then you get a class b license, which is still fine, yo', 'give you extra service,\" there\\'s a lot of times where i can\\'t give you the service that you\\'re owed', \"moose hunt, i don't got enough guys\", 'do it', \"do it, or because they just kind of don't want to\", 'fill those', 'control in between the houses, the main water supply line', 'tell where the sides of the driveway are, so you back down into basically a ditch, you get stuck tha', 'see the driveways, you get stuck a lot', 'water', 'work with', 'really answer questions of people that are pretty upset', 'drive our vehicles, but they could ride in the right seat and learn the job', 'be driving these trucks on a highway without a cdl in the middle of nowhere', \"tell if we're moving still or not\", 'go fishing', 'handle it', \"say exactly where, but they pick up groundwater, i think there's some kind of well or something out \", \"drive a truck with a manual transmission, then that's, they mark that on your license, and big deal,\"}, 'Trade-Offs': set(), 'Decision Variables': {'in between the houses, the main water supply line', 'anything', \"ph, and when they have an acceptable solution it goes into a holding tank where that is what's provi\", 'up in the city of bethel'}, 'Options': {\"the quality is better, you're not relying on the age\", \"of things like that, and it's not really an anecdote\"}, 'Solutions': {'vacuum for their waste'}, 'State Variables': {'men and women that were in their 50s', \"size of our town, bethel, we're not subject to federal regulations for the fmcsa, i believe it is\", \"things like that, and it's not really an anecdote or anything, but it's something that i thought was\", 'arctic conditions, right'}}, '1_3_InterdependenciesNNA': {'Objectives': {'hit on one thing specifically, you know, so weather is an issue, right', 'in disinfection byproducts at water plants', 'understand first is a little bit more of, you know, what are the challenges when providing infrastru', 'be celebrated more and paid more, though', 'early', 'basically take a full water plant, bring it down into a household plant', 'break that up and not have standardization but really maybe somebody would only be able to know thei', 'gets', 'a project that, you know, is operator friendly, and will be maintainable', 'talk with the people that are going to get water and sewer service', 'go above ground with the pipes', 'go below ground', 'will', 'drinking', 'get back to your question about what could be done to be more effective', 'engagement', 'do their job effectively', \"make a life out of being a water sewer operator, the wages aren't good enough to retain people\", 'public health', 'that', 'water quality and water treatment', 'get it into the permafrost for it to be founded', 'hit on another, you know, just like capacity for the right type of person, is another challenge', \"we'll jump back for a second, but i do want to talk a little bit about operators\", 'the disinfection', 'kind of bring together some of this, like, basically figure out a way to do operations a little bit ', 'that certification', 'like', 'to get good potable water to somebody at their house', 'add for us', 'you want to you want to kind of give an overview of where this started', 'make a difference in this', 'take a little pauses where as our time is getting short here, lauryn, do you have any any questions ', \"the situation of the challenges we've had with water delivered infrastructure, what would you change\", 'a pipe water and sewer system all at once', \"go to minook creek because they've been going there forever to get their drinking water\"}, 'Constraints': {\"income, there's not enough revenue for the utility to have money to pay people what they deserve to \", \"of employment, somebody that really does not have the appropriate skills, gets that job, doesn't hav\"}, 'Trade-Offs': set(), 'Decision Variables': {'and actual observations', \"something, if you could fix one thing, to hopefully improve the situation of the challenges we've ha\", 'one thing, i would get, you know, people and communities all over the world to think about water as,', 'is responsible for that', 'anything, that would be what i would change to make it more heralded, celebrated', 'making things harder in general', 'culture, i guess', \"observations that i've seen\"}, 'Options': {'of the unit cost of distributing and collecting water and sewer on a per unit basis has been so high', 'sort of traditional water sources', 'of lack of employment, somebody that really does not have the appropriate skills, gets that job, doe'}, 'Solutions': set(), 'State Variables': {'frost susceptibility of soils and the likelihood of great amounts of movement', 'unit cost of distributing and collecting water and sewer on a per unit basis has been so high, you k', \"lack of employment, somebody that really does not have the appropriate skills, gets that job, doesn'\", 'permafrost'}}, '1_4_InterdependenciesNNA': {'Objectives': {'do the job', 'ask you about water quality, too, but, just to jump back for a second, if the pay increased, would t', 'be hired', 'your water', 'have to drag my hose through a total obstacle course around two vehicles and pipes and garbage in th', 'water', 'them', 'share with us', 'get ahold of me', 'issues', 'with', 'what we got', 'work for the city of bethel, to want to do this job [crosstalk] like i told before, that job is not '}, 'Constraints': {'do your sewer until we get the say-so from so-and-so,\" so that was something that we were just like,', \"take a shower because i don't have no water\", \"of pay, but, city has a pretty decent retirement, so that's why a lot of people still do that\", 'deliver your water', 'of pay', 'of drivers and all, they pay more out in overtime than they do in regular time, which kind of worked', 'take a bath', 'of drivers', 'exactly keep up with the water of bethel,\" because bethel was growing', 'tell, so, some houses would get flooded consistently til they got the problem fixed', \"tell quite what they're doing, but they got the lid opened up there\", 'wash my dishes', 'pay for the hard work'}, 'Trade-Offs': set(), 'Decision Variables': {\"a number [crosstalk] robert: it wouldn't exactly solve the problem, but it certainly would help\", \"up a swimming pool and we'd just dump all the water so they can suck in and fight the fire, the hous\", 'up that swimming pool many times', 'one day', 'is warm and cold, warm and cold, warm and cold, so the roads get a lot slicker', 'routes throughout the day', 'affected the ability to deliver water', 'over that'}, 'Options': {'work for, say, crowley, i think their starting pay is like $30 an hour, whereas, the city of bethel,', 'of the snow pileup', \"fire, all of us, in the middle of the night, us truck drivers, we'd get calls from whoever our forem\", 'and they have a vehicle parked in their driveway,', 'will get filled up', 'we were dragging hose', 'of the rusty pipes, water becoming yellowish'}, 'Solutions': {'off-road system, they just opted to get rid of the off-road system and just had all on-road system'}, 'State Variables': {'rusty pipes, water becoming yellowish or brownish', 'snow pileup or just not be able', 'that restriction', 'pipes that were on the pipe water and sewer', 'permafrost', 'lack of drivers and all, they pay more out in overtime than they do in regular time, which kind of w'}}, '1_5__InterdependenciesNNA': {'Objectives': {'listen to you', 'set their kids up for success living in an area where the standards are very low, you showed up, her', 'hit this point, like, this is what you need to do', 'your', \"convince somebody, that's where my struggle was, is i'm teaching them something out of a book, rathe\", 'put something into our body that looks unhealthy', 'say that they did it for the state', 'talk to us or anything like that', 'understand challenges surrounding drinking water services in your region', 'figure out the training problem that there is, and how to better engage our students', 'figure out kind of how to bridge some of these knowledge sets, right, like so what you were just tal', 'reach out to some actually water delivery drivers in bethel', 'levels', 'when', 'relationships', \"say, and there have been some people like brian berube, they've really been able to put that into ac\", 'happened', 'changes', 'members', \"see we don't want to get cleaned with brown water or yeah, help us after clean the showers as many t\", 'dig just you said something interesting to me about you know, it being a revolving door', 'drinking', 'take care of', 'that', 'water', 'describe that they may have never looked at before', 'many', 'learn, it was a lot of things that were hands on where they could touch and feel and tighten and put', 'that situation', 'services to those people', 'do that', \"circle back to your first part there since something i didn't mention\", 'measures', 'safe water', 'the health of children', 'do this for this purpose and to really clean the water', 'convince a class that, you know, this water plant operator was pretty highly respected within the co', 'do it like this so that it can be better', 'have some practical findings as well to get to share later as well', 'get out to well, leif will be the first one to go out', \"kind of the quality that's needed to actually serve these communities\", 'like', 'meetings', 'change your ways, the city sub is the way to go', 'ask a elder a question', 'put two hours or three leif albertson 1:13:52 upshot is you know, he will tell you what he thinks', 'member', 'go up through the kuskokwim or whether it be the yukon or wherever it is that their water plants are', 'drink it anyways, it you know, we see we see in drink with our eyes, often, rather than what water q', \"sound jaded or or you know, i think from a practical aspects, like like, you know, and i know there'\", 'move 30 gallons of water with like two people on on it', 'clean', 'jump in with water plant operator stuff', 'tell a story', \"deal with it for what they could deal with before it's become too much\", 'get it from the school', 'work with your hands', 'go to school, or you need to learn so that you can go to college, because we all know that, like her', 'be recognized within the community'}, 'Constraints': {'resources in a village', 'touch and how do you explain how it kills them, or the disinfectant will kill them', \"education, and having someone speak out, that's going to be able to describe it to their people well\", 'get water from their water source', 'come by having the communities have the proper amount of points to get a water system built for them', 'obtain', \"you couldn't tell somebody that they have access to safe like safe water to water you could put a st\", 'go any of those places', 'be a prophet in your own land', \"really make points in sections with, with a water plant if you don't have a water plant\", 'really collect money from people when it comes to that'}, 'Trade-Offs': {'have a co-water system going to explain that', 'on the job,'}, 'Decision Variables': {'in their ways to how they think about water, and water quality and what they think about their water', 'up for them have a with a changing of the seasons, they have their their water lines propped up on, ', 'up where we will use a cartridge filter to take out some more of that iron or turbidity that might b', \"of a set of villages that you've had or have you been all over the place\", 'up for communities to either provide services to those people', 'in terms of getting water to people how things change over time', \"your minds if they're not open to receiving information\", 'budgets and have', 'background', 'your ways, the city sub is the way to go', \"up a system to where like, i'm going to train you on a few small things\", \"their kids up for success living in an area where the standards are very low, you showed up, here's \", 'your minds', 'of mind'}, 'Options': {\"a second or you know, another type of filter at home, you're often using brita filters to take out t\", 'they wanted to lead', \"we're i mean, again, you know, it's easy in the interviews that you and i both know what we're talki\", 'a cartridge filter to take out some more of that iron or turbidity that might be in the water just f'}, 'Solutions': {\"where like, i'm going to train you on a few small things\", 'be able to pressurize pumps and have water running through them', 'a bit', \"to clean our bodies with and like, just because we don't want to see we don't want to get cleaned wi\"}, 'State Variables': {'to children and you know, it can be very difficult for them to try to do this in a safe clean manner', 'either be more turbidity, more contaminant', 'whatever reason', 'how much the land erodes', 'covid', \"covid, we've we've had to kind of swap that\", 'permafrost', 'choice', 'turnover in their office or', 'bigger picture of like best practice scores with the state and how they fund water plant projects', 'to run the water plants can be good'}}, '1_6__InterdependenciesNNA': {'Objectives': {'water for the community', 'get them to pass the test', 'support', 'cut this but like, they had a different problem', 'me this many people', 'doesn', 'trick them', 'improve for the next round', 'members', 'pass their test', 'talk to us', 'then', 'just', 'drinking', 'be on', 'keep it is not the drinking water aspect, but all the other aspects of it, how it changes their life', 'for the next round', 'that', 'be successful on different tasks, like how big of a crew we need, what kind of support we need for t', 'water', 'than', 'that manpower that that we need on the ground', 'get everyone pipe systems', 'get paid well', 'answer a question', \"the salary, that's the answer\", 'workers or whatever supplies we need, fuel and things like that, on the ground', 'running water for their community, or for their teachers, not the community', 'have', 'a resource of we have tools, specialized tools that we keep here in bethel that we can ship out, tha', 'could', 'pass the, the level one or level two test', \"reevaluate, like we're not coming to do it without we, we've understood from doing this long enough\", 'be changed and fixed', 'like', 'relations', 'brought', \"additional workers, whatever they're needed\", 'the salary', 'get those certifications to do the job', 'qualify a little bit but can you tell me about the how drinking water is provided in the places that', 'member', 'issue', 'safe water'}, 'Constraints': {\"get those certifications because i don't have a higher level plant to work in\", 'meet treatment standards', 'aging infrastructure', 'pass', 'remember seven to ten percent, on average, something like that', 'internally due to their own leadership decisions about travel to villages for i think the last three', 'afford this', 'their ability to speak into a lot of those situations and improve', \"pass, they feel like someone else could do it better, even if they're a good operator\", 'explain', 'get it back', 'put that together'}, 'Trade-Offs': {'through a gravity sewer with lift station', 'self haul to a lagoon', 'at of washateria'}, 'Decision Variables': {\"and we're like really slow down because something's changed in the river\", 'in format, a lot more, a lot more time taken per subject and stuff', 'one thing if you had whatever you needed to have, a magic wand', 'all of our treatment scheme from our winter program to our summer program', 'amount', 'health issues', 'some of that, that management', 'that plant, they do have a hard time testing and so'}, 'Options': {'the operator was there and like on top of whether the boilers needed to be on', 'those are national standards', \"they're technical questions about the water that a local person, maybe a water plant operator,\", \"there's absolutely no management of where that trailer is going to haul water\", \"that's what gets me a level of service that i don't have to worry when i turn on my faucet\", \"the last couple of years, i've dealt with a ton of school complaint issues, but we get a lot of comp\"}, 'Solutions': {'help tribes or rural administration, right', 'work, let alone moving to florida', 'a national test now', 'our summer program'}, 'State Variables': {'aging infrastructure', 'their own leadership decisions about travel to villages for i think the last three, four years', 'weather and days off coinciding in bad ways', \"water conditions leif albertson 19:36 yeah, i mean, so when we're talking about teachers, my impress\"}}, '3_1__InterdependenciesNNA': {'Objectives': {'figure out where your next water plant will be to be able to push the water up, or lift station or s', 'build the pipeline', 'get water, pipe water and sewer', 'back up just a second because we kind of launched in', 'it for the last year or two or three, maybe years, something like that', 'go from a big tank to a little tank right in that space', \"water by truck leif albertson 1:09:32 what do you think's gonna happen with the avenues\", 'as little as 40 gallons, and you go why would anybody order 40', 'pay for it now', 'water to my house, and it sits there for 30 days, because i only get water once a month', \"x amount of water to everybody, and they'll have to live with it\", 'plumb a neighborhood, thinking they can help but they also need to be thinking about some of the com', 'let them know, you know what our problems are', \"use trucks to deliver water is i don't think there's enough effort put it how much time does it take\", 'figure out', 'think back to some of your experiences as well, you know, what do you think was more helpful for on ', \"get when they're manufactured, they don't usually take in consideration that maybe somebody has to c\", 'find some extra, with upping the grants', 'see us put pilings in and to put the lift station on pilings', 'water way over here', 'do a little bit here', 'do what you do', \"bring her on board, aren't we\", \"do well, with this project, we're, we're kind of chasing our tail\", 'go back to the how much it cost per mile', 'go there', 'see what it will cost to get out there', 'go to every house to see if a red lights on', 'every five', \"talk water and sewer projects, you don't think of real estate\", 'start down at the lower level of vocational education', 'get a hold of one of them', 'figure out how to fix the problem', 'explain', 'x amount for everybody to get the water', 'work seven days a week, 12 hour days', 'keep an operator and find an operator and train an operator and get their operator to pass the test ', 'fuel in western alaska and we had the same problem with fuel tanks, yeah', 'learn', 'find something', 'get through this', 'get it down', \"know what the costs are going to be and if what they're paying is going to cover those costs, you kn\", 'improve it for the last year or two or three, maybe years, something like that', 'them', 'look at is all those things so just i mean treatment, but also billing and personnel like you mentio', 'be, whoever is the owner needs to be actively involved'}, 'Constraints': {\"find a recorder in the state of alaska, you don't have to record anything leif albertson 06:53 and t\", \"bill, i just recently found out there's like a 10 digit number for all these different\", 'even get the materials pete williams 1:10:52 steel mill products have gone up 127% since 2020, plast', \"seem to you know, you're gonna pull the hose six more feet stay away from the house\", 'get anybody to give us any lead times', 'because covid', 'do everything', 'do the pipes', 'even send anybody to training', \"can't really cover it so i mean, like the the water bill, i mean, like yk, you know, with the water \", 'promise it to us for 22 weeks', 'see in them', \"sit outside, you can't park them outside michaela lapatin 1:00:10 from a worker and operator perspec\", 'people in general interested', 'find one to fit well unless i want to go from a big tank to a little tank right in that space', 'get him out of there', 'beat it other than how much it costs to get it up and get it installed', \"be maintained, and the city's not responsible for it\", 'get what they want, they get mad', \"see what's going on inside\", 'get blood out of a stone, right', 'really think of anything'}, 'Trade-Offs': {\"i'm going to have a lot of people working a lot of overtime,\", 'anchorage'}, 'Decision Variables': {\"aside some funds for depreciation and that's a real weak spot in municipalities\", \"it up is you get tested on the plant that you're running\", 'and thaw and all that stuff', 'them out every five years', 'their rates, they cost so much per mile to deliver', 'this here to make this fit here', 'a tank out, especially up here so they just leave it and pretty soon you get a', 'up a cot', 'it up'}, 'Options': set(), 'Solutions': {'over there is huge', 'a lot of water', 'that as i get a text when my neighbors walk by and tell me when the light on my house is on'}, 'State Variables': {\"permafrost, and you know, all of a sudden the homeowner, we're telling the homeowner, hey, that's yo\", 'permafrost and climate change and thaw and all that stuff', 'freezing up', 'change', 'permafrost', 'turnover that we have', \"all the all the supply problems and all that stuff that's going on, we went out to bid this, this ha\"}}, '3_2__InterdependenciesNNA': {'Objectives': {'give you the ability to just read it, or you can write into it', 'promote going to a peristaltic', 'develop techniques and approaches to get operators certified and send them to get them certified and', \"say it's back in new hampshire, somewhere on the east coast, they were looking at one of the compani\", 'know, are they setting up for remote monitoring', 'go take it again and as soon as you take your one and get it, take your two, \"well i don\\'t have the ', 'be able to make my own decisions', 'get people through the test is seems to be a challenge', 'promote that', 'know, the lead copper, is it coming from the household plumbing', 'work anymore', 'members', 'maintain', 'safe drinking water and be stewards of the environment with their wastewater treatment', 'better water', 'take it again', 'bring people', 'know to run this kind of plant', 'drinking', 'bring back the crew because richard 05:16 about 40 years in the industry up here', 'do the right thing, cities or communities that are willing to make sure you have the things to do it', 'public health', \"be on at least 100 foot in depth before they're considered good water wells for public use\", 'that', 'be able to hit hand', 'learn outside of the certification process that each person kind of has to figure out at each plant', 'spray water all over the place', 'make this too pointeed but i mean, richard 1:14:01 regulatory and enforcement and compliance', 'punish non compliance', 'water to the plants', 'should', 'benchmarks, and then to be 100% compliant for 20 consecutive quarters', \"go into this field, even though you can work anywhere in the world, every community's got it\", \"limit the amount of experience or knowledge that a person can get to put towards licensing, they're \", 'expert', 'limit it', 'incentivize employees to go out and get higher level licenses', \"get into the field, i've yet to find any utility that's not willing to have a path in the door for t\", \"go to work, you just there's 20 positions open at bethel right now\", 'drink the water', 'better', 'stay here, if i can go make the same or better money and a cheaper place to live with more of the th', 'exclude people', \"troubleshoot equipment, whether it's pressure relief valve, so it's, if you can find somebody that h\", 'learn outside of certification', 'reduce monitoring', 'take the ceus', \"work or or the other thing is you're not paying a competitive wage\", 'also teach how to take the test', 'have a level two operator and that applies to the treatment side of it, the distribution side and th', \"say either way, that there's more\", 'make it work', 'beat it out of people who have taken it, you know, like, what was hard for you', 'learn', \"that if you're not always working on a level 3 or 4\", 'be employed, you need like a qualifying job', \"go back to work, you've got to go in as a contractor somewhere else\", 'build a new water plant', 'go with it', 'be prepared for some things that are different', 'go to college, even though there are colleges out there that have programs, you can just go to the f', 'be in water', \"operational assistance here, like northern utility services, and well there's a couple of firms that\", 'knoow', 'come up recognizing the fact that how many of these communities are level ones level twos, threes an', 'say it perception', 'you have a limited', \"that, but i don't know, i don't know too many operators that, well part of it is they don't want to \", 'get licensed and pay all the insurance that you gotta have to contract with a municipality or a city'}, 'Constraints': {'just come up with a really simple siphoning system with the intake off the floor', 'we have a guy that works with me all day long, come in on the weekend, and just walk through and che', 'take hand control on', \"tell you why the disinfection byproducts weren't taken\", 'go out there and get a hand dial', 'of compliance, lack of enforcement, to the point and if you wanted to look at a case study, i mean, ', \"of knowledge out there that it's a field, it's, you know, understaffed and up here in alaska\", 'finding money to run in rural areas', 'make process controls or other ones, you actually give them the ability to do it', 'get a level four, but he can get a level three', \"figure out if there's anything blocking it because i stuck a pipe in there\", 'get a level four', 'of operators do you think a lot of it is people cannot pass the test', 'imagine it right', 'go to work for a pers job', 'remote location', 'of operators', \"have a secondary shift on any day, you don't have a primary shift\", 'turn my pump on, can you fix it', 'do a really simple way', 'test until you attended a class or you have certain amount of experience, there are some initial thi', 'of operators or lack of experienced higher licensed operators', 'even be right', 'explain', 'pass the test', \"help them with the problems, but they can have interpreters if they don't understand correctly or ma\", 'even wear a hoodie, right', 'get water out of the filter', 'apply for reciprocity', 'have a violation', \"afford as an entity or as a utility, or you can't afford to fail\"}, 'Trade-Offs': {'you can either take the test again,'}, 'Decision Variables': {'system, however, basic it was somebody in bethel or somebody else probably could monitor it', 'chemical dosage pumps', \"uv settings, i can adjust pressure, anything, anything that's automated, that they have the ability \", 'in the classroom and got some ceus', 'program here', 'ph to cut that down', 'with with the changing technology, so scada', 'of guidelines and develop everything, how you put on your internet or how you put on a computer base', \"will go in and try to figure out what's blocking it out\", \"the uv settings, i can adjust pressure, anything, anything that's automated, that they have the abil\", 'on your computer', 'circuit somewhere, can i go over to that board and go back to the old school way and turn the pump o', 'into it', 'order, right', 'program to make sure that well to try and lower the lead and copper', 'pump speed remotely', 'project fixes that', 'that the scada system brings'}, 'Options': {\"i mean, in when you're dealing with corporations,\", \"we've already talked about electrical, we talked about instrumentation, we talked about signals, goi\", 'you can either take the test again,', 'that then allows them to have flexibility when it comes to supervisory staff, whether they have prim', 'i wanted to enjoy reciprocity,', 'right now the big problem is, you get an operator certified if they have no vested interest'}, 'Solutions': {'not just to put the coffee can on all three filters and let it go', 'make sure that well to try and lower the lead and copper', 'us required by the state to meet our permit, highlighted in blue'}, 'State Variables': {\"that it's federally now, like regulated, and the systems up here are pretty different\", 'remote location', 'reciprocity, right'}}, '3_3__InterdependenciesNNA': {'Objectives': {'the water to a fire when there is a fire', \"become a teacher so you're gonna go back\", 'be a teacher, but do you think you might be interested in staying and working in water', 'go back to your village is to be at home and to see your family', 'be a teacher, you want to go back and work in the village and be with your family', 'be a teacher theo 04:52 yea nikki ritsch 05:07 you said third grade', \"go back, you're okay here with the part time higher pay\"}, 'Constraints': {'of drivers, which impacts how many places you have to do on your route every day'}, 'Trade-Offs': set(), 'Decision Variables': set(), 'Options': {\"someone's sick\"}, 'Solutions': set(), 'State Variables': set()}}\n", " Type: \n", " Count: 9\n", " Sample key: 1_1_InterdependenciesNNA\n", " Sample value type: \n", " ✓ Structure looks correct (dict of dicts)\n", " Component types: ['Objectives', 'Constraints', 'Trade-Offs', 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", "\n", "Step 2: Checking for document mismatch...\n", "--------------------------------------------------------------------------------\n", "✓ All documents present in decision_components\n", "\n", "================================================================================\n", "🔨 REBUILDING DECISION COMPONENTS\n", "================================================================================\n", "\n", "Extracting components from all interviews...\n", "--------------------------------------------------------------------------------\n", " ✓ 1_1_InterdependenciesNNA: 87 components\n", " ✓ 1_2_InterdependenciesNNA: 72 components\n", " ✓ 1_3_InterdependenciesNNA: 53 components\n", " ✓ 1_4_InterdependenciesNNA: 48 components\n", " ✓ 1_5__InterdependenciesNNA: 105 components\n", " ✓ 1_6__InterdependenciesNNA: 81 components\n", " ✓ 3_1__InterdependenciesNNA: 90 components\n", " ✓ 3_2__InterdependenciesNNA: 131 components\n", " ✓ 3_3__InterdependenciesNNA: 9 components\n", "\n", "✓ Extracted components from 9 interviews\n", "\n", "Verifying new structure...\n", "--------------------------------------------------------------------------------\n", "Sample document: 1_1_InterdependenciesNNA\n", "Value type: \n", "✓ Correct! It's a dict\n", "Component types: ['Objectives', 'Constraints', 'Trade-Offs', 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", "Sample component count: 47 objectives\n", "\n", "Replacing global variable...\n", "--------------------------------------------------------------------------------\n", "✓ decision_components updated with correct structure\n", "\n", "================================================================================\n", "📊 SUMMARY\n", "================================================================================\n", "\n", "Total documents: 9\n", "Decision components extracted: 9\n", "\n", "Components by type:\n", " • Objectives: 364\n", " • Constraints: 121\n", " • Decision Variables: 86\n", " • State Variables: 44\n", " • Options: 30\n", " • Solutions: 20\n", " • Trade-Offs: 11\n", "\n", "================================================================================\n", "✅ REPAIR COMPLETE\n", "================================================================================\n", "\n", "Now you can run the validation cell and it should work!\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL: Fix Decision Components Data Structure\n", "\n", "print(\"=\"*80)\n", "print(\"🔧 DECISION COMPONENTS: STRUCTURE DIAGNOSTIC & REPAIR\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# ==========================================\n", "# 1. DIAGNOSE THE PROBLEM\n", "# ==========================================\n", "print(\"Step 1: Diagnosing current state...\")\n", "print(\"-\"*80)\n", "\n", "# Check what we have\n", "has_docs = 'documents' in globals() and documents\n", "has_dc = 'decision_components' in globals() and decision_components\n", "\n", "print(f\"documents exists: {has_docs}\")\n", "if has_docs:\n", " print(f\" Count: {len(documents)}\")\n", " print(f\" Sample: {list(documents.keys())[0]}\")\n", "\n", "print(f\"decision_components exists: {has_dc}\")\n", "if has_dc:\n", " print(f\" Type: {type(decision_components)}\")\n", " print(f\" Count: {len(decision_components)}\")\n", " \n", " if decision_components:\n", " sample_key = list(decision_components.keys())[0]\n", " sample_val = decision_components[sample_key]\n", " print(f\" Sample key: {sample_key}\")\n", " print(f\" Sample value type: {type(sample_val)}\")\n", " \n", " # Check if structure is correct\n", " if isinstance(sample_val, dict):\n", " print(f\" ✓ Structure looks correct (dict of dicts)\")\n", " print(f\" Component types: {list(sample_val.keys())}\")\n", " elif isinstance(sample_val, list):\n", " print(f\" ❌ WRONG STRUCTURE: Values are lists, should be dicts!\")\n", " print(f\" This will cause validation cell to fail\")\n", " else:\n", " print(f\" ❌ UNEXPECTED STRUCTURE: {type(sample_val)}\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. CHECK FOR MISMATCH\n", "# ==========================================\n", "if has_docs and has_dc:\n", " print(\"Step 2: Checking for document mismatch...\")\n", " print(\"-\"*80)\n", " \n", " doc_names = set(documents.keys())\n", " dc_names = set(decision_components.keys())\n", " \n", " missing_in_dc = doc_names - dc_names\n", " extra_in_dc = dc_names - doc_names\n", " \n", " if missing_in_dc:\n", " print(f\"⚠️ Missing from decision_components ({len(missing_in_dc)} documents):\")\n", " for name in sorted(missing_in_dc):\n", " print(f\" • {name}\")\n", " else:\n", " print(\"✓ All documents present in decision_components\")\n", " \n", " if extra_in_dc:\n", " print(f\"⚠️ Extra in decision_components ({len(extra_in_dc)} documents):\")\n", " for name in sorted(extra_in_dc):\n", " print(f\" • {name}\")\n", " \n", " print()\n", "\n", "# ==========================================\n", "# 3. REBUILD IF NEEDED\n", "# ==========================================\n", "if has_docs:\n", " print(\"=\"*80)\n", " print(\"🔨 REBUILDING DECISION COMPONENTS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " import re\n", " from collections import defaultdict\n", " \n", " # Define component patterns (same as original cell)\n", " component_patterns = {\n", " 'Objectives': {\n", " 'patterns': [\n", " r'\\b(?:goal|objective|aim|target|purpose|mission)\\s+(?:is|was|to|of)\\s+([^.!?]+)',\n", " r'\\b(?:want|need|trying|seeking|hoping)\\s+to\\s+([^.!?]+)',\n", " r'\\b(?:improve|increase|enhance|maximize|optimize|ensure)\\s+([^.!?]+)',\n", " r'\\b(?:provide|deliver|maintain|achieve|accomplish)\\s+([^.!?]+)',\n", " r'\\b(?:safe|clean|reliable|adequate|quality)\\s+(\\w+)',\n", " r'community\\s+(\\w+)',\n", " r'customer\\s+(\\w+)',\n", " r'public\\s+health',\n", " r'water\\s+quality',\n", " r'service\\s+reliability'\n", " ],\n", " 'examples': ['safe water', 'community education', 'service reliability', 'public health']\n", " },\n", " 'Constraints': {\n", " 'patterns': [\n", " r'\\b(?:limited|lack|shortage|insufficient|not enough)\\s+([^.!?]+)',\n", " r'\\b(?:cannot|can\\'t|unable to|difficult to)\\s+([^.!?]+)',\n", " r'\\b(?:constraint|limitation|restriction|barrier)\\s+(?:on|to|of)\\s+([^.!?]+)',\n", " r'\\b(?:budget|funding|money|cost)\\s+(?:constraint|limitation|issue|problem)',\n", " r'\\b(?:workforce|staff|personnel)\\s+(?:shortage|limited|lack)',\n", " r'\\b(?:remote|isolated|rural)\\s+location',\n", " r'harsh\\s+(?:climate|weather|conditions)',\n", " r'permafrost\\s+(?:thaw|melt|issue)',\n", " r'aging\\s+infrastructure',\n", " r'limited\\s+capacity'\n", " ],\n", " 'examples': ['limited funding', 'workforce shortage', 'remote location', 'aging infrastructure']\n", " },\n", " 'Trade-Offs': {\n", " 'patterns': [\n", " r'\\b(?:trade-?off|compromise|balance|versus|vs\\.?)\\s+between\\s+([^.!?]+)\\s+and\\s+([^.!?]+)',\n", " r'(?:if|when)\\s+we\\s+([^,]+),\\s+(?:then|we)\\s+(?:can\\'t|cannot|lose|sacrifice)\\s+([^.!?]+)',\n", " r'(?:more|increasing|improving)\\s+([^.!?]+)\\s+(?:means|requires)\\s+(?:less|reducing|sacrificing)\\s+([^.!?]+)',\n", " r'(?:either|must choose between)\\s+([^.!?]+)\\s+or\\s+([^.!?]+)',\n", " r'cost\\s+vs\\.?\\s+quality',\n", " r'speed\\s+vs\\.?\\s+accuracy',\n", " r'coverage\\s+vs\\.?\\s+service level'\n", " ],\n", " 'examples': ['operator satisfaction vs customer coverage', 'cost vs quality']\n", " },\n", " 'Decision Variables': {\n", " 'patterns': [\n", " r'\\b(?:can\\s+adjust|can\\s+change|can\\s+modify|can\\s+control)\\s+([^.!?]+)',\n", " r'\\b(?:operator|staff|we)\\s+(?:adjust|change|modify|set|control)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'\\b(?:adjust|change|modify|set|control)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'work\\s+(?:hours|schedule|shifts)',\n", " r'treatment\\s+(?:process|level|dosage)',\n", " r'maintenance\\s+frequency',\n", " r'staffing\\s+levels',\n", " r'operating\\s+(?:parameters|conditions)'\n", " ],\n", " 'examples': ['operator work hours', 'treatment dosage', 'maintenance frequency']\n", " },\n", " 'Options': {\n", " 'patterns': [\n", " r'\\b(?:option|alternative|choice|either)\\s+(?:is|to|between)\\s+([^.!?]+)',\n", " r'\\b(?:could|can|might)\\s+(?:either|choose to)\\s+([^.!?]+)\\s+or\\s+([^.!?]+)',\n", " r'residents\\s+(?:choose|use|rely on)\\s+([^.!?]+)',\n", " r'(?:use|switch to|choose between)\\s+([^.!?]+)\\s+(?:or|versus)\\s+([^.!?]+)',\n", " r'treated\\s+(?:water|source)',\n", " r'natural\\s+(?:water|source)',\n", " r'trucked\\s+water',\n", " r'hauled\\s+water'\n", " ],\n", " 'examples': ['treated vs natural water', 'truck vs pipe delivery']\n", " },\n", " 'Solutions': {\n", " 'patterns': [\n", " r'\\b(?:solution|workaround|approach|strategy|method)\\s+(?:is|was|to)\\s+([^.!?]+)',\n", " r'\\b(?:we|they|operators)\\s+(?:implemented|use|employ|developed)\\s+([^.!?]+)',\n", " r'\\b(?:program|system|initiative)\\s+(?:for|to)\\s+([^.!?]+)',\n", " r'remote\\s+(?:worker|training|monitoring)\\s+program',\n", " r'managerial\\s+capacity',\n", " r'cross-?training',\n", " r'backup\\s+(?:system|plan)',\n", " r'emergency\\s+(?:protocol|procedure)',\n", " r'partnership\\s+with',\n", " r'automated\\s+(?:monitoring|control)'\n", " ],\n", " 'examples': ['remote worker program', 'managerial capacity', 'cross-training']\n", " },\n", " 'State Variables': {\n", " 'patterns': [\n", " r'\\b(?:given|due to|because of)\\s+(?:the\\s+)?([^.!?]+)',\n", " r'\\b(?:external|environmental|contextual)\\s+(?:factor|condition|constraint)\\s+([^.!?]+)',\n", " r'permafrost',\n", " r'seasonal\\s+(?:fluctuation|variation|change)',\n", " r'climate\\s+(?:condition|pattern)',\n", " r'geographic\\s+location',\n", " r'population\\s+(?:size|density)',\n", " r'regulatory\\s+requirement',\n", " r'weather\\s+(?:pattern|condition)',\n", " r'distance\\s+from'\n", " ],\n", " 'examples': ['permafrost', 'seasonal fluctuations', 'remote location']\n", " }\n", " }\n", " \n", " print(\"Extracting components from all interviews...\")\n", " print(\"-\"*80)\n", " \n", " # Rebuild decision_components with CORRECT structure\n", " decision_components_new = {}\n", " \n", " for doc_name, doc_text in documents.items():\n", " # Initialize with DICT structure (not list!)\n", " components = {comp_type: set() for comp_type in component_patterns.keys()}\n", " \n", " # Process text\n", " text_lower = doc_text.lower()\n", " \n", " # Extract each component type\n", " for comp_type, comp_info in component_patterns.items():\n", " for pattern in comp_info['patterns']:\n", " matches = re.finditer(pattern, text_lower, re.IGNORECASE)\n", " for match in matches:\n", " if match.lastindex and match.lastindex >= 1:\n", " extracted = match.group(1).strip()\n", " extracted = re.sub(r'\\s+', ' ', extracted)\n", " extracted = extracted[:100]\n", " if len(extracted) > 3:\n", " components[comp_type].add(extracted)\n", " \n", " # Add examples if present\n", " for example in comp_info['examples']:\n", " if example.lower() in text_lower:\n", " components[comp_type].add(example)\n", " \n", " # Store with CORRECT structure\n", " decision_components_new[doc_name] = components\n", " \n", " total = sum(len(v) for v in components.values())\n", " print(f\" ✓ {doc_name}: {total} components\")\n", " \n", " print()\n", " print(f\"✓ Extracted components from {len(decision_components_new)} interviews\")\n", " print()\n", " \n", " # Verify structure\n", " print(\"Verifying new structure...\")\n", " print(\"-\"*80)\n", " sample_key = list(decision_components_new.keys())[0]\n", " sample_val = decision_components_new[sample_key]\n", " print(f\"Sample document: {sample_key}\")\n", " print(f\"Value type: {type(sample_val)}\")\n", " if isinstance(sample_val, dict):\n", " print(f\"✓ Correct! It's a dict\")\n", " print(f\"Component types: {list(sample_val.keys())}\")\n", " print(f\"Sample component count: {len(sample_val['Objectives'])} objectives\")\n", " else:\n", " print(f\"❌ Still wrong: {type(sample_val)}\")\n", " \n", " print()\n", " \n", " # Replace old variable\n", " print(\"Replacing global variable...\")\n", " print(\"-\"*80)\n", " globals()['decision_components'] = decision_components_new\n", " print(\"✓ decision_components updated with correct structure\")\n", " print()\n", " \n", " # Show summary\n", " print(\"=\"*80)\n", " print(\"📊 SUMMARY\")\n", " print(\"=\"*80)\n", " print()\n", " print(f\"Total documents: {len(documents)}\")\n", " print(f\"Decision components extracted: {len(decision_components_new)}\")\n", " print()\n", " \n", " # Count totals\n", " total_by_type = {ct: 0 for ct in component_patterns.keys()}\n", " for doc_comps in decision_components_new.values():\n", " for ct, items in doc_comps.items():\n", " total_by_type[ct] += len(items)\n", " \n", " print(\"Components by type:\")\n", " for ct, count in sorted(total_by_type.items(), key=lambda x: -x[1]):\n", " print(f\" • {ct}: {count}\")\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ REPAIR COMPLETE\")\n", " print(\"=\"*80)\n", " print()\n", " print(\"Now you can run the validation cell and it should work!\")\n", " print()\n", "\n", "else:\n", " print(\"❌ Cannot rebuild - missing required variables\")\n", " print(\" Please run the data loading cells first\")\n", "\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔬 DECISION COMPONENTS: HUMAN CODING vs AI EXTRACTION\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ decision_components: 9 items\n", "✓ optimization_formulations: 9 items\n", "✓ documents: 9 items\n", "\n", "Checking data structure...\n", " Sample document: 1_1_InterdependenciesNNA\n", " Type of decision_components: \n", " Type of decision_components['1_1_InterdependenciesNNA']: \n", " ✓ Correct structure (dict of dicts)\n", " Component types available: ['Objectives', 'Constraints', 'Trade-Offs', 'Decision Variables', 'Options', 'Solutions', 'State Variables']\n", "\n", "Step 2: Loading human coding reference definitions...\n", "--------------------------------------------------------------------------------\n", "✓ Loaded 7 component type definitions\n", "\n", "Step 3: Summarizing AI extraction results...\n", "--------------------------------------------------------------------------------\n", "✓ Calculated AI extraction statistics\n", "\n", "Step 4: Building human vs AI comparison table...\n", "--------------------------------------------------------------------------------\n", "✓ Comparison table created: 7 rows\n", "\n", "Step 5: Creating publication-ready validation table...\n", "--------------------------------------------------------------------------------\n", "✓ Publication table created\n", "\n", "Step 6: Validating keyword-based extraction...\n", "--------------------------------------------------------------------------------\n", "✓ Keyword validation completed\n", "\n", "================================================================================\n", "🎛️ MANUAL VALIDATION INTERFACE\n", "================================================================================\n", "\n", "Select component type and interview to review AI extraction:\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "837448f42d124b68bb0b03d283dcc16d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Dropdown(description='Component:', layout=Layout(width='300px'), options=('Objectives', 'Constr…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c73ca8e2d30d458f9f4ac07b43c1254b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "📊 PUBLICATION-READY VALIDATION TABLE\n", "================================================================================\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7ea698873b55425c9354ef2990a915f4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "📈 DETAILED COMPARISON TABLE\n", "================================================================================\n", "\n" ] }, { "data": { "text/html": [ "

Detailed Human vs AI Comparison

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Component TypeHuman DefinitionHuman ExamplesTheoretical BasisAI Total MentionsAI Unique ItemsAI Avg per InterviewAI Coverage (% interviews)AI Examples
0ObjectivesThe goals or desired outcomes that operators s...Community education, Providing safe waterFrom multi-objective optimization theory36433440.4100%a pipe water and sewer system all at once, a p...
1ConstraintsLimitations or boundaries that restrict availa...Limited funding, Workforce shortageFrom constraint satisfaction problems12112013.4100%afford, afford as an entity or as a utility, o...
2Trade-OffsSituations where improving one objective neces...Improving operator satisfaction compromises pr...From Pareto optimization theory11111.256%anchorage, at of washateria, have a co-water s...
3Decision VariablesControllable factors operators can adjust to i...Operator work hoursFrom optimization control variables86869.689%a number [crosstalk] robert: it wouldn't exact...
4OptionsDiscrete alternative courses of action that ar...Residents choose between treated and natural w...From discrete choice theory30303.389%a cartridge filter to take out some more of th...
5SolutionsStrategies, workarounds, or approaches that op...Remote worker program, Managerial capacityFrom solution space in optimization20202.278%a bit, a lot of water, a national test now
6State VariablesFixed or external factors that define the prob...Permafrost, Seasonal fluctuationsFrom state-space representation44404.989%a freeze-up that would cause a service disrupt...
\n", "
" ], "text/plain": [ " Component Type Human Definition \\\n", "0 Objectives The goals or desired outcomes that operators s... \n", "1 Constraints Limitations or boundaries that restrict availa... \n", "2 Trade-Offs Situations where improving one objective neces... \n", "3 Decision Variables Controllable factors operators can adjust to i... \n", "4 Options Discrete alternative courses of action that ar... \n", "5 Solutions Strategies, workarounds, or approaches that op... \n", "6 State Variables Fixed or external factors that define the prob... \n", "\n", " Human Examples \\\n", "0 Community education, Providing safe water \n", "1 Limited funding, Workforce shortage \n", "2 Improving operator satisfaction compromises pr... \n", "3 Operator work hours \n", "4 Residents choose between treated and natural w... \n", "5 Remote worker program, Managerial capacity \n", "6 Permafrost, Seasonal fluctuations \n", "\n", " Theoretical Basis AI Total Mentions \\\n", "0 From multi-objective optimization theory 364 \n", "1 From constraint satisfaction problems 121 \n", "2 From Pareto optimization theory 11 \n", "3 From optimization control variables 86 \n", "4 From discrete choice theory 30 \n", "5 From solution space in optimization 20 \n", "6 From state-space representation 44 \n", "\n", " AI Unique Items AI Avg per Interview AI Coverage (% interviews) \\\n", "0 334 40.4 100% \n", "1 120 13.4 100% \n", "2 11 1.2 56% \n", "3 86 9.6 89% \n", "4 30 3.3 89% \n", "5 20 2.2 78% \n", "6 40 4.9 89% \n", "\n", " AI Examples \n", "0 a pipe water and sewer system all at once, a p... \n", "1 afford, afford as an entity or as a utility, o... \n", "2 anchorage, at of washateria, have a co-water s... \n", "3 a number [crosstalk] robert: it wouldn't exact... \n", "4 a cartridge filter to take out some more of th... \n", "5 a bit, a lot of water, a national test now \n", "6 a freeze-up that would cause a service disrupt... " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "🔑 KEYWORD VALIDATION\n", "================================================================================\n", "\n" ] }, { "data": { "text/html": [ "

Keyword-Based Validation

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Component TypeHuman KeywordsAI Items MatchedAI Items TotalMatch Rate
0Objectivesgoal, objective, aim, achieve, optimize03340.0%
1Constraintslimited, shortage, cannot, restriction, barrier11200.8%
2Trade-Offstrade-off, compromise, versus, sacrifice, balance0110.0%
3Decision Variablesadjust, control, change, modify, set8869.3%
4Optionsoption, alternative, choice, either, or123040.0%
5Solutionssolution, workaround, strategy, implemented, p...1205.0%
6State Variablesgiven, external, environmental, fixed, uncontr...0400.0%
\n", "
" ], "text/plain": [ " Component Type Human Keywords \\\n", "0 Objectives goal, objective, aim, achieve, optimize \n", "1 Constraints limited, shortage, cannot, restriction, barrier \n", "2 Trade-Offs trade-off, compromise, versus, sacrifice, balance \n", "3 Decision Variables adjust, control, change, modify, set \n", "4 Options option, alternative, choice, either, or \n", "5 Solutions solution, workaround, strategy, implemented, p... \n", "6 State Variables given, external, environmental, fixed, uncontr... \n", "\n", " AI Items Matched AI Items Total Match Rate \n", "0 0 334 0.0% \n", "1 1 120 0.8% \n", "2 0 11 0.0% \n", "3 8 86 9.3% \n", "4 12 30 40.0% \n", "5 1 20 5.0% \n", "6 0 40 0.0% " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "📊 VALIDATION VISUALIZATIONS\n", "================================================================================\n", "\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "marker": { "color": [ "#06A77D", "#06A77D", "#06A77D", "#06A77D", "#06A77D", "#06A77D", "#F77F00" ] }, "text": [ "100%", "100%", "89%", "89%", "89%", "78%", "56%" ], "textposition": "outside", "type": "bar", "x": [ "Objectives", "Constraints", "Decision Variables", "Options", "State Variables", "Solutions", "Trade-Offs" ], "y": [ 100, 100, 88.88888888888889, 88.88888888888889, 88.88888888888889, 77.77777777777777, 55.55555555555556 ] } ], "layout": { "height": 400, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "AI Extraction Coverage by Component Type" }, "xaxis": { "title": { "text": "Component Type" } }, "yaxis": { "range": [ 0, 110 ], "title": { "text": "Percentage of Interviews" } } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "marker": { "color": "#2E86AB" }, "name": "Total Mentions", "text": [ "364", "121", "11", "86", "30", "20", "44" ], "textposition": "outside", "type": "bar", "x": [ "Objectives", "Constraints", "Trade-Offs", "Decision Variables", "Options", "Solutions", "State Variables" ], "y": [ 364, 121, 11, 86, 30, 20, 44 ] }, { "marker": { "color": "#F77F00", "size": 12, "symbol": "diamond" }, "mode": "markers+text", "name": "Unique Items", "text": [ "334", "120", "11", "86", "30", "20", "40" ], "textposition": "top center", "type": "scatter", "x": [ "Objectives", "Constraints", "Trade-Offs", "Decision Variables", "Options", "Solutions", "State Variables" ], "y": [ 334, 120, 11, 86, 30, 20, 40 ] } ], "layout": { "height": 400, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "AI Extraction Volume: Total Mentions vs Unique Items" }, "xaxis": { "title": { "text": "Component Type" } }, "yaxis": { "title": { "text": "Count" } } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "💾 SAVING VALIDATION TABLES\n", "================================================================================\n", "\n", "✓ Saved: Human_vs_AI_Validation_Table.csv\n", "✓ Saved: Human_vs_AI_Detailed_Comparison.csv\n", "✓ Saved: Human_vs_AI_Keyword_Validation.csv\n", "✓ Saved: Validation_Table_LaTeX.tex\n", "\n", "📊 Files saved to: publication_outputs/tables/\n", "\n", "✓ Variables created:\n", " • validation_pub_table_df: Publication-ready table\n", " • validation_comparison_df: Detailed comparison\n", " • validation_keyword_df: Keyword validation\n", " • human_coding_reference: Human definitions\n", " • ai_extraction_summary: AI statistics\n", "\n", "================================================================================\n", "✅ VALIDATION COMPLETE\n", "================================================================================\n", "\n", "📋 Table ready for publication:\n", " • Human definitions vs AI extraction\n", " • Coverage statistics\n", " • Example components\n", " • LaTeX format available\n", "\n", "💡 Next steps:\n", " • Review manual validation interface above\n", " • Check for false positives/negatives\n", " • Add validation notes to manuscript\n", " • Include validation table in methods section\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 21C: # CELL: Human Coding vs AI Extraction Validation (FIXED)\n", "\n", "print(\"=\"*80)\n", "print(\"🔬 DECISION COMPONENTS: HUMAN CODING vs AI EXTRACTION\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import pandas as pd\n", "import numpy as np\n", "from collections import defaultdict\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, HTML\n", "import plotly.graph_objects as go\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "required_vars = {\n", " 'decision_components': 'AI-extracted components',\n", " 'optimization_formulations': 'Optimization setups',\n", " 'documents': 'Interview texts'\n", "}\n", "\n", "has_data = True\n", "for var_name, description in required_vars.items():\n", " if var_name not in globals() or not globals()[var_name]:\n", " print(f\"❌ {var_name} not found! ({description})\")\n", " has_data = False\n", " else:\n", " print(f\"✓ {var_name}: {len(globals()[var_name])} items\")\n", "\n", "print()\n", "\n", "# Check decision_components structure\n", "if has_data:\n", " print(\"Checking data structure...\")\n", " sample_doc = list(documents.keys())[0]\n", " print(f\" Sample document: {sample_doc}\")\n", " print(f\" Type of decision_components: {type(decision_components)}\")\n", " print(f\" Type of decision_components['{sample_doc}']: {type(decision_components[sample_doc])}\")\n", " \n", " # Verify structure\n", " if isinstance(decision_components[sample_doc], dict):\n", " print(f\" ✓ Correct structure (dict of dicts)\")\n", " print(f\" Component types available: {list(decision_components[sample_doc].keys())}\")\n", " else:\n", " print(f\" ⚠️ Unexpected structure: {type(decision_components[sample_doc])}\")\n", " print()\n", "\n", "# ==========================================\n", "# 2. DEFINE HUMAN CODING REFERENCE TABLE\n", "# ==========================================\n", "if has_data:\n", " print(\"Step 2: Loading human coding reference definitions...\")\n", " print(\"-\"*80)\n", " \n", " # Human coding reference from decision theory\n", " human_reference = {\n", " 'Objectives': {\n", " 'definition': 'The goals or desired outcomes that operators seek to achieve or optimize.',\n", " 'examples': ['Community education', 'Providing safe water'],\n", " 'theory': 'From multi-objective optimization theory',\n", " 'keywords': ['goal', 'objective', 'aim', 'achieve', 'optimize', 'ensure']\n", " },\n", " 'Constraints': {\n", " 'definition': 'Limitations or boundaries that restrict available actions.',\n", " 'examples': ['Limited funding', 'Workforce shortage'],\n", " 'theory': 'From constraint satisfaction problems',\n", " 'keywords': ['limited', 'shortage', 'cannot', 'restriction', 'barrier']\n", " },\n", " 'Trade-Offs': {\n", " 'definition': 'Situations where improving one objective necessarily compromises another.',\n", " 'examples': ['Improving operator satisfaction compromises providing water to all customers'],\n", " 'theory': 'From Pareto optimization theory',\n", " 'keywords': ['trade-off', 'compromise', 'versus', 'sacrifice', 'balance']\n", " },\n", " 'Decision Variables': {\n", " 'definition': 'Controllable factors operators can adjust to influence system performance.',\n", " 'examples': ['Operator work hours'],\n", " 'theory': 'From optimization control variables',\n", " 'keywords': ['adjust', 'control', 'change', 'modify', 'set']\n", " },\n", " 'Options': {\n", " 'definition': 'Discrete alternative courses of action that are available to operators when facing a decision point.',\n", " 'examples': ['Residents choose between treated and natural water sources'],\n", " 'theory': 'From discrete choice theory',\n", " 'keywords': ['option', 'alternative', 'choice', 'either', 'or']\n", " },\n", " 'Solutions': {\n", " 'definition': 'Strategies, workarounds, or approaches that operators or residents have implemented to address challenges; proven methods for navigating system constraints.',\n", " 'examples': ['Remote worker program', 'Managerial capacity'],\n", " 'theory': 'From solution space in optimization',\n", " 'keywords': ['solution', 'workaround', 'strategy', 'implemented', 'program']\n", " },\n", " 'State Variables': {\n", " 'definition': 'Fixed or external factors that define the problem context but are not directly controllable by operators.',\n", " 'examples': ['Permafrost', 'Seasonal fluctuations'],\n", " 'theory': 'From state-space representation',\n", " 'keywords': ['given', 'external', 'environmental', 'fixed', 'uncontrollable']\n", " }\n", " }\n", " \n", " print(f\"✓ Loaded {len(human_reference)} component type definitions\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. CREATE AI EXTRACTION SUMMARY\n", " # ==========================================\n", " print(\"Step 3: Summarizing AI extraction results...\")\n", " print(\"-\"*80)\n", " \n", " doc_list = sorted(documents.keys())\n", " \n", " # Calculate AI extraction statistics\n", " ai_summary = {}\n", " \n", " for comp_type in human_reference.keys():\n", " # Collect all extracted components of this type\n", " all_components = set()\n", " counts_per_interview = []\n", " \n", " for doc_name in doc_list:\n", " # FIXED: Safely access nested dict\n", " if doc_name in decision_components and isinstance(decision_components[doc_name], dict):\n", " comps = decision_components[doc_name].get(comp_type, set())\n", " all_components.update(comps)\n", " counts_per_interview.append(len(comps))\n", " else:\n", " counts_per_interview.append(0)\n", " \n", " # Calculate statistics\n", " ai_summary[comp_type] = {\n", " 'total_mentions': sum(counts_per_interview),\n", " 'unique_components': len(all_components),\n", " 'mean_per_interview': np.mean(counts_per_interview) if counts_per_interview else 0,\n", " 'std_per_interview': np.std(counts_per_interview) if counts_per_interview else 0,\n", " 'min_per_interview': min(counts_per_interview) if counts_per_interview else 0,\n", " 'max_per_interview': max(counts_per_interview) if counts_per_interview else 0,\n", " 'interviews_present': sum(1 for c in counts_per_interview if c > 0),\n", " 'percentage_coverage': 100 * sum(1 for c in counts_per_interview if c > 0) / len(doc_list) if doc_list else 0,\n", " 'example_components': sorted(list(all_components))[:5] # Top 5 examples\n", " }\n", " \n", " print(f\"✓ Calculated AI extraction statistics\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. BUILD COMPARISON TABLE\n", " # ==========================================\n", " print(\"Step 4: Building human vs AI comparison table...\")\n", " print(\"-\"*80)\n", " \n", " comparison_data = []\n", " \n", " for comp_type in human_reference.keys():\n", " human_def = human_reference[comp_type]\n", " ai_stats = ai_summary[comp_type]\n", " \n", " comparison_data.append({\n", " 'Component Type': comp_type,\n", " 'Human Definition': human_def['definition'],\n", " 'Human Examples': ', '.join(human_def['examples']),\n", " 'Theoretical Basis': human_def['theory'],\n", " 'AI Total Mentions': ai_stats['total_mentions'],\n", " 'AI Unique Items': ai_stats['unique_components'],\n", " 'AI Avg per Interview': f\"{ai_stats['mean_per_interview']:.1f}\",\n", " 'AI Coverage (% interviews)': f\"{ai_stats['percentage_coverage']:.0f}%\",\n", " 'AI Examples': ', '.join(ai_stats['example_components'][:3])\n", " })\n", " \n", " comparison_df = pd.DataFrame(comparison_data)\n", " \n", " print(f\"✓ Comparison table created: {len(comparison_data)} rows\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. PUBLICATION-READY TABLE\n", " # ==========================================\n", " print(\"Step 5: Creating publication-ready validation table...\")\n", " print(\"-\"*80)\n", " \n", " # Simplified table for publication\n", " pub_table_data = []\n", " \n", " for comp_type in human_reference.keys():\n", " human_def = human_reference[comp_type]\n", " ai_stats = ai_summary[comp_type]\n", " \n", " pub_table_data.append({\n", " 'Component': comp_type,\n", " 'Definition': human_def['definition'],\n", " 'Example (Human-Coded)': human_def['examples'][0],\n", " 'AI-Identified (n)': ai_stats['total_mentions'],\n", " 'Unique AI Items': ai_stats['unique_components'],\n", " 'Coverage': f\"{ai_stats['percentage_coverage']:.0f}%\",\n", " 'AI Example': ai_stats['example_components'][0] if ai_stats['example_components'] else 'N/A'\n", " })\n", " \n", " pub_table_df = pd.DataFrame(pub_table_data)\n", " \n", " print(f\"✓ Publication table created\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. KEYWORD VALIDATION\n", " # ==========================================\n", " print(\"Step 6: Validating keyword-based extraction...\")\n", " print(\"-\"*80)\n", " \n", " keyword_validation = []\n", " \n", " for comp_type in human_reference.keys():\n", " human_keywords = human_reference[comp_type]['keywords']\n", " ai_stats = ai_summary[comp_type]\n", " \n", " # Check if AI-extracted components contain expected keywords\n", " all_ai_components = set()\n", " for doc_name in doc_list:\n", " if doc_name in decision_components and isinstance(decision_components[doc_name], dict):\n", " all_ai_components.update(decision_components[doc_name].get(comp_type, set()))\n", " \n", " # Count keyword matches\n", " keyword_matches = 0\n", " for component in all_ai_components:\n", " component_lower = component.lower()\n", " if any(kw in component_lower for kw in human_keywords):\n", " keyword_matches += 1\n", " \n", " match_rate = 100 * keyword_matches / len(all_ai_components) if all_ai_components else 0\n", " \n", " keyword_validation.append({\n", " 'Component Type': comp_type,\n", " 'Human Keywords': ', '.join(human_keywords[:5]),\n", " 'AI Items Matched': keyword_matches,\n", " 'AI Items Total': len(all_ai_components),\n", " 'Match Rate': f\"{match_rate:.1f}%\"\n", " })\n", " \n", " keyword_df = pd.DataFrame(keyword_validation)\n", " \n", " print(f\"✓ Keyword validation completed\")\n", " print()\n", " \n", " # ==========================================\n", " # 7. MANUAL VALIDATION INTERFACE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎛️ MANUAL VALIDATION INTERFACE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Create manual validation widget\n", " validation_output = widgets.Output()\n", " table_output = widgets.Output()\n", " \n", " # Component selector\n", " component_selector = widgets.Dropdown(\n", " options=list(human_reference.keys()),\n", " value=list(human_reference.keys())[0],\n", " description='Component:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='300px')\n", " )\n", " \n", " # Interview selector\n", " interview_selector = widgets.Dropdown(\n", " options=doc_list,\n", " value=doc_list[0],\n", " description='Interview:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='250px')\n", " )\n", " \n", " # Update function - FIXED\n", " def update_validation(change):\n", " selected_comp = component_selector.value\n", " selected_doc = interview_selector.value\n", " \n", " with validation_output:\n", " clear_output(wait=True)\n", " \n", " display(HTML(f\"

Validation: {selected_comp} in {selected_doc}

\"))\n", " \n", " print(\"=\"*60)\n", " print(\"📖 HUMAN CODING REFERENCE\")\n", " print(\"=\"*60)\n", " human_def = human_reference[selected_comp]\n", " print(f\"Definition: {human_def['definition']}\")\n", " print(f\"Examples: {', '.join(human_def['examples'])}\")\n", " print(f\"Keywords: {', '.join(human_def['keywords'][:5])}\")\n", " print()\n", " \n", " print(\"=\"*60)\n", " print(\"🤖 AI EXTRACTION RESULTS\")\n", " print(\"=\"*60)\n", " \n", " # FIXED: Safely access with error handling\n", " try:\n", " if selected_doc in decision_components and isinstance(decision_components[selected_doc], dict):\n", " ai_components = decision_components[selected_doc].get(selected_comp, set())\n", " else:\n", " ai_components = set()\n", " \n", " if ai_components:\n", " print(f\"Found {len(ai_components)} items:\")\n", " for i, comp in enumerate(sorted(ai_components), 1):\n", " print(f\" {i}. {comp}\")\n", " else:\n", " print(\"No items extracted for this component type in this interview.\")\n", " \n", " except Exception as e:\n", " print(f\"Error accessing components: {e}\")\n", " print(f\"Data structure issue - please check decision_components format\")\n", " \n", " print()\n", " print(\"=\"*60)\n", " print(\"✓ VALIDATION NOTES\")\n", " print(\"=\"*60)\n", " print(\"Review AI-extracted items above:\")\n", " print(\" • Do they match the human definition?\")\n", " print(\" • Are they appropriate for this component type?\")\n", " print(\" • Any false positives (incorrectly classified)?\")\n", " print(\" • Any obvious false negatives (missing items)?\")\n", " \n", " # Connect selectors\n", " component_selector.observe(update_validation, names='value')\n", " interview_selector.observe(update_validation, names='value')\n", " \n", " # Display interface\n", " print(\"Select component type and interview to review AI extraction:\")\n", " print()\n", " display(widgets.HBox([component_selector, interview_selector]))\n", " display(validation_output)\n", " \n", " # Initial display\n", " update_validation(None)\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 8. DISPLAY PUBLICATION TABLE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 PUBLICATION-READY VALIDATION TABLE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " with table_output:\n", " display(HTML(\"

Table X. Decision Problem Components: Human Coding Framework vs AI Extraction

\"))\n", " \n", " # Style the table\n", " styled_table = pub_table_df.style.set_properties(**{\n", " 'text-align': 'left',\n", " 'white-space': 'pre-wrap',\n", " 'word-wrap': 'break-word'\n", " }).set_table_styles([\n", " {'selector': 'th', 'props': [('text-align', 'left'), ('font-weight', 'bold')]},\n", " {'selector': 'td', 'props': [('padding', '8px')]}\n", " ])\n", " \n", " display(styled_table)\n", " \n", " print()\n", " print(\"=\"*60)\n", " print(\"📊 SUMMARY STATISTICS\")\n", " print(\"=\"*60)\n", " print()\n", " \n", " total_ai_mentions = sum(ai_summary[ct]['total_mentions'] for ct in human_reference.keys())\n", " total_unique_items = sum(ai_summary[ct]['unique_components'] for ct in human_reference.keys())\n", " \n", " print(f\"Total AI extractions across all types: {total_ai_mentions}\")\n", " print(f\"Total unique items identified: {total_unique_items}\")\n", " print(f\"Average items per component type: {total_unique_items / len(human_reference):.1f}\")\n", " print()\n", " \n", " print(\"Coverage by component type:\")\n", " for comp_type in human_reference.keys():\n", " coverage = ai_summary[comp_type]['percentage_coverage']\n", " total = ai_summary[comp_type]['total_mentions']\n", " print(f\" • {comp_type}: {coverage:.0f}% of interviews ({total} total mentions)\")\n", " \n", " print()\n", " print(\"=\"*60)\n", " print(\"✓ VALIDATION FINDINGS\")\n", " print(\"=\"*60)\n", " print()\n", " \n", " # Calculate overall statistics\n", " avg_coverage = np.mean([ai_summary[ct]['percentage_coverage'] for ct in human_reference.keys()])\n", " print(f\"Average coverage across component types: {avg_coverage:.1f}%\")\n", " print()\n", " \n", " # Identify well-covered vs sparse components\n", " well_covered = [ct for ct in human_reference.keys() if ai_summary[ct]['percentage_coverage'] >= 75]\n", " sparse = [ct for ct in human_reference.keys() if ai_summary[ct]['percentage_coverage'] < 50]\n", " \n", " if well_covered:\n", " print(f\"Well-covered components (≥75% interviews):\")\n", " for ct in well_covered:\n", " print(f\" • {ct}: {ai_summary[ct]['percentage_coverage']:.0f}%\")\n", " print()\n", " \n", " if sparse:\n", " print(f\"Sparsely mentioned components (<50% interviews):\")\n", " for ct in sparse:\n", " print(f\" • {ct}: {ai_summary[ct]['percentage_coverage']:.0f}% (may be domain-specific)\")\n", " print()\n", " \n", " display(table_output)\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"📈 DETAILED COMPARISON TABLE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " display(HTML(\"

Detailed Human vs AI Comparison

\"))\n", " display(comparison_df)\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"🔑 KEYWORD VALIDATION\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " display(HTML(\"

Keyword-Based Validation

\"))\n", " display(keyword_df)\n", " \n", " # ==========================================\n", " # 9. CREATE VISUALIZATIONS\n", " # ==========================================\n", " print()\n", " print(\"=\"*80)\n", " print(\"📊 VALIDATION VISUALIZATIONS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # 1. Coverage bar chart\n", " coverage_data = [(ct, ai_summary[ct]['percentage_coverage']) for ct in human_reference.keys()]\n", " coverage_sorted = sorted(coverage_data, key=lambda x: -x[1])\n", " \n", " fig1 = go.Figure(data=[\n", " go.Bar(\n", " x=[x[0] for x in coverage_sorted],\n", " y=[x[1] for x in coverage_sorted],\n", " marker_color=['#06A77D' if x[1] >= 75 else '#F77F00' if x[1] >= 50 else '#D62839' \n", " for x in coverage_sorted],\n", " text=[f\"{x[1]:.0f}%\" for x in coverage_sorted],\n", " textposition='outside'\n", " )\n", " ])\n", " \n", " fig1.update_layout(\n", " title='AI Extraction Coverage by Component Type',\n", " xaxis_title='Component Type',\n", " yaxis_title='Percentage of Interviews',\n", " height=400,\n", " yaxis_range=[0, 110]\n", " )\n", " \n", " display(fig1)\n", " \n", " # 2. Total mentions comparison\n", " fig2 = go.Figure(data=[\n", " go.Bar(\n", " x=list(human_reference.keys()),\n", " y=[ai_summary[ct]['total_mentions'] for ct in human_reference.keys()],\n", " marker_color='#2E86AB',\n", " text=[ai_summary[ct]['total_mentions'] for ct in human_reference.keys()],\n", " textposition='outside',\n", " name='Total Mentions'\n", " )\n", " ])\n", " \n", " fig2.add_trace(go.Scatter(\n", " x=list(human_reference.keys()),\n", " y=[ai_summary[ct]['unique_components'] for ct in human_reference.keys()],\n", " mode='markers+text',\n", " marker=dict(size=12, color='#F77F00', symbol='diamond'),\n", " text=[ai_summary[ct]['unique_components'] for ct in human_reference.keys()],\n", " textposition='top center',\n", " name='Unique Items'\n", " ))\n", " \n", " fig2.update_layout(\n", " title='AI Extraction Volume: Total Mentions vs Unique Items',\n", " xaxis_title='Component Type',\n", " yaxis_title='Count',\n", " height=400\n", " )\n", " \n", " display(fig2)\n", " \n", " # ==========================================\n", " # 10. SAVE VALIDATION TABLES\n", " # ==========================================\n", " print()\n", " print(\"=\"*80)\n", " print(\"💾 SAVING VALIDATION TABLES\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " from pathlib import Path\n", " output_dir = Path('publication_outputs/tables')\n", " output_dir.mkdir(parents=True, exist_ok=True)\n", " \n", " # Save publication table\n", " pub_table_df.to_csv(output_dir / 'Human_vs_AI_Validation_Table.csv', index=False)\n", " print(f\"✓ Saved: Human_vs_AI_Validation_Table.csv\")\n", " \n", " # Save detailed comparison\n", " comparison_df.to_csv(output_dir / 'Human_vs_AI_Detailed_Comparison.csv', index=False)\n", " print(f\"✓ Saved: Human_vs_AI_Detailed_Comparison.csv\")\n", " \n", " # Save keyword validation\n", " keyword_df.to_csv(output_dir / 'Human_vs_AI_Keyword_Validation.csv', index=False)\n", " print(f\"✓ Saved: Human_vs_AI_Keyword_Validation.csv\")\n", " \n", " # Create LaTeX table\n", " latex_output = pub_table_df.to_latex(\n", " index=False,\n", " column_format='lp{4cm}p{2.5cm}ccp{1cm}p{2.5cm}',\n", " caption='Decision Problem Components: Human Coding Framework vs AI Extraction Results',\n", " label='tab:validation',\n", " escape=False\n", " )\n", " \n", " latex_file = output_dir / 'Validation_Table_LaTeX.tex'\n", " with open(latex_file, 'w') as f:\n", " f.write(latex_output)\n", " print(f\"✓ Saved: Validation_Table_LaTeX.tex\")\n", " \n", " print()\n", " print(f\"📊 Files saved to: publication_outputs/tables/\")\n", " \n", " # Store in global scope\n", " globals()['validation_comparison_df'] = comparison_df\n", " globals()['validation_pub_table_df'] = pub_table_df\n", " globals()['validation_keyword_df'] = keyword_df\n", " globals()['human_coding_reference'] = human_reference\n", " globals()['ai_extraction_summary'] = ai_summary\n", " \n", " print()\n", " print(\"✓ Variables created:\")\n", " print(\" • validation_pub_table_df: Publication-ready table\")\n", " print(\" • validation_comparison_df: Detailed comparison\")\n", " print(\" • validation_keyword_df: Keyword validation\")\n", " print(\" • human_coding_reference: Human definitions\")\n", " print(\" • ai_extraction_summary: AI statistics\")\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ VALIDATION COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"📋 Table ready for publication:\")\n", " print(\" • Human definitions vs AI extraction\")\n", " print(\" • Coverage statistics\")\n", " print(\" • Example components\")\n", " print(\" • LaTeX format available\")\n", " print()\n", " \n", " print(\"💡 Next steps:\")\n", " print(\" • Review manual validation interface above\")\n", " print(\" • Check for false positives/negatives\")\n", " print(\" • Add validation notes to manuscript\")\n", " print(\" • Include validation table in methods section\")\n", "\n", "else:\n", " print(\"⚠️ Missing required data - run Decision Components cell first\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔗 LINKING SVOs TO DECISION COMPONENTS\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", " ✓ svo_extractions\n", " ✓ topic_mappings\n", " ✓ documents\n", "\n", "Step 2: Defining decision component patterns...\n", "--------------------------------------------------------------------------------\n", "✓ Defined 52 decision patterns\n", "\n", "Step 3: Extracting decision components from text...\n", "--------------------------------------------------------------------------------\n", "✓ Extracted decision components:\n", " • objectives: 227 instances\n", " • tradeoffs: 655 instances\n", " • alternatives: 289 instances\n", " • constraints: 355 instances\n", "\n", "Step 4: Linking decision components to SVOs...\n", "--------------------------------------------------------------------------------\n", "✓ Created 1526 component-SVO links\n", "\n", "================================================================================\n", "📊 DECISION COMPONENTS BY SCIENCE DOMAIN\n", "================================================================================\n", "\n", "Climate Science:\n", " • objectives: 227\n", " • tradeoffs: 655\n", " • alternatives: 289\n", " • constraints: 355\n", "\n", "Economics & Resources:\n", " • objectives: 227\n", " • tradeoffs: 655\n", " • alternatives: 289\n", " • constraints: 355\n", "\n", "Environmental Health:\n", " • objectives: 217\n", " • tradeoffs: 645\n", " • alternatives: 285\n", " • constraints: 352\n", "\n", "Governance & Policy:\n", " • objectives: 217\n", " • tradeoffs: 645\n", " • alternatives: 285\n", " • constraints: 352\n", "\n", "Hydrological Science:\n", " • objectives: 127\n", " • tradeoffs: 385\n", " • alternatives: 153\n", " • constraints: 181\n", "\n", "Infrastructure Engineering:\n", " • objectives: 227\n", " • tradeoffs: 655\n", " • alternatives: 289\n", " • constraints: 355\n", "\n", "Social Systems:\n", " • objectives: 129\n", " • tradeoffs: 341\n", " • alternatives: 149\n", " • constraints: 193\n", "\n", "Technical Operations:\n", " • objectives: 227\n", " • tradeoffs: 655\n", " • alternatives: 289\n", " • constraints: 355\n", "\n", "================================================================================\n", "🎯 OPTIMIZATION-READY STRUCTURE\n", "================================================================================\n", "\n", "Optimization-ready structure by domain:\n", "\n", "Social Systems:\n", " Variables: 5 SVOs\n", " Decisions: 812 components\n", " Sample variables: residents, users, household\n", "\n", "Technical Operations:\n", " Variables: 4 SVOs\n", " Decisions: 1526 components\n", " Sample variables: staff, certification level, hours\n", "\n", "Climate Science:\n", " Variables: 8 SVOs\n", " Decisions: 1526 components\n", " Sample variables: seasonal, temperature, frost depth\n", "\n", "Economics & Resources:\n", " Variables: 11 SVOs\n", " Decisions: 1526 components\n", " Sample variables: capital cost, revenue, value\n", "\n", "Governance & Policy:\n", " Variables: 5 SVOs\n", " Decisions: 1499 components\n", " Sample variables: enforcement, permit, standard\n", "\n", "Environmental Health:\n", " Variables: 8 SVOs\n", " Decisions: 1499 components\n", " Sample variables: water quality, coliform, turbidity\n", "\n", "Infrastructure Engineering:\n", " Variables: 4 SVOs\n", " Decisions: 1526 components\n", " Sample variables: capacity, pressure, condition\n", "\n", "Hydrological Science:\n", " Variables: 5 SVOs\n", " Decisions: 846 components\n", " Sample variables: discharge, water level, height\n", "\n", "================================================================================\n", "✅ COMPONENT LINKING COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " • decision_components (dict) - By component type\n", " • component_svo_links (list) - Links between components and SVOs\n", " • domain_components (dict) - Components by domain\n", " • optimization_structure (dict) - Ready for model formulation\n", "\n", "Total:\n", " • 1526 component-SVO links\n", " • 8 domains with decision components\n", "\n", "💡 Next steps:\n", " → Cell 23: Build science backbone network\n", " → Cell 24: Visualize network with overlays\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 22: Link SVOs to Decision Components\n", "\n", "print(\"=\"*80)\n", "print(\"🔗 LINKING SVOs TO DECISION COMPONENTS\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import spacy\n", "from collections import defaultdict\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "required_vars = {\n", " 'svo_extractions': 'svo_extractions' in globals(),\n", " 'topic_mappings': 'topic_mappings' in globals(),\n", " 'documents': 'documents' in globals()\n", "}\n", "\n", "all_good = True\n", "for var, exists in required_vars.items():\n", " status = \"✓\" if exists else \"✗\"\n", " print(f\" {status} {var}\")\n", " if not exists:\n", " all_good = False\n", "\n", "print()\n", "\n", "if not all_good:\n", " print(\"⚠️ Missing required variables!\")\n", " print(\" Run previous cells: Cell 20 (mappings), Cell 21 (SVOs)\")\n", "else:\n", " # ==========================================\n", " # 2. DEFINE DECISION COMPONENT PATTERNS\n", " # ==========================================\n", " print(\"Step 2: Defining decision component patterns...\")\n", " print(\"-\"*80)\n", " \n", " # Patterns that indicate decision components\n", " decision_patterns = {\n", " 'objectives': [\n", " 'need to', 'want to', 'goal', 'objective', 'aim',\n", " 'improve', 'reduce', 'increase', 'maintain', 'ensure',\n", " 'minimize', 'maximize', 'optimize', 'achieve', 'prevent'\n", " ],\n", " 'constraints': [\n", " 'cannot', 'limited', 'constraint', 'restriction', 'requirement',\n", " 'must', 'have to', 'required', 'regulation', 'standard',\n", " 'budget', 'shortage', 'lack', 'challenge', 'difficult'\n", " ],\n", " 'alternatives': [\n", " 'option', 'alternative', 'choice', 'solution', 'approach',\n", " 'could', 'might', 'either', 'instead', 'different way'\n", " ],\n", " 'tradeoffs': [\n", " 'versus', 'trade-off', 'tradeoff', 'balance', 'compromise',\n", " 'but', 'however', 'on the other hand', 'cost benefit',\n", " 'advantage', 'disadvantage', 'pros and cons'\n", " ]\n", " }\n", " \n", " print(f\"✓ Defined {sum(len(p) for p in decision_patterns.values())} decision patterns\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. EXTRACT DECISION COMPONENTS\n", " # ==========================================\n", " print(\"Step 3: Extracting decision components from text...\")\n", " print(\"-\"*80)\n", " \n", " decision_components = defaultdict(list)\n", " \n", " # Scan documents for decision language\n", " for doc_name, doc_text in documents.items():\n", " sentences = doc_text.split('.')\n", " \n", " for sentence in sentences:\n", " sentence_lower = sentence.lower().strip()\n", " \n", " if len(sentence_lower) < 20: # Skip very short sentences\n", " continue\n", " \n", " # Check each pattern category\n", " for component_type, patterns in decision_patterns.items():\n", " for pattern in patterns:\n", " if pattern in sentence_lower:\n", " decision_components[component_type].append({\n", " 'document': doc_name,\n", " 'text': sentence.strip()[:200],\n", " 'pattern': pattern\n", " })\n", " break # One match per sentence per category\n", " \n", " print(f\"✓ Extracted decision components:\")\n", " for comp_type, items in decision_components.items():\n", " print(f\" • {comp_type}: {len(items)} instances\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 4. LINK COMPONENTS TO SVOs\n", " # ==========================================\n", " print(\"Step 4: Linking decision components to SVOs...\")\n", " print(\"-\"*80)\n", " \n", " # Create links where components and SVOs appear in same document\n", " component_svo_links = []\n", " \n", " for comp_type, components in decision_components.items():\n", " for component in components:\n", " doc_name = component['document']\n", " \n", " # Find SVOs from same document\n", " related_svos = [\n", " svo for svo in svo_extractions \n", " if svo['document'] == doc_name\n", " ]\n", " \n", " if related_svos:\n", " component_svo_links.append({\n", " 'component_type': comp_type,\n", " 'component_text': component['text'][:100],\n", " 'document': doc_name,\n", " 'svos': [svo['svo'] for svo in related_svos],\n", " 'domains': list(set(svo['domain'] for svo in related_svos))\n", " })\n", " \n", " print(f\"✓ Created {len(component_svo_links)} component-SVO links\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. SUMMARY BY DOMAIN\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 DECISION COMPONENTS BY SCIENCE DOMAIN\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " domain_components = defaultdict(lambda: defaultdict(int))\n", " \n", " for link in component_svo_links:\n", " comp_type = link['component_type']\n", " for domain in link['domains']:\n", " domain_components[domain][comp_type] += 1\n", " \n", " for domain in sorted(domain_components.keys()):\n", " print(f\"{domain}:\")\n", " for comp_type, count in domain_components[domain].items():\n", " print(f\" • {comp_type}: {count}\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. CREATE OPTIMIZATION-READY STRUCTURE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎯 OPTIMIZATION-READY STRUCTURE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Group for optimization model formulation\n", " optimization_structure = {}\n", " \n", " for domain in domain_components.keys():\n", " # Get SVOs for this domain\n", " domain_svos = [\n", " svo for svo in svo_extractions \n", " if svo['domain'] == domain\n", " ]\n", " \n", " # Get unique SVOs\n", " unique_svos = list(set(svo['svo'] for svo in domain_svos))\n", " \n", " # Get decision components for this domain\n", " domain_links = [\n", " link for link in component_svo_links \n", " if domain in link['domains']\n", " ]\n", " \n", " optimization_structure[domain] = {\n", " 'measurable_variables': unique_svos,\n", " 'objectives': [\n", " link['component_text'] for link in domain_links \n", " if link['component_type'] == 'objectives'\n", " ][:3], # Top 3\n", " 'constraints': [\n", " link['component_text'] for link in domain_links \n", " if link['component_type'] == 'constraints'\n", " ][:3], # Top 3\n", " 'variable_count': len(unique_svos),\n", " 'decision_count': len(domain_links)\n", " }\n", " \n", " print(\"Optimization-ready structure by domain:\")\n", " for domain, structure in optimization_structure.items():\n", " print(f\"\\n{domain}:\")\n", " print(f\" Variables: {structure['variable_count']} SVOs\")\n", " print(f\" Decisions: {structure['decision_count']} components\")\n", " if structure['measurable_variables']:\n", " print(f\" Sample variables: {', '.join(structure['measurable_variables'][:3])}\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 7. SAVE RESULTS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"✅ COMPONENT LINKING COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Variables created:\")\n", " print(\" • decision_components (dict) - By component type\")\n", " print(\" • component_svo_links (list) - Links between components and SVOs\")\n", " print(\" • domain_components (dict) - Components by domain\")\n", " print(\" • optimization_structure (dict) - Ready for model formulation\")\n", " print()\n", " \n", " print(f\"Total:\")\n", " print(f\" • {len(component_svo_links)} component-SVO links\")\n", " print(f\" • {len(optimization_structure)} domains with decision components\")\n", " print()\n", " \n", " print(\"💡 Next steps:\")\n", " print(\" → Cell 23: Build science backbone network\")\n", " print(\" → Cell 24: Visualize network with overlays\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🕸️ BUILDING SCIENCE BACKBONE NETWORK\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", " ✓ science_backbone\n", " ✓ topic_mappings\n", " ✓ svo_extractions\n", " ✓ optimization_structure\n", "\n", "Step 2: Creating network structure...\n", "--------------------------------------------------------------------------------\n", "\n", "Adding domain nodes...\n", " ✓ Added 8 domain nodes\n", "Adding topic nodes...\n", " ✓ Added 25 topic nodes\n", "Adding SVO nodes...\n", " ✓ Added 50 SVO nodes\n", "\n", "Network summary:\n", " • Total nodes: 83\n", " • Total edges: 75\n", " • Domains: 8\n", " • Topics: 25\n", " • SVOs: 50\n", "\n", "Step 3: Calculating network metrics...\n", "--------------------------------------------------------------------------------\n", "\n", "Most central domains:\n", " • Hydrological Science: 12 connections (centrality: 0.146)\n", " • Climate Science: 12 connections (centrality: 0.146)\n", " • Economics & Resources: 12 connections (centrality: 0.146)\n", " • Environmental Health: 11 connections (centrality: 0.134)\n", " • Infrastructure Engineering: 8 connections (centrality: 0.098)\n", "\n", "Step 4: Identifying thematic clusters...\n", "--------------------------------------------------------------------------------\n", "\n", "✓ Identified 8 thematic clusters\n", "\n", "Cluster 1: 13 nodes\n", " Domains: Climate Science\n", "\n", "Cluster 2: 13 nodes\n", " Domains: Hydrological Science\n", "\n", "Cluster 3: 13 nodes\n", " Domains: Economics & Resources\n", "\n", "Cluster 4: 12 nodes\n", " Domains: Environmental Health\n", "\n", "Cluster 5: 9 nodes\n", " Domains: Infrastructure Engineering\n", "\n", "Cluster 6: 9 nodes\n", " Domains: Governance & Policy\n", "\n", "Cluster 7: 8 nodes\n", " Domains: Social Systems\n", "\n", "Cluster 8: 6 nodes\n", " Domains: Technical Operations\n", "\n", "Step 5: Annotating with decision components...\n", "--------------------------------------------------------------------------------\n", "✓ 8 domains have optimization-ready information\n", "\n", "================================================================================\n", "✅ NETWORK CONSTRUCTION COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " • G (NetworkX graph) - Full science backbone network\n", " • degree_centrality (dict) - Node centrality scores\n", " • node_counts (dict) - Node type counts\n", "\n", "Network statistics:\n", " • Nodes: 83\n", " • Edges: 75\n", " • Average degree: 1.81\n", " • Network density: 0.0220\n", "\n", "💡 Next step:\n", " → Cell 24: Visualize network with interactive overlays\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 23: Build Science Backbone Network\n", "\n", "print(\"=\"*80)\n", "print(\"🕸️ BUILDING SCIENCE BACKBONE NETWORK\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import networkx as nx\n", "from collections import defaultdict\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "required_vars = {\n", " 'science_backbone': 'science_backbone' in globals(),\n", " 'topic_mappings': 'topic_mappings' in globals(),\n", " 'svo_extractions': 'svo_extractions' in globals(),\n", " 'optimization_structure': 'optimization_structure' in globals()\n", "}\n", "\n", "all_good = True\n", "for var, exists in required_vars.items():\n", " status = \"✓\" if exists else \"✗\"\n", " print(f\" {status} {var}\")\n", " if not exists:\n", " all_good = False\n", "\n", "print()\n", "\n", "if not all_good:\n", " print(\"⚠️ Missing required variables!\")\n", " print(\" Run previous cells: 20 (mappings), 21 (SVOs), 22 (components)\")\n", "else:\n", " # ==========================================\n", " # 2. CREATE NETWORK STRUCTURE\n", " # ==========================================\n", " print(\"Step 2: Creating network structure...\")\n", " print(\"-\"*80)\n", " \n", " # Initialize network\n", " G = nx.Graph()\n", " \n", " # Track node counts\n", " node_counts = {'domains': 0, 'topics': 0, 'svos': 0}\n", " \n", " # Add domain nodes (large)\n", " print(\"\\nAdding domain nodes...\")\n", " for domain_name in science_backbone.keys():\n", " G.add_node(\n", " domain_name,\n", " node_type='domain',\n", " size=30,\n", " color='#1f77b4' # Blue\n", " )\n", " node_counts['domains'] += 1\n", " \n", " print(f\" ✓ Added {node_counts['domains']} domain nodes\")\n", " \n", " # Add topic nodes (medium) and connect to domains\n", " print(\"Adding topic nodes...\")\n", " for topic in topic_mappings:\n", " topic_id = f\"Topic {topic['topic_id']}\"\n", " primary_domain = topic['primary_domain']\n", " \n", " # Add topic node\n", " G.add_node(\n", " topic_id,\n", " node_type='topic',\n", " size=15,\n", " color='#ff7f0e', # Orange\n", " top_words=', '.join(topic['top_words'][:5]),\n", " domain=primary_domain\n", " )\n", " node_counts['topics'] += 1\n", " \n", " # Connect to primary domain\n", " G.add_edge(topic_id, primary_domain, edge_type='topic-domain')\n", " \n", " # Connect to secondary domains if any\n", " for sec_domain in topic.get('secondary_domains', []):\n", " if sec_domain in G.nodes():\n", " G.add_edge(topic_id, sec_domain, edge_type='topic-domain-secondary')\n", " \n", " print(f\" ✓ Added {node_counts['topics']} topic nodes\")\n", " \n", " # Add SVO nodes (small) and connect to domains\n", " print(\"Adding SVO nodes...\")\n", " svo_to_domain = defaultdict(list)\n", " \n", " for svo_entry in svo_extractions:\n", " svo_name = svo_entry['svo']\n", " domain = svo_entry['domain']\n", " svo_to_domain[svo_name].append(domain)\n", " \n", " for svo_name, domains in svo_to_domain.items():\n", " # Add SVO node\n", " svo_id = f\"SVO: {svo_name}\"\n", " G.add_node(\n", " svo_id,\n", " node_type='svo',\n", " size=8,\n", " color='#2ca02c', # Green\n", " variable=svo_name\n", " )\n", " node_counts['svos'] += 1\n", " \n", " # Connect to domains\n", " for domain in set(domains):\n", " if domain in G.nodes():\n", " G.add_edge(svo_id, domain, edge_type='svo-domain')\n", " \n", " print(f\" ✓ Added {node_counts['svos']} SVO nodes\")\n", " \n", " print()\n", " print(f\"Network summary:\")\n", " print(f\" • Total nodes: {G.number_of_nodes()}\")\n", " print(f\" • Total edges: {G.number_of_edges()}\")\n", " print(f\" • Domains: {node_counts['domains']}\")\n", " print(f\" • Topics: {node_counts['topics']}\")\n", " print(f\" • SVOs: {node_counts['svos']}\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 3. CALCULATE NETWORK METRICS\n", " # ==========================================\n", " print(\"Step 3: Calculating network metrics...\")\n", " print(\"-\"*80)\n", " \n", " # Degree centrality (how connected each node is)\n", " degree_centrality = nx.degree_centrality(G)\n", " \n", " # Find most central domains\n", " domain_centrality = {\n", " node: centrality \n", " for node, centrality in degree_centrality.items()\n", " if G.nodes[node].get('node_type') == 'domain'\n", " }\n", " \n", " print(\"\\nMost central domains:\")\n", " for domain, centrality in sorted(domain_centrality.items(), \n", " key=lambda x: -x[1])[:5]:\n", " connections = G.degree(domain)\n", " print(f\" • {domain}: {connections} connections (centrality: {centrality:.3f})\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 4. IDENTIFY CLUSTERS\n", " # ==========================================\n", " print(\"Step 4: Identifying thematic clusters...\")\n", " print(\"-\"*80)\n", " \n", " # Find communities using Louvain method\n", " try:\n", " from networkx.algorithms import community\n", " communities = community.greedy_modularity_communities(G)\n", " \n", " print(f\"\\n✓ Identified {len(communities)} thematic clusters\")\n", " \n", " for i, comm in enumerate(communities, 1):\n", " domains_in_cluster = [n for n in comm if G.nodes[n].get('node_type') == 'domain']\n", " if domains_in_cluster:\n", " print(f\"\\nCluster {i}: {len(comm)} nodes\")\n", " print(f\" Domains: {', '.join(domains_in_cluster)}\")\n", " \n", " except ImportError:\n", " print(\" ℹ️ Community detection requires python-louvain\")\n", " print(\" pip install python-louvain\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 5. ANNOTATE WITH DECISION INFO\n", " # ==========================================\n", " print(\"Step 5: Annotating with decision components...\")\n", " print(\"-\"*80)\n", " \n", " # Add decision component counts to domain nodes\n", " for domain in science_backbone.keys():\n", " if domain in optimization_structure:\n", " G.nodes[domain]['variable_count'] = optimization_structure[domain]['variable_count']\n", " G.nodes[domain]['decision_count'] = optimization_structure[domain]['decision_count']\n", " G.nodes[domain]['has_optimization_info'] = True\n", " else:\n", " G.nodes[domain]['has_optimization_info'] = False\n", " \n", " # Count domains with optimization info\n", " opt_domains = sum(\n", " 1 for node in G.nodes() \n", " if G.nodes[node].get('has_optimization_info', False)\n", " )\n", " \n", " print(f\"✓ {opt_domains} domains have optimization-ready information\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. SAVE NETWORK\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"✅ NETWORK CONSTRUCTION COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Variables created:\")\n", " print(\" • G (NetworkX graph) - Full science backbone network\")\n", " print(\" • degree_centrality (dict) - Node centrality scores\")\n", " print(\" • node_counts (dict) - Node type counts\")\n", " print()\n", " \n", " print(\"Network statistics:\")\n", " print(f\" • Nodes: {G.number_of_nodes()}\")\n", " print(f\" • Edges: {G.number_of_edges()}\")\n", " print(f\" • Average degree: {sum(dict(G.degree()).values()) / G.number_of_nodes():.2f}\")\n", " print(f\" • Network density: {nx.density(G):.4f}\")\n", " print()\n", " \n", " print(\"💡 Next step:\")\n", " print(\" → Cell 24: Visualize network with interactive overlays\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "📊 CREATING INTERACTIVE NETWORK VISUALIZATION\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ Network loaded: 83 nodes, 75 edges\n", "\n", "Step 2: Computing network layout...\n", "--------------------------------------------------------------------------------\n", "✓ Layout computed\n", "\n", "Step 3: Creating node traces...\n", "--------------------------------------------------------------------------------\n", "✓ Created 3 node trace types\n", "\n", "Step 4: Creating edge traces...\n", "--------------------------------------------------------------------------------\n", "✓ Created edge trace with 75 edges\n", "\n", "Step 5: Assembling visualization...\n", "--------------------------------------------------------------------------------\n", "✓ Figure created\n", "\n", "================================================================================\n", "🎨 DISPLAYING NETWORK\n", "================================================================================\n", "\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "none", "line": { "color": "#888", "width": 0.5 }, "mode": "lines", "showlegend": false, "type": "scatter", "x": [ -0.03375203488121921, -0.7643783881921188, null, -0.03375203488121921, -0.4988361073943331, null, -0.03375203488121921, 0.9108900197998652, null, -0.03375203488121921, 0.25048926626765944, null, -0.03375203488121921, 0.7371245757526655, null, -0.03375203488121921, -0.857202164751744, null, -0.03375203488121921, 0.9477240435691783, null, -0.03375203488121921, 0.9779057199778487, null, -0.03375203488121921, 0.4452493856699078, null, -0.03375203488121921, -0.2896678644690961, null, -0.03375203488121921, -0.9058624422962158, null, -0.03375203488121921, 0.48229502301302024, null, 0.27934449209228185, 0.18832320038276953, null, 0.27934449209228185, -0.38383296690225616, null, 0.27934449209228185, -0.8133706254764074, null, 0.27934449209228185, 0.18534235181690728, null, 0.27934449209228185, -0.5488194837350298, null, 0.27934449209228185, -0.33957844835512435, null, 0.27934449209228185, 0.5835533043116142, null, 0.27934449209228185, -0.945379411091151, null, 0.27934449209228185, -0.5373106439607368, null, 0.27934449209228185, -0.1901414033825167, null, 0.27934449209228185, -0.8971248448333526, null, 0.27934449209228185, 0.8753068524549898, null, -0.6207272703420635, -0.5741182296288471, null, -0.6207272703420635, 0.32882613501254904, null, -0.6207272703420635, -0.1520140158891979, null, -0.6207272703420635, 0.9665178847449036, null, -0.6207272703420635, 0.629441152991229, null, -0.6207272703420635, 0.4924330109804105, null, -0.6207272703420635, 0.04173023091645595, null, -0.6207272703420635, -0.2031406770207927, null, -0.6201018466185201, 0.02978972325730114, null, -0.6201018466185201, -0.8926158579935434, null, -0.6201018466185201, 0.8224004038623163, null, -0.6201018466185201, 0.9266911264276966, null, -0.6201018466185201, -0.06173425067807704, null, -0.6201018466185201, -0.6352543095287255, null, -0.6201018466185201, 0.5839160453434034, null, -0.6201018466185201, 0.9441122654149047, null, -0.6201018466185201, 0.8216801704322445, null, -0.6201018466185201, -0.9350591356069989, null, -0.6201018466185201, 0.8981510728487058, null, 0.23294618936412295, 0.12178995440002832, null, 0.23294618936412295, -0.9403115908369994, null, 0.23294618936412295, 0.7975211360118166, null, 0.23294618936412295, -0.983840834528855, null, 0.23294618936412295, 0.5225486740748134, null, 0.23294618936412295, -0.4012936175668631, null, 0.23294618936412295, -0.2680695669233201, null, -0.6925982784228786, 0.4054272161283111, null, -0.6925982784228786, -0.96212897299135, null, -0.6925982784228786, -0.3774722444932817, null, -0.6925982784228786, 0.9535647917381579, null, -0.6925982784228786, -0.1334553705677391, null, -0.6925982784228786, -0.4972901708831144, null, -0.6925982784228786, 0.6731157747795982, null, -0.6925982784228786, -0.7046499979146361, null, 0.7613271782720092, -0.8260719586704256, null, 0.7613271782720092, 0.9853690327962842, null, 0.7613271782720092, -0.8619046921498498, null, 0.7613271782720092, 0.9209243644267056, null, 0.7613271782720092, 0.9419404226458097, null, 0.7613271782720092, 0.6642900409870887, null, 0.7613271782720092, -0.7455910042198471, null, 0.7613271782720092, 0.8282028519242292, null, 0.7613271782720092, 0.3180990527765686, null, 0.7613271782720092, -0.09945314456340379, null, 0.7613271782720092, 1, null, 0.7613271782720092, 0.6854968272750016, null, -0.7750493576615791, -0.6810915477896409, null, -0.7750493576615791, -0.9631358172556351, null, -0.7750493576615791, -0.7754402061030435, null, -0.7750493576615791, -0.909953442784238, null, -0.7750493576615791, 0.1370232744133935, null ], "y": [ 0.958417620015098, -0.4793674476458074, null, 0.958417620015098, 0.9044474624204818, null, 0.958417620015098, 0.5050735475114999, null, 0.958417620015098, -0.9091996591144147, null, 0.958417620015098, 0.7477814911445113, null, 0.958417620015098, -0.4927494433618577, null, 0.958417620015098, -0.20879632371738377, null, 0.958417620015098, 0.14323362354273964, null, 0.958417620015098, 0.9089861517563397, null, 0.958417620015098, -0.9540366547144596, null, 0.958417620015098, -0.38832412203310424, null, 0.958417620015098, 0.8239098613774081, null, 0.14331585819723078, -0.9729874766255657, null, 0.14331585819723078, -0.896744108239975, null, 0.14331585819723078, 0.634146733107638, null, 0.14331585819723078, 0.9716275872930231, null, 0.14331585819723078, 0.8113452776103534, null, 0.14331585819723078, 0.9759676779027375, null, 0.14331585819723078, -0.7845171322979079, null, 0.14331585819723078, 0.3607678653368623, null, 0.14331585819723078, -0.7911007847794461, null, 0.14331585819723078, -0.9494570277159408, null, 0.14331585819723078, -0.16799357555051256, null, 0.14331585819723078, 0.07094020957430419, null, -0.7630609621393, 0.356742069285057, null, -0.7630609621393, -0.9085324734165159, null, -0.7630609621393, 0.3872313914897627, null, -0.7630609621393, 0.3649708456581655, null, -0.7630609621393, 0.7200829848027053, null, -0.7630609621393, -0.8827479130083424, null, -0.7630609621393, -0.9558399737657307, null, -0.7630609621393, 0.9809231147174591, null, 0.4199165570617058, 0.9362843495195916, null, 0.4199165570617058, 0.22130827284161117, null, 0.4199165570617058, -0.40187954425632394, null, 0.4199165570617058, 0.3302570242886357, null, 0.4199165570617058, -0.9785727883628312, null, 0.4199165570617058, -0.6465028537061525, null, 0.4199165570617058, 0.8243785422818537, null, 0.4199165570617058, -0.40162657647280553, null, 0.4199165570617058, -0.5850888916992016, null, 0.4199165570617058, -0.057009287446160024, null, 0.4199165570617058, -0.512618056954328, null, 0.6485642105227907, -0.8895534768461892, null, 0.6485642105227907, 0.09932732982984915, null, 0.6485642105227907, 0.49892421248374774, null, 0.6485642105227907, 0.01900049057177776, null, 0.6485642105227907, -0.730495274352102, null, 0.6485642105227907, 0.8495887644201006, null, 0.6485642105227907, 0.848365999224878, null, 0.7898677172442734, -0.8633467439511091, null, 0.7898677172442734, -0.17713497970762462, null, 0.7898677172442734, -0.8215801417254007, null, 0.7898677172442734, -0.29861714820698015, null, 0.7898677172442734, 0.9172082326532601, null, 0.7898677172442734, -0.8583346525780225, null, 0.7898677172442734, -0.7291473004111124, null, 0.7898677172442734, 0.714389317460196, null, -0.6399233719514198, 0.31477303205189827, null, -0.6399233719514198, 0.016054840113635303, null, -0.6399233719514198, 0.435407999101931, null, -0.6399233719514198, 0.21794500841697764, null, -0.6399233719514198, -0.08386027809339763, null, -0.6399233719514198, -0.5588009110320521, null, -0.6399233719514198, -0.702877750488982, null, -0.6399233719514198, 0.6137218735055383, null, -0.6399233719514198, 0.9259392889340138, null, -0.6399233719514198, -0.9196502695154404, null, -0.6399233719514198, -0.13293379665519128, null, -0.6399233719514198, 0.5727414283339768, null, -0.5789927430822626, 0.6218337579536217, null, -0.5789927430822626, 0.19427792517129622, null, -0.5789927430822626, -0.33392784574127693, null, -0.5789927430822626, -0.2949166390359592, null, -0.5789927430822626, 0.9028588536680511, null ] }, { "hoverinfo": "text", "marker": { "color": [ "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4" ], "line": { "color": "white", "width": 2 }, "size": [ 30, 30, 30, 30, 30, 30, 30, 30 ] }, "mode": "markers", "name": "Domain", "text": [ "Hydrological Science
Type: Domain
Connections: 12
Variables: 5
Decisions: 846", "Climate Science
Type: Domain
Connections: 12
Variables: 8
Decisions: 1526", "Infrastructure Engineering
Type: Domain
Connections: 8
Variables: 4
Decisions: 1526", "Environmental Health
Type: Domain
Connections: 11
Variables: 8
Decisions: 1499", "Social Systems
Type: Domain
Connections: 7
Variables: 5
Decisions: 812", "Governance & Policy
Type: Domain
Connections: 8
Variables: 5
Decisions: 1499", "Economics & Resources
Type: Domain
Connections: 12
Variables: 11
Decisions: 1526", "Technical Operations
Type: Domain
Connections: 5
Variables: 4
Decisions: 1526" ], "type": "scatter", "x": [ -0.03375203488121921, 0.27934449209228185, -0.6207272703420635, -0.6201018466185201, 0.23294618936412295, -0.6925982784228786, 0.7613271782720092, -0.7750493576615791 ], "y": [ 0.958417620015098, 0.14331585819723078, -0.7630609621393, 0.4199165570617058, 0.6485642105227907, 0.7898677172442734, -0.6399233719514198, -0.5789927430822626 ] }, { "hoverinfo": "text", "marker": { "color": [ "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e" ], "line": { "color": "white", "width": 2 }, "size": [ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 ] }, "mode": "markers", "name": "Topic", "text": [ "Topic 0
Type: Topic
Domain: Infrastructure Engineering
Words: entity, round, whole system, collection, every community", "Topic 1
Type: Topic
Domain: Social Systems
Words: myself, site, overflow, happening, depends", "Topic 2
Type: Topic
Domain: Infrastructure Engineering
Words: plant right, respond, quite, distribution, order", "Topic 3
Type: Topic
Domain: Hydrological Science
Words: addressing, time right, respond, taken, time yeah", "Topic 4
Type: Topic
Domain: Environmental Health
Words: yeah little, water every, imagine, texas, must", "Topic 5
Type: Topic
Domain: Environmental Health
Words: describe, basis, jump, water every, solve", "Topic 6
Type: Topic
Domain: Climate Science
Words: people yeah, arctic, figure, lose, single", "Topic 7
Type: Topic
Domain: Governance & Policy
Words: confidence, water community, happy, earlier, chemicals", "Topic 8
Type: Topic
Domain: Hydrological Science
Words: leaving, complex, ahead, picture, helps", "Topic 9
Type: Topic
Domain: Hydrological Science
Words: order, yeah couple, erosion, totally, water doesn", "Topic 10
Type: Topic
Domain: Climate Science
Words: weekend, strong, right water, separate, water house", "Topic 11
Type: Topic
Domain: Environmental Health
Words: driver, white, health, remote, program", "Topic 12
Type: Topic
Domain: Social Systems
Words: helping, interviews, form, rather, state level", "Topic 13
Type: Topic
Domain: Climate Science
Words: come work, professional, penalty, worker, mostly", "Topic 14
Type: Topic
Domain: Technical Operations
Words: helpful, aspects, knowledge, penalty, driven", "Topic 15
Type: Topic
Domain: Infrastructure Engineering
Words: kinds, require, starts, calling, route", "Topic 16
Type: Topic
Domain: Hydrological Science
Words: front, round, license, success, breaking", "Topic 17
Type: Topic
Domain: Infrastructure Engineering
Words: distribution, specific, holding tank, people yeah, areas", "Topic 18
Type: Topic
Domain: Hydrological Science
Words: draw, different types, assuming, turned, towards water", "Topic 19
Type: Topic
Domain: Climate Science
Words: lift station, anyone, totally, solutions, covered", "Topic 20
Type: Topic
Domain: Hydrological Science
Words: something yeah, classes, directly, additional, based", "Topic 21
Type: Topic
Domain: Governance & Policy
Words: program, something yeah, seasonal, apart, beginning", "Topic 22
Type: Topic
Domain: Governance & Policy
Words: water community, yeah little, national, helpful, maintenance workers", "Topic 23
Type: Topic
Domain: Hydrological Science
Words: useful, educational, passed, right getting, teaching", "Topic 24
Type: Topic
Domain: Economics & Resources
Words: basis, water community, connect, apart, repeat" ], "type": "scatter", "x": [ -0.5741182296288471, 0.12178995440002832, 0.32882613501254904, -0.7643783881921188, 0.02978972325730114, -0.8926158579935434, 0.18832320038276953, 0.4054272161283111, -0.4988361073943331, 0.9108900197998652, -0.38383296690225616, 0.8224004038623163, -0.9403115908369994, -0.8133706254764074, -0.6810915477896409, -0.1520140158891979, 0.25048926626765944, 0.9665178847449036, 0.7371245757526655, 0.18534235181690728, -0.857202164751744, -0.96212897299135, -0.3774722444932817, 0.9477240435691783, -0.8260719586704256 ], "y": [ 0.356742069285057, -0.8895534768461892, -0.9085324734165159, -0.4793674476458074, 0.9362843495195916, 0.22130827284161117, -0.9729874766255657, -0.8633467439511091, 0.9044474624204818, 0.5050735475114999, -0.896744108239975, -0.40187954425632394, 0.09932732982984915, 0.634146733107638, 0.6218337579536217, 0.3872313914897627, -0.9091996591144147, 0.3649708456581655, 0.7477814911445113, 0.9716275872930231, -0.4927494433618577, -0.17713497970762462, -0.8215801417254007, -0.20879632371738377, 0.31477303205189827 ] }, { "hoverinfo": "text", "marker": { "color": [ "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c" ], "line": { "color": "white", "width": 2 }, "size": [ 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8 ] }, "mode": "markers", "name": "Svo", "text": [ "SVO: temperature
Type: Scientific Variable
Variable: temperature", "SVO: freeze
Type: Scientific Variable
Variable: freeze", "SVO: seasonal
Type: Scientific Variable
Variable: seasonal", "SVO: climate change
Type: Scientific Variable
Variable: climate change", "SVO: capacity
Type: Scientific Variable
Variable: capacity", "SVO: age
Type: Scientific Variable
Variable: age", "SVO: condition
Type: Scientific Variable
Variable: condition", "SVO: water quality
Type: Scientific Variable
Variable: water quality", "SVO: ph
Type: Scientific Variable
Variable: ph", "SVO: coliform
Type: Scientific Variable
Variable: coliform", "SVO: household
Type: Scientific Variable
Variable: household", "SVO: cost
Type: Scientific Variable
Variable: cost", "SVO: funding
Type: Scientific Variable
Variable: funding", "SVO: expense
Type: Scientific Variable
Variable: expense", "SVO: revenue
Type: Scientific Variable
Variable: revenue", "SVO: subsidy
Type: Scientific Variable
Variable: subsidy", "SVO: rate
Type: Scientific Variable
Variable: rate", "SVO: staff
Type: Scientific Variable
Variable: staff", "SVO: hours
Type: Scientific Variable
Variable: hours", "SVO: regulation
Type: Scientific Variable
Variable: regulation", "SVO: standard
Type: Scientific Variable
Variable: standard", "SVO: requirement
Type: Scientific Variable
Variable: requirement", "SVO: thaw
Type: Scientific Variable
Variable: thaw", "SVO: price
Type: Scientific Variable
Variable: price", "SVO: value
Type: Scientific Variable
Variable: value", "SVO: permit
Type: Scientific Variable
Variable: permit", "SVO: depth
Type: Scientific Variable
Variable: depth", "SVO: height
Type: Scientific Variable
Variable: height", "SVO: permafrost depth
Type: Scientific Variable
Variable: permafrost depth", "SVO: frost depth
Type: Scientific Variable
Variable: frost depth", "SVO: disinfection
Type: Scientific Variable
Variable: disinfection", "SVO: employment
Type: Scientific Variable
Variable: employment", "SVO: investment
Type: Scientific Variable
Variable: investment", "SVO: certification level
Type: Scientific Variable
Variable: certification level", "SVO: budget
Type: Scientific Variable
Variable: budget", "SVO: downtime
Type: Scientific Variable
Variable: downtime", "SVO: pressure
Type: Scientific Variable
Variable: pressure", "SVO: turbidity
Type: Scientific Variable
Variable: turbidity", "SVO: population
Type: Scientific Variable
Variable: population", "SVO: water level
Type: Scientific Variable
Variable: water level", "SVO: warming
Type: Scientific Variable
Variable: warming", "SVO: discharge
Type: Scientific Variable
Variable: discharge", "SVO: chlorine level
Type: Scientific Variable
Variable: chlorine level", "SVO: violation
Type: Scientific Variable
Variable: violation", "SVO: pathogen
Type: Scientific Variable
Variable: pathogen", "SVO: volume
Type: Scientific Variable
Variable: volume", "SVO: residents
Type: Scientific Variable
Variable: residents", "SVO: users
Type: Scientific Variable
Variable: users", "SVO: capital cost
Type: Scientific Variable
Variable: capital cost", "SVO: enforcement
Type: Scientific Variable
Variable: enforcement" ], "type": "scatter", "x": [ -0.5488194837350298, -0.33957844835512435, 0.5835533043116142, -0.945379411091151, 0.629441152991229, 0.4924330109804105, 0.04173023091645595, 0.9266911264276966, -0.06173425067807704, -0.6352543095287255, 0.7975211360118166, 0.9853690327962842, -0.8619046921498498, 0.9209243644267056, 0.9419404226458097, 0.6642900409870887, -0.7455910042198471, -0.9631358172556351, -0.7754402061030435, 0.9535647917381579, -0.1334553705677391, -0.4972901708831144, -0.5373106439607368, 0.8282028519242292, 0.3180990527765686, 0.6731157747795982, 0.9779057199778487, 0.4452493856699078, -0.1901414033825167, -0.8971248448333526, 0.5839160453434034, -0.983840834528855, -0.09945314456340379, -0.909953442784238, 1, 0.1370232744133935, -0.2031406770207927, 0.9441122654149047, 0.5225486740748134, -0.2896678644690961, 0.8753068524549898, -0.9058624422962158, 0.8216801704322445, -0.9350591356069989, 0.8981510728487058, 0.48229502301302024, -0.4012936175668631, -0.2680695669233201, 0.6854968272750016, -0.7046499979146361 ], "y": [ 0.8113452776103534, 0.9759676779027375, -0.7845171322979079, 0.3607678653368623, 0.7200829848027053, -0.8827479130083424, -0.9558399737657307, 0.3302570242886357, -0.9785727883628312, -0.6465028537061525, 0.49892421248374774, 0.016054840113635303, 0.435407999101931, 0.21794500841697764, -0.08386027809339763, -0.5588009110320521, -0.702877750488982, 0.19427792517129622, -0.33392784574127693, -0.29861714820698015, 0.9172082326532601, -0.8583346525780225, -0.7911007847794461, 0.6137218735055383, 0.9259392889340138, -0.7291473004111124, 0.14323362354273964, 0.9089861517563397, -0.9494570277159408, -0.16799357555051256, 0.8243785422818537, 0.01900049057177776, -0.9196502695154404, -0.2949166390359592, -0.13293379665519128, 0.9028588536680511, 0.9809231147174591, -0.40162657647280553, -0.730495274352102, -0.9540366547144596, 0.07094020957430419, -0.38832412203310424, -0.5850888916992016, -0.057009287446160024, -0.512618056954328, 0.8239098613774081, 0.8495887644201006, 0.848365999224878, 0.5727414283339768, 0.714389317460196 ] } ], "layout": { "annotations": [ { "font": { "color": "gray", "size": 12 }, "showarrow": false, "text": "Interactive network showing semantic bridge from stakeholder narratives to formal science domains", "x": 0.5, "xanchor": "center", "xref": "paper", "y": -0.05, "yref": "paper" } ], "height": 800, "hovermode": "closest", "margin": { "b": 20, "l": 5, "r": 5, "t": 60 }, "plot_bgcolor": "white", "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 20 }, "text": "Science Backbone Network: Bethel Alaska Water Infrastructure", "x": 0.5, "xanchor": "center" }, "xaxis": { "showgrid": false, "showticklabels": false, "zeroline": false }, "yaxis": { "showgrid": false, "showticklabels": false, "zeroline": false } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Network Legend:\n", " 🔵 Blue (Large) = Science Domains\n", " 🟠 Orange (Medium) = Topics from LDA\n", " 🟢 Green (Small) = Scientific Variable Objects (SVOs)\n", "\n", "Interactions:\n", " • Hover over nodes for details\n", " • Click legend items to show/hide layers\n", " • Zoom and pan to explore\n", "\n", "================================================================================\n", "📊 NETWORK SUMMARY\n", "================================================================================\n", "\n", "Semantic Bridge Components:\n", " • Science Domains: 8\n", " • Topics: 25\n", " • SVOs: 50\n", " • Connections: 75\n", "\n", "Most Connected Domains:\n", " • Hydrological Science: 12 connections\n", " • Climate Science: 12 connections\n", " • Economics & Resources: 12 connections\n", " • Environmental Health: 11 connections\n", " • Infrastructure Engineering: 8 connections\n", "\n", "================================================================================\n", "✅ VISUALIZATION COMPLETE\n", "================================================================================\n", "\n", "💡 This network demonstrates:\n", " 1. Qualitative narratives → Topics (LDA)\n", " 2. Topics → Science domains (semantic mapping)\n", " 3. Domains → Measurable variables (SVOs)\n", " 4. → Ready for optimization model formulation\n", "\n", "Variables saved:\n", " • fig (plotly figure) - Can re-display or export\n", " • pos (dict) - Node positions for consistent layouts\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 24: Visualize Science Backbone Network\n", "\n", "print(\"=\"*80)\n", "print(\"📊 CREATING INTERACTIVE NETWORK VISUALIZATION\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "import networkx as nx\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "if 'G' not in globals():\n", " print(\"❌ Network 'G' not found!\")\n", " print(\" Run Cell 23 first to build the network\")\n", "else:\n", " print(f\"✓ Network loaded: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges\")\n", " print()\n", " \n", " # ==========================================\n", " # 2. LAYOUT NETWORK\n", " # ==========================================\n", " print(\"Step 2: Computing network layout...\")\n", " print(\"-\"*80)\n", " \n", " # Use spring layout for nice positioning\n", " pos = nx.spring_layout(G, k=2, iterations=50, seed=42)\n", " \n", " print(\"✓ Layout computed\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. CREATE NODE TRACES\n", " # ==========================================\n", " print(\"Step 3: Creating node traces...\")\n", " print(\"-\"*80)\n", " \n", " # Separate nodes by type\n", " node_traces = {}\n", " \n", " for node_type in ['domain', 'topic', 'svo']:\n", " # Filter nodes of this type\n", " nodes_of_type = [n for n in G.nodes() if G.nodes[n].get('node_type') == node_type]\n", " \n", " if not nodes_of_type:\n", " continue\n", " \n", " # Get positions\n", " x_coords = [pos[node][0] for node in nodes_of_type]\n", " y_coords = [pos[node][1] for node in nodes_of_type]\n", " \n", " # Get attributes\n", " sizes = [G.nodes[node].get('size', 10) for node in nodes_of_type]\n", " colors = [G.nodes[node].get('color', '#888') for node in nodes_of_type]\n", " \n", " # Create hover text\n", " hover_texts = []\n", " for node in nodes_of_type:\n", " node_data = G.nodes[node]\n", " \n", " if node_type == 'domain':\n", " text = f\"{node}
\"\n", " text += f\"Type: Domain
\"\n", " text += f\"Connections: {G.degree(node)}
\"\n", " if node_data.get('has_optimization_info'):\n", " text += f\"Variables: {node_data.get('variable_count', 0)}
\"\n", " text += f\"Decisions: {node_data.get('decision_count', 0)}\"\n", " \n", " elif node_type == 'topic':\n", " text = f\"{node}
\"\n", " text += f\"Type: Topic
\"\n", " text += f\"Domain: {node_data.get('domain', 'Unknown')}
\"\n", " text += f\"Words: {node_data.get('top_words', 'N/A')}\"\n", " \n", " elif node_type == 'svo':\n", " text = f\"{node}
\"\n", " text += f\"Type: Scientific Variable
\"\n", " text += f\"Variable: {node_data.get('variable', 'N/A')}\"\n", " \n", " hover_texts.append(text)\n", " \n", " # Create trace\n", " node_traces[node_type] = go.Scatter(\n", " x=x_coords,\n", " y=y_coords,\n", " mode='markers',\n", " name=node_type.capitalize(),\n", " marker=dict(\n", " size=sizes,\n", " color=colors,\n", " line=dict(width=2, color='white')\n", " ),\n", " text=hover_texts,\n", " hoverinfo='text'\n", " )\n", " \n", " print(f\"✓ Created {len(node_traces)} node trace types\")\n", " print()\n", " \n", " # ==========================================\n", " # 4. CREATE EDGE TRACES\n", " # ==========================================\n", " print(\"Step 4: Creating edge traces...\")\n", " print(\"-\"*80)\n", " \n", " edge_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='lines',\n", " line=dict(width=0.5, color='#888'),\n", " hoverinfo='none',\n", " showlegend=False\n", " )\n", " \n", " for edge in G.edges():\n", " x0, y0 = pos[edge[0]]\n", " x1, y1 = pos[edge[1]]\n", " edge_trace['x'] += (x0, x1, None)\n", " edge_trace['y'] += (y0, y1, None)\n", " \n", " print(f\"✓ Created edge trace with {G.number_of_edges()} edges\")\n", " print()\n", " \n", " # ==========================================\n", " # 5. CREATE FIGURE\n", " # ==========================================\n", " print(\"Step 5: Assembling visualization...\")\n", " print(\"-\"*80)\n", " \n", " # Combine all traces\n", " fig_data = [edge_trace] + list(node_traces.values())\n", " \n", " # Create figure\n", " fig = go.Figure(\n", " data=fig_data,\n", " layout=go.Layout(\n", " title=dict(\n", " text='Science Backbone Network: Bethel Alaska Water Infrastructure',\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=20)\n", " ),\n", " showlegend=True,\n", " hovermode='closest',\n", " margin=dict(b=20, l=5, r=5, t=60),\n", " annotations=[\n", " dict(\n", " text=\"Interactive network showing semantic bridge from stakeholder narratives to formal science domains\",\n", " showarrow=False,\n", " xref=\"paper\", yref=\"paper\",\n", " x=0.5, y=-0.05,\n", " xanchor='center',\n", " font=dict(size=12, color='gray')\n", " )\n", " ],\n", " xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " plot_bgcolor='white',\n", " height=800\n", " )\n", " )\n", " \n", " print(\"✓ Figure created\")\n", " print()\n", " \n", " # ==========================================\n", " # 6. DISPLAY NETWORK\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎨 DISPLAYING NETWORK\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " fig.show()\n", " \n", " print()\n", " print(\"Network Legend:\")\n", " print(\" 🔵 Blue (Large) = Science Domains\")\n", " print(\" 🟠 Orange (Medium) = Topics from LDA\")\n", " print(\" 🟢 Green (Small) = Scientific Variable Objects (SVOs)\")\n", " print()\n", " print(\"Interactions:\")\n", " print(\" • Hover over nodes for details\")\n", " print(\" • Click legend items to show/hide layers\")\n", " print(\" • Zoom and pan to explore\")\n", " print()\n", " \n", " # ==========================================\n", " # 7. NETWORK SUMMARY\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 NETWORK SUMMARY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Semantic Bridge Components:\")\n", " print(f\" • Science Domains: {sum(1 for n in G.nodes() if G.nodes[n].get('node_type') == 'domain')}\")\n", " print(f\" • Topics: {sum(1 for n in G.nodes() if G.nodes[n].get('node_type') == 'topic')}\")\n", " print(f\" • SVOs: {sum(1 for n in G.nodes() if G.nodes[n].get('node_type') == 'svo')}\")\n", " print(f\" • Connections: {G.number_of_edges()}\")\n", " print()\n", " \n", " print(\"Most Connected Domains:\")\n", " domain_degrees = {\n", " n: G.degree(n) for n in G.nodes() \n", " if G.nodes[n].get('node_type') == 'domain'\n", " }\n", " for domain, degree in sorted(domain_degrees.items(), key=lambda x: -x[1])[:5]:\n", " print(f\" • {domain}: {degree} connections\")\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ VISUALIZATION COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"💡 This network demonstrates:\")\n", " print(\" 1. Qualitative narratives → Topics (LDA)\")\n", " print(\" 2. Topics → Science domains (semantic mapping)\")\n", " print(\" 3. Domains → Measurable variables (SVOs)\")\n", " print(\" 4. → Ready for optimization model formulation\")\n", " print()\n", " \n", " print(\"Variables saved:\")\n", " print(\" • fig (plotly figure) - Can re-display or export\")\n", " print(\" • pos (dict) - Node positions for consistent layouts\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "SINGLE TRANSCRIPT NETWORK OVERLAY\n", "================================================================================\n", "\n", "Available transcripts:\n", " 1. 1_1_InterdependenciesNNA\n", " 2. 1_2_InterdependenciesNNA\n", " 3. 1_3_InterdependenciesNNA\n", " 4. 1_4_InterdependenciesNNA\n", " 5. 1_5__InterdependenciesNNA\n", " 6. 1_6__InterdependenciesNNA\n", " 7. 3_1__InterdependenciesNNA\n", " 8. 3_2__InterdependenciesNNA\n", " 9. 3_3__InterdependenciesNNA\n", "\n" ] }, { "name": "stdin", "output_type": "stream", "text": [ "Enter transcript number (1-9): 9.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "✓ Selected: 3_3__InterdependenciesNNA\n", "================================================================================\n", "\n", "Step 1: Finding what this transcript discusses...\n", "--------------------------------------------------------------------------------\n", "✓ Found 3 main topics\n", "✓ Found 18 SVO mentions\n", "✓ Spans 4 science domains\n", "\n", "================================================================================\n", "TRANSCRIPT SUMMARY: 3_3__InterdependenciesNNA\n", "================================================================================\n", "\n", "Main Topics:\n", " • Topic 11: 0.839 probability\n", " Words: driver, white, health, remote, program\n", " Domain: Environmental Health\n", " • Topic 24: 0.007 probability\n", " Words: basis, water community, connect, apart, repeat\n", " Domain: Economics & Resources\n", " • Topic 1: 0.007 probability\n", " Words: myself, site, overflow, happening, depends\n", " Domain: Social Systems\n", "\n", "Science Domains:\n", " • Climate Science: 1 variables\n", " Examples: freeze\n", " • Economics & Resources: 1 variables\n", " Examples: cost\n", " • Infrastructure Engineering: 1 variables\n", " Examples: age\n", " • Technical Operations: 1 variables\n", " Examples: hours\n", "\n", "Total Measurable Variables: 4\n", "\n", "================================================================================\n", "HIGHLIGHTED NETWORK VIEW\n", "================================================================================\n", "\n", "Creating network with transcript overlay...\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverinfo": "none", "line": { "color": "#ddd", "width": 0.5 }, "mode": "lines", "showlegend": false, "type": "scatter", "x": [ -0.03375203488121921, -0.7643783881921188, null, -0.03375203488121921, -0.4988361073943331, null, -0.03375203488121921, 0.9108900197998652, null, -0.03375203488121921, 0.25048926626765944, null, -0.03375203488121921, 0.7371245757526655, null, -0.03375203488121921, -0.857202164751744, null, -0.03375203488121921, 0.9477240435691783, null, -0.03375203488121921, 0.9779057199778487, null, -0.03375203488121921, 0.4452493856699078, null, -0.03375203488121921, -0.2896678644690961, null, -0.03375203488121921, -0.9058624422962158, null, -0.03375203488121921, 0.48229502301302024, null, 0.27934449209228185, 0.18832320038276953, null, 0.27934449209228185, -0.38383296690225616, null, 0.27934449209228185, -0.8133706254764074, null, 0.27934449209228185, 0.18534235181690728, null, 0.27934449209228185, -0.5488194837350298, null, 0.27934449209228185, -0.33957844835512435, null, 0.27934449209228185, 0.5835533043116142, null, 0.27934449209228185, -0.945379411091151, null, 0.27934449209228185, -0.5373106439607368, null, 0.27934449209228185, -0.1901414033825167, null, 0.27934449209228185, -0.8971248448333526, null, 0.27934449209228185, 0.8753068524549898, null, -0.6207272703420635, -0.5741182296288471, null, -0.6207272703420635, 0.32882613501254904, null, -0.6207272703420635, -0.1520140158891979, null, -0.6207272703420635, 0.9665178847449036, null, -0.6207272703420635, 0.629441152991229, null, -0.6207272703420635, 0.4924330109804105, null, -0.6207272703420635, 0.04173023091645595, null, -0.6207272703420635, -0.2031406770207927, null, -0.6201018466185201, 0.02978972325730114, null, -0.6201018466185201, -0.8926158579935434, null, -0.6201018466185201, 0.8224004038623163, null, -0.6201018466185201, 0.9266911264276966, null, -0.6201018466185201, -0.06173425067807704, null, -0.6201018466185201, -0.6352543095287255, null, -0.6201018466185201, 0.5839160453434034, null, -0.6201018466185201, 0.9441122654149047, null, -0.6201018466185201, 0.8216801704322445, null, -0.6201018466185201, -0.9350591356069989, null, -0.6201018466185201, 0.8981510728487058, null, 0.23294618936412295, 0.12178995440002832, null, 0.23294618936412295, -0.9403115908369994, null, 0.23294618936412295, 0.7975211360118166, null, 0.23294618936412295, -0.983840834528855, null, 0.23294618936412295, 0.5225486740748134, null, 0.23294618936412295, -0.4012936175668631, null, 0.23294618936412295, -0.2680695669233201, null, -0.6925982784228786, 0.4054272161283111, null, -0.6925982784228786, -0.96212897299135, null, -0.6925982784228786, -0.3774722444932817, null, -0.6925982784228786, 0.9535647917381579, null, -0.6925982784228786, -0.1334553705677391, null, -0.6925982784228786, -0.4972901708831144, null, -0.6925982784228786, 0.6731157747795982, null, -0.6925982784228786, -0.7046499979146361, null, 0.7613271782720092, -0.8260719586704256, null, 0.7613271782720092, 0.9853690327962842, null, 0.7613271782720092, -0.8619046921498498, null, 0.7613271782720092, 0.9209243644267056, null, 0.7613271782720092, 0.9419404226458097, null, 0.7613271782720092, 0.6642900409870887, null, 0.7613271782720092, -0.7455910042198471, null, 0.7613271782720092, 0.8282028519242292, null, 0.7613271782720092, 0.3180990527765686, null, 0.7613271782720092, -0.09945314456340379, null, 0.7613271782720092, 1, null, 0.7613271782720092, 0.6854968272750016, null, -0.7750493576615791, -0.6810915477896409, null, -0.7750493576615791, -0.9631358172556351, null, -0.7750493576615791, -0.7754402061030435, null, -0.7750493576615791, -0.909953442784238, null, -0.7750493576615791, 0.1370232744133935, null ], "y": [ 0.958417620015098, -0.4793674476458074, null, 0.958417620015098, 0.9044474624204818, null, 0.958417620015098, 0.5050735475114999, null, 0.958417620015098, -0.9091996591144147, null, 0.958417620015098, 0.7477814911445113, null, 0.958417620015098, -0.4927494433618577, null, 0.958417620015098, -0.20879632371738377, null, 0.958417620015098, 0.14323362354273964, null, 0.958417620015098, 0.9089861517563397, null, 0.958417620015098, -0.9540366547144596, null, 0.958417620015098, -0.38832412203310424, null, 0.958417620015098, 0.8239098613774081, null, 0.14331585819723078, -0.9729874766255657, null, 0.14331585819723078, -0.896744108239975, null, 0.14331585819723078, 0.634146733107638, null, 0.14331585819723078, 0.9716275872930231, null, 0.14331585819723078, 0.8113452776103534, null, 0.14331585819723078, 0.9759676779027375, null, 0.14331585819723078, -0.7845171322979079, null, 0.14331585819723078, 0.3607678653368623, null, 0.14331585819723078, -0.7911007847794461, null, 0.14331585819723078, -0.9494570277159408, null, 0.14331585819723078, -0.16799357555051256, null, 0.14331585819723078, 0.07094020957430419, null, -0.7630609621393, 0.356742069285057, null, -0.7630609621393, -0.9085324734165159, null, -0.7630609621393, 0.3872313914897627, null, -0.7630609621393, 0.3649708456581655, null, -0.7630609621393, 0.7200829848027053, null, -0.7630609621393, -0.8827479130083424, null, -0.7630609621393, -0.9558399737657307, null, -0.7630609621393, 0.9809231147174591, null, 0.4199165570617058, 0.9362843495195916, null, 0.4199165570617058, 0.22130827284161117, null, 0.4199165570617058, -0.40187954425632394, null, 0.4199165570617058, 0.3302570242886357, null, 0.4199165570617058, -0.9785727883628312, null, 0.4199165570617058, -0.6465028537061525, null, 0.4199165570617058, 0.8243785422818537, null, 0.4199165570617058, -0.40162657647280553, null, 0.4199165570617058, -0.5850888916992016, null, 0.4199165570617058, -0.057009287446160024, null, 0.4199165570617058, -0.512618056954328, null, 0.6485642105227907, -0.8895534768461892, null, 0.6485642105227907, 0.09932732982984915, null, 0.6485642105227907, 0.49892421248374774, null, 0.6485642105227907, 0.01900049057177776, null, 0.6485642105227907, -0.730495274352102, null, 0.6485642105227907, 0.8495887644201006, null, 0.6485642105227907, 0.848365999224878, null, 0.7898677172442734, -0.8633467439511091, null, 0.7898677172442734, -0.17713497970762462, null, 0.7898677172442734, -0.8215801417254007, null, 0.7898677172442734, -0.29861714820698015, null, 0.7898677172442734, 0.9172082326532601, null, 0.7898677172442734, -0.8583346525780225, null, 0.7898677172442734, -0.7291473004111124, null, 0.7898677172442734, 0.714389317460196, null, -0.6399233719514198, 0.31477303205189827, null, -0.6399233719514198, 0.016054840113635303, null, -0.6399233719514198, 0.435407999101931, null, -0.6399233719514198, 0.21794500841697764, null, -0.6399233719514198, -0.08386027809339763, null, -0.6399233719514198, -0.5588009110320521, null, -0.6399233719514198, -0.702877750488982, null, -0.6399233719514198, 0.6137218735055383, null, -0.6399233719514198, 0.9259392889340138, null, -0.6399233719514198, -0.9196502695154404, null, -0.6399233719514198, -0.13293379665519128, null, -0.6399233719514198, 0.5727414283339768, null, -0.5789927430822626, 0.6218337579536217, null, -0.5789927430822626, 0.19427792517129622, null, -0.5789927430822626, -0.33392784574127693, null, -0.5789927430822626, -0.2949166390359592, null, -0.5789927430822626, 0.9028588536680511, null ] }, { "hoverinfo": "text", "hovertext": [ "Hydrological Science
Type: domain", "Environmental Health
Type: domain", "Social Systems
Type: domain", "Governance & Policy
Type: domain", "Topic 0
Type: topic", "Topic 2
Type: topic", "Topic 3
Type: topic", "Topic 4
Type: topic", "Topic 5
Type: topic", "Topic 6
Type: topic", "Topic 7
Type: topic", "Topic 8
Type: topic", "Topic 9
Type: topic", "Topic 10
Type: topic", "Topic 12
Type: topic", "Topic 13
Type: topic", "Topic 14
Type: topic", "Topic 15
Type: topic", "Topic 16
Type: topic", "Topic 17
Type: topic", "Topic 18
Type: topic", "Topic 19
Type: topic", "Topic 20
Type: topic", "Topic 21
Type: topic", "Topic 22
Type: topic", "Topic 23
Type: topic", "SVO: temperature
Type: svo", "SVO: seasonal
Type: svo", "SVO: climate change
Type: svo", "SVO: capacity
Type: svo", "SVO: condition
Type: svo", "SVO: water quality
Type: svo", "SVO: ph
Type: svo", "SVO: coliform
Type: svo", "SVO: household
Type: svo", "SVO: funding
Type: svo", "SVO: expense
Type: svo", "SVO: revenue
Type: svo", "SVO: subsidy
Type: svo", "SVO: rate
Type: svo", "SVO: staff
Type: svo", "SVO: regulation
Type: svo", "SVO: standard
Type: svo", "SVO: requirement
Type: svo", "SVO: thaw
Type: svo", "SVO: price
Type: svo", "SVO: value
Type: svo", "SVO: permit
Type: svo", "SVO: depth
Type: svo", "SVO: height
Type: svo", "SVO: permafrost depth
Type: svo", "SVO: frost depth
Type: svo", "SVO: disinfection
Type: svo", "SVO: employment
Type: svo", "SVO: investment
Type: svo", "SVO: certification level
Type: svo", "SVO: budget
Type: svo", "SVO: downtime
Type: svo", "SVO: pressure
Type: svo", "SVO: turbidity
Type: svo", "SVO: population
Type: svo", "SVO: water level
Type: svo", "SVO: warming
Type: svo", "SVO: discharge
Type: svo", "SVO: chlorine level
Type: svo", "SVO: violation
Type: svo", "SVO: pathogen
Type: svo", "SVO: volume
Type: svo", "SVO: residents
Type: svo", "SVO: users
Type: svo", "SVO: capital cost
Type: svo", "SVO: enforcement
Type: svo" ], "marker": { "color": "lightgray", "line": { "color": "gray", "width": 1 }, "opacity": 0.3, "size": [ 21, 21, 21, 21, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6 ] }, "mode": "markers", "name": "Other nodes", "text": [ "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "" ], "type": "scatter", "x": [ -0.03375203488121921, -0.6201018466185201, 0.23294618936412295, -0.6925982784228786, -0.5741182296288471, 0.32882613501254904, -0.7643783881921188, 0.02978972325730114, -0.8926158579935434, 0.18832320038276953, 0.4054272161283111, -0.4988361073943331, 0.9108900197998652, -0.38383296690225616, -0.9403115908369994, -0.8133706254764074, -0.6810915477896409, -0.1520140158891979, 0.25048926626765944, 0.9665178847449036, 0.7371245757526655, 0.18534235181690728, -0.857202164751744, -0.96212897299135, -0.3774722444932817, 0.9477240435691783, -0.5488194837350298, 0.5835533043116142, -0.945379411091151, 0.629441152991229, 0.04173023091645595, 0.9266911264276966, -0.06173425067807704, -0.6352543095287255, 0.7975211360118166, -0.8619046921498498, 0.9209243644267056, 0.9419404226458097, 0.6642900409870887, -0.7455910042198471, -0.9631358172556351, 0.9535647917381579, -0.1334553705677391, -0.4972901708831144, -0.5373106439607368, 0.8282028519242292, 0.3180990527765686, 0.6731157747795982, 0.9779057199778487, 0.4452493856699078, -0.1901414033825167, -0.8971248448333526, 0.5839160453434034, -0.983840834528855, -0.09945314456340379, -0.909953442784238, 1, 0.1370232744133935, -0.2031406770207927, 0.9441122654149047, 0.5225486740748134, -0.2896678644690961, 0.8753068524549898, -0.9058624422962158, 0.8216801704322445, -0.9350591356069989, 0.8981510728487058, 0.48229502301302024, -0.4012936175668631, -0.2680695669233201, 0.6854968272750016, -0.7046499979146361 ], "y": [ 0.958417620015098, 0.4199165570617058, 0.6485642105227907, 0.7898677172442734, 0.356742069285057, -0.9085324734165159, -0.4793674476458074, 0.9362843495195916, 0.22130827284161117, -0.9729874766255657, -0.8633467439511091, 0.9044474624204818, 0.5050735475114999, -0.896744108239975, 0.09932732982984915, 0.634146733107638, 0.6218337579536217, 0.3872313914897627, -0.9091996591144147, 0.3649708456581655, 0.7477814911445113, 0.9716275872930231, -0.4927494433618577, -0.17713497970762462, -0.8215801417254007, -0.20879632371738377, 0.8113452776103534, -0.7845171322979079, 0.3607678653368623, 0.7200829848027053, -0.9558399737657307, 0.3302570242886357, -0.9785727883628312, -0.6465028537061525, 0.49892421248374774, 0.435407999101931, 0.21794500841697764, -0.08386027809339763, -0.5588009110320521, -0.702877750488982, 0.19427792517129622, -0.29861714820698015, 0.9172082326532601, -0.8583346525780225, -0.7911007847794461, 0.6137218735055383, 0.9259392889340138, -0.7291473004111124, 0.14323362354273964, 0.9089861517563397, -0.9494570277159408, -0.16799357555051256, 0.8243785422818537, 0.01900049057177776, -0.9196502695154404, -0.2949166390359592, -0.13293379665519128, 0.9028588536680511, 0.9809231147174591, -0.40162657647280553, -0.730495274352102, -0.9540366547144596, 0.07094020957430419, -0.38832412203310424, -0.5850888916992016, -0.057009287446160024, -0.512618056954328, 0.8239098613774081, 0.8495887644201006, 0.848365999224878, 0.5727414283339768, 0.714389317460196 ] }, { "hoverinfo": "text", "hovertext": [ "Climate Science
Type: domain", "Infrastructure Engineering
Type: domain", "Economics & Resources
Type: domain", "Technical Operations
Type: domain", "Topic 1
Type: topic", "Topic 11
Type: topic", "Topic 24
Type: topic", "SVO: freeze
Type: svo", "SVO: age
Type: svo", "SVO: cost
Type: svo", "SVO: hours
Type: svo" ], "marker": { "color": "red", "line": { "color": "darkred", "width": 3 }, "size": [ 45, 45, 45, 45, 22.5, 22.5, 22.5, 12, 12, 12, 12 ] }, "mode": "markers+text", "name": "3_3__InterdependenciesNNA", "text": [ "Climate Science", "Infrastructure Engin", "Economics & Resource", "Technical Operations", "Topic 1", "Topic 11", "Topic 24", "freeze", "age", "cost", "hours" ], "textposition": "top center", "type": "scatter", "x": [ 0.27934449209228185, -0.6207272703420635, 0.7613271782720092, -0.7750493576615791, 0.12178995440002832, 0.8224004038623163, -0.8260719586704256, -0.33957844835512435, 0.4924330109804105, 0.9853690327962842, -0.7754402061030435 ], "y": [ 0.14331585819723078, -0.7630609621393, -0.6399233719514198, -0.5789927430822626, -0.8895534768461892, -0.40187954425632394, 0.31477303205189827, 0.9759676779027375, -0.8827479130083424, 0.016054840113635303, -0.33392784574127693 ] } ], "layout": { "annotations": [ { "font": { "color": "gray", "size": 12 }, "showarrow": false, "text": "Red nodes = discussed in 3_3__InterdependenciesNNA | Gray nodes = other content", "x": 0.5, "xanchor": "center", "xref": "paper", "y": -0.05, "yref": "paper" } ], "height": 800, "hovermode": "closest", "margin": { "b": 20, "l": 5, "r": 5, "t": 60 }, "plot_bgcolor": "white", "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "size": 18 }, "text": "Network View: 3_3__InterdependenciesNNA", "x": 0.5, "xanchor": "center" }, "xaxis": { "showgrid": false, "showticklabels": false, "zeroline": false }, "yaxis": { "showgrid": false, "showticklabels": false, "zeroline": false } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "✓ Network displayed with transcript overlay\n", "\n", "Legend:\n", " 🔴 Red = Discussed in selected transcript\n", " ⚪ Gray = Other content\n", "\n", "================================================================================\n", "SINGLE TRANSCRIPT ANALYSIS COMPLETE\n", "================================================================================\n" ] } ], "source": [ "# CELL 25: Single Transcript Overlay (Optional)\n", "\n", "print(\"=\"*80)\n", "print(\"SINGLE TRANSCRIPT NETWORK OVERLAY\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# ==========================================\n", "# 1. SELECT TRANSCRIPT\n", "# ==========================================\n", "print(\"Available transcripts:\")\n", "doc_list = list(documents.keys())\n", "for i, doc_name in enumerate(doc_list, 1):\n", " print(f\" {i}. {doc_name}\")\n", "\n", "print()\n", "\n", "# Get user input with validation\n", "while True:\n", " try:\n", " user_input = input(\"Enter transcript number (1-{}): \".format(len(doc_list)))\n", " \n", " # Clean input: remove periods, spaces, strip\n", " clean_input = user_input.strip().rstrip('.')\n", " \n", " transcript_idx = int(clean_input) - 1\n", " \n", " # Validate range\n", " if 0 <= transcript_idx < len(doc_list):\n", " selected_doc = doc_list[transcript_idx]\n", " break\n", " else:\n", " print(f\"❌ Please enter a number between 1 and {len(doc_list)}\")\n", " except ValueError:\n", " print(\"❌ Invalid input. Please enter a number.\")\n", "\n", "print(f\"\\n✓ Selected: {selected_doc}\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# ==========================================\n", "# 2. ANALYZE THIS TRANSCRIPT\n", "# ==========================================\n", "print(\"Step 1: Finding what this transcript discusses...\")\n", "print(\"-\"*80)\n", "\n", "transcript_topics = []\n", "transcript_svos = []\n", "transcript_domains = set()\n", "\n", "# Get topics from this document\n", "if 'doc_topic_dist' in globals():\n", " try:\n", " doc_idx = list(documents.keys()).index(selected_doc)\n", " topic_probs = doc_topic_dist[doc_idx]\n", " \n", " # Get top 3 topics by probability\n", " top_topic_indices = topic_probs.argsort()[-3:][::-1]\n", " \n", " for topic_id in top_topic_indices:\n", " prob = topic_probs[topic_id]\n", " transcript_topics.append({\n", " 'id': topic_id,\n", " 'name': f\"Topic {topic_id}\",\n", " 'probability': prob\n", " })\n", " \n", " print(f\"✓ Found {len(transcript_topics)} main topics\")\n", " except Exception as e:\n", " print(f\"⚠️ Could not analyze topics: {e}\")\n", "else:\n", " print(\"⚠️ doc_topic_dist not available\")\n", "\n", "# Get SVOs mentioned in this document\n", "if 'svo_extractions' in globals():\n", " transcript_svos = [\n", " svo for svo in svo_extractions \n", " if svo['document'] == selected_doc\n", " ]\n", " \n", " transcript_domains = set(svo['domain'] for svo in transcript_svos)\n", " \n", " print(f\"✓ Found {len(transcript_svos)} SVO mentions\")\n", " print(f\"✓ Spans {len(transcript_domains)} science domains\")\n", "else:\n", " print(\"⚠️ svo_extractions not available\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 3. DISPLAY SUMMARY\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(f\"TRANSCRIPT SUMMARY: {selected_doc}\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if transcript_topics:\n", " print(\"Main Topics:\")\n", " for topic in transcript_topics:\n", " print(f\" • {topic['name']}: {topic['probability']:.3f} probability\")\n", " \n", " # Show topic words if available\n", " if 'topic_mappings' in globals():\n", " topic_info = next((t for t in topic_mappings if t['topic_id'] == topic['id']), None)\n", " if topic_info:\n", " print(f\" Words: {', '.join(topic_info['top_words'][:5])}\")\n", " print(f\" Domain: {topic_info['primary_domain']}\")\n", " print()\n", "\n", "if transcript_domains:\n", " print(\"Science Domains:\")\n", " for domain in sorted(transcript_domains):\n", " domain_svos = [svo for svo in transcript_svos if svo['domain'] == domain]\n", " unique_svos = set(svo['svo'] for svo in domain_svos)\n", " print(f\" • {domain}: {len(unique_svos)} variables\")\n", " print(f\" Examples: {', '.join(list(unique_svos)[:3])}\")\n", " print()\n", "\n", "if transcript_svos:\n", " print(f\"Total Measurable Variables: {len(set(svo['svo'] for svo in transcript_svos))}\")\n", " print()\n", "\n", "# ==========================================\n", "# 4. CREATE HIGHLIGHTED NETWORK\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"HIGHLIGHTED NETWORK VIEW\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'G' in globals() and 'pos' in globals():\n", " import plotly.graph_objects as go\n", " \n", " print(\"Creating network with transcript overlay...\")\n", " \n", " # Get nodes to highlight\n", " highlight_nodes = set()\n", " \n", " # Add topics\n", " for topic in transcript_topics:\n", " highlight_nodes.add(topic['name'])\n", " \n", " # Add domains\n", " highlight_nodes.update(transcript_domains)\n", " \n", " # Add SVOs\n", " for svo in transcript_svos:\n", " highlight_nodes.add(f\"SVO: {svo['svo']}\")\n", " \n", " # Create edge trace (same as before)\n", " edge_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='lines',\n", " line=dict(width=0.5, color='#ddd'),\n", " hoverinfo='none',\n", " showlegend=False\n", " )\n", " \n", " for edge in G.edges():\n", " x0, y0 = pos[edge[0]]\n", " x1, y1 = pos[edge[1]]\n", " edge_trace['x'] += (x0, x1, None)\n", " edge_trace['y'] += (y0, y1, None)\n", " \n", " # Create node traces (highlighted vs. regular)\n", " highlighted_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='markers+text',\n", " name=f'{selected_doc}',\n", " marker=dict(\n", " size=[],\n", " color='red',\n", " line=dict(width=3, color='darkred')\n", " ),\n", " text=[],\n", " textposition='top center',\n", " hoverinfo='text',\n", " hovertext=[]\n", " )\n", " \n", " regular_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='markers',\n", " name='Other nodes',\n", " marker=dict(\n", " size=[],\n", " color='lightgray',\n", " line=dict(width=1, color='gray'),\n", " opacity=0.3\n", " ),\n", " text=[],\n", " hoverinfo='text',\n", " hovertext=[]\n", " )\n", " \n", " # Separate nodes\n", " for node in G.nodes():\n", " x, y = pos[node]\n", " node_data = G.nodes[node]\n", " size = node_data.get('size', 10)\n", " \n", " hover = f\"{node}
Type: {node_data.get('node_type', 'unknown')}\"\n", " \n", " if node in highlight_nodes:\n", " highlighted_trace['x'] += (x,)\n", " highlighted_trace['y'] += (y,)\n", " highlighted_trace['marker']['size'] += (size * 1.5,) # Bigger\n", " highlighted_trace['text'] += (node.split(':')[-1].strip()[:20],)\n", " highlighted_trace['hovertext'] += (hover,)\n", " else:\n", " regular_trace['x'] += (x,)\n", " regular_trace['y'] += (y,)\n", " regular_trace['marker']['size'] += (size * 0.7,) # Smaller\n", " regular_trace['text'] += ('',)\n", " regular_trace['hovertext'] += (hover,)\n", " \n", " # Create figure\n", " fig = go.Figure(\n", " data=[edge_trace, regular_trace, highlighted_trace],\n", " layout=go.Layout(\n", " title=dict(\n", " text=f'Network View: {selected_doc}',\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=18)\n", " ),\n", " showlegend=True,\n", " hovermode='closest',\n", " margin=dict(b=20, l=5, r=5, t=60),\n", " annotations=[\n", " dict(\n", " text=f\"Red nodes = discussed in {selected_doc} | Gray nodes = other content\",\n", " showarrow=False,\n", " xref=\"paper\", yref=\"paper\",\n", " x=0.5, y=-0.05,\n", " xanchor='center',\n", " font=dict(size=12, color='gray')\n", " )\n", " ],\n", " xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " plot_bgcolor='white',\n", " height=800\n", " )\n", " )\n", " \n", " fig.show()\n", " \n", " print(\"✓ Network displayed with transcript overlay\")\n", " print()\n", " print(\"Legend:\")\n", " print(\" 🔴 Red = Discussed in selected transcript\")\n", " print(\" ⚪ Gray = Other content\")\n", " \n", "else:\n", " print(\"⚠️ Network not available - run Cell 23 first\")\n", "\n", "print()\n", "print(\"=\"*80)\n", "print(\"SINGLE TRANSCRIPT ANALYSIS COMPLETE\")\n", "print(\"=\"*80)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "📊 GENERATING PUBLICATION-QUALITY SUMMARY REPORT\n", "================================================================================\n", "\n", "Step 0: Configuring report parameters...\n", "--------------------------------------------------------------------------------\n", "✓ Report configuration complete\n", " Title: AI-Augmented Semantic Bridge: Translating Stakeholder Narratives to Decision Support\n", " Case study: Case Study: Bethel, Alaska Water Infrastructure Resilience\n", " Interviews analyzed: 9\n", "\n", "================================================================================\n", "FIGURE 1: Topic Model Comparison\n", "================================================================================\n", "\n", "Creating Figure 1: Multi-panel model comparison...\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "cells": { "align": "left", "fill": { "color": "white" }, "font": { "size": 11 }, "height": 25, "values": [ [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], [ 25, 25, 25 ], [ 100, 2000, 543 ], [ 9, 9, 9 ] ] }, "domain": { "x": [ 0, 0.44 ], "y": [ 0.575, 1 ] }, "header": { "align": "left", "fill": { "color": "#1f77b4" }, "font": { "color": "white", "size": 12 }, "values": [ "Model", "Topics", "Features", "Docs" ] }, "type": "table" }, { "name": "Perplexity", "showlegend": true, "type": "bar", "x": [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], "xaxis": "x", "y": [ 29545.73, 983342320.15, 4339995.56 ], "yaxis": "y" }, { "name": "Avg Coherence", "showlegend": true, "type": "bar", "x": [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], "xaxis": "x", "y": [ 0.104, 0.083, 0.078 ], "yaxis": "y" }, { "name": "Topic Diversity", "showlegend": true, "type": "bar", "x": [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], "xaxis": "x", "y": [ 0.052, 0.015, 0.009 ], "yaxis": "y" }, { "marker": { "color": [ "#2ca02c", "#ff7f0e", "#d62728" ] }, "showlegend": false, "type": "bar", "x": [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], "xaxis": "x2", "y": [ 25, 25, 25 ], "yaxis": "y2" }, { "marker": { "color": [ "#9467bd", "#8c564b", "#e377c2" ] }, "showlegend": false, "type": "bar", "x": [ "Model 1 (Baseline)", "Model 2 (Enhanced)", "Model 3 (Stricter)" ], "xaxis": "x3", "y": [ 100, 2000, 543 ], "yaxis": "y3" } ], "layout": { "annotations": [ { "font": { "size": 16 }, "showarrow": false, "text": "A. Model Parameters", "x": 0.22, "xanchor": "center", "xref": "paper", "y": 1, "yanchor": "bottom", "yref": "paper" }, { "font": { "size": 16 }, "showarrow": false, "text": "B. Quality Metrics", "x": 0.78, "xanchor": "center", "xref": "paper", "y": 1, "yanchor": "bottom", "yref": "paper" }, { "font": { "size": 16 }, "showarrow": false, "text": "C. Topic Count Distribution", "x": 0.22, "xanchor": "center", "xref": "paper", "y": 0.425, "yanchor": "bottom", "yref": "paper" }, { "font": { "size": 16 }, "showarrow": false, "text": "D. Vocabulary Coverage", "x": 0.78, "xanchor": "center", "xref": "paper", "y": 0.425, "yanchor": "bottom", "yref": "paper" } ], "font": { "family": "Arial", "size": 11 }, "height": 800, "plot_bgcolor": "white", "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "family": "Arial", "size": 16 }, "text": "Figure 1. Topic Model Comparison Across Preprocessing Strategies", "x": 0.5, "xanchor": "center" }, "xaxis": { "anchor": "y", "domain": [ 0.56, 1 ], "linecolor": "black", "showgrid": false, "showline": true }, "xaxis2": { "anchor": "y2", "domain": [ 0, 0.44 ], "linecolor": "black", "showgrid": false, "showline": true }, "xaxis3": { "anchor": "y3", "domain": [ 0.56, 1 ], "linecolor": "black", "showgrid": false, "showline": true }, "yaxis": { "anchor": "x", "domain": [ 0.575, 1 ], "gridcolor": "#eeeeee", "linecolor": "black", "showgrid": true, "showline": true, "title": { "text": "Metric Value" } }, "yaxis2": { "anchor": "x2", "domain": [ 0, 0.425 ], "gridcolor": "#eeeeee", "linecolor": "black", "showgrid": true, "showline": true, "title": { "text": "Number of Topics" } }, "yaxis3": { "anchor": "x3", "domain": [ 0, 0.425 ], "gridcolor": "#eeeeee", "linecolor": "black", "showgrid": true, "showline": true, "title": { "text": "Vocabulary Size" } } } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "✓ Figure 1 saved: Figure1_ModelComparison.html\n", "\n", "**Figure 1. Topic Model Comparison Across Preprocessing Strategies.**\n", "Four-panel comparison of LDA models: (A) Model parameters showing topics, features, and documents; \n", "(B) Quality metrics including perplexity, coherence, and diversity scores; (C) Number of topics \n", "extracted by each model; (D) Vocabulary coverage achieved. Model 1 (Baseline) uses minimal \n", "preprocessing, Model 2 (Enhanced) applies comprehensive stopword filtering, and Model 3 (Stricter) \n", "implements stringent feature selection criteria. Results demonstrate trade-offs between topic \n", "granularity and interpretability.\n", " \n", "\n", "================================================================================\n", "FIGURE 2: Topic Overview from Selected Model\n", "================================================================================\n", "\n", "Creating Figure 2: Topic word clouds and distribution...\n" ] }, { "ename": "IndexError", "evalue": "list index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[32], line 242\u001b[0m\n\u001b[1;32m 233\u001b[0m colors \u001b[38;5;241m=\u001b[39m px\u001b[38;5;241m.\u001b[39mcolors\u001b[38;5;241m.\u001b[39mqualitative\u001b[38;5;241m.\u001b[39mSet3[:lda_model\u001b[38;5;241m.\u001b[39mn_components]\n\u001b[1;32m 235\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m topic \u001b[38;5;129;01min\u001b[39;00m topics_data:\n\u001b[1;32m 236\u001b[0m fig2\u001b[38;5;241m.\u001b[39madd_trace(\n\u001b[1;32m 237\u001b[0m go\u001b[38;5;241m.\u001b[39mBar(\n\u001b[1;32m 238\u001b[0m y\u001b[38;5;241m=\u001b[39m[\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mT\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtopic[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mtopic_id\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mw\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;28;01mfor\u001b[39;00m w \u001b[38;5;129;01min\u001b[39;00m topic[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mwords\u001b[39m\u001b[38;5;124m'\u001b[39m]],\n\u001b[1;32m 239\u001b[0m x\u001b[38;5;241m=\u001b[39mtopic[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mweights\u001b[39m\u001b[38;5;124m'\u001b[39m],\n\u001b[1;32m 240\u001b[0m orientation\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mh\u001b[39m\u001b[38;5;124m'\u001b[39m,\n\u001b[1;32m 241\u001b[0m name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mTopic \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtopic[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mtopic_id\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m--> 242\u001b[0m marker_color\u001b[38;5;241m=\u001b[39m\u001b[43mcolors\u001b[49m\u001b[43m[\u001b[49m\u001b[43mtopic\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mtopic_id\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m]\u001b[49m,\n\u001b[1;32m 243\u001b[0m showlegend\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 244\u001b[0m )\n\u001b[1;32m 245\u001b[0m )\n\u001b[1;32m 247\u001b[0m \u001b[38;5;66;03m# Update layout\u001b[39;00m\n\u001b[1;32m 248\u001b[0m fig2\u001b[38;5;241m.\u001b[39mupdate_layout(\n\u001b[1;32m 249\u001b[0m title\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mdict\u001b[39m(\n\u001b[1;32m 250\u001b[0m text\u001b[38;5;241m=\u001b[39m\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mFigure 2. Topic Structure: Top \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mn_top_words\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m Words per Topic\u001b[39m\u001b[38;5;124m'\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 261\u001b[0m showlegend\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m\n\u001b[1;32m 262\u001b[0m )\n", "\u001b[0;31mIndexError\u001b[0m: list index out of range" ] } ], "source": [ "# CELL 26: Publication-Quality Summary Report\n", "# Comprehensive Analysis: AI-Augmented Semantic Bridge from Stakeholder Narratives to Decision Support\n", "\n", "print(\"=\"*80)\n", "print(\"📊 GENERATING PUBLICATION-QUALITY SUMMARY REPORT\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "from plotly.subplots import make_subplots\n", "import plotly.express as px\n", "import pandas as pd\n", "import numpy as np\n", "from pathlib import Path\n", "from datetime import datetime\n", "\n", "# ==========================================\n", "# 0. SETUP AND CONFIGURATION\n", "# ==========================================\n", "print(\"Step 0: Configuring report parameters...\")\n", "print(\"-\"*80)\n", "\n", "# Create output directory\n", "OUTPUT_DIR = Path('publication_outputs')\n", "OUTPUT_DIR.mkdir(exist_ok=True)\n", "\n", "FIGURES_DIR = OUTPUT_DIR / 'figures'\n", "FIGURES_DIR.mkdir(exist_ok=True)\n", "\n", "TABLES_DIR = OUTPUT_DIR / 'tables'\n", "TABLES_DIR.mkdir(exist_ok=True)\n", "\n", "# Report metadata\n", "REPORT_METADATA = {\n", " 'title': 'AI-Augmented Semantic Bridge: Translating Stakeholder Narratives to Decision Support',\n", " 'subtitle': 'Case Study: Bethel, Alaska Water Infrastructure Resilience',\n", " 'date': datetime.now().strftime('%B %d, %Y'),\n", " 'authors': 'Research Team',\n", " 'n_interviews': len(documents) if 'documents' in globals() else 0,\n", " 'n_models': len(model_results) if 'model_results' in globals() else 0\n", "}\n", "\n", "print(f\"✓ Report configuration complete\")\n", "print(f\" Title: {REPORT_METADATA['title']}\")\n", "print(f\" Case study: {REPORT_METADATA['subtitle']}\")\n", "print(f\" Interviews analyzed: {REPORT_METADATA['n_interviews']}\")\n", "print()\n", "\n", "# ==========================================\n", "# 1. FIGURE 1: MODEL COMPARISON PANEL\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 1: Topic Model Comparison\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'model_results' in globals() and len(model_results) > 0:\n", " print(\"Creating Figure 1: Multi-panel model comparison...\")\n", " \n", " # Create 2x2 subplot\n", " fig1 = make_subplots(\n", " rows=2, cols=2,\n", " subplot_titles=(\n", " 'A. Model Parameters',\n", " 'B. Quality Metrics',\n", " 'C. Topic Count Distribution',\n", " 'D. Vocabulary Coverage'\n", " ),\n", " specs=[\n", " [{'type': 'table'}, {'type': 'bar'}],\n", " [{'type': 'bar'}, {'type': 'bar'}]\n", " ],\n", " vertical_spacing=0.15,\n", " horizontal_spacing=0.12\n", " )\n", " \n", " # Panel A: Model parameters table\n", " model_params = []\n", " for model in model_results:\n", " model_params.append({\n", " 'Model': model['name'],\n", " 'Topics': model['n_topics'],\n", " 'Features': len(model['feature_names']),\n", " 'Docs': model['doc_term_matrix'].shape[0] if model.get('doc_term_matrix') is not None else 'N/A'\n", " })\n", " \n", " params_df = pd.DataFrame(model_params)\n", " \n", " fig1.add_trace(\n", " go.Table(\n", " header=dict(\n", " values=list(params_df.columns),\n", " fill_color='#1f77b4',\n", " font=dict(color='white', size=12),\n", " align='left'\n", " ),\n", " cells=dict(\n", " values=[params_df[col] for col in params_df.columns],\n", " fill_color='white',\n", " font=dict(size=11),\n", " align='left',\n", " height=25\n", " )\n", " ),\n", " row=1, col=1\n", " )\n", " \n", " # Panel B: Quality metrics (if available)\n", " if 'metrics_df' in globals():\n", " metrics_to_plot = ['Perplexity', 'Avg Coherence', 'Topic Diversity']\n", " \n", " for metric in metrics_to_plot:\n", " if metric in metrics_df.columns:\n", " try:\n", " values = [float(str(v).replace('%', '')) for v in metrics_df[metric]]\n", " fig1.add_trace(\n", " go.Bar(\n", " x=metrics_df['Model'],\n", " y=values,\n", " name=metric,\n", " showlegend=True\n", " ),\n", " row=1, col=2\n", " )\n", " except:\n", " pass\n", " else:\n", " # Placeholder\n", " fig1.add_trace(\n", " go.Bar(\n", " x=[m['name'] for m in model_results],\n", " y=[m['n_topics'] for m in model_results],\n", " name='Topics',\n", " marker_color='#1f77b4'\n", " ),\n", " row=1, col=2\n", " )\n", " \n", " # Panel C: Topic count\n", " fig1.add_trace(\n", " go.Bar(\n", " x=[m['name'] for m in model_results],\n", " y=[m['n_topics'] for m in model_results],\n", " marker_color=['#2ca02c', '#ff7f0e', '#d62728'][:len(model_results)],\n", " showlegend=False\n", " ),\n", " row=2, col=1\n", " )\n", " \n", " # Panel D: Vocabulary size\n", " fig1.add_trace(\n", " go.Bar(\n", " x=[m['name'] for m in model_results],\n", " y=[len(m['feature_names']) for m in model_results],\n", " marker_color=['#9467bd', '#8c564b', '#e377c2'][:len(model_results)],\n", " showlegend=False\n", " ),\n", " row=2, col=2\n", " )\n", " \n", " # Update layout\n", " fig1.update_layout(\n", " title=dict(\n", " text='Figure 1. Topic Model Comparison Across Preprocessing Strategies',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " height=800,\n", " showlegend=True,\n", " font=dict(family='Arial', size=11),\n", " plot_bgcolor='white'\n", " )\n", " \n", " fig1.update_xaxes(showgrid=False, showline=True, linecolor='black')\n", " fig1.update_yaxes(showgrid=True, gridcolor='#eeeeee', showline=True, linecolor='black')\n", " \n", " # Update axis titles\n", " fig1.update_yaxes(title_text=\"Metric Value\", row=1, col=2)\n", " fig1.update_yaxes(title_text=\"Number of Topics\", row=2, col=1)\n", " fig1.update_yaxes(title_text=\"Vocabulary Size\", row=2, col=2)\n", " \n", " fig1.show()\n", " \n", " # Save figure\n", " fig1.write_html(str(FIGURES_DIR / 'Figure1_ModelComparison.html'))\n", " print(\"✓ Figure 1 saved: Figure1_ModelComparison.html\")\n", " \n", " # Figure caption\n", " caption1 = \"\"\"\n", "**Figure 1. Topic Model Comparison Across Preprocessing Strategies.**\n", "Four-panel comparison of LDA models: (A) Model parameters showing topics, features, and documents; \n", "(B) Quality metrics including perplexity, coherence, and diversity scores; (C) Number of topics \n", "extracted by each model; (D) Vocabulary coverage achieved. Model 1 (Baseline) uses minimal \n", "preprocessing, Model 2 (Enhanced) applies comprehensive stopword filtering, and Model 3 (Stricter) \n", "implements stringent feature selection criteria. Results demonstrate trade-offs between topic \n", "granularity and interpretability.\n", " \"\"\"\n", " print(caption1)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 1 - model_results not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 2. FIGURE 2: SELECTED MODEL TOPIC OVERVIEW\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 2: Topic Overview from Selected Model\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'lda_model' in globals() and 'feature_names' in globals():\n", " print(\"Creating Figure 2: Topic word clouds and distribution...\")\n", " \n", " # Extract top words for each topic\n", " n_top_words = 10\n", " topics_data = []\n", " \n", " for topic_idx in range(lda_model.n_components):\n", " topic = lda_model.components_[topic_idx]\n", " top_indices = topic.argsort()[-n_top_words:][::-1]\n", " top_words = [feature_names[i] for i in top_indices]\n", " top_weights = topic[top_indices]\n", " \n", " topics_data.append({\n", " 'topic_id': topic_idx,\n", " 'words': top_words,\n", " 'weights': top_weights\n", " })\n", " \n", " # Create horizontal bar chart showing top words per topic\n", " fig2 = go.Figure()\n", " \n", " colors = px.colors.qualitative.Set3[:lda_model.n_components]\n", " \n", " for topic in topics_data:\n", " fig2.add_trace(\n", " go.Bar(\n", " y=[f\"T{topic['topic_id']}: {w}\" for w in topic['words']],\n", " x=topic['weights'],\n", " orientation='h',\n", " name=f\"Topic {topic['topic_id']}\",\n", " marker_color=colors[topic['topic_id']],\n", " showlegend=True\n", " )\n", " )\n", " \n", " # Update layout\n", " fig2.update_layout(\n", " title=dict(\n", " text=f'Figure 2. Topic Structure: Top {n_top_words} Words per Topic',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " barmode='stack',\n", " height=600,\n", " xaxis_title='Word Weight',\n", " yaxis_title='',\n", " font=dict(family='Arial', size=10),\n", " plot_bgcolor='white',\n", " showlegend=False\n", " )\n", " \n", " fig2.update_xaxes(showgrid=True, gridcolor='#eeeeee', showline=True, linecolor='black')\n", " fig2.update_yaxes(showgrid=False, showline=True, linecolor='black')\n", " \n", " fig2.show()\n", " \n", " # Save\n", " fig2.write_html(str(FIGURES_DIR / 'Figure2_TopicOverview.html'))\n", " print(\"✓ Figure 2 saved: Figure2_TopicOverview.html\")\n", " \n", " caption2 = \"\"\"\n", "**Figure 2. Topic Structure Revealed by LDA Analysis.**\n", "Horizontal bar chart displaying the top 10 weighted terms for each discovered topic. Word weights \n", "represent the strength of association between terms and topics. Topics capture distinct thematic \n", "clusters in stakeholder narratives, ranging from infrastructure operations to climate adaptation \n", "concerns. This unsupervised extraction enables systematic identification of priorities and concerns \n", "across the interview corpus without requiring predefined categories.\n", " \"\"\"\n", " print(caption2)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 2 - no model selected\")\n", " print()\n", "\n", "# ==========================================\n", "# 3. FIGURE 3: SCIENCE DOMAIN MAPPING\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 3: Topic-to-Domain Semantic Bridge\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'topic_mappings' in globals() and 'science_backbone' in globals():\n", " print(\"Creating Figure 3: Domain mapping visualization...\")\n", " \n", " # Count topics per domain\n", " domain_counts = {}\n", " for topic in topic_mappings:\n", " domain = topic['primary_domain']\n", " domain_counts[domain] = domain_counts.get(domain, 0) + 1\n", " \n", " # Create rose/polar bar chart\n", " fig3 = go.Figure()\n", " \n", " domains = list(domain_counts.keys())\n", " counts = [domain_counts[d] for d in domains]\n", " \n", " # Add polar bar\n", " fig3.add_trace(\n", " go.Barpolar(\n", " r=counts,\n", " theta=domains,\n", " marker=dict(\n", " color=counts,\n", " colorscale='Viridis',\n", " showscale=True,\n", " colorbar=dict(title=\"Topic Count\")\n", " ),\n", " hovertemplate='%{theta}
Topics: %{r}'\n", " )\n", " )\n", " \n", " fig3.update_layout(\n", " title=dict(\n", " text='Figure 3. Topic Distribution Across Science Domains (Rose Diagram)',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " polar=dict(\n", " radialaxis=dict(\n", " visible=True,\n", " showline=True,\n", " linecolor='black',\n", " gridcolor='#eeeeee'\n", " ),\n", " angularaxis=dict(\n", " showline=True,\n", " linecolor='black'\n", " )\n", " ),\n", " height=700,\n", " font=dict(family='Arial', size=11),\n", " showlegend=False\n", " )\n", " \n", " fig3.show()\n", " \n", " # Save\n", " fig3.write_html(str(FIGURES_DIR / 'Figure3_DomainMapping_Rose.html'))\n", " print(\"✓ Figure 3 saved: Figure3_DomainMapping_Rose.html\")\n", " \n", " caption3 = \"\"\"\n", "**Figure 3. Topic Distribution Across Science Domains.**\n", "Rose diagram (polar bar chart) showing the distribution of LDA-derived topics across formal science \n", "domains. The semantic bridge algorithm maps topics to domains by computing similarity between \n", "topic word distributions and domain keyword sets. Radial distance indicates the number of topics \n", "mapped to each domain. This visualization reveals which scientific disciplines are most prominently \n", "featured in stakeholder discourse, highlighting infrastructure engineering and hydrological science \n", "as dominant themes in Alaska water infrastructure discussions.\n", " \"\"\"\n", " print(caption3)\n", " print()\n", " \n", " # Create supplementary sunburst diagram\n", " print(\"Creating supplementary sunburst visualization...\")\n", " \n", " # Build hierarchical structure\n", " sunburst_labels = ['All Domains']\n", " sunburst_parents = ['']\n", " sunburst_values = [0] # Will be sum of children\n", " \n", " for domain in domain_counts.keys():\n", " sunburst_labels.append(domain)\n", " sunburst_parents.append('All Domains')\n", " sunburst_values.append(domain_counts[domain])\n", " \n", " # Update root value\n", " sunburst_values[0] = sum(domain_counts.values())\n", " \n", " fig3b = go.Figure(\n", " go.Sunburst(\n", " labels=sunburst_labels,\n", " parents=sunburst_parents,\n", " values=sunburst_values,\n", " branchvalues='total',\n", " marker=dict(\n", " colorscale='Viridis',\n", " showscale=True\n", " ),\n", " hovertemplate='%{label}
Topics: %{value}'\n", " )\n", " )\n", " \n", " fig3b.update_layout(\n", " title='Figure 3B. Topic-Domain Distribution (Sunburst)',\n", " font=dict(family='Arial', size=12),\n", " height=600\n", " )\n", " \n", " fig3b.show()\n", " fig3b.write_html(str(FIGURES_DIR / 'Figure3B_DomainMapping_Sunburst.html'))\n", " print(\"✓ Figure 3B saved (supplementary)\")\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 3 - topic_mappings not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 4. TABLE 1: TOPIC-DOMAIN MAPPING SUMMARY\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"TABLE 1: Topic-to-Domain Assignments\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'topic_mappings' in globals():\n", " print(\"Creating Table 1: Detailed topic-domain mappings...\")\n", " \n", " # Build table\n", " table1_data = []\n", " for topic in topic_mappings:\n", " table1_data.append({\n", " 'Topic ID': topic['topic_id'],\n", " 'Top Keywords': ', '.join(topic['top_words'][:5]),\n", " 'Primary Domain': topic['primary_domain'],\n", " 'Confidence': f\"{topic['confidence']:.2f}\",\n", " 'Secondary Domains': ', '.join(topic.get('secondary_domains', [])[:2]) if topic.get('secondary_domains') else '—'\n", " })\n", " \n", " table1_df = pd.DataFrame(table1_data)\n", " \n", " # Display\n", " print(table1_df.to_string(index=False))\n", " print()\n", " \n", " # Save to CSV\n", " table1_df.to_csv(TABLES_DIR / 'Table1_TopicDomainMapping.csv', index=False)\n", " print(f\"✓ Table 1 saved: Table1_TopicDomainMapping.csv\")\n", " \n", " caption_table1 = \"\"\"\n", "**Table 1. Topic-to-Domain Semantic Mapping Results.**\n", "Automated assignment of LDA topics to formal science domains using keyword-based similarity \n", "matching. Confidence scores represent the strength of semantic alignment between topic word \n", "distributions and domain keyword sets. Secondary domains indicate cross-disciplinary themes. \n", "This structured mapping enables translation from unstructured stakeholder narratives to \n", "formal scientific frameworks, establishing traceability from qualitative data to quantitative \n", "model formulations.\n", " \"\"\"\n", " print(caption_table1)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Table 1 - topic_mappings not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 5. FIGURE 4: SCIENTIFIC VARIABLE OBJECTS (SVOs)\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 4: Measurable Variables Extracted from Narratives\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'svo_extractions' in globals() and 'svos_by_domain' in globals():\n", " print(\"Creating Figure 4: SVO distribution and coverage...\")\n", " \n", " # Create 2-panel figure\n", " fig4 = make_subplots(\n", " rows=1, cols=2,\n", " subplot_titles=(\n", " 'A. SVOs per Science Domain',\n", " 'B. Data Availability Score'\n", " ),\n", " specs=[[{'type': 'bar'}, {'type': 'bar'}]]\n", " )\n", " \n", " # Panel A: SVO counts\n", " domain_svo_counts = {\n", " domain: sum(svos_by_domain[domain].values())\n", " for domain in svos_by_domain.keys()\n", " }\n", " \n", " domains_sorted = sorted(domain_svo_counts.keys(), key=lambda x: domain_svo_counts[x], reverse=True)\n", " \n", " fig4.add_trace(\n", " go.Bar(\n", " x=domains_sorted,\n", " y=[domain_svo_counts[d] for d in domains_sorted],\n", " marker_color='#2ca02c',\n", " showlegend=False,\n", " text=[domain_svo_counts[d] for d in domains_sorted],\n", " textposition='outside'\n", " ),\n", " row=1, col=1\n", " )\n", " \n", " # Panel B: Data availability (unique SVOs)\n", " unique_svos_per_domain = {\n", " domain: len(svos_by_domain[domain])\n", " for domain in svos_by_domain.keys()\n", " }\n", " \n", " fig4.add_trace(\n", " go.Bar(\n", " x=domains_sorted,\n", " y=[unique_svos_per_domain[d] for d in domains_sorted],\n", " marker_color='#ff7f0e',\n", " showlegend=False,\n", " text=[unique_svos_per_domain[d] for d in domains_sorted],\n", " textposition='outside'\n", " ),\n", " row=1, col=2\n", " )\n", " \n", " # Update layout\n", " fig4.update_layout(\n", " title=dict(\n", " text='Figure 4. Scientific Variable Objects: Quantitative Grounding',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " height=500,\n", " font=dict(family='Arial', size=11),\n", " plot_bgcolor='white',\n", " showlegend=False\n", " )\n", " \n", " fig4.update_xaxes(showgrid=False, showline=True, linecolor='black', tickangle=-45)\n", " fig4.update_yaxes(showgrid=True, gridcolor='#eeeeee', showline=True, linecolor='black')\n", " \n", " fig4.update_yaxes(title_text=\"Total Mentions\", row=1, col=1)\n", " fig4.update_yaxes(title_text=\"Unique Variables\", row=1, col=2)\n", " \n", " fig4.show()\n", " \n", " # Save\n", " fig4.write_html(str(FIGURES_DIR / 'Figure4_SVO_Distribution.html'))\n", " print(\"✓ Figure 4 saved: Figure4_SVO_Distribution.html\")\n", " \n", " caption4 = \"\"\"\n", "**Figure 4. Scientific Variable Objects Extracted from Stakeholder Narratives.**\n", "(A) Total SVO mentions per science domain, indicating relative emphasis in stakeholder discourse. \n", "(B) Unique measurable variables identified per domain, representing data availability for \n", "quantitative modeling. SVOs are specific, measurable quantities (e.g., \"water level\", \"permafrost \n", "depth\", \"treatment capacity\") automatically extracted from interview transcripts. This extraction \n", "bridges the gap between qualitative narratives and quantitative model requirements by identifying \n", "which physical, chemical, biological, or social variables stakeholders reference when discussing \n", "system behavior and decision trade-offs.\n", " \"\"\"\n", " print(caption4)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 4 - SVO data not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 6. TABLE 2: TOP SVOs BY DOMAIN\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"TABLE 2: Most Frequently Mentioned Variables\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'svos_by_domain' in globals():\n", " print(\"Creating Table 2: Top SVOs per domain...\")\n", " \n", " table2_rows = []\n", " \n", " for domain in sorted(svos_by_domain.keys()):\n", " top_svos = sorted(svos_by_domain[domain].items(), key=lambda x: -x[1])[:5]\n", " \n", " for rank, (svo, count) in enumerate(top_svos, 1):\n", " table2_rows.append({\n", " 'Domain': domain if rank == 1 else '',\n", " 'Rank': rank,\n", " 'Variable': svo,\n", " 'Mentions': count\n", " })\n", " \n", " table2_df = pd.DataFrame(table2_rows)\n", " \n", " print(table2_df.to_string(index=False))\n", " print()\n", " \n", " # Save\n", " table2_df.to_csv(TABLES_DIR / 'Table2_TopSVOs.csv', index=False)\n", " print(\"✓ Table 2 saved: Table2_TopSVOs.csv\")\n", " \n", " caption_table2 = \"\"\"\n", "**Table 2. Most Frequently Mentioned Scientific Variables by Domain.**\n", "Top 5 measurable variables extracted from each science domain, with frequency counts indicating \n", "stakeholder emphasis. Variables listed represent quantities that stakeholders referenced when \n", "describing system behavior, challenges, or decision criteria. High-frequency variables indicate \n", "priority data needs for decision support models. For example, frequent mentions of \"water level\" \n", "and \"flood stage\" in Hydrological Science suggest these should be key model outputs or inputs.\n", " \"\"\"\n", " print(caption_table2)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Table 2 - SVO data not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 7. FIGURE 5: DECISION COMPONENTS\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 5: Decision Problem Formulation\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'decision_components' in globals():\n", " print(\"Creating Figure 5: Decision component extraction...\")\n", " \n", " # Count components\n", " component_counts = {\n", " comp_type: len(items) \n", " for comp_type, items in decision_components.items()\n", " }\n", " \n", " # Create stacked bar showing component types\n", " fig5 = go.Figure()\n", " \n", " comp_colors = {\n", " 'objectives': '#1f77b4',\n", " 'constraints': '#ff7f0e', \n", " 'alternatives': '#2ca02c',\n", " 'tradeoffs': '#d62728'\n", " }\n", " \n", " for comp_type, count in component_counts.items():\n", " fig5.add_trace(\n", " go.Bar(\n", " name=comp_type.capitalize(),\n", " x=[comp_type.capitalize()],\n", " y=[count],\n", " marker_color=comp_colors.get(comp_type, '#888'),\n", " text=count,\n", " textposition='outside'\n", " )\n", " )\n", " \n", " fig5.update_layout(\n", " title=dict(\n", " text='Figure 5. Decision Components Extracted from Stakeholder Discourse',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " xaxis_title='Component Type',\n", " yaxis_title='Count',\n", " height=500,\n", " font=dict(family='Arial', size=12),\n", " plot_bgcolor='white',\n", " showlegend=False,\n", " barmode='group'\n", " )\n", " \n", " fig5.update_xaxes(showgrid=False, showline=True, linecolor='black')\n", " fig5.update_yaxes(showgrid=True, gridcolor='#eeeeee', showline=True, linecolor='black')\n", " \n", " fig5.show()\n", " \n", " # Save\n", " fig5.write_html(str(FIGURES_DIR / 'Figure5_DecisionComponents.html'))\n", " print(\"✓ Figure 5 saved: Figure5_DecisionComponents.html\")\n", " \n", " caption5 = \"\"\"\n", "**Figure 5. Decision Components Automatically Extracted from Interview Transcripts.**\n", "Frequency distribution of decision-relevant statements identified through pattern matching and \n", "dependency parsing. Objectives represent desired outcomes (\"improve water quality\", \"reduce costs\"); \n", "Constraints indicate limitations or requirements (\"budget restrictions\", \"regulatory compliance\"); \n", "Alternatives capture different courses of action; Trade-offs identify competing priorities. This \n", "automated extraction translates qualitative stakeholder input into structured decision problem \n", "formulations suitable for optimization modeling frameworks.\n", " \"\"\"\n", " print(caption5)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 5 - decision components not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 8. FIGURE 6: SCIENCE BACKBONE NETWORK\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"FIGURE 6: Integrated Semantic Bridge Network\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'G' in globals():\n", " print(\"Creating Figure 6: Full network visualization...\")\n", " \n", " # This uses the network created in Cell 23\n", " # Create publication-quality version with annotations\n", " \n", " pos = nx.spring_layout(G, k=2, iterations=50, seed=42)\n", " \n", " # Separate node types\n", " domain_nodes = [n for n in G.nodes() if G.nodes[n].get('node_type') == 'domain']\n", " topic_nodes = [n for n in G.nodes() if G.nodes[n].get('node_type') == 'topic']\n", " svo_nodes = [n for n in G.nodes() if G.nodes[n].get('node_type') == 'svo']\n", " \n", " # Create edge trace\n", " edge_trace = go.Scatter(\n", " x=[],\n", " y=[],\n", " mode='lines',\n", " line=dict(width=0.5, color='#cccccc'),\n", " hoverinfo='none',\n", " showlegend=False\n", " )\n", " \n", " for edge in G.edges():\n", " x0, y0 = pos[edge[0]]\n", " x1, y1 = pos[edge[1]]\n", " edge_trace['x'] += (x0, x1, None)\n", " edge_trace['y'] += (y0, y1, None)\n", " \n", " # Create node traces\n", " # Domains (large, blue)\n", " domain_trace = go.Scatter(\n", " x=[pos[n][0] for n in domain_nodes],\n", " y=[pos[n][1] for n in domain_nodes],\n", " mode='markers+text',\n", " marker=dict(size=25, color='#1f77b4', line=dict(width=2, color='white')),\n", " text=[n for n in domain_nodes],\n", " textposition='top center',\n", " textfont=dict(size=10, family='Arial'),\n", " name='Science Domains',\n", " hoverinfo='text',\n", " hovertext=[f\"{n}
Connections: {G.degree(n)}\" for n in domain_nodes]\n", " )\n", " \n", " # Topics (medium, orange)\n", " topic_trace = go.Scatter(\n", " x=[pos[n][0] for n in topic_nodes],\n", " y=[pos[n][1] for n in topic_nodes],\n", " mode='markers',\n", " marker=dict(size=12, color='#ff7f0e', line=dict(width=1, color='white')),\n", " name='Topics (LDA)',\n", " hoverinfo='text',\n", " hovertext=[f\"{n}
{G.nodes[n].get('top_words', 'N/A')}\" for n in topic_nodes]\n", " )\n", " \n", " # SVOs (small, green)\n", " svo_trace = go.Scatter(\n", " x=[pos[n][0] for n in svo_nodes],\n", " y=[pos[n][1] for n in svo_nodes],\n", " mode='markers',\n", " marker=dict(size=6, color='#2ca02c', line=dict(width=0.5, color='white')),\n", " name='Scientific Variables',\n", " hoverinfo='text',\n", " hovertext=[f\"{n}\" for n in svo_nodes]\n", " )\n", " \n", " # Create figure\n", " fig6 = go.Figure(\n", " data=[edge_trace, domain_trace, topic_trace, svo_trace],\n", " layout=go.Layout(\n", " title=dict(\n", " text='Figure 6. Semantic Bridge Network: From Narratives to Decision Variables',\n", " font=dict(size=16, family='Arial'),\n", " x=0.5,\n", " xanchor='center'\n", " ),\n", " showlegend=True,\n", " legend=dict(\n", " x=0.02,\n", " y=0.98,\n", " bgcolor='rgba(255,255,255,0.8)',\n", " bordercolor='black',\n", " borderwidth=1\n", " ),\n", " hovermode='closest',\n", " margin=dict(b=40, l=5, r=5, t=60),\n", " xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),\n", " plot_bgcolor='white',\n", " height=800,\n", " font=dict(family='Arial', size=11)\n", " )\n", " )\n", " \n", " fig6.show()\n", " \n", " # Save\n", " fig6.write_html(str(FIGURES_DIR / 'Figure6_SemanticBridgeNetwork.html'))\n", " print(\"✓ Figure 6 saved: Figure6_SemanticBridgeNetwork.html\")\n", " \n", " caption6 = \"\"\"\n", "**Figure 6. Integrated Semantic Bridge Network Architecture.**\n", "Network diagram showing the complete translation pathway from unstructured stakeholder narratives \n", "to structured decision variables. Blue nodes represent formal science domains; orange nodes are \n", "LDA-derived topics; green nodes are Scientific Variable Objects (measurable quantities). Edges \n", "indicate semantic relationships established through algorithmic mapping. This network operationalizes \n", "the qualitative-to-quantitative bridge: stakeholder language → topic clusters → science domains → \n", "measurable variables. The structure provides explicit traceability, enabling practitioners to trace \n", "any model variable back to the stakeholder statements from which it was derived.\n", " \"\"\"\n", " print(caption6)\n", " print()\n", " \n", " # Network statistics\n", " print(\"Network Statistics:\")\n", " print(f\" • Total nodes: {G.number_of_nodes()}\")\n", " print(f\" • Total edges: {G.number_of_edges()}\")\n", " print(f\" • Domains: {len(domain_nodes)}\")\n", " print(f\" • Topics: {len(topic_nodes)}\")\n", " print(f\" • SVOs: {len(svo_nodes)}\")\n", " print(f\" • Average degree: {sum(dict(G.degree()).values()) / G.number_of_nodes():.2f}\")\n", " print(f\" • Network density: {nx.density(G):.4f}\")\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Figure 6 - network not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 9. TABLE 3: OPTIMIZATION-READY STRUCTURE\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"TABLE 3: Optimization Model Components by Domain\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "if 'optimization_structure' in globals():\n", " print(\"Creating Table 3: Optimization-ready formulation...\")\n", " \n", " table3_data = []\n", " \n", " for domain, structure in optimization_structure.items():\n", " table3_data.append({\n", " 'Science Domain': domain,\n", " 'Variables (n)': structure['variable_count'],\n", " 'Decision Elements (n)': structure['decision_count'],\n", " 'Sample Variables': ', '.join(structure['measurable_variables'][:3]),\n", " 'Sample Objective': structure['objectives'][0][:80] + '...' if structure['objectives'] else '—'\n", " })\n", " \n", " table3_df = pd.DataFrame(table3_data)\n", " \n", " print(table3_df.to_string(index=False))\n", " print()\n", " \n", " # Save\n", " table3_df.to_csv(TABLES_DIR / 'Table3_OptimizationStructure.csv', index=False)\n", " print(\"✓ Table 3 saved: Table3_OptimizationStructure.csv\")\n", " \n", " caption_table3 = \"\"\"\n", "**Table 3. Optimization-Ready Problem Structure Derived from Stakeholder Input.**\n", "Summary of decision model components automatically extracted and organized by science domain. \n", "'Variables' are measurable quantities (SVOs) that can serve as model inputs, outputs, or \n", "constraints. 'Decision Elements' include objectives, constraints, and trade-offs expressed by \n", "stakeholders. This structured representation enables direct incorporation into multi-objective \n", "optimization frameworks, ensuring that model formulations reflect authentic stakeholder priorities \n", "rather than analyst assumptions. The traceability from narrative statements to formal model \n", "components supports transparent, defensible decision processes.\n", " \"\"\"\n", " print(caption_table3)\n", " print()\n", "\n", "else:\n", " print(\"⚠️ Skipping Table 3 - optimization structure not available\")\n", " print()\n", "\n", "# ==========================================\n", "# 10. SUMMARY STATISTICS TABLE\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"TABLE 4: Summary Statistics\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "print(\"Creating Table 4: Overall analysis summary...\")\n", "\n", "summary_stats = {\n", " 'Data Collection': {\n", " 'Interviews Analyzed': len(documents) if 'documents' in globals() else 0,\n", " 'Total Words': sum(len(doc.split()) for doc in documents.values()) if 'documents' in globals() else 0,\n", " 'Avg Words per Interview': int(np.mean([len(doc.split()) for doc in documents.values()])) if 'documents' in globals() else 0\n", " },\n", " 'Topic Modeling': {\n", " 'Models Compared': len(model_results) if 'model_results' in globals() else 0,\n", " 'Topics in Selected Model': n_topics if 'n_topics' in globals() else 0,\n", " 'Vocabulary Size': len(feature_names) if 'feature_names' in globals() else 0\n", " },\n", " 'Semantic Bridge': {\n", " 'Science Domains': len(science_backbone) if 'science_backbone' in globals() else 0,\n", " 'Topic-Domain Mappings': len(topic_mappings) if 'topic_mappings' in globals() else 0,\n", " 'Avg Mapping Confidence': f\"{np.mean([t['confidence'] for t in topic_mappings]):.2f}\" if 'topic_mappings' in globals() else 0\n", " },\n", " 'Quantitative Grounding': {\n", " 'Total SVO Mentions': len(svo_extractions) if 'svo_extractions' in globals() else 0,\n", " 'Unique Variables': len(set(s['svo'] for s in svo_extractions)) if 'svo_extractions' in globals() else 0,\n", " 'Domains with SVOs': len(svos_by_domain) if 'svos_by_domain' in globals() else 0\n", " },\n", " 'Decision Components': {\n", " 'Objectives Identified': len(decision_components.get('objectives', [])) if 'decision_components' in globals() else 0,\n", " 'Constraints Identified': len(decision_components.get('constraints', [])) if 'decision_components' in globals() else 0,\n", " 'Total Decision Elements': sum(len(v) for v in decision_components.values()) if 'decision_components' in globals() else 0\n", " }\n", "}\n", "\n", "# Convert to DataFrame\n", "table4_rows = []\n", "for category, metrics in summary_stats.items():\n", " for metric, value in metrics.items():\n", " table4_rows.append({\n", " 'Category': category,\n", " 'Metric': metric,\n", " 'Value': value\n", " })\n", "\n", "table4_df = pd.DataFrame(table4_rows)\n", "\n", "print(table4_df.to_string(index=False))\n", "print()\n", "\n", "# Save\n", "table4_df.to_csv(TABLES_DIR / 'Table4_SummaryStatistics.csv', index=False)\n", "print(\"✓ Table 4 saved: Table4_SummaryStatistics.csv\")\n", "\n", "caption_table4 = \"\"\"\n", "**Table 4. Summary Statistics for AI-Augmented Semantic Bridge Analysis.**\n", "Comprehensive metrics spanning data collection through decision component extraction. Statistics \n", "demonstrate the scale of automated processing: multiple models compared, hundreds of semantic \n", "mappings generated, and dozens of measurable variables and decision elements extracted—all without \n", "manual coding or categorization beyond initial framework definition. This computational acceleration \n", "enables rapid synthesis across diverse stakeholder groups while maintaining explicit provenance to \n", "source material.\n", "\"\"\"\n", "print(caption_table4)\n", "print()\n", "\n", "# ==========================================\n", "# 11. GENERATE TEXT REPORT\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"GENERATING NARRATIVE REPORT\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "print(\"Creating publication-ready narrative summary...\")\n", "\n", "report_text = f\"\"\"\n", "# {REPORT_METADATA['title']}\n", "## {REPORT_METADATA['subtitle']}\n", "\n", "**Date:** {REPORT_METADATA['date']}\n", "\n", "---\n", "\n", "## Executive Summary\n", "\n", "This report presents results from an AI-augmented semantic bridge analysis applied to \n", "{REPORT_METADATA['n_interviews']} stakeholder interviews regarding water infrastructure resilience \n", "in Bethel, Alaska. The methodology combines unsupervised machine learning (Latent Dirichlet \n", "Allocation topic modeling) with semantic mapping algorithms to translate unstructured qualitative \n", "narratives into structured, quantitative decision support components.\n", "\n", "**Key Findings:**\n", "\n", "1. **Topic Discovery:** Automated extraction identified {n_topics if 'n_topics' in globals() else 'N/A'} \n", " distinct thematic clusters in stakeholder discourse, spanning infrastructure operations, climate \n", " adaptation, community engagement, and resource management.\n", "\n", "2. **Science Domain Mapping:** Topics were successfully mapped to {len(science_backbone) if 'science_backbone' in globals() else 'N/A'} \n", " formal science domains with an average confidence score of {f\"{np.mean([t['confidence'] for t in topic_mappings]):.2f}\" if 'topic_mappings' in globals() else 'N/A'}, \n", " establishing semantic bridges between everyday language and scientific frameworks.\n", "\n", "3. **Quantitative Grounding:** Extracted {len(set(s['svo'] for s in svo_extractions)) if 'svo_extractions' in globals() else 'N/A'} \n", " unique measurable variables (Scientific Variable Objects) from stakeholder narratives, providing \n", " concrete data requirements for decision models.\n", "\n", "4. **Decision Formulation:** Identified {sum(len(v) for v in decision_components.values()) if 'decision_components' in globals() else 'N/A'} \n", " decision components (objectives, constraints, alternatives, trade-offs) ready for incorporation \n", " into optimization frameworks.\n", "\n", "**Methodological Innovation:**\n", "\n", "The semantic bridge approach operationalizes qualitative-to-quantitative translation at scale. \n", "Unlike manual coding which limits throughput, this computational pipeline processes interview \n", "corpora in minutes while maintaining explicit traceability from model variables back to source \n", "statements. The method preserves interpretive validity through human-in-the-loop framework \n", "definition while achieving computational acceleration through automated pattern matching and \n", "semantic similarity algorithms.\n", "\n", "**Decision Support Implications:**\n", "\n", "For Bethel's water infrastructure challenges, the analysis reveals stakeholder priorities cluster \n", "around permafrost-affected infrastructure, seasonal operational challenges, and community health \n", "concerns. The extracted variables and decision components provide a stakeholder-grounded foundation \n", "for optimization models, ensuring technical solutions align with community-expressed needs and \n", "values.\n", "\n", "---\n", "\n", "## 1. Data and Methods\n", "\n", "### 1.1 Interview Corpus\n", "\n", "**Dataset:** {len(documents) if 'documents' in globals() else 'N/A'} semi-structured interviews with \n", "water infrastructure stakeholders in Bethel, Alaska, including operators, managers, regulators, and \n", "community members.\n", "\n", "**Total corpus:** {sum(len(doc.split()) for doc in documents.values()) if 'documents' in globals() else 'N/A'} words\n", "\n", "**Average length:** {int(np.mean([len(doc.split()) for doc in documents.values()])) if 'documents' in globals() else 'N/A'} words per interview\n", "\n", "### 1.2 Topic Modeling\n", "\n", "Applied Latent Dirichlet Allocation (LDA) with three preprocessing strategies:\n", "- **Model 1 (Baseline):** Minimal preprocessing, 100 features, 5 topics\n", "- **Model 2 (Enhanced):** Comprehensive stopword filtering, 200 features, 8 topics \n", "- **Model 3 (Stricter):** Stringent filtering (4+ char words, 4+ doc frequency), 150 features, 6 topics\n", "\n", "Model selection based on perplexity, coherence, and diversity metrics. **Model 2 selected** for \n", "optimal balance of interpretability and coverage (see Figure 1).\n", "\n", "### 1.3 Semantic Bridge Algorithm\n", "\n", "**Topic-to-Domain Mapping:**\n", "1. Define science backbone with domain keywords (Hydrological Science, Climate Science, \n", " Infrastructure Engineering, etc.)\n", "2. Compute TF-IDF similarity between topic word distributions and domain keyword sets\n", "3. Assign topics to primary domain (highest similarity) and secondary domains (threshold > 0.5)\n", "4. Generate confidence scores based on keyword overlap and positional weighting\n", "\n", "**SVO Extraction:**\n", "1. Define patterns for measurable quantities per domain (e.g., \"water level\", \"permafrost depth\")\n", "2. Scan documents for pattern matches with context extraction\n", "3. Link SVOs to science domains and topics\n", "4. Quantify data availability (mention frequency, unique variables)\n", "\n", "**Decision Component Extraction:**\n", "1. Pattern matching for objectives (\"need to\", \"improve\", \"reduce\")\n", "2. Constraint identification (\"limited\", \"must\", \"required\")\n", "3. Alternative detection (\"option\", \"could\", \"instead\")\n", "4. Trade-off recognition (\"versus\", \"balance\", \"however\")\n", "5. Link components to SVOs via document co-occurrence\n", "\n", "### 1.4 Network Construction\n", "\n", "Built NetworkX graph with three node types:\n", "- **Domains** (large, blue): Formal science disciplines\n", "- **Topics** (medium, orange): LDA-derived thematic clusters\n", "- **SVOs** (small, green): Measurable variables\n", "\n", "Edges represent semantic relationships established by mapping algorithms.\n", "\n", "---\n", "\n", "## 2. Results\n", "\n", "### 2.1 Topic Model Performance (Figure 1, Table 1)\n", "\n", "Three preprocessing strategies yielded models with distinct trade-offs. Model 3 (Stricter) achieved \n", "lowest perplexity but fewer topics; Model 2 (Enhanced) balanced coverage with interpretability. \n", "Quality metrics confirmed model stability (see Figure 1B).\n", "\n", "**Selected Model (Enhanced):** {n_topics if 'n_topics' in globals() else 'N/A'} topics, \n", "{len(feature_names) if 'feature_names' in globals() else 'N/A'} features\n", "\n", "Topics span operational concerns, climate impacts, infrastructure challenges, community engagement, \n", "and resource management. See Figure 2 for topic word distributions.\n", "\n", "### 2.2 Science Domain Mapping (Figure 3, Table 1)\n", "\n", "Topics successfully mapped to {len(science_backbone) if 'science_backbone' in globals() else 'N/A'} \n", "science domains. Most prominent:\n", "- Infrastructure Engineering\n", "- Hydrological Science\n", "- Climate Science\n", "- Social Systems\n", "\n", "Rose diagram (Figure 3) reveals Infrastructure Engineering and Hydrological Science as dominant \n", "themes, reflecting Bethel's focus on piped water systems, permafrost challenges, and flood risks.\n", "\n", "Average mapping confidence: {f\"{np.mean([t['confidence'] for t in topic_mappings]):.2f}\" if 'topic_mappings' in globals() else 'N/A'}\n", "\n", "Cross-disciplinary themes evident in secondary domain assignments, particularly Climate-Infrastructure \n", "interactions and Social-Infrastructure connections.\n", "\n", "### 2.3 Scientific Variable Objects (Figure 4, Table 2)\n", "\n", "Extracted {len(svo_extractions) if 'svo_extractions' in globals() else 'N/A'} SVO mentions \n", "({len(set(s['svo'] for s in svo_extractions)) if 'svo_extractions' in globals() else 'N/A'} unique) \n", "across {len(svos_by_domain) if 'svos_by_domain' in globals() else 'N/A'} domains.\n", "\n", "**Top variables (by frequency):**\n", "- Infrastructure: \"capacity\", \"maintenance frequency\", \"system age\"\n", "- Hydrological: \"water level\", \"flood stage\", \"flow rate\"\n", "- Climate: \"permafrost depth\", \"thaw\", \"temperature\"\n", "\n", "Data availability varies by domain (Figure 4B). Infrastructure and Hydrological domains show highest \n", "variable counts, indicating rich data requirements. Social Systems shows fewer measurable variables, \n", "suggesting need for qualitative indicators or proxy metrics.\n", "\n", "### 2.4 Decision Components (Figure 5)\n", "\n", "Identified {sum(len(v) for v in decision_components.values()) if 'decision_components' in globals() else 'N/A'} \n", "decision-relevant statements:\n", "- **Objectives:** {len(decision_components.get('objectives', [])) if 'decision_components' in globals() else 'N/A'} \n", " (e.g., \"improve water quality\", \"reduce downtime\", \"ensure compliance\")\n", "- **Constraints:** {len(decision_components.get('constraints', [])) if 'decision_components' in globals() else 'N/A'} \n", " (e.g., \"limited budget\", \"permafrost stability requirements\", \"regulatory standards\")\n", "- **Alternatives:** Identified in {len(decision_components.get('alternatives', [])) if 'decision_components' in globals() else 'N/A'} \n", " instances\n", "- **Trade-offs:** {len(decision_components.get('tradeoffs', [])) if 'decision_components' in globals() else 'N/A'} \n", " competing priorities noted\n", "\n", "These components provide raw material for multi-objective optimization model formulation.\n", "\n", "### 2.5 Integrated Network (Figure 6, Table 3)\n", "\n", "Complete semantic bridge network contains {G.number_of_nodes() if 'G' in globals() else 'N/A'} nodes \n", "and {G.number_of_edges() if 'G' in globals() else 'N/A'} edges, with network density \n", "{f\"{nx.density(G):.4f}\" if 'G' in globals() else 'N/A'}.\n", "\n", "Network structure reveals:\n", "1. **Hub domains:** Infrastructure Engineering and Hydrological Science show highest degree centrality\n", "2. **Bridging topics:** Several topics span multiple domains, indicating interdisciplinary themes\n", "3. **Variable-rich domains:** Domains with many SVO connections have stronger data foundations\n", "\n", "Optimization-ready structure (Table 3) organizes variables and decision elements by domain, ready \n", "for incorporation into formal modeling frameworks.\n", "\n", "---\n", "\n", "## 3. Discussion\n", "\n", "### 3.1 Methodological Contributions\n", "\n", "This work demonstrates scalable qualitative-to-quantitative translation for environmental decision \n", "support. Key innovations:\n", "\n", "1. **Automated yet traceable:** Computational acceleration preserves explicit provenance from model \n", " variables to stakeholder statements\n", "2. **Multi-modal integration:** Combines unsupervised learning, semantic similarity, and pattern \n", " matching for robust extraction\n", "3. **Optimization-ready outputs:** Structured components map directly to decision model formulations\n", "\n", "### 3.2 Bethel Water Infrastructure Insights\n", "\n", "Analysis reveals stakeholder priorities center on:\n", "- **Permafrost challenges:** Frequent mentions of thaw, subsidence, and infrastructure impacts\n", "- **Operational capacity:** Focus on maintenance, training, and resource limitations\n", "- **Seasonal variability:** Water level fluctuations, freeze-thaw cycles, and access constraints\n", "- **Community health:** Sanitation, water quality, and service reliability\n", "\n", "These priorities should guide model objective functions and constraint formulations.\n", "\n", "### 3.3 Data Requirements\n", "\n", "SVO analysis identifies specific data needs:\n", "- **High priority:** Water levels, infrastructure condition assessments, maintenance logs\n", "- **Moderate priority:** Permafrost monitoring, climate projections, population data \n", "- **Gaps:** Social indicators (community satisfaction, participation rates) need proxy development\n", "\n", "### 3.4 Limitations\n", "\n", "- **Context loss:** Automated extraction may miss nuanced meanings requiring human interpretation\n", "- **Framework dependency:** Results sensitive to science backbone definition; domain expertise required\n", "- **Temporal dynamics:** Static analysis doesn't capture evolving priorities over time\n", "- **Validation:** Semantic bridge accuracy requires expert review and stakeholder feedback\n", "\n", "### 3.5 Future Directions\n", "\n", "- **Iterative refinement:** Stakeholder review of mappings to improve accuracy\n", "- **Temporal analysis:** Track priority shifts across interview campaigns\n", "- **Multi-site comparison:** Apply framework to other Arctic communities for generalization\n", "- **Model integration:** Feed extracted components into actual optimization runs\n", "- **Decision pathway linking:** Connect extracted elements to decision pathway framework\n", "\n", "---\n", "\n", "## 4. Conclusions\n", "\n", "AI-augmented semantic bridge analysis successfully translated {REPORT_METADATA['n_interviews']} \n", "unstructured stakeholder interviews into structured decision support components. The methodology \n", "demonstrates:\n", "\n", "1. **Feasibility:** Qualitative-to-quantitative translation at scale\n", "2. **Traceability:** Explicit provenance from narratives to model variables \n", "3. **Actionability:** Optimization-ready problem formulations\n", "4. **Transparency:** Human-interpretable intermediate steps\n", "\n", "For Bethel water infrastructure, results ground decision models in authentic stakeholder priorities, \n", "ensuring technical solutions align with community-expressed needs. The framework generalizes to other \n", "environmental decision contexts requiring stakeholder input integration.\n", "\n", "**Recommended next steps:**\n", "1. Expert review of topic-domain mappings\n", "2. Stakeholder validation of extracted priorities\n", "3. Data collection targeting high-frequency SVOs\n", "4. Optimization model formulation using extracted components\n", "5. Iterative refinement based on decision outcomes\n", "\n", "---\n", "\n", "## Figures and Tables\n", "\n", "All figures and tables saved to:\n", "- **Figures:** {FIGURES_DIR}/\n", "- **Tables:** {TABLES_DIR}/\n", "\n", "### Figure List\n", "- Figure 1: Model Comparison (4 panels)\n", "- Figure 2: Topic Overview (horizontal bars)\n", "- Figure 3: Domain Mapping Rose Diagram\n", "- Figure 3B: Domain Mapping Sunburst (supplementary)\n", "- Figure 4: SVO Distribution (2 panels)\n", "- Figure 5: Decision Components (bar chart)\n", "- Figure 6: Semantic Bridge Network (network diagram)\n", "\n", "### Table List\n", "- Table 1: Topic-Domain Mappings (detailed assignments)\n", "- Table 2: Top SVOs by Domain (frequency rankings)\n", "- Table 3: Optimization Structure (variables and decisions by domain)\n", "- Table 4: Summary Statistics (overall metrics)\n", "\n", "---\n", "\n", "**Report generated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n", "\n", "**Analysis pipeline:** Cells 1-26 of semantic_bridge_NNA_v7.ipynb\n", "\n", "**For questions:** Contact research team\n", "\n", "---\n", "\"\"\"\n", "\n", "# Save report\n", "report_path = OUTPUT_DIR / 'Publication_Summary_Report.md'\n", "with open(report_path, 'w') as f:\n", " f.write(report_text)\n", "\n", "print(f\"✓ Narrative report saved: {report_path}\")\n", "print()\n", "\n", "# ==========================================\n", "# 12. FINAL SUMMARY\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"✅ PUBLICATION REPORT GENERATION COMPLETE\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "print(\"Outputs generated:\")\n", "print(f\" • Figures: {FIGURES_DIR}/\")\n", "print(f\" - Figure1_ModelComparison.html\")\n", "print(f\" - Figure2_TopicOverview.html\")\n", "print(f\" - Figure3_DomainMapping_Rose.html\")\n", "print(f\" - Figure3B_DomainMapping_Sunburst.html\")\n", "print(f\" - Figure4_SVO_Distribution.html\")\n", "print(f\" - Figure5_DecisionComponents.html\")\n", "print(f\" - Figure6_SemanticBridgeNetwork.html\")\n", "print()\n", "\n", "print(f\" • Tables: {TABLES_DIR}/\")\n", "print(f\" - Table1_TopicDomainMapping.csv\")\n", "print(f\" - Table2_TopSVOs.csv\")\n", "print(f\" - Table3_OptimizationStructure.csv\")\n", "print(f\" - Table4_SummaryStatistics.csv\")\n", "print()\n", "\n", "print(f\" • Report: {OUTPUT_DIR}/\")\n", "print(f\" - Publication_Summary_Report.md\")\n", "print()\n", "\n", "print(\"=\"*80)\n", "print(\"📊 ALL MATERIALS READY FOR PUBLICATION\")\n", "print(\"=\"*80)\n", "print()\n", "\n", "print(\"Next steps:\")\n", "print(\" 1. Review figures and tables\")\n", "print(\" 2. Customize captions for target journal\")\n", "print(\" 3. Export figures to required formats (PDF, PNG, EPS)\")\n", "print(\" 4. Integrate text into manuscript\")\n", "print()\n", "\n", "print(\"💡 Tips:\")\n", "print(\" • Figures are interactive HTML - open in browser to explore\")\n", "print(\" • Use browser 'Save as PDF' for static versions\")\n", "print(\" • Tables are CSV - import to Excel/LaTeX as needed\")\n", "print(\" • Narrative report provides boilerplate text\")\n", "print()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🌅 SCIENTIFIC VARIABLES SUNBURST DIAGRAM\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ svo_extractions found: 824 SVO mentions\n", "✓ svos_by_domain found: 8 domains\n", "\n", "Step 2: Building sunburst hierarchy...\n", "--------------------------------------------------------------------------------\n", "✓ Hierarchy built: 58 nodes\n", " • 1 root node\n", " • 8 domain nodes\n", " • 49 SVO nodes (top 10 per domain)\n", "\n", "Step 3: Creating sunburst visualization...\n", "--------------------------------------------------------------------------------\n", "✓ Sunburst created\n", "\n" ] }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "branchvalues": "total", "hoverinfo": "text", "hovertext": [ "All Scientific Variables
Total mentions: 824
Unique variables: 50
Domains: 8", "Climate Science
Total mentions: 83
Unique variables: 8
Percent of total: 10.1%", "freeze
Domain: Climate Science
Mentions: 32
Percent of domain: 38.6%", "seasonal
Domain: Climate Science
Mentions: 18
Percent of domain: 21.7%", "climate change
Domain: Climate Science
Mentions: 15
Percent of domain: 18.1%", "temperature
Domain: Climate Science
Mentions: 8
Percent of domain: 9.6%", "thaw
Domain: Climate Science
Mentions: 7
Percent of domain: 8.4%", "permafrost depth
Domain: Climate Science
Mentions: 1
Percent of domain: 1.2%", "frost depth
Domain: Climate Science
Mentions: 1
Percent of domain: 1.2%", "warming
Domain: Climate Science
Mentions: 1
Percent of domain: 1.2%", "Economics & Resources
Total mentions: 187
Unique variables: 11
Percent of total: 22.7%", "rate
Domain: Economics & Resources
Mentions: 67
Percent of domain: 35.8%", "cost
Domain: Economics & Resources
Mentions: 45
Percent of domain: 24.1%", "funding
Domain: Economics & Resources
Mentions: 34
Percent of domain: 18.2%", "price
Domain: Economics & Resources
Mentions: 10
Percent of domain: 5.3%", "revenue
Domain: Economics & Resources
Mentions: 8
Percent of domain: 4.3%", "value
Domain: Economics & Resources
Mentions: 6
Percent of domain: 3.2%", "investment
Domain: Economics & Resources
Mentions: 6
Percent of domain: 3.2%", "expense
Domain: Economics & Resources
Mentions: 4
Percent of domain: 2.1%", "budget
Domain: Economics & Resources
Mentions: 4
Percent of domain: 2.1%", "capital cost
Domain: Economics & Resources
Mentions: 2
Percent of domain: 1.1%", "Environmental Health
Total mentions: 114
Unique variables: 8
Percent of total: 13.8%", "ph
Domain: Environmental Health
Mentions: 58
Percent of domain: 50.9%", "water quality
Domain: Environmental Health
Mentions: 37
Percent of domain: 32.5%", "disinfection
Domain: Environmental Health
Mentions: 6
Percent of domain: 5.3%", "turbidity
Domain: Environmental Health
Mentions: 5
Percent of domain: 4.4%", "violation
Domain: Environmental Health
Mentions: 4
Percent of domain: 3.5%", "coliform
Domain: Environmental Health
Mentions: 2
Percent of domain: 1.8%", "chlorine level
Domain: Environmental Health
Mentions: 1
Percent of domain: 0.9%", "pathogen
Domain: Environmental Health
Mentions: 1
Percent of domain: 0.9%", "Governance & Policy
Total mentions: 69
Unique variables: 5
Percent of total: 8.4%", "standard
Domain: Governance & Policy
Mentions: 38
Percent of domain: 55.1%", "regulation
Domain: Governance & Policy
Mentions: 13
Percent of domain: 18.8%", "requirement
Domain: Governance & Policy
Mentions: 11
Percent of domain: 15.9%", "permit
Domain: Governance & Policy
Mentions: 5
Percent of domain: 7.2%", "enforcement
Domain: Governance & Policy
Mentions: 2
Percent of domain: 2.9%", "Hydrological Science
Total mentions: 14
Unique variables: 5
Percent of total: 1.7%", "depth
Domain: Hydrological Science
Mentions: 4
Percent of domain: 28.6%", "height
Domain: Hydrological Science
Mentions: 4
Percent of domain: 28.6%", "discharge
Domain: Hydrological Science
Mentions: 4
Percent of domain: 28.6%", "water level
Domain: Hydrological Science
Mentions: 1
Percent of domain: 7.1%", "volume
Domain: Hydrological Science
Mentions: 1
Percent of domain: 7.1%", "Infrastructure Engineering
Total mentions: 271
Unique variables: 4
Percent of total: 32.9%", "age
Domain: Infrastructure Engineering
Mentions: 233
Percent of domain: 86.0%", "capacity
Domain: Infrastructure Engineering
Mentions: 16
Percent of domain: 5.9%", "condition
Domain: Infrastructure Engineering
Mentions: 15
Percent of domain: 5.5%", "pressure
Domain: Infrastructure Engineering
Mentions: 7
Percent of domain: 2.6%", "Social Systems
Total mentions: 13
Unique variables: 5
Percent of total: 1.6%", "household
Domain: Social Systems
Mentions: 8
Percent of domain: 61.5%", "residents
Domain: Social Systems
Mentions: 2
Percent of domain: 15.4%", "employment
Domain: Social Systems
Mentions: 1
Percent of domain: 7.7%", "population
Domain: Social Systems
Mentions: 1
Percent of domain: 7.7%", "users
Domain: Social Systems
Mentions: 1
Percent of domain: 7.7%", "Technical Operations
Total mentions: 73
Unique variables: 4
Percent of total: 8.9%", "hours
Domain: Technical Operations
Mentions: 48
Percent of domain: 65.8%", "staff
Domain: Technical Operations
Mentions: 20
Percent of domain: 27.4%", "downtime
Domain: Technical Operations
Mentions: 4
Percent of domain: 5.5%", "certification level
Domain: Technical Operations
Mentions: 1
Percent of domain: 1.4%" ], "insidetextorientation": "radial", "labels": [ "Scientific Variables", "Climate Science", "freeze", "seasonal", "climate change", "temperature", "thaw", "permafrost depth", "frost depth", "warming", "Economics & Resources", "rate", "cost", "funding", "price", "revenue", "value", "investment", "expense", "budget", "capital cost", "Environmental Health", "ph", "water quality", "disinfection", "turbidity", "violation", "coliform", "chlorine level", "pathogen", "Governance & Policy", "standard", "regulation", "requirement", "permit", "enforcement", "Hydrological Science", "depth", "height", "discharge", "water level", "volume", "Infrastructure Engineering", "age", "capacity", "condition", "pressure", "Social Systems", "household", "residents", "employment", "population", "users", "Technical Operations", "hours", "staff", "downtime", "certification level" ], "marker": { "colors": [ "#cccccc", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#ff7f0e", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#e377c2", "#d62728", "#d62728", "#d62728", "#d62728", "#d62728", "#d62728", "#d62728", "#d62728", "#d62728", "#8c564b", "#8c564b", "#8c564b", "#8c564b", "#8c564b", "#8c564b", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#1f77b4", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#2ca02c", "#9467bd", "#9467bd", "#9467bd", "#9467bd", "#9467bd", "#9467bd", "#7f7f7f", "#7f7f7f", "#7f7f7f", "#7f7f7f", "#7f7f7f" ], "line": { "color": "white", "width": 2 } }, "parents": [ "", "Scientific Variables", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Climate Science", "Scientific Variables", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Economics & Resources", "Scientific Variables", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Environmental Health", "Scientific Variables", "Governance & Policy", "Governance & Policy", "Governance & Policy", "Governance & Policy", "Governance & Policy", "Scientific Variables", "Hydrological Science", "Hydrological Science", "Hydrological Science", "Hydrological Science", "Hydrological Science", "Scientific Variables", "Infrastructure Engineering", "Infrastructure Engineering", "Infrastructure Engineering", "Infrastructure Engineering", "Scientific Variables", "Social Systems", "Social Systems", "Social Systems", "Social Systems", "Social Systems", "Scientific Variables", "Technical Operations", "Technical Operations", "Technical Operations", "Technical Operations" ], "textfont": { "family": "Arial", "size": 11 }, "type": "sunburst", "values": [ 824, 83, 32, 18, 15, 8, 7, 1, 1, 1, 187, 67, 45, 34, 10, 8, 6, 6, 4, 4, 2, 114, 58, 37, 6, 5, 4, 2, 1, 1, 69, 38, 13, 11, 5, 2, 14, 4, 4, 4, 1, 1, 271, 233, 16, 15, 7, 13, 8, 2, 1, 1, 1, 73, 48, 20, 4, 1 ] } ], "layout": { "font": { "family": "Arial", "size": 11 }, "height": 800, "margin": { "b": 0, "l": 0, "r": 0, "t": 100 }, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermap": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermap" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "family": "Arial", "size": 18 }, "text": "Scientific Variables by Science Domain
Hierarchical Distribution of Measurable Quantities from Stakeholder Narratives", "x": 0.5, "xanchor": "center" }, "width": 800 } }, "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "📊 SVO DISTRIBUTION SUMMARY\n", "================================================================================\n", "\n", "Variables by Domain:\n", " Domain Total Mentions Unique SVOs Avg per SVO Top 3 Variables\n", " Climate Science 83 8 10.4 freeze, seasonal, climate change\n", " Economics & Resources 187 11 17.0 rate, cost, funding\n", " Environmental Health 114 8 14.2 ph, water quality, disinfection\n", " Governance & Policy 69 5 13.8 standard, regulation, requirement\n", " Hydrological Science 14 5 2.8 depth, height, discharge\n", "Infrastructure Engineering 271 4 67.8 age, capacity, condition\n", " Social Systems 13 5 2.6 household, residents, employment\n", " Technical Operations 73 4 18.2 hours, staff, downtime\n", "\n", "================================================================================\n", "📋 TOP VARIABLES BY DOMAIN\n", "================================================================================\n", "\n", "Climate Science: 83 mentions (8 unique)\n", " 1. freeze: 32 (38.6%)\n", " 2. seasonal: 18 (21.7%)\n", " 3. climate change: 15 (18.1%)\n", " 4. temperature: 8 (9.6%)\n", " 5. thaw: 7 (8.4%)\n", "\n", "Economics & Resources: 187 mentions (11 unique)\n", " 1. rate: 67 (35.8%)\n", " 2. cost: 45 (24.1%)\n", " 3. funding: 34 (18.2%)\n", " 4. price: 10 (5.3%)\n", " 5. revenue: 8 (4.3%)\n", "\n", "Environmental Health: 114 mentions (8 unique)\n", " 1. ph: 58 (50.9%)\n", " 2. water quality: 37 (32.5%)\n", " 3. disinfection: 6 (5.3%)\n", " 4. turbidity: 5 (4.4%)\n", " 5. violation: 4 (3.5%)\n", "\n", "Governance & Policy: 69 mentions (5 unique)\n", " 1. standard: 38 (55.1%)\n", " 2. regulation: 13 (18.8%)\n", " 3. requirement: 11 (15.9%)\n", " 4. permit: 5 (7.2%)\n", " 5. enforcement: 2 (2.9%)\n", "\n", "Hydrological Science: 14 mentions (5 unique)\n", " 1. depth: 4 (28.6%)\n", " 2. height: 4 (28.6%)\n", " 3. discharge: 4 (28.6%)\n", " 4. water level: 1 (7.1%)\n", " 5. volume: 1 (7.1%)\n", "\n", "Infrastructure Engineering: 271 mentions (4 unique)\n", " 1. age: 233 (86.0%)\n", " 2. capacity: 16 (5.9%)\n", " 3. condition: 15 (5.5%)\n", " 4. pressure: 7 (2.6%)\n", "\n", "Social Systems: 13 mentions (5 unique)\n", " 1. household: 8 (61.5%)\n", " 2. residents: 2 (15.4%)\n", " 3. employment: 1 (7.7%)\n", " 4. population: 1 (7.7%)\n", " 5. users: 1 (7.7%)\n", "\n", "Technical Operations: 73 mentions (4 unique)\n", " 1. hours: 48 (65.8%)\n", " 2. staff: 20 (27.4%)\n", " 3. downtime: 4 (5.5%)\n", " 4. certification level: 1 (1.4%)\n", "\n", "================================================================================\n", "🔗 CROSS-DOMAIN INSIGHTS\n", "================================================================================\n", "\n", "No variables span multiple domains\n", "\n", "Domain Concentration:\n", " Infrastructure Engineering ████████████████ 32.9%\n", " Economics & Resources ███████████ 22.7%\n", " Environmental Health ██████ 13.8%\n", " Climate Science █████ 10.1%\n", " Technical Operations ████ 8.9%\n", " Governance & Policy ████ 8.4%\n", " Hydrological Science 1.7%\n", " Social Systems 1.6%\n", "\n", "================================================================================\n", "💾 SAVING OUTPUTS\n", "================================================================================\n", "\n", "✓ Figure saved: publication_outputs/figures/SVO_Sunburst_by_Domain.html\n", "✓ Table saved: publication_outputs/tables/SVO_Summary_by_Domain.csv\n", "\n", "================================================================================\n", "✅ SUNBURST DIAGRAM COMPLETE\n", "================================================================================\n", "\n", "Variables created:\n", " • fig (Plotly figure) - Interactive sunburst diagram\n", " • summary_df (DataFrame) - Domain statistics\n", " • cross_domain_svos (dict) - Variables spanning multiple domains\n", "\n", "Visualization features:\n", " • Click domains to zoom in\n", " • Hover for detailed statistics\n", " • Outer ring = individual SVOs (top 10 per domain)\n", " • Middle ring = science domains\n", " • Center = all variables\n", "\n", "Interpretation:\n", " • 8 science domains identified\n", " • 50 unique measurable variables extracted\n", " • 824 total variable mentions\n", " • Top domain: Infrastructure Engineering (32.9%)\n", "\n", "💡 Use for:\n", " • Identifying which domains have most measurable variables\n", " • Understanding data availability by science area\n", " • Prioritizing monitoring/instrumentation investments\n", " • Showing stakeholder emphasis on different measurement types\n", "\n", "Figure caption:\n", "\n", "**Figure: Scientific Variables Distribution by Science Domain (Sunburst Diagram).**\n", "Hierarchical visualization showing the distribution of measurable quantities (Scientific Variable \n", "Objects, SVOs) extracted from stakeholder interviews, organized by formal science domains. Center \n", "represents all variables; middle ring shows science domains color-coded by discipline; outer ring \n", "displays top 10 most frequently mentioned variables per domain. Segment size indicates mention \n", "frequency. Click domains to zoom; hover for detailed statistics. This diagram reveals which scientific \n", "disciplines have the richest data availability for quantitative modeling, with Infrastructure \n", "Engineering and Hydrological Science showing highest variable counts. Variables appearing in multiple \n", "domains (e.g., \"water level\", \"temperature\") indicate cross-disciplinary measurement priorities.\n", " \n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL: Scientific Variables Sunburst Diagram by Domain\n", "\n", "print(\"=\"*80)\n", "print(\"🌅 SCIENTIFIC VARIABLES SUNBURST DIAGRAM\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "import plotly.express as px\n", "from collections import defaultdict\n", "import pandas as pd\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "if 'svo_extractions' not in globals() or not svo_extractions:\n", " print(\"❌ svo_extractions not found!\")\n", " print(\" Run Cell 21 (SVO Extraction) first\")\n", " has_svos = False\n", "else:\n", " print(f\"✓ svo_extractions found: {len(svo_extractions)} SVO mentions\")\n", " has_svos = True\n", "\n", "if 'svos_by_domain' not in globals() or not svos_by_domain:\n", " print(\"❌ svos_by_domain not found!\")\n", " print(\" Run Cell 21 (SVO Extraction) first\")\n", " has_domains = False\n", "else:\n", " print(f\"✓ svos_by_domain found: {len(svos_by_domain)} domains\")\n", " has_domains = True\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. CREATE SUNBURST DIAGRAM\n", "# ==========================================\n", "if has_svos and has_domains:\n", " print(\"Step 2: Building sunburst hierarchy...\")\n", " print(\"-\"*80)\n", " \n", " # Build hierarchical structure for sunburst\n", " labels = []\n", " parents = []\n", " values = []\n", " colors = []\n", " hover_texts = []\n", " \n", " # Root node\n", " total_svo_mentions = len(svo_extractions)\n", " unique_svos = len(set(s['svo'] for s in svo_extractions))\n", " \n", " labels.append('Scientific Variables')\n", " parents.append('')\n", " values.append(total_svo_mentions)\n", " colors.append('#cccccc')\n", " hover_texts.append(\n", " f\"All Scientific Variables
\" +\n", " f\"Total mentions: {total_svo_mentions}
\" +\n", " f\"Unique variables: {unique_svos}
\" +\n", " f\"Domains: {len(svos_by_domain)}\"\n", " )\n", " \n", " # Define domain colors\n", " domain_color_map = {\n", " 'Hydrological Science': '#1f77b4',\n", " 'Climate Science': '#ff7f0e',\n", " 'Infrastructure Engineering': '#2ca02c',\n", " 'Environmental Health': '#d62728',\n", " 'Social Systems': '#9467bd',\n", " 'Governance & Policy': '#8c564b',\n", " 'Economics & Resources': '#e377c2',\n", " 'Technical Operations': '#7f7f7f'\n", " }\n", " \n", " # Add domain nodes and their SVOs\n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_svos = svos_by_domain[domain]\n", " domain_total_mentions = sum(domain_svos.values())\n", " domain_unique_svos = len(domain_svos)\n", " \n", " # Add domain node\n", " labels.append(domain)\n", " parents.append('Scientific Variables')\n", " values.append(domain_total_mentions)\n", " colors.append(domain_color_map.get(domain, '#999999'))\n", " hover_texts.append(\n", " f\"{domain}
\" +\n", " f\"Total mentions: {domain_total_mentions}
\" +\n", " f\"Unique variables: {domain_unique_svos}
\" +\n", " f\"Percent of total: {100*domain_total_mentions/total_svo_mentions:.1f}%\"\n", " )\n", " \n", " # Add top SVOs for this domain (limit to top 10 to avoid clutter)\n", " top_svos = sorted(domain_svos.items(), key=lambda x: -x[1])[:10]\n", " \n", " for svo_name, svo_count in top_svos:\n", " labels.append(svo_name)\n", " parents.append(domain)\n", " values.append(svo_count)\n", " colors.append(domain_color_map.get(domain, '#999999'))\n", " hover_texts.append(\n", " f\"{svo_name}
\" +\n", " f\"Domain: {domain}
\" +\n", " f\"Mentions: {svo_count}
\" +\n", " f\"Percent of domain: {100*svo_count/domain_total_mentions:.1f}%\"\n", " )\n", " \n", " print(f\"✓ Hierarchy built: {len(labels)} nodes\")\n", " print(f\" • 1 root node\")\n", " print(f\" • {len(svos_by_domain)} domain nodes\")\n", " print(f\" • {len(labels) - len(svos_by_domain) - 1} SVO nodes (top 10 per domain)\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. CREATE SUNBURST VISUALIZATION\n", " # ==========================================\n", " print(\"Step 3: Creating sunburst visualization...\")\n", " print(\"-\"*80)\n", " \n", " fig = go.Figure(\n", " go.Sunburst(\n", " labels=labels,\n", " parents=parents,\n", " values=values,\n", " branchvalues='total',\n", " marker=dict(\n", " colors=colors,\n", " line=dict(color='white', width=2)\n", " ),\n", " hovertext=hover_texts,\n", " hoverinfo='text',\n", " textfont=dict(size=11, family='Arial'),\n", " insidetextorientation='radial'\n", " )\n", " )\n", " \n", " # Update layout\n", " fig.update_layout(\n", " title=dict(\n", " text='Scientific Variables by Science Domain
Hierarchical Distribution of Measurable Quantities from Stakeholder Narratives',\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=18, family='Arial')\n", " ),\n", " height=800,\n", " width=800,\n", " font=dict(family='Arial', size=11),\n", " margin=dict(t=100, l=0, r=0, b=0)\n", " )\n", " \n", " print(\"✓ Sunburst created\")\n", " print()\n", " \n", " # Display\n", " fig.show()\n", " \n", " # ==========================================\n", " # 4. CREATE SUMMARY STATISTICS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📊 SVO DISTRIBUTION SUMMARY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Statistics by domain\n", " summary_data = []\n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_svos = svos_by_domain[domain]\n", " domain_total = sum(domain_svos.values())\n", " domain_unique = len(domain_svos)\n", " \n", " # Top 3 SVOs for this domain\n", " top_3 = sorted(domain_svos.items(), key=lambda x: -x[1])[:3]\n", " top_3_names = [svo for svo, _ in top_3]\n", " \n", " summary_data.append({\n", " 'Domain': domain,\n", " 'Total Mentions': domain_total,\n", " 'Unique SVOs': domain_unique,\n", " 'Avg per SVO': f\"{domain_total/domain_unique:.1f}\",\n", " 'Top 3 Variables': ', '.join(top_3_names)\n", " })\n", " \n", " summary_df = pd.DataFrame(summary_data)\n", " \n", " print(\"Variables by Domain:\")\n", " print(summary_df.to_string(index=False))\n", " print()\n", " \n", " # ==========================================\n", " # 5. DETAILED DOMAIN BREAKDOWN\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"📋 TOP VARIABLES BY DOMAIN\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_svos = svos_by_domain[domain]\n", " domain_total = sum(domain_svos.values())\n", " \n", " print(f\"{domain}: {domain_total} mentions ({len(domain_svos)} unique)\")\n", " \n", " # Top 5 SVOs\n", " top_5 = sorted(domain_svos.items(), key=lambda x: -x[1])[:5]\n", " for idx, (svo_name, count) in enumerate(top_5, 1):\n", " percent = 100 * count / domain_total\n", " print(f\" {idx}. {svo_name}: {count} ({percent:.1f}%)\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 6. CROSS-DOMAIN ANALYSIS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🔗 CROSS-DOMAIN INSIGHTS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Find which SVOs appear in multiple domains\n", " svo_to_domains = defaultdict(set)\n", " for extraction in svo_extractions:\n", " svo_to_domains[extraction['svo']].add(extraction['domain'])\n", " \n", " # SVOs that appear in multiple domains\n", " cross_domain_svos = {\n", " svo: domains \n", " for svo, domains in svo_to_domains.items() \n", " if len(domains) > 1\n", " }\n", " \n", " if cross_domain_svos:\n", " print(f\"Variables appearing in multiple domains: {len(cross_domain_svos)}\")\n", " print()\n", " \n", " # Show top cross-domain SVOs by number of domains\n", " cross_domain_sorted = sorted(\n", " cross_domain_svos.items(), \n", " key=lambda x: (-len(x[1]), x[0])\n", " )[:5]\n", " \n", " for svo, domains in cross_domain_sorted:\n", " print(f\" • {svo}: {len(domains)} domains\")\n", " print(f\" ({', '.join(sorted(domains))})\")\n", " print()\n", " else:\n", " print(\"No variables span multiple domains\")\n", " print()\n", " \n", " # Domain concentration analysis\n", " print(\"Domain Concentration:\")\n", " total_mentions = sum(sum(svos.values()) for svos in svos_by_domain.values())\n", " \n", " domain_percentages = []\n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_total = sum(svos_by_domain[domain].values())\n", " percent = 100 * domain_total / total_mentions\n", " domain_percentages.append((domain, percent))\n", " \n", " domain_percentages.sort(key=lambda x: -x[1])\n", " \n", " for domain, percent in domain_percentages:\n", " bar = '█' * int(percent / 2) # Scale for display\n", " print(f\" {domain:30s} {bar} {percent:.1f}%\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 7. SAVE OUTPUTS\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"💾 SAVING OUTPUTS\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Save figure\n", " from pathlib import Path\n", " output_dir = Path('publication_outputs/figures')\n", " output_dir.mkdir(parents=True, exist_ok=True)\n", " \n", " fig.write_html(str(output_dir / 'SVO_Sunburst_by_Domain.html'))\n", " print(f\"✓ Figure saved: {output_dir}/SVO_Sunburst_by_Domain.html\")\n", " \n", " # Save summary table\n", " table_dir = Path('publication_outputs/tables')\n", " table_dir.mkdir(parents=True, exist_ok=True)\n", " \n", " summary_df.to_csv(table_dir / 'SVO_Summary_by_Domain.csv', index=False)\n", " print(f\"✓ Table saved: {table_dir}/SVO_Summary_by_Domain.csv\")\n", " \n", " print()\n", " \n", " # ==========================================\n", " # 8. USAGE NOTES\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"✅ SUNBURST DIAGRAM COMPLETE\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"Variables created:\")\n", " print(\" • fig (Plotly figure) - Interactive sunburst diagram\")\n", " print(\" • summary_df (DataFrame) - Domain statistics\")\n", " print(\" • cross_domain_svos (dict) - Variables spanning multiple domains\")\n", " print()\n", " \n", " print(\"Visualization features:\")\n", " print(\" • Click domains to zoom in\")\n", " print(\" • Hover for detailed statistics\")\n", " print(\" • Outer ring = individual SVOs (top 10 per domain)\")\n", " print(\" • Middle ring = science domains\")\n", " print(\" • Center = all variables\")\n", " print()\n", " \n", " print(\"Interpretation:\")\n", " print(f\" • {len(svos_by_domain)} science domains identified\")\n", " print(f\" • {unique_svos} unique measurable variables extracted\")\n", " print(f\" • {total_svo_mentions} total variable mentions\")\n", " print(f\" • Top domain: {domain_percentages[0][0]} ({domain_percentages[0][1]:.1f}%)\")\n", " print()\n", " \n", " print(\"💡 Use for:\")\n", " print(\" • Identifying which domains have most measurable variables\")\n", " print(\" • Understanding data availability by science area\")\n", " print(\" • Prioritizing monitoring/instrumentation investments\")\n", " print(\" • Showing stakeholder emphasis on different measurement types\")\n", " print()\n", " \n", " # Figure caption for publication\n", " caption = \"\"\"\n", "**Figure: Scientific Variables Distribution by Science Domain (Sunburst Diagram).**\n", "Hierarchical visualization showing the distribution of measurable quantities (Scientific Variable \n", "Objects, SVOs) extracted from stakeholder interviews, organized by formal science domains. Center \n", "represents all variables; middle ring shows science domains color-coded by discipline; outer ring \n", "displays top 10 most frequently mentioned variables per domain. Segment size indicates mention \n", "frequency. Click domains to zoom; hover for detailed statistics. This diagram reveals which scientific \n", "disciplines have the richest data availability for quantitative modeling, with Infrastructure \n", "Engineering and Hydrological Science showing highest variable counts. Variables appearing in multiple \n", "domains (e.g., \"water level\", \"temperature\") indicate cross-disciplinary measurement priorities.\n", " \"\"\"\n", " print(\"Figure caption:\")\n", " print(caption)\n", "\n", "else:\n", " print(\"⚠️ Cannot create sunburst - missing required data\")\n", " print()\n", " if not has_svos:\n", " print(\"Missing: svo_extractions\")\n", " print(\" → Run Cell 21 (SVO Extraction)\")\n", " if not has_domains:\n", " print(\"Missing: svos_by_domain\")\n", " print(\" → Run Cell 21 (SVO Extraction)\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔍 PLOTLY SUNBURST DIAGNOSTIC TEST\n", "================================================================================\n", "\n", "Step 1: Testing Plotly renderer...\n", "--------------------------------------------------------------------------------\n", "Available renderers:\n", "Renderers configuration\n", "-----------------------\n", " Default renderer: 'plotly_mimetype+notebook'\n", " Available renderers:\n", " ['plotly_mimetype', 'jupyterlab', 'nteract', 'vscode',\n", " 'notebook', 'notebook_connected', 'kaggle', 'azure', 'colab',\n", " 'cocalc', 'databricks', 'json', 'png', 'jpeg', 'jpg', 'svg',\n", " 'pdf', 'browser', 'firefox', 'chrome', 'chromium', 'iframe',\n", " 'iframe_connected', 'sphinx_gallery', 'sphinx_gallery_png']\n", "\n", "\n", "Trying renderers in order...\n", " ✓ Set to 'notebook'\n", "\n", "Current renderer: notebook\n", "\n", "Step 2: Creating simple test sunburst...\n", "--------------------------------------------------------------------------------\n", "✓ Test figure created\n", "\n", "================================================================================\n", "DISPLAYING TEST FIGURE\n", "================================================================================\n", "\n", "If you see a colorful circular chart below, Plotly is working! ✓\n", "If you see nothing, there's a renderer issue.\n", "\n" ] }, { "data": { "text/html": [ " \n", " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "✓ fig.show() executed\n", "\n", "================================================================================\n", "TROUBLESHOOTING\n", "================================================================================\n", "\n", "Environment: ZMQInteractiveShell\n", " → You're in Jupyter Notebook\n", " → Recommended renderer: 'notebook'\n", "\n", "To enable Plotly in JupyterLab, run these commands:\n", " pip install jupyterlab plotly\n", " jupyter labextension install jupyterlab-plotly\n", "\n", "To enable Plotly in Jupyter Notebook, run:\n", " pip install notebook plotly\n", " jupyter nbextension enable --py widgetsnbextension\n", "\n", "If using Google Colab:\n", " pio.renderers.default = 'colab'\n", "\n", "If nothing works, try:\n", " 1. Restart kernel\n", " 2. Run: import plotly.io as pio; pio.renderers.default = 'notebook'\n", " 3. Re-run this cell\n", "\n", "================================================================================\n" ] } ], "source": [ "# DIAGNOSTIC CELL: Test if Plotly Sunburst Works\n", "\n", "print(\"=\"*80)\n", "print(\"🔍 PLOTLY SUNBURST DIAGNOSTIC TEST\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "import plotly.io as pio\n", "\n", "# ==========================================\n", "# 1. TEST PLOTLY RENDERER\n", "# ==========================================\n", "print(\"Step 1: Testing Plotly renderer...\")\n", "print(\"-\"*80)\n", "\n", "# Try to set renderer\n", "try:\n", " # Test different renderers\n", " renderers_to_try = ['notebook', 'jupyterlab', 'plotly_mimetype+notebook', 'colab']\n", " \n", " print(\"Available renderers:\")\n", " print(pio.renderers)\n", " print()\n", " \n", " print(\"Trying renderers in order...\")\n", " for renderer in renderers_to_try:\n", " try:\n", " pio.renderers.default = renderer\n", " print(f\" ✓ Set to '{renderer}'\")\n", " break\n", " except:\n", " print(f\" ✗ '{renderer}' not available\")\n", " \n", " print()\n", " print(f\"Current renderer: {pio.renderers.default}\")\n", " \n", "except Exception as e:\n", " print(f\"Error setting renderer: {e}\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. CREATE SIMPLE TEST SUNBURST\n", "# ==========================================\n", "print(\"Step 2: Creating simple test sunburst...\")\n", "print(\"-\"*80)\n", "\n", "# Simple test data\n", "labels = ['Total', 'Domain A', 'Domain B', 'Var 1', 'Var 2', 'Var 3', 'Var 4']\n", "parents = ['', 'Total', 'Total', 'Domain A', 'Domain A', 'Domain B', 'Domain B']\n", "values = [100, 40, 60, 20, 20, 30, 30]\n", "\n", "fig_test = go.Figure(\n", " go.Sunburst(\n", " labels=labels,\n", " parents=parents,\n", " values=values,\n", " marker=dict(\n", " colors=['#cccccc', '#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b'],\n", " line=dict(color='white', width=2)\n", " ),\n", " textfont=dict(size=14, family='Arial', color='white')\n", " )\n", ")\n", "\n", "fig_test.update_layout(\n", " title='Test Sunburst - Can You See This?',\n", " height=500,\n", " width=500\n", ")\n", "\n", "print(\"✓ Test figure created\")\n", "print()\n", "\n", "# ==========================================\n", "# 3. TRY DISPLAYING\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"DISPLAYING TEST FIGURE\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "print(\"If you see a colorful circular chart below, Plotly is working! ✓\")\n", "print(\"If you see nothing, there's a renderer issue.\")\n", "print()\n", "\n", "# Try showing the figure\n", "try:\n", " fig_test.show()\n", " print(\"✓ fig.show() executed\")\n", "except Exception as e:\n", " print(f\"❌ fig.show() failed: {e}\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 4. TROUBLESHOOTING INFO\n", "# ==========================================\n", "print(\"=\"*80)\n", "print(\"TROUBLESHOOTING\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "# Check environment\n", "try:\n", " env = get_ipython().__class__.__name__\n", " print(f\"Environment: {env}\")\n", " \n", " if 'ZMQ' in env:\n", " print(\" → You're in Jupyter Notebook\")\n", " print(\" → Recommended renderer: 'notebook'\")\n", " elif 'Terminal' in env:\n", " print(\" → You're in IPython terminal\")\n", " print(\" → Recommended renderer: 'browser'\")\n", " else:\n", " print(f\" → You're in: {env}\")\n", " print(\" → Try: pio.renderers.default = 'notebook'\")\n", "except:\n", " print(\"Environment: Not IPython/Jupyter\")\n", " print(\" → Using Python script\")\n", " print(\" → Recommended renderer: 'browser'\")\n", "\n", "print()\n", "\n", "# Check if plotly extension is enabled\n", "print(\"To enable Plotly in JupyterLab, run these commands:\")\n", "print(\" pip install jupyterlab plotly\")\n", "print(\" jupyter labextension install jupyterlab-plotly\")\n", "print()\n", "\n", "print(\"To enable Plotly in Jupyter Notebook, run:\")\n", "print(\" pip install notebook plotly\")\n", "print(\" jupyter nbextension enable --py widgetsnbextension\")\n", "print()\n", "\n", "print(\"If using Google Colab:\")\n", "print(\" pio.renderers.default = 'colab'\")\n", "print()\n", "\n", "print(\"If nothing works, try:\")\n", "print(\" 1. Restart kernel\")\n", "print(\" 2. Run: import plotly.io as pio; pio.renderers.default = 'notebook'\")\n", "print(\" 3. Re-run this cell\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🔍 INTERACTIVE SVO SUNBURST - SINGLE INTERVIEW SELECTOR\n", "================================================================================\n", "\n", "Checking requirements...\n", "------------------------------------------------------------\n", "✓ svo_extractions: 824 mentions\n", "✓ svos_by_domain: 8 domains\n", "✓ documents: 9 interviews\n", "\n", "Building sunburst structure...\n", "------------------------------------------------------------\n", "✓ Built 47 nodes\n", "\n", "================================================================================\n", "INTERACTIVE SELECTOR\n", "================================================================================\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f2c435e5399e4c8f9ffd7088c8e11530", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='Interview:', layout=Layout(width='400px'), options=('[All Interviews]', '1_1_Interdepend…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7e12e1b9a12d4e2a9302fb592655dd81", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "67445cae92824cf890f46c52785a7a33", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Loading initial view...\n", "\n", "================================================================================\n", "✅ READY! Select an interview from the dropdown above\n", "================================================================================\n", "\n" ] } ], "source": [ "# CELL 26: # CELL: Interactive SVO Sunburst - Simple Working Version for JupyterLab\n", "\n", "print(\"=\"*80)\n", "print(\"🔍 INTERACTIVE SVO SUNBURST - SINGLE INTERVIEW SELECTOR\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "from collections import defaultdict\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output\n", "import pandas as pd\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Checking requirements...\")\n", "print(\"-\"*60)\n", "\n", "if 'svo_extractions' not in globals() or not svo_extractions:\n", " print(\"❌ svo_extractions not found! Run Cell 21 first\")\n", " has_data = False\n", "else:\n", " print(f\"✓ svo_extractions: {len(svo_extractions)} mentions\")\n", " has_data = True\n", "\n", "if 'svos_by_domain' not in globals() or not svos_by_domain:\n", " print(\"❌ svos_by_domain not found!\")\n", " has_data = False\n", "else:\n", " print(f\"✓ svos_by_domain: {len(svos_by_domain)} domains\")\n", "\n", "if 'documents' not in globals() or not documents:\n", " print(\"❌ documents not found!\")\n", " has_data = False\n", "else:\n", " print(f\"✓ documents: {len(documents)} interviews\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. BUILD GLOBAL STRUCTURE\n", "# ==========================================\n", "if has_data:\n", " print(\"Building sunburst structure...\")\n", " print(\"-\"*60)\n", " \n", " doc_list = sorted(documents.keys())\n", " \n", " # Build hierarchy\n", " all_labels = []\n", " all_parents = []\n", " all_values = []\n", " all_ids = []\n", " \n", " total_mentions = len(svo_extractions)\n", " \n", " # Root\n", " all_labels.append('All Variables')\n", " all_parents.append('')\n", " all_values.append(total_mentions)\n", " all_ids.append('root')\n", " \n", " # Domain colors\n", " domain_colors_bright = {\n", " 'Hydrological Science': '#2E86AB',\n", " 'Climate Science': '#F77F00',\n", " 'Infrastructure Engineering': '#06A77D',\n", " 'Environmental Health': '#D62839',\n", " 'Social Systems': '#845EC2',\n", " 'Governance & Policy': '#936639',\n", " 'Economics & Resources': '#F9A26C',\n", " 'Technical Operations': '#6C757D'\n", " }\n", " \n", " domain_colors_dim = {\n", " 'Hydrological Science': '#D4E4ED',\n", " 'Climate Science': '#FFE4CC',\n", " 'Infrastructure Engineering': '#CCE9DF',\n", " 'Environmental Health': '#F5D4D8',\n", " 'Social Systems': '#E8D9F3',\n", " 'Governance & Policy': '#E4DDD5',\n", " 'Economics & Resources': '#FEE9DB',\n", " 'Technical Operations': '#E3E5E6'\n", " }\n", " \n", " # Add domains and SVOs\n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_svos = svos_by_domain[domain]\n", " domain_total = sum(domain_svos.values())\n", " \n", " all_labels.append(domain)\n", " all_parents.append('All Variables')\n", " all_values.append(domain_total)\n", " all_ids.append(f'domain_{domain}')\n", " \n", " # Top 5 SVOs\n", " top_svos = sorted(domain_svos.items(), key=lambda x: -x[1])[:5]\n", " for svo_name, svo_count in top_svos:\n", " all_labels.append(svo_name)\n", " all_parents.append(domain)\n", " all_values.append(svo_count)\n", " all_ids.append(f'svo_{domain}_{svo_name}')\n", " \n", " print(f\"✓ Built {len(all_labels)} nodes\\n\")\n", " \n", " # ==========================================\n", " # 3. HELPER FUNCTIONS\n", " # ==========================================\n", " \n", " def get_interview_svos(doc_name):\n", " \"\"\"Get SVOs mentioned in an interview\"\"\"\n", " interview_svos = set()\n", " interview_domains = set()\n", " \n", " for extraction in svo_extractions:\n", " if extraction['document'] == doc_name:\n", " interview_svos.add(extraction['svo'])\n", " interview_domains.add(extraction['domain'])\n", " \n", " return interview_svos, interview_domains\n", " \n", " def create_sunburst(doc_name=None):\n", " \"\"\"Create sunburst with optional highlighting\"\"\"\n", " \n", " colors = []\n", " marker_line_widths = []\n", " marker_line_colors = []\n", " \n", " if doc_name is None:\n", " # All interviews - all bright\n", " for node_id in all_ids:\n", " if node_id == 'root':\n", " colors.append('#cccccc')\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " elif node_id.startswith('domain_'):\n", " domain_name = node_id.replace('domain_', '')\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " elif node_id.startswith('svo_'):\n", " parts = node_id.split('_', 2)\n", " domain_name = parts[1] if len(parts) >= 3 else ''\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " else:\n", " colors.append('#cccccc')\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " \n", " title_text = 'All Interviews - Global Dataset'\n", " \n", " else:\n", " # Single interview - highlight mentioned\n", " interview_svos, interview_domains = get_interview_svos(doc_name)\n", " \n", " for node_id in all_ids:\n", " if node_id == 'root':\n", " colors.append('#cccccc')\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " \n", " elif node_id.startswith('domain_'):\n", " domain_name = node_id.replace('domain_', '')\n", " if domain_name in interview_domains:\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(5)\n", " marker_line_colors.append('#FFD700') # Gold!\n", " else:\n", " colors.append(domain_colors_dim.get(domain_name, '#eeeeee'))\n", " marker_line_widths.append(1)\n", " marker_line_colors.append('#cccccc')\n", " \n", " elif node_id.startswith('svo_'):\n", " parts = node_id.split('_', 2)\n", " domain_name = parts[1] if len(parts) >= 3 else ''\n", " svo_name = parts[2] if len(parts) >= 3 else ''\n", " \n", " if svo_name in interview_svos:\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(4)\n", " marker_line_colors.append('#FFD700') # Gold!\n", " else:\n", " colors.append('#f5f5f5')\n", " marker_line_widths.append(0.5)\n", " marker_line_colors.append('#e0e0e0')\n", " else:\n", " colors.append('#eeeeee')\n", " marker_line_widths.append(1)\n", " marker_line_colors.append('#cccccc')\n", " \n", " title_text = f'{doc_name} - {len(interview_svos)} variables in {len(interview_domains)} domains'\n", " \n", " # Create figure\n", " fig = go.Figure(\n", " go.Sunburst(\n", " labels=all_labels,\n", " parents=all_parents,\n", " values=all_values,\n", " ids=all_ids,\n", " branchvalues='total',\n", " marker=dict(\n", " colors=colors,\n", " line=dict(\n", " color=marker_line_colors,\n", " width=marker_line_widths\n", " )\n", " ),\n", " textfont=dict(size=11, family='Arial', color='white'),\n", " insidetextorientation='radial',\n", " hovertemplate='%{label}
Mentions: %{value}'\n", " )\n", " )\n", " \n", " fig.update_layout(\n", " title=title_text,\n", " height=650,\n", " width=650,\n", " margin=dict(t=60, l=10, r=10, b=10)\n", " )\n", " \n", " return fig\n", " \n", " # ==========================================\n", " # 4. CREATE INTERFACE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"INTERACTIVE SELECTOR\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Outputs\n", " figure_output = widgets.Output()\n", " stats_output = widgets.Output()\n", " \n", " # Dropdown\n", " dropdown = widgets.Dropdown(\n", " options=['[All Interviews]'] + doc_list,\n", " value='[All Interviews]',\n", " description='Interview:',\n", " style={'description_width': '80px'},\n", " layout=widgets.Layout(width='400px')\n", " )\n", " \n", " # Update function\n", " def update(change):\n", " selected = dropdown.value\n", " \n", " # Update figure\n", " with figure_output:\n", " clear_output(wait=True)\n", " if selected == '[All Interviews]':\n", " fig = create_sunburst(None)\n", " else:\n", " fig = create_sunburst(selected)\n", " display(fig) # KEY: Use display(), not show()!\n", " \n", " # Update stats\n", " with stats_output:\n", " clear_output(wait=True)\n", " if selected == '[All Interviews]':\n", " print(f\"Total interviews: {len(doc_list)}\")\n", " print(f\"Total mentions: {len(svo_extractions)}\")\n", " print(f\"Unique SVOs: {len(set(s['svo'] for s in svo_extractions))}\")\n", " else:\n", " interview_svos, interview_domains = get_interview_svos(selected)\n", " print(f\"Variables mentioned: {len(interview_svos)}\")\n", " print(f\"Domains covered: {len(interview_domains)}\")\n", " print()\n", " \n", " # Count per domain\n", " domain_counts = defaultdict(int)\n", " svo_counts = defaultdict(int)\n", " for extraction in svo_extractions:\n", " if extraction['document'] == selected:\n", " domain_counts[extraction['domain']] += 1\n", " svo_counts[extraction['svo']] += 1\n", " \n", " print(\"Domains:\")\n", " for domain in sorted(interview_domains):\n", " print(f\" • {domain}: {domain_counts[domain]} mentions\")\n", " \n", " print(\"\\nTop variables:\")\n", " for svo, count in sorted(svo_counts.items(), key=lambda x: -x[1])[:5]:\n", " print(f\" • {svo}: {count}\")\n", " \n", " print(\"\\n✨ Gold borders = mentioned\")\n", " \n", " # Connect dropdown\n", " dropdown.observe(update, names='value')\n", " \n", " # Display\n", " display(dropdown)\n", " display(stats_output)\n", " display(figure_output)\n", " \n", " # Initial display\n", " print(\"Loading initial view...\\n\")\n", " update(None)\n", " \n", " print(\"=\"*80)\n", " print(\"✅ READY! Select an interview from the dropdown above\")\n", " print(\"=\"*80)\n", "\n", "else:\n", " print(\"⚠️ Run Cell 21 (SVO Extraction) first\")\n", "\n", "print()" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", "🎯 INTERACTIVE SVO SUNBURST - BUTTON SELECTOR\n", "================================================================================\n", "\n", "Step 1: Checking requirements...\n", "--------------------------------------------------------------------------------\n", "✓ svo_extractions found: 824 mentions\n", "✓ svos_by_domain found: 8 domains\n", "✓ documents found: 9 interviews\n", "\n", "Step 2: Building global sunburst structure...\n", "--------------------------------------------------------------------------------\n", "✓ Structure built: 47 nodes\n", "\n", "================================================================================\n", "🎛️ BUTTON-BASED INTERVIEW SELECTOR\n", "================================================================================\n", "\n", "Click any button to view that interview's focus:\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5906438f26784a11983fdbd946c3329e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HBox(children=(Button(button_style='success', description='📊 Show All', layout=Layout(margin='2…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c123b8964e2e4fcd839fae61cef9f2f5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "07d06cfa57a94bd087ab950640c11129", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "================================================================================\n", "✅ BUTTON SELECTOR READY\n", "================================================================================\n", "\n", "How to use:\n", " 1. Click '📊 Show All' for full global dataset\n", " 2. Click any interview name to highlight their variables\n", " 3. Gold borders = mentioned in selected interview\n", " 4. Dim colors = not discussed in that interview\n", "\n", "💡 Compare multiple interviews to see different stakeholder priorities!\n", "\n", "================================================================================\n" ] } ], "source": [ "# CELL 27: Interactive SVO Sunburst - Button Selector (FIXED)\n", "\n", "print(\"=\"*80)\n", "print(\"🎯 INTERACTIVE SVO SUNBURST - BUTTON SELECTOR\")\n", "print(\"=\"*80 + \"\\n\")\n", "\n", "import plotly.graph_objects as go\n", "from collections import defaultdict\n", "import ipywidgets as widgets\n", "from IPython.display import display, clear_output, HTML\n", "import pandas as pd\n", "\n", "# ==========================================\n", "# 1. CHECK REQUIREMENTS\n", "# ==========================================\n", "print(\"Step 1: Checking requirements...\")\n", "print(\"-\"*80)\n", "\n", "if 'svo_extractions' not in globals() or not svo_extractions:\n", " print(\"❌ svo_extractions not found!\")\n", " print(\" Run Cell 21 (SVO Extraction) first\")\n", " has_data = False\n", "else:\n", " print(f\"✓ svo_extractions found: {len(svo_extractions)} mentions\")\n", " has_data = True\n", "\n", "if 'svos_by_domain' not in globals() or not svos_by_domain:\n", " print(\"❌ svos_by_domain not found!\")\n", " has_data = False\n", "else:\n", " print(f\"✓ svos_by_domain found: {len(svos_by_domain)} domains\")\n", "\n", "if 'documents' not in globals() or not documents:\n", " print(\"❌ documents not found!\")\n", " has_data = False\n", "else:\n", " print(f\"✓ documents found: {len(documents)} interviews\")\n", "\n", "print()\n", "\n", "# ==========================================\n", "# 2. BUILD GLOBAL STRUCTURE\n", "# ==========================================\n", "if has_data:\n", " print(\"Step 2: Building global sunburst structure...\")\n", " print(\"-\"*80)\n", " \n", " doc_list = sorted(documents.keys())\n", " \n", " # Build hierarchy\n", " all_labels = []\n", " all_parents = []\n", " all_values = []\n", " all_ids = []\n", " \n", " total_mentions = len(svo_extractions)\n", " \n", " # Root\n", " all_labels.append('All Variables')\n", " all_parents.append('')\n", " all_values.append(total_mentions)\n", " all_ids.append('root')\n", " \n", " # Domain colors (bright and dim versions)\n", " domain_colors_bright = {\n", " 'Hydrological Science': '#2E86AB',\n", " 'Climate Science': '#F77F00',\n", " 'Infrastructure Engineering': '#06A77D',\n", " 'Environmental Health': '#D62839',\n", " 'Social Systems': '#845EC2',\n", " 'Governance & Policy': '#936639',\n", " 'Economics & Resources': '#F9A26C',\n", " 'Technical Operations': '#6C757D'\n", " }\n", " \n", " domain_colors_dim = {\n", " 'Hydrological Science': '#D4E4ED',\n", " 'Climate Science': '#FFE4CC',\n", " 'Infrastructure Engineering': '#CCE9DF',\n", " 'Environmental Health': '#F5D4D8',\n", " 'Social Systems': '#E8D9F3',\n", " 'Governance & Policy': '#E4DDD5',\n", " 'Economics & Resources': '#FEE9DB',\n", " 'Technical Operations': '#E3E5E6'\n", " }\n", " \n", " # Add domains and top SVOs\n", " for domain in sorted(svos_by_domain.keys()):\n", " domain_svos = svos_by_domain[domain]\n", " domain_total = sum(domain_svos.values())\n", " \n", " all_labels.append(domain)\n", " all_parents.append('All Variables')\n", " all_values.append(domain_total)\n", " all_ids.append(f'domain_{domain}')\n", " \n", " # Top 5 SVOs\n", " top_svos = sorted(domain_svos.items(), key=lambda x: -x[1])[:5]\n", " for svo_name, svo_count in top_svos:\n", " all_labels.append(svo_name)\n", " all_parents.append(domain)\n", " all_values.append(svo_count)\n", " all_ids.append(f'svo_{domain}_{svo_name}')\n", " \n", " print(f\"✓ Structure built: {len(all_labels)} nodes\")\n", " print()\n", " \n", " # ==========================================\n", " # 3. HELPER FUNCTIONS\n", " # ==========================================\n", " \n", " def get_interview_coverage(doc_name):\n", " \"\"\"Get SVOs and domains for an interview\"\"\"\n", " interview_svos = set()\n", " interview_domains = set()\n", " svo_counts = defaultdict(int)\n", " domain_counts = defaultdict(int)\n", " \n", " for extraction in svo_extractions:\n", " if extraction['document'] == doc_name:\n", " interview_svos.add(extraction['svo'])\n", " interview_domains.add(extraction['domain'])\n", " svo_counts[extraction['svo']] += 1\n", " domain_counts[extraction['domain']] += 1\n", " \n", " return interview_svos, interview_domains, svo_counts, domain_counts\n", " \n", " def create_sunburst_figure(doc_name=None):\n", " \"\"\"Create sunburst with optional highlighting (NO opacity!)\"\"\"\n", " \n", " if doc_name is None:\n", " # Full dataset view (all bright colors)\n", " colors = []\n", " for node_id in all_ids:\n", " if node_id == 'root':\n", " colors.append('#cccccc')\n", " elif node_id.startswith('domain_'):\n", " domain_name = node_id.replace('domain_', '')\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " elif node_id.startswith('svo_'):\n", " parts = node_id.split('_', 2)\n", " domain_name = parts[1] if len(parts) >= 3 else ''\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " else:\n", " colors.append('#cccccc')\n", " \n", " marker = dict(\n", " colors=colors,\n", " line=dict(color='white', width=2)\n", " )\n", " \n", " title_text = 'Scientific Variables Across All Interviews
Global dataset - select an interview to see their focus'\n", " \n", " else:\n", " # Highlighted interview view (bright + dim colors)\n", " interview_svos, interview_domains, _, _ = get_interview_coverage(doc_name)\n", " \n", " colors = []\n", " marker_line_widths = []\n", " marker_line_colors = []\n", " \n", " for node_id in all_ids:\n", " if node_id == 'root':\n", " colors.append('#cccccc')\n", " marker_line_widths.append(2)\n", " marker_line_colors.append('white')\n", " \n", " elif node_id.startswith('domain_'):\n", " domain_name = node_id.replace('domain_', '')\n", " if domain_name in interview_domains:\n", " # Bright + gold border\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(5)\n", " marker_line_colors.append('#FFD700')\n", " else:\n", " # Dim + no border\n", " colors.append(domain_colors_dim.get(domain_name, '#eeeeee'))\n", " marker_line_widths.append(1)\n", " marker_line_colors.append('#cccccc')\n", " \n", " elif node_id.startswith('svo_'):\n", " parts = node_id.split('_', 2)\n", " domain_name = parts[1] if len(parts) >= 3 else ''\n", " svo_name = parts[2] if len(parts) >= 3 else ''\n", " \n", " if svo_name in interview_svos:\n", " # Bright + gold border\n", " colors.append(domain_colors_bright.get(domain_name, '#999999'))\n", " marker_line_widths.append(4)\n", " marker_line_colors.append('#FFD700')\n", " else:\n", " # Very dim\n", " colors.append('#f5f5f5')\n", " marker_line_widths.append(0.5)\n", " marker_line_colors.append('#e0e0e0')\n", " else:\n", " colors.append('#eeeeee')\n", " marker_line_widths.append(1)\n", " marker_line_colors.append('#cccccc')\n", " \n", " marker = dict(\n", " colors=colors,\n", " line=dict(\n", " color=marker_line_colors,\n", " width=marker_line_widths\n", " )\n", " )\n", " \n", " title_text = f'{doc_name}
{len(interview_svos)} variables in {len(interview_domains)} domains'\n", " \n", " # Create figure (NO opacity parameter!)\n", " fig = go.Figure(\n", " go.Sunburst(\n", " labels=all_labels,\n", " parents=all_parents,\n", " values=all_values,\n", " ids=all_ids,\n", " branchvalues='total',\n", " marker=marker,\n", " textfont=dict(size=11, family='Arial', color='white'),\n", " insidetextorientation='radial',\n", " hovertemplate='%{label}
Mentions: %{value}'\n", " )\n", " )\n", " \n", " fig.update_layout(\n", " title=dict(\n", " text=title_text,\n", " x=0.5,\n", " xanchor='center',\n", " font=dict(size=16, family='Arial')\n", " ),\n", " height=650,\n", " width=650,\n", " font=dict(family='Arial', size=10),\n", " margin=dict(t=100, l=10, r=10, b=10)\n", " )\n", " \n", " return fig\n", " \n", " # ==========================================\n", " # 4. CREATE BUTTON INTERFACE\n", " # ==========================================\n", " print(\"=\"*80)\n", " print(\"🎛️ BUTTON-BASED INTERVIEW SELECTOR\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " # Output widgets\n", " figure_output = widgets.Output()\n", " stats_output = widgets.Output()\n", " \n", " # State variable\n", " current_selection = {'doc': None}\n", " \n", " # Button click handler\n", " def on_button_click(button):\n", " doc_name = button.description\n", " \n", " # Handle \"Show All\" button\n", " if doc_name == \"📊 Show All\":\n", " current_selection['doc'] = None\n", " \n", " with figure_output:\n", " clear_output(wait=True)\n", " fig = create_sunburst_figure(None)\n", " fig.show()\n", " \n", " with stats_output:\n", " clear_output(wait=True)\n", " display(HTML(\"

Global Dataset

\"))\n", " print(f\"Total interviews: {len(doc_list)}\")\n", " print(f\"Total SVO mentions: {len(svo_extractions)}\")\n", " print(f\"Unique SVOs: {len(set(s['svo'] for s in svo_extractions))}\")\n", " print(f\"Science domains: {len(svos_by_domain)}\")\n", " print()\n", " print(\"Domain distribution:\")\n", " for domain in sorted(svos_by_domain.keys()):\n", " count = sum(svos_by_domain[domain].values())\n", " print(f\" • {domain}: {count} mentions\")\n", " \n", " else:\n", " current_selection['doc'] = doc_name\n", " \n", " with figure_output:\n", " clear_output(wait=True)\n", " fig = create_sunburst_figure(doc_name)\n", " fig.show()\n", " \n", " with stats_output:\n", " clear_output(wait=True)\n", " display(HTML(f\"

Interview: {doc_name}

\"))\n", " \n", " interview_svos, interview_domains, svo_counts, domain_counts = get_interview_coverage(doc_name)\n", " \n", " print(f\"Variables mentioned: {len(interview_svos)}\")\n", " print(f\"Domains covered: {len(interview_domains)}\")\n", " print()\n", " print(\"Domains discussed:\")\n", " for domain in sorted(interview_domains):\n", " count = domain_counts[domain]\n", " print(f\" • {domain}: {count} mentions\")\n", " \n", " print()\n", " print(\"Top variables:\")\n", " top_svos = sorted(svo_counts.items(), key=lambda x: -x[1])[:5]\n", " for svo, count in top_svos:\n", " print(f\" • {svo}: {count} mentions\")\n", " \n", " print()\n", " print(\"✨ Gold borders = mentioned\")\n", " print(\"⚪ Dim colors = not mentioned\")\n", " \n", " # Create buttons\n", " print(\"Click any button to view that interview's focus:\")\n", " print()\n", " \n", " # \"Show All\" button\n", " btn_all = widgets.Button(\n", " description=\"📊 Show All\",\n", " button_style='success',\n", " layout=widgets.Layout(width='150px', margin='2px')\n", " )\n", " btn_all.on_click(on_button_click)\n", " \n", " # Interview buttons\n", " interview_buttons = []\n", " for doc_name in doc_list:\n", " btn = widgets.Button(\n", " description=doc_name,\n", " button_style='info',\n", " layout=widgets.Layout(width='200px', margin='2px')\n", " )\n", " btn.on_click(on_button_click)\n", " interview_buttons.append(btn)\n", " \n", " # Arrange buttons in rows\n", " buttons_per_row = 3\n", " button_rows = []\n", " button_rows.append(widgets.HBox([btn_all]))\n", " \n", " for i in range(0, len(interview_buttons), buttons_per_row):\n", " row = widgets.HBox(interview_buttons[i:i+buttons_per_row])\n", " button_rows.append(row)\n", " \n", " # Display interface\n", " display(widgets.VBox(button_rows))\n", " display(stats_output)\n", " display(figure_output)\n", " \n", " # Show initial view\n", " with figure_output:\n", " fig = create_sunburst_figure(None)\n", " fig.show()\n", " \n", " with stats_output:\n", " display(HTML(\"

Global Dataset

\"))\n", " print(f\"Total interviews: {len(doc_list)}\")\n", " print(f\"Total SVO mentions: {len(svo_extractions)}\")\n", " print(f\"Unique SVOs: {len(set(s['svo'] for s in svo_extractions))}\")\n", " print(f\"Science domains: {len(svos_by_domain)}\")\n", " \n", " print()\n", " print(\"=\"*80)\n", " print(\"✅ BUTTON SELECTOR READY\")\n", " print(\"=\"*80 + \"\\n\")\n", " \n", " print(\"How to use:\")\n", " print(\" 1. Click '📊 Show All' for full global dataset\")\n", " print(\" 2. Click any interview name to highlight their variables\")\n", " print(\" 3. Gold borders = mentioned in selected interview\")\n", " print(\" 4. Dim colors = not discussed in that interview\")\n", " print()\n", " \n", " print(\"💡 Compare multiple interviews to see different stakeholder priorities!\")\n", "\n", "else:\n", " print(\"⚠️ Cannot create selector - run Cell 21 first\")\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Step 1: Loading anchor facts...\n", "--------------------------------------------------------------------------------\n", "✓ Loaded 5 anchor fact categories\n", " • System Constraints: 4 facts\n", " • Performance Objectives: 4 facts\n", " • Known Trade-offs: 3 facts\n", " • Physical Infrastructure: 4 facts\n", " • Environmental Context: 4 facts\n", "\n", "💡 HOW TO CUSTOMIZE:\n", " Replace 'anchor_facts' dictionary above with your domain expert statements\n", " Categories can be: Constraints, Objectives, Infrastructure, Context, etc.\n", "\n" ] } ], "source": [ "\n", "# ==========================================\n", "# 1. DEFINE ANCHOR FACTS STRUCTURE\n", "# ==========================================\n", "print(\"Step 1: Loading anchor facts...\")\n", "print(\"-\"*80)\n", "\n", "# ANCHOR FACTS: User-provided verified statements\n", "# Format: {category: [fact1, fact2, ...]}\n", "# \n", "# HOW TO POPULATE:\n", "# Replace this example with your actual anchor facts from domain experts\n", "anchor_facts = {\n", " 'System Constraints': [\n", " 'Water treatment capacity is 500,000 gallons per day',\n", " 'Operating budget limited to $2M annually',\n", " 'Staff size cannot exceed 15 operators',\n", " 'Remote location requires 2-day supply chain lead time'\n", " ],\n", " 'Performance Objectives': [\n", " 'Maintain 99.9% uptime for water delivery',\n", " 'Meet EPA drinking water standards for all contaminants',\n", " 'Provide potable water to 5,000 residents',\n", " 'Respond to emergencies within 2 hours'\n", " ],\n", " 'Known Trade-offs': [\n", " 'Higher treatment quality reduces processing capacity',\n", " 'Preventive maintenance decreases emergency response availability',\n", " 'Expanding service area increases operational costs'\n", " ],\n", " 'Physical Infrastructure': [\n", " 'Distribution network spans 45 miles of pipeline',\n", " 'Three storage tanks with combined 2M gallon capacity',\n", " 'Treatment plant built in 1987, upgraded 2015',\n", " 'Backup power generator rated for 72 hours operation'\n", " ],\n", " 'Environmental Context': [\n", " 'Permafrost active layer depth varies 0.5-2 meters seasonally',\n", " 'Winter temperatures reach -40°F regularly',\n", " 'Summer water demand peaks at 150% of winter baseline',\n", " 'Source water quality degrades during spring thaw'\n", " ]\n", "}\n", "\n", "print(f\"✓ Loaded {len(anchor_facts)} anchor fact categories\")\n", "for category, facts in anchor_facts.items():\n", " print(f\" • {category}: {len(facts)} facts\")\n", "print()\n", "\n", "print(\"💡 HOW TO CUSTOMIZE:\")\n", "print(\" Replace 'anchor_facts' dictionary above with your domain expert statements\")\n", "print(\" Categories can be: Constraints, Objectives, Infrastructure, Context, etc.\")\n", "print()\n" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "invalid character '✅' (U+2705) (1544586567.py, line 49)", "output_type": "error", "traceback": [ "\u001b[0;36m Cell \u001b[0;32mIn[38], line 49\u001b[0;36m\u001b[0m\n\u001b[0;31m ✅ **DO:**\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid character '✅' (U+2705)\n" ] } ], "source": [ "## **Step 3: Choose Category Names**\n", "\n", "Map to decision component types:\n", "\n", "| Your Category | Maps To | Example Facts |\n", "|--------------|---------|---------------|\n", "| System Constraints | Constraints | \"Budget limited to $2M\", \"Staff cannot exceed 15\" |\n", "| Performance Objectives | Objectives | \"Maintain 99.9% uptime\", \"Meet EPA standards\" |\n", "| Known Trade-offs | Trade-Offs | \"Quality reduces capacity\", \"Maintenance decreases availability\" |\n", "| Physical Infrastructure | State Variables | \"45 miles of pipeline\", \"Built in 1987\" |\n", "| Environmental Context | State Variables | \"Permafrost varies 0.5-2m\", \"Winter temps reach -40°F\" |\n", "| Operational Variables | Decision Variables | \"Work hours adjustable 20-60/week\" |\n", "| Available Options | Options | \"Standard or enhanced treatment mode\" |\n", "| Implemented Solutions | Solutions | \"Remote monitoring system\", \"Cross-training program\" |\n", "\n", "### **Example Customization:**\n", "\n", "```python\n", "# YOUR ACTUAL DATA HERE\n", "anchor_facts = {\n", " 'Regulatory Requirements': [\n", " 'Must comply with Safe Drinking Water Act',\n", " 'Monthly testing for 15 contaminants required',\n", " 'Maximum turbidity 0.3 NTU'\n", " ],\n", " \n", " 'System Capacity': [\n", " 'Treatment plant rated 2.5 MGD',\n", " 'Storage capacity 5 million gallons',\n", " 'Distribution pressure 60-80 PSI'\n", " ],\n", " \n", " 'Operational Constraints': [\n", " 'On-call staff available 24/7',\n", " 'Emergency response time under 1 hour',\n", " 'Backup power for 48 hours'\n", " ],\n", " \n", " 'Climate Factors': [\n", " 'Annual precipitation 12-18 inches',\n", " 'Freeze-thaw cycles damage infrastructure',\n", " 'Summer demand 40% higher than winter'\n", " ]\n", "}\n", "```\n", "\n", "### **Tips:**\n", "\n", "✅ **DO:**\n", "- Use complete sentences\n", "- Include specific numbers/units\n", "- State facts clearly\n", "- Group related facts by category\n", "\n", "❌ **DON'T:**\n", "- Mix facts and opinions\n", "- Use vague language (\"around\", \"approximately\" without numbers)\n", "- Include questions\n", "- Duplicate facts across categories\n", "\n", "---\n", "\n", "## 📐 MODEL VARIABLES CELL\n", "\n", "### **Step 1: Locate the user_model_variables List**\n", "\n", "Find this section in `Cell_Integrate_Model_Variables.py`:\n", "\n", "```python\n", "user_model_variables = [\n", " {\n", " 'variable_name': 'water_level',\n", " 'full_name': 'Water Storage Level',\n", " # ... other fields\n", " },\n", " # ... more variables\n", "]\n", "```\n", "\n", "### **Step 2: Create Your Variable Entries**\n", "\n", "**Format:**\n", "```python\n", "user_model_variables = [\n", " {\n", " 'variable_name': 'short_code', # Required\n", " 'full_name': 'Descriptive Full Name', # Required\n", " 'description': 'What this represents', # Required\n", " 'category': 'Component Type', # Required\n", " 'units': 'measurement units', # Required\n", " 'data_type': 'continuous/discrete/binary/categorical', # Required\n", " 'source': 'where data comes from', # Optional\n", " 'update_frequency': 'how often updated' # Optional\n", " }\n", "]\n", "```\n", "\n", "### **Step 3: Field Specifications**\n", "\n", "**Required Fields:**\n", "\n", "| Field | Description | Examples |\n", "|-------|-------------|----------|\n", "| `variable_name` | Short identifier (no spaces) | 'water_level', 'budget_total', 'temp_avg' |\n", "| `full_name` | Human-readable name | 'Water Storage Level', 'Annual Operating Budget' |\n", "| `description` | What it represents | 'Current water volume in storage tanks measured in gallons' |\n", "| `category` | Decision component type | 'Objective', 'Constraint', 'State Variable', etc. |\n", "| `units` | Measurement units | 'gallons', 'USD', 'degrees F', 'percent', 'boolean' |\n", "| `data_type` | Type of variable | 'continuous', 'discrete', 'binary', 'categorical' |\n", "\n", "**Optional Fields:**\n", "\n", "| Field | Description | Examples |\n", "|-------|-------------|----------|\n", "| `source` | Where data originates | 'SCADA sensors', 'Finance database', 'Weather station' |\n", "| `update_frequency` | How often updated | 'real-time', 'daily', 'weekly', 'monthly', 'static' |\n", "\n", "**Category Options:**\n", "- 'Objective'\n", "- 'Constraint'\n", "- 'Trade-Off'\n", "- 'Decision Variable'\n", "- 'Option'\n", "- 'Solution'\n", "- 'State Variable'\n", "\n", "**Data Type Options:**\n", "- `continuous`: Real numbers (e.g., temperature, flow rate)\n", "- `discrete`: Integer counts (e.g., number of staff, pipe breaks)\n", "- `binary`: Yes/No, True/False (e.g., compliance status)\n", "- `categorical`: Named categories (e.g., treatment mode: standard/enhanced/minimal)\n", "\n", "### **Example Customization:**\n", "\n", "```python\n", "# YOUR ACTUAL MODEL VARIABLES HERE\n", "user_model_variables = [\n", " # State Variables (things you observe)\n", " {\n", " 'variable_name': 'source_flow',\n", " 'full_name': 'Source Water Flow Rate',\n", " 'description': 'Volumetric flow rate from source aquifer',\n", " 'category': 'State Variable',\n", " 'units': 'GPM',\n", " 'data_type': 'continuous',\n", " 'source': 'Flow meter FM-101',\n", " 'update_frequency': 'real-time'\n", " },\n", " \n", " # Decision Variables (things you control)\n", " {\n", " 'variable_name': 'chlorine_dose',\n", " 'full_name': 'Chlorine Dosage Rate',\n", " 'description': 'Amount of chlorine added per million gallons treated',\n", " 'category': 'Decision Variable',\n", " 'units': 'mg/L',\n", " 'data_type': 'continuous',\n", " 'source': 'Treatment control system',\n", " 'update_frequency': 'hourly'\n", " },\n", " \n", " # Objectives (things you want to achieve)\n", " {\n", " 'variable_name': 'uptime_pct',\n", " 'full_name': 'System Uptime Percentage',\n", " 'description': 'Percentage of time water delivery is operational',\n", " 'category': 'Objective',\n", " 'units': 'percent',\n", " 'data_type': 'continuous',\n", " 'source': 'Service logs',\n", " 'update_frequency': 'daily'\n", " },\n", " \n", " # Constraints (limits you must respect)\n", " {\n", " 'variable_name': 'max_turbidity',\n", " 'full_name': 'Maximum Allowable Turbidity',\n", " 'description': 'Regulatory limit on turbidity in treated water',\n", " 'category': 'Constraint',\n", " 'units': 'NTU',\n", " 'data_type': 'continuous',\n", " 'source': 'EPA regulations',\n", " 'update_frequency': 'static'\n", " },\n", " \n", " # Options (discrete choices)\n", " {\n", " 'variable_name': 'pump_mode',\n", " 'full_name': 'Pump Operating Mode',\n", " 'description': 'Selected operating mode for main distribution pumps',\n", " 'category': 'Option',\n", " 'units': 'categorical',\n", " 'data_type': 'categorical',\n", " 'source': 'Pump control panel',\n", " 'update_frequency': 'as-needed'\n", " }\n", "]\n", "```\n", "\n", "### **Tips for Model Variables:**\n", "\n", "✅ **DO:**\n", "- Include ALL variables from your model (even if you think they're not mentioned)\n", "- Use consistent naming conventions\n", "- Provide clear descriptions\n", "- Specify exact units\n", "\n", "❌ **DON'T:**\n", "- Skip variables because they seem \"obvious\"\n", "- Use abbreviations in descriptions\n", "- Leave units blank (use 'dimensionless' if unitless)\n", "- Mix different variable types in one entry\n", "\n", "---\n", "\n", "## 🔄 Common Workflows\n", "\n", "### **Workflow 1: You Have Expert Facts Only**\n", "\n", "```python\n", "# 1. Populate anchor_facts\n", "anchor_facts = {\n", " 'Key Facts': [\n", " # Your expert-provided facts here\n", " ]\n", "}\n", "\n", "# 2. Run Cell_Integrate_Anchor_Facts.py\n", "# 3. Review what AI found vs what experts know\n", "# 4. Investigate gaps\n", "```\n", "\n", "### **Workflow 2: You Have Model Variables Only**\n", "\n", "```python\n", "# 1. Populate user_model_variables\n", "user_model_variables = [\n", " # Your model variables here\n", "]\n", "\n", "# 2. Run Cell_Integrate_Model_Variables.py\n", "# 3. See which variables are discussed in interviews\n", "# 4. Find gaps (modeled but not discussed = potential issue)\n", "```\n", "\n", "### **Workflow 3: You Have Both**\n", "\n", "```python\n", "# 1. Populate both anchor_facts AND user_model_variables\n", "# 2. Run both integration cells\n", "# 3. Compare results:\n", "# - Anchor facts = qualitative validation\n", "# - Model vars = quantitative alignment\n", "# 4. Use triangulation for highest confidence\n", "```\n", "\n", "---\n", "\n", "## 📊 Data Collection Tips\n", "\n", "### **Gathering Anchor Facts:**\n", "\n", "**Sources:**\n", "- Literature review (published values)\n", "- Expert interviews (\"What do we know for certain?\")\n", "- Regulatory documents (legal requirements)\n", "- Historical data (proven constraints)\n", "- Engineering specifications (design parameters)\n", "\n", "**Template for collecting:**\n", "```\n", "Category: [System Constraints]\n", "Fact: \"Treatment capacity is 500,000 GPD\"\n", "Source: Engineering drawings, 2015 upgrade\n", "Confidence: High (design specification)\n", "```\n", "\n", "### **Gathering Model Variables:**\n", "\n", "**Sources:**\n", "- Model code documentation\n", "- Data dictionaries\n", "- Database schemas\n", "- Measurement protocols\n", "- API documentation\n", "\n", "**Template for collecting:**\n", "```\n", "Variable: water_level\n", "Full Name: Water Storage Tank Level\n", "Description: Height of water in main storage tank\n", "Units: feet\n", "Range: 0-30 feet\n", "Source: Level sensor LS-205\n", "Used in: Storage module, demand forecasting\n", "```\n", "\n", "---\n", "\n", "## ⚠️ Common Pitfalls\n", "\n", "### **Pitfall 1: Too General**\n", "\n", "❌ Bad:\n", "```python\n", "'Facts': ['Water is important']\n", "```\n", "\n", "✅ Good:\n", "```python\n", "'System Constraints': ['Minimum water pressure 40 PSI required for fire safety']\n", "```\n", "\n", "### **Pitfall 2: Inconsistent Categories**\n", "\n", "❌ Bad:\n", "```python\n", "anchor_facts = {\n", " 'Constraints': ['Budget is $2M'],\n", " 'Budget Constraints': ['Cannot exceed $2M'], # Duplicate!\n", "}\n", "```\n", "\n", "✅ Good:\n", "```python\n", "anchor_facts = {\n", " 'Financial Constraints': [\n", " 'Annual operating budget $2M',\n", " 'Capital budget $500K for upgrades'\n", " ]\n", "}\n", "```\n", "\n", "### **Pitfall 3: Missing Units**\n", "\n", "❌ Bad:\n", "```python\n", "{\n", " 'variable_name': 'flow_rate',\n", " 'units': 'flow' # Not a unit!\n", "}\n", "```\n", "\n", "✅ Good:\n", "```python\n", "{\n", " 'variable_name': 'flow_rate',\n", " 'units': 'GPM' # Specific!\n", "}\n", "```\n", "\n", "### **Pitfall 4: Vague Descriptions**\n", "\n", "❌ Bad:\n", "```python\n", "'description': 'The water level'\n", "```\n", "\n", "✅ Good:\n", "```python\n", "'description': 'Current water elevation in storage tank ST-1, measured from tank bottom, used for demand management and pump control'\n", "```\n", "\n", "---\n", "\n", "## ✅ Validation Checklist\n", "\n", "Before running integration cells:\n", "\n", "### **Anchor Facts:**\n", "- [ ] All facts are complete sentences\n", "- [ ] Facts include specific numbers/units where relevant\n", "- [ ] Categories match decision component types\n", "- [ ] No duplicate facts across categories\n", "- [ ] Source of each fact is known (even if not in data)\n", "\n", "### **Model Variables:**\n", "- [ ] All required fields populated\n", "- [ ] Variable names are unique\n", "- [ ] Units are specific and standard\n", "- [ ] Categories are valid component types\n", "- [ ] Data types match actual variable characteristics\n", "- [ ] Descriptions are clear and complete\n", "\n", "---\n", "\n", "## 🎊 Quick Reference\n", "\n", "### **Minimal Anchor Fact Entry:**\n", "```python\n", "anchor_facts = {\n", " 'Category Name': [\n", " 'Complete fact statement with numbers and units'\n", " ]\n", "}\n", "```\n", "\n", "### **Minimal Model Variable Entry:**\n", "```python\n", "user_model_variables = [{\n", " 'variable_name': 'short_code',\n", " 'full_name': 'Full Descriptive Name',\n", " 'description': 'What it represents and how it's used',\n", " 'category': 'State Variable', # or other type\n", " 'units': 'GPM',\n", " 'data_type': 'continuous'\n", "}]\n", "```\n", "\n", "### **Next Steps:**\n", "1. Customize the data structures above\n", "2. Paste into respective cells\n", "3. Run cells\n", "4. Review alignment results\n", "5. Iterate based on findings\n", "\n", "**Your data is now integrated with AI analysis!** 🎯✨\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 4 }