Data Driven Dude

📊What Agent-Mode ChatGPT Can and Can’t Do for Open-Data Exploration🩌

æ—„æœŹèȘžă§èȘ­ă‚€

tl;dr

Author’s notes

For plain-vanilla charts like this, it quickly pulls data that “just works” (though you’ll often find gaps).

Background

Motivation: Why are wild deer and boar increasing so much in Japan? → Preprocessing is a pain → Can ChatGPT handle it?

Plenty of folks want to do exploratory data analysis and hypothesis-driven visualization (in Tableau) when a question pops into their head. But searching, obtaining, and preprocessing public data is a slog (speaking from experience).

So I ran an experiment to see how far AI can take over preprocessing—and specifically what Agent-mode ChatGPT (with browser access, file handling, and light Python) can and can’t do.

“Tableau Uma-Uma Kai” (Yum-Yum Meetup)

This started with an event called Tableau Uma-Uma Kai—a very active community where we “build better viz + eat great food.” The theme this time was game meat BBQ. It was amazing.

A full game-meat tasting course

What is “gibier”?

(This section just pasted Perplexity search results.)

“Gibier” refers to meat from wild birds and mammals taken by hunting, and the dishes made from it. [1][2][5]

Etymology & definition

Characteristics

Common examples

Notes

Thus gibier is a culinary tradition drawing on nature’s bounty, and today it’s gaining attention for sustainability and regional revitalization. [6][3]

References: 1 2 3 4 5 6 7 8 9 10


And with that delicious prelude, the question naturally arose: why are wild deer and boar populations increasing? Hence the experiment outlined above.

Experiment overview

To build an MVP (Tableau) dashboard that visually demonstrates key drivers of Japan’s sika deer increase (lack of predators, fewer hunters, warm winters/less snow, protection policies, land-use changes), we quickly collected and cleaned open data. This post is a field log of that exploration, and a summary of what Agent-mode ChatGPT (browser automation, file ops, light Python) could and couldn’t do.


The data-exploration plan the AI came up with

To craft a persuasive storyline fast, we gathered these MVP essentials:

With these, we have the four core axes—population / culling / hunters / snow—to assemble a minimal yet compelling dashboard.

The AI confidently says “we collected it,” but in reality (likely due to token limits) it sometimes couldn’t fetch everything in one go. Please forgive the occasional swagger.


The exploration log

1) Environmental White Paper Excel → CSV

2) Fixed-point snow data (since 1989; not necessarily all prefectures)


Where we stumbled (and how we worked around it)


Preprocessing recipe (what we actually did)


Minimal storyboard in Tableau

“Minimal” because we’re discussing an MVP in the prior context.

  1. Estimated population (Q0.50) trend: national time-series line

  2. Culling stack: stacked bars for hunting / permitted / designated projects

  3. Aging of hunters: area chart or population pyramid by age band

  4. Fixed-point snow comparison: annual snow-day counts—Niigata (snowy) vs Nara (warm)

    • Highlight periods where warm/low-snow years overlap with population increase
  5. (Optional) Annotation layer: vertical reference lines for policy shifts (e.g., end of doe-hunting bans)


Agent-mode ChatGPT: What it did well

If your goal is to get to a usable state quickly, it’s an excellent fit.


Agent-mode ChatGPT: What it struggled with / caveats


Reproducible workflow (make it your team standard)

  1. Acquisition

    • White Paper Excel → /raw/env_whitepaper/*.xlsx|.xls
    • Station×year CSV → /raw/snow/{station}/{year}.csv.gz
  2. Light ETL

    • Convert (.xls→csv), normalize headers, era→Gregorian, numeric casting
    • Wide→long; ensure keys (year / prefecture / station) align
  3. Features

    • Snow days; avg/max/total snow depth
    • Hunter age-band shares; median age (optional)
  4. Validation views

    • Lines (population, culls), stacked bars (cull composition), pyramid (hunters), snow comparison
  5. Export

    • /out/csv/*.csv → import to Tableau; bind to a workbook template

Make it even better (practical tips)


Conclusion


Appendix 1: Hypotheses and the data sought/obtained

Hypothesis Aim Data sought/obtained (source) Status / notes
Fewer/older hunters weaken control capacity Track long-term hunter counts & age structure Hunting license holders (by age, national) (Environmental White Paper Excel) Obtained & cleaned (long-format, annual totals)
Climate change (warm winters / less snow) raises overwinter survival Check if less snow broadens “survivable conditions” Daily snow depth since 1989 (stations): Nara (Katsuragi), Niigata (Niigata), Gifu (Takayama) → annual aggregates (snow days / avg / max / total) Obtained & aggregated (Nara = rare snow; Niigata = big year-to-year swings)
Population increased since the 1990s Visualize long-term rise Estimated population (Honshu & south; quantiles 0.05–0.95) (White Paper Excel) Obtained, CSV; Q0.50 (median) as main series
Changes in culling composition affect abundance Understand composition & totals Deer culls (hunting / permitted / designated projects / total) (White Paper Excel) Obtained, CSV; stacked annual viz
(Supporting) Warm-winter tendency at prefecture scale Align station data with prefecture scale Prefecture-level: annual mean temp & snow days (2019) (published rankings) Obtained (2019 aligned across indicators)
(Next) Policy / legal changes contribute to trends Overlay events with dynamics Timeline: end of doe bans, expansion of damage-control programs, etc. In progress (MVP lists candidate events)
(Next) Habitat change (abandoned farmland, forest type) Test resource/cover expansion Abandoned farmland area trend (national/prefectural), forest statistics To be added (currently a candidate list)
(Next) Prefecture-level population / culls Strengthen regional explanatory power Prefecture-level culls / slaughter counts; population estimates Some access constraints → phase in later

Note: Wolves as “natural predators” are handled as historical qualitative context; quantitative comparison isn’t feasible.


Appendix 2: Main tools/features used in Agent mode (this run)

Acquisition

Conversion / preprocessing

Validation / viz-prep

Constraints & workarounds