Skip to content

gumadeiras/fruitloops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fruitloops

Agent-friendly CLI for querying connectome analysis tables from hemibrain and FlyWire.

The repository keeps generated CSV products in a predictable layout:

data/
  manifest.csv
  hemibrain/
  flywire/
  comparison/

Quick Use

Install with Homebrew:

brew tap gumadeiras/tap
brew install fruitloops

The brewed CLI installs lightweight by default. Add the bulk/live/plot Python dependencies into fruitloops' Homebrew virtualenv when needed:

fruitloops-install-extras

Install from PyPI:

python -m pip install fruitloops
python -m pip install 'fruitloops[bulk,live,plot]'

Run directly from the repository:

python -m fruitloops datasets
python -m fruitloops files --dataset flywire --contains summary
python -m fruitloops head --table flywire:analysis_outputs/full_summary
python -m fruitloops query --table comparison:matched_ln_class_similarity --contains LN_class=il3LN6 --format json
python -m fruitloops ln il3LN6 --dataset flywire --format json
python -m fruitloops partners il3LN6 --dataset flywire --kind orn --format csv
python -m fruitloops compare il3LN6 --format json

For editable installation:

python -m pip install -e '.[bulk,live,plot]'
fruitloops datasets

Table References

Tables can be referenced as:

  • dataset:relative/path/without_csv
  • dataset:collection/file_stem
  • file_id from data/manifest.csv

Examples:

fruitloops schema --table flywire:analysis_outputs/full_summary
fruitloops query --table hemibrain:analysis_outputs/full_summary --select bodyId,LN_type,input_preference
fruitloops query --table flywire:source_audit/ln_observations_by_hemisphere --where LN_type=il3LN6
fruitloops query --table flywire:source_audit/orn_partner_counts_by_hemisphere --where LN_type=il3LN6 --format csv
fruitloops path --table comparison:matched_ln_class_similarity

Common Agent Queries

Aggregate any table without pandas:

fruitloops aggregate \
  --table flywire:source_audit/orn_partner_counts_by_hemisphere \
  --where LN_type=il3LN6 \
  --by LN_type,analysis_hemisphere,input_relation \
  --sum n_synapses \
  --format csv

Summarize ORN or PN partners for one LN:

fruitloops partners il3LN6 --dataset flywire --kind orn --format csv
fruitloops partners il3LN6 --dataset flywire --kind pn --format csv
fruitloops partners il3LN6 --dataset hemibrain --kind orn --format csv
fruitloops partners il3LN6 --dataset hemibrain --kind pn --format csv

Pull the reconciled hemibrain/FlyWire comparison:

fruitloops compare il3LN6 --format json

Useful LN workflow:

fruitloops ln il3LN6 --format csv
fruitloops query --table flywire:source_audit/ln_observations_by_hemisphere --where LN_type=il3LN6 --format csv
fruitloops aggregate \
  --table flywire:source_audit/orn_partner_counts_by_hemisphere \
  --where LN_type=il3LN6 \
  --by analysis_hemisphere,input_relation \
  --sum n_synapses \
  --format csv
fruitloops compare il3LN6 --format jsonl

Olfaction Offline Cache

Build derived AL/LH/MB tables after importing bulk connectivity:

python -m pip install -e '.[bulk]'
fruitloops olfaction build
fruitloops olfaction tables

For complete names/classes/glomeruli, cache annotations once from live APIs and rebuild:

python -m pip install -e '.[bulk,live]'
fruitloops olfaction cache-annotations --dataset hemibrain
fruitloops olfaction cache-annotations --dataset flywire

The builder creates olf_edges_by_neuropil, olf_edges_total aggregated over AL/LH/MB, olf_neuropil_membership, olf_neurons, and olf_provenance in the DuckDB store. It uses imported annotation tables when available:

  • hemibrain_olfaction_neuron_annotations or hemibrain_traced_neurons
  • flywire_hierarchical_neuron_annotations
  • flywire_neuron_information_v2

Example olfaction queries:

fruitloops olfaction neurons --dataset flywire --region AL --class ORN --format csv
fruitloops olfaction pns --dataset hemibrain --glomerulus DM1 --format csv
fruitloops olfaction orn-inputs --dataset hemibrain --glomerulus DM1 --by-side --format csv
fruitloops olfaction edges --dataset flywire --region LH --min-synapses 5 --format csv

Generic Plotting

Plotting is reusable and table-agnostic. Install the plotting extra when needed:

python -m pip install -e '.[plot]'

Render from any fruitloops table reference:

fruitloops plot \
  --table comparison:matched_ln_class_similarity \
  --kind scatter \
  --x hemibrain_mean_contra_preference \
  --y flywire_mean_contra_preference \
  --label LN_class \
  --top-labels 8 \
  --output outputs/contra_preference_scatter \
  --formats png,svg

Or render from any CSV path:

fruitloops plot \
  --csv path/to/table.csv \
  --kind scatter \
  --x x_column \
  --y y_column \
  --output outputs/my_scatter

Other generic plot kinds:

fruitloops plot --table comparison:matched_ln_class_similarity --kind bar --x LN_class --y orn_input_distribution_correlation --output outputs/orn_corr_bar
fruitloops plot --table flywire:source_audit/orn_partner_counts_by_hemisphere --kind violin --x input_relation --value n_synapses --where LN_type=il3LN6 --output outputs/il3ln6_orn_violin
fruitloops plot --table flywire:source_audit/orn_partner_counts_by_hemisphere --kind heatmap --x glomerulus --y input_relation --value n_synapses --where LN_type=il3LN6 --output outputs/il3ln6_orn_heatmap
fruitloops plot --table comparison:matched_ln_class_similarity --kind bubble --x orn_input_distribution_correlation --y pn_output_distribution_correlation --size flywire_orn_input_total --color flywire_contra_fraction --label LN_class --output outputs/similarity_bubble

The wrapper script is equivalent:

python scripts/plot_csv.py --csv path/to/table.csv --kind hist --value score --output outputs/score_hist

Live Connectome Access

Live database access is optional. Credentials come from environment variables or from a local .env file. .env is ignored by git; start from .env.example.

python -m pip install -e '.[live]'
cp .env.example .env

Use a different env file with --env-file path/to/file.env.

Hemibrain uses neuprint-python:

export NEUPRINT_SERVER=neuprint.janelia.org
export NEUPRINT_DATASET=hemibrain:v1.2.1
export NEUPRINT_APPLICATION_CREDENTIALS=<neuprint-token>

fruitloops live hemibrain neurons --type-contains il3LN6 --limit 5 --format csv
fruitloops live hemibrain connections --upstream-body-id 5813018460 --limit 20 --format json
fruitloops live hemibrain cypher --query 'MATCH (n:Neuron) RETURN n.bodyId AS bodyId, n.type AS type LIMIT 5'

FlyWire uses caveclient:

export FLYWIRE_DATASTACK=flywire_fafb_public
export CAVE_AUTH_TOKEN=<cave-token>

fruitloops live flywire tables --format csv
fruitloops live flywire table --table synapses_nt_v1 --in pre_pt_root_id=720575940623636701 --limit 10 --format csv
fruitloops live flywire synapses --pre-root-id 720575940623636701 --limit 10 --format json

Script shortcuts are equivalent:

python scripts/live_hemibrain.py neurons --type-contains il3LN6 --limit 5
python scripts/live_flywire.py tables

Offline-First Live Cache

Use offline fetch when you want local data first and live APIs only on cache miss. Results are saved under cache/live/, which is ignored by git.

fruitloops offline fetch \
  --dataset flywire \
  --action synapses \
  --pre-root-id 720575940623636701 \
  --limit 10 \
  --format csv

Repeat the same command to read the cached CSV. Use --offline-only to fail instead of hitting the network, or --refresh to force a live re-fetch.

fruitloops offline list
fruitloops offline fetch --dataset flywire --action tables --offline-only
fruitloops offline fetch --dataset hemibrain --action neurons --type-contains il3LN6 --limit 5

Bulk Offline Releases

Bulk releases should be the primary offline source when you need broad connectivity, with live/cache queries only filling gaps.

List known public release files:

fruitloops bulk sources

Download the practical FlyWire connection table first:

fruitloops bulk download --dataset flywire --kind proofread-connections

Optional larger downloads:

fruitloops bulk download --dataset hemibrain --kind compact-adjacencies
fruitloops bulk download --dataset flywire --kind synapses
fruitloops bulk download --dataset hemibrain --kind neo4j-inputs

Import CSV/Parquet/Feather into local DuckDB:

python -m pip install -e '.[bulk]'
fruitloops bulk import \
  --path bulk/raw/flywire/proofread_connections_783.feather \
  --table flywire_proofread_connections \
  --replace
fruitloops bulk tables
fruitloops bulk query --table flywire_proofread_connections --limit 10 --format csv

Optimize imported connection tables before repeated partner queries:

fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire
fruitloops bulk optimize --table hemibrain_traced_roi_connections --prefix hemibrain

Agent-facing wrappers infer common pre/post/weight/ROI column names:

fruitloops bulk schema --table flywire_proofread_connections
fruitloops bulk connections --table flywire_proofread_connections --pre-id ROOT --limit 20 --format csv
fruitloops bulk inputs --table flywire_proofread_connections --body-id ROOT --format csv
fruitloops bulk outputs --table flywire_proofread_connections --body-id ROOT --format csv
fruitloops bulk partners --table flywire_proofread_connections --body-id ROOT --format json
fruitloops bulk views --table flywire_proofread_connections --prefix flywire
fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire

Hemibrain's compact adjacency and Neo4j bundles are CSV archives; extract first, then import the CSVs you need:

fruitloops bulk extract --path bulk/raw/hemibrain/exported-traced-adjacencies-v1.2.tar.gz
fruitloops bulk import \
  --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-roi-connections.csv \
  --table hemibrain_traced_roi_connections \
  --replace
fruitloops bulk import \
  --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-total-connections.csv \
  --table hemibrain_traced_total_connections \
  --replace
fruitloops bulk import \
  --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-neurons.csv \
  --table hemibrain_traced_neurons \
  --replace
fruitloops bulk extract --path bulk/raw/hemibrain/hemibrain_v1.2_neo4j_inputs.zip
fruitloops bulk import --path bulk/extracted/hemibrain_v1.2_neo4j_inputs/<file>.csv --table hemibrain_<name>

End-to-end offline setup:

python -m pip install -e '.[bulk]'
fruitloops bulk download --dataset flywire --kind proofread-connections
fruitloops bulk import --path bulk/raw/flywire/proofread_connections_783.feather --table flywire_proofread_connections --replace
fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire
fruitloops bulk download --dataset hemibrain --kind compact-adjacencies
fruitloops bulk extract --path bulk/raw/hemibrain/exported-traced-adjacencies-v1.2.tar.gz
fruitloops bulk import --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-roi-connections.csv --table hemibrain_traced_roi_connections --replace
fruitloops bulk optimize --table hemibrain_traced_roi_connections --prefix hemibrain
fruitloops bulk tables

flywire_synapses_783.feather is much larger than the proofread connection table. Fruitloops streams Feather imports through Arrow record batches, but the resulting DuckDB database still needs enough local disk for the imported table and indexes.

Output Formats

Most commands support --format table, --format csv, --format json, or --format jsonl. CSV and JSONL are intended for downstream agent pipelines.

Rebuilding the Data Snapshot

From the paper repository root:

python scripts/build_data_snapshot.py \
  --source "/path/to/widespread-direction-selectivity" \
  --dest data

The script copies generated CSVs and rewrites data/manifest.csv.

Test

python -m unittest discover -s tests

About

Agent-friendly CLI and CSV snapshot for querying hemibrain and FlyWire connectome analyses.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages