Data flow diagram

The diagrams below show the flow of data through the system. The main purpose is highlight the distinction between the data preparation pipeline (poppusher) and the data access components (popgetter-*).

--- title: Poppusher --- graph LR subgraph download [Download country data] census_download@{ shape: text, label: "(customised for each country's census format)" } aA(Scotland) aB(Northern Ireland) aC(Singapore) aD(USA) aE(Belgium) aF(Australia) end raw(raw data) aA & aB & aC & aD & aE & aF --> raw ==> ingest subgraph ingest bA(Convert to common file formats) bB(Derive common metadata info) bC(Derive common metrics) bA --> bB bB --> bA bC --> bB bB --> bC end direction TB subgraph processed [Cloud hosted structure data store] direction TB dir_struct_docs@{ shape: text, label: "(**_See docs_**)" } dA("`**countries** (plain-text)`") subgraph percountry [per-country files] dCa("`**metadata** (parquet)`") dCb("`**metrics** (parquet)`") dCc("`**geometry** - (flatgeobuff) - (GeoJSON) - (PMTiles)`") end dir_struct_docs ~~~ dA dA ~~~ percountry click dir_struct_docs href "https://poppusher.readthedocs.io/en/latest/output_structure/" _blank end ingest ==> processed
--- title: Popgetter --- graph LR subgraph processed [Cloud hosted structure data store] direction TB dir_struct_docs@{ shape: text, label: "(**_See docs_**)" } dA("`**countries** (plain-text)`") subgraph percountry [per-country files] dCa("`**metadata** (parquet)`") dCb("`**metrics** (parquet)`") dCc("`**geometry** - (flatgeobuff) - (GeoJSON) - (PMTiles)`") end dir_struct_docs ~~~ dA dA ~~~ percountry click dir_struct_docs href "https://poppusher.readthedocs.io/en/latest/output_structure/" _blank end direction TB subgraph clients core("`**popgetter-core** common part of all clients - complied to wasm. - understands the directory structure and downloads the data. `") direction TB fA("`**popgetter-cli** A commandline tool to query and download data`") fB("`**popgetter-py** Enables access from Python`") fC("`**popgetter-browser** A web interface for exploring the available data`") fD("`**popgetter-llm** An experimental natural language client using LLMs`") core --> fA core --> fB core --> fC core --> fD end processed ===> core