Apparently the *-NS0-*.json.tar.gz files really do contain only namespace 0 pages even in places where $wgContentNamespaces contains namespaces other than 0, unlike the more flexible *-pages-articles-*.xml* dumps.
$ curl https://dumps.wikimedia.org/other/enterprise_html/runs/20220301/idwikisource-NS0-20220301-ENTERPRISE-HTML.json.tar.gz | tar xzOf - | jq -r ".namespace.identifier" | LANG=C sort -u % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 39.2M 100 39.2M 0 0 4333k 0 0:00:09 0:00:09 --:--:-- 4591k 0
Not sure what logic was followed to decide the contents of the JSON dumps but it's clear they don't map the XML dumps, so I can't tell whether that's a bug. Either way there are two separate tasks here (+1):
- make sure relevant content namespaces are reflected in the configuration of each wiki, ideally by adding them to $wgContentNamespaces (this is a community decision);
- respect this configuration and place said content somewhere;
- if that place is not the existing file, adapt clients to find out where that is (if they are separate files, clients need to also retrieve the aforementioned configuration, to find out they need to look for those additional files).