Data directories [indicator] and [ref_area]
There are two different data directories, based on two different ways of presenting the corresponding tables: organizing them by ‘indicator’ (and frequency) or by ‘ref_area’ (and frequency). The indicator refers to the title of each specific table, including the represented variable and the eventual breakdowns used for it (for instance, ‘labour force by sex and age’, ‘employment by sex and economic activity’ and ‘unemployment rate by sex, age and rural / urban areas’ are ILOSTAT indicators). The ref_area (from reference area) refers to the geographic areas for which data are available. Since ILOSTAT includes both country-level data and regional and global estimates, the reference area can either refer to countries, to regions (geographic regions such as Africa, Americas or Arab States, income groups such as low income countries, or other groups such as the BRICS or the G20) or to the world as a whole. However, it is important to note that global and regional estimates are only available for some indicators, and so most datasets would only include country-level data. The frequency refers to whether the various data points are annual, quarterly or monthly.
Data directories, whether by indicator or by ref_area, are presented in csv format as compressed zip files (‘gz’). All ‘gz’ files can be uncompressed using WinZip or 7zip. For further information on the csv files, see the following section. After selecting one of the two approaches proposed (tables by indicator or by ref_area) by clicking on the name of the directory, you can access and download the desired data by clicking on the code name(s) of the table(s) you are looking for.
The [dic] directory provides dictionaries of all the code lists needed to identify the indicator or reference area that you are looking for. For reference, please note that codes all follow the same structure. The indicator code includes, in this order:
code of the topic
code to identify the indicator within that topic
breakdowns or ‘NOC’ for ‘no classification’ if there is no breakdown
unit of measure
‘NB’ for absolute values or numbers
‘RT’ for percentages or rates
frequency
‘A’ for annual data
‘Q’ for quarterly data
‘M’ for monthly data
Similarly, the code names of the files by reference area refer to:
country (ISO Alpha-3 country code) or the region (codes starting with X) and
frequency
The two tables presented next show the contents of the [indicator] and the [ref_area] directories, which contain approximately 500 and 700 datasets respectively.
Contents of [indicator]
Files
Contents
table_of_contents_en
Table of contents in English
table_of_contents_fr
Table of contents in French
table_of_contents_sp
Table of contents in Spanish
EAP_TEAP_SEX_AGE_NB_A.csv
Dataset containing all annual data available on the labour force by sex and age
EMP_DWAP_NOC_RT_A.csv
Dataset containing all annual data available for the employment-to-population ratio
Contents of [ref_area]
Files
Contents
table_of_contents_en
Table of contents in English
table_of_contents_fr
Table of contents in French
table_of_contents_sp
Table of contents in Spanish
ABW_A.csv
Dataset containing all annual data available for Aruba
ABW_M.csv
Dataset containing all monthly data available for Aruba
Format of CSV data files
Files in ‘csv’ format are files storing tabular information (whether numbers or text) in the form of plain text, as comma separated values. That is, the columns (or fields) from the original table are separated by commas, allowing for each row or line of the file to correspond to one data record (the data record may thus consist of one or more fields, separated by commas). These files can easily and straightforwardly be opened in Excel. In ILOSTAT ‘csv’ files, the first row contains the headers (of the fields or columns). The subsequent rows present the data records, consisting of the key of the record (the ‘names’ of the dimensions used to identify each record, including the reference area, the source of the data, the classifications used, etc., referring to all fields from ‘ref_area’ to ‘time’), the observation value (‘obs_value’) and any other metadata available (such as the geographical coverage of the source or the specific definitions used for some concepts, referring to all fields from ‘obs_status’ to ‘note_source’). All of the labels corresponding to the code names used as field headers in the csv files available for download are presented in the code lists’ dictionary ([dic] files, see following section for further information). The only code name not explained in the [dic] files is ‘obs_value’, which corresponds to
the observation value.
There is no dictionary (or no ‘dic’ file) for the time dimension. The syntax of the codes used for this dimension is the following:
Annual data: YYYY where YYYY is the year.
Quarterly data: YYYYQ where YYYY is the year and Q is the quarter (the number corresponding to the quarter from 1 to 4).
Monthly data: YYYYMM where YYYY is the year and MM is the month (the number corresponding to the month from 01 to 12).
The number format applied in ILOSTAT files uses a dot as the decimal symbol (‘.’).
Dictionary directory [dic]
Code lists are predefined sets of terms from which statistical concepts (statistical characteristics of data) that have been coded take their values. All of the code lists presented in ILOSTAT are available in three languages (‘en’ for English, ‘fr’ for French and ‘sp’ for Spanish). All ILOSTAT code list files have the same structure, consisting of three columns: the variable name or code (‘var_name’), the variable label or description of the code (‘var_label’) and a number used to sort the information in the file (‘var_sort’). The following table provides an example of ILOSTAT code list.
Extract of ‘indicator_en.csv’
Indicator
Indicator.label
Indicator.sort
GDP_211P_NOC_NB
Output per worker (GDP constant 2011 international $ in PPP) — ILO estimates and projections, Nov. 2016 (units)
CPI_NCPI_COI_IN
National consumer price index (CPI) by COICOP (units)
The various code lists available in English, French and Spanish in the [dic] directory correspond to the fields used in the downloaded csv files described in the previous section (except for the ‘obs_value’ field used for the observation value and not requiring a dictionary with labels). The following table enumerates the code lists included in the [dic] directory.
Extract of ‘indicator_en.csv’ in [dic]
Variable name
Brief description
ref_area
Reference area – this can refer to countries, geographic regions, groups of countries (by income level or others) or the world
source
The specific source of the data, including information on the country or region for which it is used and the main type of source (population census, labour force survey, administrative records, etc.) as well as the precise name of the source.
indicator
The indicator, including information on the represented variables, the classifications used (if any) and the unit.
sex
The breakdown by sex and the items of this breakdown.
classif1
All classifications used as the first breakdown in the various indicators available (excluding the breakdown by sex, which is treated separately) and the corresponding classification categories or items.
classif2
All classifications used as the second breakdown in the various indicators available (excluding the breakdown by sex, which is treated separately) and the corresponding classification categories or items.
obs_status
The value status or flags on the values, such as breaks in series or provisional values.
note_classif
Metadata and/or footnotes related to the classifications used and the specific classification categories.
note_indicator
Metadata and/or footnotes related to the indicator.
note_source
Metadata and/or footnotes related to the data source.
It should be noted that these code lists present only the label corresponding to each code. For further methodological information, including definitions of the main statistical terms used in ILOSTAT, detailed indicator descriptions and statistical standards, refer to the
concepts and definitions page
The two data directories [indicator] and [ref_area] include a table of contents, available in csv format and in three languages (‘en’ for English, ‘fr’ for French and ‘sp’ for Spanish). These tables of contents list all of the data files available for download in the corresponding directory, and provide summary information on each data file.
The table of contents of the [indicator] directory lists all the indicators available, with the label of the indicator and the frequency of the data.
The table of contents of the [ref_area] directory lists all the reference areas available (countries, regions, groups of countries), with the label of the reference area and the frequency of the data.
Both tables indicate the size of each data file, the time period covered by the data in the file and the date when the data file was last updated. Since ILOSTAT’s datasets include projections of the main labour market indicators, the time period covered by some data files can go as far as 2050. The codes or identifiers used in the tables of contents for the indicators and reference areas in the first field or column (‘id’) are unique and allow for the unequivocal identification of the corresponding item. The two tables presented next show extracts of the tables of contents of the [indicator] and the [ref_area] directories.
Extract of ‘table_of_contents_en.csv’ in [indicator]
Variable name
Brief description
id
File name of the dataset
indicator
Indicator code
indicator.label
Indicator name, including information on the represented variables, the classifications used (if any) and the unit.
freq
Frequency code (A, Q, M)
freq.label
Frequency label
size
Size of the .csv.gz file
data.start
First time period available in the dataset
data.end
Last time period available in the dataset
last.update
Last update of the dataset (Europe/Paris time zone)
n.records
Number of records in the dataset
collection
Collection code
collection.label
Data collection or compilation from which the data was derived, from all the various data compilations carried out by the ILO and disseminated in ILOSTAT
subject
Subject code
subject.label
How the indicator is display on the ilostat website
Extract of file ‘table_of_contents_en.csv’ in [ref_area]
Variable name
Brief description
id
File name of the dataset
ref_area
Reference area code
ref_area.label
Reference area name, this can refer to countries, geographic regions, groups of countries (by income level or others) or the world
freq
Frequency code (A, Q, M)
freq.label
Frequency label
size
Size of the file
data.start
First time period available in the dataset
data.end
Last time period available in the dataset
last.update
Last update of the dataset (Europe/Paris time zone)
n.records
Number of records in the dataset
group_geo
Geograpical group code
group_geo.label
Geographical group name of the reference area
group_income
Income group code
group_income.label
Income group name of the reference area
Updates
All of the information stored in the facility is updated daily at 12:00 pm (Europe/Paris time zone). The updating procedure only involves datasets for which there is new data or that have undergone a modification or a structural change.