ESS Standard for Quality Reports Structure (ESQRS)
National Statistical Institute
|Contact organisation unit|
Statistics on Living Conditions Department,
Demographic and Social Statistics Directorate
Desislava Dimitrova, PhD
|Contact person function|
head of department
|Contact mail address|
2 P.Volov street, 1038 Sofia
|Contact email address|
|Contact phone number|
+359 2 9857 183
|Contact fax number|
Survey on income and living conditions (SILC) is a tool for providing timely and comparable data on income distribution, level and structure of poverty and social exclusion. The survey is carried out in a European methodology and provides information about the current state (cross-sectional data) and longitudinal (longitudinal data) changes in income level and structure of poverty and social exclusion.
EU-SILC provides four basic files containing target variables based on common concepts and definitions.
Annual data for the countries contain the following components:
• Household register (D-file);
• Personal register (R-file)
• Household data (Н-file)
• Personal data of people aged 16 and more (Р-file)
Each year additional data on the household and household members on specific topics is collected, the so-called ad-hoc modules.
The indicators on poverty and social inclusion are calculated on the basis of the survey "Statistics on income and living conditions" and a common methodology for data collection, target variables obtaining and calculating of common indicators, approved by Eurostat. The poverty rate is the share of households that are below the poverty line which is defined as 60% of the median equivalised disposable income.
Data refers to all private households and individuals living in the private households in the national territory at the time of data collection.
The EU-SILC survey is a key instrument for providing information required by the European Semester and the European Pillar of Social Rights, in particular for income distribution, poverty and social exclusion, as well as various related living conditions and poverty EU policies, such as on child poverty, access to health care and other services, housing, over indebtedness and quality of life. It is also the main source of data for microsimulation purposes and flash estimates of income distribution and poverty rates.
The following social fields are included in the survey methodology:
|Statistical concepts and definitions|
Total household income:
Two main concepts for total household income are applied:
Total household gross income (HY010) is computed as the sum for all household members of gross personal income components:
Total disposable household income (HY020) can be computed as total household gross income (HY010) is reduced to:
Household is two or more persons, living in one dwelling or part of dwelling, sharing common budget and eating together.
Household is a person, living in one dwelling, room or part of it to a dwelling, has a separate budget for the cost of meals and expenses to satisfy other needs.
For the calculation of indicators of poverty and social inclusion using the total disposable household income is "equalised". Due to the different composition and number of persons in the household equivalent scales apply. Use the modified OECD scale, which gives a weight of 1.0 to the first person aged 14 or more, a weight of 0.5 to other persons aged 14 or more and a weight of 0.3 to persons aged 0-13. The weights are given to each member of the household and summed to obtain an equivalent household size. Total disposable net income for each household is divided by its equivalent size and form the total disposable net income per equivalent unit.
Units of observation are households and household members.
The EU-SILC target population consists of all private households and their current members residing in the country. Persons living in collective households and in institutions are generally excluded from the target population.
Entire territory of Republic of Bulgaria
2006 - 2021
The sample for BG-SILC 2021 are selected from the sampling frame based on the Population Census 2011. The data base includes all private households and their current members residing in the country. Persons living in collective households and in institutions are excluded from the target population. Student’s and worker’s hostels are excluded at the first stage of selection of PSU, because student’s and worker’s households rarely stay on the same addresses and are difficult to trace.
The frame is regularly updated according to the administrative changes made.
Household data within the selected PSUs are updated according to the Information System “Demography” data (ISD).
The longitudinal component consists of the sub-samples R2, R3, R4, R5 and R6.
All personal/household income variables were collected by interview. Where the information is available, the data from the administrative source is directly used.
The National Revenue Agency provides data from the register of insured persons. This register used for PY010, PY030, PY050 and HY090 variables.
The National Social Security Institute provides data on income from pensions and other social security payments. This register used for PY090, PY100, PY110, PY120, PY130, HY050 and HY110 variables.
The Social Assistance Agency provides data on income from social benefits. This register used for HY050, HY060 and HY070 variables.
Two stage sampling on a territorial principle is implemented as follows:
- on the first stage - the census enumeration units (PSU) are selected;
- on the second stage - the households are identified.
Sampling rate and sampling size
Concerning the SILC instrument, three different sample size definitions can be applied:
- the actual sample size which is the number of sampling units selected in the sample
- the achieved sample size which is the number of observed sampling units (household or individual) with an accepted interview
- the effective sample size which is defined as the achieved sample size divided by the design effect with regards to the at-risk-of poverty rate indicator
Given that the effective sample size has been already treated in the section dealing with sampling errors, in this section the attention focuses mainly on the achieved sample size.
The total gross sample size (number of households) has been calculated analyzing the non-response rates and design effects of the previous EU-SILC surveys.
The total sample size in 2021 is 9123 households:
Number of households for which an interview is accepted for the database.
Rotational group breakdown and total
RB250 = 11,14 Number of persons of 16 years or older who are members of the households for which the interview is accepted for the database, and who completed a personal interview.
Rotational group breakdown and total
The sample size for longitudinal component was 29214 households and 53865 persons aged 16 and over.
Number of households in longitudinal component for which an interview is accepted for the database
Number of persons 16 years and older who are members of the households for which the interview is accepted for the database, and who completed a personal interview
|Frequency of data collection|
SILC2021 data are collected with CAPI questionnaires through personal interview with households included in the sample as well as all household members aged 16 and more.
Fieldwork began in mid-April 2021 instead of planned in March due to restrictions imposed in connection with the COVID-19 pandemic.
The unfavorable situation with COVID-19 significantly complicates the normal conduct of the survey. People are skeptical and critical of the need to conduct a large-scale survey during an epidemiological situation. The survey is conducted with electronic devices, but in some cases, it is possible to conduct the phone by recording the data directly in the program of the device.
The mean interview duration
The mean interview duration per household is calculated as the sum of the duration of all household interviews plus the sum of the duration of all personal interviews, divided by the number of household questionnaires completed. Only households accepted for the database have to be considered.
The average household interview duration was about 24 minutes, while the average individual interview duration was about 23 minutes.
Average interview duration = 69.7 minutes
In the process Data-entry is a logical control of extreme values, filled-in information on all issues, data comparability checks, links between individual questionnaires and registers is carried out. After processing the primary data and receiving the target changes, a verification with the SAS program provided by Eurostat for verification and validation of the data is performed. Additional compatibility checks are performed before publishing the information
The database of each country contains a different types of weights:
Weighting factors were calculated as required to take into account the units’ probability of selection, non-response and to adjust the sample to external data relating to the distribution of households and persons in the target population, such as sex and age, residence or administrative-territorial districts (NUTS 3).
For the first year of the panel each household from the new rotation group got a sampling weight inversely proportional to the probability of selection of the household. These were the household’s design weights DB080.
To adjust for non-responding households the procedure “weighting classes” was used. The households were divided into classes where the probability to respond was assumed to be homogenous within the classes. Due to lack of information (demographic characteristics) for the non-responding households these classes were the sampling strata. The ratio of the weights of the responding households to the weights of all households in the given class was calculated.
After reflecting the non-responding households the base weights for the new rotation group were calibrated to the population as of 31.12.2020. For the calibration the following variables at individual and at household level were used:
The information on individuals as of 31.12.2020 was available from the ISD. The information on the households was an estimation made on the basis of the updated file on Census 2011 and data on the split-off households from the SILC survey. Persons born in 2021 were not included in the calibration as they were not part of the population as of the end of 2020. For the calibration of weights the SAS Macro Calmar 2 was used. The logit method (M=3 in Calmar) was used for the calibration by setting upper and lower limits of the g-weights. The G-weights were the ratio of the assigned weights and the final calibrated weights. The upper limit in 2020 was 2.9 and the lower – 0.3.
The calibrated weights with reflected non-responding households were the base weights (RB060) for the new rotation group and will be used in the weighting procedure in the following years. These weights were also the longitudinal weights (DB095) of the households from the new rotation group.
Weighting procedure for rotation groups (14, 15, 16, 17 and 18) from previous survey waves.
To get the base weights for the current year, the base weights (RB060) for each rotation group from the previous year were adjusted taking into account the non-response. The adjustment procedure was made on an individual and not on household level.
To adjust for non-response first all persons from the 2020 register (DB135 = 1 & RB110 in (1,2,3,4)) who were followed up in 2021 were marked as responding (current members of the household). Persons who have left the household between the two survey waves (2020 and 2021) were marked as non-responding. A logistic regression was used to calculate the probability for each individual to be enumerated between 2020 and 2021. The weights of the enumerated persons were adjusted with the probability of following up (result of logistic regression) and thus the base weights (RB060) for 2021 were get.
The model was applied for each rotation group separately. The independent variables used in the model were: poverty indicators, education, economic activity, age, sex, household size, household type, income, dwelling type. The dependent variable was the one showing if the individual was enumerated or not.
New members of the household after first year who were not part of the sample got base weights for the current year as follows:
Each person in the household should receive equal weight within the household (RB050 cross-sectional weight). For this reason, each household member whit zero and non-zero base weight received average base weight within the household.
After the non-response adjustment procedures, each of the 5 rotation groups was calibrated separately to the population as of 31.12.2020 according to the method described above.
The same variables and levels as for the new rotation group were used for calibration.
Combining all (6) sub-samples
After applying all procedures for non-response adjustment and calibration, all sub-samples (rotation groups) were combined together. Each sub-sample separately represented all population of the country. To combine all sub-samples all weights were multiplied an appropriate scaling factor. The scaling factor used was 1/6 as there were 6 rotation groups in the panel.
Final cross-sectional weights
Calibration of all rotation groups to current population.
After successfully applying all the procedures the weights were calibrated to the population as of 31.12.2020. The following variables on individual and household level were used for calibration:
(0-15) (16-19) (20-24) (25-29) (30-34) (35-39) (40-44) (45-49) (50-54) (55-59) (60-64) (65-69) (70-74) (75+)
In 2016 the number of pensioners was used as calibration variable for first time.
This variable had 3 levels:
1 - old-age pensions
2 - social pensions
3 - all others(rest of population)
To allocate each person to the correct sub-population data from NSSI was used- number of personal pensions as of 31.12. There were two reasons to use this variable as a calibration variable. First, get better estimation of pensioners and second, to reduce the standard error of the AROPE indicator.
After calibration, the final cross-sectional weight DB090 of the household was obtained. The individual cross-section weight RB050 was equal to the corresponding household weight DB090 (RB050=DB090).
The newborn in 2021 were not included in the calibration. They received the corresponding household weight after calibration.
The personal cross-section weight for all individuals aged 16 and more (PB040) was calculated after the age group (0-15) was removed. Only the individuals who have responded (or were imputed) to the individual questionnaire (RB250 in (11,14)) were used. After one more calibration, the weight PB040 (personal cross-sectional weight for all household members aged 16 and more) was obtained.
The Survey on Income and Living Conditions (SILC) is an annual survey implemented in the framework of Regulation (EC) No 1700/2019, which defines Scope, Definitions, Time coverage, Characteristics of the data, Sample size, Publication and Access to data.
National statistical Institute is certified according to ISO 9001. In practical terms for the EU-SILC survey, this means:
Data are accompanied with quality reports analysing the accuracy, coherence and comparability of the data.
The quality of the BG-SILC survey can be assumed to be high. Its concepts and methodology have been developed according to European and international standards and using best practices from all EU Member States. BG-SILC indicators are considered to be sufficiently accurate for all practical purposes they are put into. The indicators are disseminated following a predetermined Release calendar.
Further work is ongoing to improve the quality and in particular the comparability of the indicators. Key priorities are greater harmonisation of methods for quality adjustment and sampling.
There is a yearly ISO 9001 internal and external audits for the whole departm
BG-SILC the main users are:
SILC covers only people living in private households (all persons aged 16 and over within the household are eligible for the operation), i.e. persons living in collective households and in institutions are generally excluded from the target population.
|Data completeness - rate|
|Accuracy and reliability|
As with any other statistical survey, SILC may be burdened with errors due to sampling and other relating to the inability to be interviewed some of the units in the sample, as well as the errors taking place at the stage of data recording, data processing, etc.
In terms of precision requirements, the representativeness of the sample and the effective sample size is to be achieved. The effective sample size combines sample size and sampling design effect which depends on sampling design, population structure and non-response rate.
As with any other statistical survey, SILC may be burdened with errors due to sampling and other relating to the inability to be interviewed some of the units in the sample, as well as the errors taking place at the stage of data recording, data processing, etc.
Regulation 1700/2019 defines the minimum effective sample sizes to be achieved to compensate for all kinds of non-response. The allocation of the effective sample size is done according to the size of the country and ensuring minimum precision criteria for the key indicator at national level (absolute precision of the at-risk-of-poverty rate of 1%).
Computations of standard errors were carried out using SAS programs for the SILC Quality Reports and Complex Sample analysis in IBM SPSS ver.27.
|Sampling errors - indicators|
Sampling error - indicators
Main indicators, standard error and CI at country level
Main indicators, standard error and CI at NUTS 2 level
Estimation for main indicators by ethnic groups in 2021
Sampling errors for the income components by mean, total number of observations (before and after imputation) and standard errors
Non-sampling errors are basically of 4 types:
Coverage errors include over-coverage, under-coverage and misclassification:
|Over-coverage - rate|
|Common units - proportion|
Not requested by Reg. 2019/2180
As with any other statistical survey, EU-SILC may be burdened with non-sampling errors which occur at various stages of the survey and which cannot be eliminated completely. This mainly applies to interviewers’ errors at the stage of collecting the information, errors due to the respondents’ misunderstanding of questions and inaccurate or sometimes even false answers as well as the errors taking place at the stage of data recording.
EU-SILC is a non-obligatory, representative survey of individual households, performed by a face-to-face interview technique with the use of the CAPI method. Two types of questionnaires: individual and household questionnaire were applied. In order to finalize the questionnaires, any observations made on the questionnaires of the previous years were taken into account. The data collected from the survey were compared to the data obtained from the registers. Some of the persons, who according to the register receive minimum income, defined themselves as unemployed or non-active in the survey, because they assess their current activity as temporary and did not indicate their income. Income from interests, dividends in unincorporated businesses is in general not provided from the households.
|Non response error|
|Unit non-response - rate|
|Item non-response - rate|
The computation of item non-response is essential to fulfil the precision requirements. Item non-response rate is provided for the main income variables both at household and personal level.
Item non-response which refers to the situation where a sample unit has been successfully enumerated, but not all the required information has been obtained.
EU-SILC data were collected with two kinds of questionnaires – household and individual questionnaire. Households and individuals are interviewed by electronic devices (CAPI).
The data entry program was developed on Visual Basic.NET (MS Visual Studio 2017). The program is currently running on Windows 10 based tablet PCs.
We used the following components when installing the program:
A large number of edit checks (hard and soft) between questions in both questionnaires were implemented for ensuring data correctness and consistency. For example, two external files (at household and personal level) were used for verifying correctness of identifiers and for checking against previously collected information – household composition and questions such as day, month and year of birth, sex etc. for those individuals who are not observed for the first time. All gross income values were checked if they are equal or greater than net values (hard error) and if net values are greater or equal than gross values divided by two (soft error). In order to check the consistency of data on child allowances an additional check has been implemented – the program checks if the number and age of children in the household corresponds to the child allowances received in the household (hard error). Another check that has been added is between the salary of an individual, his/her profession and the minimum insurance income (soft error). According to national legislation the minimum insurance income is set to a certain level according to the profession type. For checking purposes, lower and upper boundaries, narrower than absolute, were set for most of the questions on income (e.g. social benefits, pensions) based upon national legislation. Internal files (implemented in the database) that hold valid ISCO-08 and NACE codes and descriptions were included.
During data entry phase, data entry operators were enabled to generate progress report by using SQL queries. The report contained form IDs, form status, number of errors and number of suppressed signals. A report for the number of individuals and households been interviewed or not grouped by interviewee had been added.
Data processing phase
After data-entry phase, further data checking and editing was performed by SILC unit, using SPSS scripts.
Initially, data were checked whether all questionnaires have been entered and completed. Special attention was paid to split-off households. Next, all suppressed signals and remarks made by data entry operators were checked up and relevant corrections were made. After that, data were converted to SPSS data sets. Extreme income values were compared with data provided by National Social Security Institute or administrative data sources and data from previous waves, where possible and corrected if necessary. All SILC target variables were computed after checking original variable(s). Finally, four transmission files were converted to .csv format and verified by Eurostat` SAS checking programs.
The main errors detected in the post-data-collection process were related to double registration of child allowances and personal income from agriculture, property or land. Both of them were recorded in household` and individual` questionnaires. As well as this, there were values that exceeded the maximum possible sizes of unemployment, old-age, survivor`, sickness and disability benefits.
All gross income values were checked if they are equal or greater than net values (hard error) and if net values are greater or equal than gross values divided by two (soft error).
|Imputation - rate|
Data processing is performed with statistical software SPSS.
Total gross income and disposable household income were calculated according to Document 065 (2021 operation). All personal/household income variables were collected by interview. Where the information is available, the data from the administrative source is directly used. The National Revenue Agency provides data from the register of insured persons. The National Social Security Institute provides data on income from pensions and other social security payments. The Social Assistance Agency provides data on income from social benefits.
The interviewers and the respondents have the option of reporting income gross and/or net at component level. From 2012 Emploee cash or near cash income (PY010) is collected only net. The form in which the net amounts are recorded in database are net of tax on income at source and of social contributions.
The gross income was obtained by summing up net value, income tax payments and compulsory social insurance contributions. If the information on tax and insurance contributions was missing, the amounts were imputed in accordance with the labour and social insurance legislations. If either the net or the gross value was missing for PY010 or PY050, the missing value was calculated on the basis of a net-gross conversion and vice versa. In case of missing information on income components, the data of the National Revenue Agency, the National Social Security Institute and Social Assistance Agency are used. When data from administrative registers are not available, the regression deterministic imputation method is applied.
For imputation of income variables in personal data file the following groups were created:
The gross income was obtained by summing up net value, income tax payments and compulsory social insurance contributions. If the information on tax and insurance contributions was missing, the amounts were imputed according to labour and social insurance legislations. In some cases where only net income amounts were available these had to be converted to gross values using all necessary information. Extreme income values and missing values were compared with data provided by National Social Security Institute or administrative data sources and data from previous waves, where possible and corrected if necessary.
Imputed rents are estimated for dwellings used as main residence by the households. The imputation is applied for those households that did not report paying rent:
The market rent is the rent due for the right to use an unfurnished dwelling on the private market, excluding charges for heating, water, electricity, etc.
Stratification method based on actual rents is used (the same used by National Accounts – the same stratification variables and the same market rents). The method is in line with ESA’95 and requirements of Commission Decision 95/309 and Commission Regulation 1722/2005 on the principle of estimating dwelling services.
-location (district centre with university, other district centre, smaller town, rural area)
-size of the dwelling
-number of rooms (1, 2, 3, 4+)
-amenities – availability of central heating
Actual market rents – main data sources:
-current price statistics
-household budget survey
-real estate agencies
The information on the private use of a company car is collected in the individual questionnaire. To evaluate the benefits of private use of company car we used the amount of kilometers driven, the number of months in which the car is used, the cost of fuel under statutory spending limits and the average price of fuel for the year. Take into account the amount that the employer provides of limit on fuel costs. In case of missing value imputation is applied with the use of hot-deck and regression imputation with simulated residuals methods.
|Model assumption error|
|Data revision - policy|
|Data revision - practice|
No revisions to report.
|Data revision - average size|
|Timeliness and punctuality|
SILC cross-sectional and longitudinal data are available in the form of tables 10 months after the end of the data collection period.
|Time lag - first results|
First data are available 6 months after data collection.
|Time lag - final results|
Final results are available 12 months after data collection.
|Punctuality - delivery and publication|
|Coherence and comparability|
The coherence of two or more statistical outputs refers to the degree to which the statistical processes, by which they were generated, used the same concepts and harmonised methods. A comparison with external sources for all income target variables and the number of persons who receive income from each ‘income component’ will be provided, where the Member States concerned consider such external data to be sufficiently reliable.
|Comparability - geographical|
Comparability across EU Member States is considered high due to use of harmonised concepts, variables, definitions and classifications.
Comparability between different regions of the country is considered high.
|Asymmetry for mirror flows statistics - coefficient|
|Comparability - over time|
In Bulgaria no breaks in series/significant changes in year 2021. Due to the COVID19 the field work was postponed with a month.
A number of income measures were implemented during the year which could be explained by taking into consideration the following:
|Length of comparable time series|
|Coherence - cross domain|
The cross-sectional data for the EU-SILC2020 were compared to the Labor force survey 2020 and HBS 2020.
When comparing SILC and HBS we must take into account the discrepancies. The differences are to great extent brought about by the methodological diversity. Here are the main methodological differences:
|Coherence - sub annual and annual statistics|
Highest ISCED level attained
Self-defined current economic status
Status in employment weighted
|Coherence - National Accounts|
|Coherence - internal|
|Accessibility and clarity|
Poverty and Social Inclusion Indicators.
Detailed results are available to all users of the NSI website under the heading Social Inclusion and Living Conditions - Poverty and Social Inclusion Indicators: https://www.nsi.bg/en/node/8292 and INFOSTAT
|Data tables - consultations|
Anonymised individual data can be made available for scientific research purposes, and at the individual request of the Rules for the provision of anonymised individual data for scientific and research purposes.
Information service on request, according to the Rules for the dissemination of statistical products and services to NSI.
|Metadata - consultations|
|Documentation on methodology|
Detailed information about the list of social inclusion indicators, definitions and algorithm for their calculation on european level can be found on the following site:
|Metadata completeness – rate|
National Quality Report.
|Cost and burden|
The total length of interviewing household in average 70 minutes.
|Confidentiality - policy|
|Confidentiality – data treatment|
According Art. 25 of the Statistics Act individual data are not published (they are suppressed). Dissemination of individual data is possible only according to Art. 26 of the Statistics Act.