Analytical Center is Working on Method for Assessing Data and Improving Their Quality

31 july 2019

Analytical Center presented a project of data assessment and improving their quality for expert discussion. The document is a part of the methodology of the National Data Management System and was developed within the framework of the federal project "Public Digital Management" of the national program "Digital Economy".

The method allows assessing quality of data in an information resource. It excludes a safety parameter which is applicable to the information system as a whole rather than to the data.

The authors of the document pointed out 14 parameters related to data quality, 7 of them have measurable values: coverage, completeness, accuracy, timeliness, consistency, integrity and uniqueness.

"These parameters are of different value for different information resources. Some of them need completeness and accuracy, and in some situations the speed and timeliness are of priority, for example in case of emergency," said Alexander Malakhov, Head of the Division for Methodological Support for Data Handling, Analytical Center.

The expert emphasized that this method is just one document from a series of documents being developed: "Criteria, assessment and improving of data quality in this method are based on the assessment of a certain information resource. Consequently, the questions which arise in relation to the system of inconsistent informations resources existing in the sphere of public administration are not considered."

Currently, the document covers 4 basic steps of quality assessment: assessing of information resource (based on analysis of regulations), desktop study, obtaining of data array (quality parameters are being checked and incidents are being analyzed) and extracting of errors (as part of complete extracting of the report and its check for parameters).

As a result of such analysis, a report on current state of the resource is obtained. But the data quality assessment shall be conducted non-stop, Mr Malakhov emphasized and added that the chosen parameters may apply to each information resource in a different way thus influencing the final assessment.

During the discussion, the experts made different proposals as to the refinement of the method. Thus, Deputy Director of the Department of Statistics and Data Management, Director of the Data Management Center of the Bank of Russia Irina Pantina proposed the following 2 phases of data quality assessment: Phase 1 is connection of an information resource to the National Data Management System supposing usage of similar metrics, methods and types of communication with such information sources; and Phase 2 – is the measurements by metrics, criteria and quality indicators to be assessed on a regular basis.

Olga Dudorova, Head of the Department of Statistics of Education, Science and Innovations of the Federal Statistics Service suggests that the method should include the approaches used in the Multidimensional Information System.

Yaroslav Omelay, Director of the Department of Information Technologies and Assurance of Project Activity of the Ministry of Labor of Russia noted that the method lacked the data characteristic which would evidence weather the data are primary or derived. "It is important if an information resource is assessed on weather these are primary data, and if they are involved in real processes. If the data are not applied, they will never be complete and qualitative," he explained. 

Representatives of the Federal Meteorology Center, Federal Service for Execution of Punishment, Federal Protective Service, Ministry of Defense of Russian and other authorities and organizations also spoke during the discussion.

An updated version of the method will be published in August.