During the second day of the All-Russia Summit ‘Open data – 2015’ experts discussed linguistic methods of development, experience in publishing, the methodology of collecting and analyzing Open Data, and examined the role of data in data-journalism, education and science.
Special attention at the Round Table ‘Position and Role of Open Data in Big Data’ was paid to the crossroad of personal and big data. Yuriy Ammosov, the Expert of the Analytical Center, is sure that data are not identical, but they can interfere with each other, if they are about the same person. Experts also noted that in this sector there are many ambiguous concepts, so it is necessary to define Big Data and indicate the direction of their activities as quickly as possible.
During the Session ‘Open Data Publishing Practices’ regional representatives talked about their practices, tools to simplify and control compliance of data sets with guidelines. St. Petersburg adapted its portal on devices with various screen resolutions and visualized statistical data about its work; the city provides with information loading from third-party systems. Perm Krai just started to work in this direction, but has made a huge leap, moving from the number of data to their quality. All regions are moving towards the use of electronic interactions with different authorities and an automated mode for data exchange. ‘The Open Data sector is only being formed, so it is important to inform the public about the best technologies so that new entrants were able to use proven methods,’ said Yuriy Linkov, the representative of the ZAO Gosbook. To do this, we need to renew the register of successful projects on the Open Data Portal, carry out various activities and bring together representatives of the industry, according to the expert.
Scientific communication is developing through publishing scientific papers’, and the Ministry of Education and Science is only 32nd on data openness among federal executive authorities. However in 2016 60% of all scientific papers will be in the open access.
Dmitriy Semyachkin, Director of ‘Otkrytaya Nauka’ Association, ‘KiberLeninka’ Project Manager
‘Open Data practices are relatively new practices of processing digital mass data for making and justifying management decisions,’ said Galina Gradoselskaya, the Expert of the Analytical Center, at the Round Table ‘Methodology for Collecting and Analyzing Open Data.’ There is a big problem in the sector - the lack of a developed unified methodology, according to the expert. Thus, every single sphere of activity has its own tools, which differ from each other. According to experts, in Russia this problem has arisen because industry development was late enough and because the main Portal was designed without any reasoned specificity and ready data consumers. The US and Europe have their own methodologies, but they do not fit Russia. During the discussion experts worked out some steps to be made to create a unified national methodology which meets interests of all the parties of the industry.
The Session ‘Open Data in Education and Science’ touched upon the role of Open Data in improving the efficiency of scientific research and access to the results. Inna Karakchieva, the Expert of the Analytical Center, was the moderator of the event. She emphasized the need to publish all Open Data of studies granted by the Government. Irina Anisimova, the representative of the Federal Labour and Employment Service, considers that an important task is to create a classifier of competencies that will fix quality requirements of the employer to experts. ‘The scientific communication extends through opening scientific publications, and the Ministry of Education and Science is only 32nd on data openness among federal executive authorities. However in 2016 60% of all scientific papers will be in the open access,’ said Dmitriy Semyachkin, the Director of ‘Otkrytaya Nauka’ Association and the ‘KiberLeninka’ Project Manager. The problem of the gap between education and the real business was noted by Sergey Neizvestniy, the Professor of the Russian State Social University. In his opinion, it is necessary to change the curriculum and motivation of the teaching staff in favor of Open Data so that skills of graduates met the needs of the business.
Speaking at the Round Table ‘Linguistic Techniques of Data Processing’ field experts noted that the linguistic systematization is crucial for processing Big Data. Methods of computational linguistics are necessary to extract facts, verify, standardize, aggregate and bind them. Iliya Dimitrov, the Executive Director of the Association of Electronic Trading Platforms (AETP) said that ‘without linguistic technologies there will be no Big Data.’ Representatives of companies involved in the systematization of linguistic data shared their practices and discussed problems. Presented systems independently look up the context for key features and interpret them in the desired format, they are used for data protection, documents’ processing, etc. The business value of Open Data begins when data are clean and aggregated, according to experts. They noted that there is a shortage of personnel in the sector as it is difficult to find professionals who can program and know linguistics at the same time, and also drew attention to the fact that today one technology often is not enough - you need a set of analysis techniques to reduce the number of false operations.