Big data analytics

About the Big Data Analytics research direction

The Big Data research subprogramme focuses on the research in efficient big data analysis, with special focus on identifying the limits of existing data analysis techniques, and guidance for the selection of the most efficient data analysis technique to each specific problem setup. Besides the benchmarks and comparative studies of existing data analysis techniques, new methods might be proposed to optimize the utilization of the infrastructure.

In the Big Data area, CERIT-SC aims to identify and validate the effectiveness and limitations of existing data analysis techniques when being applied to different datasets under various setups (e.g. given by the research question being answered for the specific dataset). Extensive attention is paid to extremely large datasets, which often require very simplistic techniques leveraging between analysis feasibility (e.g. response time, amount of used resources) and achievable information value. The example domains include large networks of interconnected sensors (present for instance within the concept of Internet of Things), cybersecurity assurance and (cyber)crime detection (dealing with vast amounts of heterogeneous data), as well as various bioinformatics data portals and analyses (e.g., genome DNA/RNA sequencing and analyses).

Research topics

  • Examination of existing data analysis algorithms and techniques
  • Comparison of existing data analysis tools
  • Architecture of data analysis infrastructure (tool composition)
  • Solutions to domain-specific data analysis problem
  • Domains of Smart Grids, Bioinformatics, cybercrime, and others.

Application domains

Big data analysis applies to essentially all domains where big data is present. In our case, we specialise on two domains, with strongest link to other partners and projects. The first is IoT (Internet of Things systems), and Smart energy grids in particular. The second is data describing biological problems and samples.


At the moment, we work mostly with publicly available tools, which we are able to configure to best match the addressed problem. These tools are mainly Elasticsearch a Hadoop. 


[1] Bangui, H., Ge, M., Buhnova, B., Rakrak, S., Raghay, S., & Pitner, T. (2017). Multi-Criteria Decision Analysis Methods in the Mobile Cloud Offloading Paradigm. Journal of Sensor and Actuator Networks6(4), 25.

This work is supported by the project OP RD&E CERIT Scientific Cloud CZ.02.1.01/0.0/0.0/16_013/0001802

You are running an old browser version which is not fully supported information system anymore. Some applications might not display correctly, some functions might not work as expected or might not work at all.