Research Interests
Key words:
- data mining and data analysis,
- astroinformatics,
- data visualization,
- computational astrophysics,
- high-performance computing,
- distributed systems.
During the past decade, data mining has gained a solid foothold in a variety of application domains. Currently, the amount of data, available in an increasing range of formats, is growing at an unprecedented speed. Furthermore, both data and computational resources are increasingly distributed over geographical locations. Most current data-mining techniques assume data to be homogeneous and contained in a centralized location, which opens a wide range of research opportunities to address the above issues.
My short-term research goals include, but are not limited to, the
application of data-mining techniques to open research questions in
astronomy (Notes for ADASS 2007
Tutorial on Data Mining in Astronomy , Additional Handout). The
extension of the distributed clustering scheme developed for my thesis,
which enables scalable, efficient and privacy-preserving data mining
for geographically-distributed data, will target additional clustering
techniques and the problem of clustering tasks in sensor networks. The
application of data-mining techniques to open research problems in
astronomy, such as a multi-wavelength galaxy classification scheme,
includes clustering and matrix decomposition techniques, for example to
detect three-dimensional structures in data obtained from the Canadian
Galactic Plane Survey.
Long-term research goals include development of new data-analysis and data-mining techniques capable to deal with special issues associated with data in certain application domains. For example, astronomical data obtained from multiple surveys are measured at varying resolutions, may contain noisy data and multiple measurements, require techniques for cross-identification of objects detected in various surveys, and contain measurements influenced by measurement errors and selection effects. These characteristics correspond directly to characteristics of data obtained in other application domains including health care, economics, bioinformatics, security, and sensor network, therefore opening numerous opportunities for research and collaboration.
I am also interested in visualization techniques for data, both in a distributed setting and for high-dimensional data, the parallelization and optimization of existing data-mining techniques, the development and optimization of astrophysical simulation and modeling code, and the integration of data-mining approaches into the Virtual Observatory.
I don't believe in mathematicsAlbert Einstein