• Call Us +972 (0)4 - 828 8500
  • Email Carmel@univ.haifa.ac.il

Solving 'Big Data' Problems using Coresets

Lead Researcher: Dr. Dan Feldman

Background
"Big data" describes the volumes of data sets streaming from all aspects of our lives in unprecedented amounts, collected from posts on social networks, readings from sensory technologies, digital pictures and videos, to GPS signals from mobile phones and other data sources estimated at 2,500,000,000,000,000,000 a day! Commonly used software tools are incapable of processing all this information in real-time.
In a world where technological progress generates these massive amounts of digital data, leaner, faster and more affordable solutions are needed to help sort and analyze information in real time.

Coresets – A New Paradigm for Better and Faster Machine Learning Performance in Robotics
Coresets (data reduction algorithms) are a new paradigm that can help process more accurate results of bigger and more complex datasets faster than ever. Unlike compression technique (like ZIP or MP4), coresets is a problem-dependent data reduction technique that enables solving a problem faster by order of magnitude. With smaller datasets, running times are improved while only marginally compromising the original data.

Dr. Dan Feldman, Assistant Professor in the University of Haifa's Department of Computer Science, is seeking to bridge the gap between mathematical theory and engineering applications and is introducing a new approach to solving big data problems plaguing the IT industry. Feldman and his students are using coresets (as a statistical computation tool) to solve fundamental problems emerging in big data that affect machine learning performance through robotic projects in the Robotics and Big Data (RBD) Lab.

The RBD Lab's projects are aimed at supporting better business intelligence and optimizing the performance of simple robots for improved customer service and increased cost saving.

Research Status
The group has recently begun to apply coresets in determining differential privacy in solving cloud computation security challenges. Statistical data is extracted from large datasets while preserving the privacy and anonymity of its users, creating what the research team refers to private coresets or sanitized database.

Other projects include the development of gesture control armbands – the future of wearable technology – and human-computer interaction that offers the user hands-free control of a device (e.g., a smartphone, computer, and even industrial machines), using only gestures and motion.

At the Robotics and Big Data (RBD) Laboratory, a team of Computer Science students are developing inexpensive real-time tracking systems that will turn ordinary toy drones into autonomous drones, capable of navigating through complex grounds and buildings by means of low-cost (but very safe) ‘simple’ hardware with strong novel algorithms.

Soliman Nasser (PhD student) and Ibrahim Jubran (MSc student) are developing state-of the-art algorithms based on coresets that will be able to track and localize flying robots. With the help of research assistant Michael Volgin and BSc student George Kesaev, the team has succeeded in creating a low-cost tracking system to ultimately replace commercial systems that are up to a hundred times more expensive.

 

Applications

  •  Drone / quad-copter navigation / tracking systems
  •  Image de-noising
  • Telecommunication network optimization
  • GPS / video data compression and analysis
  • Google page-ranking
  • Latent semantic analysis
  • Homomorphic encryption / learning while preserving privacy using differential privacy and homomorphic encryption
  • Additional confidential applications

 

Related pages

Dan Feldman, Dr. researcher page

Robotics and Big Data Lab website