Data Fusion

Data Fusion: Getting good results from noisy data

Real World Data

In the scientific world, data comes from carefully controlled experiments. It is reliable, clean and controlled. In the real world, data comes from many sources and in many formats. It contains noise, biases and gaps. It is messy - but with the right mathematical analysis it can be very valuable indeed.

Advanced Bayesian Reasoning

Our prime technology is advanced Bayesian reasoning where we are developing powerful in-house tools. We produce models of how the things you care about (e.g. position in tracking, demographics in profiling, opinions in marketing) generate the things you can observe (the data).

The benefits of this approach are that the models:

  • Can leverage knowledge from domain experts.

  • Can combine different types of evidence.

  • Can be adapted to handle changes in your data collection practices.

  • Can explain data in human terms.

  • Can be used and adjusted by non-technical staff.

  • Provide confidence estimates.

Lights blur in a traffic stream

Example: Combining Evidence

Problem: A major broadcaster wanted to deliver an innovative location-based mobile phone service - but most of their customers do not have GPS phones.

Solution: Statistically combining information from sources which on their own were noisy and incomplete: cell ID, some GPS, and bluetooth.

Benefits: Major improvements in accuracy from ~500m to ~30m.

Related problems: Tracking, navigation, analysing hybrid data, combining human intuition with statistical inference.