Intellifest 2012: Long-Ji Lin: Efficient Predictive Models for On-Line Ads

on October 25, 2012
SMARTS™ Data-Powered Decision Management Platform

Long-Ji Lin delivered a very interesting presentation on how he has worked on addressing the key issues facing predictive models for on-line ads placements. These challenges are wide ranging and significant:

  • latency
  • scalability under cost control
  • curse of dimensionality – very large number of input variables, very few positive cases
  • actual efficiency – with very few positive cases to work with
  • environment challenges – drifts in taste, impact of unpredictable events

Interestingly, these are very similar to the challenges faced by fraud detection systems that I am more familiar with.

Long-Ji mentioned the fact that it’s the actual choice of the algorithm for the creation of these models is singificantly less important than handling the challenges above.

The solutions mentioned are also similar to those in the fraud industry:

  • Downsample the negative data (Heuristic 1 positive case for 5 to 100 negative cases)
  • Use near-conversion as positive data – such as putting items in the shopping cart
  • Use pooled models
  • Inject domain knowledge you may have into the models

To address the curse of dimentionality (50K features per campaign/model), the solutions are also similar

  • pruning (tree pruning, connection pruning, L1/L2 norm regularlization)
  • but even better, spend time/energy finding the good features!

The scale is Big Data scale

  • 20 b ad requests
  • 100 M ad views
  • 1 TB data
  • 1000 advertisers
  • 20 trillion decisions

The models are built frequently. One key point is that these models frequently need to be tested in real situations rather than on historical situations in order to really make an assessment of quality.

The technologies used by the systems Long-Ji works with in order to build and test the models combine the usual Big Data suspects:

  • Hadoop
  • Hive
  • HBase

Learn more about Decision management and Sparkling Logic’s SMARTS™ Data-Powered Decision Manager

Search Posts by Category

ABOUT US

Sparkling Logic Inc. is a Silicon Valley-based company dedicated to helping organizations automate and optimize key decisions in daily business operations and customer interactions in a low-code, no-code environment. Our core product, SMARTS™ Data-Powered Decision Manager, is an all-in-one decision management platform designed for business analysts to quickly automate and continuously optimize complex operational decisions. Learn more by requesting a live demo or free trial today.