My solution plan for the GF Challenge

I am a solver in the IARPA GF Challenge.  This is a cross post of my solution plan.  I am sharing mostly because I am afraid of having no competition at all.  About 500 have signed on but they are all notably silent.  Usually an indication that people are getting ready is that they try things and ask questions.   Only a few people are posting or asking questions in the GF Forum.   I don’t take this as a sign that everybody else is busy.  I take it as a sign that nobody has a clue.  We’ll see when things get started, how many people pop up on the Leaderboard.  Anyhoo, here’s my plan:

The rules say: “Finalists who wish to compete for monetary prizes will need to provide the solution package for review [by] the GF Challenge Team” and “Solvers will provide a short (4 page) explanation for their solution”.

There are several sources of information for each question:

  • Individual: Individual crowd forecasts
  • Consensus: Crowd consensus forecast
  • Model: Output of domain-specific models for the question
  • Intel: History of reference data required to resolve the question, and events and opinions gathered from news and social media
  • Crowd Experience: Brier scores of crowd forecasters identified by their anonymous GUID and recorded forecasts on resolved questions. This information becomes available as we work through and resolve the 175 questions to be answered during the Challenge.
  • My Experience: Applying my own domain-specific knowledge to the question, putting myself in the role of Analyst Manager.

I will create the follow kinds of domain-specific models, which take Intel as input (not Indiviudal or Consensus forecasts or derived Crowd Experience information):

  • Time series rate: Will Gold price lie between $1250 and $1350 on 18 March 2018?
  • Time series frequency: Will there be a mass killing event in Sudan in the month of March 2018?
  • Selectorate: Will Scotland vote to secede from the UK on or before 18 March 2018
  • Multiple Criteria Decision Analysis (MCDA): Will either Turkey or Russia officially suspend or cancel the Akkuyu nuclear power plant project before 18 March 2018?
  • Analysis of Competing Hypotheses (ACH): Will President Putin meet with Prime Minister Abe in Japan before 18 March 2018?
  • Social Media (SM): Will the UK vote to exit the European Union before 18 March 2018?

I distinguish MCDA from ACH largely by whether the outcome is predictable by reading the news or whether there is something more structural and quantifiable about the decision factors. I reserve MCDA for questions with structure and ACH for news-driven questions. Typically a power plant decision would have some structural features that can be quantified. Whether or not Putin goes to Japan is more of a guessing game signalled by diplomatic moves reported in the press. I reserve the Social Media category for matters so unpredictable that only keeping an ear to the ground (munching Twitter posts) will give an idea of the sentiment of the selectorate.

On a question by question basis I may use more than one model, and combine their results with a linear weighted average.

I will use the following prediction methods which combine the above sources of information in various ways to get a forecast stream which yields a better Brier score than the Consensus forecast stream:

  • crcc01: Consensus 0/1. Round the Consensus forecast to 0%/100%
  • crcdmf: Expert crowd (see https://larswericson.wordpress.com/2015/12/22/double-median-filter/). Take all the forecasters in a question who have beat the crowd in some related questions. Each forecaster’s weight is the sum of the accuracy scores in the questions they beat, so the more questions they were in, the higher their weight. Take the 24 forecasters with the most negative sum of accuracy scores. Take forecasts which are within 2 standard deviations of the median. Take the median again and take remaining forecasts with 2 standard deviations of that. Then take the average of those forecasts, and thats the prediction.
  • crcdmf01: Expert crowd 0/1. Take the output of double median filter and round it to 0%/100%.
  • crcm: Models. Take the output of domain-specific models only and ignore the crowd and consensus forecast stream.
  • crcm01: Models 0/1. Take the output of domain-specific models and round it to 0%/100%.
  • crcmcm: Managed consensus plus models. Combine the Consensus forecast and models using weighting decided by analyst manager (me).
  • crcmcm01: Managed consensus plus models 0/1. Round the managed consensus plus models to 0%/100%.
  • crcmem: Managed expert plus models. Combine the Expert crowd output with models using weighting decided by analyst manager (me).
  • crcmem01: Managed expert plus models. Round the managed expert plus models to 0%/100%.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s