Book-length thread on single forecast, preserved for posterity


If some day 30,000 years in the future space aliens visit the empty hulk of Earth, this is how they will come to know us:

Reaffirming my forecast.

The question is stated as delivery of a “S-300 or S-400 missile system.” A “system” requires all the major operational components to be delivered (search radar, targeting radar, missile launcher, command post, etc.). Regarding hiding the system, while I agree that generally you don’t want to give away the exact location of your military assets, it is important to credibly signal your enemies if you expect to deter them. Most of the Iranian comments to this point are probably inward directed, to create a sense of strength in the aftermath of the nuclear deal with the West. Outward directed statements (to Israel) will come when the system is actually functioning.

I agree with your assessment. I have been forecasting under the assumption of one completed S-300 (or S-400) SYSTEM. Others have been speculating that S-300 manuals or even just the missiles, will suffice.



Haha we should have all listened to DIma K’s rumor and stayed at 100%. There is a Superforecaster Platform, people, and we’re not on it!

Makes absolutely no difference since the resolution is backdated to Nov.23rd

@Dima K. Well, GJ still hasn’t closed this question, so I think they are keeping an open mind to the debate here. I can’t agree with the criteria they used to resolve the question in the Supercaster Platform.

“to deliver” means that the system needs to actually be in Iran.

All that happened on the 23rd is Russia “began the process of delivering.” For all we know, that means all they did was move the S-300 from Siberia to a warehouse in Moscow.



No it’s closed, you can’t enter new predictions. They just haven’t run the scoring function yet so it doesn’t show up on your profile Scores.

Ah. I see. I’d love to see their reasoning with this one. “Agreed to deliver” is not equal to actually delivering. There were so many examples in past seasons that a promise or an announcement to do something (instead of actually doing it) would not suffice.

I would not be surprised if the delay in calculating scores is because the Powers that Be are debating whether to void this question or reopen it. The announced resolution is a classic example of the question substitution fallacy. Whoever made the decision substituted the question, “when will news stories say delivery “has begun,” for the stated question, “Will Russia deliver an S-300 or S-400 missile system to Iran before 1 January 2016?” This decision is highly offensive and if permitted to stand, will drive away players. Our scores will become meaningless if this sort of incompetence among organizers of this game were to continue.



That plus the fact that it just reinforces the idea that we are just here to make an off-site team look good.

“we are just here to make an off-site team look good”

Jesus… From everything I know, that is definitely not true. One would still exist if another didn’t. The two are independent and questions overlap is not that big. (In this case, the question was the same but its cut off date was very different. This provided for very, very different forecasts). Finally, AFAIK, there is zero interest in comparing the two.



Oh, come on, you’re the one that started the rumor! [1] What do you think the GJ, Inc. business plan is? How do they expect to monetize this site?





Yes but,…still not scored. Open Question: The score hasn’t shown up because

A. GJOpen computer broken or they need to hire a programmer: 18%
B. @cmeinel is right, they may void this question because mishegoss meshuganah: 68%
C. Line at Chicago Starbucks backed up: 14%

@000 I haven’t laughed this hard in a month of Mondays.

Good place to vent! I’ve been on the “(almost) manuals” side since the other extreme did not make sense to me. Not sure this decision really passes the giggle test, but here is the way I’ve rationalized it.

Parchin was “inspected” by the IAEA Director carrying away some random handfuls of dirt. Since the IAEA blessed the inspection (regardless of what one may think of its utility), it makes sense in a way. Who better to “judge.” Along the same lines, (ignoring all the silly blather battered about as news) if enough key individuals (particularly on the Iranian side) say it is happening/happened, then who better to judge?



Except that people have been known to misrepresent the ground truth if they perceive an advantage in doing so. [1]


Excellent point, Lars. For example, just a few days ago, inspectors determined that despite many denials, Iran did, at one time, have a nuclear weapons program.



I ❤ it when things go a-rye. Although, I’d rather they go a-malted barley and peat 😉

Has anyone seen a reputable (i.e. not Tass, sorry!) source that the system(s) were actually delivered? I can’t find a thing online.





I found an interview with one of the GJP judges talking about how this question was resolved. [1]


I’m just trying to figure out what the objective of this game might be. A quick scan of the people I’ve identified as the top 20 superforecasters overall shows that most of them now have Brier scores close to or exceeding 0.5, for example mike97mike, jeremylichtman, Clairvoyance, just to mention those active in this thread. I’m now up to 0.449 but only escaped worse by having been scored on 18 questions so far, dilution effect.

“The only way to win is not to play the game.”



Well you’ve kept your mojo anyway, you’re showing up in the top 12 in many of my dream team picks:

000: “Is” is a dirty word, in some quarters!

cmeinel: There’s no prize. I don’t know about anyone else, but I “compete” because it’s an interesting puzzle. That said, it looks like they’re moving away from “pure” Brier scores to relative scoring, so a 0.5 may not mean the same thing. We’re also on “classic” Brier scoring, where the worst possible score is 2 (not 1, in the modified score). Also, I really messed up on a couple of questions!

@000 But have you updated it since this afternoon’s scores updates? Now our various superpick techniques will biased by this weird scoring decision.

That system outage today must have been an upgrade of the software. Now the@username technique sends a notification to username.



@cmeinel, my run last night assumed that this question was closed as advertised. I reran it with todays additional forecasts and the numbers across all questions only change a % or 2 here and there, not worth updating right at the moment.

My cumulative score has hit the 1.0 zone with this question. I’m going to step back from thinking about the world and just run my superpicker algorithm for a little while to see if that restrains me back into a more reasonable score. Or I could eat ice cream and watch TV, we’ll see what takes precedence.

Scores. Smores. If you haven’t lived on the street of an American city, or a refugee camp in the Jordanian desert, or waking up on the wind swept frozen taiga with no warm coffee, then this is good opportunity to experience the feeling. If you think forecasting is all reading The Diplomat or Die Welt, or Googling the right source engines, or even the correct algorithm, like a good, mythical, field agent – James Bond – you survive your 0.4-0.5 Brier and more even more numbing score. You learn that now is different from then. That 2015 is not 2005. The spirals moves forward.

@mike97mike Yes, John le Carre.



@mike97mike, @jeremylichtman, I rely for my global inspiration strictly on the wisdom of this guy [1][2], although I’m told this guy [3] is also not too far off the mark.


@mike97mike: The issue I’m trying to figure out is why Dr. Tetlock, who is a part owner via Good, would wish to demonstrate to the world that he has lost his mojo. That is the effect of running a game where known habitual superforecasters [his TM] are being demonstrated to be not so super after all. Consider that this is his selling point to UBS and his statements to the press, Edge Master Classes to IARPA etc. In the case of the S-300 question, his team, people who worked on his IARPA contract, made a classic cognitive science error: question substitution. Instead of the hard question, which during GJP would only have been resolved by approval by IARPA, GJOpen has documented the fact that it substituted the easy question, which was when will news sources that the scorers like, regardless of how frequently they have been known to be outlets for governmental disinformation, report that shipments “have begun.” Even if true, these news stories did not mean a “system” has arrived. As pointed out be others, this could mean, for example, movement from one Russian warehouse to another, and of components such as just a screwdriver or just manual. This even could mean that preparation of shipping documentations has begun. Heck, you don’t need to be an industrial engineer with expertise in arms shipments to know that these sorts of things don’t get shipped overnight via Federal Express, especially in Russia.

However, lost mojo is not, IMHO, a credible explanation for what is going on here. Coming up soon are a raft of trivial questions that promise to raise most people’s scores above the level we would have had by forecasting 50% all the time everywhere. Companies to whom Dr. Tetlock’s team are pitching proposals might just look at the surface conduct of this site.

Another possibility is that I have reason to believe that the codebase for UBS, and presumably future customers, is totally different from this game. This comparison could be used to make the pitch that Dr. Tetlock has a secret sauce that only he will deliver, via the correct user intrerface.

A third possibility, and none of these possibilities are exclusive, is that this game is designed to further winnow the superforecasters [TM] in order to make job offers for the banksters, er, customers. You can read about Dr. Tetlock’s first commercial customer that I know of in the book The Fall of the UBS: The Reasons Behind the Decline of the Union Bank of Switzerland. So could this game be a Hail Mary to sort out the ultras among the supers to save UBS’s rear end?

Could this be why IARPA is about to release a Broad Agency Announcement for its CREATE program? I have reason to believe that the Good Judgment Lab has ruled out going it alone, and still is looking for a prime contractor to take them on.

@cmeinel My gut feeling is that simply doesn’t have access (right now) to the resources at IARPA that produced high quality questions and high validity resolutions for GJP (whoever was doing those jobs had effectively infinite resources at their command). I wouldn’t be surprised if they’re trying to recreate that relationship, given how well it worked.

Note to everyone: out of respect for CIA folks who risk their lives abroad: they are not “agents” as described in the header for the Geopoltical Challenge. They are operatives. There is nothing like having a pair of them and a jeep show up within minutes of a mob trying to burn down your hotel to impress a person with what they do. I feel that I owe them big time. The ones who work in an office analyzing information are analysts. They save lives, too, and they always have depended in part on information provided by people who sift through open source info. See the CIA’s Open Source Center, for example. The more we could do for them via open source, the more time these people have to do what is essential. Contact me via Twitter @cmeinel if you believe you might have something to contribute to CREATE.



@cmeinel, some items:

1. Nice Wolf of Wall Street clip! Full disclosure and boast: One of the characters portrayed in that movie, a shoe salesman who did well, had his daughter in my son’s fancy private school kindergarten a few years ago when I was trying to keep up with the Manhattan uppercrust. The cagey old crook never said Hi next to the cubbies when he was dropping his daughter off. His wife, his former secretary, was marginally more friendly.

2. I think CIA prefers the term “officer” for their employees, and “agent” for anybody taking the bait, which may have included Saddam Hussein [1] and Osama (or “Usama” depending on if you work for the Gummint) BL. [2] I don’t know about the term “operative”, which could apply well to contractors, of which there are many, such as the choir boys at Academi (formerly Xe Services (formely Blackwater)). [6]

3. The jeep scene. Yes, I saw that scene in Season 4 (of Homeland, not GJP) Episode 1. [3] Although caveats. [4][5]

@jeremylichtman, one item: It is not clear that in Season 4 (of GJP), IARPA had any direct input into the questions or the resolution. They could simply have been observers. Since the whole process was extremely public (participants from many countries with relatively uncontrolled/unverified identities), it makes sense to have the working spooks out of the loop. Otherwise you are compromising a mental map of US concerns to pretty much everybody. That especially holds for GJOpen: My blog, for example, has readers from 21 countries including 2 with moons on their flags (Pakistan and Turkey), and several behind the Iron Curtain (Estonia, Romania, and Bulgaria). (Note to everybody: Start a WordPress blog and hook it to your Blog button. It’s easy!) So my claim is that one single very clever, well read and highly educated person made up the questions and did all the judging. Maybe someone from GJP can enlighten us on this point. @kmcochranor @GJDrew, care to shed some light?


@000 Yeah, I would like to know who was doing questions and answers for Season 4.

@000 I’m not talking about a jeep scene from some TV show. I’m talking real life, example, personal experience, New Delhi, Nov. 1966. There are thousands of U.S. citizens who can thank these courageous operatives for their rescues, decade after decade.
@jeremylichtman: In GJP 1 and 2, IARPA posed the questions and vetted the resolutions. I know this both from reading GJP literature, news stories and interviews with key players. GJP3,4 provided questions based upon a team they recruited, as publicized in their blog. I know that IARPA had final say on question resolution because I was in an experimental cohort where we received alerts that they were waiting to either score or reopen a question on the basis of feedback from IARPA.

@cmeinel I was consistently amazed at their choice of question options (as in: how did they know that ‘x’ and ‘y’ would be the relevant numbers), and in how they were able to close some of them.



Excluding South Stream which would have been closed as “Yes” last year based on the standards used to close this question.

@000 and don’t forget my favorite, Russians in Ukraine…

Iran to Drop Lawsuit against Russia before S-300 Missile Supply: Report

The implication of this article is that the system has still not been delivered.

The resolution sure looks very premature now:
“The deliveries of Russian S-300PMU-2 air defense systems to Iran will begin in January, a source in the military and technical cooperation system told TASS on Friday.

“It is planned to begin the process of delivery of the first regiment of the S-300PMU-2 air defense systems in January and to complete it in February. Iran is due to receive the second regiment of these systems in August or September 2016,” the source said.

“Russia will thus fulfill its obligations to supply the S-300PMU-2 air defense systems to Iran,” he added.

According to the source, about 80 Iranian specialists will be trained to use the S-300 missile systems (NATO reporting name SA-10 Grumble) at the Mozhaisky Military Space Academy.

“About 80 military specialists from Iran in January, 2016 will begin a training to use S-300 at the training centers of the Mozhaisky Academy,” the source said. “The training program will be four months long. Its cost is a part of the contract. After the course, supposedly in May, the Iranians will return home.

In September the parties signed an additional agreement to the contract on the S-300 air defense systems, the source recalled. The first batch of the S-300PMU-2 systems under the contract — one regimental set — was immediately sent to the Kapustin Yar range. Currently these systems are completing tests there to confirm their stated specifications. Then they will arrive at the port of shipment in the Russian part of the Caspian Sea from where they will be delivered to Iran by maritime transport,” the source said.”

Happy Festivus, fellow forecasters!

I wonder if GJP has a process in place for addressing mistaken closures?

Looks like I was right about maritime transport – makes sense to float equipment down-river to the Caspian, and then load it on barges and deliver directly to N. Iran. The other option would be a fleet of Antonovs, of course.



@GJDrew, @kmcochran, @WarrenHatch, cleanup on Aisle 7!

Interesting juxtaposition between training and Kapustin Yar:

Another article citing the same TASS information. This article from Israel National News even mentions how Iran’s Ambassador to Russia Mehdi Sanai “claimed” the first delivery had already come in.

My opinion is that Mehdi is a liar and simply trying to save face. Mehdi promised delivery by “the end of the year” but the fact of the matter is the Iranian lawsuit against Russia is still active due to non-delivery.

I wish I had emailed GJ back when we found out that the other superforcaster platform had closed the question. I was thinking about asking them to keep the question active until January 1st: so the resolution could be debated/sorted out later without voiding the question.

As it is now, if/when we get reports in Jan/Feb that the S-300 has been delivered and the Iranian lawsuit against Russia dropped, I don’t see any other recourse but to void the question.

There is so much back and forth with this question, I think it’s reasonable for all involved to admit that the delivery can’t be credibly confirmed within the stated time frame.



@000 Well…It looks like the S-300 were actually delivered today. I’m glad the controversy with this question can finally come to a close.

Here’s a photo of a S-300 missile being paraded to the citizens of Iran: just in time for Christmas



Alas, I call fake, for the simple reason that those ladies would be immediately arrested in Tehran for un-Islamic garb. Although their heads are properly covered, their hair is not secured. Apparently hair is an erotic signifier over there, so it has to be battened down to maintain the public order.

You better apply some Antarctica to that burn!

This question is infinite.

NOW, they are scheduling delivery for February.

This from TASS–that dependable source this question seemed to rely upon to “close” prior.



It’s the gift that keeps on giving. At least this way, GJ will never run out of questions.

That the resolution is problematic, does not mean it is incorrect. I’m not saying that it is correct; only that we don’t know. I’ve assumed from the beginning that all information on this topic was subject to maskirovka. I don’t see any reason to drop that assumption now. Moreover, we are simply not privy to any evidence that would be determinative. For example, who amongst us can say that the prior claim to “operator training” was made false by the current (more detailed) claim? Can’t they both be true? I am comfortable with the idea that something was delivered, and the “contract” has morphed into a series a related contracts. I wish you all expressed such a passion for accuracy on the OPEC question, where we have all the necessary evidence, and the proof of biased criteria substitution is staring you right in the face.

@redacted, Bravo! And what else is staring us in the face? The loss of an Amazon gift card? Why do we do this in the first place? What are we testing in Beta GJI? I don’t want to go John le Carre on anyone, but I know the basis for my forecasts. The hours spent reading and translating the open literature online was not wasted. Moreover, it gave me greater insight into the geopolitical pressure for creating the “correct” answer to this question. One take-home lesson: the Fog-of-War extends even to the analysis of who has actually won or loss in the available media; tactical loss, but strategic victory? There is much invested in perceived “correct” outcomes. Good fortune to you!

“Schrödinger’s Forecaster”… 🙂

@redacted makes a good point re: OPEC. The decision to not make a decision has the clear effect of making a decision to lift all quotas. OTOH, they did not say in so many words that they were lifting all quotas. TO be fair, the question ought to be voided, just as the Nicaraguan Canal last year was voided.

@GeneH the “Schrödinger’s Forecaster”… 🙂 pretty well sums up redacted’s point, that we have no evidence other than conflicting and vague news stories. At least when IARPA was making the final decisions, we could have some confidence that they were making a genuine effort at determining the truth.

The only reason I’m in this game is for research in the hopes of winding up on a winning team for IARPA’s CREATE program. You folks participating in this thread are examples of high quality research and analysis, delightfully illuminating for me, thank you. Too bad the admins are treating us so poorly, in particular making errors that preferentially harm the scores of the best forecasters.

Even if I’m not on a winning team, I intend to volunteer for IARPA’s next forecasting games. If the admins continue to permit the use of “@”, I’ll let you folks know via this game when some competent games are ready to run.



The clearest evidence of pressure for an outcome was when they paused FDLR last year in anticipation of a weekend State Department press release declaring victory. So this venue is definitely subject to manipulation.

@cmeinel: So as to not hijack this thread anymore than I already have by interjecting the OPEC question (apologies), I refer you instead to my first December 17th comment on the more appropriate thread, where I describe the specific criteria substitution in detail.



We understand, you lost your heart and your innocence at the OPEC question. Such trauma is hard to walk away from.

@000: It’s true. I did.

@redacted, @000, thank you both in this thread and in the OPEC thread for clarifying issues and proposing solutions. The program manager for ACE/GJP, Dr. Steve Rieber, reports having received many complaints about unfairness issues during the GJP seasons. According to refereed scientific papers written about the four GJP seasons, morale may well have been low, as reflected in the average number of forecasts per question being <2. Lars (000), in his blog, has shown that activity in GJopen has continued this trend of extremely low participation.

On the plus side, according to IARPA program manager Dr. Rieber’s briefing to proposers, the next four years of forecasting games will include a criterion of providing a pleasant experience for participants.

So, folks, please share more ideas on how these sorts of situations could be avoided or repaired, and what would make these sorts of prediction poll games more pleasant, more motivating?



@redacted, as an antidote to your distress, I suggest a cover-to-cover close reading of Bridget Nolan’s thesis:

@cmeinel, we like to gossip and watch each other’s action, so for a “pleasant experience” in this context, it would probably suffice to do a few things:

1. Add their best training modules to the “training” section which is currently a decoration.

2. Add the ability for all forecasters to download in some convenient form (base case is Excel spreadsheet .CSV format) all of the data on the site (questions, users, forecasts, comments). This ability is currently restricted to those with some Web programming skills.

3. Add the ability for users to create discussion forum topics outside of questions (we had this ability in the Inkling Markets interface last year).

4. Maybe add some more facetime components like Google Hangouts, and, what Wikipedians do frequently, which is establish local groups that meet in person to discuss. You could extend that to an annual meeting at a vacation destination somewhere, I’m sure there’d be some takers.

@000, yes, and I’ve asked for the same thing, since I have invested both time and treasure into the effort of pulling all the data, as I can see you have, too. The site isn’t good enough to even help me to find out what I’ve done, let alone what anyone else has done. With the site being open to anyone, and no teams/groups established, and anyone able to generate tens or hundreds of accounts for whatever purpose they choose, so I feel like I don’t have any choice.

There. I’m out of the closet. I’m a GJP scraper.

Comment deleted on Dec 31, 2015 09:52PM UTC

I get that they have try to judge these questions. It would, I think, make more sense to leave these questions open until the end of the period (unless the resolution is clear, which it never was, in this case), and then, if the answer is unclear (within a good margin), to leave it unjudged until it is clear. If the season ends and the answer is still unclear, then the question can be voided.



Has anyone else ever been on a comment thread and noticed that it was as if they were a different user?

For the record, right now, I’m seeing the page as if I were @redacted (but I’m @einsteinjs). I’m typing this comment and I’m not sure who it will show up as when hit submit.

EDIT: Ah, ok. So, it showed up as “einsteinjs,” but it if I scroll up the page, it shows me the “edit” function for @redacted‘s comments. Of course, I won’t change them, but I guess that’s part of the reason we’re still in beta.

EDIT #2: But when I refresh the page AFTER I’ve made a comment, I’m not able to edit@redacted‘s comments (thankfully). This isn’t the first time I’ve seen this before, either. Anyone else come across this?



Ah, I see you have reached Level 2 of the game.

No einsteinjs, but I alerted the admins some time ago to something they probably didn’t intend (ability to change different parts of another user’s account).

@einsteinjs I’ve reported that bug, too.

Ahem. At the risk of sounding paranoid, I do hope the bug selects other participants as well. As far as I can tell, my comments have not been abused or vandalized. An honorable and virtuous crowd. Although I must say, it is great comfort to now always have the future ability to repudiate and deny my own comments.



@cmeinel, Congratulations, you have reached Level 2 of the game.

@000 — except it’s not replicable, so it doesn’t *feel* like I’ve reached another level. Hmmm, is that another ‘level’ TO the level?

@000: Thanks for the link to the Nolan thesis. I quite enjoyed it. It provided both a number of useful insights and sources of frustration. The phenomena of the “Social Construct of Precision” was interesting.
@000 & @cmeinel: Just a quick story on Training. In general, training might be segregated into “Scoring” and “Prediction.” In terms of Scoring, @morrell and @Khalid (I think) have offered some handy tips. In terms of Prediction, I am reminded of a phrase (from Blackstone?) about the “infinite variety of human creativity” or the like. Perhaps with the exception of the probabilistic approach apparently outlined in the Tetlock book (which I will eventually buy), and the causal concepts of “system decomposition” (Newell/Simon) and “nearly decomposable systems” (B. Russell), I’m not sure how one could present training that is suitable to all problem cases. I was once charged with supervising a freelancer of sorts as he advised participants in a program for which I had overall responsibility. He had come highly recommended, had an impressive pedigree (Bell Labs), the participants were pleased, and I could see the project was moving forward most satisfactorily. I must have engaged him in three or four different conversations in an attempt to find out the method of his success, and I must say that these conversations were some of the most curious that I have ever had. As I recall, his general message was that he could not provide to me a general methodology, but only “a bag of tools.” You can’t imagine my dismay. Here I was, paying this advisor major sums of money, and all he could tell me was that he had some “bag of tools!” It has taken me years to see how right he was. You can’t have a methodology of any specificity for an infinite variety of problems. The real trick is in diagnosing the problem perhaps. This, along with knowledge of your toolset, will together provide the solution.



Oh no, they had a nice 20 page PowerPoint with a drill on how to think with base rates and whatnot wrapped up in a nice acronym like CHEESY SPEEKS or SPICEY NICE. Alas, I have forgotten the mnemonics. They swore the drill was worth 20 points of accuracy though. I have suffered the consequences of not remembering it.

@einsteinjs – yes, I’ve had that problem with being another user too. At least once I was you, though apparently now I am @redacted. Which means, some strong possibility that it’s related to the @ mentions. I got here via a link in an email notification of a mention, so… Will go report it now.

Haven’t been able to repeat it though, since I added this comment. So now I wonder if it only happens when going to a comment thread that you have not as yet participated in, other than being mentioned.

in the end, despite all the broad brush presented in the book, no one knows for sure what makes a “superforecaster”. They come in all varieties and I am convinced that everyone succeeds according to his own “bag of tools”. FWIW, I never could get anything useful out of the “CHAMPS KNOW” and to this day I consider it quite silly (that’s putting it mildly).

I have not noticed any of my comments edited by friendly hackers / scrapers, but would you quit hiding my menu bar when you are done on my home computer? Sheesh! You may be Ansaonymous but practice tradecraft! 😉

Lots to comment about here! I second the suggestions given by @000 especially
2. Add the ability for all forecasters to download in some convenient form (base case is Excel spreadsheet .CSV format) all of the data on the site (questions, users, forecasts, comments).
3. Add the ability for users to create discussion forum topics outside of questions (we had this ability last year).

I’m not holding my breath for these, though, because the “graphs and stats” still don’t superimpose our own forecast history on the overall trends, despite being told quite some time ago that GJI is working on it. This was already available in the original GJP. In some cases we’ve gained a snazzier interface but lost functionality.

I was in teams for both my GJP years, and possibly the best part was the ability to have general discussions. In the present format, discussions such as the one on this thread tend to get lost very quickly. However, discussions here are much more lively and responses are almost guaranteed, unlike in the small team format.
What I like about this open format is that I get to read comments from everyone instead of just perhaps half a dozen team-mates. Similarly, the “consensus” is arrived at from hundreds of forecasts, and is therefore likely to be more accurate. I expect the overall accuracy on this site will be better (even though current Brier Scores are terrible).
The biggest problem for me on this site is that I’m spending (“wasting”, at the cost of my real life) too much time reading as many comments as I can. When time is short, I end up reading only those I’m following but that’s still more than 80 people. It’s going to be hard to keep up. Are we ever going to emerge from this Beta phase?
As to why I’m doing this, I’ve been treating it as an educational game, and it’s an addiction.

In last season, from Decemer/January, I was isolated in a group of 5, and no one talked, one made 3 forecast then disappeared. I could see no other comments but my own, and final score included some interaction and up vote type analysis for which I ( my group of dead people ) had none. It was a test of will and sheer determination to complete the season, rank our group up to 5th place, Brier .331 final. I am here to see what a super looks, thinks and acts like. 🙂

It was obviously no picnic being a guinea pig in isolation! I thought one of the reasons for GJP’s success was the team format, as a result of which small groups put together at random were able to work together to collectively improve the team score. It’s true that there are quite a few inactive members – in my first year only 5 out of 15 contributed regularly, though the second year was better, with about 7 out of 10. That’s probably why our team in the second year was ranked #2 out of 50 in its group, compared to #15 in the previous year. Being able to discuss rationales and adjust accordingly is a big advantage.



@Dima-K, thanks for reminding me about CHAMPS KNOW. I kept thinking SPICEY NICEY and DONKEY WONKEY and now I remember it, I can Google for it and dig it up. @GJDrew,@WarrenHatch, it’s a bit of a mystery why these training modules are not in your “Training” section even though they were made available to your competitors back in the day (the Internets never forget):

@000 OK, here’s the “cheesy mnemonic.” I found it to be useful in GJP, back when we were playing a fair game:

Comparison classes. Provide relevant classes and calculate base rates.
Hunt for info. Share the evidence that helps inform your prediction.
Adjust. Explain the reason for updating your prediction.
Models. Explain any application of mathematical or statistical models.
Post-mortem. Interpret past failures and successes; draw implications for the future.
Select effort. Describe the optimal level of effort.
Know the power-players. Describe individuals with influence and how will they use it.
Norms & protocols. Describe laws, rules, scheduled meetings, etc., that matter.
Other perspectives. Describe bottom-up processes (protest, culture, market forces, technology, etc.) that may play a role.
Wildcards. Explain the sources and magnitude of uncertainty.

WHEW! I kept coming up with “DRINK PEPSI®” and “MARY JANE”, but I knew those couldn’t be right.

@ravel @kahneman Please don’t take that “superforecaster [TM] label too seriously. As best as I can tell, reading the published accounts of GJP in refereed publications, the two primary factors in winning this designation were putting in time and getting some sort of feedback from others who put in time. All the rest, IMHO, is ex post facto analysis. I recall that it was after I’d been in the top 1% for some time that the Powers That Be queried me about openmindedness etc. How do we know the openmindedness was a cause of success, a result of striving for success and getting Brier score feedback, or both? I’m hoping the next four years of planned research under CREATE could help illuminate this.

Are any of you wondering why I added@kahneman here? Somebody signed up under the name Daniel Kahneman. Is that perhaps the Daniel Kahneman of Nobel Prize fame? He has written eloquently about his struggles as a young psychologist with the false efficacy syndrome. That’s what has me wondering about the superforecaster designation. Yeah, I got one, so how fun to think I’m special. What if I’m not? What if anyone with similar education and intelligence, of whom there must be many millions, could do as well or better under the right conditions? What if a billion people, or more, have this potential? Inquiring minds want to know.

P.S. Here’s the location of the latest report on the last two seasons of GJP:



Hunhh. I wanted to upvote @Heffalump, but apparently I am Heffalump now. I feel….strange…..

@000 last night, one of my comments had the “upvoted” button on it, suggesting that I had upvoted myself, but I can’t seem to make a habit of it, yet…time to play with cookies, and maybe some well-placed semicolons…

@cmeinel, well, if you look at the first Tetlock training video under Training, you will see that he begins his series by thanking Daniel Kahneman for all their collaboration, and Daniel Kahneman introduces himself as a psychologist, as they go around the table, along with Robert Axelrod and others, several of whom are “players” here, facilitating and doing what psychologists and analysts do, probe the rats. 🙂

@Heffalump Re: this long reported, still unfixed bug whereby we could screw up other people’s comments: consider databases of SQL injection attacks, for purposes of beta testing only, of course.

@ravel We’re rats in a cage?

Regarding this superforecasting business, here’s the final report by the top competitor to GJP, SciCast Predict:

Bottom line: in their final year, using the market prediction approach, they were achieving results comparable to GJP4 but without this business of identifying superduperforecasters and putting them/us into special groups to share info, analyses.

I think it’s bad enough that it’s an open secret that several/many of us scrape the site with regularity. I’m not going to do basic script-kiddie/JS/SQL hacking without some sort of mandate.



I don’t know about folks who don’t make any forecasts at all, I think they should be given some kind of hidden Peanut Gallery registration if they just want to watch.

Also I would time out anybody who registered and never forecast after a few weeks, because we don’t have 14,000 active punters, we have about 70.

On the other hand that establishes that with 70 or fewer forecasters you can still build a well-functioning analyst team.

@Heffalump Early on I asked @GJDrew for permission to simply run nmap against the plethora of servers that contribute to the pages built on the fly that we see here, and he asked me to not do so. Perhaps he will read this thread and consider how many people have complained for so long about this particular, apparently SQL-based bug, that he would permit us to diagnose the problem for free.

@000, you mean like Daniel Kahneman?
or user number 7, Katie Cochran?
or dataguy42 Phillip Rescober at user number 6?
or “an Industrial-Organizational Psychology researcher and with applied and practical experiences in continuous improvement in organizations” Abel Gallardo Olcay, “all the better to study your psyche, my dear” at user number 8, also no scores, predictions, upvotes:
or perhaps UofIballer Christopher Marshall, user number 9?
The whole world is a lab rat maze and we are the runners. And if you quit, sit home and watch TV on the couch, they are peeping at you through that little light adjustment gauge thingee and monitoring your cable usage and your refrigerator tells them how many beers or sodas you drank, and your power meter tells them which room of the house you are in and your smoke detector sends radiation beams into your brain making you sad, so you hurry back to GJP Open to feel closely monitored, nurtured and able to fellowship and commiserate with the other jacked up, hacker scraper super dooper crackerjack rats. Yeah, maybe they have been put in time out, or maybe they just walk around with clipboard and measure your pulse rate when you found out you were Heffalump 🙂

@kahneman @kmcochran @dataguy42@Abel @UofIballer

@ravel The thing is, smart rats get to have all of the fun. Just act a little bit unpredictable, and cover your smirk with your hand, so they don’t see it.

@ravel: On one of the comments on here, you were asking about what a “super” looks/talks/speaks like. I thought you might find this post from Farnam Street useful:

He distills the “Ten Commandments for Aspiring Superforecasters.” I think that this *kind of* gets to the question you were asking.

@000: “with 70 or fewer forecasters you can still build a well-functioning analyst team.” Well, now that that’s sorted out, let’s move on to POLICY formation, shall we?

@everyone in this thread, I haven’t laughed so hard in a month of Tuesdays.

Comment deleted on Dec 31, 2015 09:53PM UTC

@000, Hypermind has this rule that kicks out anyone who doesn’t make a prediction for more than two weeks. I don’t think they’ve actually implemented that yet, though. They also threaten to periodically drop the 15% worst performing participants!

Only 70 out of 14,000? That’s worse than SciCast, where my guess was that less than 50 were really active, though nearly 11,000 are supposed to have signed up. Inkling’s Public Market was better though. I think there were well over a thousand active out of 15,000 registered.



@cmeinel, regarding SciCast, it reminds me that GJP published their final picture which is also in their final report that you should, which, filtering 3 times, proves that forecasting is better than betting. However those three filters (training, teaming, tools) were not also consistently applied in the market condition. So the picture “proving” forecast better than market is not entirely well-founded based on not uniformly testing the same items in both formats. Nevertheless, I agree that forecast format has pragmatic benefits.



@cmeinel, @Heffalump, they can pragmatically shut off scraping any time they want. If you ask, they will say they discourage it. Discourage and prevent/admonish are two different words. It is hard to believe that they made the site as orderly and porous as it is with no intention of letting people scrape. Scraping is legal until people actively tell you not to and take measures to stop you. Any Windows 10 purchaser will have to acknowledge that our whole lives are being scraped constantly by Silicon Valley and the resulting data is at the core of the fortunes of the Googles and Amazons of this world. So let’s not feel too bad about scraping where scraping is being de facto permitted. Let them say No absolutely and then we can just walk away from this game, because without the scraping and without the IARPA connection, there’s not really that much to do except throw darts in favor of “Challenges” without prizes that nobody is watching. I’m just saying. Be yourself.

So once again, as a reminder, here is the API:

1. All users are serially indexed, look at the URL
2. There is a “link” button for every comment. All comments are serially indexed. Every forecast is contained in a comment. Look at the URL.
3. All questions are serially indexed, click on a question, look at the URL.

So there you have all users, all forecasts, all comments. There for the taking. A little elbow grease is required. There are courses for that, you just have to commit to putting your own time into getting the skills:

@Khalid It looks to me like SciCast had considerably more activity in its final year, thanks to substantial cash prizes. OTOH, this also sparked a major cheating problem. On the third hand, they developed a decent way of closing and invalidating questions. IMHO, the Scicast/George Mason University team has a good chance of winning a CREATE contract, and likely will be using an Inkling type market platform. You can read their report for the 2014-2015 season here:

I give them huge smiley face points for being much more open about their research than certain competitors.

@000 Two more easily scrapeable items.

To get everyone’s “accuracy scores” for every challenge, scrape through this index:
Substitute user number for the first “1” and challenge number for the second “1” which currently comprises the set of {1,2,3,5}

If a player has entered less than all current challenges, but more than zero, you can get the scores for unentered challenges as well by this technique.

Next fun scrape target: the source code for each comments page has a finer grained time stamp than what is on the browser-displayed page, even if no time and date are visible on the page, for example, from the above noted
<span data-localizable-timestamp=”2015-12-29T16:13:20Z”>Dec 29, 2015 04:13PM UTC</span>

Fine grained time stamps can help us detect bots, for example.

I see someone as me up voted Lars @000Amazons and Googles comment before I got here. Thank you. 🙂



@ravel, you have reached Level 3.

Best. Post. Ever. You win.

@khalid email me for the user data from yesterday. Who knows, if you contribute to the analysis, maybe you’ll get a copy regularly

@000 no, they can TRY. And I can’t stand semicolon-terminated-line languages. A semicolon is a precious thing that should be cherished, loved, and used sparingly, just like the Oxford comma (see what I did, there?)

All of us who are scraping should really just get together and share so we can cut the traffic load down.

So is this question going to be reconsidered, or we all just bragging for the sake of being booted so we can go get some work done instead of sniffing the cheese?



@Heffalump: I hear you; but I do have concerns about sharing code with an unknown Sir or Madam from an unknown country, whose purpose is to extract information from a website which is obviously a venture of Air America:
As you can see:
It feels a bit dicey.

If you can assure me that you have 4 beefy Russian pimps in an old Mercedes with a roomy trunk who can whisk me to guaranteed safety should a misunderstanding arise, then you will have refined my thinking.

I believe, without a shred of evidence, that the Oct 10 and following Iranian missile tests that get them in a wrist-slap with the UN, where the un-named US intelligence source says there was indeed an ICBM rocket launch, and “other” launches, that the Russians had already delivered an S-300/S-400 battery, that when they launched rockets from their Caspian sea naval vessels and undisclosed sources claimed one or more of those missiles crash landed in Iran, which Iran denied, and which dropped off the screen, that that was the testing of the new delivery, no Fed-X required since 20+ cargo plane loads from Iran to Syria were being delivered daily and the S-300/S-400 delivery could have been brought from theater, from naval group, from Russia, from Eastern Territory, all of which had the same, plus Arctic in recent days, i.e. the period of this question. Therefore the paperwork, lawsuit, factory order, retrofit jumbled mess was subterfuge, the missiles are in place, along with other batteries Iran has bought over the years from eastern bloc sources, and to think otherwise would be to fly, launch, attack, invade to one’s own peril. This will never be substantiated, the question should resolve as NO, my score will take the hit, but I would not overfly sensitive areas in Iran without the Israeli intelligence counter S-300/ S-400 tactics memorized. Best wishes 🙂

@000: If you are referring to “if they try”, then yes, agreed. Advanced is for Advanced. However, for the way it is now, a) any script kiddie can do it and b) we can collectively make our activity less annoying if instead of each of us doing it individually, and therefore redundantly/simultaneously, we limit that load to one of us and then share. So tonight is my night, tomorrow is yours, etc., or we could be hitting different parts of the db (e.g. I do users 1..7000, you do 7001..14000, and Sally, over there does 14001..14239 (as of a moment ago)). With 77,690 comments, it would also potentially save them from the attack of the n00bz since several of us already have pulled all of it…



@ravel, great comment, missiles could well be there now. This just highlights the absurdity of doing military intelligence at home for free without access to anything but press releases from interested parties.

@Heffalump, I love you, but you are a completely and carefully anonymous stranger living I don’t know where, who is asking me to cooperate over the Internet on an activity which could be misinterpreted. Please just save your comment html scrapes to disk. Then you don’t have to redownload all 77000 every night, and I won’t have to worry about meeting Saul and Carrie in an overlit room. With cacheing you will only have to download a few hundred forecasts a night.

If you think being a stranger is not a concern, note that my GJ connected blog has attracted readers from 100 countries including most in scope for GJOpen military intelligence questions. It’s all a lot of fun but I often worry about crossing some invisible line that I didn’t see painted anywhere.

@000 I think we both know that anyone who knows anything about how to do this isn’t scraping the comments over and over, and I would hope that they aren’t saving the html to disk when they have access to more appropriate tools. It’s the user data that changes, and I haven’t invested the effort into reverse-engineering the algorithms so that I don’t have to do that, yet.
You are more than welcome to be as careful as you choose. I was only offering to split the work, but suit yourself. You, and everyone else, are more than welcome to my data. I was hoping to make the zookeepers a bit less anxious about our activities by reducing the load on the servers, but whatever. I have other things to do, too. Some of them are even interesting – like fingernail avulsions.



@Heffalump, I don’t think the load of the 3.5 people who may be scraping is a material drag on their servers.

Also, far be it from me to deny GJ and Air America the intelligence value of observing who is scraping, just as I don’t deny myself the pleasure of counting the number of visitor flags on my WordPress blog. If we combined efforts that would obscure those results.

It’s interesting that you think so few people could figure it out. Heck, you figured it out, and you’re not even a girl…
Can I have level -1 now? What if I smile?



@Heffalump, it’s not that I think only 3.5 people can figure out how to scrape web data, it’s that I honestly think only 3.5 people actually care about this topic.

@000: Don’t you think that it’s weird, though, that so large a proportion would be from the same cadre?



@Heffalump, you and I are from same кадровый состав. @cmeinel is not. That makes 3. @Khalid is not. I’m counting him as 0.25 because he doesn’t program.@mparrault asked for code sharing on my blog. He’s the other 0.25 and he’s not in our кадровый состав. That makes 3.5 people that care, out of 13,000. The servers will survive. We are the only people that care about this, just as there are only a few people in the whole world that care about minimal regularity bounds in a perturbative framework. You may feel a wave of loneliness upon learning this fact. However, unless you are in fact this young lady: [1], you may find it hard to convince yourself or other people that any of this matters.


To me, the fact that the dung beetle actually uses constellations in the night sky to orient it’s correct direction has more clues than any minimal perturbative framework boundary limitations. There are water molecules perpetually captured in buckyballs and not one scientist to the rescue! The nanoscopic fractal nature of the macro cosmos should make the ripples in space time themselves be adequate breadcrumbs to solution, for those permutations indicate usurpation of order springing from chaos, and true super forecasters can feel this trembling in the ephemeral web as clearly as church bells and air raid sirens! Stop rolling your balls of dung to deconstructed mnemonic thrills, the universe is crying out the answer and it lies yonder!

I love you, @ravel.

@000: I’ll let you in on a little secret: @cmeineldoesn’t scrape. She’s too proud.



@ravel, I saw that dung beetle play in 1972 at the Ipswich High School. “My pile, my pile!” said the beetle.

Nor shall we bow, nor shall we scrape,
But go gently into the shadow of nightfall.
Alors, alors.
Quand meme.

I *can* scrape. But I’ve done way too much of that, and I’m tired of it.



As we scrape on by
Little do we know
What the future portends.

What an interesting thread!

I lost my heart and innocence in the fall of season 3 on the OPCW question (“How many of Assad’s 23 chemical weapons sites will OPEC inspect prior to x date?”), that they closed early, after only 21 were inspected. Since then, judges have repeatedly proven to me their human fallibility, and, at times, suspected “political” influences ( evidence: the “little green men” in Ukraine).

The interaction, discussion & debate this year has been incredible, and so engaging that one of the new challenges for me is to balance the time I spend reading the posts of other forecasters with the time I devote to doing my own research.

I am so appreciative of the input of the forecasters in this string, and others, like Jean-Pierre; you’ve made this experience wonderful! I am looking forward to the new year and to more wisely selecting which questions I’ll forecast on. I hope to be able to contribute more constructively by being more focused.

This thread ought to have reached the inboxes of Those Who Matter in GJI by now. Once in the GJP some team mates and I were discussing the project in the team forum or possibly in-team mail when Terry Murray joined in. They probably monitor the most commented threads.

SciCast was certainly more relaxed about its users. They would provide an “API key” to whoever asked for it and tolerated bots that would exploit every chance to make more points. The only time I remember that they drew a line was when some users apparently set up multiple accounts to manipulate the market. However, I still think they had very few active users compared to GJOpen – I was on it practically throughout the period it was functioning and I barely added 5% to my initial stake, yet I ranked between 30 and 50 on the leaderboard. There were about 20 “power users” who accumulated practically all the available earnings. In my case, with GJP training, I was concerned more with predicting correctly than maximizing earnings.
Following @000 I’ve signed up for an online Python course, I’ve completed several MOOCs over the last three years, but none on computing. Let’s see how it goes.



@Khalid, that’s great, now if I can just get a commission from the Coursera people. Note: Their participation rate (students who sign up and then follow through to completion on a course) is actually around 6%.[1][2] (We can’t have a discussion here without some stats!)

Those Who Matter: @GJDrew, @WarrenHatch,@kmcochran, @ben, @Reynard, @Littlefinger,@kahneman: Look at us, look at us! Take a picture!!


It looks like I up voted @anneinak. Oh well, I was going to, anyway. STOP PICKING THE UKRAINE WOUNDS!

@000 Bae will love those pieces for the PhD thesis, so thanks!

@000, my MOOC completion rate is 18/24 = 75% though a lot of those were “soft” subjects. If learning Python feels too much like “work” I’ll drop out. Alternatively, I’ll have to cut down on GJOpen (I’ll probably do so anyway once a bunch of questions close over the New Year).



@Khalid, I’m in same state. I have MOOCs on numerical methods, terrorisme (in French) , bioinformatics and game theory coming up. Keeping busy on these will distract me from my lousy Brier score. Of course if the 4 questions getting scored on Monday tick my score back down below 1, I will probably remain addicted.

Actually I thought terrorismes was on-demand but it looks like I’ve missed the entire session! Will have to re-enroll

There’s a START terrorism course on Coursera I think I’m enrolled in which is upcoming. It’s not a subject that hooks me that much, being rather ill-defined. I started another Dutch terrorism course on Coursera a while ago which was on-demand but I didn’t like the format. If you know French, the lecturer in the above course was more engaging.



@Khalid, I’m looking at the course archive for Terrorismes, luckily I should be able to watch the videos. The annoying ambiguity of the topic is illustrated by the following exchange between a student and the professor for the course:

Q : Guerre d’Algérie : vous parlez du terrorisme des algériens mais vous oubliez les massacres commis par les français !

Awesome discussion, very interesting links.

For the record, I believe the question should be voided, but if not, well, the game’s the game.

Happy New Year, everyone!

@Khalid, I’m so disappointed to hear that you intend to cut back your participation in GJI. I will miss your contributions!

@Anneinak, thank you. I very much doubt I’ll really cut back! However, in the last four months I’ve probably spent more time on this project than any equivalent period in the GJP and that’s hard to sustain.



@Khalid, interesting you’re more hooked on this year because I found last year more intense.

@000 @khalid: Lars was in the market format, last year, but Khalid, I don’t remember bumping into you. I agree, Lars, the market format is more compelling. The only thing that is keeping me interested, this year, is messing with the scraping and the game-within-a-game. I am definitely not doing hard research like I did or reading nonstop to keep up on topics.

Remember a couple of months ago when I threw down “I’m bored”? I’m still bored.



@Heffalump, yeah I’m super bored with this. It’s filling out the same 88-question survey over and over every day for no reason. No gift card, no spooks, no drama because the interaction format doesn’t support flame wars.

@Heffalump Bored? Me, too. The larger game is the only reason I’m playing. 50% likelihood I’ll abandon one or more players/bots/ringers I may or might not be running within four weeks. Hmmm, that would be a good prediction problem for GJopen: how many bots/ringers will voluntarily shut themselves down as verified by their formal announcements to their followers before Jan. 31? How many involuntarily shut down, as verified by formal announcements by GJopen?

Scicast Predict kept track of these sorts of pseudo players and shut down those who were harming the game while tolerating the others. Does GJopen even know who these are in this game? Hint: you need an industrial grade data scientist, someone with the smarts of @000 or Heff to figure this out.

@cmeinel actually, it’s pretty easy to tell if multiple accounts are from the same user. See:



@jeremylichtman, ha ha, she goes through 15 different hops and tunnels through the Silk Road and you can track her with some cookies and browser specs. Bravo!

Combine that thing with an “ever cookie” (almost impossible to remove), and a database to keep track of separate computers/browsers that coincidentally appear via the same IP….

Basically, there’s no way to hide online. Even users on a shared computer (i.e. internet cafe, public library) can be uniquely fingerprinted.

Happy New Year to all of you! I’ll bust my anonymity for you with this one link:


@Khalid, I totally understand the need to find a way to manage a sustainable level of participation!

@000, @jeremylichtman Have you ever tried to place cookies into the xombrero browser? How about browsing a website via telnet? Muhahaha… The most fun thing about telnet browsing is the ease with which one can insert jokes into the record of the visit. Here’s my tutorial:

That website, isn’t as accurate as it makes itself out to be. Using Ubuntu 15.10 which uses the Linux 4.2.0-7.7 kernel with two different browsers, that site gave two different answers for my kernel: “Linux 4.2.0-22-generic” and “Linux i686” It also thought each one was unique so it clearly did not flag them as coming from the same computer. I revisited with xombrero and it did recognize it as having the same fingerprint as it did in the preceeding visit.

With one of my browsers — which was not Safari 8.0 — and which does not save any data — xombrero– I got this:

Are you unique?
Yes! (You can be tracked!)
3.89 % of observed browsers are Safari, as yours.

0.51 % of observed browsers are Safari 8.0, as yours.

17.05 % of observed browsers run Linux, as yours.

3.20 % of observed browsers have UTC-7 as their timezone, as yours.

However, your full fingerprint is unique among the 119187 collected so far. Want to know why?

I ran another browser, one that actually stores things — Arora — and got this:

7.08 % of observed browsers are unidentified, as yours.

17.05 % of observed browsers run Linux, as yours.

61.81 % of observed browsers have set “en”as their primary language, as yours.

3.20 % of observed browsers have UTC-7 as their timezone, as yours.

However, your full fingerprint is unique among the 119188 collected so far. Want to know why?

OK, folks, want to guess how many different computers I have, how many browsers I have, and whether I can do shell programming?

I think if you visited something you shouldn’t have using telnet, that probably uniquely identifies the visit as having come from you. 🙂

Sames goes for lynx. I still use that occasionally for testing purposes.

Or wget, for that matter.

“Ever cookie” doesn’t use actual browser cookies. No idea if it works via xombrero, but it’s a nasty piece of work, and heavily used by corporate types.

That “am I unique” site is flawed, but it demonstrates the usage of system fonts as a tracking tool. The technique is used elsewhere to good (or bad, depending on who you are) effect.

Happy New Year!



@ravel, thanks for sharing your music and anonymity and Happy New Years!

They were the first of their kind. Lots of things block the original ones, but I’ve experienced some odd behavior of certain company websites that indicate that the lessons were learned.

@jeremylichtman Given that swarms of script kiddies read my harmless hacking tutorials, there are probably lots of telnet browsing incidents all over the place. In addition, how would one tie a telnet jokester to my staid old Chrome browser? Note that each of my browsers, even when run from the same operating system on the same Linux box, gives a different signature. Now if I were to run my plethora of browsers on various boxen through Tor, how would the admins of any web server know they all came from the same mostly harmless hacker? (P.S. I don’t trust Tor. Toooo convenient… but then maybe I’m paranoid.)

This actually is a serious problem that needs a solution for future IARPA/DARPA crowdsource experiments.

@jeremylichtman Where did you get your Lynx? Is it the old ASCII only version?

I’ve got the old ASCII only lynx on a bunch of old servers in my basement. Can’t recall how I installed it on them. Probably something silly like “sudo apt-get install lynx”. I occasionally use it in command-line mode for similar purposes as wget, or to trigger remote scripts, or just for accessibility testing.

I don’t think the Tor folks are deliberately malevolent, but it is known to “leak”, and I worry about folks in certain countries who think it makes them safe.

@jeremylichtman could you please see what happens when your lynx tries to connect to Mine gets instantly aborted. lynx all 2.8.9dev6-3 [3,728 B]

Same problem. It looks like the SSL setup here isn’t compatible with lynx.

When I was working on the original GJP scraper in September or October, I ran into a similar issue. I don’t remember exactly what the problem was but For some reason TLS version comes to mind. I ended up switching to a different tool to get around it.

I won’t publicly admit to knowing anything about TLS, SSL or anything related.

In my experience, that always leads to being asked to babysit servers.

@jeremylichtman @000 But wget can download pages. Then sort through the directory where they are stored with find . -name ‘*’ -print | xargs grep

Crude alternative to scraping.

Yup, more than one way to scrape a website, Lars. But as Heffalump points out, I won’t because I’m too proud, having kindly asked permission to nmap and scrape and was denied so I’ll just wait for stuff to automagically arrive in my inboxen. No SQL injections, either. Not even insertions of jokes into syslog or wherever they store that stuff.

Do I win the kiddie haxor bragfest yet?

I google’d “lynx tls support”. Looks like it has issues with some types of TLS.

This site’s cert is using TLS 1.2 (current). I think lynx isn’t up to date with handling for that yet.


So like, this thread is super-long. *Takes a puff* Do you ever think, like, there’s another thread out there, like, with lots of other people continuing to write comments and comments?

<thundering voice>
</thundering voice>

Time for an ontology break.

This is the palm of my hand
And this is my face
Watch them slowly intersect

Obligatory “Picard double facepalm” –>

He also does a really good “quadrupal take”.

@ravel stop deleting your comments! You haven’t had a bad one, yet, either



I concur with @cmeinel, it is time for an oncology break. Or did she say eschatology.

Arcology? Archaeology? Ecology? Ophthalmology?

@000 Eschatology? I’m with Martin Luther, Revelations should never have been included in the Bible. Why aren’t you boys all getting drunk. My husband and I are eating ice cream right now, our version of a vice. We haven’t managed to stay awake for New Year’s midnight since we rang in the New Year 2000 on a Y2K-compliant sailboat on Elephant Butte Lake. Darn, that Y2K sure was boring, not a speck of eschatological incidents.

I don’t drink because I’m Amish.



Your glass is amiss?

Happy New Year! Ontology break or not, this is just too fascinating to stop now.
@000, @Heffalump, I was in small (10 and 13 member) teams and didn’t know what was happening outside them. At most I interacted with 20 team-mates in two years. By comparison, the comment stream here is like trying to drink from a fire-hose. There are about a dozen users – you know who you are – who make compulsive reading, so much so that actual forecasting is just a secondary activity these days.
Last year my team was in an experimental group that didn’t receive feedback on Brier Scores till the season was over, which was somewhat demotivating, though that was offset by being part of a very engaged team, with 7 out of 10 regularly contributing thoughtful rationales. This year is way more interesting, though it’s about time we moved on from the beta phase.
@Ravel, nice music!

Welcome, 2016!
We’re just 31 min into it, here in the ADT zone.
@Khalid: we must have been in the same experimental condition both years (on different teams). Interaction? Almost non-existent.
I agree that forecasters’ comments this year are incredibly engaging.

@Khalid I’ve been gathering and analyzing data from GJP season 4, as nothing yet has been published about it and I want to understand this research better. Furthermore, even if I were to be patient and wait, what has been published about the first three seasons has been, let us say, sparse. So I don’t expect much more in the future. Did you perchance save a copy of your Participant Feedback Report? I’ll gladly trade my results for yours. Same deal for anyone else reading this.

To put a little faith in GJI back in you, guys, the good news is that the S-300 question will be voided. This is an interesting thread but no need to bitch about the bad resolution anymore 🙂



Should we put our faith in GJI or this guy:

If we couldn’t bitch about bad resolutions there would be almost nothing to talk about. I say bring on more random Mission Accomplished resolutions.×350/370556ab70e834bc511b29e6dff5e71c/image4061139x.jpg

@Dima-K: I wonder… are you getting some info before there’s an announcement on this site because that’s GJ is announcing things on the *other* platform?



Clearly yes @Dima-K is on the A team because he originated the matter of the “other platform” by mentioning that “this one day, in Band Camp, I heard some other kids talking about the Other Platform, and….”

Barring that, Occam’s Razor would suggest that he has been pinged by our Director of Government Relations.

PS And no, this isn’t the right clip:

@GL2814, you’re missing this whole thread, where are you?

Thanks for the news, @Dima-K! And Happy New Year to everyone….wow did this thread ever expand in the last few days. @000 – clearly @GL2814 is over at @einsteinjs‘s other thread.

@morrell – you have the better of me. Other thread? I didn’t realize this was still going on.

@000 – I was beginning to think you didn’t like my anymore. 😉

This question should be resolved yes or no. We obviously lost our spooks and just as obviously lost anyone with the phone number of a spook for this question has an answer, but GJOpen, Ibc. Is obviously over their head in asking it! How many manned bases are on the moon? None? @cmeinel would say. Yet this week my research indicates between four and forty. Some people should not ask questions they themselves cannot answer. Void the question. Wimps!



@ravel we need some references please on those moonbases.

@000, as good as the sources to resolve this question, gladly 🙂

@ravel, @000 These lunar base allegations mostly trace back to Richard Hoagland, who purports to have been a NASA consultant. He has been debunked multiple times regarding this and many other activities dating back to July 1975, which is when he went over a cliff. But there’s always a new crop of Hoagland believers, year after year.

Nowadays you can buy your own telescope that’s good enough to NOT see Hoagland’s “structures” — if you have trouble understanding how hard it would be for ALL astronomers to conspire to hide them. Example: Celestron CPC Deluxe 800 with Edge HD Optics



Everybody knows that brand of telescopes has embedded secret imaging circuits that recognize the bases and smoothes them over in the lens.

Thanks for the update, Dima-K.

Credit to the GJO for making the right call with the S-300 question.

Although I completely agree that voiding this question is the correct course of action, I think some of the comments directed at the GJO admins were a little over the top. I believe that they are doing their best, and I do appreciate the fact that they provided a platform for all of us to forecast on. Without the GJO, we wouldn’t be forecasting right now. On the other hand, I understand the frustration of my fellow forecasters (after all, my brier score got annihilated on this question: I held my forecast at near 0%).

I think going forward, more open communication between forecasters and the GJO would be constructive. We can all live and learn and make this better.

My one piece of advice for the GJO, is that they shouldn’t’ close a question until they are absolutely certain (as is practical) of the outcome. Based on all our comments, GJO must have known how controversial the S-300 question was. Many of us (me included) was calling the Iranian ambassador a liar. When there is this kind of doubt, they should leave the question open (for as long as needed) until we have conclusive proof of a resolution. Once the question closes we can’t update our forecast; However, if we leave the question open, we can always go back and calculate our brier scores if a specific date of delivery is confirmed.

In this way we could have avoided voiding the question.

@000 @cmeinel isn’t this my point? Hoaxes are debunked by true science, like this lovely experiment. Discredited sources, the braggarts in Iran and Russia are offset by human, technical and intercepted intelligence. If “they” cannot determine the truth on missile systems as big as trash / dump trucks, how will “they” ever determine the question regarding 300 grams of nuclear material. Therefore GJOpen, Inc. has NO credible sources and cannot discern hoax from braggart from secret maneuver. Void the question. I do not believe there are human bases on the moon. I wish it were a question here though 🙂

@ravel Excellent point and one we need to address in the CREATE program. IARPA had both GJP and Scicast Predict develop their own questions in the last two years of ACE. However, IARPA maintained control over question resolution. I suspect that IARPA may do something similar in the next four years of crowd forecasting research.

As for credible sources, there are ways to get to the bottom of news stories. But it takes hard work. Given the major world mass media rush to with draw their reporters from foreign bureaus, news stories are becoming less reliable. What they are doing now is relying on freelancers, who tend to get killed or wind up behind bars, or who might even be playing a double game. For this reason, crowd forecasting might become less reliable over time.

OTOH, Twitter and other social media are becoming more useful to get to original sources: eye witnesses with smart phones, geolocation of imagery, etc.

On my infamous third hand, the troll factories are getting increasingly good at mimicking eye witness, video etc. evidence.

@Clairvoyance The crux of my complaints against GJopen is that the research underlying it by Dr. Tetllock et al has not, to my knowledge, judging by their publications and news stories, ever addressed the problem of how to present questions to the crowd so as to enable the crowd to make the best possible forecasts. By contrast, their competitors in the ACE tournament, Scicast Predict, have published their research into this issue. They are competing for a CREATE contract, and teaming with Cultivate Labs, to my knowledge. So hang on, we may soon have a far better playground.

Specifically, from pg. 88, 89, SciCast Annual Report, (2015) 25-May-2015, Year 4,

The resolution source is by far the most critical portion of a question and must be rigorously crafted to ensure there is no ambiguity when resolving it. Each user who reads a question will have his or her own interpretation of it and what could constitute a resolution; it is the job of question management to minimize these instances…. Adherence to a consistent style guide is necessary to ensure high-quality questions. Concise grammar, diction, and unambiguous explanations are absolutely necessary [pg. 88, 89, SciCast Annual Report, (2015) 25-May-2015, Year 4]

I apologize for the infantile humor. I could not resist. @cmeinel‘s infamous “third hand:”

@redacted – you be serious and have fun too. I think it helps memory retention.

I, too, am glad to hear that the S-300 question will be voided. I assume my Brier score will be better after the score on it is backed out, but more importantly my trust in the process and morale will be better.

Even more importantly, I take this question as providing many examples of deliberate dissemination of misinformation–something that I knew went on but I had not been able to identify before.

@clairvoyance: I think that leaving a question open is not sufficient when there has been a significant event. I certainly agree that more communication would be beneficial between administrators and forecasters. Because silence implies that the status quo is still operative, I think that occasionally it would be appropriate for administrators to announce something to the effect: “We are aware of the reports linked and are evaluating their relevance for the resolution of this question.” To me, this says, “No decision has been made yet regarding the significance of these reports. Do not assume that we have decided that we consider this event to NOT have resolved this question, AND do not assume that we will decide that this event HAS resolved this question. Please continue to forecast on this question and provide documentation to support your forecast(s).”



@cmeinel, apropos of nothing here is a recent paper on how they implemented the pricing mechanism for market format:

@000 Thank you, I had not previously seen that paper.

@einsteinjs Could you please send a copy of your Participant Feedback report to me? I’ll gladly trade mine plus my final scores and leaderboard from the prediction polls group I was in.



TA DA: Admin is just as bored as we are. They just sent out an email (at 9:40PM EST on Saturday night!) closing the question.

@GL2814@cmeinel and I are not the only ones from our group on here. She has probably kept better track than me of who’s shown up, but there are at least 25 or so who were on our leaderboards at one time or another. We can only recognize folks from the leaderboards since there was no interaction in our group. (At least, I had none, but it’s possible that other people found each other outside of GJP through their names.)

This has been a long grind — not only for us forecasters, but I’m sure for administrators, too. I thought that the announcement was very well written. One thing that always amazes me is how gracious GJI administrators are, even under such frustrating and difficult circumstances.

I’m not in entire agreement with previous posts by forecasters about leaving questions open without comment when there are significant events. I think that it would have been appropriate for GJ administrators to have announced that they were aware of the press releases by the Russians & Iranians . . . more or less like they did with the OPEC question where they opted to let us know that they didn’t consider the events of the early Dec meeting to have satisfied their requirement of an announced decision of change (an interpretation that I strongly disagree with).

I am indifferent as to whether this question should be voided. I just want to reiterate that there is no law of nature that requires a disinformation campaign to end when the underlying event occurs. Who knows if the latest news bits are true? In particular, if you are to believe that the latest constellation of data is the “true story,” you must now explain the inconsistency that did not previously exist; namely, the purported fact that the operator training will begin in Russia after the first “regiment” has already been delivered to Iran. Why even send the units if they’re going to be sitting idle and unmanned for, what, 2, 4, 6 weeks? As to the lawsuit, its dismissal has seemingly always been conditioned upon receipt of “enough” of something, not just upon receipt of anything. But I’m not seeking to change the current adjudicated outcome. Only to again, for the now third time, remind us all that time disassociation is a basic method of information manipulation.

[Edit: I’m also wondering what the silent @g thinks of all this!]

@anneinak because…?

@000: I very much appreciated when GJP moved to bidding system for the prediction market. There was still the problem of the folks who were able to get to the IFP as soon as it went live, but it felt like this certainly mitigated that ‘race.’ I also liked the game aspect of trying to figure out where other forecasters were going to peg the IFP and then submit my bid based on that.

@cmeinel: I thought I’d already sent it to you, but I guess not. Earlier this evening, I sent it. I should also say, I don’t ever check that email address, so you’ll need to ping me on here if there’s an error. Hopefully you’ll be able to enlarge the pictures, (some of the graphs may not be too easy to read), as I was a bit of an outlier on a number of measures.

@redacted, wasn’t it stated early on that Iranians were in Russia training on the upgraded command units and software? In contradiction to the statement that all of this series were obsolete and scrapped? In contradiction to trade show mixed messages saying Iranians were in process of picking makes and models? In contradiction of May 2015 orders from Putin to fill the order (do countries with dictators ignore supreme leader mandates?). Also with a multitude of positions regarding the lawsuit, factory orders and other paperwork. Meanwhile, Egypt, India and others were “placing orders” and various stock units were deployed to Syria, Russian Eastern Territory and Arctic during this question period. I believe, like the naval missiles to Syria, the downed missiles in Iran, the onboard S-400 batteries on ships in the Mediterranean and the submarine missile firings, these were primarily for Western consumption. All the talk of a bankrupt Russia cannot diminish the perceive panoply in multiple theaters, that if Western aggression begins anywhere, there are multiple responses in place to respond east coast, west coast, Alaska, Canada, Gulf of Mexico, Scandinavia, Eastern Europe, Mediterranean, Turkey, Caucuses, Levant that any reasonable military strategist would start running out of battle pieces to cover the map with adequate response. Borrowing an old American slogan/flag motto, Russia hung a huge “Don’t Tread On Me!” banner for all the world to see. It hasn’t hurt their arms sales business.

@ravel: Yes, you’re right. Perhaps I was too myopic in selecting my discrepancy exemplar. We should have more questions focused on Russia. Does Janes track transaction volumes? I’m kind of counting on some big Russian arms sales to deliver the 4th quarter GDP surprise for the BRIC question. Maybe we could work backward from the financial transaction to the S-300 delivery date, eh?

@einsteinjs Thank you, I got your participant feedback report and converted it into pdf and odt formats and returned it to you along with my feedback and final leaderboard. It certainly is fascinating getting a look at season 4’s many experimental groups. Anyone else want to trade?



@ravel, Russia won’t go bankrupt as long as they can keep selling rocket engines to the United States Air Force.

PS not to mention the Putin/Obama joint venture on the moon bases

@000, Lars, Moon Maid died in the explosion so I am too sad to continue that discussion. However before we discuss alt-news, perhaps we can entertain ideas as why or what agreements between Old Families, Bloodline and Magicians, White Hats, Dragons and P2 that allow P2 to fall, Iran a pass, and the re-emergence of Assyria. Ah, those are great themes for future incarnations of prediction games and markets, no?
If only we had invested in rocket engines: “The debate over the Russian engines has become entangled in an emerging rivalry among the companies vying for the lucrative business of space launches, amounting to $70 billion in contracts for military and intelligence missions alone between now and 2030, according to an estimate cited by the Government Accountability Office.” as per your source, says that yes, that is a lot of money, and if the McCain party position wins, great news for Elon Musk and Robert T. Bigelow of Skinwalker Ranch fame, and ah yes, this brings up again, the Aviary. So who know how the Raven died, and his identity? I have an unresolved question about that. Sunday morning before coffee! I love it!



@ravel, hmmm, Skinwalker Ranch, mmm hmmmm. The secret lies in this journey:

Other secrets can be found here, in this text, including the disappearance in these parts of the mysterious “V”:

The raven seeks news of his cousin, the Falcon:

And that’s all I’ve got, I need to go buy some pricey almond rolls from the local fancy cafe before I can continue.


And finally:

Good Judgment™ Open – New announcement from admins
Sat 1/2/2016 9:40 PM

Hey Lars,

An admin made a new announcement:

1/2/15: GJ has decided to cancel the question (#3) about whether Russia would deliver the S-300 or S-400 missile system to Iran because new evidence suggests that the initial decision to close the question as a “yes” was made based on premature announcements. The question will be removed from the GJP Classic Challenge and the Iran Challenge, and scores will be updated to reflect this decision as soon as technologically feasible.

As mentioned in the initial closing announcement, GJ decided to close the question as a “yes” after both the Iranian and Russian ambassadors made statements confirming delivery. At the time, GJ considered this decision a very close call: Although credible Western outlets had picked up the announcements, they merely repeated the ambassadors’ claims or quoted the Russian and Iranian news outlets. Because of this, GJ continued to monitor the question to see if there was further conformation in credible sources or evidence from any sources suggesting that systems had not been delivered. After a week of no further reporting, and no contradictory evidence, the question was closed.

However, contradictory evidence has now emerged that shows the initial decision to close the question as a yes was premature (,,, According to new Russian and Iranian sources, the delivery will be delayed until Iran’s lawsuit against Russia has been dropped and the latest estimates suggest that delivery may happen early in 2016. Because the question has been closed for almost a month and forecasters have been unable to update their forecasts in response to new developments, GJ is voiding the question rather than scoring the question as a “no.”

GJ’s question team is committed to asking tough geopolitical questions with real-world significance. The team spends a great deal of time on the front end trying to make sure all of our questions are falsifiable. Unfortunately, resolving some of those questions using open source materials can be challenging. GJ takes every resolution decision very seriously, often consulting outside experts and tapping into additional resources to make sure that resolution decisions reflect what happened in the real world. The question team also does extensive post-mortem analyses on problematic questions so that future questions can avoid falling into similar traps. This question provided a number of important lessons that GJ plans on internalizing, including the need to take more time in resolving questions when the resolution is ambiguous, the dangers of relying only on announcements to resolve questions, and the intractability of questions about weapons transfers. Please be assured that these lessons, and others, will be applied as the team writes and resolves future questions.

The Good Judgment™ Open Team




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s