Some GJOpen crowd statistics

These are based on my observations, not official stats, so results not guaranteed reliable.

Current registered GJOpen Population: 18,377 more or less

newforecasterspd

The maximum here is 6,707 which is 36.5% of the total registered of 18,377. That is, about 2/3rds of the people who register never make a forecast or comment:

totalforecasterspd

fpd

pctactiveforecasterspd.png

Here’s an interesting picture which is a little fatter than the last time I did this, but still looks about the same.  6,707 people forecast at least one question, the median being 3:

qpf

This one cries out for a table since the numbers to the right are hard to read graphically (guess what, I’m the one who answered 115 questions).   What this table says is that 5% of the active population is doing most of the heavy lifting:

#Questions #Forecasters % of 6,707 Cum %
115 1 0.015% 0.0%
114 2 0.030% 0.0%
112 1 0.015% 0.1%
111 2 0.030% 0.1%
107 1 0.015% 0.1%
105 1 0.015% 0.1%
104 1 0.015% 0.1%
102 1 0.015% 0.1%
101 2 0.030% 0.2%
100 1 0.015% 0.2%
98 1 0.015% 0.2%
96 2 0.030% 0.2%
95 1 0.015% 0.3%
93 1 0.015% 0.3%
92 2 0.030% 0.3%
91 1 0.015% 0.3%
89 2 0.030% 0.3%
88 3 0.045% 0.4%
87 1 0.015% 0.4%
86 1 0.015% 0.4%
85 1 0.015% 0.4%
84 3 0.045% 0.5%
81 3 0.045% 0.5%
80 2 0.030% 0.6%
79 2 0.030% 0.6%
78 4 0.060% 0.6%
77 1 0.015% 0.7%
75 3 0.045% 0.7%
74 2 0.030% 0.7%
71 2 0.030% 0.8%
70 1 0.015% 0.8%
69 2 0.030% 0.8%
68 4 0.060% 0.9%
67 2 0.030% 0.9%
65 2 0.030% 0.9%
64 4 0.060% 1.0%
63 3 0.045% 1.0%
62 3 0.045% 1.1%
61 3 0.045% 1.1%
60 2 0.030% 1.1%
58 1 0.015% 1.2%
57 1 0.015% 1.2%
56 5 0.075% 1.3%
55 6 0.089% 1.3%
54 5 0.075% 1.4%
53 1 0.015% 1.4%
52 4 0.060% 1.5%
51 3 0.045% 1.5%
50 4 0.060% 1.6%
49 6 0.089% 1.7%
48 3 0.045% 1.7%
47 7 0.104% 1.8%
46 4 0.060% 1.9%
45 1 0.015% 1.9%
44 9 0.134% 2.0%
43 5 0.075% 2.1%
42 3 0.045% 2.2%
41 8 0.119% 2.3%
40 3 0.045% 2.3%
39 13 0.194% 2.5%
38 7 0.104% 2.6%
37 5 0.075% 2.7%
36 6 0.089% 2.8%
35 10 0.149% 2.9%
34 12 0.179% 3.1%
33 14 0.209% 3.3%
32 7 0.104% 3.4%
31 10 0.149% 3.6%
30 6 0.089% 3.7%
29 17 0.253% 3.9%
28 9 0.134% 4.1%
27 16 0.239% 4.3%
26 14 0.209% 4.5%
25 20 0.298% 4.8%
24 24 0.358% 5.2%
23 16 0.239% 5.4%
22 20 0.298% 5.7%
21 20 0.298% 6.0%
20 31 0.462% 6.5%
19 30 0.447% 6.9%
18 42 0.626% 7.5%
17 52 0.775% 8.3%
16 48 0.716% 9.0%
15 40 0.596% 9.6%
14 62 0.924% 10.5%
13 78 1.163% 11.7%
12 91 1.357% 13.1%
11 103 1.536% 14.6%
10 108 1.610% 16.2%
9 152 2.266% 18.5%
8 217 3.235% 21.7%
7 219 3.265% 25.0%
6 299 4.458% 29.4%
5 384 5.725% 35.2%
4 475 7.082% 42.2%
3 695 10.362% 52.6%
2 1166 17.385% 70.0%
1 2013 30.013% 100.0%

If you then ask, “how many people have a negative total Accuracy score”, i.e. how many people beat the crowd in sum of Accuracy over all questions they answered, this is going to be a smaller number.  Note, I am calculating Brier score per question branch, treating each branch as a separate question, so for multi-part questions, my calculation will differ from GJOpen. There are a few questions we can ask.  First is, overall, what is the distribution of accuracy, irrespective of the number of questions answered.  It looks like this, and tells us that exactly 1/3rd of our population beats the crowd, which is a good start:

accuracy.png

Next we might ask what how many people beat the crowd who have answered 1, 2, …, 115 questions.  We know how many accurate forecasters there are.  Then the question is how many accurate busy forecasters there are, i.e. forecasters who, day in, day out, take whatever is thrown at them and crank out a crowd-beating forecast.  Let’s see:

abf

Well, this is another picture that’s hard to read without a table.  What it’s saying is that you’ve got around 700 one-hit-wonders out of 1386 who happened to answer one question right on the money.  Then they went home.  That’s 10% of the active population.  The numbers get a whole lot smaller when you consider busy forecasters, i.e. people answering 20% or more of the available questions.  To my mind, these are the people who are the best model for professional analysts who, day-in, day-out, have to forecast some workload of questions, whether they are in a good mood or not.  So let’s look at the table:

#Questions #Forecasters Cum #Fcsters % of 1386 Cum %
74 1 1 0.072% 0.1%
69 1 2 0.072% 0.1%
68 1 3 0.072% 0.2%
64 1 4 0.072% 0.3%
62 1 5 0.072% 0.4%
61 1 6 0.072% 0.4%
57 1 7 0.072% 0.5%
49 1 8 0.072% 0.6%
48 3 11 0.216% 0.8%
44 1 12 0.072% 0.9%
40 1 13 0.072% 0.9%
36 1 14 0.072% 1.0%
35 1 15 0.072% 1.1%
34 3 18 0.216% 1.3%
33 1 19 0.072% 1.4%
32 1 20 0.072% 1.4%
28 1 21 0.072% 1.5%
26 1 22 0.072% 1.6%
25 2 24 0.144% 1.7%
24 1 25 0.072% 1.8%
23 2 27 0.144% 1.9%
22 1 28 0.072% 2.0%
20 2 30 0.144% 2.2%
19 1 31 0.072% 2.2%
18 5 36 0.361% 2.6%
17 4 40 0.289% 2.9%
16 5 45 0.361% 3.2%
15 3 48 0.216% 3.5%
14 6 54 0.433% 3.9%
13 10 64 0.722% 4.6%
12 6 70 0.433% 5.1%
11 14 84 1.010% 6.1%
10 13 97 0.938% 7.0%
9 9 106 0.649% 7.6%
8 22 128 1.587% 9.2%
7 23 151 1.659% 10.9%
6 37 188 2.670% 13.6%
5 55 243 3.968% 17.5%
4 101 344 7.287% 24.8%
3 107 451 7.720% 32.5%
2 214 665 15.440% 48.0%
1 721 1386 52.020% 100.0%

So if we set our cutoff point for busy-ness at 25 questions, then a grand total of 24 people out of 18,377 registrants are doing most of the heavy lifting.  That’s 0.13% of the total.

Now, to be a little bloody-minded, we could ask what would happen if 18,377 people flipped a fair coin 25 times, with the coin coming out heads more than half the time.  I think I want to say 18377 time 2 to the -24th power times 100 to get the percentage.  That might get you a number like 0.10% of the total.  So we’re ahead.  By 0.03%.  So are these people skilled or just lucky?  Hard to say.  But if they answered 25 questions, at least they’re busy, and they’re having fun.

Advertisements

3 thoughts on “Some GJOpen crowd statistics

    1. I’ve noticed that there are a bunch of new forecasters who are forecasting only the primary questions (and not very accurately – they wildly overestimate Cruz in NH). On February 7th, there were 600+ forecasts in the primary questions, vs. around 60 for the other top questions. This new group of forecasters all seem to have all have descriptions ut:”username” (e.g. UT: lp24923 for user #18245). It looks to me like some new group of forecasters from a single source. Do you have any clue what’s going on?

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s