When I analyzed the data collected during the last BruCON edition, I had the idea to correlate the timeslots assigned to talks with the amount of Internet traffic. First a big disclaimer: My goal is not to judge the popularity of a speaker or the quality of his/her presentation but more to investigate if the network usage could reveal interesting facts.
Measuring the bandwidth is not a good indicator. Some people used BitTorrent clients or others were downloaded big files in the background. I think that it is more relevant to collect the number of sessions. The first step was to extract relevant data. I decided to focus only on HTTP traffic (TCP 80 & 443). Only public destination IP addresses have been used (eg. connections to the wall of sheel are not included). All sessions with their timestamp have been extracted and indexed by Splunk:
Then, I exported the connections grouped by slots of 30 minutes and exported the data in a CSV file:
source="/opt/splunk/var/run/splunk/csv/httpbrucon.csv" index="brucon" | timechart span=30m count
Finally, I exported the schedule from sched.brucon.org and correlated both with Excel:
And the graph showing traffic per talk:
So now, how to interpret those numbers? A peak of traffic can be interpreted in both ways: When the speaker has a nice slide or explain something awesome, attendees will often share it on social networks. But, on the other side, bored people (or those who are lost in too complex slides) will be tempted to surf the web waiting for the end of the presentation. Based on the feedback received about some talks, both situations are present in my results (again, I won’t disclose which one).
This model is not perfect. Besides regular talks, there was also workshops organized and they could generate a significant amount of connections too. The idea to improve the reporting could be to restrict the analyze to connections performed from wireless access points located in the main room…