Obviously the Snowden whistleblowing is about a lot more than just data like I'm going to be discussing here, it's about being private online, in my eyes Snowden has done a very brave and very good thing for his country and for the internet community. What happens next with regards to the American public and their willingness to challenge the government will determine if it was worth it or not. (You can read a timeline of events on Snowden here).
However, in relation to testing, a big pool of data with regards to your application/website will come from analytics, be it WebTrends or Google Analytics. I am sure that many companies are already taking advantage of analytics data to determine what browsers are most popular and what devices are used to access their companies website or application, in having this data we are able to determine what browsers and devices we support and thus test on. This list will grow larger and larger as more and more companies release devices or as different versions of browsers are released, we already for instance have Internet Explorer 7,8,9 and now 10, I'm just thankful that we stopped supporting IE6 as that was a monstrosity. So this is just one way that we are using data to support our testing.
Also in our analytics data are things like user journeys, and common things that users do, we can use this to drive our testing, if there's a common journey that a number of users do, then we can ensure in our testing that there's nothing that will disrupt that journey, often the journey is common sense, but it's a useful piece of data non the less.
We can also use other forms of data to determine where bugs are likely to appear in the system, if you have a system that has evolved quite a bit and you have a lot of bugs that have been raised and subsequently fixed, if you know that a certain area of the system is prone to having regression type bugs, then you can be sure that you have tests that cover this area in your test pack, or even perform exploratory testing around it as necessary. This is where CI builds can come in handy, by having tests run every night against a build, you can determine fairly easily if certain areas of the system are prone to bugs, and as such protect against this going forward.
Finally, analytics can be used to determine how often certain scenarios happen, for instance, we have a timeout page that has certain analytics on it. When this page is hit, we know what the user was doing when they hit the page, and it will help in researching what the problem was and how it arose.
Data doesn't have to be limited to analytics, we can increase this "data" to using tools such as SQL server profiler, if we have this running during an a test run then this data in future can be used to drive any improvements that we want to make to the databases to improve efficiency or even to find bugs that might otherwise had been missed.
Similarly, we can run tests with console windows open to see any errors, we can use HttpWatch to see requests that the browser is making, we can view logs on servers. This is all data that is available to us, and as testers we should be using it regularly as part of our test runs.
Obviously, this data might not be available to everyone, so is there a risk that smaller companies get left behind? In my eyes, no, smaller companies will eventually have this data available to them, be it in having multiple releases that have had tests run against them, they too will have this data, if they decide to use it or not is up to them.
The point I'm getting at, is that there is an abundance of data out there, some readily at our fingertips, and some we might have to tweak/work for a bit, but the more data as testers we have then the more pinpointed our testing can be, and not one piece of this data alone would be any use to us, but combining it all together helps our test approach become more direct and focused.