Monday, April 14, 2008

Data-Mining

The sheer amount of data that human observers can accumulate just by watching a single moderately-sized environment is staggering. Supposing your observation unit is a small state of five million souls. Already, the number of theoretical interactions beggars the power of any computer on the face of this earth. You can record simple things like power consumption, water usage, tonnage of rice imported, stuff like that.

But to use the data to prove that people are happy (and how much so), that people are moral (and how much, and in what way), that people are kind or gentle or emotional, or relaxed or stressed — this is not so easy, even though it might seem so at first glance. In the scientific paradigm, it is best to have questions that give you some idea of what kind of answer is sought, and how certain those answers might be, and why.

People trained in such disciplines can make assertions, and test them. Sometimes, however, the training is misapplied. When scientists ask whether God exists, the question is a bad one. How would you know? Given the proposed characteristics of God, why do you think that the data we have access to can define His existence or lack of it? What level of evidence would you require to prove God existed, and if you knew, would God allow you to prove His existence? Why would He; why would He not?

The thing about assertions, I suppose, is that if you were to present your evidence to a panel of reasonable human beings, and then contrary evidence of superior quantity and quality were presented, you should be able to reject your hypothesis. If you cannot reject your hypothesis, you're not being very scientific. On the other hand, this is asymmetric. Presenting refutations in advance without knowing what is to be refuted is slightly odd. Not that it cannot be done, but it isn't often done.

In the end, the huge mass of data will be interpreted by those who mine it best, who extract the ore carefully, reduce it to its true metal, and make useful implements from it. As I sit here, I sometimes find myself at a loss as to what the eventual fate of this data should be. I try to be as painstaking as possible. I try not to make statements I can't support with the full burden of data. It kills me, sometimes. But it is useful to marshal the facts in neat formations before sending them out to do battle against the unknown.

Labels: , ,

0 Comments:

Post a Comment

<< Home