Yottabytes and the Data Analysis Challenge

July 6th, 2009 by Steven Aftergood

The increasing capability of high-resolution military and intelligence sensors is producing ever growing quantities of data that could overwhelm the capacity to analyze them without new approaches to data management and analysis, according to a newly released report (pdf) from the JASON defense advisory panel.

“As the amount of data captured by these sensors grows, the difficulty in storing, analyzing, and fusing the sensor data becomes increasingly significant,” the report said.

Extrapolating from current trends, data production could hypothetically reach the Yottabyte range by 2015.  (The Yotta- prefix means ten raised to the twenty-fourth power.  Mega- means ten to the sixth power, Giga- means ten to the ninth power, and Tera- is ten to the twelfth power.)  If one byte of data were used to image one square meter of the Earth’s surface, then 1.6 Yottabytes would be generated by imaging the entire surface of the Earth every second for a hundred years, the report explained.

While the data management challenge is daunting, it is not unmanageable in principle, the JASONs said, nor is it entirely unprecedented.  “Important parallels can be drawn with data intensive science efforts such as high energy physics and astronomy.”  These efforts show how data filtering approaches can be applied to reduce data storage and processing requirements well below the Yottabyte range.

The report suggested several research and development strategies for improving data management and analysis.  The JASONs also proposed a series of “grand challenges” that would set ambitious technical goals and provide monetary rewards for their achievement.

The December 2008 JASON report was initially withheld from public access, but a copy was released in response to a Freedom of Information Act request from Secrecy News.  See “Data Analysis Challenges”.

5 Responses to “Yottabytes and the Data Analysis Challenge”

  1. NMvoiceofreason Says:

    If you image the entire earth at 128×128 per meter resolution (roughly 3″ squares), then you reach a yottabyte in about a month. 2/3rds of that data is almost totally useless, as it shows the surface of the ocean. If you go multispectral, C/m/y/k/Ir/Uv you can use a yotta per week. If you add lidar and sar, you can do a yotta a day.

    Nothing is unmanageable with sufficient time, money, manpower, except for the bureaucracies that gather sensor data.

  2. Matthew Tanner Says:

    NMvoiceofreason, (July 7th, 2009 at 8:32 am)

    Your twinky analogy is missing something. I get the resolution we’re working at but that says nothing about the rate that we’re using up memory. I’m presuming that it’s has something to do with the rate imaging satellites produce data.

  3. taochiapet Says:

    i think NMvoiceofreason’s point is that we’re already capable of, and indeed do, generate yotta’s worth of image data. the post’s unrealistic use of 1 byte = 1 m^2 (granted, for illustrative purposes) tends to obscure that fact.

    rather than an increase in the rate that imaging satellites produce data, perhaps there’s just an increate in the use of imaging satellites to collect data. in that regard, i’d say the only memory being used up is the collective memory of a government limited in power and constrained by law…

  4. Grumpy Old Man Says:

    You are saying that data including oceans is totally useless. You are talking about the engine of the world and calling it totally useless. I think you might be thinking with blinders on.

  5. Company - News - Solera Networks™ Says:

    [...] Secrecy works to challenge excessive government secrecy and to promote public oversight”) in a post on the challenges of dealing with large data sets. The December 2008 JASON (not an acronym) report [...]

Leave a Reply


© 2009 Secrecy News All Rights Reserved -- Copyright notice by Blog Copyright