{"id":181,"date":"2024-04-29T18:01:38","date_gmt":"2024-04-29T18:01:38","guid":{"rendered":"http:\/\/localhost:8888\/sawberries\/2024\/04\/29\/tornet-ai-dataset-carves-new-paths-tornado-detection-0429\/"},"modified":"2024-04-29T18:01:38","modified_gmt":"2024-04-29T18:01:38","slug":"tornet-ai-dataset-carves-new-paths-tornado-detection-0429","status":"publish","type":"post","link":"http:\/\/localhost:8888\/sawberries\/2024\/04\/29\/tornet-ai-dataset-carves-new-paths-tornado-detection-0429\/","title":{"rendered":"An AI dataset carves new paths to tornado detection"},"content":{"rendered":"
The return of spring in the Northern Hemisphere touches off tornado season. A tornado’s twisting funnel of dust and debris seems an unmistakable sight. But that sight can be obscured to radar, the tool of meteorologists. It’s hard to know exactly when a tornado has formed, or even why.<\/p>\n
A new dataset could hold answers. It contains radar returns from thousands of tornadoes that have hit the United States in the past 10 years. Storms that spawned tornadoes are flanked by other severe storms, some with nearly identical conditions, that never did. MIT Lincoln Laboratory researchers who curated the dataset, called TorNet<\/a>, have now released it open source. They hope to enable breakthroughs in detecting one of nature’s most mysterious and violent phenomena.<\/p>\n \u201cA lot of progress is driven by easily available, benchmark datasets. We hope TorNet will lay a foundation for machine learning algorithms to both detect and predict tornadoes,\u201d says Mark Veillette, the project’s co-principal investigator with James Kurdzo. Both researchers work in the Air Traffic Control Systems Group.\u00a0<\/p>\n Along with the dataset, the team is releasing models trained on it. The models show promise for machine learning’s ability to spot a twister. Building on this work could open new frontiers for forecasters, helping them provide more accurate warnings that might save lives.\u00a0<\/p>\n Swirling uncertainty<\/strong><\/p>\n About 1,200 tornadoes occur in the United States every year, causing millions to billions of dollars in economic damage<\/a> and claiming 71 lives on average. Last year, one unusually long-lasting tornado<\/a> killed 17 people and injured at least 165 others along a 59-mile path in Mississippi. \u00a0<\/p>\n Yet tornadoes are notoriously difficult to forecast because scientists don’t have a clear picture of why they form. \u201cWe can see two storms that look identical, and one will produce a tornado and one won’t. We don’t fully understand it,\u201d Kurdzo says.<\/p>\n A tornado\u2019s basic ingredients are thunderstorms with instability caused by rapidly rising warm air and wind shear that causes rotation. Weather radar is the primary tool used to monitor these conditions. But tornadoes lay too low to be detected, even when moderately close to the radar. As the radar beam with a given tilt angle travels further from the antenna, it gets higher above the ground, mostly seeing reflections from rain and hail carried in the \u201cmesocyclone,\u201d the storm’s broad, rotating updraft. A mesocyclone doesn’t always produce a tornado.<\/p>\n With this limited view, forecasters must decide whether or not to issue a tornado warning. They often err on the side of caution. As a result, the rate of false alarms for tornado warnings is more than 70 percent. \u201cThat can lead to boy-who-cried-wolf syndrome,\u201d Kurdzo says.\u00a0\u00a0<\/p>\n In recent years, researchers have turned to machine learning to better detect and predict tornadoes. However, raw datasets and models have not always been accessible to the broader community, stifling progress. TorNet is filling this gap.<\/p>\n The dataset contains more than 200,000 radar images, 13,587\u00a0of which depict tornadoes. The rest of the images are non-tornadic, taken from storms in one of two categories: randomly selected severe storms or false-alarm storms (those that led a forecaster to issue a warning but that didn\u2019t produce a tornado).<\/p>\n Each sample of a storm or tornado comprises two sets of six radar images. The two sets correspond to different radar sweep angles. The six images portray different radar data products, such as reflectivity (showing precipitation intensity) or radial velocity (indicating if winds are moving toward or away from the radar).<\/p>\n A challenge in curating the dataset was first finding tornadoes. Within the corpus of weather radar data, tornadoes are extremely rare events. The team then had to balance those tornado samples with difficult non-tornado samples. If the dataset were too easy, say by comparing tornadoes to snowstorms, an algorithm trained on the data would likely over-classify storms as tornadic.<\/p>\n \u201cWhat’s beautiful about a true benchmark dataset is that we’re all working with the same data, with the same level of difficulty, and can compare results,\u201d Veillette says. \u201cIt also makes meteorology more accessible to data scientists, and vice versa. It becomes easier for these two parties to work on a common problem.\u201d<\/p>\n Both researchers represent the progress that can come from cross-collaboration. Veillette is a mathematician and algorithm developer who has long been fascinated by tornadoes. Kurdzo is a meteorologist by training and a signal processing expert. In grad school, he chased tornadoes with custom-built mobile radars, collecting data to analyze in new ways.<\/p>\n \u201cThis dataset also means that a grad student doesn’t have to spend a year or two building a dataset. They can jump right into their research,\u201d Kurdzo says.<\/p>\n