Programming contest taps analytics to mark endangered whales

This episode of the 'Talking Data' podcast discussed how machine learning is being applied to whale identification as part of a programming contest.

A programming contest encouraging the use of machine learning to help identify endangered whales is the topic in this edition of the Talking Data podcast.

Analytics software maker MathWorks in Natick, Mass., is sponsoring the competition, along with the U.S. National Oceanographic and Atmospheric Administration (NOAA). It's hosted on Kaggle, which is a platform for programming contests and analytics competitions.

Underway until Jan. 7, 2016, the public competition is looking for algorithms that cull through aerial photographs and successfully recognize individual whales in the right whale community.

The right whales' recent history is a tortured one. In the days of massive whaling, it seems, the right whales were a coast-hugging species -- and easy prey for whalers who decimated their ranks. Those whales have been a focus of conservation efforts since 1935.

Their very small population has actually been adding members, recently numbering about 500, according to Christin Khan, fishery biologist with NOAA fisheries. That number is still tenuous. Among the issues the right whales face is entanglement in fishing lines. They also can be struck by ships.

"Part of our work is to identify every right whale we can," Khan said. That job can be labor-intensive when staff members have to carefully eyeball photo after photo, she noted.

Source: NOAA

Khan said she and her colleagues have watched the growth of machine learning projects that identify images. If such analytical algorithms can trim time from the whale identification task, that can free up staff for more proactive efforts. 

According to Paul Pilotte, MathWorks' technical marketing manager, the contestant's typical approach to the project is to first identify the whales by their unique head markings, employing, for example, edge detection routines. Programming languages used by contestants include MathWorks' own MATLAB, R and Python, he said. Then, predictive identifiers test out their work's accuracy using existing NOAA image databases. The top prize for the programming contest for whale recognition is $5,000 for the winning team.

Next Steps

Discover differences between machine learning and statistics

Find out about Spark machine learning libraries

Listen to other recent Talking Data podcasts

Dig Deeper on Big data analytics