Machine Learning for Ecologists and Other Interesting People

Machine Learning for Ecologists and Other Interesting People

Facebook
Twitter

Every spring, graduate students at the Odum School of Ecology at the University of Georgia organize a short weekly seminar around an interesting theme. This year, the theme is “Machine Learning for Ecologists”. At first glance, machine learning may not seem very related to a discipline mostly concerned with the distribution and abundance of organisms, but machine learning has plenty of applications in ecology and beyond. 

File:Sarkar&Saha Figure1A.png
Modified after Sarhar & Saha, 2019 by CreightonMA via wikimedia commons. Licensed under CC BY-SA 4.0

Machine learning describes a broad class of computer algorithms that can improve automatically through experience and the use of data. These algorithms make up a subfield of artificial intelligence and include many techniques such as clustering, decision trees, and neural networks. Machine learning methods are becoming more advanced and implementation has become increasingly accessible due to online tutorials and dedicated software packages.

Machine learning has become a part of our daily lives in both apparent and less obvious ways. For example, digital personal assistants and navigation applications employ machine learning to provide us with information when requested. Machine learning methods are also often used for highly specific tasks such as playing games like chess or Go. More covertly, machine learning determines the advertisements we see, filters for our emails, and signals of fraud. In addition, industries such as medicine and law that rely on highly trained professionals have also begun to use these methods to improve performance and reduce costs. Computers can now read radiology images for doctors, and improve the document review process for lawyers. 

In academic research, the adoption of machine learning methods has also been growing steadily. For example, in my field of ecology, researchers now use machine learning to process large amounts of data by training computer algorithms to sort through complex data such as images. Large datasets are then employed for predictive applications such as estimating how the home range of an animal or plant species may change under different effects such as climate change. Machine learning can also be used to estimate values when datasets are incomplete. As an example, ecologists have applied these methods to the problem of estimating life history traits of species, such as lifespans and reproductive rates, when these traits have not been directly measured. In quantum chemistry, machine learning is being used to leverage large amounts of data to predict energies and atomic charges, tasks that usually need to be tackled separately but can be done using a single machine learning algorithm. In plant molecular biology, machine learning is useful for handling large volumes of data such as genetic sequences, which can be used to determine which genes code for specific proteins. 

“Colorado” By Inam Jameel. Used with permission

Any discipline is likely to benefit from thoughtful application of machine learning methods because of its broad uses. However, the benefits of machine learning come with challenges and risks. Machine learning methods often employ the separation of data into testing and training data that can be used to evaluate how well an algorithm is performing, and this can sometimes require large amounts of data which can be difficult or costly to gather. The interpretation of many results from machine learning methods can also be notoriously difficult to understand because these algorithms can identify patterns that are not immediately clear to humans. In these situations, it is important to have a strong understanding of the data itself in order to avoid the common issue of conflating causation with correlation

Machine learning adds to our toolkit as scientists and it is important to understand when these methods can be most useful for our research. If you'd like to learn more, then resources online such as machinelearningmastery.com, towardsdatascience.com, and scikit-learn.org are all helpful starting places. I also highly recommend this free textbook from Hastie and Tibshirani. Most of all, I recommend searching through the literature to see how machine learning is being applied in your discipline. Please leave a comment if you find something that interests you!

Note: the title is inspired by Hanna Kokko's “Modelling for Field Biologists and Other Interesting People” which is a great resource for modeling for field biologists and other interesting people.

About the Author

+ posts

Daniel is a Ph.D. student in the Odum School of Ecology at the University of Georgia. He is interested in the relationship between host biodiversity and parasite transmission. Stemming from a background in freshwater ecology, Daniel often uses diverse communities of amphibians and their many pathogens as study systems.

More to explore

Scroll to Top