In the preliminary report from the MIT Task Force on the Future of Libraries, we make several references to the importance of optimizing library content, data, and metadata for machine learning applications.
We imagine a repository of knowledge and data that can be exploited and analyzed by humans, machines, and algorithms. This transformation will accelerate the accumulation and validation of knowledge, and will enable the creation of new knowledge and of solutions to the world’s great challenges. Libraries will no longer be geared primarily to direct readers but instead to content contributors, community curators, text-mining programs, machine-learning algorithms, and visualization tools.
I am convinced that machine learning is going to have a major impact on the advancement of knowledge in lots of ways we can’t anticipate, and I want to understand it better. I am also convinced that without the intervention of folks who understand the biases built into our collections in terms of content, organization, and description; machine learning applications will re-inscribe and reify existing inequalities.
To that end, I’m trying to put together a reading list to get smarter about what machine learning is, what it can do for libraries, and what libraries can do to support and inspire creative, productive, just and inclusive applications of machine learning. Here’s my very incomplete initial list. Additional suggestions welcome in the comments.
- Machine Learning: The new AI by Ethem Alpaydin (part of the Essential Knowledge Series from MIT Press)
- Always Already Computational: Library Collections as Data, an IMLS-funded project, Thomas Padilla.
- “New AI-Based Search Engines are a “Game Changer” for Science Research”, by Nicola Jones, Nature magazine on November 12, 2016
- Artificial Intelligence Is Lost in the Woods, by David Gelernter, in MIT Technology Review
- Searching for Lost Knowledge in the Age of Intelligent Machines, by Adrienne Lafrance, The Atlantic.
- AI Songsmith Cranks Out Surprisingly Catchy Tunes, by Will Smith, MIT Technology Review
- And this list of resources: Hitchhiker’s guide to data science, machine learning, R, Python, from Vincent Granville
This quote resonated with me:
“I am also convinced that without the intervention of folks who understand the biases built into our collections in terms of content, organization, and description; machine learning applications will re-inscribe and reify existing inequalities.”
A bit of background: I have a particular interest in personal epistemology–the implicit beliefs about the nature of knowledge and knowing that drive, or at least guide, an individual’s actions. My husband is an electrical engineer who does a lot of machine learning.
He is convinced that I should turn a critical eye to the personal epistemology of machine learning programmers for the exact reasons you state above, because machine learning is shaping so much of what we can access but machine learning is not inherently neutral, it is shaped by the nature of the programmers themselves. Machine learning programmers need to be reflexive in their practices, and often these are people who are never introduced to such concepts or practices.
I’m glad we are not the only ones thinking critically about this.
LikeLike
Thanks for putting together this list, Chris. I obviously have some catching up to do. I would add this recent Guardian article to demonstrate why it is important for libraries to pay attention to this issue:
https://www.theguardian.com/technology/2016/dec/04/google-democracy-truth-internet-search-facebook
I would also add the classic, In the age of the smart machine by Shoshana Zuboff. Still in print although originally published in 1988.
LikeLiked by 1 person