Algorithms and academic research
Last Friday, I attended Computers and Crowds: Unexpected Authors and Their Impact on Scholarly Research at the CUNY Graduate School of Journalism, an excellent event organized by the LACUNY Emerging Tech Committee, LILAC, and OLS.
My notes are available as a very messy PDF of my scribbles made with Paper (iPad app). Another version of the presentation slides by Paul Zenke and Kate Peterson is also online under the title "Black Hats, Farms, and Bubbles."
Impressions, connections, and resolutions:
- What's a filter bubble? As a web algorithm learns what you are more interested in, you are given more of what you tend to like. It's a positive feedback loop. The downside of this is that you get less exposure to material that makes you uncomfortable or challenges your preconceptions/politics.
- You need to have a balanced information diet. It was hard enough before personalized filters became the norm — now it's harder!
- See: Eli Pariser: Beware online "filter bubbles" (10-minute TED talk)
- One of the library's roles may be providing a place of neutrality. We can better provide neutral information for our users by installing tools that increase user privacy and decrease tracking, especially if these might be inconvenient or undesirable to use at home.
- Some practices to protect yourself and your students from unwanted tracking:
- clear your history and cookies regularly
- use ad blocking software
- see who's tracking you using Collusion (Chrome & Firefox plugin)
- use private browsing
- understand how to de-personalize your Google search results
- try out alternatives like Duck Duck Go
- Challenge students to evaluate not just the resource, but to evaluate the algorithms that led them there.
- Why might one article rise to the top of the results list using Google or an academic database?
- How would they design a system to recommend material to a friend?
- Challenge yourself to understand and compare these algorithms and filters. Do the leg work and the research to ensure you're providing your students with acceptable platforms for information hunting, consumption, and creation.
- For example, if you use Primo, familiarize yourself with ScholarRank
- Algorithm-created content is already here. Narrative Science is hugely successful. NLP and text mining are changing journalism and are on their way to changing academic writing as well.
- Algorithmically-created essays might be the next cheating trend. I have heard of online education programs (MOOCs, probably) asking students for a portfolio of past writing to algorithmically ascertain whether their writing is theirs or not by stylometric analysis
- See also: "The Great Automatic Grammatizator," a prescient short story by Roald Dahl