The problem with data

2017-05-25 Trouble-With-Data

We’re being inundated with data. That’s what we’re told, right? We hear all the time how many exabytes of new data are being created every day. There’s just one problem: maybe none of it is the data we actually need.

I recently had the opportunity, along with several of my OCLC colleagues, to attend the Electronic Resources and Libraries (ER&L) Conference. I’ve been going to this great conference for the last two years, and each year it offers a really valuable look into how libraries manage e-resources. This year, several topics across multiple presentations led me to the conclusion that actionable data is actually pretty hard to find and even harder to wrangle successfully.

You will never “have it all”

An important element of any successful data analysis strategy is to first realize that you’ll never have all of the data you need. At some point, you have to take what you’ve got, come to some conclusions and move on. But we should also be looking for ways to start by collecting data that is more actionable from the get-go.

As I thought through the ER&L presentations and listened to librarians talk about their “wish lists” for data, I found that three themes seemed to resonate around this idea of “good data” as opposed to “more data.”

  1. Don’t collect data that’s not actionable
  2. Go right to your users
  3. Standardize and share

1. What are you willing to change?

If you’re not planning on making changes based on the data you’re collecting and analyzing, just stop. I heard from folks who’d been collecting all kinds of statistics that, at the end of the day, they weren’t useful in terms of supporting actual strategy or tactics.

What changes are you willing to make based on how you analyze data? Click To Tweet

The exact opposite of that is the demand-driven acquisition (DDA) and evidence-based acquisition (EBA) models, which we heard quite a lot about at ER&L. In these programs, libraries partner with publishers to make e-resources available for purchase or access after crossing some usage threshold. It’s a great example of using just-in-time data to make specific collection development decisions.

2. Be an anthropologist, not a number cruncher

In the absence of readily available (or useful) data, it seems that more libraries are investing in user surveys and usability tests, especially involving students. Librarians from CUNY spoke about watching their student testers locate an electronic article on a specific topic. The staff watched where the students navigated online, what they typed into search boxes and how they sorted results.

Some libraries, like Montana State University, are working to optimize their e-resources for search engines and social media. Their Open SESMO project incorporates linked data terms and social media-friendly images into e-resource records to help online information seekers find their resources. The “data” in this case is based on observations of students in order for the library to “be in the right place at the right time.”

Smart, simple glimpses into the “life of the user” can inform your strategy as much as pages of gate stats.

3. All together now

Most librarians I talked to are trying to manage whatever data they can collect in Excel, which has limitations. Library staff who want to engage in data-driven collection development are still sometimes forced to make guesses about what their users want or compare “apples and oranges” when data comes in different formats from different sources.

This is where a conference like ER&L really helps. When we can get together and agree on formats and standards that work well, we can develop better, shared tools for analysis and implementation. For example, when vendors and publishers use COUNTER-compliant statistics, libraries can compare data from across multiple sources much more easily. Some folks I spoke with were a bit frustrated that they weren’t able to get COUNTER reports for streaming video services.

Using data standards also allows us to do wider-ranging, even global analysis and research. It’s handy for any one library to have access to standardized reports. But when thousands of libraries get together?

Data is a means, not the end

Collectively—when we are thoughtful about how and why we collect and share data—libraries can make an impact not just on their own institutions, but across the entire information landscape. The trick is to always think of data in terms of what it can do for us, not just what we need to do to get more.