Institutional Analytics Blog

5 most common higher ed data quality challenges—and solutions

By Tyler Richardett

"There are two interesting moments in the lifetime of a piece of data: The moment it is created and the moment it is used. Quality, the degree to which the data is fit for use, is judged at the moment of use," writes Dr. Thomas Redman in the Harvard Business Review.

Higher education offers no exception to this rule. In fact, one of the most prominent barriers to using data for optimal resource allocation is incomplete and inaccurate department-level data. As we push toward new, more complex analytics for the purpose of reducing inefficiencies, it is imperative to limit gaps, inaccuracies, and inconsistencies in your own data.

Over the last three years, we’ve audited the quality of more than 90 colleges and universities’ data during the implementation of our Academic Performance Solutions (APS) analytics platform. Through this process, we’ve identified five of the most common data quality challenges we see among members:

1. Capture instructors’ percent responsibility

Challenge: Most colleges and universities have sections with multiple instructors but don’t consistently record the share for which each instructor is responsible.

Impact: Without a documented share of responsibility, it's impossible to analyze instructor teaching activity and accurately gauge available instructional capacity.

Potential remedy: Within your student information system (SIS), ensure that the sum of instructors’ responsibility equals 100% for each section. If a value is missing altogether, one solution may be to evenly split the responsibility among all assigned instructors. Or assign default percentages based on the rank of the instructors that instructors themselves can manually override.

2. Assign sections to the correct course types

Challenge: Most colleges and universities have inconsistent policies for setting course types. Some set all course types to a default option (e.g., lecture); others allow faculty to set the course type without providing guidance or a description of each.

Impact: The course type assigned to each course should reflect characteristics of the course, including relative size and pedagogy. This is necessary to capture instructional workload, cost variations, and student progress implications for each course type, as well as benchmark against similar courses.

Potential remedy: Within your SIS, have faculty or department staff select from a picklist when assigning course types, rather than entering them into a free form text field, and provide a definition for each option. Additionally, establish a process by which you can easily identify when a course’s size is misaligned with its type (e.g., a five-person lecture or 200-person seminar).

Related: How Salisbury University is solving institutional data disparity

3. Establish accurate enrollment caps

Challenge: Nearly all colleges and universities have inconsistencies in documenting the maximum capacity for sections of the course. Some set caps to zero as a means of strictly controlling registration, while others set caps to a value like 9,999—essentially removing a maximum altogether. An analysis of APS members’ data found that 37% of courses have variable max caps among sections that should all have the same pedagogical and space needs.

Enrollment Caps Data

Impact: Fill-rate analyses, calculated as the enrolled class size divided by the maximum capacity, can help you determine if there are too many or too few sections of a course. Surplus sections can represent wasted resources; an insufficient number of sections can pose a barrier to student progress toward timely graduation.

Potential remedy: Establish a policy that sets clear guidelines about how max caps should be set. Within the policy, create processes that minimize workarounds, such as allowing faculty to retain control over registration without having to set the cap to zero.

4. Identify tenure and rank codes

Challenge: Many colleges and universities do not track instructors’ tenure and rank information, or, if they do, it is often inaccurate and/or outdated. For example, we often see null values for instructors without tenure, making it difficult to distinguish between those who are ineligible and those on track.

Impact: Tenure and rank codes group faculty and instructional staff into meaningful categories which describe their relationship with the school. This allows you to calculate faculty headcounts and instructional workloads among these different groups and compare to both internal and external benchmarks. For example, is your math department relying more heavily on adjunct staff than peer departments?

Potential remedy: Given the sensitivity of data relating to tenure, having formal data governance processes can be especially important in this area. Leadership must make clear that all university data is owned by the institution rather than by departments or individuals, who act as stewards of the data. Responsibilities for data governance should be split between two primary committees: A prioritization committee of executives and a definition- and access-focused committee of technologists and data custodians who are subject-matter experts. Rely on the latter committee to set picklist options that reflect local language and norms for talking about tenure and rank categories.

5. Map distinct organization codes to academic departments

Challenge: Many colleges and universities do not accurately relate distinct organization codes and academic departments. In other words, they are not reliably attributing costs to academic units.

Impact: Accurately cross-walking financial organizations and academic departments is necessary to perform effective cost analyses. Downstream, bad data limits the extent to which you can tie the decisions of academic stakeholders to the budgetary impacts of those decisions.

Potential remedy: Among your SIS, HR, and finance systems, ensure that each academic department is matched to distinct organization codes and vice versa. One academic department may be associated with multiple financial organizations, but one financial organization should not be associated with more than one academic department. Set in place a process to regularly update crosswalks; this becomes particularly important when academic and/or financial schema change. As with the previous challenge, technologists and subject-matter experts should oversee these mappings.

How this university used data to save 700 work hours and $466,000

St. Ambrose University was able to use key department-level metrics to inform resource requests and make distribution choices, saving them almost half a million dollars.

Download the case study to see how these practices could be implemented on your campus.

Download the Case Study

Sources:

  • Academic Performance Solutions data and analysis, 2018
  • A Common Currency: Achieving Excellence in Data Governance and Adoption of Analytics, IT Forum, 2015

  • Manage Your Events
  • Saved webpages and searches
  • Manage your subscriptions
  • Update personal information
  • Invite a colleague