Assessment of information literacy skills among first year students

The development of research and information literacy skills in first year students is essential, but challenging. Approaches to developing these skills that are embedded within subject design, and use a blended approach between online and face-to-face delivery are considered best practice in this area. However research has yet to identify the most appropriate form of assessment of these skills. We used constructive alignment to embed research skills in a first year subject. Students were assessed on their research skills using a diagnostic online quiz in week one, and then in week six, their application of their skills in their assignment was assessed using a rubric. We created a matched sample of the results on these two forms of assessment that included 227 students. Our main aim was to determine whether there was a relationship between quiz and rubric scores, and to assess the practical relevance of the quiz in terms of identifying students who might be in need of additional support. We found a small, but significant, positive correlation between quiz and rubric results and conclude that both the quiz and the rubric are useful forms of assessment, and that there are benefits to using both within an embedded curriculum.


Introduction and background literature
Graduate skills and capabilities incorporate the skills and knowledge that undergraduates should develop beyond discipline-specific content traditionally associated with a university education (Barrie, 2007).Research and information literacy form a crucial part of these capabilities, as they contribute to students' writing and critical thinking (Andrews & Patil, 2007;Grafstein, 2002).It is particularly important that students are able to use these skills in their first year at university, but there is little continuity in the expectations, teaching or assessment of these skills from high school to tertiary settings (Willison & O'Regan, 2005).Universities are under increasing pressure to measure and report levels of graduate attribute type skills during the first year at university (Barrie, 2007), and then to demonstrate improvement of those skills over the course of a degree.In order to do this, library and teaching staff must utilise best practice in terms of the direct teaching, and assessment of information literacy and research skills.
In the tertiary education environment, an embedded approach to the development of research skills has been acknowledged as superior to providing stand-alone workshops (Price, Becker, Clark, & Collins, 2011).This approach provides greater opportunities for students to learn, practice and receive feedback on their skills (Treleaven & Voola, 2011).A study previously published in this journal described an embedded approach to the development of generic academic skills using online tutorials to teach information literacy skills (Cassar, Funk, Hutchings, Henderson, & Pancini, 2012).Instruction for the development of information literacy skills is now commonly provided online in order to provide services to an increasing number of students (Anderson & May, 2010;Zhang, Watson, & Banfield, 2007).Blended instruction formats (with a combination of online and face-to-face delivery) emerge as favourable in a review of studies comparing these approaches (Zhang et al.).
The two dominant forms of assessment of information literacy skills include the use of short online quizzes, or the assessment of skills as demonstrated in students' assignments.Online quizzes are often favoured as universities come under pressure to provide diagnostic and summative assessment of graduate skills (Barrie, 2007).
Academics are often sceptical of the capacity of a brief, online, multiple-choice quiz to accurately assess these higher order skills.However they do acknowledge that using such methods are the easiest option, and might facilitate cross-institutional comparisons (Scharf, Elliott, Huey, Briller, & Joshi, 2007).Many Australian universities have therefore developed short online multiple-choice tests to assess information literacy (Price et al., 2011) and large, standardised forms of online testing to examine these skills have been used in the USA (Educational Testing Service, 2004;Kent State University Libraries and Media Services, 2007).Assessment of applied information literacy skills in student's written work, using rubrics that scaffold assessment criteria and indicate where students could improve, is often considered to be a more authentic method of assessment (Knight, 2006).However this option is resource intensive and not particularly viable with large class sizes.Nevertheless many authors have described their use of portfolio-based assessment to determine information literacy skills using either rubrics or checklists as a grading framework (e.g.Knight; Scharf et al., 2007).
In this paper, we aim to contribute to the debate regarding the assessment of research and information literacy skills.We compare the scores of the same students on a multiple-choice assessment and a rubric-based assessment of information literacy, and use statistics to determine the relationship between the two.In addition, we provide information regarding the potential usefulness of each assessment approach in identifying students who are performing well, or those who are in need of additional support.We (the teaching team) trialled an approach to embedding the graduate capability of Inquiry/Research in Concepts of Wellbeing [EDU1CW], a large (N ~ 340) first year subject.This subject is delivered in the first semester of the first year of study for all primary and secondary Bachelor of Education students (approximately 340 each year).One of the major aims of EDU1CW is to facilitate first year students' transition to university through a content focus on their personal wellbeing, and a skills focus on their academic capabilities through formative assessment (using a model described in Taylor, 2008).Full details of the subject are published elsewhere (Yager, 2011).
In order to embed the teaching and assessment of Inquiry/Research skills, we used constructive alignment.This involves subject design where the Intended Learning Outcomes [ILO's], teaching and learning activities and assessment are all related to each other in order to encourage deep learning (Biggs & Tang, 2007).In Concepts of Wellbeing, we included the development of Inquiry/Research as an ILO of the subject and this was communicated to students in written and verbal forms.A variety of online and faceto-face teaching and learning activities were provided for direct instruction about Inquiry/Research skills.Online activities included the Inquiry/Research Quiz [IRQ], a multiple-choice assessment with automated feedback, and LibSkills online modules, which provided further information following the quiz.In class, lectures about database searching, using library resources and referencing were provided.
Students then had the opportunity to practice searching library databases to find journal articles relevant to their assessment topic in tutorials held in the computer labs.None of these activities were technically compulsory, but all students were strongly encouraged to complete all activities.
Students were initially assessed on their Inquiry/Research skills using the IRQ, and then on a rubric-based evaluation of their skills as demonstrated in their assessment.Students were encouraged to complete the IRQ in the first or second week of classes, so this formed the first learning activity that taught students about Inquiry/Research skills, but it was also an assessment of their baseline skill level.Students then practiced and demonstrated what they had learned in the first lowstakes written assessment for EDU1CW (Stage 1, described below).
Finally, students were formally assessed on whether they met the cornerstone standards for Inquiry/Research in their Stage 2 assignment (Theoretical and Background Plan, described below) due in week six, and were given formal feedback on their Inquiry/Research skills on a rubric in week eight.The rubric that we used was based on the La Trobe University Information Literacy Framework (La Trobe University, 2011b).The Framework has six standards, which articulate learning outcomes at cornerstone, midpoint and capstone levels and is based on a standardised Australian Framework (Bundy, 2004).The cornerstone outcomes from the Framework were transferred to the rubric and used to assess students' assignments in terms of meeting, not meeting, or exceeding the standard.
The major assessment in EDU1CW, the Personal Wellbeing Plan (PWP), was designed to facilitate Inquiry/Research skill development through a series of written assessments in four stages, described below.
• Stage 1: the Proposal (10%, due week four) required students to present an evidence-based plan for personal behaviour change and give APA-style references of two peerreviewed journal articles that they might use to support this plan.Feedback to students focussed on academic writing and referencing skills as well as the suitability and credibility of the articles chosen.
Referencing was required, but did not attract a grade, giving students a "free trial."• Stage 2: Theoretical and Background Information (30%, due in week six) required students to summarise their peer-reviewed journal articles and indicate how the research related to their plan for improving their wellbeing.Inquiry/Research skills were assessed using the rubric described above.An overview of the criteria and mechanisms for assessment of each criteria used in the rubric is provided in Table 1 (below).
• Stage 3: the Reflection (20%, due week 11) required students to respond to a series of structured reflective questions about their experiences of behaviour change and to demonstrate continuing improvement in their writing and referencing skills.This allowed students the opportunity to further practice and demonstrate skills after they had received formal feedback on how well they had met the cornerstone standards.• Stage 4: the Artefact (10%, due week 13) required students to provide a visual representation of their attempts at behaviour change and allowed a final attempt at referencing.
For the PWP assessment, students were also required to submit all previous stages of their work when they submitted their current piece of assessment.This allowed academic staff to refer back to students' past attempts and whether they have responded to feedback that was provided.
Grading was such that students were penalised for failing to respond to, and incorporate this feedback.

Research questions
The main aim of this research was to use statistics to determine the correlation between students' Inquiry/Research Skills as assessed in an online quiz, and as demonstrated through their written assessment.The research questions were as follows: 1) Did either demographic factors (age, gender, course enrolled in) or quiz factors (amount of time taken, week quiz was done, making more than one attempt at the quiz) impact on students' quiz results?2) Did demographic factors (age, gender, course enrolled in) impact on students' rubric results? 3) Was there a relationship between the quiz and rubric scores?and 4) Is the quiz a useful tool for identifying students who might be performing well, or in need of additional support for this graduate capability?

Participants
Participants were first year undergraduates enrolled in the first year, first semester subject Concepts of Wellbeing.
The Faculty of Education Human Ethics Committee approved a universal ethics application that covered many projects relating to the first year in the faculty.This meant that students gave informed consent to the collection of data, test scores, artefacts of assessment and a first year survey in the first week of class.
No students refused participation.A total of 320 students were enrolled in the class, but matched data for both the IRQ and rubric was only available for 227 students, which comprised the sample for this study.

Measurement
Students' research skills were assessed using the IRQ, and the rubric-based assessment in Stage 2 of their major assignment, the PWP.In the first week of semester, students were directed to the IRQ through their learning management system (Moodle).Completion of the quiz and modules was voluntary, but strongly encouraged, and students were allowed as many attempts at the quiz as they liked.Students' total score on their first attempt at the quiz and total scores of any subsequent attempts were recorded using program software.This information was exported to Microsoft Excel by library staff, and provided to teaching staff.In week 6, students submitted Stage 2 of their PWP and their Inquiry/Research skills were assessed using a rubric (described above).Marks for each of the six areas of the rubric were recorded as 1 = standard not met, 2 = standard met and 3 = standard exceeded in accordance with the university guidelines for measuring graduate capabilities, providing a total score out of 18.In addition, tutors recorded whether or not the student was considered to have met the standard (or not met, or exceeded) overall.This information was then entered into an excel database, along with details of each student's birth date, gender, course, and student number.
Raw data in excel spread sheets were obtained from teaching and library staff and sorted by surname.Data were copied into SPSS, and matched manually, by student name.A total of N = 319 first year students had results on the online quiz, and a total of N = 320 students were enrolled in EDU1CW.However, the lists of students in each database were not identical.From a total of N = 338 entries into the SPSS database, n = 90 were removed as they did not have rubric data, and n = 21 were removed as they did not have quiz data.This resulted in a final sample of n = 227 students for whom matched data for both the quiz and rubric was available.

Data analysis
Data screening and initial exploration revealed that the total scores on the first attempt of the quiz, and scores on the rubric were not normally distributed; therefore non-parametric tests were used in all analyses.Descriptive statistics were used to obtain means and frequencies in relation to demographic data and performance on the IRQ and rubric.Where data were categorical and allowed for the comparison of two groups, Mann-Whitney U tests (the non-parametric alternative to an independent samples t-test) were used to determine the differences between these groups on quiz and rubric scores.Where data were categorical and allowed for the comparison of three groups, Kruskal-Wallis tests were used (the non-parametric version of a One-Way ANOVA) to test for the differences on quiz and rubric outcomes by course, and quiz factors.
Where data were continuous, Spearman's rho was used as the non-parametric version of the Pearson's test to determine correlations between scores.This same test was used to determine whether there was a correlation between the IRQ score and the total score on the rubric.Where there were significant correlations, the relationship was explored further using Mann-Whitney U tests.

Description of the sample
Data for both the quiz and the rubric were available for 227 students.Mann-Whitney U tests demonstrated the representativeness of this sample as there were no significant differences between the total score on the online quiz of the students in the final sample and those who were excluded due to missing rubric data (z = -0.51,p = .61).There was also no difference on the total rubric scores between those included in the final sample and those who were excluded due to missing quiz data (z = -0.99,p = .32).
The sample was predominantly female (females: 71.4%, n = 162; males: 28.6%, n = 65).Most students were enrolled in a Bachelor of Education (70%, n = 159), and a smaller proportion were enrolled in a Bachelor of Physical and Health Education (10.6%, n = 24) or a Bachelor of Early Childhood (15%, n = 34).A small number of students (4.4%, n = 10) were enrolled in degrees in other faculties.Students ranged in age from 18 to 58 years of age.The median was 19 years and the mean age was 21.05 years [5.62].

Results of quiz-based assessment
On their first attempt at the quiz, student's scores ranged from 2 to 10 and the mean [SD] was 7.33 [1.53].The proportion of students who got each of the quiz items correct is provided in Table 1.The majority of students were correct in responding to the majority of quiz items on the first attempt, with the exception of question three and question one.Just under half (47.14%, n = 107) of students made a second attempt at the quiz.The mean score on second attempts at the quiz was 8.57 [1.68].A further 18.06% (n = 41) made a third attempt [mean score 9.07, SD 1.32], five (2.20%) students made a fourth [mean score 9.20, SD= 0.84], and three (1.32%) made a fifth attempt [mean score 10.00 SD = 0].The majority of students (76%, n = 174) completed their first attempt at the quiz in the first week of the semester, while 20.7% (n = 47) completed the quiz in the second week and 2.6% (n = 6) completed the quiz after week four.

Table 1: Proportion of students who chose the correct option on their first attempt at the IRQ
We were interested in determining whether there were any significant correlations between demographic factors and students' results on their first attempt at the quiz.Spearman's rho found that there was no significant correlation between students' age and their total score on the first quiz attempt (r s = .04,p = .54).Mann-Whitney U tests found that there was no significant difference between the total score on the first quiz attempt by gender (z = -1.52,p = .13).Finally, Kruskal-Wallis tests found that there was no significant difference between the total score on the first quiz attempt according to the course that students were enrolled in [X 2 (2, n = 217)= 1.66, p = .44].
We were also interested in determining whether any of the factors related to the quiz were correlated with students' total scores on their first attempt.We found that students who made more than one attempt at the quiz (n = 106) were significantly more likely to have had a lower mean score on their initial quiz attempt (mean = 6.74,SD= 1.57) than those who only made one attempt at the quiz (mean = 7.85, SD= 1.29), according to a Mann-Whitney-U test (z = -5.29,p = .00).Kruskal-Wallis tests found that there was no significant difference between the total score on the first quiz attempt according to the week that students completed the quiz [X 2 (2, 226) = 0.08, p = .95].There was also no significant correlation between the amount of time taken to complete the quiz and the total score on the first attempt (r s = 0.02, p = .72)according to Spearman's rho.

Results of rubric-based assessment
The majority of students (59.5%) were considered to have met the cornerstone standards for Inquiry/Research according to the rubric-based assessment of the second stage of their major assignment.Table 2 indicates the proportion of students who met each of the standards as provided in the Information Literacy Framework, and whether students met the standards overall.
Again, we were interested in determining whether there were any relationships between demographic factors and total rubric scores.Total rubric scores were generated by adding together the values of not meeting the standard (1), meeting the standard (2) or exceeding the standard (3) for each of the six areas of the framework.There was a significant difference between the mean total rubric scores by gender, as males were significantly more likely (z = -2.67,p = .00)to a lower score on the rubric (mean = 11.41,SD= 3.01) than females (mean = 12.61, SD= 1.52) according to the Mann Whitney U test.However there were no correlations between age and total rubric scores (Spearman's rho, rs = 0.11, p = .11).There was also no significant difference between total rubric scores according to the course that students were enrolled in [X 2 (2, 216) = 0.42, p = .81]according to a Kruskal-Wallis Test.

Relationship between quiz and rubric scores
As the IRQ and the rubric were based on the same Information Literacy Framework, and attempting to measure the same construct in very different ways, we were interested in seeing whether there was a relationship between the scores on these assessments.It is important to note that we did not consider this to be a repeated measures analysis of the change in student scores from the quiz (in week 1) to the rubric (in week 6), as this would require using the exact same measure at each timepoint to make the analysis valid.Instead, we were interested in seeing whether students' scores on the two tasks were related, and whether the quiz could be a valid instrument for determining whether students would meet the standard in their written assessment.Spearman's rho indicated that there was a significant positive correlation between scores on the initial quiz attempt, and the total grade given on the rubric (rs = 0.21, p = .001).Cohen (1988) classifies a correlation of 0.2 as within the small range (from 0.10-0.29).Although statistically significant, quiz scores only explained 4.49% of the variance on the rubric score.The dataset was then split according to other interesting groups.There was a stronger correlation between quiz and rubric scores for those who were recent school leavers [aged 18 or 19; rs = 0.25, p < .01]as opposed to others [aged 20 years or over; r s = 0.17, p < .05].In addition there was a stronger correlation between quiz and rubric scores for males (r s = 0.24, p < .05)than for females (r s = 0.19, p < .05).Finally there was a stronger correlation for those students enrolled in a Bachelor of Early Childhood (r s = 0.50, p < .01)than those in the Bachelor of Education (r s = 0.16, p = .05)or Bachelor of Physical and Health Education (r s = 0.23, p < .05).
A Kruskal-Wallis test was used to compare the initial quiz results of students who were later classified as either having met, not met or exceeded the standards according to the rubric based assessment of their written work.It was found that there was a significant difference overall [X 2 (2, 226) = 14.68, p = .00],and that scores were as expected, as those who were considered to have not met the standard (n = 44) had a mean initial quiz total score of 6.61 [1.69]; those who met the standard (n = 135) had mean quiz scores of 7.38 [1.45]; and those who exceeded the standard (n = 47) had a mean quiz score of 7. 85 [1.38].Follow up Mann-Whitney U tests found that those who did not meet the standard in the rubric had a significantly lower mean score on the quiz than those who met (z = -2.9,p = .00)and those who exceeded the standard (z = -3.65,p = .00).However those who were classified as exceeding the standard in the rubric did not have a significantly higher score on the initial quiz attempt than those who met the standard (z = -1.77,p = .08).
In order to evaluate the accuracy of the quiz in determining Inquiry/Research skills in real terms, we did some further analyses.Using the mean scores given above, we determined a cut-off score of seven as representing the midpoint between the mean quiz scores of those who met and did not meet the standard according to their rubric assessment.When a quiz score of 7 is used as a cut-off point, only 27.7% (n = 20) of the n = 112 students who had an initial quiz score of 7 or less were identified as not meeting the standard according to the rubric later on.A further 57.1% (n = 64) of these students who received a quiz score of less than seven were classified as having met the standard and 15.2% (n = 17) were classified as having exceeded the standard based on their work that was assessed in the rubric.

Discussion
In this paper, we provided details of an embedded approach to the development of Inquiry/Research skills into a first year, first semester subject using constructive alignment.We compared the scores of 227 students on two different approaches of assessment of Inquiry/Research skills.We found that there was a positive, significant correlation between students' scores on a ten question, online quiz (the IRQ) and a rubric-based assessment of their Inquiry/Research skills.However, the relative strength of this relationship was low.Correlations were stronger for students who were male, recent school leavers (aged 18 or 19) and enrolled in the Bachelor of Early Childhood course.
The IRQ identified 27.7% of students who were later classified as not meeting cornerstone standards on the rubric-based assessment of their written work.This indicates that an online quiz might be useful in terms of identifying some, but not all, students who could be offered additional workshops and resources.It was interesting that the likely cut-off score for not meeting the standard (7) was quite high, and this might reflect the difficulty of the quiz questions.An important practical finding was that the quiz was not particularly useful in determining those students who would later go on to demonstrate that they exceeded the cornerstone-level standards in Inquiry/Research.Both forms of assessment were based on the La Trobe University Information Literacy Framework but were very different in terms of the investment of staff time, and the feedback that was provided to students.
Using students' written assessment to evaluate their research skills was useful in this subject, and we found that this is the only mechanism by which students with high information literacy levels can be identified.However, it was also extremely time consuming.Although rubric-based assessment of information literacy skills is considered to be beneficial by many others (e.g., Knight, 2006), most who use this approach do so to assess information literacy and research skills at the capstone level, where class sizes may be smaller.
We suggest that, rather than choosing one form of graduate capability assessment over the other, using the quiz and rubric in tandem offers more opportunities for learning and assessment.Other authors have indicated that students' selfperceptions of their information literacy skills are particularly inaccurate, which might make them less likely to seek unprompted assistance (Dean & Cowley, 2009).Price and colleagues (2011) found that first year students initially demonstrated higher levels of confidence in their own information literacy skills than those in later year levels at university, but they revised their confidence upon the receipt of feedback in relation to their performance.Using online quizzes at the very beginning of first year may assist students in more accurately determining their capabilities in this area, and provide additional motivation for attending classes with face-to-face delivery of skills instruction, as well as the use of online materials.Rubric-based assessment that is embedded within a formative assessment process can then support students in their development of these skills, and ultimately reward them for exceeding standards and doing well.
In our attempt to evaluate two methods of assessment of research skills, we were limited by a major practical issue.Frameworks and standards generally identify information literacy processes, whereas assessment of these skills is generally limited to the outputs or outcomes of these processes (Willison & O'Regan, 2005).Some of the criteria from the framework used for the rubric referred to processes that students would use, whereas teaching staff could only provide grades and feedback on the outcomes of those processes, as demonstrated in their written assessment.This issue will persist unless researchers and university staff commit to identifying the areas of frameworks that might be practically determined using student assessments.There were some other limitations to this assessment and research.Both assessments were relatively brief considerations of students' ability in this area.Students might have had assistance from others when completing their IRQ, which may have influenced the results.There may also have been some variability in the grading of students' Inquiry/Research skills on the rubric as inter-rater reliability was not able to be calculated.

Conclusion
We found that both an online quiz, and a more complex rubric-based assessment of students' research skills were useful in the assessment of student graduate capabilities such as research and information literacy.As there was very little discipline focus, these findings have implications for all involved in teaching first year students.This includes library and other support staff as well as academics in a range of disciplines that aim to develop and assess student graduate capabilities and skills such as information literacy and research.