| Summary: | The Brazilian High School Exam (Enem) consists of a written essay and four tests with
45multiple-choice items: Human Sciences and their Technologies (HS); Natural Sciences and
their Technologies (NS); Languages, Codes and their Technologies (LC); and Mathematics and
their Technologies (MT). The exam is used as a selection process for entry into higher education
courses. This use poses challenges for the exam in its format, such as: producing precise scores
for a diverse population, minimizing the effect of item position on performance, and
constructing equivalent tests. It is possible to advance in these challenges by applying Enem in
a Computerized Adaptive Testing (CAT) format. Therefore, the objective of this work was to
develop a more efficient, precise, and secure Enem CAT than its current format. We divided
the thesis into two Articles and two Products. Article 1 compared sample distributions in the
calibration of items in the three-parameter logistic model of Item Response Theory. We used
information from the four Enem 2020 tests to simulate the responses of 5,040 participants
drawn from three types of sampling designs: random, rectangular, and shifted. There was no
significant difference between the designs for the discrimination parameter. The shifted sample
recovered the difficulty parameters in HS better than the rectangular sample and in NS better
than the random sample. The shifted and random samples recovered the pseudo-guessing
parameters better than the rectangular sample in all four tests. The results do not point to the
prevalence of one type of sample to calibrate the Enem 2020 items. Product 1 consisted of a
statistical package for simulating CAT in an R environment. Article 2 evaluated the method of
progressive restricted exposure (PR) control with different acceleration parameters in a CAT in
terms of efficiency, precision, and security. We manipulated the item selection method
(Random, Maximum Fisher Information - MFI - and PR with two acceleration parameters) and
the stopping criterion (Fixed length of 20 and 45 items, Standard Error of 0.30 and Error
Reduction of 0.015 with Standard Error of 0.30) and simulated the application of 16 CAT
conditions for each test. Finally, we simulated the application of Enem 2020 in a linear format
and compared it to the 20-item fixed-length CAT. The test size was larger with MFI. In the
fixed-length CATs and with the stopping criterion of standard error, MFI and PR (with both
acceleration parameters) had similar results for precision. With the error reduction criterion, PR
performed worse. Security increased as the acceleration parameter increased. The adaptive
version of Enem had higher precision than the linear version. Product 2 of the thesis was the
publication of an Enem CAT application with the algorithm determined in Article 2. We
concluded that it is possible to reduce the size of Enem and improve its precision and security
with a CAT.
|