(PDF) Enhancing Classroom Assessment: Pr

(PDF) Enhancing Classroom Assessment: Practical Insights into Turkish EFL Teachers' Strengths and Challenges
Academia.edu uses cookies to personalize content, tailor ads and improve the user experience.
By using our site, you agree to our collection of information through the use of cookies.
To learn more, view our
Privacy Policy.
Enhancing Classroom Assessment: Practical Insights into Turkish EFL Teachers' Strengths and Challenges
Journal of Language Education and Research, 2026, 12 (1), 96-128
Research Article

Enhancing Classroom Assessment: Practical Insights into Turkish EFL
Teachers’ Strengths and Challenges
İrem Aydın 1*

Devrim Höl2**

ARTICLE INFO

ABSTRACT

Received: 13.11.2025
Revised form:09.02.2026
Accepted: 10.02.2026
Doi:10.31464/jlere.1822952

This study aimed to investigate perceptions of Turkish in-service
English teachers concerning their strengths and weaknesses as well as
the possible sources for their challenges and their needs in foreign
language (L2) assessment. For data management and analysis,
MAXQDA Analytics Pro 2022 was employed. The findings of the study
revealed that Turkish English as a Foreign Language (EFL) teachers
possess a strong sense of self-perception and high levels of perceived
competence in their abilities to effectively ensure the reliability and
validity of language assessments. Conversely, it was discovered that the
teachers exhibited a perception of inadequacy and lack of proficiency in
the rating procedure of L2 assessments, as well as in the design of tests
and formulation of test items. The teachers attributed the origins of these
issues to external factors and the education system.

Keywords:
language assessment
testing
evaluation
perception
strengths
weakness
Acknowledgments

This article was extracted from the author's M.A. thesis titled "Investigating
Perceptions of Turkish EFL Teachers Upon L2 Assessment ".

Statement of Publication Ethics

The ethics committee approval for the current study was obtained from
Pamukkale University Social and Human Sciences Research and Publication
Ethics Board on 10/03/2022 with decision number 05-5.

Authors’ Contribution Rate

The authors’ contribution rates are equal.

Conflict of Interest

The authors declare that there is no conflict of interest associated with this
study.
Aydın, İ., & Höl, D. (2026). Enhancing classroom assessment: practical insights
into Turkish EFL teachers’ strengths and challenges. Journal of Language
Education and Research, 12 (1), 96-128.

Reference

PhD student, İrem AYDIN, ORCID ID: https://orcid.org/0000-0001-6750-9328, Pamukkale University, Department of
English Language Teaching,
[email protected]
2**
Dr. Devrim HÖL, ORCID ID: https://orcid.org/0000-0001-5151-2581, Pamukkale University, Department of English
Language Teaching,
[email protected]
Copyright © 2026 by JLERE- https://dergipark.org.tr/en/pub/jlere
ISSN: 2149-5602

97
Enhancing classroom assessment: practical insights…

Introduction
The Role of Assessment in Education
Assessment is a fundamental component of the educational process because teaching and
assessment are inherently interconnected, especially serving not only to enhance and inform
learning but also to systematically monitor learners’ progress throughout instruction
(Çetinkaya, 2020). Effective assessment provides teachers with feedback about student
progress, informs instructional decisions, and ensures that learning objectives are achieved
(Rogier, 2014; Sevimel-Şahin & Subaşı, 2019). It also contributes to continuous educational
improvement by guiding curriculum reform and supporting evidence-based teaching practices
(Al-Mahrooqi, 2017). As Rea-Dickins (2004) states, assessment is not an adjunct to instruction
but a natural part of the teaching act. Similarly, Alderson et al. (2017) emphasize that
assessment and teaching depend on each other for their effectiveness. The assertion that “the
quality of instruction in any classroom turns on the quality of the assessments used there”
underscores this connection (Stiggins, 1999, p. 20).
Within this framework, teachers play a critical role in maintaining the quality of
assessment. They are not merely users of tests but decision-makers who interpret results,
diagnose learning needs, and plan subsequent instruction (Mertler & Campbell, 2005). Thus,
possessing important levels of assessment competence is not optional but central to teachers’
professional identity. Inadequate assessment skills can lead to unreliable judgments about
student performance and weaken instructional quality (Zulaiha et al., 2020). Consequently, the
notion of assessment literacy has emerged as an essential element of teacher expertise,
particularly in contexts where teachers are both assessors and facilitators of learning.

Language Assessment Literacy (LAL): Concepts, Training, and Needs
In language education, assessment plays an especially key role because it directly evaluates
learners’ linguistic knowledge and communicative abilities. Green (2013) defines language
assessment as the process of collecting and interpreting data to make valid inferences about a
person’s language proficiency. In this sense, Language Assessment Literacy (LAL) represents
teachers’ ability to understand, design, implement, and interpret assessment practices in ways
that align with pedagogical goals. It combines theoretical understanding, technical skills, and
reflective awareness by enabling teachers to use assessment results to support learning rather
than simply measure it (Fulcher, 2012).
However, extensive research has shown that both pre-service and in-service EFL teachers
often possess insufficient assessment literacy. Studies in various contexts (Büyükkarcı, 2019;
Fard & Tabatabaei, 2018; Najib Muhammad & Bardakçı, 2019; Ölmezer-Öztürk & Aydın,
2019; Tsagari & Vogt, 2017; Vogt & Tsagari, 2014) consistently report that teachers’
theoretical knowledge rarely translates into classroom practice. Although they understand key
assessment concepts, they frequently struggle with applying them effectively or designing valid
instruments.
© 2026 JLERE, Journal of Language Education and Research, 12(1), 96-128

İrem AYDIN & Devrim HÖL
Several studies have investigated how LAL can be developed through formal education and
professional training, and revelaed that language assessment courses improved pre-service
teachers’ theoretical and operational knowledge, helping them connect assessment design with
instructional objectives (Giraldo & Quintero, 2019), however, it was also found that LAL
courses were primarily theory-based and lacked practical components due to time constraints
(Sevimel-Şahin & Subaşı, 2019; Şahin, 2019). These findings give us a clear implication that
assessment literacy development requires not only conceptual instruction but also opportunities
for practice and reflection.
From another perspective, research on teachers’ needs for training further supports this
conclusion. Teachers sought detailed procedural guidance for designing, implementing, and
validating assessments (Fulcher, 2012), and they reported insufficient knowledge of formative
assessment practices as well as limited awareness of assessment fairness, particularly in terms
of avoiding linguistic bias, ensuring clarity of test instructions, and aligning assessment tasks
with learners’ proficiency levels (Türk, 2018). In addition, participants from various studies
conducted stated they need more skill-specific training, especially in assessing writing and
speaking, and practical sessions on ensuring objectivity and reliability (Semiz & Odabaş, 2016;
Ölmezer-Öztürk &Aydın, 2019; Berry et al., 2019), and teachers prefer hands- on workshops
and ready-to-use classroom tools to purely theoretical instruction (Canlı &Altay, 2024).
Collectively, these studies emphasize the importance of continuous, context-sensitive
professional development for improving teachers’ LAL.

Teachers’ Beliefs, Practices, and the Turkish Context
While training improves theoretical knowledge, teachers’ beliefs, self-efficacy, and
contextual conditions strongly influence how this knowledge takes place and applied in
practice accordingly. In many studies it was found that teachers understood basic assessment
principles but could not consistently implement them due to overcrowded classrooms and
limited teaching time (Shim, 2009), and their theoretical knowledge did not work effectively
in classrooms (Jannati, 2015; Öz & Atay, 2017) while Djoub (2017) concluded that even
trained teachers often felt underqualified and requested more autonomy and feedback. Rad
(2019) further demonstrated that teachers with strong assessment literacy produced more
coherent instructional plans and exhibited better awareness of learners’ strengths and
weaknesses. Conversely, from another perspective, institutional policies, limited resources, and
frequent curriculum change constrained teachers’ ability to apply assessment knowledge
effectively (Zulaiha et al., 2020; Luthfiyyah et al., 2020).
In local context, similar challenges persist. Han and Kaya (2014) found that EFL teachers
prioritize grammar and reading while viewing speaking and writing as the most difficult skills
to assess. Saka (2016) reported that teachers rely on written and multiple-choice exams because
of the dominance of high stakes testing and insufficient time for communicative assessment.
Karagül et al. (2017) confirmed that crowded classrooms and limited teaching hours further
reinforced the use of traditional methods. More recent studies show that these practices remain
widespread. Arslan and Üçok-Atasoy (2022) revealed a mismatch between curriculum

99
Enhancing classroom assessment: practical insights…

expectations and classroom realities, as teachers continued to use pen-and-paper tests rather
than integrated skill assessments. Similarly, Çimen (2022) found that although the curriculum
promotes multiple assessment tools, most teachers neglect listening and rely heavily on
grammar-focused exams. Teachers also expressed negative emotions about assessment, citing
workload, low student motivation, and large classes. Genç et al. (2020) added that while
teachers were confident in assessing speaking, their understanding of writing assessment
remained limited.
Overall, the literature suggests that assessment literacy is not a unidimensional skill but a
multidimensional construct encompassing knowledge, beliefs, and contextual adaptability
(Ülper & Bağcı, 2012). Teachers’ ability to assess effectively depends not only on what they
know but also on how institutional and emotional factors shape their decisions. Although
quantitative studies have documented teachers’ strengths and weaknesses, qualitative research
exploring how they interpret and enact assessment literacy in authentic classroom contexts
remains limited. Addressing these gaps through practice-based and context-aware professional
development will be essential to bridge the persistent divide between assessment theory and
classroom reality in language education.
Research Aim and Research Questions
The present study intends to explore the perceptions of Turkish EFL teachers regarding L2
assessment. For this purpose, their opinions on their strengths and weaknesses, challenges and
the reasons for their challenges, and their needs for L2 assessment were investigated.
Methodology
Research Design/Model
A qualitative descriptive research design was selected for this study because it is the
most practical approach for conducting initial research on themes that have not yet been fully
explored (Hill et al., 1997). Moreover, Creswell (2013) notes that qualitative research is
employed when it is necessary to explore a subject or a problem and to fully understand the
subject. In addition, qualitative research seeks to comprehend certain issues or circumstances
by exploring people's attitudes, behaviors, and background for their actions (Kaplan &
Maxwell, 2005). As a result, this study utilized a qualitative descriptive research design that is
entirely data-driven and tries to create codes from the data during the research process.
Additionally, this approach ensures a logical, clear, and succinct descriptive overview of the
data's informative components (Lambert & Lambert, 2012).
Publication Ethics
This study was conducted in accordance with the principles of research and publication
ethics. The required ethics approval was obtained from Pamukkale University Social and
Human Sciences Research and Publication Ethics Board on 10 March 2022, with the decision
number 05-5.

İrem AYDIN & Devrim HÖL
Participants
Turkish in-service EFL teachers from various levels of teaching and all regions of
Türkiye made up the study's research group. For this study, voluntary-response sampling
design which aims to collect accurate and thorough data from volunteer subjects (Murairwa,
2020) was used, and those who responded to the interview were approved as participants of
this study. Regarding the number of participants, the study followed Kvale’s (1996)
recommendation to “interview as many subjects as necessary to find out what you need to
know” (p. 101). Thus, due to data saturation, the data collection process was ended once
recurring and similar responses began to emerge. Consequently, there were 116 Turkish inservice EFL teachers responding to the interview in total. The majority of the sample (64.7%,
n = 75) were female, while 35.3% (n = 41) were male in-service EFL teachers. Almost half of
the teachers (49.1%, n = 57) worked at secondary schools, 30.2% (n = 35) worked at high
schools, 18.1% (n = 21) worked at primary schools, and 2.6% (n = 3) were employed at
universities.
Data Collection and Procedure
A written online semi-structured interview was utilized to gather information about the
strengths and weaknesses of Turkish in-service EFL teachers, as well as the challenges they
face and their needs regarding language assessment. The interview was generated by the
researchers and the procedures for generating an interview in the relevant literature (Benson &
Clark, 1982; Creswell, 2002; Kvale, 1996; Rose et al., 2019) served as the basis for the
interview's design (Figure 1). The researchers designed a semi-structured interview consisting
of six open-ended questions in line with the purpose of this study. The interview consists of
two parts. Prior to delving into the subsequent sections, the researchers provided an
introduction encompassing their identities, the research's objective, the assurance of
confidentiality pertaining to the interviewee and the collected data, instructions regarding the
handling of research findings, and the acquisition of informed consent from the interviewee. In
the first part, the teachers were requested to provide demographic information, including their
gender, the type of school they are employed at, and the city in which they work. In the second
part, teachers were asked to answer open-ended questions. To avoid potential errors and
address complex challenges related to data collection (Janghorban et al., 2014), a pilot study
was conducted. This preliminary study involved eight English language teachers and provided
an opportunity for researchers to make necessary adjustments and enhancements to the main
study. As a result, a total of 116 English language teachers participated in the interview.

101
Enhancing classroom assessment: practical insights…

Figure 1. Instrument Development Stages

Data Analysis
The research employed a summative content analysis approach, as described by Hsieh
and Shannon (2005), to identify and quantify certain words and textual elements. This method
was utilized to ascertain the manner in which these words and content pieces were employed
within the relevant context. For data archiving, data management and analysis were performed
using MAXQDA Analytics Pro 2022.
First, in MAXQDA, the interview questions were allocated within the code window as
categories, and then, the next step involved preparing the raw data for analysis. This entailed
reading the data sequentially and assigning a code label or term to each text segment (Creswell,
2016). This study adopted the terms code and category as overarching terms in data analysis.
As per the definition given by Saldaña (2016, p. 4), “a code in qualitative inquiry is most often
a word or short phrase that symbolically assigns a summative, salient, essence-capturing,
and/or evocative attribute for a portion of language-based or visual data.” The coding
procedures encompass two primary cycles that incorporate distinct coding methods, namely
the first cycle and the second cycle coding methods. During the initial coding phase, a datadriven approach was adopted without predefining categories (Kuckartz & Rädiker, 2019).
Similar events and actions were grouped together through an inductive segmentation of the
data. At the end of this phase, 116 initial codes were generated.
In the subsequent categorization phase, the initially generated codes were examined,
merged, and reorganized into broader thematic categories. Namely, identification of frequently
occurring and marginal codes was conducted, followed by the reorganization of the dataset by
the removal of synonymous codes and the elimination of unnecessary codes. Finally, the most
representative codes were selected (Boeije, 2010). Following the initial cycle, a second-order
labeling technique known as subcoding was employed to further refine the codes (Saldaña,
2016) (see Figure 3.4). Subsequently, the encoded data were gathered and organized into
categories characterized by overarching and inclusive headers. Consequently, the initial set of
116 open codes was condensed to 52 codes, with certain codes containing sub-codes, which
were then classified into five distinct categories.
The trustworthiness of the study was ensured in line with Lincoln and Guba’s (1985)
criteria of credibility, transferability, and dependability. Credibility was established through

İrem AYDIN & Devrim HÖL
peer debriefing and independent coding by the researcher and a second coder, followed by
consensus meetings and the use of direct participant quotations. Detailed descriptions of the
research context and transparent reporting of the research design and analytical procedures
further supported transferability and dependability.
Results
The study's findings are reported in a descriptive manner. To help the reader
comprehend the findings, they are presented based on the research questions and are described
in a detailed way through tables and figures. The emerging categories and codes are quantified
by establishing the frequency and percentage of each of them. Lastly, the category and code
frequencies and percentages are accompanied by several quotes. This approach allows for a
comprehensive analysis where numerical data and textual descriptions complement each other,
enabling robust conclusions for the reader. As for quotations, T stands for Turkish EFL inservice teacher, and the number denotes the participant's rank in the study.
1. The Strengths of the Participants in L2 Assessment
(What are the strengths of Turkish EFL teachers in the testing, evaluation, and assessment
process?)
The primary objective of this study was to ascertain the strengths of L2 assessment. Figure 2
displays the codes representing the strengths of the teachers. According to Figure 3, the analysis
of interview responses yielded a total of eight different strengths codes, with five of these codes
further divided into sub-codes.
Figure 2. Frequency Graph Showing the Number of Coded Sections Related to Teachers’ Strengths

The code pertaining to ensuring reliability emerged as the prevailing theme inside this
particular domain. This was followed by ensuring validity, test design and item writing,
personal factors, and the rating procedure, respectively. The three remaining strengths were
identified by the teachers as the least prevalent themes among the identified codes. The
subsequent sections provide a comprehensive explanation of the emerging codes and subcodes,
accompanied by individual tables for each.

103
Enhancing classroom assessment: practical insights…

Figure 3. Categorization of Views about Teachers' Strengths

Ensuring Reliability
Figure 4. Sub-Codes of Ensuring Reliability

To be fair (Teacher22-fairness)
…….and I think I can ensure that I can be objective as much as I can and I am proficient enough. (T87objectivity)
I keep the number of exam questions high. I try to prepare questions similar to the activities we do in
the lesson. (T14-reliability of language tests)
…using various techniques to approach the answers given by the students impartially (preparing and
scoring the answer key clearly, closing the names of the students, etc.) (T12-rater-reliability)

İrem AYDIN & Devrim HÖL
When asked about their areas of strength, the most prevalent response among teachers related
to their dedication to preserving reliability. This code comprised four distinct sub-codes,
namely fairness, objectivity, reliability of language tests, and rater-reliability. The table reveals
a clear trend wherein teachers indicated that they were fair, followed by objective. Specifically,
they claimed to have prioritized the development of reliable language examinations by
including a sufficient number of questions and constructing assessments that closely resemble
typical classroom exercises. Additionally, the teachers indicated that they implemented various
measures to mitigate potential biases. These measures included the creation of an answer key
before the examination and the blinding of student names during the evaluation of the exam
papers. The aforementioned statements reflect certain perspectives that align with the
viewpoints of the educators.
Ensuring Validity
Figure 5. Sub-Codes of Ensuring Validity

Most of the exams I prepare are those with high content validity. I pay attention to asking questions about
every topic covered in my classes. (T72-content validity)
I prepare exams that reflect the course content, designed in accordance with the teaching activities, and
measured the achievements, and took the necessary precautions against cheating and applied for the
exams. (T107-content validity)
When using assessment and evaluation tools, I believe I can successfully match students’ level of
proficiency and course requirements and aims. (T37-level-based assessment)
To determine their levels correctly and to make assessments according to their levels. (T77-level-based
assessment)

The sub-code derived from the participants' responses to the initial question presented in Figure
5 indicated that the educators demonstrated proficiency in guaranteeing content validity by
posing questions related to all subjects addressed during the instructional session. Furthermore,
they exhibited attentiveness to the diverse levels of students' abilities during the assessment
process, recognizing that each class or individual student's proficiency level may vary. The
statements mentioned above provide a glimpse into the perspectives held by certain educators
regarding this matter.

105
Enhancing classroom assessment: practical insights…

Test Design and Item Writing
Figure 6. Sub-Codes of Test Design and Item Writing

Including every type of questions including subskills for each skill (Open-ended, multiple choice,
listening, writing...etc.) (T82-using different item types)
I do not think that I am addressing a part of the class when I make assessment. I pay attention that my
questions are at the upper, intermediate, and elementary level. (T49- preparing questions with different
difficulty levels)
I make sure that my rubric/scale is reliable, valid and effective. (T94- preparing a reliable and valid
measurement tool)

The creation of tests and writing of items emerged as a frequently cited strength by the
teachers. In relation to this matter, teachers expressed their ability to create items in various
formats and generate questions of varying levels of difficulty, with the aim of achieving a
uniform assessment procedure and ensuring equitable opportunities for all students. According
to the data presented in Figure 6, educators articulated four distinct perspectives regarding this
matter. The responses provided by the educators regarding their individual answers are
included in the corresponding sections above.
Personal Factors
Figure 7. Sub-Codes of Personal Factors

Patient, Collaborative, Self-confident Flexible, Young and Dynamic (T30,68,100,108,105-personality
traits)
Being an educator who loves to learn is my strongest point. Attending webinars has become my hobby.
I am keeping up with the updates. (T84-being eager to improve themselves)
My strong point is that I like to work collaboratively with not only my colleaques and also my students.
(T83-working in cooperation)

İrem AYDIN & Devrim HÖL
The code derived from the responses to the initial question, as illustrated in Figure 7, shows
the positive personal attributes and behaviors exhibited by instructors in relation to assessment.
According to the statements provided by the teachers themselves, it can be inferred that they
expressed a curiosity to acquire knowledge and displayed a willingness to embrace
advancements in the field of assessment. Moreover, they demonstrated proficiency in fostering
collaboration with their peers.
The Rating Procedure
This code addresses the positive aspects of teachers in relation to the rating process. During
this procedure, the individuals placed significant emphasis on their proficiency in evaluating
examination papers, employing assessment in a constructive manner, and delivering feedback.
The aforementioned ideas presented align with the perspectives held by the teachers.
Figure 8. Sub-Codes of the Rating Procedure

Grading exam papers, calculating scores, (T11-grading exam papers)
And even if there is a misspelling, if what he wants to say is correct, I accept the answer, I have a
motivating aspect to the student. (T79-using assessment in a positive way)
As soon as I took the exam, I evaluate them immediately. I evaluate their exams, and then, the paper is
self-exaluated by the student and I tell him to calculate his/her score and examine the paper before it is
erased from his memory. I believe this is effective and students remember their mistakes. (T69-providing
feedback)

Formative Assessment
A substantial number of teachers have shown their inclination towards utilizing
formative assessment, identifying it as a key aspect of their assessment practices. The educators
elucidated their approach to assessment as a systematic procedure, employing portfolio
assessment as well as regular administration of quizzes to monitor students' progress and
evaluate their performance. The following are the teachers' verbatim statements regarding the
implementation of formative assessment:
Administering quizzes and monitoring students' progress frequently. Allow students and parents to
make up the difference without having to save. Checking homework regularly. (T67).
In order to assess processes, I also employ portfolio works. (T72)

Assessment Literacy and Experience
One of the less frequently acknowledged qualities within this particular category was
to possess a comprehensive understanding and practical knowledge in the field of assessment.
© 2026 JLERE, Journal of Language Education and Research, 12(1), 96-128

107
Enhancing classroom assessment: practical insights…

The subsequent excerpts consist of direct quotations extracted from the perspectives of teachers
about assessment literacy and experience codes.
-Graduating from ELT (I attended testing lessons)
-Having a master's degree
-Having Ph.D. degree
-I received in-service training on testing.
-I attended an extra course on testing.
- Having worked as the Exam Coordinator in my institution
-and my experience here (T4)
-I read studies/articles about various test and task preparation processes and made applications. (T7)
I took assessment and evaluation courses in undergraduate, graduate, and doctorate degrees. While we
learned practical information in the undergraduate course, we learned theoretical information in the
graduate course. It was like a combination of the two in the doctorate. I can say that my strength is that I
can easily prepare an exam by looking at the syllabus and aims of the school. I owe this to the courses I
took and to my exam experiences. (T10)

Assessing Four Skills
Based on the findings of the interview, it was observed that a less frequently expressed
perspective regarding strengths in assessment related to the assessment of four distinct skills.
The findings of this study indicate that a limited number of teachers incorporate assessments
of all four language skills into their examinations and perceive this as an area of expertise. The
subsequent statements reflect the perspectives of teachers regarding this topic:
I know how to assess all four skills (T1)
The use of four basic skills in foreign language teaching, their development with grammar and
vocabulary (T18)
Preparing exam questions to assess four skills (T46)

2. The Weaknesses of the Participants in L2 Assessment
(What are the weaknesses of Turkish in-service EFL teachers in testing, evaluation, and
assessment process?)
Figure 9. Frequency graph showing the number of coded sections related to the weaknesses

İrem AYDIN & Devrim HÖL
The second primary objective of this study was to examine the deficiencies exhibited by
Turkish EFL teachers in the domain of second language (L2) assessment. The study of
MAXQDA software yielded a total of twelve codes categorized as weaknesses. Among these
codes, three were found to have sub-codes, as depicted in Figure 10. Upon careful examination
of the viewpoints expressed by all the teachers, it became evident that the rating
procedure depicted in Figure 4.3 was seen as a major weakness. Furthermore, the teachers
commonly identified test design and item composition, personal variables, assessment literacy,
and experience, as well as deficiencies in assessing speaking skills and the absence of
acknowledging any deficits as the primary issues. The teachers had weaknesses in several
different areas, including the rating technique, test design and item composition, personal
variables, and assessment literacy and experience, as shown by the codes. Based on the data
shown in Figure 9, it is evident that the teachers' least frequently reported areas of weakness
were the assessment of writing, objectivity in assessment, and the utilization of technology in
assessment.
Figure 10. Categorization of views about teachers' weaknesses

The Rating Procedure
Figure 11. Sub-Codes of the Rating Procedure

109
Enhancing classroom assessment: practical insights…

to be very merciful, (T54), I do not like talented students getting low grades (T74), According to the
questions, I ignore minor mistakes and keep the notes abundant (T86-rater bias)
Grading exam papers is very time consuming for me. When I think about preparing multiple choice
items and doing optical reading, I think that multiple choice questions will be insufficient. (T79grading exam papers)
Not knowing which points to score in written texts (T11-grading writing)
When the number of students is high and the number of students is high in assessment and evaluation, I
am insufficient to give feedback and to deal with them individually. (T88-providing feedback)
I'm not always sure if he delivers the correct answer because he is so familiar with the test questions.
(T5-chance success)

Upon inquiry into the deficiencies of L2 assessment among teachers, it was determined that
the most salient themes pertain to the rating procedure. The process was categorized into six
subheadings based on the perspectives of the teachers. According to the findings presented in
Figure 11, a considerable number of teachers had a perception of being deficient in terms of
rater bias. Grading exam papers and writing came next, with the remaining three being the least
acknowledged weaknesses in this respect. The aforementioned comments provided by the
teachers regarding this code are shown in the order in which they were delivered.
Test Design and Item Writing
Figure 12. Sub-Codes of Test Design and Item Writing

İrem AYDIN & Devrim HÖL
I do not prepare the questions myself; I compile from ready-made questions. This is the weakness.
(T69-item writing)
Since I usually have A2 level students, I may have difficulty in assessing high level students such as B1
or B2. (T1-level-based test design)
Lack of using various question types (T12-using different item types)

According to the teacher reports, a considerable portion of their challenges stemmed from test
design and item writing subsequent to the implementation of the rating method code. They
expressed three distinct perspectives regarding their shortcomings in test design and item
writing, along with their corresponding solutions. According to the teachers, they encountered
the greatest challenges in the domain of item writing among the growing sub-codes. This was
followed by difficulties in level-based test design and the utilization of various item types. The
preceding statements provide the perspectives of the educators from whom these sub-codes
were derived.
Personal Factors
The following replies to the question regarding the teachers' weaknesses in language
assessment include their personal traits. The aforementioned sample sentences, as articulated
by the educators regarding this topic, were categorized using personal aspects code. These
quotes describe the influence of teachers' personal factors on their assessment processes. Some
teachers feel the need to make their assessment processes more flexible and student-centered,
while others point out how their over-criticism affects their assessment. Prolonged assessment
processes or busy personal life circumstances may lead some teachers to spend less time on
assessments, while others emphasize the importance of increasing their patience in online
teaching. These factors shape teachers' approaches to assessment processes and, in this context,
have a significant impact on assessment practice.
Sometimes my obsessions cannot make my assessment more static and result oriented. Also, I cannot
turn the evaluation into a solution. (T19)
Being hypercritical (T45)
Spending limited time due to the excessive intensity of my private life. (T67)
I reinforced my patience in evaluating my students in my online classes. (T84)
Being comfort (T108)

Assessment Literacy and Experience
Having insufficient assessment expertise and experience was one of the most frequently
cited difficulties in the teachers' responses addressing weaknesses in language assessment.
Views on this topic were classified as assessment experience and literacy. Below are several
opinions expressed by the teachers on this matter. These quotations emphasize that one of the
problems that most affect teachers' weaknesses in assessment processes is the lack of
assessment knowledge and experience. Teachers stated that their training in this field was
insufficient, pointing out that lack of knowledge was at the root of the difficulties they
experienced during assessment practice. Moreover, teachers wish to overcome these
weaknesses by receiving more training and increasing their knowledge, especially in
assessment and evaluation processes. These quotes highlight teachers' desire for professional
development and to increase their student assessment skills, which suggests that education
systems should provide more support and resources in this regard.

111
Enhancing classroom assessment: practical insights…
-I do not think that I received sufficient training on assessment and evaluation. (T2)
-How to ensure validity and reliability of an exam, item discrimination, and factor analysis are
important, and I would like to learn more.
- I can get more training on item writing.
-I would like to do practical studies on “the process of adapting tests” (T4)
-I do not have enough knowledge and experience (T13)
-Insufficient information given for assessment and evaluation before starting teaching life. (T38)
-I wish I had received more training in measurement, I wish I could make more creative and
courageous decisions and use methods. (T115)

Having No Weaknesses
When asked about their weaknesses in language assessment, it was frequently
confronted with responses stating that they had no weaknesses. The participating teachers'
views on this code are given below. One of the teachers states that she has no shortcomings in
her assessments, but there is definitely room for improvement, while another teacher states
unequivocally that she has no weaknesses in this area. Another teacher, while acknowledging
the possibility of being aware of potential weaknesses, explains that she does not currently have
such awareness. One teacher confidently states that they have no shortcomings and that they
have strengthened their assessment skills by taking the necessary seminars. These responses
reflect a range of attitudes, from confidence in their assessment abilities to openness to the
possibility of unidentified weaknesses. The important thing is to be able to demonstrate an
intention to accept identical weaknesses as areas for improvement and to be ready for
professional growth, without explicitly stating them.
I have not experienced any shortcomings in my measurements, but there are definitely things that I
need to improve. (T43)
There must be, but if I were aware of it, I would have already done my best to change it. (T67)
I don't think I have any shortcomings, I took all the necessary seminars (T104)

Lack in Assessing Speaking Skills
Upon analyzing teachers' perspectives on the deficiencies in language assessment, it
was determined that, in addition to other abilities, they identified weaknesses in assessing
speaking proficiency. Furthermore, they reported encountering challenges related to time
constraints and inadequate physical conditions. Given the significance of each of the four
language abilities, they were categorized individually under weaknesses, with different codes
assigned to each. Below are a few examples of sentences stated by the teachers, which served
as the basis for the development of this code. These quotes reflect teachers' perceptions of the
difficulties in assessing speaking skills. In particular, the following are emphasized: One of
teacher states that there are not enough time and physical conditions to assess speaking skills.
Another teacher states that speaking instruction may be limited in public schools and therefore
there may be a lack of speaking. Another teacher states that she has difficulty assessing
speaking skills and that she is unable to conduct this test. In addition, there are teachers who
talk about shortcomings in assessing speaking skills and a teacher who expresses an inability
to assess speaking skills that are associated with student motivation. These quotes reflect the
practical problems of assessing teachers' speaking skills and the difficulties associated with
student motivation.
There are not enough time and physical conditions to assess and evaluate speaking.

İrem AYDIN & Devrim HÖL
and we do not have a chance to assess speaking skills. (T37)
Since speaking cannot be taught much in public schools, there may be a lack of speaking. (T39)
and not being able to do the speaking exam (T89)
Having deficiencies in assessing speaking skill (T107)
Unfortunately, I am insufficient in assessing speaking skills, which is a bit related to student
motivation. (T113)

Lack in Assessing Skills
Teachers frequently discussed their weaknesses in assessing speaking abilities.
However, they also acknowledged, albeit less frequently, their inability to effectively assess
the four language skills and their failure to design comprehensive exams that encompassed all
of these skills. The participant teachers provide explanations for the deficiency in assessing the
four skills. For instance, they point to the difficulty of preparing an assessment that covers the
four skills and that the inadequate application of these skills in the lessons contributes to these
deficiencies. In particular, teachers reported that they had difficulty in preparing questions from
each skill, which leads them to question the feasibility of a fair assessment process that includes
all four skills. These quotes emphasize the importanc
e of fair and effective assessment of the four language skills and also reflect the need to
address this challenge in language education systems. The need for support focused on teachers'
professional development and access to better resources for assessing language learners' skills
is therefore clear.
Examinations should vary and we should evaluate students with exams where we can address the four
language skills rather than just assessing writing-listening. (T5)
As a result of the fact that the four language skills cannot be applied sufficiently in the lessons, some of
these areas are neglected in the assessment and evaluation. (T12)
We cannot make an assessment that covers all four areas of foreign language. (T22)
Not being able to prepare questions for every skill (T33)
I would like the exams in English to be more comprehensive; they should be at a level to assess all
skills. (T42)
not being able to assess and evaluate skills (T68)

3. The Possible Sources for The Challenges in L2 Assessment
(What are the possible sources for the challenges stated by Turkish in-service EFL teachers
in foreign language assessment?)
The third inquiry centered on the perspectives of the participating teachers regarding the origins
of the challenges they encountered in language assessment. Upon analysis of the responses
provided to this inquiry, it was determined that the educators referenced the educational system
and external factors that impact language assessment, as depicted in Figure 4.5.
External Factors
Figure 13. Frequency Graph Showing the Number of Coded Sections Related to the External Factors

113
Enhancing classroom assessment: practical insights…

The study revealed that other factors, outside the influence of teachers themselves, exerted an
impact on language assessment. The results pertaining to external influences, as presented in
Figure 13, encompassed five codes that were seen with varying frequency.
Parents.
Parental pressure was found to be the most influential factor among the external factors
affecting teachers' language assessments. Participant teachers wrote the following sentences
to express their thoughts on this issue:
Not being able to give the children the grade they deserve because of the parents because I teach in a
private school (T3)
Parent factors can sometimes affect my evaluation process. Parental pressure can lead me badly. (T19)
Parents put pressure on grades. Most parents do not know the capacity of their child. Or they do not
want to know. So, they expect a lot from us. They do not like the grades given. (T62)
Parents expect high course grades (T25)
Parent ignorance (T52)
Unnecessary reactions and demands of parents and students (T67)

These quotes show that parents are a crucial factor influencing teachers' language assessments.
Parents' expectations and demands on their students' grades can influence teachers' assessment
processes and limit teachers' freedom to grade. This emphasizes the importance of
collaborating and communicating with parents but also reflects the expectations of the
education system and society on education and assessment.
Physical infrastructure and equipment.
The second factor that teachers highly emphasized among external factors was physical
infrastructure and equipment. The teachers emphasize how physical infrastructure and
equipment, which have a major place among external factors, affect language assessments.
Teachers expressed their thoughts on this issue in the following quotes:
I would like to evaluate this more as the physical structure of the schools. A quality assessment cannot
be made without the necessary equipment. (T2)
-Technical facilities (copier, listening materials etc.)
- Number of invigilators (whether there is enough or not) (T4)

İrem AYDIN & Devrim HÖL
I prepare quizzes online and send them to my students and I want them to do it at a certain time.
However, some students cannot enter on time because there is a problem on the internet, or they have
trouble attending because their class conflicts with their other siblings at home. This creates difficulties
for me to provide assessment and evaluation as a whole. (T56)
Problems related to the internet, insufficient technological tools, a computer that is not enough for two
students and a teacher at home, telephone (T83)

These quotes clearly show how physical infrastructure and equipment can affect assessment
processes. In particular, inadequate technical facilities and internet access can make it
difficult for students to participate in assessments and for teachers to manage these processes.
This highlights the responsibility of educational institutions and systems to provide the
physical infrastructure and equipment needed for assessment processes to function properly,
while at the same time highlighting the practical challenges faced in digital education.
Education System
Figure 14. Frequency Graph Showing the Number of Coded Sections Related to the Education
System

The education system was identified as the second biggest challenge faced by
instructors in language assessment. The analysis of the interview responses revealed that this
category consisted of six different codes (see Figure 14). Moreover, two of these codes have
subcodes (see Figure 15). The most striking finding under this category is related to
administrative issues, which gathered the highest frequency of comments among all findings.
In contrast, teacher autonomy emerged as the least discussed issue according to the
perspectives shared by teachers.
Figure 15. Categorization of views about teachers' challenges regarding education system with
hierarchical code model

115
Enhancing classroom assessment: practical insights…

Administrative problems.
The most notable discovery within the realm of the education system category related to the
administrative problems code. As shown in Figure 16, the instructors covered ten subjects
related to administrative problems. The code reveals that school management emerged as the
most prevalent administrative issue encountered during the language assessment procedure.
School management exerts a great deal of influence on assessments, especially through
grading policies. Limited class hours limit teachers' ability to monitor and assess students. In
addition, complex or intensive curricula can affect student learning, while the quality of
textbooks can affect the reliability of assessments. Large classes can lead to teachers
spending more time on teaching, while the class passing system may encourage some
students not to study. In addition, the sheer number of subjects and objectives can complicate
teachers' assessment processes. The following sentences are the statements made by the
teachers regarding each sub-code, listed in sequential sequence.
Figure 16. Frequency Graph Showing the Number of Coded Sections Related to the Administrative
Problems

Management's attitude. In general, we are asked to prepare tests, to push too hard, and not to give low
grades. This is the most influential factor. (T1) (school management)

İrem AYDIN & Devrim HÖL
Both school administrators and district administrators constantly base English teaching on test success,
and this pushes me to think that I am inadequate in English teaching. (T63) (school management)
Due to the short course hours, a detailed assessment and evaluation cannot be made. (T85) (course
hours)
curriculum density (T52) (Curricula-related problems)
I want textbooks to be a resource that measures all language skills, not just one skill, we can only
measure reading and writing. (T42) (quality of coursebooks)
The assessment-evaluation subject takes a long time due to the large number of students. (T34) (class
sizes)
Students who think that I will pass the class anyway, what is the need to study (T96) (class passing
system)

Assessment issues
This code was derived through the process of identifying the viewpoints expressed by
educators regarding assessment themes that are intricately linked to the field of education.
Upon examination of Figure 17, it becomes apparent that the predominant topic of discussion
among teachers was the insufficiency of tools for assessment.
Figure 17. Sub-Codes of Assessment Issues

Teachers' views on issues related to assessment are directly quoted in the examples shown
below. These quotes reflect teachers' different views on assessment and measurement
processes. Some teachers believe that the current assessment tools are inadequate to assess
skills-oriented lessons. Others suggest moving to an assessment system without exams or
adopting a listening and play-based approach. Some teachers expressed the need for
standardization of assessment and evaluation activities, while others emphasized the
shortcomings of common exams. At the same time, some teachers prefer to focus on traditional
exams, while others favor the use of practice exams that measure the four skills. These quotes
show that teachers have different perspectives and suggestions about assessment processes.
These different views reflect the diversity of the education system and assessment practices.
Backwash effect of exams.

117
Enhancing classroom assessment: practical insights…

One of the most common problems identified by teachers' opinions was the impact of exams
on education. The quotes reflect teachers' views on the effects of exams on education and
highlight the shortcomings of learning in an exam-oriented system.
Despite being included in the national exams, the density of the subjects and the small number of
questions push the students to focus on other lessons, and they have difficulty in studying so many
subjects for so few questions. The preparation process for the exam goes through intensively with
intensive vocabulary teaching and test question solving techniques. This negatively supports the
general judgments against the course. (T51)
Students study not because they are interested in the English lesson, but only because of their fear of
exams. (T93)

4. The Needs of the Participants in L2 Assessment
(What are the needs of Turkish in-service EFL teachers in terms of testing, evaluation, and
assessment process?
The final objective of the study was to investigate the perceptions of the participants on their
language assessment needs. Within the category of needs, a total of eight separate codes were
identified, as illustrated in Figure 18. Figure 19 displays five codes together with their
respective sub-codes. The results of the study indicate that equipment needs were the most
prominent requirement for language assessment, followed by systematic, training, assessment,
and occupational demands, in that order.
Figure 18. Frequency Graph Showing the Number of Coded Sections Related to The Needs of Teachers

Figure 19. Categorization of Views about Teachers' Language Assessment Needs with Hierarchical
Code Model

İrem AYDIN & Devrim HÖL

119
Enhancing classroom assessment: practical insights…

Equipment Needs
Figure 20. Sub-Codes of Equipment Needs

The findings indicate that the equipment needs code in this category has the highest coding
frequency. The perspectives of educators regarding their equipment requirements were
categorized into four distinct sub-categories: suitable equipment, Web 2.0 tools, assessment
tools, and technical infrastructure. Of the several subcodes, the most frequently referenced one
pertained to the teachers' need for appropriate equipment. Teachers often express their
requirements about this code in relation to the language laboratory and well-equipped
classrooms, particularly for the purpose of testing listening and speaking abilities.
Each school should have a foreign language class in a separate room and include a smartboard and a
computer. It is very suitable for listening activities and watching movies. If every English teacher uses
that room for only two lessons a week, it will be beneficial for listening and speaking. (T58-suitable
equipment)
Technological software. (T19-Web 2.0 tools)
Effective and easy-to-apply assessment and evaluation tools that will appeal to all student levels (T20assessment tools)
Lack of internet in schools (T18-technological infrastructure)

Systematic Needs
Figure 21. Sub-Codes of Systematic Needs

Class hours should be more, (T114-increasing class hours)

İrem AYDIN & Devrim HÖL
Narrowing the subject with the arrangements and improvements made in the curriculum (T51-revising
curriculum)
System change. I think that removing English from being considered a course will allow us to be free in
assessment and evaluation like many other things. A ton of learned methods are unfortunately forgotten
in the face of the system... (T2-reorganizing system)
The teacher should be given full authority in planning and measuring, and flexibility should be
provided in this regard. Instead of a uniform system, tools should be developed that will take care of
the needs of the individual. (T76-teacher autonomy)
A standard framework performance scale from the ministry could have been useful to assess in-class
performance in second and third grades. Assessment depends on the teacher at primary school level.
(T9- performance scale for primary school)

Teachers listed systematic needs as the most important subject, after equipment needs. The
teachers' stipulations regarding the education system typically encompassed their anticipations
for the system and their desired areas of reform. According to the data presented in Figure 21,
it is evident that the primary expectation of teachers regarding the system is the augmentation
of course hours. The actual words of teachers on this topic are shown above.
Figure 22. Sub-Codes of Training Needs

I think that the testing courses that many English teachers take during their undergraduate education
should be supported in their working life. In this sense, I believe that in-service training and workshops
should be held frequently in institutions. Particular attention should be paid to personal development so
that teachers can prepare for their own exams without being bound by certain ready-made exams. More
teachers should receive master's and doctoral education. By establishing exam offices in institutions,
teachers should be provided with a practice area and gain experience. -Item writing -Scale/rubric
development -Adapting tests -Technology usage in testing, (how can I benefit?) (T4-in-service training)
The measurement and evaluation course I took at the university was taught, but I do not think it was
enough in terms of time and passing the subjects quickly. (T14-pre-service training)

According to Figure 22, a noteworthy portion of respondents expressed a preference for
receiving education on language assessment while actively engaged in their career. The
aforementioned sentences serve as illustrations of the viewpoints held by teachers regarding
their need for training. These training requests are related to teachers' desire to have more
knowledge and skills to develop more effective assessment tools and to better assess their
students. These findings reflect teachers' interest in receiving ongoing training and continuing
their professional development. The development of programs tailored to their training needs
can help teachers more effectively assess their students and improve their teaching.

121
Enhancing classroom assessment: practical insights…

Discussion and Conclusion
The primary objective of this study was to ascertain the strengths and weaknesses of Turkish
in-service EFL teachers in the domain of language assessment. Furthermore, an examination
was conducted to identify the potential origins of the challenges faced by the participants, as
well as their specific requirements in the realm of language assessment. Upon conducting a
thorough examination of the existing literature, a lack of data specifically addressing the
competencies and weaknesses of Turkish in-service EFL teachers in language assessment was
identified. Numerous research has been conducted regarding the roles of teachers, their ideas,
and practices in language assessment, as well as their assessment needs. The current research
has revealed results that support and also contradict the results of previous research in this
field.
Initially, similar to this study’s findings, in Wach's (2012) study, an investigation was
conducted to examine the perspectives of teachers in both primary and tertiary educational
settings on the merits of assessment and the challenges associated with L2 assessment. The
findings revealed that the teachers' enumeration of challenges encountered in assessment
outweighed the enumeration of positive qualities. In brief, it may be inferred that teachers had
a greater number of weaknesses in language assessment compared to their strengths, while not
explicitly acknowledging them, and were more likely to elaborate on their shortcomings. When
asked about their strengths in language assessment, the primary concern expressed by the
teachers was to ensure assessment reliability. The formation of this code might be attributed to
the implication that teachers often assert their fairness and objectivity, employ strategies to
enhance the reliability of language exam preparation, and adopt procedures to improve rater
reliability during the rating process. Regarding the validity of the assessments, the findings are
directly in line with Shim’s (2009) findings which showed that most of the primary school
English teachers said that they paid attention to content validity and that they made assessments
based on what was taught in the lesson and the activities and methods they applied in the
classroom. In contrast to this study’s findings, De Silva (2021) found that teachers did not
consider essential assessment principles, such as validity and reliability, during the
development of examination papers. Additionally, the test design neglected the aspect of
content validity. In parallel with the results of this study, the teachers in Xie and Tan’s study
(2019) emphasized the significance of appropriately calibrating the difficulty level of
assessment items in accordance with the students' ability levels during test preparation. In
practice, however, creating tasks that accurately reflect curricular objectives and student
proficiency requires not only theoretical knowledge, but also expertise in item writing and
ample time for meticulous planning. These conditions may not always be available to teachers.
Regarding formative assessment, again the findings of this study share similar findings of Xie
and Tan (2019) who discovered that while preservice and novice teachers both recognize the
value of implementing formative assessment, their knowledge of the techniques is insufficient.
Regarding the matter of weaknesses this study has unveiled that the participating teachers
identified their greatest weaknesses in the rating process while assessing language proficiency.
Concerning this matter, teachers' opinions primarily focus on rater bias. Several teachers have
acknowledged engaging in activities that may include bias, such as factoring in a student's inclass performance, overlooking minor errors, and compensating for missing points throughout
the assessment procedure. Since raters are human beings rather than objective scoring devices,
their decisions may also be impacted by feelings, empathy, past experiences, and interpersonal
dynamics in the classroom. One further problem encountered by teachers during the process of
test preparation and item writing referred to the development of level-based language
assessments. Teachers usually complained that they struggled to determine which questions to

İrem AYDIN & Devrim HÖL
ask at each language level and how to assess the various language levels. These results align
with the work of Yan et al. (2018). Nevertheless, this challenge may be the result of a more
fundamental structural issue: although proficiency frameworks like the Common European
Framework of Reference for Languages (CEFR) offer general performance descriptors, the
conversion of these abstract-level definitions into concrete, quantifiable classroom tasks
necessitate advanced assessment literacy and item-writing proficiency.
Furthermore, teachers presented objectivity as an area in which they wanted to improve in
language assessment. One of the teachers expressed her concern that she would not be able to
assess students' writing and speaking abilities objectively. This result was consistent with
Bérešová's (2019) findings, which demonstrated that Slovak teachers preferred to assess
vocabulary and grammar since they believed it was impossible to assess productive skills
objectively. Regarding assessing skills, the teachers in this study stated that they assessed
speaking skills the least among the four skills. In the local literature, in parallel with this
finding, Han & Kaya (2014) found that the assessment of speaking is widely regarded as the
most difficult skill to assess whereas assessing reading skills is comparatively less challenging.
Not only in Türkiye, also in distinct parts of the world, in a study conducted with teacher
participants, speaking skill was found hard to assess and given that the speaking skill was not
even tested, which is considered the most difficult ability to master (Banitz, 2022). The present
findings suggest that this difficulty may not be solely pedagogical but also structural. Assessing
speaking requires individual performance time, careful listening, scoring across multiple
criteria, and often repeated evaluation to ensure fairness. These requirements make speaking
assessment more time-consuming and logistically demanding compared to receptive skills.
Therefore, beyond concerns about objectivity, the limited assessment of speaking may also be
linked to broader administrative and institutional constraints that shape teachers’ assessment
practices.
Within the challenges related to the education system, administrative issues were found to have
the greatest impact on language assessment, and the majority of teachers' perspectives on
administrative problems were centered around the management of schools. That is to say, most
teachers expressed dissatisfaction with the ways in which the school administration managed
language assessments, their participation in the process, and the pressure they applied, for
example, asking simple questions and refusing to provide failing grades. Following the
administration of the school, teachers expressed the most dissatisfaction with the duration of
course hours. According to numerous educators, there was a widespread observation that the
allocated instructional hours for the course were limited and deemed inadequate in terms of
assessment. The expressed inability to conduct comprehensive assessments and provide
sufficient practice opportunities for the four skills is attributed to the insufficient number of
course hours. This result aligns with the outcome of a study conducted by Gelbal and
Kelecioglu (2007), which revealed that the primary constraints faced in implementing
assessment methodologies were the large class size and limited course hours. One significant
issue highlighted by teachers within the realm of the education system was related to the
backwash effect of examinations. Numerous teachers asserted that the prevailing educational
framework exhibited a predominant focus on examinations, resulting in students displaying a
lack of interest in acquiring knowledge beyond the scope of material covered in both school
and national assessments. Consequently, this phenomenon has had an obvious impact on the
pedagogical approaches employed by teachers. This finding echoes the research conducted by
Yan et al. (2018), which demonstrated that standardized examinations have negative
consequences on the field of education. Another issue that participating teachers complain
about is rote learning-based education system. This finding supports previous research into the

123
Enhancing classroom assessment: practical insights…

most significant problems in the Turkish education system which found a widespread opinion
among pre-service teachers is that the primary challenge facing our education system lies in
the implementation of an educational strategy centered on rote learning (Yeşil and Şahan,
2015).
According to the teachers, parents were identified as the primary factor externally influencing
language assessment. The significance of parental involvement in student academic
achievement has been recognized by educators, administrators, and policymakers, who view it
as a crucial element of contemporary educational reforms and initiatives (Wilder, 2014).
However, the majority of teachers participating in this study expressed concerns regarding the
adverse impact of parental involvement on the assessment process. The prominence of parental
pressure reflects the socio-cultural positioning of assessment within the Turkish educational
system, where exam results often function as indicators of both student and institutional
success. However, this dynamic is not unique to Türkiye. Similar to this result, the teachers in
the Xie and Tan (2019) study stated that they tried to score the exam papers very carefully to
avoid criticism from parents. The physical infrastructure and equipment were also significant
factors that exerted a substantial influence on teachers’ language assessment. The educators
contended that the absence of essential equipment rendered the task of conducting a thorough
assessment impractical. Teachers commonly expressed that the assessment setting was
unsuitable, citing insufficient technical resources such as internet access, smart boards, and
copier machines, which are anticipated to impact the assessment process. This result is
supported by previous research (Han & Kaya, 2014).
The findings concerning the needs of Turkish in-service EFL teachers in the domain of
language assessment indicated that their primary demand was related to equipment. The
participants’ perspectives regarding this particular topic were categorized into four distinct
categories: suitable equipment, technological infrastructure, assessment tools, and Web 2.0
tools. The study identified several requirements, including a language classroom, personal
computers, a smart board, a sound system for testing listening skills, a quiet and appropriate
assessment environment, visual materials, supplementary resources, diverse assessment tools,
technological infrastructure, and software specifically designed for assessment purposes. In
contrast to this finding, in Djoub's (2017) research, it was seen that teachers expressed the least
amount of need for supporting items, despite the fact that they were identified as the group
requiring the greatest training. When queried about their requirements for language assessment,
educators frequently emphasized the necessity of training. The requirements regarding this
topic were categorized into two primary categories: pre-service and in-service training. The
majority of educators expressed a desire for professional development opportunities,
particularly in the areas of test design, item writing, scale/rubric development, test adaptation,
technology integration in assessment, and the fundamental concepts of assessment.
Furthermore, they expressed a desire for the implementation of frequent seminars and webinars
relevant to this topic, with the aim of providing ongoing support for the testing courses
undertaken throughout their undergraduate education, as they move into their professional
careers. In Ölmezer-Öztürk and Aydın's (2019) study, Turkish in-service EFL teachers
expressed similar requests. On the other hand, the findings of the current study are also
consistent with those of Janatifar and Marandi (2018) who found that EFL teachers hold the
belief that in addition to the theoretical aspects of assessment, it is essential for them to receive
practical instruction focused on developing skills in language assessment. Lastly, some
teachers expressed a desire for income and prestige, factors that are not directly associated with
language assessment but rather relate to their chosen profession. Regarding the social status of
the teaching profession, the findings of this study align with the research conducted by Kara

İrem AYDIN & Devrim HÖL
(2020), who found the decline in the reputation of teachers as one of the important problems in
the education system.
Based on the findings of the research, this study provides valuable information for the
assessment of English in schools affiliated with the Ministry of National Education, the
improvement of English teacher training programs, academic institutions, and the development
of future research efforts in the field of language assessment.
For the first time two years ago (09.09.2023), the Ministry of National Education published the
Ministry of National Education's Regulation on Assessment and Evaluation. The purpose of
this regulation is to regulate the procedures and principles of central system exams conducted
by the Ministry of National Education, national/international monitoring surveys, monitoring
of academic and social development in pre-school education institutions and primary schools,
common exams in secondary and secondary education institutions, and the duties, authorities
and responsibilities of assessment and evaluation center directorates. When the regulation is
examined, it is seen that only one article is related to the foreign language course. In subparagraph I of Article 5: “Exams for Turkish/Turkish language and literature and foreign
language courses are conducted in written and practical exams to measure listening, speaking,
reading, and writing skills. In case of a nationwide or province/district-wide common exam,
the school conducts the practical part of the exam, and the two exams are evaluated together.”
However, all articles also include English as a foreign language course along with all courses.
This newly published assessment and evaluation regulation responds to some of the issues
raised by teachers in this study.
Limitations and Suggestions
Despite its contributions, this study has several limitations that should be
acknowledged. First, the study relies solely on self-reported data obtained through semistructured interviews. While such data provides valuable insight into teachers’ perceptions and
experiences, they do not allow for direct observation of classroom assessment practices.
Therefore, discrepancies may exist between reported beliefs and actual practices. Secondly, the
research was undertaken within the Turkish educational setting, which may influence the
findings through local structural and cultural dynamics, hence constraining their wider
applicability. Finally, although systematic coding procedures, including independent coding by
a second researcher and consensus meetings, were employed, the absence of additional data
sources, such as classroom observations or document analysis, limits methodological
triangulation.
In prospective investigations, it may be beneficial to categorize language instructors
according to their geographic location, such as distinguishing between those teaching in urban
versus rural schools, as well as their educational stage, encompassing elementary, secondary,
and tertiary-level English teachers. This classification is warranted due to the anticipated
differences in assessment training, competence, and practice among EFL teachers, which are
likely to be influenced by factors such as school type, regional context, and instructional level.
The categories and codes that have resulted from this research can be examined separately and
further investigated in order to expand the existing literature.

125
Enhancing classroom assessment: practical insights…

References
Alderson, J. C., Brunfaut, T., & Harding, L. (2017). Bridging assessment and learning: A view from
second and foreign language assessment. Assessment in Education: Principles, Policy &
Practice, 24(3), 379-387. https://doi.org/10.1080/0969594X.2017.1331201
Al-Mahrooqi, R. (2017). Introduction: EFL assessment: Back in focus. In R. Al-Mahrooqi, C. Coombe,
F. Al-Maamari and V. Thakur (Eds.), Revisiting EFL assessment (pp. 1-6). Springer.
Arslan, R. S., & Üçok-Atasoy, M. (2020). An investigation into EFL teachers' assessment of young
learners of English: Does practice match the policy?. International Online Journal of
Education and Teaching, 7(2), 468-484. https://iojet.org/index.php/IOJET/article/view/818
Banitz, B. (2022). Language assessment in Mexico: Exploring university language teachers'
backgrounds,
practices,
and
opinions.
System,
110,
102898.
Benson, J., & Clark, F. (1982). A guide for instrument development and validation. The American
Journal of Occupational Therapy, 36(12), 789-800. https://doi.org/10.5014/ajot.36.12.789
Bérešová, J. (2019). The importance of objectivity in assessing writing skills. In INTED2019
Proceedings (pp. 4376–4380). IATED.
Berry, V., Sheehan, S., & Munro, S. (2019). What does language assessment literacy mean to teachers?
ELT Journal, 73(2), 113–123. https://doi.org/10.1093/elt/ccy055
Boeije, H. (2010) Analysis in qualitative research. Sage.
Büyükkarcı, K. (2016). Identifying the areas for English language teacher development: A study of
assessment literacy. Pegem Eğitim ve Öğretim Dergisi, 6(3), 333-346,
Canlı, G. S., & Altay, İ. F. (2024). Training needs of in-service EFL teachers in language testing and
assessment. Pegem
Journal
of
Education
and
Instruction, 14(1),
69-79.
Çetinkaya, G. (2020). Dinleme eğitimi sürecinde ölçme ve değerlendirme. In G. Çetinkaya (Ed.),
Türkçe eğitimi sürecinde ölçme ve değerlendirme (2nd ed.). Anı Yayıncılık.
Çimen, Ş. S. (2022). Exploring EFL assessment in Turkey: Curriculum and teacher practices.
International Online Journal of Education and Teaching (IOJET), 9(1). 531-550.
Creswell, J. W. (2013). Qualitative inquiry & research design: Choosing among five approaches. Sage.
Creswell, J. W. (2016). 30 essential skills for the qualitative researcher. SAGE Publications.
De Silva, R. (2021). Assessment literacy and assessment practices of teachers of English in a SouthAsian context: Issues and possible washback. In B. Lanteigne, C. Coombe, and J. D. Brown
(Eds.), Challenges in language testing around the world (pp. 433-446). Springer.
Djoub, Z. (2017). Assessment literacy: Beyond teacher practice. In R. Al-Mahrooqi, C. Coombe, F.
Al-Maamari, and V. Thakur (Eds.), Revisiting EFL assessment (pp. 9-27). Springer.
Fard, Z. R., & Tabatabaei, O. (2018). Investigating assessment literacy of EFL teachers in Iran. Journal
of Applied Linguistics and Language Research, 5(3), 91-100.
Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly,
9(2), 113–132.
Gelbal, S., & Kelecioğlu, H. (2007). Öğretmenlerin ölçme ve değerlendirme yöntemleri hakkındaki
yeterlik algıları ve karşılaştıkları sorunlar. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 33,
135-145.

İrem AYDIN & Devrim HÖL
Genç, E., Çalışkan, H., & Yüksel, D. (2020). Language assessment literacy level of EFL teachers: A
focus on writing and speaking assessment knowledge of the teachers. Sakarya University
Journal of Education, 10(2), 274-291. https://doi.org/10.19126/suje.626156
Giraldo, F., & Murcia Quintero, D. (2019). Language assessment literacy and the professional
development of pre-service language teachers. Colombian Applied Linguistics Journal, 21(2),
243–259. https://doi.org/10.14483/22487085.14514
Green, A. (2013). Exploring language assessment and testing: Language in action. Routledge.
Han, T., & Kaya, H. İ. (2014). Turkish EFL teachers’ assessment preferences and practices in the
context of constructivist instruction. Journal of Studies in Education, 4(1), 77-93.
Hill, C. E., Thompson, B. J., & Williams, E. N. (1997). A guide to conducting consensual qualitative
research. The Counseling Psychologist, 25(4), 517-572.
Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative
Health Research, 15(9), 1277-1288.
Janatifar, M., & Marandi, S. S. (2018). Iranian EFL teachers’ language assessment literacy under an
assessing lens. Applied Research on English Language, 7(3), 361–382.
Janghorban, R., Latifnejad Roudsari, R., & Taghipour, A. (2014). Pilot study in qualitative research:
The roles and values. Journal of Hayat, 19(4), 1-5.
Jannati, S. (2015). ELT teachers’ language assessment literacy: Perceptions and practices. The
International Journal of Research in Teacher Education, 6(2), 26-37.
Kaplan, B., & Maxwell, J. A. (2005). Qualitative research methods for evaluating computer
information systems. In J. G. Anderson and C. E. Aydin (Eds.), Evaluating the organizational
impact of healthcare information systems (pp. 30-55). Springer.
Kara, M. (2020). Eğitim paydaşlarının görüşleri doğrultusunda Türk eğitim sisteminin sorunları.
Kırşehir Eğitim Fakültesi Dergisi, 21(3), 1650-1694. https://doi.org/10.29299/kefad.853999
Karagül, B. İ., Yüksel, D., & Altay, M. (2017). Assessment and grading practices of EFL teachers in
Turkey. International
Journal
of
Language
Academy, 5(5),
168-174.
Kuckartz, U., & Rädiker, S. (2019). Analyzing qualitative data with MAXQDA. Springer International
Publishing.
Kvale, S. (1996). Interviews: An introduction to qualitative research interviewing. Sage Publications.
Lambert, V. A., & Lambert, C. E. (2012). Qualitative descriptive research: An acceptable design.
Pacific Rim International Journal of Nursing Research, 16(4), 255-256.
Luthfiyyah, R., Basyari, I. W., & Dwiniasih. (2020). EFL secondary teachers’ assessment literacy:
Assessment conceptions and practices. Journal on English as a Foreign Language, 10(2), 402421. https://doi.org/10.23971/jefl.v10i2.2101
Mertler, C. A., & Campbell, C. S. (2005) Measuring teachers’ knowledge and application of classroom
assessment concepts: Development of the assessment literacy inventory. The Annual Meeting
of the American Educational Research Association, April, Montreal, Quebec, Canada.
Murairwa, S. (2015). Voluntary sampling design. International Journal of Advanced Research in
Management and Social Sciences, 4(2), 185-200.
Najib Muhammad, F. H., & Bardakçı, M. (2019). Iraqi EFL teachers’ assessment literacy: Perceptions
and
practices.
Arab
World
Journal
(AWEJ),
10.

127
Enhancing classroom assessment: practical insights…

Ölmezer-Öztürk, E., & Aydın, B. (2019). Investigating language assessment knowledge of EFL
teachers.
Hacettepe
University
Journal
of
Education,
34(3),
602-620.
Öz, S., & Atay, D. (2017). Turkish EFL instructors’ in-class language assessment literacy: Perceptions
and practices. ELT Research Journal, 6(1), 25-44.
Rad, M. R. (2019). The impact of EFL teachers’ assessment literacy on their assessment efficiency in
classroom. Britain International of Linguistics, Arts and Education (BIoLAE) Journal, 1(1),
9–17. https://doi.org/10.33258/biolae.v1i1.14
Rea-Dickins, P. (2004). Understanding teachers as agents of assessment. Language Testing, 21(3),
249–258.
Rogier, D. (2014). Assessment literacy: Building a base for better teaching and learning. English
Language Teaching Forum, 52(3), 2-13.
Rose, H., McKinley, J., & Baffoe-Djan, J. B. (2019). Data collection research methods in applied
linguistics. Bloomsbury Academic.
Şahin, S. (2019). An analysis of English language testing and evaluation course in English language
teacher education programs in Turkey: Developing language assessment literacy of preservice EFL teachers [Doctoral dissertation, Middle East Technical University].
Saka, F. Ö. (2016). What do teachers think about testing procedure at schools?. Procedia-Social and
Behavioral Sciences, 232, 575-582. https://doi.org/10.1016/j.sbspro.2016.10.079
Saldaña, J. (2016). The coding manual for qualitative researchers. Sage.
Semiz, O., & Odabaş, K. (2016). Turkish EFL teachers’ familiarity of and perceived needs for language
testing and assessment literacy. In Proceedings of the Third International Linguistics and
Language
Studies
Conference
(pp.
66–72).
Sevimel-Şahin, A., & Subasi, G. (2019). An overview of language assessment literacy research within
English language education context. Journal of Theoretical Educational Science, 12(4), 13401364. https://doi.org/10.30831/akukeg.501817
Shim, K. N. (2009). An investigation into teachers’ perceptions of classroom-based assessment of
English as a foreign language in Korean primary education. [Unpublished doctoral
dissertation.
University
of
Exeter,
United
Kingdom.]
Şişman, E. P., & Büyükkarcı, K. (2019). A review of foreign language teachers’ assessment
literacy. Sakarya
University
Journal
of
Education, 9(3),
628-650.
Stiggins, R. J. (1999). Are you assessment literate? The High School Journal, 6(5), 20- 23.
Strauss, A. L., & Corbin, J. M. (1988). Shaping a new health care system: The explosion of chronic
illness as a catalyst for change. Jossey-Bass.
Tsagari, D., & Vogt, K. (2017). Assessment literacy of foreign language teachers around Europe:
Research, challenges and future prospects. Papers in Language Testing and Assessment, 6(1),
41-63. https://doi.org/10.58379/UHIX9883
Türk, M. (2018). Language assessment training level and perceived training needs of English language
instructors: A mixed methods study. [Unpublished master’s thesis, Bahçeşehir University].
Ülper, H., & Bağcı, H. (2012). Türkçe öğretmeni adaylarının öğretmenlik mesleğine dönük öz yeterlik
algıları. Journal of Turkish Studies, 7(2), 1115-1131.

İrem AYDIN & Devrim HÖL
Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a
European
study.
Language
Assessment
Quarterly,
11(4),
374–402.
Wach, A. (2012). Classroom-based language efficiency assessment: A challenge for EFL teachers.
Glottodidactica, 39(1), 81-92.
Wilder, S. (2014). Effects of parental involvement on academic achievement: A meta-synthesis.
Educational Review, 66(3), 377–397. https://doi.org/10.1080/00131911.2013.780009
Xie, Q., & Tan, S. (2019). Preparing primary English teachers in Hong Kong: Focusing on language
assessment
literacy.
Journal
of
Asia
TEFL,
16(2),
653.
Yan, X., Zhang, C., & Fan, J. J. (2018). Assessment knowledge is important, but . . .’: How contextual
and experiential factors mediate assessment practice and training needs of language teachers.
System, 74, 158–168. https://doi.org/10.1016/j.system.2018.03.003
Yeşil, R., & Şahan, E. (2015). Öğretmen adaylarının Türk eğitim sisteminin en önemli sorun, neden ve
çözüm yollarına ilişkin algıları. Ahi Evran Üniversitesi Kırşehir Eğitim Fakültesi Dergisi,
16(3), 123-143.
Zulaiha, S., Mulyono, H., & Ambarsari, L. (2020). An investigation into EFL teachers' assessment
literacy: Indonesian teachers' perceptions and classroom practice. European Journal of
Contemporary Education, 9(1), 189-201. https://doi.org/10.13187/ejced.2020.1.189
Zulaiha, S., & Mulyono, H. (2020). Exploring junior high school EFL teachers’ training needs of
assessment
literacy.
Cogent
Education,
7(1),
1772943.