An Analyses of Theories and Practice of Language Testing in a Historical Timeline and the Implication for the Language Teacher

An Analyses of Theories and Practice of Language Testing in a Historical Timeline and the Implication for the Language Teacher


by
Mike Adeyi
Department of English
Federal College of Education. Kano
Email: mikeadeyi2 gmail.com

Abstract
Language testing aims at determining the quality and quantity of the language skills that a learner has attained at a particular point in time. They are important tools, because they are used to measure the success of individuals in different aspects of life. But despite their importance, it was only in the 1980s that language theorists began to raise questions of high sensitivity. Tests were considered as purely linguistic acts,
therefore, very little attention was paid to the social dimension of language as the most important medium of communication among humans. With the advent of integrative testing, and later communicative language testing, linguists became more conscious of the social impact that tests have. Test is a highly complex undertaking that must be based on theory as well as practice. In other words, there should be guiding principles that should guide the teachers in the construction and administration of a language test. This paper undertakes an analyses of theories and practice of language testing in a historical timeline and the implication for the language teacher. It highlights the major principles or the characteristics that should guide every test. The paper x-rays the challenges as well as prospects of language testing and then offers-suggest ions.

Key words:  Theories, practice, language testing, language teacher, implication,
          historical timeline,

 Introduction
Linguists generally see the human language as complex and mysterious; a concept that tend to defile a universally accepted definition. There are therefore as many definitions as there are linguists depending on what a particular linguist or scholar is interested in. the above statement is corroborated by Halliday (1975) when he posits that language could be defined in various way depending on whether one is interested in dialectics and those that speak these dialects, words and their histories, the differences in language in different culture, the formal properties of language systems, language as an art medium, uses of language and the like. It is pertinent to highlight and examine some of the definitions as they are given by some linguists.
Edward Sapir (1921) definition, which is close to a century ago since it was given still holds sway and has been accepted by some linguists ( Azikiwe 1998, Foyewa (2015), Adewuyi and Oluokun (2001) as a good definitions of language. Sapir defines language as
. . . a purely human  and non-instinctive  method of communicating ideas, notions and desires by means of a system of voluntarily product symbols which are in the first instance auditory.
In another development, Sapir 1961 again sees language as a means by which representation of human experiences work efficiently together when he says that:
.. .language is primarily vocal actualization of the tendency To see reality symbolically...an actualization in terms of vocal expression of the tendency to master reality not by direct adhoc ad hoc handling of this element but by-reduction of experience to familiar form.

Another renowned linguist, Blumfit (1985), in agreeing with Sapir sees language as a relationship between words and experiences and it could be produced and received in the form of verbalization of experience or experience of verbalization. These two terms above have different meanings and they explain language as a means of representing experiences. Henry Sweet, an English phonetician and language scholar, states: "Language is the expression of ideas by means of speech-sounds combined into words, Words are combined into sentences, this combination answering to that of ideas into thoughts."
This paper wishes to adopt the definition of language by Azikiwe (1998), who
On her part, asserts that:
Language is simply a code whereby ideas of the user about the environment and the world at large are represented through a conventional system of arbitrary signals for communication.
A closed analysis of the definition above reveals that there are some key words that adequately project what language truly is. These key words are code, ideas, convention, system, and communication. This, in effect, suggests that language is a code which represents an idea; and a system which is a convention that is
used for communication.
Human language are sometimes learnt in formal situations and evaluation is the parameter used to measure the successes or otherwise of the learning process. Evaluation, according to Foyewa (2015), citing Adewuyi and Oluokun (2001) is a process of gathering and interpreting evidence regarding the problems and progress of learners in achieving desirable educational goals. The above is in agreement with Okpala, Onocha & Oyedeji (1993) who say that evaluation is a process of gathering valid information on the attainment of educational objectives, analyzing and fashioning information to aid judgment on the effectiveness of teaching or the educational objectives. Evaluation in any educational enterprise is absolutely crucial as it assists in determining the level of understanding and provides the opportunity of rating accordingly. There are different forms of evaluation among which are projects, observation, test and or examination. This paper focuses on the test component of evaluation and all its entailments.
For some years, language testing research and second language acquisition research have been viewed as distinct areas of inquiry in applied linguistics. Bachman and Cohen (1998), however, posit that since the late 1980s, there has been an increasing number of studies in which the two subfields of Applied Linguistics come together, both in terms of the substantive issues being investigated and the methodological approach been used. Sajitha (2013) is of the opinion that when students learn English as a Second Language, they face various problems. These problems, he says, can be partially categorised as problems caused by mother tongue interference and partially those caused by the method of language teaching and assessment. One of the greatest curses of the modern educational system is the lack of harmony between what is taught and what is tested (Bachman and Cohen 1998). This paper sets out to analyse various theories and practice of language testing in historical perspective and examines the guiding principles of test generally. The paper also highlights the problems and the prospects of test administration in linguistic study.

Forms and the General Principles of Language Testing
Language testing has been classified into two major types. The broad classification, according to Desheng and Varghese (2013), are: (i) testing the language skills and (ii) testing the knowledge of the content.
Skill Testing deals with testing the various language skills of listening, speaking, reading and writing as well as the sub-skills such as comprehension, vocabulary, grammar, spelling and punctuation. In Knowledge Testing, different types of tests are used to determine the extent of the learners' knowledge in the language. Alabi and Babatunde (2001) and Dasheng and varghese (2013) identify these forms of tests to be Non-referential test, achievement test, diagnostic test, aptitude test, and proficiency test.
i.        The Achievement test: helps the teacher to determine the level of the learner's           achievement in the language class, that is, whether he has mastered the    content he has been taught or not.
ii.       Aptitude test: this test in a language situation is used to predict the ability of     a learner to learn a language. Though this prediction may not be always       sustain as there may be factors such as teaching methods of the teacher, the      availability of learning material, or the learning environment that can either          enhance or retard this ability as predicted.
iii. Proficiency test: this test aims at measuring the language ability of the learner       and determining the learner's readiness to undertake a particular        communicative task as well as predicting the future language performance of       the student.
iv.      Diagnostic test: usually aims at finding out the strength and weaknesses of       the learner in learning the language skills. This is usually used to remedial     purposes. Due to the difference in the linguistic background of the learners, the manifest varying degrees of differences in their ability to undertake      certain language task. The diagnostic test helps the teacher to identify each student's problem and strength
Alma (2013) asserts that a good test should have certain guiding principles which should guide teachers when they wish to conduct one. Sajitha (2015) lists the following seven principles as those that every good test should possess. They include: validity, reliability, practicability, security, wash back, transparency and usefulness.
i.        Validity: A communicative language learning approach must be matched by communicative language testing. It should also have face validity which means that the test should measure what it is supposed to, or what it sets out to measure. There are different types of validity which include: face validity, content validity, predictive validity, concurrent validity, and construct validity
ii.       Reliability: Reliability suggests the consistency of the test scores. The result of a test should be the same if it is conducted at any other time. There should be consistency in the format, content and time of the exam. Exam administration and the ambience in which the test is conducted are also important.
iii.       Practicality: The practicality of the test can be obtained only when the tests are marked and the students are given proper feedback.
iv.       Security: It is part of both reliability and validity.
v. Wash back: It refers to the effect of testing on teaching and learning. The students accomplish the desired result when they perceive the tests are markers of their progress.
vi. Transparency: Students   should   be provided clear accurate information which is known as transparency,
vii. Usefulness: It is an important quality of testing. This means that a test must serve a particular purpose or usefulness. A good test should stick on to all the above mentioned features. A language assessment should stick on to the four skills. If the assessment is confined to one or two skills, we will not recognize the skill in which a student excels or in which he/ she keeps a low level of performance. Four skill assessments will be beneficial for students of different intelligence level when the four skills of language are tested.
The Concept of Theory
Theory, according to Kerlinger (1973) is a set of interrelated constructs (concepts), definitions, and propositions that presents a systematic view of phenomena by specifying relations among variables with the purpose of explaining and predicting the phenomena. Severin and Tankard (1982), on their part, define theory as a set of ideas of systematic generalizations based on scientific observation leading to further empirical observation. And Osuala (1982) says that a theory is an attempt at synthesizing and integrating empirical data for maximum clarification and unification. From the foregoing, we can conclude that a theory is a generalization arrived at as a result of organized analysis of interrelated variables about a situation.
To adequately understand a theory, a highlight of the features is pertinent. These features, according to Osuala are that (i) a theory must have the quality of parsimony, especially with the use of words; that is it must be stated in simple
statement and in clear terms, (ii) it should be as applicable as possible and must be grounded in empirical data which have been verified, (iii) it should allow interpretations and deductions that can be tested empirically,
Language Testing Theories in Historical Perspective
Alma (2013) is of the opinion that the history of language testing is closely related to the historical development of theories of linguistics in general. Spolsky (1976) identifies three historical periods of modern language testing. These are pre-scientific, psychometric-structuralist, and psycholinguistic-sociolinguistic. While Shohamy (1996), on his part, identifies five stages of development. These, he names as the discrete point era, the integrative era, the communicative era, the performance testing era and the alternative assessment era. All these eras are reflections of philosophies, viewpoints and trends of their time. Each of the eras is discussed below for a clear picture of the theoretical foundations of language testing in general.
The First Era
Testing during this era, Spolsky (2005) says, is deeply rooted in the pre-scientific phase, with tests of Confucian Doctrine until the beginning of last century, when some basic concepts started to evolve. This is the stage that Taber (2006) refers to as the pre-behaviourist when, according to him, the first theory-based methods of second-language instruction started with Francis Gouin in the mid-nineteenth century. And even though his work did not win universal and lasting recognition, it set the stage for later theorists. During this period, some methods of language teaching were evolved. Among them were the following:
The Series Method
Gouin's theory of language acquisition rose out of his own failure to learn German. His failure stemmed from his refusal to interact or converse with native speakers of German while trying to learn the language. Imagine trying to learn a foreign language by shunning interaction with the very people who speak it. When he returned to his native home, France, disillusioned about his failure, he discovered that during his twelve-month absence, his three-year-old nephew had become miraculously fluent in French. He wondered how a toddler could so easily out­perform his own considerable intellect. He then began to observe his nephew and other children who were in the process of acquiring language. As a consequence, he was able to theorize that the language one uses is related to one's actions at the time of the utterance. On these bases, he develops the Series Method, which seeks to teach second language by recreating conditions in which children learn a first language Brown (2000). Specifically, the teacher does an activity-walking to the door and simultaneously verbalizes the process of walking to the door:
          "I walk toward the door. 1 draw near to the door. I draw nearer to the door. I get to the door. I stop at the door". The student then mimics the instructor. As time goes on, the student is able to expand his/her linguistic skills: "Am I walking to the door?" "Did I walk to the door?" "I am thinking about walking to the door. "I am walking to the window."
The Direct Method
Second-language theorists maintain that the first real method of language teaching was the Direct Method, which was developed as a reaction against the monotony and ineffectiveness of grammar-translation classes. The Direct Method, Taber (2006) says is the brainchild of Charles Berlitz, a nineteenth-century linguist whose schools of language learning are famous throughout the world. This borrows and applies Gouin's findings of the previous generation, seeking to imitate his naturalistic approach. In light of Gouin's miserable failure in German, Berlitz wanted to immerse students in the target language. He believed, like Gouin, that one could learn a second language by imitating, just the way children learn their first language; that is, directly and without explanations of grammatical points and using only the target language. Therefore, grammar was taught inductively. The objectives were speaking and listening comprehension, not translation; for this reason, vocabulary was introduced in context and through demonstrations and pictures; and an emphasis was placed on correct usage and pronunciation. Students learned to write by taking dictation in the target language.
A typical Direct Method class had few students. Students might first take turns reading aloud, preferably a dialogue or anecdotal passage. To test for understanding, the teacher would then ask questions in the target language and students would have to respond appropriately in the target language. Following the question-response session, the instructor might dictate the passage to the students three times. Students would then read the dictation back to the class. The Direct Method was popular in Europe and the United States, especially during the first quarter of the twentieth century. However, its very intensity and necessarily small class sizes made the method impossible for public schools. In addition, it was considered a weak method because it was not supported by heavy-duty theories and it depended too much on teachers' ability to teach as well as their fluency in the target language. "So, it was back to the old reliable grammar-translation method until behaviorism began to shine its light on the field of second-language teaching" (Brown2000:44). This period was when subjective tests became very popular with teachers being the primary (and in most cases the only) assessors of the linguistic competence of their students. As a result of this, according to Alma (2013), there is no issue of reliability or validity.
Weir (2005) posits that in this phase, the experienced teachers would score written essays or examine the students' orally (although with great limitations), and their judgment on the competence was accepted and relied on. The overall view of this phase was that 'everything was going well' (p.5).
The Psychometrics Structuralist Era
Pavlov, Skinner, and Watson are the founders of the behaviorism-based techniques employed in United States of America's classrooms as well as the Audio-lingual Method of second-language instruction. Skinner's theory of operant conditioning is based on the concept that learning results from a change in overt behavior. Applied to language acquisition, one learns language by emitting an utterance (operant), which is reinforced by a response by another (consequence). If the consequence of the imitated behavior is negative, the behavior is not repeated; if, on the other hand, the response is positive, one repeats the behavior. Repetition then leads to habit formation. Thus, behaviorists agree with the likes of Francis Bacon and John Locke that one is born a tabula rasa, a blank slate, and all learning is the result of outside stimuli. From this thinking sprang the popular Audio-lingual Method.
The Audio-lingual Method (ALM) was first known as the Army Method because it had been adopted by the military during the Second World War when it became evident that most Americans were hopelessly monolingual. ALM is not unlike the Direct Method in that its purpose is to teach students to communicate in the target language. The Audio-lingual Method is a purely behavioristic approach to language teaching. It is based on drill work that aims to form good language habits, and it makes use of extensive conversation practice in the target language. Students enter the target-language classroom with their cognitive slates entirely blank, at least in theory, and they receive various linguistic stimuli and respond to them. If they respond correctly, they enjoy a reward and repeat the response, which promotes good habit formation. If they respond incorrectly, they receive no reward and therefore repress the response, which represses the response. Its theoretical support also comes from post-war structural linguists. Structural linguists analyze how language is formed, not in a historical-descriptive, or diachronic sense, but as it is "currently spoken in the speech community" (Stafford 1995). Language was now seen as a set of abstract linguistic units that made up a whole language system. The realization that all languages are complex, unique systems allowed linguists to understand the multifaceted, singular structure of English without comparing it to Latin, which had long been the paragon of excellence among prescriptive grammarians. This led to new thinking in terms of how language should be taught. Individual structures should be presented one at a time and practiced via repetition drills. Grammar explanations should be minimal or nonexistent, for students will learn grammatical structures by inductive analogy.
A typical ALM class consists often-minute drill periods interspersed with activities such as the reading and memorization of a dialogue. The instructor then examines a grammar point by contrasting it with a similar point in the students1 native language. (The teacher speaks in the native language, but discourages its use among students.) This is followed by more drills-chain drills, repetition drills, substitution drills. Target language vocabulary is introduced and learned in context, and teachers make abundant use of visual aids. Like its predecessors, ALM focuses on the surface forms of language and rote learning. While some students, especially those who could memorize dialogues, did well in the classroom, they still were not able to use the target language with any proficiency.
The language testing developments became fully shaped in this era from the middle of the last century, with discussions on the measurement of human mental knowledge and the concerns of statistician Francis Y. Edgeworth about fairness of
tests (Edgeworth, as cited by Spolsky, p. 171,). Language testing, unlike other fields of study, was skeptical about achieving objectivity in assessment. The main drawback to this was the discussion on linguistic knowledge and whether that could be divided into measurable and statistically assessed segments or not. Although these issues have not found complete response to date and continue to be the subject of discussion of the division of opinion, the second stage of the historical development of these discussions took a clearer shape. Unlike the first phase, in the second phase, there was an increased interest in theoretical concepts and a pool of linguists and experts entered the field.
Fulcher (2000) citing Morrow (1979) calls this stage as "Lawn of Tears" because of the obsessive efforts to reach objectivity, in macro as well as micro-skills. Psychometrics, which flourished at the time, dealt only with the reliability of tests and was interested in sophisticated formulae to achieve that. Alma (2013) citing Shohamy (2007) says that, at that time, nothing was mentioned about 'testing as an experience'. Nobody dealt with this experience, its consequences and attitudes towards it. However, the contribution of psychometrics has not completely lost the relation to the social side of tests, but this connection is very limited. On the other hand, as tests become more qualitative in measuring knowledge, the conclusions drawn from tests results are more rigorous thereby reducing the undesirable social impact (McNamara and Roever, 2006). For quite a while, the tests were far from being social, but with the advent of communicative competence paradigm and introduction in the game of the social context, tests have managed to the gaps of these aspects.
The Integrated Era
The third phase of theoretical developments in testing came as a natural consequence of the new movement in linguistics, Communicative Approach. Linguistic concepts about how students learn a foreign language began to change radically with Chomsky (1965, 1970) who proposed the system of knowledge based on linguistic rules and the difference between competence and performance. To Chomsky, competence is knowledge of the ideal speaker/listener on rules of language, and the performance was the actual use of language in concrete situations. These developments in communicative approach, and the need for tests that measure the productive language skills, led to the demands for language tests that include performance.
According to McNamara (2000), language Professionals began to believe that a language is more than a collection of separate elements that were tested during psychometric-structuralist movement and that this tradition was focused on the formal system of language, rather than in how the knowledge is used to achieve a successful communicative act. This led to the development and design of tests which integrated more of the skills of language. The feeling of linguists is that if language is used for communicative purposes, then, language testing should take into account all the language skills that enhance effective communication.
The Communicative Era
With the developments of the previous stages so far talked about, the foundations of communicative approach in language teaching and testing were laid. The Communicative Language Teaching (CLT) was the flavour of the decade during the 1990s (Taber 2013). CLT does not teach about language, rather, it teaches language. It is often associated with the Functional-Notional Approach; that is, the emphasis is on functions such as time, location, travel, measurements. In short, it seeks to recreate real-life social and functional situations in the classroom to guide students toward communicative competence. The issues this approach raised were not new in the realm of language testing, but what it did was putting forward a serious effort into tackling them. Under this new light, tests were not seen simply as linguistic tests per se, rather, more factors were taken into consideration. These factors include the individual taking the test, and the impact the test results had on his or her life. The recent models raised questions on previous concepts of linguistic knowledge and the way the tests were compiled to assess it. As psycholinguistics, which dealt with the individual cognitive abilities, was turning into 'the ephemeral fad', more and more linguists started to become interested in testing the social dimension of language and its use (Alma, 2013) This dimension was highlighted more by the surfacing of communicative approach, which appeared after structuralist linguistics and behaviourist psychology of the previous stage had failed to solve the problems of teaching and testing.
Fulcher (2000) believes that communicative testing initially came as a response to the enormous importance of reliability and validity which were lacking in the previous stages. Morrow (1979) states that reliability was the requirement of objectivity and validity existed depending on the criteria that were based on questionable assumptions. Redefinition of these two concepts became an important task for testers of communicative testing. Shohamy (2001) is of the opinion that a number of authors have begun to ask questions about the social and ethical side to testing in much the same time. Consequently, language testing took another direction.
Conclusion
Second-language instruction has come a long way since the days of rote learning. Still, it has a long way to go. The trend since the late 1990s has been towards eclecticism, which is a combination of some of these methods and this is probably the healthiest approach for it accommodates many styles of learning and endeavors to do more than elicit monosyllabic utterances from students. In like manner, language testing and evaluation have reflected the goal and method of instruction the learner receives. Furthermore, an eclectic approach allows teachers to glean the effective elements from many methods that really work in the classroom. A combination of a little of some methods can create interesting activities and fun in the classroom and can give students a sense of true accomplishment in the task given to them in the target language. Language learning methodologies and testing certainly mirror the times in which they thrive; but some have claimed to have virtues that are not evident beyond their theoretical framework.
The eclectic approach takes the best that theorists have to offer and incorporates it with techniques that work. Language testing needs to be planned well to achieve the desired results. Observing the ethics of testing is key in this regards. It is pertinent that the teacher knows that there are factors that affect effectiveness of language testing. The students' attitudes about the teaching methods the teacher employs, their home situations, literacy, self-confidence, academic level, identification with their native language are only a few factors that affect their ability to learn or acquire a new language and do well in its testing. In the end, teachers have a tremendous challenge in trying to give their students the tools with which to function on all levels in the target language through the instrument of language testing because in the words of McNamara and Shohamy (2008), "test has a strong power.., can motivate students to learn and teachers to become more effective in their instruction" (p. 90),

Bibliography
Akindele F. and Adegbite W. (2005); The Sociology and Politics of English in    Nigeria: An Introduction. Ile-Ife: O.A.U. Press.
Alma C.V.(2015) "Developments in Language Testing with the Focus on Ethics"        In Journal of Education and PracticeVol,6, No.32.
Anyanchonkeya N. & Anyanchonkeya C. (2006): The Anatomy of English Studies,     Owerri: Chukwumeka Printers and Publishers.
Anyawu S. (2001) 'Nature of Language' in Amadi R, Anyawu S, and Iruagba A.        (Ed) Language Education: Issues and Insights. Owerri: Barioz Publishers.
Azikiwe Uche (1998) Language Teaching and Learning. Onisha: Africana-Fed publishers Ltd
Blumfit (1985) "Language and Literature Teaching from Practice" ELT Document       102 on English as an International Language. London: The British Council.
Brown, H. D. (2000) Principles of Language Learning and Teaching. New York:           Longman.
Finegan Edward (2004) Language: Its Structure and Use Boston: Thomson       Wadsworth
Foyewa R.A. (2015) "Testing and evaluating English Language in Teaching: A case of O level English in Nigeria." In International Journal of English          Language Teaching Vol. 3. No 6.
Fulcher,      G. "Ethics in Language Testing" Retrieved from http://taesig.8m.com/newsl.html
Kerlinger, FN. (1973) Foundation of Behavioural Research. New York: Rinnchat       and Wilson
Izuagba A.C. (2001) "The Functions of Language" In Amadi R, Anyawu S, and
          Iruagba A. (Ed) Language Education: Issues and Insights. Owerri: Barioz           Publishers.
Halliday M A K( 1975) Explorations in the Functions of Language. London:     Edward Arnold Ltd
McNamara, T. (2000) Language Testing. Oxford: Oxford University Press.
McNamara, T. and C. Roever (2006) Language Testing: The Social Dimension.          Maiden, MA & Oxford: Blackwell.
Ndimele Ozo-Mekuri (ed) (2001) Readings on Language. Port-Harcourt: M J    Grand Orbit Communications Ltd.
Okpala, P.N Onocha CO and Oyedeji O.A (1998) Measurement and Evaluation in           Education. JaltuUzarue: Stirling Holden Publishers
Osuala E.C, (1987) Introduction to Research Methodology. Onitsha: Africana feb           Publisher.
Sapir E. (\92\)Language. New York: Harcourt Brace and World.
Serverin, W.J. and Tankar, JW. (2001) Communication Theories: Origin, methods
          and uses in the Mass Media. New York: lomgman.
Shohamy, E. (2007) "Tests as power tools: Looking back, looking forward".
          Language testing reconsidered, ed. J Fox, J.et al. University of Ottawa Press Shohamy, E., (2001) 'The Social Responsibility of the Language Testers. New
          Perspectives and Issues in Educational Language Policy, John Benjamins
          Publishing Company,
Spolsky, B. (1995) Measured Words: The Development of Objective Language
          Testing. Oxford: Oxford University Press.
…. . . . . (1997) "The Ethics of Gatekeeping Tests: What have We Learned in a
             Hundred Years?" Language Testing Vol.14 No.3.
. . . . . . . (1995) Measured Words. Oxford: Oxford University Press.
Stafford, Amy (1995) "Structural Linguistics: Its History, Contributions and    Relevance."
Taber, Joan "A Brief History of ESL Instruction: Theories, Methodologies, and           Upheavals" (http://papersbyjoantaber.blogspot.com/2006/05/brief-history-       of-esl-instruction.html
Udofot, I (2005): An Introduction to the Morphology of English, Ikot Ekpene:    D.U.C Press
Udofot, I. and Ekpenyong (2006) A Comprehensive English Course, Ikot Ekpene:       D.U.C Press.
Ukpana Ime D. (2004) Indigenous Musicians from Nigeria's Akwa Ibom State Are     prophet" Uyo Journal of Humanities. Vol. 10
. . .  (2003) "Career Prospects in Music : The Case Nigerian Professionals " Uyo         Journal of Humanities. Vol. 8. Pp 174-189
Yusuf   Ore   (2007)   Basic   Linguistics.    Port-Harcourt:   M. J Grand Orbit           Communication Ltd



No comments:

Post a Comment